blob: 3bf4b8043b839b8495f87813cd34cd71c3e32187 [file] [log] [blame]
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><generator uri="http://jekyllrb.com" version="2.5.3">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2016-06-22T13:48:43-07:00</updated><id>/</id><entry><title>Apache Kudu (incubating) Weekly Update June 21, 2016</title><link href="/2016/06/21/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu (incubating) Weekly Update June 21, 2016" /><published>2016-06-21T00:00:00-07:00</published><updated>2016-06-21T00:00:00-07:00</updated><id>/2016/06/21/weekly-update</id><content type="html" xml:base="/2016/06/21/weekly-update.html">&lt;p&gt;Welcome to the fourteenth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.&lt;/p&gt;
&lt;h2 id=&quot;development-discussions-and-code-in-progress&quot;&gt;Development discussions and code in progress&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Dan Burkert posted a series of patches to &lt;a href=&quot;https://gerrit.cloudera.org/#/c/3388/&quot;&gt;add support in the Java client&lt;/a&gt;
for non-covering range partitions. At the same time he improved how that client locates tables by
leveraging the tablets cache.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the context of making multi-master reliable in 1.0, Adar Dembo posted a &lt;a href=&quot;https://gerrit.cloudera.org/#/c/3393/&quot;&gt;design document&lt;/a&gt;
on how to handle permanent master failures. Currently the master’s code is missing some features
like &lt;code&gt;remote bootstrap&lt;/code&gt; which makes it possible for a new replica to download a snapshot of the data
from the leader replica.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Tsuyoshi Ozawa refreshed &lt;a href=&quot;https://gerrit.cloudera.org/#/c/2162/&quot;&gt;a patch&lt;/a&gt; posted in February that
makes it easier to get started contributing to Kudu by providing a Dockerfile with the right
environment.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;on-the-blog&quot;&gt;On the blog&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Mike Percy &lt;a href=&quot;http://getkudu.io/2016/06/17/raft-consensus-single-node.html&quot;&gt;wrote&lt;/a&gt; about how Kudu
uses Raft consensus on a single node, and some changes we’re making as Kudu is getting more mature.&lt;/li&gt;
&lt;/ul&gt;
&lt;!--more--&gt;
&lt;p&gt;Want to learn more about a specific topic from this blog post? Shoot an email to the
&lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#117;&amp;#115;&amp;#101;&amp;#114;&amp;#064;&amp;#107;&amp;#117;&amp;#100;&amp;#117;&amp;#046;&amp;#105;&amp;#110;&amp;#099;&amp;#117;&amp;#098;&amp;#097;&amp;#116;&amp;#111;&amp;#114;&amp;#046;&amp;#097;&amp;#112;&amp;#097;&amp;#099;&amp;#104;&amp;#101;&amp;#046;&amp;#111;&amp;#114;&amp;#103;&quot;&gt;kudu-user mailing list&lt;/a&gt; or
tweet at &lt;a href=&quot;https://twitter.com/ApacheKudu&quot;&gt;@ApacheKudu&lt;/a&gt;. Similarly, if you’re
aware of some Kudu news we missed, let us know so we can cover it in
a future post.&lt;/p&gt;</content><author><name>Jean-Daniel Cryans</name></author><summary>Welcome to the fourteenth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.
Development discussions and code in progress
Dan Burkert posted a series of patches to add support in the Java client
for non-covering range partitions. At the same time he improved how that client locates tables by
leveraging the tablets cache.
In the context of making multi-master reliable in 1.0, Adar Dembo posted a design document
on how to handle permanent master failures. Currently the master&amp;#8217;s code is missing some features
like remote bootstrap which makes it possible for a new replica to download a snapshot of the data
from the leader replica.
Tsuyoshi Ozawa refreshed a patch posted in February that
makes it easier to get started contributing to Kudu by providing a Dockerfile with the right
environment.
On the blog
Mike Percy wrote about how Kudu
uses Raft consensus on a single node, and some changes we&amp;#8217;re making as Kudu is getting more mature.</summary></entry><entry><title>Using Raft Consensus on a Single Node</title><link href="/2016/06/17/raft-consensus-single-node.html" rel="alternate" type="text/html" title="Using Raft Consensus on a Single Node" /><published>2016-06-17T00:00:00-07:00</published><updated>2016-06-17T00:00:00-07:00</updated><id>/2016/06/17/raft-consensus-single-node</id><content type="html" xml:base="/2016/06/17/raft-consensus-single-node.html">&lt;p&gt;As Kudu marches toward its 1.0 release, which will include support for
multi-master operation, we are working on removing old code that is no longer
needed. One such piece of code is called LocalConsensus. Once LocalConsensus is
removed, we will be using Raft consensus even on Kudu tables that have a
replication factor of 1.&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;Using Raft consensus in single-node cases is important for multi-master
support because it will allow people to dynamically increase their Kudu
cluster’s existing master server replication factor from 1 to many (3 or 5 are
typical).&lt;/p&gt;
&lt;h1 id=&quot;the-consensus-interface&quot;&gt;The Consensus interface&lt;/h1&gt;
&lt;p&gt;In Kudu, the
&lt;a href=&quot;https://github.com/apache/incubator-kudu/blob/branch-0.9.x/src/kudu/consensus/consensus.h&quot;&gt;Consensus&lt;/a&gt;
interface was created as an abstraction to allow us to build the plumbing
around how a consensus implementation would interact with the underlying
tablet. We were able to build out this “scaffolding” long before our Raft
implementation was complete.&lt;/p&gt;
&lt;p&gt;The Consensus API has the following main responsibilities:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Support acting as a Raft &lt;code&gt;LEADER&lt;/code&gt; and replicate writes to a local
write-ahead log (WAL) as well as followers in the Raft configuration. For
each operation written to the leader, a Raft implementation must keep track
of how many nodes have written a copy of the operation being replicated, and
whether or not that constitutes a majority. Once a majority of the nodes
have written a copy of the data, it is considered committed.&lt;/li&gt;
&lt;li&gt;Support acting as a Raft &lt;code&gt;FOLLOWER&lt;/code&gt; by accepting writes from the leader and
preparing them to be eventually committed.&lt;/li&gt;
&lt;li&gt;Support voting in and initiating leader elections.&lt;/li&gt;
&lt;li&gt;Support participating in and initiating configuration changes (such as going
from a replication factor of 3 to 4).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The first implementation of the Consensus interface was called LocalConsensus.
LocalConsensus only supported acting as a leader of a single-node configuration
(hence the name “local”). It could not replicate to followers, participate in
elections, or change configurations. These limitations have led us to
&lt;a href=&quot;https://gerrit.cloudera.org/3350&quot;&gt;remove&lt;/a&gt; LocalConsensus from the code base
entirely.&lt;/p&gt;
&lt;p&gt;Because Kudu has a full-featured Raft implementation, Kudu’s RaftConsensus
supports all of the above functions of the Consensus interface.&lt;/p&gt;
&lt;h1 id=&quot;using-a-single-node-raft-configuration&quot;&gt;Using a Single-node Raft configuration&lt;/h1&gt;
&lt;p&gt;A common question on the Raft mailing lists is: “Is it even possible to use
Raft on a single node?” The answer is yes.&lt;/p&gt;
&lt;p&gt;Fundamentally, Raft works by first electing a leader that is responsible for
replicating write operations to the other members of the configuration. In
order to elect a leader, Raft requires a (strict) majority of the voters to
vote “yes” in an election. When there is only a single eligible node in the
configuration, there is no chance of losing the election. Raft specifies that
when starting an election, a node must first vote for itself and then contact
the rest of the voters to tally their votes. If there is only a single node, no
communication is required and an election succeeds instantaneously.&lt;/p&gt;
&lt;p&gt;So, when does it make sense to use Raft for a single node?&lt;/p&gt;
&lt;p&gt;It makes sense to do this when you want to allow growing the replication factor
in the future. This is something that Kudu needs to support. When deploying
Kudu, someone may wish to test it out with limited resources in a small
environment. Eventually, they may wish to transition that cluster to be a
staging or production environment, which would typically require the fault
tolerance achievable with multi-node Raft. Without a consensus implementation
that supports configuration changes, there would be no way to gracefully
support this. Because single-node Raft supports dynamically adding an
additional node to its configuration, it is possible to go from one replica to
2 and then 3 replicas and end up with a fault-tolerant cluster without
incurring downtime.&lt;/p&gt;
&lt;h1 id=&quot;more-about-raft&quot;&gt;More about Raft&lt;/h1&gt;
&lt;p&gt;To learn more about how Kudu uses Raft consensus, you may find the relevant
&lt;a href=&quot;https://github.com/apache/incubator-kudu/blob/master/docs/design-docs/README.md&quot;&gt;design docs&lt;/a&gt;
interesting. In the future, we may also post more articles on the Kudu blog
about how Kudu uses Raft to achieve fault tolerance.&lt;/p&gt;
&lt;p&gt;To learn more about the Raft protocol itself, please see the &lt;a href=&quot;https://raft.github.io/&quot;&gt;Raft consensus
home page&lt;/a&gt;. The design of Kudu’s Raft implementation
is based on the extended protocol described in Diego Ongaro’s Ph.D.
dissertation, which you can find linked from the above web site.&lt;/p&gt;</content><author><name>Mike Percy</name></author><summary>As Kudu marches toward its 1.0 release, which will include support for
multi-master operation, we are working on removing old code that is no longer
needed. One such piece of code is called LocalConsensus. Once LocalConsensus is
removed, we will be using Raft consensus even on Kudu tables that have a
replication factor of 1.</summary></entry><entry><title>Apache Kudu (incubating) Weekly Update June 13, 2016</title><link href="/2016/06/13/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu (incubating) Weekly Update June 13, 2016" /><published>2016-06-13T00:00:00-07:00</published><updated>2016-06-13T00:00:00-07:00</updated><id>/2016/06/13/weekly-update</id><content type="html" xml:base="/2016/06/13/weekly-update.html">&lt;p&gt;Welcome to the thirteenth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;If you find this post useful, please let us know by emailing the
&lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#117;&amp;#115;&amp;#101;&amp;#114;&amp;#064;&amp;#107;&amp;#117;&amp;#100;&amp;#117;&amp;#046;&amp;#105;&amp;#110;&amp;#099;&amp;#117;&amp;#098;&amp;#097;&amp;#116;&amp;#111;&amp;#114;&amp;#046;&amp;#097;&amp;#112;&amp;#097;&amp;#099;&amp;#104;&amp;#101;&amp;#046;&amp;#111;&amp;#114;&amp;#103;&quot;&gt;kudu-user mailing list&lt;/a&gt; or
tweeting at &lt;a href=&quot;https://twitter.com/ApacheKudu&quot;&gt;@ApacheKudu&lt;/a&gt;. Similarly, if you’re
aware of some Kudu news we missed, let us know so we can cover it in
a future post.&lt;/p&gt;
&lt;h2 id=&quot;development-discussions-and-code-in-progress&quot;&gt;Development discussions and code in progress&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The IPMC vote for 0.9.0 RC1 passed and Kudu 0.9.0 is now
&lt;a href=&quot;http://getkudu.io/2016/06/10/apache-kudu-0-9-0-released.html&quot;&gt;officially released&lt;/a&gt;. Per the
lazily agreed-upon &lt;a href=&quot;http://mail-archives.apache.org/mod_mbox/kudu-dev/201602.mbox/%3CCAGpTDNcMBWwX8p+yGKzHfL2xcmKTScU-rhLcQFSns1UVSbrXhw@mail.gmail.com%3E&quot;&gt;plan&lt;/a&gt;,
the next release will be 1.0.0 in about two months.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Adar Dembo has been cleaning up and improving the Master process’s code. Last week he
&lt;a href=&quot;https://gerrit.cloudera.org/#/c/2887/&quot;&gt;finished&lt;/a&gt; removing the per-tablet replica locations cache.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Alexey Serbin contributed his first patch last week by &lt;a href=&quot;https://gerrit.cloudera.org/#/c/3360/&quot;&gt;fixing&lt;/a&gt;
most of the unit tests that were failing on OSX.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Sameer Abhyankar is nearly finished adding support for “in-list” predicates,
follow &lt;a href=&quot;https://gerrit.cloudera.org/#/c/2986/&quot;&gt;this link&lt;/a&gt; to the gerrit
review. This will enable specifying predicates in the style of “column IN (list, of, values)”.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mike Percy posted a few patches that remove LocalConsensus for single-node tablets, with the actual
removal happening in this &lt;a href=&quot;https://gerrit.cloudera.org/#/c/3350/&quot;&gt;patch&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;slides-and-recordings&quot;&gt;Slides and recordings&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Todd Lipcon presented Kudu at Berlin Buzzwords earlier this month. The recording is available
&lt;a href=&quot;https://berlinbuzzwords.de/session/apache-kudu-incubating-fast-analytics-fast-data&quot;&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Jean-Daniel Cryans</name></author><summary>Welcome to the thirteenth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.</summary></entry><entry><title>Apache Kudu (incubating) 0.9.0 released</title><link href="/2016/06/10/apache-kudu-0-9-0-released.html" rel="alternate" type="text/html" title="Apache Kudu (incubating) 0.9.0 released" /><published>2016-06-10T00:00:00-07:00</published><updated>2016-06-10T00:00:00-07:00</updated><id>/2016/06/10/apache-kudu-0-9-0-released</id><content type="html" xml:base="/2016/06/10/apache-kudu-0-9-0-released.html">&lt;p&gt;The Apache Kudu (incubating) team is happy to announce the release of Kudu
0.9.0!&lt;/p&gt;
&lt;p&gt;This latest version adds basic UPSERT functionality and an improved Apache Spark Data Source
that doesn’t rely on the MapReduce I/O formats. It also improves Tablet Server
restart time as well as write performance under high load. Finally, Kudu now enforces
the specification of a partitioning scheme for new tables.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Read the detailed &lt;a href=&quot;http://getkudu.io/releases/0.9.0/docs/release_notes.html&quot;&gt;Kudu 0.9.0 release notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Download the &lt;a href=&quot;http://getkudu.io/releases/0.9.0/&quot;&gt;Kudu 0.9.0 source release&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Jean-Daniel Cryans</name></author><summary>The Apache Kudu (incubating) team is happy to announce the release of Kudu
0.9.0!
This latest version adds basic UPSERT functionality and an improved Apache Spark Data Source
that doesn&amp;#8217;t rely on the MapReduce I/O formats. It also improves Tablet Server
restart time as well as write performance under high load. Finally, Kudu now enforces
the specification of a partitioning scheme for new tables.
Read the detailed Kudu 0.9.0 release notes
Download the Kudu 0.9.0 source release</summary></entry><entry><title>Apache Kudu (incubating) Weekly Update June 6, 2016</title><link href="/2016/06/06/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu (incubating) Weekly Update June 6, 2016" /><published>2016-06-06T00:00:00-07:00</published><updated>2016-06-06T00:00:00-07:00</updated><id>/2016/06/06/weekly-update</id><content type="html" xml:base="/2016/06/06/weekly-update.html">&lt;p&gt;Welcome to the twelfth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;If you find this post useful, please let us know by emailing the
&lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#117;&amp;#115;&amp;#101;&amp;#114;&amp;#064;&amp;#107;&amp;#117;&amp;#100;&amp;#117;&amp;#046;&amp;#105;&amp;#110;&amp;#099;&amp;#117;&amp;#098;&amp;#097;&amp;#116;&amp;#111;&amp;#114;&amp;#046;&amp;#097;&amp;#112;&amp;#097;&amp;#099;&amp;#104;&amp;#101;&amp;#046;&amp;#111;&amp;#114;&amp;#103;&quot;&gt;kudu-user mailing list&lt;/a&gt; or
tweeting at &lt;a href=&quot;https://twitter.com/ApacheKudu&quot;&gt;@ApacheKudu&lt;/a&gt;. Similarly, if you’re
aware of some Kudu news we missed, let us know so we can cover it in
a future post.&lt;/p&gt;
&lt;h2 id=&quot;development-discussions-and-code-in-progress&quot;&gt;Development discussions and code in progress&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Jean-Daniel Cryans, put up &lt;a href=&quot;http://mail-archives.apache.org/mod_mbox/incubator-kudu-dev/201606.mbox/%3CCAGpTDNduoQM0ktuZc1eW1XeXCcXhvPGftJ%3DLRB8Er5c2dZptvw%40mail.gmail.com%3E&quot;&gt;0.9.0 RC1&lt;/a&gt;
for vote on the dev mailing list and it passed. The Incubator PMC (IPMC) will also need
to vote on it before it can officially be released.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mike Percy is working on removing LocalConsensus which is currently used for
single node Kudu deployments. We will instead use the Raft consensus implementation
with a replication factor of 1. This is to simplify development since we need to maintain two
consensus implementations. It will also provide a way to migrate from single node to multi-node
deployments. See the discussion in this &lt;a href=&quot;http://mail-archives.apache.org/mod_mbox/kudu-dev/201605.mbox/%3CCADXBggeE6RUYchv5fa=J2geHGE8Mw4SOeoi=LjXjdfmYYSqyhQ@mail.gmail.com%3E&quot;&gt;dev thread&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Zhen Zhang got a patch in for &lt;a href=&quot;https://issues.apache.org/jira/browse/KUDU-1444&quot;&gt;KUDU-1444&lt;/a&gt;
that adds resources usage monitoring to scanners in the C++ client. In the future this could
be leveraged by systems like Impala to augment the query profiles.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Longer term efforts for 1.0 are making good progress. Dan Burkert &lt;a href=&quot;https://gerrit.cloudera.org/#/c/3255/&quot;&gt;added support&lt;/a&gt;
in the C++ client for non-covering range partitioned tables, and David Alves has a few
patches in for the &lt;a href=&quot;https://gerrit.cloudera.org/#/c/2642/&quot;&gt;Replay Cache&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Jean-Daniel Cryans</name></author><summary>Welcome to the twelfth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.</summary></entry><entry><title>Default Partitioning Changes Coming in Kudu 0.9</title><link href="/2016/06/02/no-default-partitioning.html" rel="alternate" type="text/html" title="Default Partitioning Changes Coming in Kudu 0.9" /><published>2016-06-02T00:00:00-07:00</published><updated>2016-06-02T00:00:00-07:00</updated><id>/2016/06/02/no-default-partitioning</id><content type="html" xml:base="/2016/06/02/no-default-partitioning.html">&lt;p&gt;The upcoming Apache Kudu (incubating) 0.9 release is changing the default
partitioning configuration for new tables. This post will introduce the change,
explain the motivations, and show examples of how code can be updated to work
with the new release.&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;The most common source of frustration with new Kudu users is the default
partitioning behavior when creating new tables. If partitioning is not
specified, the Kudu client prior to 0.9 creates tables with a &lt;em&gt;single tablet&lt;/em&gt;.
Single tablet tables are a Kudu anti-pattern, since they are unable to get the
scalability benefit of distributing data across the cluster, and instead keep
all data on a single machine.&lt;/p&gt;
&lt;p&gt;Unfortunately, automatically choosing a better default partitioning
configuration for new tables is not simple. In most cases, hash partitioning on
the primary key is a better default, but this approach can have its own
drawbacks. In particular, it is not clear how many buckets should be used for
the new table.&lt;/p&gt;
&lt;p&gt;Since there is no bullet-proof default and changing the partitioning
configuration after table creation is impossible, &lt;a href=&quot;https://lists.apache.org/thread.html/ca8972620839109334493424a1022fc08c77c315d9d623f5caaa815f@1463699013@%3Cuser.kudu.apache.org%3E&quot;&gt;we
decided&lt;/a&gt;
to remove the default altogether. Removing the default is a backwards
incompatible change, so it must be done before the 1.0 release. If we later find
a better way to create a default partitioning configuration, it should be
possible to adopt it in a backwards compatible way. The result of removing the
default is that new tables created with the 0.9 client must specify a
partitioning configuration, or table creation will fail. You can still create a
table with a single tablet, but it must be configured explicitly. These changes
only affect new table creation; existing tables, including tables created with
default partitioning before the 0.9 release, will continue to work.&lt;/p&gt;
&lt;p&gt;In most cases updating existing code to explicitly set a partitioning
configuration should be simple. The examples below add hash partitioning, but
you can also specify range partitioning or a combination of range and hash
partitioning. See the &lt;a href=&quot;http://getkudu.io/docs/schema_design.html#data-distribution&quot;&gt;schema design
guide&lt;/a&gt; for more
advanced configurations.&lt;/p&gt;
&lt;h1 id=&quot;c-client&quot;&gt;C++ Client&lt;/h1&gt;
&lt;p&gt;With the C++ client, creating a new table with hash partitions is as simple as
calling &lt;code&gt;KuduTableCreator:add_hash_partitions&lt;/code&gt; with the columns to hash and the
number of buckets to use:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;cpp
unique_ptr&amp;lt;KuduTableCreator&amp;gt; table_creator(my_client-&amp;gt;NewTableCreator());
Status create_status = table_creator-&amp;gt;table_name(&quot;my-table&quot;)
.schema(my_schema)
.add_hash_partitions({ &quot;key_column_a&quot;, &quot;key_column_b&quot; }, 16)
.Create();
if (!create_status.ok() { /* handle error */ }
&lt;/code&gt;&lt;/p&gt;
&lt;h1 id=&quot;java-client&quot;&gt;Java Client&lt;/h1&gt;
&lt;p&gt;And similarly, in Java:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;java
List&amp;lt;String&amp;gt; hashColumns = new ArrayList&amp;lt;&amp;gt;();
hashColumns.add(&quot;key_column_a&quot;);
hashColumn.add(&quot;key_column_b&quot;);
CreateTableOptions options = new CreateTableOptions().addHashPartitions(hashColumns, 16);
myClient.createTable(&quot;my-table&quot;, my_schema, options);
&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;In the examples above, if the hash partition configuration is omitted the create
table operation will fail with the error &lt;code&gt;Table partitioning must be specified
using setRangePartitionColumns or addHashPartitions&lt;/code&gt;. In the Java client this
manifests as a thrown &lt;code&gt;IllegalArgumentException&lt;/code&gt;, while in the C++ client it is
returned as a &lt;code&gt;Status::InvalidArgument&lt;/code&gt;.&lt;/p&gt;
&lt;h1 id=&quot;impala&quot;&gt;Impala&lt;/h1&gt;
&lt;p&gt;When creating Kudu tables with Impala, the formerly optional &lt;code&gt;DISTRIBUTE BY&lt;/code&gt;
clause is now required:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SQL
CREATE TABLE my_table (key_column_a STRING, key_column_b STRING, other_column STRING)
DISTRIBUTE BY HASH (key_column_a, key_column_b) INTO 16 BUCKETS
TBLPROPERTIES(
'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler',
'kudu.table_name' = 'my_table',
'kudu.master_addresses' = 'kudu-master.example.com:7051',
'kudu.key_columns' = 'key_column_a,key_column_b'
);
&lt;/code&gt;&lt;/p&gt;</content><author><name>Dan Burkert</name></author><summary>The upcoming Apache Kudu (incubating) 0.9 release is changing the default
partitioning configuration for new tables. This post will introduce the change,
explain the motivations, and show examples of how code can be updated to work
with the new release.</summary></entry><entry><title>Apache Kudu (incubating) Weekly Update June 1, 2016</title><link href="/2016/06/01/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu (incubating) Weekly Update June 1, 2016" /><published>2016-06-01T00:00:00-07:00</published><updated>2016-06-01T00:00:00-07:00</updated><id>/2016/06/01/weekly-update</id><content type="html" xml:base="/2016/06/01/weekly-update.html">&lt;p&gt;Welcome to the eleventh edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;If you find this post useful, please let us know by emailing the
&lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#117;&amp;#115;&amp;#101;&amp;#114;&amp;#064;&amp;#107;&amp;#117;&amp;#100;&amp;#117;&amp;#046;&amp;#105;&amp;#110;&amp;#099;&amp;#117;&amp;#098;&amp;#097;&amp;#116;&amp;#111;&amp;#114;&amp;#046;&amp;#097;&amp;#112;&amp;#097;&amp;#099;&amp;#104;&amp;#101;&amp;#046;&amp;#111;&amp;#114;&amp;#103;&quot;&gt;kudu-user mailing list&lt;/a&gt; or
tweeting at &lt;a href=&quot;https://twitter.com/ApacheKudu&quot;&gt;@ApacheKudu&lt;/a&gt;. Similarly, if you’re
aware of some Kudu news we missed, let us know so we can cover it in
a future post.&lt;/p&gt;
&lt;h2 id=&quot;development-discussions-and-code-in-progress&quot;&gt;Development discussions and code in progress&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Jean-Daniel Cryans, the release manager for 0.9.0, &lt;a href=&quot;http://mail-archives.apache.org/mod_mbox/incubator-kudu-dev/201605.mbox/%3CCAGpTDNe_gV5TTsJQSjx_Q-hSGjK9TesWkyP-k9rnhd0mBtYAYg%40mail.gmail.com%3E&quot;&gt;indicated&lt;/a&gt;
that the release is almost ready and the first release candidate will be put up for vote this
week.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Dan Burkert pushed &lt;a href=&quot;http://gerrit.cloudera.org:8080/3131&quot;&gt;a change&lt;/a&gt; that disallows default
partitioning when creating a new table. This is due to many reports from users experiencing bad
performance because their table was created with only one tablet. Kudu will now force users to
partition their tables.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Todd Lipcon ran YCSB stress tests on a cluster and discovered that compactions were taking hours
instead of seconds. He pushed &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3221/&quot;&gt;a change&lt;/a&gt; that solves
the issue as part of our &lt;a href=&quot;https://issues.apache.org/jira/browse/KUDU-749&quot;&gt;general effort&lt;/a&gt; to
improve performance for zipfian update workloads.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Todd also &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3186/&quot;&gt;changed&lt;/a&gt; some flush-related defaults to
encourage parallel IO and larger flushes. This is based on his previous work that he documented
in this &lt;a href=&quot;http://getkudu.io/2016/04/26/ycsb.html&quot;&gt;blog post&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Will Berkeley made a few improvements last week, but &lt;a href=&quot;http://gerrit.cloudera.org:8080/3199&quot;&gt;one&lt;/a&gt;
we’d like to call out is that he removed the Java’s kudu-mapreduce module dependency on Hadoop’s
hadoop-common test jar. This solved build issues while also removing a nasty dependency.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Jean-Daniel Cryans</name></author><summary>Welcome to the eleventh edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.</summary></entry><entry><title>Apache Kudu (incubating) Weekly Update May 23, 2016</title><link href="/2016/05/23/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu (incubating) Weekly Update May 23, 2016" /><published>2016-05-23T00:00:00-07:00</published><updated>2016-05-23T00:00:00-07:00</updated><id>/2016/05/23/weekly-update</id><content type="html" xml:base="/2016/05/23/weekly-update.html">&lt;p&gt;Welcome to the tenth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;If you find this post useful, please let us know by emailing the
&lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#117;&amp;#115;&amp;#101;&amp;#114;&amp;#064;&amp;#107;&amp;#117;&amp;#100;&amp;#117;&amp;#046;&amp;#105;&amp;#110;&amp;#099;&amp;#117;&amp;#098;&amp;#097;&amp;#116;&amp;#111;&amp;#114;&amp;#046;&amp;#097;&amp;#112;&amp;#097;&amp;#099;&amp;#104;&amp;#101;&amp;#046;&amp;#111;&amp;#114;&amp;#103;&quot;&gt;kudu-user mailing list&lt;/a&gt; or
tweeting at &lt;a href=&quot;https://twitter.com/ApacheKudu&quot;&gt;@ApacheKudu&lt;/a&gt;. Similarly, if you’re
aware of some Kudu news we missed, let us know so we can cover it in
a future post.&lt;/p&gt;
&lt;h2 id=&quot;kudu-related-podcast&quot;&gt;Kudu related podcast&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Two committers, Mike Percy and Dan Burkert, appeared on the
&lt;a href=&quot;https://developer.ibm.com/tv/apachecon-apache-projects/&quot;&gt;IBM New Builders podcast&lt;/a&gt;
to talk about Apache Kudu, how they got involved, and what sort of
workloads it is best suited for.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;development-discussions-and-code-in-progress&quot;&gt;Development discussions and code in progress&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Jean-Daniel Cryans is again acting as the release manager for the upcoming
0.9.0 release. The git branch for 0.9 has now been cut, and only bug fixes
or small improvements will be committed to that branch between now and the
first release candidate.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Since Kudu’s initial release, one of the most commonly requested features
has been support for the &lt;code&gt;UPSERT&lt;/code&gt; operation. &lt;code&gt;UPSERT&lt;/code&gt;, known in some other
databases as &lt;code&gt;INSERT ... ON DUPLICATE KEY UPDATE&lt;/code&gt;. This operation has the
semantics of an &lt;code&gt;INSERT&lt;/code&gt; if no key already exists with the provided primary
key. Otherwise, it replaces the existing row with the new values.&lt;/p&gt;
&lt;p&gt;This week, several developers collaborated to add support for this operation.
Todd Lipcon implemented
&lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3101/&quot;&gt;support on the server side&lt;/a&gt;,
C++ client, and &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3128/&quot;&gt;Python client&lt;/a&gt;.
Jean-Daniel Cryans added support in the
&lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3123/&quot;&gt;Java client&lt;/a&gt;. Ara Ebrahimi
and Will Berkeley have started working on
&lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3145/&quot;&gt;integrating upsert support into the Flume sink&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mike Percy started working on support for &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3135/&quot;&gt;basic disk
space reservations&lt;/a&gt;
in the tablet server. This feature will cause the tablet server to stop
writing to a disk before it’s full, preventing crashes due to running
out of space.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Chris George and Andy Grove collaborated on support for &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2992/&quot;&gt;insertions and
updates in the Spark DataSource&lt;/a&gt;,
and the patch was committed towards the end of the week. Brent Gardner
has also been helping with the Spark integration, and fixed an important
&lt;a href=&quot;https://issues.apache.org/jira/browse/KUDU-1453&quot;&gt;connection leak bug&lt;/a&gt;
in the initial implementation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;David Alves worked on reviving a 7-month old patch by Jingkai Yuan which
implements a &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/1210/&quot;&gt;integer delta encoding scheme&lt;/a&gt;
that is meant to be efficient both in terms of CPU and disk space. This
encoding scheme is also designed to take advantage of modern CPU instruction sets
such as AVX and AVX2.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;upcoming-talks-and-meetups&quot;&gt;Upcoming talks and meetups&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Ryan Bosshart will be presenting Kudu at the &lt;a href=&quot;http://www.meetup.com/DFW-Cloudera-User-Group/events/230547045/&quot;&gt;Dallas/Fort Worth
Cloudera User Group&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Todd Lipcon</name></author><summary>Welcome to the tenth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.</summary></entry><entry><title>Apache Kudu (incubating) Weekly Update May 16, 2016</title><link href="/2016/05/16/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu (incubating) Weekly Update May 16, 2016" /><published>2016-05-16T00:00:00-07:00</published><updated>2016-05-16T00:00:00-07:00</updated><id>/2016/05/16/weekly-update</id><content type="html" xml:base="/2016/05/16/weekly-update.html">&lt;p&gt;Welcome to the ninth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;If you find this post useful, please let us know by emailing the
&lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#117;&amp;#115;&amp;#101;&amp;#114;&amp;#064;&amp;#107;&amp;#117;&amp;#100;&amp;#117;&amp;#046;&amp;#105;&amp;#110;&amp;#099;&amp;#117;&amp;#098;&amp;#097;&amp;#116;&amp;#111;&amp;#114;&amp;#046;&amp;#097;&amp;#112;&amp;#097;&amp;#099;&amp;#104;&amp;#101;&amp;#046;&amp;#111;&amp;#114;&amp;#103;&quot;&gt;kudu-user mailing list&lt;/a&gt; or
tweeting at &lt;a href=&quot;https://twitter.com/ApacheKudu&quot;&gt;@ApacheKudu&lt;/a&gt;. Similarly, if you’re
aware of some Kudu news we missed, let us know so we can cover it in
a future post.&lt;/p&gt;
&lt;h2 id=&quot;development-discussions-and-code-in-progress&quot;&gt;Development discussions and code in progress&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Development and code reviews continued on Sameer Abhyankar’s patch which
adds support for pushing down &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2986/&quot;&gt;‘IN’ predicates&lt;/a&gt;
to the Kudu tablet servers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Todd Lipcon and Binglin Chang have been continuing to work on improving throughput
for a high throughput random-read use case. Initial profiling indicated that the
RPC system was a bottleneck, and patches have started to land which improve
the throughput:&lt;/p&gt;
&lt;p&gt;The largest bottleneck was in the queue which transfers RPC calls from the
libev “reactor” threads which perform network IO to the “worker” threads
which service the actual requests. Binglin borrowed some ideas from Facebook’s
&lt;a href=&quot;https://github.com/facebook/folly&quot;&gt;folly&lt;/a&gt; library, and implemented an
&lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2938/&quot;&gt;improved queue&lt;/a&gt;
which reduces context switches and lock contention while also
improving CPU cache locality of the worker threads.&lt;/p&gt;
&lt;p&gt;Todd identified that the hash function used to map connections to reactor
threads was poor, resulting in uneven load distribution across cores.
A &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2939/&quot;&gt;simple patch to change the hashcode implementation&lt;/a&gt;
improved the distribution substantially.&lt;/p&gt;
&lt;p&gt;With just these patches, an RPC stress benchmark was improved from about 202K RPCs/second
to 768K RPCs/second on a 24-core machine. Further improvements are in flight
and under review this week.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Zhen Zhang is continuing to focus on adding more visibility into
performance and resource usage by adding the ability to propagate various
per-operation metrics from the server side back to the client. His latest patch
under review &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3013/&quot;&gt;exposes scanner cache hit rate metrics&lt;/a&gt;
to the client.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Todd Lipcon and Sarah Jelinek continue to make progress on the
implementation of a persistent-memory backed block cache.
This week a &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2957/&quot;&gt;substantial refactor to the block cache interface&lt;/a&gt;
was committed in preparation for the &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2593/&quot;&gt;NVM cache itself&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Congratulations to Will Berkeley, a new contributor who has been
contributing small fixes and improvements such as
&lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/3022/&quot;&gt;exposing table partitioning information in the master web UI&lt;/a&gt;.
Thanks, Will!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;David Alves has been continuing to make progress towards his implementation of
the &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2642/&quot;&gt;Replay Cache&lt;/a&gt;.
This week, he refactored and cleaned up much of the client code involving
error handling and retrying write operations, in preparation to inserting
unique identifiers for these and other operations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Chris George has continued to work on the Spark DataSource implementation.
In particular, work is progressing on support for &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2992/&quot;&gt;inserting and updating
rows via Spark&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Todd Lipcon and Mike Percy both committed improvements which will help speed up
startup. Measurements on a cluster where each node stores a few TB of data
showed a 3x improvement in startup time.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;upcoming-talks-and-meetups&quot;&gt;Upcoming talks and meetups&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Mladen Kovacevi will be presenting Kudu at the
&lt;a href=&quot;http://www.meetup.com/Big-Data-Montreal/events/230879277/?eventId=230879277&quot;&gt;Big Data Montreal&lt;/a&gt;
meetup.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Todd Lipcon</name></author><summary>Welcome to the ninth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.</summary></entry><entry><title>Apache Kudu (incubating) Weekly Update May 9, 2016</title><link href="/2016/05/09/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu (incubating) Weekly Update May 9, 2016" /><published>2016-05-09T00:00:00-07:00</published><updated>2016-05-09T00:00:00-07:00</updated><id>/2016/05/09/weekly-update</id><content type="html" xml:base="/2016/05/09/weekly-update.html">&lt;p&gt;Welcome to the eighth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;If you find this post useful, please let us know by emailing the
&lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#117;&amp;#115;&amp;#101;&amp;#114;&amp;#064;&amp;#107;&amp;#117;&amp;#100;&amp;#117;&amp;#046;&amp;#105;&amp;#110;&amp;#099;&amp;#117;&amp;#098;&amp;#097;&amp;#116;&amp;#111;&amp;#114;&amp;#046;&amp;#097;&amp;#112;&amp;#097;&amp;#099;&amp;#104;&amp;#101;&amp;#046;&amp;#111;&amp;#114;&amp;#103;&quot;&gt;kudu-user mailing list&lt;/a&gt; or
tweeting at &lt;a href=&quot;https://twitter.com/ApacheKudu&quot;&gt;@ApacheKudu&lt;/a&gt;. Similarly, if you’re
aware of some Kudu news we missed, let us know so we can cover it in
a future post.&lt;/p&gt;
&lt;h2 id=&quot;development-discussions-and-code-in-progress&quot;&gt;Development discussions and code in progress&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Sameer Abhyankar posted a &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2986/&quot;&gt;patch&lt;/a&gt;
for &lt;a href=&quot;https://issues.apache.org/jira/browse/KUDU-1363&quot;&gt;KUDU-1363&lt;/a&gt;
that adds the ability to specify
&lt;a href=&quot;http://www.w3schools.com/sql/sql_in.asp&quot;&gt;IN&lt;/a&gt;-like predicates on column values.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Chris George and Andy Grove have both been adding new features in Kudu’s
Spark module such as methods to &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2981/&quot;&gt;create/delete tables&lt;/a&gt;
and &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2992/&quot;&gt;insert/update rows in the DataSource&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Todd Lipcon &lt;a href=&quot;https://issues.apache.org/jira/browse/KUDU-1437&quot;&gt;fixed a bug&lt;/a&gt; in RLE
encoding that was reported by Przemyslaw Maciolek. Thank you Przemyslaw for
reporting it and providing an easy way to reproduce it!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Adar Dembo is currently working on addressing the
&lt;a href=&quot;https://github.com/cloudera/kudu/blob/master/docs/design-docs/multi-master-1.0.md&quot;&gt;issues with multi-master&lt;/a&gt;
and early last week he got &lt;a href=&quot;http://gerrit.cloudera.org:8080/2879&quot;&gt;a&lt;/a&gt;
&lt;a href=&quot;http://gerrit.cloudera.org:8080/2928&quot;&gt;few&lt;/a&gt; &lt;a href=&quot;http://gerrit.cloudera.org:8080/2891&quot;&gt;patches&lt;/a&gt;
in that address some race conditions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Zhen Zhang got &lt;a href=&quot;http://gerrit.cloudera.org:8080/#/c/2858/&quot;&gt;a first contribution&lt;/a&gt;
in with a patch that adds statistics in the Java client. In 0.9.0 it will be
possible to query the client to get things like the number of bytes written or
how many write operations were sent.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;upcoming-talks-and-meetups&quot;&gt;Upcoming talks and meetups&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Dan Burkert and Mike Percy will present Kudu at the
&lt;a href=&quot;http://www.meetup.com/Vancouver-Spark/events/229692936/&quot;&gt;Vancouver Spark Meetup&lt;/a&gt; in Vancouver, BC,
on May 10.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Jean-Daniel Cryans</name></author><summary>Welcome to the eighth edition of the Kudu Weekly Update. This weekly blog post
covers ongoing development and news in the Apache Kudu (incubating) project.</summary></entry></feed>