| <?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><generator uri="http://jekyllrb.com" version="2.5.3">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2017-09-13T14:09:28-07:00</updated><id>/</id><entry><title>Apache Kudu 1.5.0 released</title><link href="/2017/09/08/apache-kudu-1-5-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.5.0 released" /><published>2017-09-08T00:00:00-07:00</published><updated>2017-09-08T00:00:00-07:00</updated><id>/2017/09/08/apache-kudu-1-5-0-released</id><content type="html" xml:base="/2017/09/08/apache-kudu-1-5-0-released.html"><p>The Apache Kudu team is happy to announce the release of Kudu 1.5.0!</p> |
| |
| <p>Apache Kudu 1.5.0 is a minor release which offers several new features, |
| improvements, optimizations, and bug fixes.</p> |
| |
| <p>Highlights include:</p> |
| |
| <!--more--> |
| |
| <ul> |
| <li>optimizations to improve write throughput and failover recovery times</li> |
| <li>the Raft consensus implementation has been made more resilient and flexible |
| through “tombstoned voting”, which allows Kudu to self-heal in more edge-case |
| scenarios</li> |
| <li>the number of threads used by Kudu servers has been further reduced, with |
| additional reductions planned for the future</li> |
| <li>a new configuration dashboard on the web UI which provides a high-level |
| summary of important configuration values</li> |
| <li>a new <code>kudu tablet move</code> command which moves a tablet replica from one tablet |
| server to another</li> |
| <li>a new <code>kudu local_replica data_size</code> command which summarizes the space usage |
| of a local tablet</li> |
| <li>all on-disk data is now checksummed by default, which provides error detection |
| for improved confidence when running Kudu on unreliable hardware</li> |
| </ul> |
| |
| <p>The above list of changes is non-exhaustive. Please refer to the |
| <a href="/releases/1.5.0/docs/release_notes.html">release notes</a> |
| for an expanded list of important improvements, bug fixes, and |
| incompatible changes before upgrading.</p> |
| |
| <ul> |
| <li>Download the <a href="/releases/1.5.0/">Kudu 1.5.0 source release</a></li> |
| <li>Convenience binary artifacts for the Java client and various Java |
| integrations (eg Spark, Flume) are also now available via the ASF Maven |
| repository.</li> |
| </ul></content><author><name>Dan Burkert</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.5.0! |
| |
| Apache Kudu 1.5.0 is a minor release which offers several new features, |
| improvements, optimizations, and bug fixes. |
| |
| Highlights include:</summary></entry><entry><title>Apache Kudu 1.4.0 released</title><link href="/2017/06/13/apache-kudu-1-4-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.4.0 released" /><published>2017-06-13T00:00:00-07:00</published><updated>2017-06-13T00:00:00-07:00</updated><id>/2017/06/13/apache-kudu-1-4-0-released</id><content type="html" xml:base="/2017/06/13/apache-kudu-1-4-0-released.html"><p>The Apache Kudu team is happy to announce the release of Kudu 1.4.0!</p> |
| |
| <p>Apache Kudu 1.4.0 is a minor release which offers several new features, |
| improvements, optimizations, and bug fixes.</p> |
| |
| <p>Highlights include:</p> |
| |
| <!--more--> |
| |
| <ul> |
| <li>ability to alter storage attributes and default values for existing columns</li> |
| <li>a new C++ client API to efficiently map primary keys to their associated partitions |
| and hosts</li> |
| <li>support for long-running fault-tolerant scans in the Java client</li> |
| <li>a new <code>kudu fs check</code> command which can perform offline consistency checks |
| and repairs on the local on-disk storage of a Tablet Server or Master.</li> |
| <li>many optimizations to reduce disk space usage, improve write throughput, |
| and improve throughput of background maintenance operations.</li> |
| </ul> |
| |
| <p>The above list of changes is non-exhaustive. Please refer to the |
| <a href="/releases/1.4.0/docs/release_notes.html">release notes</a> |
| for an expanded list of important improvements, bug fixes, and |
| incompatible changes before upgrading.</p> |
| |
| <ul> |
| <li>Download the <a href="/releases/1.4.0/">Kudu 1.4.0 source release</a></li> |
| <li>Convenience binary artifacts for the Java client and various Java |
| integrations (eg Spark, Flume) are also now available via the ASF Maven |
| repository.</li> |
| </ul></content><author><name>Todd Lipcon</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.4.0! |
| |
| Apache Kudu 1.4.0 is a minor release which offers several new features, |
| improvements, optimizations, and bug fixes. |
| |
| Highlights include:</summary></entry><entry><title>Apache Kudu 1.3.1 released</title><link href="/2017/04/19/apache-kudu-1-3-1-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.3.1 released" /><published>2017-04-19T00:00:00-07:00</published><updated>2017-04-19T00:00:00-07:00</updated><id>/2017/04/19/apache-kudu-1-3-1-released</id><content type="html" xml:base="/2017/04/19/apache-kudu-1-3-1-released.html"><p>The Apache Kudu team is happy to announce the release of Kudu 1.3.1!</p> |
| |
| <p>Apache Kudu 1.3.1 is a bug fix release which fixes critical issues discovered |
| in Apache Kudu 1.3.0. In particular, this fixes a bug in which data could be |
| incorrectly deleted after certain sequences of node failures. Several other |
| bugs are also fixed. See the release notes for details.</p> |
| |
| <p>Users of Kudu 1.3.0 are encouraged to upgrade to 1.3.1 immediately.</p> |
| |
| <ul> |
| <li>Download the <a href="/releases/1.3.1/">Kudu 1.3.1 source release</a></li> |
| <li>Convenience binary artifacts for the Java client and various Java |
| integrations (eg Spark, Flume) are also now available via the ASF Maven |
| repository.</li> |
| </ul></content><author><name>Todd Lipcon</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.3.1! |
| |
| Apache Kudu 1.3.1 is a bug fix release which fixes critical issues discovered |
| in Apache Kudu 1.3.0. In particular, this fixes a bug in which data could be |
| incorrectly deleted after certain sequences of node failures. Several other |
| bugs are also fixed. See the release notes for details. |
| |
| Users of Kudu 1.3.0 are encouraged to upgrade to 1.3.1 immediately. |
| |
| |
| Download the Kudu 1.3.1 source release |
| Convenience binary artifacts for the Java client and various Java |
| integrations (eg Spark, Flume) are also now available via the ASF Maven |
| repository.</summary></entry><entry><title>Apache Kudu 1.3.0 released</title><link href="/2017/03/20/apache-kudu-1-3-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.3.0 released" /><published>2017-03-20T00:00:00-07:00</published><updated>2017-03-20T00:00:00-07:00</updated><id>/2017/03/20/apache-kudu-1-3-0-released</id><content type="html" xml:base="/2017/03/20/apache-kudu-1-3-0-released.html"><p>The Apache Kudu team is happy to announce the release of Kudu 1.3.0!</p> |
| |
| <p>Apache Kudu 1.3 is a minor release which adds various new features, |
| improvements, bug fixes, and optimizations on top of Kudu |
| 1.2. Highlights include:</p> |
| |
| <!--more--> |
| |
| <ul> |
| <li>significantly improved support for security, including Kerberos |
| authentication, TLS encryption, and coarse-grained (cluster-level) |
| authorization</li> |
| <li>automatic garbage collection of historical versions of data</li> |
| <li>lower space consumption and better performance in default |
| configurations.</li> |
| </ul> |
| |
| <p>The above list of changes is non-exhaustive. Please refer to the |
| <a href="/releases/1.3.0/docs/release_notes.html">release notes</a> |
| for an expanded list of important improvements, bug fixes, and |
| incompatible changes before upgrading.</p> |
| |
| <p>Thanks to the 25 developers who contributed code or documentation to |
| this release!</p> |
| |
| <ul> |
| <li>Download the <a href="/releases/1.3.0/">Kudu 1.3.0 source release</a></li> |
| <li>Convenience binary artifacts for the Java client and various Java |
| integrations (eg Spark, Flume) are also now available via the ASF Maven |
| repository.</li> |
| </ul></content><author><name>Todd Lipcon</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.3.0! |
| |
| Apache Kudu 1.3 is a minor release which adds various new features, |
| improvements, bug fixes, and optimizations on top of Kudu |
| 1.2. Highlights include:</summary></entry><entry><title>Apache Kudu 1.2.0 released</title><link href="/2017/01/20/apache-kudu-1-2-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.2.0 released" /><published>2017-01-20T00:00:00-08:00</published><updated>2017-01-20T00:00:00-08:00</updated><id>/2017/01/20/apache-kudu-1-2-0-released</id><content type="html" xml:base="/2017/01/20/apache-kudu-1-2-0-released.html"><p>The Apache Kudu team is happy to announce the release of Kudu 1.2.0!</p> |
| |
| <p>The new release adds several new features and improvements, including:</p> |
| |
| <!--more--> |
| |
| <ul> |
| <li>User data such as row contents is now redacted from logging statements.</li> |
| <li>Kudu’s ability to provide strong consistency guarantees has been substantially improved.</li> |
| <li>Various performance improvements in metadata management as well as optimizations for BITSHUFFLE encoding on AVX2-capable hosts.</li> |
| </ul> |
| |
| <p>Additionally, 1.2.0 fixes a number of important bugs, including:</p> |
| |
| <ul> |
| <li>Kudu now automatically limits its usage of file descriptors, preventing crashes due to ulimit exhaustion.</li> |
| <li>Fixed a long-standing issue which could cause ext4 file system corruption on RHEL 6.</li> |
| <li>Fixed a disk space leak.</li> |
| <li>Several fixes for correctness in various edge cases.</li> |
| </ul> |
| |
| <p>The above list of changes is non-exhaustive. Please refer to the |
| <a href="/releases/1.2.0/docs/release_notes.html">release notes</a> |
| for an expanded list of important improvements, bug fixes, and |
| incompatible changes before upgrading.</p> |
| |
| <ul> |
| <li>Download the <a href="/releases/1.2.0/">Kudu 1.2.0 source release</a></li> |
| <li>Convenience binary artifacts for the Java client and various Java |
| integrations (eg Spark, Flume) are also now available via the ASF Maven |
| repository.</li> |
| </ul></content><author><name>Todd Lipcon</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.2.0! |
| |
| The new release adds several new features and improvements, including:</summary></entry><entry><title>Apache Kudu Weekly Update November 15th, 2016</title><link href="/2016/11/15/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu Weekly Update November 15th, 2016" /><published>2016-11-15T00:00:00-08:00</published><updated>2016-11-15T00:00:00-08:00</updated><id>/2016/11/15/weekly-update</id><content type="html" xml:base="/2016/11/15/weekly-update.html"><p>Welcome to the twenty-third edition of the Kudu Weekly Update. This weekly blog post |
| covers ongoing development and news in the Apache Kudu project.</p> |
| |
| <!--more--> |
| |
| <h2 id="project-news">Project news</h2> |
| |
| <ul> |
| <li> |
| <p>The first release candidate for Kudu 1.1.0 is <a href="http://mail-archives.apache.org/mod_mbox/kudu-dev/201611.mbox/%3CCADY20s7ZKZkPmUEcTexW%3D%2B_%2BLnDY2hABZg0-UZD3jvWAs9-pog%40mail.gmail.com%3E">now available</a>.</p> |
| |
| <dl> |
| <dt>Noteworthy new features/improvements:</dt> |
| <dd> |
| <ul> |
| <li>The Python client has been brought to feature parity with the C++ and Java clients.</li> |
| <li>IN LIST predicates.</li> |
| <li>Java client now features client-side tracing.</li> |
| <li>Kudu now publishes jar files for Spark 2.0 compiled with Scala 2.11.</li> |
| <li>Kudu’s Raft implementation now features pre-elections. In our tests this has greatly improved stability.</li> |
| </ul> |
| </dd> |
| </dl> |
| |
| <p>Community developers and users are encouraged to download the source |
| tarball and vote on the release.</p> |
| |
| <p>For more information on what’s new, check out the |
| <a href="https://github.com/apache/kudu/blob/branch-1.1.x/docs/release_notes.adoc">release notes</a>. |
| <em>Note:</em> some links from these in-progress release notes will not be live until the |
| release itself is published.</p> |
| </li> |
| <li> |
| <p>On November 7th, the Kudu PMC announced that Jordan Birdsell, from State Farm, had been voted |
| in as a new committer and PMC member.</p> |
| |
| <p>Jordan’s contributions include extensive work on the python client, throwing it some much needed |
| love, and bringing it to feature parity with the other clients.</p> |
| |
| <p>Besides his extensive code contributions Jordan has also been active in reviewing other |
| developer’s patches and helping the community in general, on slack and other channels.</p> |
| |
| <p>Jordan has been doing great work and the Kudu PMC was pleased to recognize his contributions |
| with committership.</p> |
| </li> |
| <li> |
| <p>Mike Percy will be presenting Kudu Wednesday 16th November at <a href="https://apachebigdataeu2016.sched.org/">Apache Big Data Europe, in Seville</a>.</p> |
| </li> |
| <li> |
| <p>Congratulations to Haijie Hong for his <a href="https://gerrit.cloudera.org/#/c/4822/">first contribution to Kudu!</a>. |
| Haijie fixed some edge cases in BitWriter that were blocking RLE usage for 64 bit types.</p> |
| </li> |
| <li> |
| <p>Congratulations to Maxim Smyatkin for his <a href="https://gerrit.cloudera.org/#/q/Maxim">first contributions to Kudu!</a>. |
| Maxim has contributed several patches helping with debug and cleanup.</p> |
| </li> |
| </ul> |
| |
| <h2 id="development-discussions-and-code-in-progress">Development discussions and code in progress</h2> |
| |
| <ul> |
| <li> |
| <p>A lot of progress has been done towards the goals that were set in the scope docs introduced in |
| the last couple of posts. Specifically:</p> |
| |
| <ul> |
| <li> |
| <p>Dan Burkert, Todd Lipcon and Alexey Serbin have doubled down on the security effort. They have |
| been working on enabling Kerberos authentication and rpc encryption. The <a href="https://docs.google.com/document/d/1cPNDTpVkIUo676RlszpTF1gHZ8l0TdbB7zFBAuOuYUw/edit#heading=h.gsibhnd5dyem">security scope doc</a> |
| has been updated with the latest plans for security and many patches have been merged already.</p> |
| </li> |
| <li> |
| <p>David Alves has continued the work on <a href="https://s.apache.org/7VCo">consistency</a>. Up for review |
| and partially pushed is a patch series to address row history loss if a row is deleted and then |
| re-inserted. Also in progress is work to make sure that scans at a snapshot from followers |
| always return same data as if they were executed on the leader. This helps with Read-Your-Writes |
| when reading from lagging replicas.</p> |
| </li> |
| <li> |
| <p>Adar Dembo has been making good progress <a href="https://s.apache.org/uOOt">addressing issues seen with the LogBlockManager</a>. |
| A series of patches have been merged with various fixes to block managers in general and to the |
| log block manager in particular.</p> |
| </li> |
| <li> |
| <p>Dinesh Bhat has been working on improving the manual recovery tools for Kudu. Namely, he has |
| added a tool to force a remote replica copy to a destination server, and a tool to delete a |
| local replica of a tablet. The latter is useful when a tablet cannot come up due to bad state.</p> |
| </li> |
| <li> |
| <p>Jean-Daniel Cryans has implemented RPC tracing for the java client, greatly improving |
| debuggability. JD also has added ReplicaSelection to the java client, allowing to perform |
| scans on replicas other than the leader, which should be of great help for load-balancing.</p> |
| </li> |
| <li> |
| <p>Besides the feature parity contributions, Jordan Birdsell has laid out a |
| <a href="http://mail-archives.apache.org/mod_mbox/kudu-dev/201611.mbox/%3CCAGaaj_VKfB4mhu6eExHCWo0%3D6Qd0HFWy7bg9e39JgOaFPGJ1nQ%40mail.gmail.com%3E">roadmap for Python client work</a> |
| for the 1.2 release. Feedback from other Python client users is certainly appreciated.</p> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| |
| <p>Want to learn more about a specific topic from this blog post? Shoot an email to the |
| <a href="&#109;&#097;&#105;&#108;&#116;&#111;:&#117;&#115;&#101;&#114;&#064;&#107;&#117;&#100;&#117;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">kudu-user mailing list</a> or |
| tweet at <a href="https://twitter.com/ApacheKudu">@ApacheKudu</a>. Similarly, if you’re |
| aware of some Kudu news we missed, let us know so we can cover it in |
| a future post.</p></content><author><name>David Alves</name></author><summary>Welcome to the twenty-third edition of the Kudu Weekly Update. This weekly blog post |
| covers ongoing development and news in the Apache Kudu project.</summary></entry><entry><title>Apache Kudu Weekly Update November 1st, 2016</title><link href="/2016/11/01/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu Weekly Update November 1st, 2016" /><published>2016-11-01T00:00:00-07:00</published><updated>2016-11-01T00:00:00-07:00</updated><id>/2016/11/01/weekly-update</id><content type="html" xml:base="/2016/11/01/weekly-update.html"><p>Welcome to the twenty-third edition of the Kudu Weekly Update. This weekly blog post |
| covers ongoing development and news in the Apache Kudu project.</p> |
| |
| <!--more--> |
| |
| <h2 id="development-discussions-and-code-in-progress">Development discussions and code in progress</h2> |
| |
| <ul> |
| <li> |
| <p>Dan Burkert committed a piece of test infrastructure |
| called “MiniKDC” for both Java and C++. The MiniKDC sets up a short-lived |
| Kerberos environment in the context of a single test case, making it |
| easy to build tests of security features without requiring any special |
| infrastructure on the part of the developer.</p> |
| </li> |
| <li> |
| <p>Todd Lipcon added support for Kerberos (GSSAPI) support to Kudu’s |
| RPC system, allowing servers to authenticate the user principal of |
| any inbound RPC connection. He also integrated Kudu’s C++ “MiniCluster” |
| test infrastructure to allow starting a Kerberized cluster in the |
| context of a test.</p> |
| </li> |
| <li> |
| <p>Dan, Todd, and Alexey Serbin have been iterating on a more detailed |
| <a href="https://docs.google.com/document/d/1Yu4iuIhaERwug1vS95yWDd_WzrNRIKvvVGUb31y-_mY/edit#">design doc</a> |
| for authentication in Kudu. This doc outlines the various non-Kerberos |
| methods that Kudu will use for authentication as well as how TLS will |
| be used to encrypt and authenticate some types of connections.</p> |
| </li> |
| <li> |
| <p>Part of the above design document involves Kudu servers generating and |
| signing X509 certificates on the fly to use for authenticated TLS. |
| Alexey has been working on a large <a href="https://gerrit.cloudera.org/#/c/4799/">patch</a> |
| which uses OpenSSL to provide key generation and signing functionality.</p> |
| </li> |
| <li> |
| <p>Sailesh Mukil has been working on adding support for |
| <a href="https://gerrit.cloudera.org/#/c/4789/">TLS in Kudu’s RPC system</a>. The TLS |
| support is a critical part of the overall design for security. This patch |
| has gone through several rounds of review and nearing completion.</p> |
| </li> |
| <li> |
| <p>JD Cryans has been continuing to improve the Java client, including adding |
| the ability to specify that the client would like to read the “closest” |
| replica (e.g. reading from a local copy if possible). Additionally, |
| JD has been working on some basic <a href="https://gerrit.cloudera.org/#/c/4781/">tracing support</a> |
| within the Java client. This tracing aims to make timeouts easier to understand |
| and diagnose.</p> |
| </li> |
| <li> |
| <p>Jordan Birdsell committed 9 more patches to the Python client, bringing it |
| very close to feature parity with C++. Jordan has a few more patches in flight |
| which should complete this long-running effort.</p> |
| </li> |
| <li> |
| <p>Congrats to new contributor Haijie Hong who committed his first patch this week. |
| Haijie added support for <a href="https://gerrit.cloudera.org/#/c/4822/">run-length encoding 64-bit integers</a>.</p> |
| </li> |
| <li> |
| <p>Will Berkeley picked back up work on <a href="https://gerrit.cloudera.org/#/c/4310/">improving the capability of ALTER |
| TABLE</a>. His in-flight patch adds support |
| for changing the default value of a column as well as changing storage attributes |
| such as desired block size, encoding, and compression.</p> |
| </li> |
| <li> |
| <p>Adar Dembo has been working on a series of patches for the Block Manager, the |
| component of Kudu which is responsible for laying out blocks on the local |
| file system. His patch series consists of a number of refactors to clean up |
| and improve the code structure, followed by an <a href="https://gerrit.cloudera.org/#/c/4848/">improvement to reduce file system |
| fragmentation</a>.</p> |
| </li> |
| <li> |
| <p>David Alves has been working on a <a href="https://gerrit.cloudera.org/#/c/4819/">patch series</a> |
| which adds support for storing ‘REINSERT’ deltas on disk. These records are |
| generated if a user inserts a row, deletes it, and inserts a new row with the |
| same primary key. Current versions of Kudu lose track of the history of the |
| prior version of the row in this scenario, which prevents correct snapshot reads. |
| David’s patch series fixes this.</p> |
| </li> |
| </ul> |
| |
| <p>Want to learn more about a specific topic from this blog post? Shoot an email to the |
| <a href="&#109;&#097;&#105;&#108;&#116;&#111;:&#117;&#115;&#101;&#114;&#064;&#107;&#117;&#100;&#117;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">kudu-user mailing list</a> or |
| tweet at <a href="https://twitter.com/ApacheKudu">@ApacheKudu</a>. Similarly, if you’re |
| aware of some Kudu news we missed, let us know so we can cover it in |
| a future post.</p></content><author><name>Todd Lipcon</name></author><summary>Welcome to the twenty-third edition of the Kudu Weekly Update. This weekly blog post |
| covers ongoing development and news in the Apache Kudu project.</summary></entry><entry><title>Apache Kudu Weekly Update October 20th, 2016</title><link href="/2016/10/20/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu Weekly Update October 20th, 2016" /><published>2016-10-20T00:00:00-07:00</published><updated>2016-10-20T00:00:00-07:00</updated><id>/2016/10/20/weekly-update</id><content type="html" xml:base="/2016/10/20/weekly-update.html"><p>Welcome to the twenty-second edition of the Kudu Weekly Update. This weekly blog post |
| covers ongoing development and news in the Apache Kudu project.</p> |
| |
| <!--more--> |
| |
| <h2 id="project-news">Project news</h2> |
| |
| <ul> |
| <li> |
| <p>Kudu 1.0.1 was <a href="http://mail-archives.apache.org/mod_mbox/kudu-user/201610.mbox/%3CCALo2W-UgTa%2BX15_q_9FQpRUPWN53eyqFS10C5MXK1KpsFgqcyQ%40mail.gmail.com%3E">released</a> |
| on October 11th. This is a bug fix release which fixes several bugs found |
| in 1.0.0. See the <a href="http://kudu.apache.org/releases/1.0.1/docs/release_notes.html">Kudu 1.0.1 release notes</a> |
| for more details.</p> |
| </li> |
| <li> |
| <p>Todd Lipcon has proposed a <a href="https://lists.apache.org/thread.html/4c94d313e28381bb107682ffaf43adfd38bd7fb3b03c98e3c86c52e2@%3Cdev.kudu.apache.org%3E">release plan</a> |
| for the next few months. The proposal is to have a 1.1 release in mid-November and |
| a 1.2 release in mid-January. These would be time-based releases rather than |
| gated on any particular feature scope; however, it’s anticipated that several |
| new features and improvements will be ready in time for these releases.</p> |
| </li> |
| <li> |
| <p>Happy fourth birthday to the Kudu project! The initial commit was made |
| on October 11th, 2012! Since then we’ve had 4888 more commits by 60 |
| authors!</p> |
| </li> |
| </ul> |
| |
| <h2 id="development-discussions-and-code-in-progress">Development discussions and code in progress</h2> |
| |
| <ul> |
| <li>As mentioned last week, a lot of contributors have been collaborating on |
| design documents for upcoming work. Here’s the complete list of in-flight |
| documents, along with the primary authors of these docs: |
| <ul> |
| <li><a href="https://docs.google.com/document/d/1cPNDTpVkIUo676RlszpTF1gHZ8l0TdbB7zFBAuOuYUw/edit#heading=h.gsibhnd5dyem">Security features</a> (Todd Lipcon)</li> |
| <li><a href="https://goo.gl/wP5BJb">Improved disk-failure handling</a> (Dinesh Bhat)</li> |
| <li><a href="https://s.apache.org/7K48">Tools for manual recovery from corruption</a> (Mike Percy and Dinesh Bhat)</li> |
| <li><a href="https://s.apache.org/uOOt">Addressing issues seen with the LogBlockManager</a> (Adar Dembo)</li> |
| <li><a href="https://s.apache.org/7VCo">Providing proper snapshot/serializable consistency</a> (David Alves)</li> |
| <li><a href="https://s.apache.org/ARUP">Improving re-replication of under-replicated tablets</a> (Mike Percy)</li> |
| <li><a href="https://docs.google.com/document/d/1066W63e2YUTNnecmfRwgAHghBPnL1Pte_gJYAaZ_Bjo/edit">Avoiding Raft election storms</a> (Todd Lipcon)</li> |
| <li><a href="https://s.apache.org/kudu-backup-scope">Backup and bulk load</a> (Dan Burkert)</li> |
| <li><a href="https://s.apache.org/SM6V">Improving diagnosability of client errors</a> (Alexey Serbin)</li> |
| </ul> |
| |
| <p>In many cases, work is now progressing on implementation of these ideas, |
| but these are considered living documents. It’s not too late to add your |
| comments or volunteer to help out.</p> |
| </li> |
| <li> |
| <p>JD Cryans has been working on cleaning up the Java client. Several complex pieces |
| of code were completely removed, and other parts were refactored into new |
| standalone classes for better modularity. Along the way, JD also |
| <a href="http://gerrit.cloudera.org:8080/4706">reduced lock contention</a> on a frequently-accessed |
| data structure.</p> |
| </li> |
| <li> |
| <p>Todd Lipcon implemented and committed Raft “pre-elections” as described in the |
| [election storm mitigation design document]((https://docs.google.com/document/d/1066W63e2YUTNnecmfRwgAHghBPnL1Pte_gJYAaZ_Bjo/edit). |
| Initial experiments, detailed in the document, indicate that this will substantially |
| improve leader stability on clusters with overloaded disks and lots of tablets.</p> |
| |
| <p>Following this patch, Todd worked on some cleanup and refactor of the Consensus |
| implementation, removing a bunch of dead code and splitting some classes up |
| into smaller pieces. This is preparing for some improvements in locking |
| granularity also described in the same document.</p> |
| </li> |
| <li> |
| <p>Dan Burkert and Todd Lipcon have started submitting patches to integrate Kerberos |
| authentication with Kudu’s RPC system. Dan posted a |
| <a href="https://gerrit.cloudera.org/#/c/4752/">patch</a> which adds “MiniKDC”, some test |
| infrastructure for starting and stopping a standalone Kerberos service in |
| the context of a test. Todd worked on adding |
| <a href="https://gerrit.cloudera.org/#/c/4763/">support for Kerberos authentication</a> |
| during RPC negotiation.</p> |
| |
| <p>These patches are just the beginning of the security work, but form an important |
| base to build on top of. The design uses Kerberos both as a mechanism to authenticate |
| clients as well as a way to mutually authenticate tablet servers with the master.</p> |
| </li> |
| </ul> |
| |
| <p>Want to learn more about a specific topic from this blog post? Shoot an email to the |
| <a href="&#109;&#097;&#105;&#108;&#116;&#111;:&#117;&#115;&#101;&#114;&#064;&#107;&#117;&#100;&#117;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">kudu-user mailing list</a> or |
| tweet at <a href="https://twitter.com/ApacheKudu">@ApacheKudu</a>. Similarly, if you’re |
| aware of some Kudu news we missed, let us know so we can cover it in |
| a future post.</p></content><author><name>Todd Lipcon</name></author><summary>Welcome to the twenty-second edition of the Kudu Weekly Update. This weekly blog post |
| covers ongoing development and news in the Apache Kudu project.</summary></entry><entry><title>Apache Kudu Weekly Update October 11th, 2016</title><link href="/2016/10/11/weekly-update.html" rel="alternate" type="text/html" title="Apache Kudu Weekly Update October 11th, 2016" /><published>2016-10-11T00:00:00-07:00</published><updated>2016-10-11T00:00:00-07:00</updated><id>/2016/10/11/weekly-update</id><content type="html" xml:base="/2016/10/11/weekly-update.html"><p>Welcome to the twenty-first edition of the Kudu Weekly Update. Astute |
| readers will notice that the weekly blog posts have been not-so-weekly |
| of late – in fact, it has been nearly two months since the previous post |
| as I and others have focused on releases, conferences, etc.</p> |
| |
| <p>So, rather than covering just this past week, this post will cover highlights |
| of the progress since the 1.0 release in mid-September. If you’re interested |
| in learning about progress prior to that release, check the |
| <a href="http://kudu.apache.org/releases/1.0.0/docs/release_notes.html">release notes</a>.</p> |
| |
| <!--more--> |
| |
| <h2 id="project-news">Project news</h2> |
| |
| <ul> |
| <li> |
| <p>On September 12th, the Kudu PMC announced that Alexey Serbin and Will |
| Berkeley had been voted as new committers and PMC members.</p> |
| |
| <p>Alexey’s contributions prior to committership included |
| <a href="https://gerrit.cloudera.org/#/c/3952/">AUTO_FLUSH_BACKGROUND</a> support |
| in C++ as well as <a href="http://kudu.apache.org/apidocs/">API documentation</a> |
| for the C++ client API.</p> |
| |
| <p>Will’s contributions include several fixes to the web UIs, large |
| improvements the Flume integration, and a lot of good work |
| burning down long-standing bugs.</p> |
| |
| <p>Both contributors were “acting the part” and the PMC was pleased to |
| recognize their contributions with committership.</p> |
| </li> |
| <li> |
| <p>Kudu 1.0.0 was <a href="https://kudu.apache.org/2016/09/20/apache-kudu-1-0-0-released.html">released</a> |
| on September 19th. Most community members have upgraded by this point |
| and have been reporting improved stability and performance.</p> |
| </li> |
| <li> |
| <p>Dan Burkert has been managing a Kudu 1.0.1 release to address a few |
| important bugs discovered since 1.0.0. The vote passed on Monday |
| afternoon, so the release should be made officially available |
| later this week.</p> |
| </li> |
| </ul> |
| |
| <h2 id="development-discussions-and-code-in-progress">Development discussions and code in progress</h2> |
| |
| <ul> |
| <li>After the 1.0 release, many contributors have gone into a design phase |
| for upcoming work. Over the last couple of weeks, developers have posted |
| scoping and design documents for topics including: |
| <ul> |
| <li><a href="https://docs.google.com/document/d/1cPNDTpVkIUo676RlszpTF1gHZ8l0TdbB7zFBAuOuYUw/edit#heading=h.gsibhnd5dyem">Security features</a> (Todd Lipcon)</li> |
| <li><a href="https://goo.gl/wP5BJb">Improved disk-failure handling</a> (Dinesh Bhat)</li> |
| <li><a href="https://s.apache.org/7K48">Tools for manual recovery from corruption</a> (Mike Percy and Dinesh Bhat)</li> |
| <li><a href="https://s.apache.org/uOOt">Addressing issues seen with the LogBlockManager</a> (Adar Dembo)</li> |
| <li><a href="https://s.apache.org/7VCo">Providing proper snapshot/serializable consistency</a> (David Alves)</li> |
| <li><a href="https://s.apache.org/ARUP">Improving re-replication of under-replicated tablets</a> (Mike Percy)</li> |
| <li><a href="https://docs.google.com/document/d/1066W63e2YUTNnecmfRwgAHghBPnL1Pte_gJYAaZ_Bjo/edit">Avoiding Raft election storms</a> (Todd Lipcon)</li> |
| </ul> |
| |
| <p>The development community has no particular rule that all work must be |
| accompanied by such a document, but in the past they have proven useful |
| for fleshing out ideas around a design before beginning implementation. |
| As Kudu matures, we can probably expect to see more of this kind of planning |
| and design discussion.</p> |
| |
| <p>If any of the above work areas sounds interesting to you, please take a |
| look and leave your comments! Similarly, if you are interested in contributing |
| in any of these areas, please feel free to volunteer on the mailing list. |
| Help of all kinds (coding, documentation, testing, etc) is welcomed.</p> |
| </li> |
| <li>Adar Dembo spent a chunk of time re-working the <code>thirdparty</code> directory |
| that contains most of Kudu’s native dependencies. The major resulting |
| changes are: |
| <ul> |
| <li>Build directories are now cleanly isolated from source directories, |
| improving cleanliness of re-builds.</li> |
| <li>ThreadSanitizer (TSAN) builds now use <code>libc++</code> instead of <code>libstdcxx</code> |
| for C++ library support. The <code>libc++</code> library has better support for |
| sanitizers, is easier to build in isolation, and solves some compatibility |
| issues that Adar was facing with GCC 5 on Ubuntu Xenial.</li> |
| <li>All of the thirdparty dependencies now build with TSAN instrumentation, |
| which improves our coverage of this very effective tooling.</li> |
| </ul> |
| |
| <p>The impact to most developers is that, if you have an old source checkout, |
| it’s highly likely you will need to clean and re-build the thirdparty |
| directory.</p> |
| </li> |
| <li>Many contributors spent time in recent weeks trying to address the |
| flakiness of various test cases. The Kudu project uses a |
| <a href="http://dist-test.cloudera.org:8080/">dashboard</a> to track the flakiness |
| of each test case, and <a href="http://dist-test.cloudera.org/">distributed test infrastructure</a> |
| to facilitate reproducing test flakes. <!-- spaces cause line break --> |
| As might be expected, some of the flaky tests were due to bugs or |
| timing assumptions in the tests themselves. However, this effort |
| also identified several real bugs: |
| <ul> |
| <li>A <a href="http://gerrit.cloudera.org:8080/4570]">tight retry loop</a> in the |
| Java client.</li> |
| <li>A <a href="http://gerrit.cloudera.org:8080/4395">memory leak</a> due to circular |
| references in the C++ client.</li> |
| <li>A <a href="http://gerrit.cloudera.org:8080/4551">crash</a> which could affect |
| tools used for problem diagnosis.</li> |
| <li>A <a href="http://gerrit.cloudera.org:8080/4409">divergence bug</a> in Raft consensus |
| under particularly torturous scenarios.</li> |
| <li>A potential <a href="http://gerrit.cloudera.org:8080/4394">crash during tablet server startup</a>.</li> |
| <li>A case in which <a href="http://gerrit.cloudera.org:8080/4626">thread startup could be delayed</a> |
| by built-in monitoring code.</li> |
| </ul> |
| |
| <p>As a result of these efforts, the failure rate of these flaky tests has |
| decreased significantly and the stability of Kudu releases continues |
| to increase.</p> |
| </li> |
| <li> |
| <p>Dan Burkert picked up work originally started by Sameer Abhyankar on |
| <a href="https://issues.apache.org/jira/browse/KUDU-1363">KUDU-1363</a>, which adds |
| support for adding <code>IN (...)</code> predicates to scanners. Dan committed the |
| <a href="http://gerrit.cloudera.org:8080/2986">main patch</a> as well as corresponding |
| <a href="http://gerrit.cloudera.org:8080/4530">support in the Java client</a>. |
| Jordan Birdsell quickly added corresponding support in <a href="http://gerrit.cloudera.org:8080/4548">Python</a>. |
| This new feature will be available in an upcoming release.</p> |
| </li> |
| <li> |
| <p>Work continues on the <code>kudu</code> command line tool. Dinesh Bhat added |
| the ability to ask a tablet’s leader to <a href="http://gerrit.cloudera.org:8080/4533">step down</a> |
| and Alexey Serbin added a <a href="http://gerrit.cloudera.org:8080/4412">tool to insert random data into a |
| table</a>.</p> |
| </li> |
| <li> |
| <p>Jordan Birdsell continues to be on a tear improving the Python client. |
| The patches are too numerous to mention, but highlights include Python 3 |
| support as well as near feature parity with the C++ client.</p> |
| </li> |
| <li> |
| <p>Todd Lipcon has been doing some refactoring and cleanup in the Raft |
| consensus implementation. In addition to simplifying and removing code, |
| he committed <a href="https://issues.apache.org/jira/browse/KUDU-1567">KUDU-1567</a>, |
| which improves write performance in many cases by a factor of three |
| or more while also improving stability.</p> |
| </li> |
| <li> |
| <p>Brock Noland is working on support for <a href="https://gerrit.cloudera.org/#/c/4491/">INSERT IGNORE</a> |
| as a first-class part of the Kudu API. Of course this functionality |
| can already be done by simply performing normal inserts and ignoring any |
| resulting errors, but pushing it to the server prevents the server |
| from counting such operations as errors.</p> |
| </li> |
| <li>Congratulations to Ninad Shringarpure for contributing his first patches |
| to Kudu. Ninad contributed two documentation fixes and improved |
| formatting on the Kudu web UI.</li> |
| </ul> |
| |
| <p>Want to learn more about a specific topic from this blog post? Shoot an email to the |
| <a href="&#109;&#097;&#105;&#108;&#116;&#111;:&#117;&#115;&#101;&#114;&#064;&#107;&#117;&#100;&#117;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">kudu-user mailing list</a> or |
| tweet at <a href="https://twitter.com/ApacheKudu">@ApacheKudu</a>. Similarly, if you’re |
| aware of some Kudu news we missed, let us know so we can cover it in |
| a future post.</p></content><author><name>Todd Lipcon</name></author><summary>Welcome to the twenty-first edition of the Kudu Weekly Update. Astute |
| readers will notice that the weekly blog posts have been not-so-weekly |
| of late – in fact, it has been nearly two months since the previous post |
| as I and others have focused on releases, conferences, etc. |
| |
| So, rather than covering just this past week, this post will cover highlights |
| of the progress since the 1.0 release in mid-September. If you’re interested |
| in learning about progress prior to that release, check the |
| release notes.</summary></entry><entry><title>Apache Kudu at Strata+Hadoop World NYC 2016</title><link href="/2016/09/26/strata-nyc-kudu-talks.html" rel="alternate" type="text/html" title="Apache Kudu at Strata+Hadoop World NYC 2016" /><published>2016-09-26T00:00:00-07:00</published><updated>2016-09-26T00:00:00-07:00</updated><id>/2016/09/26/strata-nyc-kudu-talks</id><content type="html" xml:base="/2016/09/26/strata-nyc-kudu-talks.html"><p>This week in New York, O’Reilly and Cloudera will be hosting Strata+Hadoop World |
| 2016. If you’re interested in Kudu, there will be several opportunities to |
| learn more, both from the open source development team as well as some companies |
| who are already adopting Kudu for their use cases. |
| <!--more--> |
| Here are some of the sessions to check out:</p> |
| |
| <ul> |
| <li> |
| <p><a href="http://conferences.oreilly.com/strata/hadoop-big-data-ny/public/schedule/detail/52146">Powering real-time analytics on Xfinity using Kudu</a> (Wednesday, 11:20am)</p> |
| |
| <p>Sridhar Alla and Kiran Muglurmath from Comcast will talk about how they’re using |
| Kudu to store hundreds of billions of Set-Top Box (STB) events, performing |
| analytics concurrently with real-time streaming ingest of thousands of events |
| per second.</p> |
| </li> |
| <li> |
| <p><a href="http://conferences.oreilly.com/strata/hadoop-big-data-ny/public/schedule/detail/52248">Creating real-time, data-centric applications with Impala and Kudu</a> (Wednesday, 2:05pm)</p> |
| |
| <p>Marcel Kornacker and Todd Lipcon will introduce how Impala and Kudu together |
| allow users to build real-time applications that support streaming ingest, |
| random access updates and deletes, and high performance analytic SQL in |
| a single system.</p> |
| </li> |
| <li> |
| <p><a href="http://conferences.oreilly.com/strata/hadoop-big-data-ny/public/schedule/detail/52168">Streaming cybersecurity into Graph: Accelerating data into Datastax Graph and Blazegraph</a> (Thursday, 1:15pm)</p> |
| |
| <p>Joshua Patterson, Michael Wendt, and Keith Kraus from Accenture Labs will discuss |
| how they have built cybersecurity solutions using graph analytics on top of open |
| source technology like Apache Kafka, Spark, and Flink. They will also touch on |
| why Kudu is becoming an integral part of Accenture’s technology stack.</p> |
| </li> |
| <li> |
| <p><a href="http://conferences.oreilly.com/strata/hadoop-big-data-ny/public/schedule/detail/52050">How GE analyzes billions of mission-critical events in real time using Apache Apex, Spark, and Kudu</a> (Thursday, 2:05pm)</p> |
| |
| <p>Venkatesh Sivasubramanian and Luis Ramos from GE Digital will discuss how they |
| collect and process real-time IoT data using Apache Apex and Apache Spark, and |
| how they’ve been experimenting with Apache Kudu for time series data storage.</p> |
| </li> |
| <li> |
| <p><a href="http://conferences.oreilly.com/strata/hadoop-big-data-ny/public/schedule/detail/51887">Apache Kudu: 1.0 and Beyond</a> (Thursday, 4:35pm)</p> |
| |
| <p>Todd Lipcon from Cloudera will review the new features that were developed between Kudu 0.5 |
| (the first public release one year ago) and Kudu 1.0, released just last week. Additionally, |
| this talk will provide some insight into the upcoming project roadmap for the coming year.</p> |
| </li> |
| </ul> |
| |
| <p>Aside from these organized sessions, word has it that there will be various demos |
| featuring Apache Kudu at the Cloudera and ZoomData vendor booths.</p> |
| |
| <p>If you’re not attending the conference, but still based in NYC, all hope is |
| not lost. Michael Crutcher from Cloudera will be presenting an introduction |
| to Apache Kudu at the <a href="http://www.meetup.com/mysqlnyc/events/233599664/">SQL NYC Meetup</a>. |
| Be sure to RSVP as spots are filling up fast.</p></content><author><name>Todd Lipcon</name></author><summary>This week in New York, O’Reilly and Cloudera will be hosting Strata+Hadoop World |
| 2016. If you’re interested in Kudu, there will be several opportunities to |
| learn more, both from the open source development team as well as some companies |
| who are already adopting Kudu for their use cases.</summary></entry></feed> |