blob: 0f0500b8829194588b008b7d56ae2b30d476235b [file] [log] [blame]
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><generator uri="http://jekyllrb.com" version="2.5.3">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2018-08-17T17:34:56+02:00</updated><id>/</id><entry><title>Getting Started with Kudu - an O’Reilly Title</title><link href="/2018/08/06/getting-started-with-kudu-an-oreilly-title.html" rel="alternate" type="text/html" title="Getting Started with Kudu - an O&#39;Reilly Title" /><published>2018-08-06T00:00:00+02:00</published><updated>2018-08-06T00:00:00+02:00</updated><id>/2018/08/06/getting-started-with-kudu-an-oreilly-title</id><content type="html" xml:base="/2018/08/06/getting-started-with-kudu-an-oreilly-title.html">&lt;p&gt;The following article by Brock Noland was reposted from the
&lt;a href=&quot;https://www.phdata.io/getting-started-with-kudu/&quot;&gt;phData&lt;/a&gt;
blog with their permission.&lt;/p&gt;
&lt;p&gt;Five years ago, enabling Data Science and Advanced Analytics on the
Hadoop platform was hard. Organizations required strong Software Engineering
capabilities to successfully implement complex Lambda architectures or even
simply implement continuous ingest. Updating or deleting data, were simply a
nightmare. General Data Protection Regulation (GDPR) would have been an extreme
challenge at that time.
&lt;!--more--&gt;
In that context, on October 11th 2012 Todd Lipcon perform Apache Kudu’s initial
commit. The commit message was:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Code for writing cfiles seems to basically work
Need to write code for reading cfiles, still
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And Kudu development was off and running. Around this same time Todd, on his
internal Wiki page, started listing out the papers he was reading to develop
the theoretical background for creating Kudu. I followed along, reading as many
as I could, understanding little, because I knew Todd was up to something
important. About a year after that initial commit, I got my
&lt;a href=&quot;https://github.com/apache/kudu/commit/1d7e6864b4a31d3fe6897e4cb484dfcda6608d43&quot;&gt;Kudu first commit&lt;/a&gt;,
documenting the upper bound of a library. This is a small contribution of which I am still
proud.&lt;/p&gt;
&lt;p&gt;In the meantime, I was lucky enough to be a founder of a Hadoop Managed Services
and Consulting company known as &lt;a href=&quot;http://phdata.io/&quot;&gt;phData&lt;/a&gt;. We found that a majority
of our customers had use cases which Kudu vastly simplified. Whether it’s Change Data
Capture (CDC) from thousands of source tables to Internet of Things (IoT) ingest, Kudu
makes life much easier as both an operator of a Hadoop cluster and a developer providing
business value on the platform.&lt;/p&gt;
&lt;p&gt;Through this work, I was lucky enough to be a co-author of
&lt;a href=&quot;http://shop.oreilly.com/product/0636920065739.do&quot;&gt;Getting Started with Kudu&lt;/a&gt;.
The book is a summation of mine and our co-authors, Jean-Marc Spaggiari, Mladen
Kovacevic, and Ryan Bosshart, learnings while cutting our teeth on early versions
of Kudu. Specifically you will learn:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Theoretical understanding of Kudu concepts in simple plain spoken words and simple diagrams&lt;/li&gt;
&lt;li&gt;Why, for many use cases, using Kudu is so much easier than other ecosystem storage technologies&lt;/li&gt;
&lt;li&gt;How Kudu enables Hybrid Transactional/Analytical Processing (HTAP) use cases&lt;/li&gt;
&lt;li&gt;How to design IoT, Predictive Modeling, and Mixed Platform Solutions using Kudu&lt;/li&gt;
&lt;li&gt;How to design Kudu Schemas&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&quot;/img/2018-08-06-getting-started-with-kudu-an-oreilly-title.gif&quot; alt=&quot;Getting Started with Kudu Cover&quot; class=&quot;img-responsive&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Looking forward, I am excited to see Kudu gain additional features and adoption
and eventually the second revision of this title. In the meantime, if you have
feedback or questions, please reach out on the &lt;code&gt;#getting-started-kudu&lt;/code&gt; channel of
the &lt;a href=&quot;https://getkudu-slack.herokuapp.com/&quot;&gt;Kudu Slack&lt;/a&gt; or if you prefer non-real-time
communication, please use the user@ mailing list!&lt;/p&gt;</content><author><name>Brock Noland</name></author><summary>The following article by Brock Noland was reposted from the
phData
blog with their permission.
Five years ago, enabling Data Science and Advanced Analytics on the
Hadoop platform was hard. Organizations required strong Software Engineering
capabilities to successfully implement complex Lambda architectures or even
simply implement continuous ingest. Updating or deleting data, were simply a
nightmare. General Data Protection Regulation (GDPR) would have been an extreme
challenge at that time.</summary></entry><entry><title>Instrumentation in Apache Kudu</title><link href="/2018/07/10/instrumentation-in-kudu.html" rel="alternate" type="text/html" title="Instrumentation in Apache Kudu" /><published>2018-07-10T00:00:00+02:00</published><updated>2018-07-10T00:00:00+02:00</updated><id>/2018/07/10/instrumentation-in-kudu</id><content type="html" xml:base="/2018/07/10/instrumentation-in-kudu.html">&lt;p&gt;Last week, the &lt;a href=&quot;http://opentracing.io/&quot;&gt;OpenTracing&lt;/a&gt; community invited me to
their monthly Google Hangout meetup to give an informal talk on tracing and
instrumentation in Apache Kudu.&lt;/p&gt;
&lt;p&gt;While Kudu doesn’t currently support distributed tracing using OpenTracing,
it does have quite a lot of other types of instrumentation, metrics, and
diagnostics logging. The OpenTracing team was interested to hear about some of
the approaches that Kudu has used, and so I gave a brief introduction to topics
including:
&lt;!--more--&gt;
- The Kudu &lt;a href=&quot;/docs/administration.html#_diagnostics_logging&quot;&gt;diagnostics log&lt;/a&gt;
which periodically logs metrics and stack traces.
- The &lt;a href=&quot;/docs/troubleshooting.html#kudu_tracing&quot;&gt;process-wide tracing&lt;/a&gt;
support based on the open source tracing framework implemented by Google Chrome.
- The &lt;a href=&quot;/docs/troubleshooting.html#kudu_tracing&quot;&gt;stack watchdog&lt;/a&gt;
which helps us find various latency outliers and issues in our libraries and
the Linux kernel.
- &lt;a href=&quot;/docs/troubleshooting.html#heap_sampling&quot;&gt;Heap sampling&lt;/a&gt; support
which helps us understand unexpected memory usage.&lt;/p&gt;
&lt;p&gt;If you’re interested in learning about these topics and more, check out the video recording
below. My talk spans the first 34 minutes.&lt;/p&gt;
&lt;iframe width=&quot;800&quot; height=&quot;500&quot; src=&quot;https://www.youtube.com/embed/qBXwKU6Ubjo?end=2058&amp;amp;start=23&quot;&gt;
&lt;/iframe&gt;
&lt;p&gt;If you have any questions about this content or about Kudu in general,
&lt;a href=&quot;http://kudu.apache.org/community.html&quot;&gt;join the community&lt;/a&gt;&lt;/p&gt;</content><author><name>Todd Lipcon</name></author><summary>Last week, the OpenTracing community invited me to
their monthly Google Hangout meetup to give an informal talk on tracing and
instrumentation in Apache Kudu.
While Kudu doesn’t currently support distributed tracing using OpenTracing,
it does have quite a lot of other types of instrumentation, metrics, and
diagnostics logging. The OpenTracing team was interested to hear about some of
the approaches that Kudu has used, and so I gave a brief introduction to topics
including:</summary></entry><entry><title>Apache Kudu 1.7.0 released</title><link href="/2018/03/23/apache-kudu-1-7-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.7.0 released" /><published>2018-03-23T00:00:00+01:00</published><updated>2018-03-23T00:00:00+01:00</updated><id>/2018/03/23/apache-kudu-1-7-0-released</id><content type="html" xml:base="/2018/03/23/apache-kudu-1-7-0-released.html">&lt;p&gt;The Apache Kudu team is happy to announce the release of Kudu 1.7.0!&lt;/p&gt;
&lt;p&gt;Apache Kudu 1.7.0 is a minor release that offers new features, performance
optimizations, incremental improvements, and bug fixes.&lt;/p&gt;
&lt;p&gt;Release highlights:&lt;/p&gt;
&lt;!--more--&gt;
&lt;ol&gt;
&lt;li&gt;Kudu now supports the decimal column type. The decimal type is a numeric
data type with fixed scale and precision suitable for financial and other
arithmetic calculations where the imprecise representation and rounding
behavior of float and double make those types impractical. The decimal type
is also useful for integers larger than int64 and cases with fractional values
in a primary key. See &lt;a href=&quot;/releases/1.7.0/docs/schema_design.html#decimal&quot;&gt;Decimal Type&lt;/a&gt;
for more details.&lt;/li&gt;
&lt;li&gt;The strategy Kudu uses for automatically healing tablets which have lost a
replica due to server or disk failures has been improved. The new re-replication
strategy, or replica management scheme, first adds a replacement tablet replica
before evicting the failed one.&lt;/li&gt;
&lt;li&gt;A new scan read mode READ_YOUR_WRITES. Users can specify READ_YOUR_WRITES when
creating a new scanner in C++, Java and Python clients. If this mode is used,
the client will perform a read such that it follows all previously known writes
and reads from this client. Reads in this mode ensure read-your-writes and
read-your-reads session guarantees, while minimizing latency caused by waiting
for outstanding write transactions to complete. Note that this is still an
experimental feature which may be stabilized in future releases.&lt;/li&gt;
&lt;li&gt;The tablet server web UI scans dashboard (/scans) has been improved with several
new features, including: showing the most recently completed scans, a pseudo-SQL
scan descriptor that concisely shows the selected columns and applied predicates,
and more complete and better documented scan statistics.&lt;/li&gt;
&lt;li&gt;Kudu daemons now expose a web page /stacks which dumps the current stack trace of
every thread running in the server. This information can be helpful when diagnosing
performance issues.&lt;/li&gt;
&lt;li&gt;By default, each tablet replica will now stripe data blocks across 3 data directories
instead of all data directories. This decreases the likelihood that any given tablet
will be affected in the event of a single disk failure.&lt;/li&gt;
&lt;li&gt;The Java client now uses a predefined prioritized list of TLS ciphers when
establishing an encrypted connection to Kudu servers. This cipher list matches the
list of ciphers preferred for server-to-server communication and ensures that the
most efficient and secure ciphers are preferred. When the Kudu client is running on
Java 8 or newer, this provides a substantial speed-up to read and write performance.&lt;/li&gt;
&lt;li&gt;The performance of inserting rows containing many string or binary columns has been
improved, especially in the case of highly concurrent write workloads.&lt;/li&gt;
&lt;li&gt;The Java client will now automatically attempt to re-acquire Kerberos credentials
from the ticket cache when the prior credentials are about to expire. This allows
client instances to persist longer than the expiration time of a single Kerberos
ticket so long as some other process renews the credentials in the ticket cache.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For more details, and the complete list of changes in Kudu 1.7.0, please see
the &lt;a href=&quot;/releases/1.7.0/docs/release_notes.html&quot;&gt;Kudu 1.7.0 release notes&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The Apache Kudu project only publishes source code releases. To build Kudu
1.7.0, follow these steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Download the &lt;a href=&quot;/releases/1.7.0/&quot;&gt;Kudu 1.7.0 source release&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Follow the instructions in the documentation to
&lt;a href=&quot;/releases/1.7.0/docs/installation.html#build_from_source&quot;&gt;build Kudu 1.7.0 from source&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For your convenience, binary JAR files for the Kudu Java client library, Spark
DataSource, Flume sink, and other Java integrations are published to the ASF
Maven repository and are
&lt;a href=&quot;https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.kudu%22%20AND%20v%3A%221.7.0%22&quot;&gt;now available&lt;/a&gt;.&lt;/p&gt;</content><author><name>Grant Henke</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.7.0!
Apache Kudu 1.7.0 is a minor release that offers new features, performance
optimizations, incremental improvements, and bug fixes.
Release highlights:</summary></entry><entry><title>Apache Kudu 1.6.0 released</title><link href="/2017/12/08/apache-kudu-1-6-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.6.0 released" /><published>2017-12-08T00:00:00+01:00</published><updated>2017-12-08T00:00:00+01:00</updated><id>/2017/12/08/apache-kudu-1-6-0-released</id><content type="html" xml:base="/2017/12/08/apache-kudu-1-6-0-released.html">&lt;p&gt;The Apache Kudu team is happy to announce the release of Kudu 1.6.0!&lt;/p&gt;
&lt;p&gt;Apache Kudu 1.6.0 is a minor release that offers new features, performance
optimizations, incremental improvements, and bug fixes.&lt;/p&gt;
&lt;p&gt;Release highlights:&lt;/p&gt;
&lt;!--more--&gt;
&lt;ol&gt;
&lt;li&gt;Kudu servers can now tolerate short interruptions in NTP clock
synchronization. NTP synchronization is still required when any Kudu daemon
starts up.&lt;/li&gt;
&lt;li&gt;Tablet servers will no longer crash when a disk containing data blocks
fails, unless that disk also stores WAL segments or tablet metadata. Instead
of crashing, the tablet server will shut down any tablets that may have lost
data locally and Kudu will re-replicate the affected tablets to another
tablet server. More information can be found in the documentation under
&lt;a href=&quot;/releases/1.6.0/docs/administration.html#disk_failure_recovery&quot;&gt;Recovering from Disk Failure&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Tablet server startup time has been improved significantly on servers
containing large numbers of blocks.&lt;/li&gt;
&lt;li&gt;The Spark DataSource integration now can take advantage of scan locality for
better scan performance. The scan will take place at the closest replica
instead of going to the leader.&lt;/li&gt;
&lt;li&gt;Support for Spark 1 has been removed in Kudu 1.6.0 and now only Spark 2 is
supported. Spark 1 support was deprecated in Kudu 1.5.0.&lt;/li&gt;
&lt;li&gt;HybridTime timestamp propagation now works in the Java client when using
scan tokens.&lt;/li&gt;
&lt;li&gt;Tablet servers now consider the health of all replicas of a tablet before
deciding to evict one. This can improve the stability of the Kudu cluster
when multiple servers temporarily go down at the same time.&lt;/li&gt;
&lt;li&gt;A bug in the C++ client was fixed that could cause tablets to be erroneously
pruned, or skipped, during certain scans, resulting in fewer results than
expected being returned from queries. The bug only affected tables whose
range partition columns are a proper prefix of the primary key.
See &lt;a href=&quot;https://issues.apache.org/jira/browse/KUDU-2173&quot;&gt;KUDU-2173&lt;/a&gt; for more
information.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For more details, and the complete list of changes in Kudu 1.6.0, please see
the &lt;a href=&quot;/releases/1.6.0/docs/release_notes.html&quot;&gt;Kudu 1.6.0 release notes&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The Apache Kudu project only publishes source code releases. To build Kudu
1.6.0, follow these steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Download the &lt;a href=&quot;/releases/1.6.0/&quot;&gt;Kudu 1.6.0 source release&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Follow the instructions in the documentation to
&lt;a href=&quot;/releases/1.6.0/docs/installation.html#build_from_source&quot;&gt;build Kudu 1.6.0 from source&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For your convenience, binary JAR files for the Kudu Java client library, Spark
DataSource, Flume sink, and other Java integrations are published to the ASF
Maven repository and are
&lt;a href=&quot;https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.kudu%22%20AND%20v%3A%221.6.0%22&quot;&gt;now available&lt;/a&gt;.&lt;/p&gt;</content><author><name>Mike Percy</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.6.0!
Apache Kudu 1.6.0 is a minor release that offers new features, performance
optimizations, incremental improvements, and bug fixes.
Release highlights:</summary></entry><entry><title>Slides: A brave new world in mutable big data: Relational storage</title><link href="/2017/10/23/nosql-kudu-spanner-slides.html" rel="alternate" type="text/html" title="Slides: A brave new world in mutable big data: Relational storage" /><published>2017-10-23T00:00:00+02:00</published><updated>2017-10-23T00:00:00+02:00</updated><id>/2017/10/23/nosql-kudu-spanner-slides</id><content type="html" xml:base="/2017/10/23/nosql-kudu-spanner-slides.html">&lt;p&gt;Since the Apache Kudu project made its debut in 2015, there have been
a few common questions that kept coming up at every presentation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Is Kudu an open source version of Google’s Spanner system?&lt;/li&gt;
&lt;li&gt;Is Kudu NoSQL or SQL?&lt;/li&gt;
&lt;li&gt;Why does Kudu have a relational data model? Isn’t SQL dead?&lt;/li&gt;
&lt;/ul&gt;
&lt;!--more--&gt;
&lt;p&gt;A few of these questions are addressed in the
&lt;a href=&quot;https://kudu.apache.org/faq.html&quot;&gt;Kudu FAQ&lt;/a&gt;, but I thought they were
interesting enough that I decided to give a talk on these subjects
at &lt;a href=&quot;https://conferences.oreilly.com/strata/strata-ny&quot;&gt;Strata Data Conference NYC 2017&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Preparing this talk was particularly interesting, since Google recently released
Spanner to the public in SaaS form as &lt;a href=&quot;https://cloud.google.com/spanner/&quot;&gt;Google Cloud Spanner&lt;/a&gt;.
This meant that I was able to compare Kudu vs Spanner not just qualitatively
based on some academic papers, but quantitatively as well.&lt;/p&gt;
&lt;p&gt;To summarize the key points of the presentation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Despite the growing popularity of “NoSQL” from 2009 through 2013, SQL has
once again become the access mechanism of choice for the majority of
analytic applications. NoSQL has become “Not Only SQL”.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Spanner and Kudu share a lot of common features. However:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Spanner offers a superior feature set and performance for Online
Transactional Processing (OLTP) workloads, including ACID transactions and
secondary indexing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Kudu offers a superior feature set and performance for Online
Analytical Processing (OLAP) and Hybrid Transactional/Analytic Processing
(HTAP) workloads, including more complete SQL support and orders of
magnitude better performance on large queries.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For more details and for the full benchmark numbers, check out the slide deck
below:&lt;/p&gt;
&lt;iframe src=&quot;//www.slideshare.net/slideshow/embed_code/key/loQpO2vzlwGGgz&quot; width=&quot;595&quot; height=&quot;485&quot; frameborder=&quot;0&quot; marginwidth=&quot;0&quot; marginheight=&quot;0&quot; scrolling=&quot;no&quot; style=&quot;border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;&quot; allowfullscreen=&quot;&quot;&gt; &lt;/iframe&gt;
&lt;div style=&quot;margin-bottom:15px&quot;&gt; &lt;strong&gt; &lt;a href=&quot;//www.slideshare.net/ToddLipcon/a-brave-new-world-in-mutable-big-data-relational-storage-strata-nyc-2017&quot; title=&quot;A brave new world in mutable big data relational storage (Strata NYC 2017)&quot; target=&quot;_blank&quot;&gt;A brave new world in mutable big data relational storage (Strata NYC 2017)&lt;/a&gt; &lt;/strong&gt; from &lt;strong&gt;&lt;a href=&quot;https://www.slideshare.net/ToddLipcon&quot; target=&quot;_blank&quot;&gt;Todd Lipcon&lt;/a&gt;&lt;/strong&gt; &lt;/div&gt;
&lt;p&gt;Questions or comments? Join the &lt;a href=&quot;/community.html&quot;&gt;Apache Kudu Community&lt;/a&gt; to discuss.&lt;/p&gt;</content><author><name>Todd Lipcon</name></author><summary>Since the Apache Kudu project made its debut in 2015, there have been
a few common questions that kept coming up at every presentation:
Is Kudu an open source version of Google’s Spanner system?
Is Kudu NoSQL or SQL?
Why does Kudu have a relational data model? Isn’t SQL dead?</summary></entry><entry><title>Consistency in Apache Kudu, Part 1</title><link href="/2017/09/18/kudu-consistency-pt1.html" rel="alternate" type="text/html" title="Consistency in Apache Kudu, Part 1" /><published>2017-09-18T00:00:00+02:00</published><updated>2017-09-18T00:00:00+02:00</updated><id>/2017/09/18/kudu-consistency-pt1</id><content type="html" xml:base="/2017/09/18/kudu-consistency-pt1.html">&lt;p&gt;In this series of short blog posts we will introduce Kudu’s consistency model,
its design and ultimate goals, current features, and next steps.
On the way, we’ll shed some light on the more relevant components and how they
fit together.&lt;/p&gt;
&lt;p&gt;In Part 1 of the series (this one), we’ll cover motivation and design trade-offs, the end goals and
the current status.&lt;/p&gt;
&lt;!--more--&gt;
&lt;h2 id=&quot;what-is-consistency-and-why-is-it-relevant&quot;&gt;What is “consistency” and why is it relevant?&lt;/h2&gt;
&lt;p&gt;In order to cope with ever increasing data volumes, modern storage systems like Kudu have to support
many concurrent users while coordinating requests across many machines, each with many threads executing
work at the same time. However, application developers shouldn’t have to understand the internal
details of how these systems implement this parallel, distributed, execution in order to write
correct applications. &lt;em&gt;Consistency in the context of parallel, distributed systems roughly
refers to how the system behaves in comparison to a single-machine, single-thread system&lt;/em&gt;. In a
single-threaded, single-machine storage system operations happen one-at-a-time, in a clearly
defined order, making correct applications easy to code and reason about. A developer writing an
application against such a system doesn’t have to care about how simultaneous operations interact
or about ordering anomalies, so the code is simpler, but more importantly, cognitive load is greatly
reduced, freeing focus for the application logic itself.&lt;/p&gt;
&lt;p&gt;While such a simple system is definitely possible to build, it wouldn’t be able to cope with very
large amounts of data. In order to deal with big data volumes and write throughputs modern storage
systems like Kudu are designed to be distributed, storing and processing data across many machines
and cores. This means that many things happen simultaneously in the same and different machines,
that there are more moving parts and thus more oportunity for mis-orderings and for components
to fail. How far systems like Kudu go (or don’t go) in emulating the simple single-threaded, single-machine
system a distributed, parallel setting where failures are common is roughly what is referred to
as how &lt;em&gt;“consistent”&lt;/em&gt; the system is.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Consistency&lt;/em&gt; as a term is somewhat overloaded in the distributed systems and database communities,
there are many different models, properties, different names for the same concept, and often
different concepts under the same name. This post is not meant to introduce these concepts
as there are excellent references already available elsewhere (we recommend Kyle Kinsbury’s excellent
series of blog posts on the matter, like &lt;a href=&quot;https://aphyr.com/posts/313-strong-consistency-models&quot;&gt;this one&lt;/a&gt;).
Throughout this and follow-up posts we’ll refer to consistency loosely as the &lt;strong&gt;C&lt;/strong&gt; in &lt;strong&gt;CAP&lt;/strong&gt;[1]
in some cases and as the &lt;strong&gt;I&lt;/strong&gt; in &lt;strong&gt;ACID&lt;/strong&gt;[2] in others; we’ll try to be specific when relevant.&lt;/p&gt;
&lt;h2 id=&quot;design-decisions-trade-offs-and-motivation&quot;&gt;Design decisions, trade-offs and motivation&lt;/h2&gt;
&lt;p&gt;Consistency is essentially about ordering and ordering usually has a cost. Distributed storage
system design must choose to prioritize some properties over others according to the target use
cases. That is, trade-offs must be made or, borrowing a term from economics, there is
“no free lunch”. Different systems choose different trade-off points; for instance, systems inspired by &lt;em&gt;Dynamo&lt;/em&gt;[3], usually favor availability in the consistency/availability
trade-off: by allowing a write to a data item to succeed even when a majority (or even all) of the
replicas serving that data item are unreachable, Dynamo’s design is minimizing insertion errors and
insert latency (related to availability) at the cost having to perform extra work for value
reconciliation on reads and possibly returning stale or disordered values (related to consistency).
On the other end of the spectrum, traditional DBMS design is often driven by the need to support
transactions of arbitrary complexity while providing the users stronger, predictable, semantics,
favoring consistency at the cost of scalability and availability.&lt;/p&gt;
&lt;p&gt;Kudu’s overarching goal is to enable &lt;em&gt;fast analytic workloads over large amounts of mutable&lt;/em&gt; data,
meaning it was designed to perform fast scans over large volumes of data stored in many servers.
In practical terms this means that, when given a choice, more often than not, we opted for the
design that would enable Kudu to have faster scan performance (i.e. favoring reads even if it meant pushing
a bit more work to the path that mutates data, i.e. writes). This does not mean that the write path
was not a concern altogether. In fact, modern storage systems like &lt;em&gt;Google’s Spanner&lt;/em&gt;[4]
global-scale database demonstrate that, with the right set of trade-offs, it is possible to have strong
consistency semantics with write latencies and overall availability that are adequate for most use
cases (e.g. Spanner achieves 5 9’s of availability). For the write path, we often made similar choices in Kudu.&lt;/p&gt;
&lt;p&gt;Another important aspect that directed our design decisions is the type of &lt;em&gt;write workload&lt;/em&gt; we targeted.
Traditionally, analytical storage systems target periodic bulk write workloads and a continuous
stream of analytical scans. This design is often problematic in that it forces users to have to
build complex pipelines where data is accumulated in one place for later loading into the storage
system. Moreover, beyond the architectural complexity, this kind of design usually also
means that the data that is available for analytics is not the most recent. In Kudu we aimed for
enabling continuous ingest, i.e. having a continuous stream of small writes, obviating the need to
assemble a pipeline for data accumulation/loading and allowing analytical scans to have access to
the most recent data. Another important aspect of the write workloads that we targeted in Kudu is
that they are append-mostly, i.e. most insert new values into the table, with a smaller percentage
updating currently existing values. Both the average write size and the data distribution influence
the design of the write path, as we’ll see in the following sections.&lt;/p&gt;
&lt;p&gt;One last concern we had in mind is that different users have different needs when it comes to
consistency semantics, particularly as it applies to an analytical storage system like Kudu. For
some users consistency isn’t a primary concern, they just want fast scans, and the ability to
update/insert/delete values without needing to build a complex pipeline. For example, many machine
learning models are mostly insensitive to data recency or ordering so, when using Kudu to store data that
will be used to train such a model, consistency is often not as primary a concern as read/write performance is.
In other cases consistency is a much higher priority. For example, when using Kudu to
store transaction data for fraud analysis it might be important to capture if events are causally
related. Fraudulent transactions might be characterized by a specific sequence of events and when
retrieving that data it might be important for the scan result to reflect that sequence. Kudu’s
design allows users to make a trade-off between consistency and performance at scan time. That is,
users can choose to have stronger consistency semantics for scans at the penalty of latency and
throughput or they can choose to weaken the consistency semantics for an extra performance boost.&lt;/p&gt;
&lt;h3 id=&quot;note&quot;&gt;Note&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;Kudu currently lacks support for atomic multi-row mutation operations (i.e. mutation
operations to more than one row in the same or different tablets, planned as a future feature).
So, when discussing writes, we’ll be talking about the consistency semantics of single row mutations.
In this context we’ll discuss Kudu’s properties more from a key/value store standpoint. On the
other hand Kudu is an analytical storage engine so, for the read path, we’ll also discuss the
semantics of large (multi-row) scans. This moves the discussion more into the field of traditional
DBMSs. These ingredients make for a non-traditional discussion that is not exactly apples-to-apples
with what the reader might be familiar with, but our hope is that it still provides valuable, or
at least interesting, insight.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;consistency-options-in-kudu&quot;&gt;Consistency options in Kudu&lt;/h2&gt;
&lt;p&gt;Consistency, as well as other properties, are underpinned in Kudu by the concept of a &lt;em&gt;timestamp&lt;/em&gt;.
In follow-up posts we’ll look into detail how these are assigned and how they are assembled. For now
it’s sufficient to know that a timestamp is a single, usually large, number that has some mapping
to wall time. Each mutation of a Kudu row is tagged with one such timestamp. Globally, these timestamps
form a partial order over all the rows with the particularity that causally related mutations (e.g.
a write mutation that is the result of the value obtained from a previous read) may be required to
have increasing timestamps, depending on the user’s choices.&lt;/p&gt;
&lt;p&gt;Row mutations performed by a single client &lt;em&gt;instance&lt;/em&gt; are guaranteed to have increasing timestamps
thus reflecting their potential causal relationship. This property is always enforced. However
there are two major &lt;em&gt;“knobs”&lt;/em&gt; that are available to the user to make performance trade-offs, the
&lt;code&gt;Read&lt;/code&gt; mode, and the &lt;code&gt;External Consistency&lt;/code&gt; mode (see &lt;a href=&quot;https://kudu.apache.org/docs/transaction_semantics.html&quot;&gt;here&lt;/a&gt;
for more information on how to use the relevant APIs).&lt;/p&gt;
&lt;p&gt;The first and most important knob, the &lt;code&gt;Read&lt;/code&gt; mode, pertains to what is the guaranteed recency of
data resulting from scans. Since Kudu uses replication for availability and fault-tolerance, there
are always multiple replicas of any data item.
Not all replicas must be up-to-date so if the user cares about recency, e.g. if the user requires
that any data read includes all previously written data &lt;em&gt;from a single client instance&lt;/em&gt; then it must
choose the &lt;code&gt;READ_AT_SNAPSHOT&lt;/code&gt; read mode. With this mode enabled the client is guaranteed to observe
&lt;strong&gt;“READ YOUR OWN WRITES”&lt;/strong&gt; semantics, i.e. scans from a client will always include all previous mutations
performed by that client. Note that this property is local to a single client instance, not a global
property.&lt;/p&gt;
&lt;p&gt;The second “knob”, the &lt;code&gt;External Consistency&lt;/code&gt; mode, defines the semantics of how reads and writes
are performed across multiple client instances. By default, &lt;code&gt;External Consistency&lt;/code&gt; is set to
&lt;code&gt;CLIENT_PROPAGATED&lt;/code&gt;, meaning it’s up to the user to coordinate a set of &lt;em&gt;timestamp tokens&lt;/em&gt; with clients (even
across different machines) if they are performing writes/reads that are somehow causally linked.
If done correctly this enables &lt;strong&gt;STRICT SERIALIZABILITY&lt;/strong&gt;[5], i.e. &lt;strong&gt;LINEARIZABILITY&lt;/strong&gt;[6] and
&lt;strong&gt;SERIALIZABILITY&lt;/strong&gt;[7] at the same time, at the cost of having the user coordinate the timestamp
tokens across clients (a survey of the meaning of these, and other definitions can be found
&lt;a href=&quot;http://www.ics.forth.gr/tech-reports/2013/2013.TR439_Survey_on_Consistency_Conditions.pdf&quot;&gt;here&lt;/a&gt;).
The alternative setting for &lt;code&gt;External Consistency&lt;/code&gt; is to have it set to
&lt;code&gt;COMMIT_WAIT&lt;/code&gt; (experimental), which guarantees the same properties through a different means, by
implementing Google Spanner’s &lt;em&gt;TrueTime&lt;/em&gt;. This comes at the cost of higher latency (depending on how
tightly synchronized the system clocks of the various tablet servers are), but doesn’t require users
to propagate timestamps programmatically.&lt;/p&gt;
&lt;h2 id=&quot;next-up&quot;&gt;Next up&lt;/h2&gt;
&lt;p&gt;In following posts we’ll look into the several components of Kudu’s architecture that come together
to enable the consistency semantics introduced in the previous section, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Transactions and the Transaction Driver&lt;/li&gt;
&lt;li&gt;Concurrent execution with Multi-Version Concurrency Control&lt;/li&gt;
&lt;li&gt;Exactly-Once semantics with Replay Cache&lt;/li&gt;
&lt;li&gt;Replication, Crash Recovery with Consensus and the Write-Ahead-Log&lt;/li&gt;
&lt;li&gt;Time keeping and timestamp assignment&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.3690&amp;amp;rep=rep1&amp;amp;type=pdf&quot;&gt;[1]&lt;/a&gt;: Armando Fox and Eric A. Brewer. 1999. Harvest, Yield, and Scalable Tolerant Systems. In Proceedings of the The Seventh Workshop on Hot Topics in Operating Systems (HOTOS ‘99). IEEE Computer Society, Washington, DC, USA.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/ACID&quot;&gt;[2]&lt;/a&gt;: ACID - Wikipedia entry&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf&quot;&gt;[3]&lt;/a&gt;: Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: amazon’s highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles (SOSP ‘07). ACM, New York, NY, USA.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://research.google.com/archive/spanner-osdi2012.pdf&quot;&gt;[4]&lt;/a&gt;: James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. 2012. Spanner: Google’s globally-distributed database. In Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, USA.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pdfs.semanticscholar.org/fafa/ebf830bc900bccc5e4fd508fd592f5581cbe.pdf&quot;&gt;[5]&lt;/a&gt;: Gifford, David K. Information storage in a decentralized computer system. Diss. Stanford University, 1981.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.doc.ic.ac.uk/~gbd10/aw590/Linearizability%20-%20A%20Correctness%20Condition%20for%20Concurrent%20Objects.pdf&quot;&gt;[6]&lt;/a&gt;: Herlihy, Maurice P., and Jeannette M. Wing. “Linearizability: A correctness condition for concurrent objects.” ACM Transactions on Programming Languages and Systems (TOPLAS) 12.3 (1990): 463-492.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.dtic.mil/get-tr-doc/pdf?AD=ADA078414&quot;&gt;[7]&lt;/a&gt;: Papadimitriou, Christos H. “The serializability of concurrent database updates.” Journal of the ACM (JACM) 26.4 (1979): 631-653.&lt;/p&gt;</content><author><name>David Alves</name></author><summary>In this series of short blog posts we will introduce Kudu’s consistency model,
its design and ultimate goals, current features, and next steps.
On the way, we’ll shed some light on the more relevant components and how they
fit together.
In Part 1 of the series (this one), we’ll cover motivation and design trade-offs, the end goals and
the current status.</summary></entry><entry><title>Apache Kudu 1.5.0 released</title><link href="/2017/09/08/apache-kudu-1-5-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.5.0 released" /><published>2017-09-08T00:00:00+02:00</published><updated>2017-09-08T00:00:00+02:00</updated><id>/2017/09/08/apache-kudu-1-5-0-released</id><content type="html" xml:base="/2017/09/08/apache-kudu-1-5-0-released.html">&lt;p&gt;The Apache Kudu team is happy to announce the release of Kudu 1.5.0!&lt;/p&gt;
&lt;p&gt;Apache Kudu 1.5.0 is a minor release which offers several new features,
improvements, optimizations, and bug fixes.&lt;/p&gt;
&lt;p&gt;Highlights include:&lt;/p&gt;
&lt;!--more--&gt;
&lt;ul&gt;
&lt;li&gt;optimizations to improve write throughput and failover recovery times&lt;/li&gt;
&lt;li&gt;the Raft consensus implementation has been made more resilient and flexible
through “tombstoned voting”, which allows Kudu to self-heal in more edge-case
scenarios&lt;/li&gt;
&lt;li&gt;the number of threads used by Kudu servers has been further reduced, with
additional reductions planned for the future&lt;/li&gt;
&lt;li&gt;a new configuration dashboard on the web UI which provides a high-level
summary of important configuration values&lt;/li&gt;
&lt;li&gt;a new &lt;code&gt;kudu tablet move&lt;/code&gt; command which moves a tablet replica from one tablet
server to another&lt;/li&gt;
&lt;li&gt;a new &lt;code&gt;kudu local_replica data_size&lt;/code&gt; command which summarizes the space usage
of a local tablet&lt;/li&gt;
&lt;li&gt;all on-disk data is now checksummed by default, which provides error detection
for improved confidence when running Kudu on unreliable hardware&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The above list of changes is non-exhaustive. Please refer to the
&lt;a href=&quot;/releases/1.5.0/docs/release_notes.html&quot;&gt;release notes&lt;/a&gt;
for an expanded list of important improvements, bug fixes, and
incompatible changes before upgrading.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Download the &lt;a href=&quot;/releases/1.5.0/&quot;&gt;Kudu 1.5.0 source release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Convenience binary artifacts for the Java client and various Java
integrations (eg Spark, Flume) are also now available via the ASF Maven
repository.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Dan Burkert</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.5.0!
Apache Kudu 1.5.0 is a minor release which offers several new features,
improvements, optimizations, and bug fixes.
Highlights include:</summary></entry><entry><title>Apache Kudu 1.4.0 released</title><link href="/2017/06/13/apache-kudu-1-4-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.4.0 released" /><published>2017-06-13T00:00:00+02:00</published><updated>2017-06-13T00:00:00+02:00</updated><id>/2017/06/13/apache-kudu-1-4-0-released</id><content type="html" xml:base="/2017/06/13/apache-kudu-1-4-0-released.html">&lt;p&gt;The Apache Kudu team is happy to announce the release of Kudu 1.4.0!&lt;/p&gt;
&lt;p&gt;Apache Kudu 1.4.0 is a minor release which offers several new features,
improvements, optimizations, and bug fixes.&lt;/p&gt;
&lt;p&gt;Highlights include:&lt;/p&gt;
&lt;!--more--&gt;
&lt;ul&gt;
&lt;li&gt;ability to alter storage attributes and default values for existing columns&lt;/li&gt;
&lt;li&gt;a new C++ client API to efficiently map primary keys to their associated partitions
and hosts&lt;/li&gt;
&lt;li&gt;support for long-running fault-tolerant scans in the Java client&lt;/li&gt;
&lt;li&gt;a new &lt;code&gt;kudu fs check&lt;/code&gt; command which can perform offline consistency checks
and repairs on the local on-disk storage of a Tablet Server or Master.&lt;/li&gt;
&lt;li&gt;many optimizations to reduce disk space usage, improve write throughput,
and improve throughput of background maintenance operations.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The above list of changes is non-exhaustive. Please refer to the
&lt;a href=&quot;/releases/1.4.0/docs/release_notes.html&quot;&gt;release notes&lt;/a&gt;
for an expanded list of important improvements, bug fixes, and
incompatible changes before upgrading.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Download the &lt;a href=&quot;/releases/1.4.0/&quot;&gt;Kudu 1.4.0 source release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Convenience binary artifacts for the Java client and various Java
integrations (eg Spark, Flume) are also now available via the ASF Maven
repository.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Todd Lipcon</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.4.0!
Apache Kudu 1.4.0 is a minor release which offers several new features,
improvements, optimizations, and bug fixes.
Highlights include:</summary></entry><entry><title>Apache Kudu 1.3.1 released</title><link href="/2017/04/19/apache-kudu-1-3-1-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.3.1 released" /><published>2017-04-19T00:00:00+02:00</published><updated>2017-04-19T00:00:00+02:00</updated><id>/2017/04/19/apache-kudu-1-3-1-released</id><content type="html" xml:base="/2017/04/19/apache-kudu-1-3-1-released.html">&lt;p&gt;The Apache Kudu team is happy to announce the release of Kudu 1.3.1!&lt;/p&gt;
&lt;p&gt;Apache Kudu 1.3.1 is a bug fix release which fixes critical issues discovered
in Apache Kudu 1.3.0. In particular, this fixes a bug in which data could be
incorrectly deleted after certain sequences of node failures. Several other
bugs are also fixed. See the release notes for details.&lt;/p&gt;
&lt;p&gt;Users of Kudu 1.3.0 are encouraged to upgrade to 1.3.1 immediately.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Download the &lt;a href=&quot;/releases/1.3.1/&quot;&gt;Kudu 1.3.1 source release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Convenience binary artifacts for the Java client and various Java
integrations (eg Spark, Flume) are also now available via the ASF Maven
repository.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Todd Lipcon</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.3.1!
Apache Kudu 1.3.1 is a bug fix release which fixes critical issues discovered
in Apache Kudu 1.3.0. In particular, this fixes a bug in which data could be
incorrectly deleted after certain sequences of node failures. Several other
bugs are also fixed. See the release notes for details.
Users of Kudu 1.3.0 are encouraged to upgrade to 1.3.1 immediately.
Download the Kudu 1.3.1 source release
Convenience binary artifacts for the Java client and various Java
integrations (eg Spark, Flume) are also now available via the ASF Maven
repository.</summary></entry><entry><title>Apache Kudu 1.3.0 released</title><link href="/2017/03/20/apache-kudu-1-3-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.3.0 released" /><published>2017-03-20T00:00:00+01:00</published><updated>2017-03-20T00:00:00+01:00</updated><id>/2017/03/20/apache-kudu-1-3-0-released</id><content type="html" xml:base="/2017/03/20/apache-kudu-1-3-0-released.html">&lt;p&gt;The Apache Kudu team is happy to announce the release of Kudu 1.3.0!&lt;/p&gt;
&lt;p&gt;Apache Kudu 1.3 is a minor release which adds various new features,
improvements, bug fixes, and optimizations on top of Kudu
1.2. Highlights include:&lt;/p&gt;
&lt;!--more--&gt;
&lt;ul&gt;
&lt;li&gt;significantly improved support for security, including Kerberos
authentication, TLS encryption, and coarse-grained (cluster-level)
authorization&lt;/li&gt;
&lt;li&gt;automatic garbage collection of historical versions of data&lt;/li&gt;
&lt;li&gt;lower space consumption and better performance in default
configurations.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The above list of changes is non-exhaustive. Please refer to the
&lt;a href=&quot;/releases/1.3.0/docs/release_notes.html&quot;&gt;release notes&lt;/a&gt;
for an expanded list of important improvements, bug fixes, and
incompatible changes before upgrading.&lt;/p&gt;
&lt;p&gt;Thanks to the 25 developers who contributed code or documentation to
this release!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Download the &lt;a href=&quot;/releases/1.3.0/&quot;&gt;Kudu 1.3.0 source release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Convenience binary artifacts for the Java client and various Java
integrations (eg Spark, Flume) are also now available via the ASF Maven
repository.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Todd Lipcon</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.3.0!
Apache Kudu 1.3 is a minor release which adds various new features,
improvements, bug fixes, and optimizations on top of Kudu
1.2. Highlights include:</summary></entry></feed>