| <?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><generator uri="http://jekyllrb.com" version="2.5.3">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2019-11-05T12:21:27-08:00</updated><id>/</id><entry><title>Apache Kudu 1.10.0 Released</title><link href="/2019/07/09/apache-kudu-1-10-0-release.html" rel="alternate" type="text/html" title="Apache Kudu 1.10.0 Released" /><published>2019-07-09T00:00:00-07:00</published><updated>2019-07-09T00:00:00-07:00</updated><id>/2019/07/09/apache-kudu-1-10-0-release</id><content type="html" xml:base="/2019/07/09/apache-kudu-1-10-0-release.html"><p>The Apache Kudu team is happy to announce the release of Kudu 1.10.0!</p> |
| |
| <p>The new release adds several new features and improvements, including the |
| following:</p> |
| |
| <!--more--> |
| |
| <ul> |
| <li>Kudu now supports both full and incremental table backups via a job |
| implemented using Apache Spark. Additionally it supports restoring |
| tables from full and incremental backups via a restore job implemented using |
| Apache Spark. See the |
| <a href="/releases/1.10.0/docs/administration.html#backup">backup documentation</a> |
| for more details.</li> |
| <li>Kudu can now synchronize its internal catalog with the Apache Hive Metastore, |
| automatically updating Hive Metastore table entries upon table creation, |
| deletion, and alterations in Kudu. See the |
| <a href="/releases/1.10.0/docs/hive_metastore.html#metadata_sync">HMS synchronization documentation</a> |
| for more details.</li> |
| <li>Kudu now supports native fine-grained authorization via integration with |
| Apache Sentry. Kudu may now enforce access control policies defined for Kudu |
| tables and columns, as well as policies defined on Hive servers and databases |
| that may store Kudu tables. See the |
| <a href="/releases/1.10.0/docs/security.html#fine_grained_authz">authorization documentation</a> |
| for more details.</li> |
| <li>Kudu’s web UI now supports SPNEGO, a protocol for securing HTTP requests with |
| Kerberos by passing negotiation through HTTP headers. To enable, set the |
| <code>--webserver_require_spnego</code> command line flag.</li> |
| <li>Column comments can now be stored in Kudu tables, and can be updated using |
| the AlterTable API |
| (see <a href="https://issues.apache.org/jira/browse/KUDU-1711">KUDU-1711</a>).</li> |
| <li>The performance of mutations (i.e. UPDATE, DELETE, and re-INSERT) to |
| not-yet-flushed Kudu data has been significantly optimized |
| (see <a href="https://issues.apache.org/jira/browse/KUDU-2826">KUDU-2826</a> and |
| <a href="https://github.com/apache/kudu/commit/f9f9526d3">f9f9526d3</a>).</li> |
| <li>Predicate performance for primitive columns and IS NULL and IS NOT NULL |
| has been optimized |
| (see <a href="https://issues.apache.org/jira/browse/KUDU-2846">KUDU-2846</a>).</li> |
| </ul> |
| |
| <p>The above is just a list of the highlights, for a more complete list of new |
| features, improvements and fixes please refer to the <a href="/releases/1.10.0/docs/release_notes.html">release |
| notes</a>.</p> |
| |
| <p>The Apache Kudu project only publishes source code releases. To build Kudu |
| 1.10.0, follow these steps:</p> |
| |
| <ul> |
| <li>Download the Kudu <a href="/releases/1.10.0">1.10.0 source release</a></li> |
| <li>Follow the instructions in the documentation to build Kudu <a href="/releases/1.10.0/docs/installation.html#build_from_source">1.10.0 from |
| source</a></li> |
| </ul> |
| |
| <p>For your convenience, binary JAR files for the Kudu Java client library, Spark |
| DataSource, Flume sink, and other Java integrations are published to the ASF |
| Maven repository and are <a href="https://search.maven.org/search?q=g:org.apache.kudu%20AND%20v:1.10.0">now |
| available</a>.</p> |
| |
| <p>The Python client source is also available on |
| <a href="https://pypi.org/project/kudu-python/">PyPI</a>.</p> |
| |
| <p>Additionally experimental Docker images are published to |
| <a href="https://hub.docker.com/r/apache/kudu">Docker Hub</a>.</p></content><author><name>Grant Henke</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.10.0! |
| |
| The new release adds several new features and improvements, including the |
| following:</summary></entry><entry><title>Location Awareness in Kudu</title><link href="/2019/04/30/location-awareness.html" rel="alternate" type="text/html" title="Location Awareness in Kudu" /><published>2019-04-30T00:00:00-07:00</published><updated>2019-04-30T00:00:00-07:00</updated><id>/2019/04/30/location-awareness</id><content type="html" xml:base="/2019/04/30/location-awareness.html"><p>This post is about location awareness in Kudu. It gives an overview |
| of the following: |
| - principles of the design |
| - restrictions of the current implementation |
| - potential future enhancements and extensions</p> |
| |
| <!--more--> |
| |
| <h1 id="introduction">Introduction</h1> |
| <p>Kudu supports location awareness starting with the 1.9.0 release. The |
| initial implementation of location awareness in Kudu is built to satisfy the |
| following requirement:</p> |
| |
| <ul> |
| <li>In a Kudu cluster consisting of multiple servers spread over several racks, |
| place the replicas of a tablet in such a way that the tablet stays available |
| even if all the servers in a single rack become unavailable.</li> |
| </ul> |
| |
| <p>A rack failure can occur when a hardware component shared among servers in the |
| rack, such as a network switch or power supply, fails. More generally, |
| replace ‘rack’ with any other aggregation of nodes (e.g., chassis, site, |
| cloud availability zone, etc.) where some or all nodes in an aggregate become |
| unavailable in case of a failure. This even applies to a datacenter if the |
| network latency between datacenters is low. This is why we call the feature |
| <em>location awareness</em> and not <em>rack awareness</em>.</p> |
| |
| <h1 id="locations-in-kudu">Locations in Kudu</h1> |
| <p>In Kudu, a location is defined by a string that begins with a slash (<code>/</code>) and |
| consists of slash-separated tokens each of which contains only characters from |
| the set <code>[a-zA-Z0-9_-.]</code>. The components of the location string hierarchy |
| should correspond to the physical or cloud-defined hierarchy of the deployed |
| cluster, e.g. <code>/data-center-0/rack-09</code> or <code>/region-0/availability-zone-01</code>.</p> |
| |
| <p>The design choice of using hierarchical paths for location strings is |
| partially influenced by HDFS. The intention was to make it possible using |
| the same locations as for existing HDFS nodes, because it’s common to deploy |
| Kudu alongside HDFS. In addition, the hierarchical structure of location |
| strings allows for interpretation of those in terms of common ancestry and |
| relative proximity. As of now, Kudu does not exploit the hierarchical |
| structure of the location except for the client’s logic to find the closest |
| tablet server. However, we plan to leverage the hierarchical structure |
| in future releases.</p> |
| |
| <h1 id="defining-and-assigning-locations">Defining and assigning locations</h1> |
| <p>Kudu masters assign locations to tablet servers and clients.</p> |
| |
| <p>Every Kudu master runs the location assignment procedure to assign a location |
| to a tablet server when it registers. To determine the location for a tablet |
| server, the master invokes an executable that takes the IP address or hostname |
| of the tablet server and outputs the corresponding location string for the |
| specified IP address or hostname. If the executable exits with non-zero exit |
| status, that’s interpreted as an error and masters add corresponding error |
| message about that into their logs. In case of tablet server registrations |
| such outcome is deemed as a registration failure and the corresponding tablet |
| server is not added into the master’s registry. The latter renders the tablet |
| server unusable to Kudu clients since non-registered tablet servers are not |
| discoverable to Kudu clients via <code>GetTableLocations</code> RPC.</p> |
| |
| <p>The master associates the produced location string with the registered tablet |
| server and keeps it until the tablet server re-registers, which only occurs |
| if the master or tablet server restarts. Masters use the assigned location |
| information internally to make replica placement decisions, trying to place |
| replicas evenly across locations and to keep tablets available in case all |
| tablet servers in a single location fail (see |
| <a href="https://s.apache.org/location-awareness-design">the design document</a> |
| for details). In addition, masters provide connected clients with |
| the information on the client’s assigned location, so the clients can make |
| informed decisions when they attempt to read from the closest tablet server. |
| Kudu tablet servers themselves are location agnostic, at least for now, |
| so the assigned location is not reported back to a registered tablet server.</p> |
| |
| <h1 id="the-location-aware-placement-policy-for-tablet-replicas-in-kudu">The location-aware placement policy for tablet replicas in Kudu</h1> |
| <p>While placing replicas of tablets in location-aware cluster, Kudu uses a best |
| effort approach to adhere to the following principle: |
| - Spread replicas across locations so that the failure of tablet servers |
| in one location does not make tablets unavailable.</p> |
| |
| <p>That’s referred to as the <em>replica placement policy</em> or just <em>placement policy</em>. |
| In Kudu, both the initial placement of tablet replicas and the automatic |
| re-replication are governed by that policy. As of now, that’s the only |
| replica placement policy available in Kudu. The placement policy isn’t |
| customizable and doesn’t have any configurable parameters.</p> |
| |
| <h1 id="automatic-re-replication-and-placement-policy">Automatic re-replication and placement policy</h1> |
| <p>By design, keeping the target replication factor for tablets has higher |
| priority than conforming to the replica placement policy. In other words, |
| when bringing up tablet replicas to replace failed ones, Kudu uses a best-effort |
| approach with regard to conforming to the constraints of the placement policy. |
| Essentially, that means that if there isn’t a way to place a replica to conform |
| with the placement policy, the system places the replica anyway. The resulting |
| violation of the placement policy can be addressed later on when unreachable |
| tablet servers become available again or the misconfiguration is addressed. |
| As of now, to fix the resulting placement policy violations it’s necessary |
| to run the CLI rebalancer tool manually (see below for details), |
| but in future releases that might be done <a href="https://issues.apache.org/jira/browse/KUDU-2780">automatically in background</a>.</p> |
| |
| <h1 id="an-example-of-location-aware-rebalancing">An example of location-aware rebalancing</h1> |
| <p>This section illustrates what happens during each phase of the location-aware |
| rebalancing process.</p> |
| |
| <p>In the diagrams below, the larger outer boxes denote locations, and the |
| smaller inner ones denote tablet servers. As for the real-world objects behind |
| locations in this example, one might think of server racks with a shared power |
| supply or a shared network switch. It’s assumed that no more than one tablet |
| server is run at each node (i.e. machine) in a rack.</p> |
| |
| <p>The first phase of the rebalancing process is about detecting violations and |
| reinstating the placement policy in the cluster. In the diagram below, there |
| are three locations defined: <code>/L0</code>, <code>/L1</code>, <code>/L2</code>. Each location has two tablet |
| servers. Table <code>A</code> has the replication factor of three (RF=3) and consists of |
| four tablets: <code>A0</code>, <code>A1</code>, <code>A2</code>, <code>A3</code>. Table <code>B</code> has replication factor of five |
| (RF=5) and consists of three tablets: <code>B0</code>, <code>B1</code>, <code>B2</code>.</p> |
| |
| <p>The distribution of the replicas for tablet <code>A0</code> violates the placement policy. |
| Why? Because replicas <code>A0.0</code> and <code>A0.1</code> constitute the majority of replicas |
| (two out of three) and reside in the same location <code>/L0</code>.</p> |
| |
| <pre><code> /L0 /L1 /L2 |
| +-------------------+ +-------------------+ +-------------------+ |
| | TS0 TS1 | | TS2 TS3 | | TS4 TS5 | |
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ | |
| | | A0.0 | | A0.1 | | | | A0.2 | | | | | | | | | | |
| | | | | A1.0 | | | | A1.1 | | | | | | A1.2 | | | | |
| | | | | A2.0 | | | | A2.1 | | | | | | A2.2 | | | | |
| | | | | A3.0 | | | | A3.1 | | | | | | A3.2 | | | | |
| | | B0.0 | | B0.1 | | | | B0.2 | | B0.3 | | | | B0.4 | | | | |
| | | B1.0 | | B1.1 | | | | B1.2 | | B1.3 | | | | B1.4 | | | | |
| | | B2.0 | | B2.1 | | | | B2.2 | | B2.3 | | | | B2.4 | | | | |
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ | |
| +-------------------+ +-------------------+ +-------------------+ |
| </code></pre> |
| |
| <p>The location-aware rebalancer should initiate movement either of <code>T0.0</code> or |
| <code>T0.1</code> from <code>/L0</code> to other location, so the resulting replica distribution would |
| <em>not</em> contain the majority of replicas in any single location. In addition to |
| that, the rebalancer tool tries to evenly spread the load across all locations |
| and tablet servers within each location. The latter narrows down the list |
| of the candidate replicas to move: <code>A0.1</code> is the best candidate to move from |
| location <code>/L0</code>, so location <code>/L0</code> would not contain the majority of replicas |
| for tablet <code>A0</code>. The same principle dictates the target location and the target |
| tablet server to receive <code>A0.1</code>: that should be tablet server <code>TS5</code> in the |
| location <code>/L2</code>. The result distribution of the tablet replicas after the move |
| is represented in the diagram below.</p> |
| |
| <pre><code> /L0 /L1 /L2 |
| +-------------------+ +-------------------+ +-------------------+ |
| | TS0 TS1 | | TS2 TS3 | | TS4 TS5 | |
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ | |
| | | A0.0 | | | | | | A0.2 | | | | | | | | A0.1 | | |
| | | | | A1.0 | | | | A1.1 | | | | | | A1.2 | | | | |
| | | | | A2.0 | | | | A2.1 | | | | | | A2.2 | | | | |
| | | | | A3.0 | | | | A3.1 | | | | | | A3.2 | | | | |
| | | B0.0 | | B0.1 | | | | B0.2 | | B0.3 | | | | B0.4 | | | | |
| | | B1.0 | | B1.1 | | | | B1.2 | | B1.3 | | | | B1.4 | | | | |
| | | B2.0 | | B2.1 | | | | B2.2 | | B2.3 | | | | B2.4 | | | | |
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ | |
| +-------------------+ +-------------------+ +-------------------+ |
| </code></pre> |
| |
| <p>The second phase of the location-aware rebalancing is about moving tablet |
| replicas across locations to make the locations’ load more balanced. For the |
| number <code>S</code> of tablet servers in a location and the total number <code>R</code> of replicas |
| in the location, the <em>load of the location</em> is defined as <code>R/S</code>.</p> |
| |
| <p>At this stage all violations of the placement policy are already rectified. The |
| rebalancer tool doesn’t attempt to make any moves which would violate the |
| placement policy.</p> |
| |
| <p>The load of the locations in the diagram above: |
| - <code>/L0</code>: 1/5 |
| - <code>/L1</code>: 1/5 |
| - <code>/L2</code>: 2/7</p> |
| |
| <p>A possible distribution of the tablet replicas after the second phase is |
| represented below. The result load of the locations: |
| - <code>/L0</code>: 2/9 |
| - <code>/L1</code>: 2/9 |
| - <code>/L2</code>: 2/9</p> |
| |
| <pre><code> /L0 /L1 /L2 |
| +-------------------+ +-------------------+ +-------------------+ |
| | TS0 TS1 | | TS2 TS3 | | TS4 TS5 | |
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ | |
| | | A0.0 | | | | | | A0.2 | | | | | | | | A0.1 | | |
| | | | | A1.0 | | | | A1.1 | | | | | | A1.2 | | | | |
| | | | | A2.0 | | | | A2.1 | | | | | | A2.2 | | | | |
| | | | | A3.0 | | | | A3.1 | | | | | | A3.2 | | | | |
| | | B0.0 | | | | | | B0.2 | | B0.3 | | | | B0.4 | | B0.1 | | |
| | | B1.0 | | B1.1 | | | | | | B1.3 | | | | B1.4 | | B2.2 | | |
| | | B2.0 | | B2.1 | | | | B2.2 | | B2.3 | | | | B2.4 | | | | |
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ | |
| +-------------------+ +-------------------+ +-------------------+ |
| </code></pre> |
| |
| <p>The third phase of the location-aware rebalancing is about moving tablet |
| replicas within each location to make the distribution of replicas even, |
| both per-table and per-server.</p> |
| |
| <p>See below for a possible replicas’ distribution in the example scenario |
| after the third phase of the location-aware rebalancing successfully completes.</p> |
| |
| <pre><code> /L0 /L1 /L2 |
| +-------------------+ +-------------------+ +-------------------+ |
| | TS0 TS1 | | TS2 TS3 | | TS4 TS5 | |
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ | |
| | | A0.0 | | | | | | | | A0.2 | | | | | | A0.1 | | |
| | | | | A1.0 | | | | A1.1 | | | | | | A1.2 | | | | |
| | | | | A2.0 | | | | A2.1 | | | | | | A2.2 | | | | |
| | | | | A3.0 | | | | A3.1 | | | | | | A3.2 | | | | |
| | | B0.0 | | | | | | B0.2 | | B0.3 | | | | B0.4 | | B0.1 | | |
| | | B1.0 | | B1.1 | | | | | | B1.3 | | | | B1.4 | | B1.2 | | |
| | | B2.0 | | B2.1 | | | | B2.2 | | B2.3 | | | | | | B2.4 | | |
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ | |
| +-------------------+ +-------------------+ +-------------------+ |
| </code></pre> |
| |
| <h1 id="how-to-make-a-kudu-cluster-location-aware">How to make a Kudu cluster location-aware</h1> |
| <p>To make a Kudu cluster location-aware, it’s necessary to set the |
| <code>--location_mapping_cmd</code> flag for Kudu master(s) and make the corresponding |
| executable (binary or a script) available at the nodes where Kudu masters run. |
| In case of multiple masters, it’s important to make sure that the location |
| mappings stay the same regardless of the node where the location assignment |
| command is running.</p> |
| |
| <p>It’s recommended to have at least three locations defined in a Kudu |
| cluster so that no location contains a majority of tablet replicas. |
| With two locations or less it’s not possible to spread replicas |
| of tablets with replication factor of three and higher such that no location |
| contains a majority of replicas.</p> |
| |
| <p>For example, running a Kudu cluster in a single datacenter <code>dc0</code>, assign |
| location <code>/dc0/rack0</code> to tablet servers running at machines in the rack <code>rack0</code>, |
| <code>/dc0/rack1</code> to tablet servers running at machines in the rack <code>rack1</code>, |
| and <code>/dc0/rack2</code> to tablet servers running at machines in the rack <code>rack2</code>. |
| In a similar way, when running in cloud, assign location <code>/regionA/az0</code> |
| to tablet servers running in availability zone <code>az0</code> of region <code>regionA</code>, |
| and <code>/regionA/az1</code> to tablet servers running in zone <code>az1</code> of the same region.</p> |
| |
| <h1 id="an-example-of-location-assignment-script-for-kudu">An example of location assignment script for Kudu</h1> |
| <pre><code>#!/bin/sh |
| # |
| # It's assumed a Kudu cluster consists of nodes with IPv4 addresses in the |
| # private 192.168.100.0/32 subnet. The nodes are hosted in racks, where |
| # each rack can contain at most 32 nodes. This results in 8 locations, |
| # one location per rack. |
| # |
| # This example script maps IP addresses into locations assuming that RPC |
| # endpoints of tablet servers are specified via IPv4 addresses. If tablet |
| # servers' RPC endpoints are specified using DNS hostnames (and that's how |
| # it's done by default), the script should consume DNS hostname instead of |
| # an IP address as an input parameter. Check the `--rpc_bind_addresses` and |
| # `--rpc_advertised_addresses` command line flags of kudu-tserver for details. |
| # |
| # DISCLAIMER: |
| # This is an example Bourne shell script for Kudu location assignment. Please |
| # note it's just a toy script created with illustrative-only purpose. |
| # The error handling and the input validation are minimalistic. Also, the |
| # network topology choice, supportability and capacity planning aspects of |
| # this script might be sub-optimal if applied as-is for real-world use cases. |
| |
| set -e |
| |
| if [ $# -ne 1 ]; then |
| echo "usage: $0 &lt;ip_address&gt;" |
| exit 1 |
| fi |
| |
| ip_address=$1 |
| shift |
| |
| suffix=${ip_address##192.168.100.} |
| if [ -z "${suffix##*.*}" ]; then |
| # An IP address from a non-controlled subnet: maps into the 'other' location. |
| echo "/other" |
| exit 0 |
| fi |
| |
| # The mapping of the IP addresses |
| if [ -z "$suffix" -o $suffix -lt 0 -o $suffix -gt 255 ]; then |
| echo "ERROR: '$ip_address' is not a valid IPv4 address" |
| exit 2 |
| fi |
| |
| if [ $suffix -eq 0 -o $suffix -eq 255 ]; then |
| echo "ERROR: '$ip_address' is a 0xffffff00 IPv4 subnet address" |
| exit 3 |
| fi |
| |
| if [ $suffix -lt 32 ]; then |
| echo "/dc0/rack00" |
| elif [ $suffix -ge 32 -a $suffix -lt 64 ]; then |
| echo "/dc0/rack01" |
| elif [ $suffix -ge 64 -a $suffix -lt 96 ]; then |
| echo "/dc0/rack02" |
| elif [ $suffix -ge 96 -a $suffix -lt 128 ]; then |
| echo "/dc0/rack03" |
| elif [ $suffix -ge 128 -a $suffix -lt 160 ]; then |
| echo "/dc0/rack04" |
| elif [ $suffix -ge 160 -a $suffix -lt 192 ]; then |
| echo "/dc0/rack05" |
| elif [ $suffix -ge 192 -a $suffix -lt 224 ]; then |
| echo "/dc0/rack06" |
| else |
| echo "/dc0/rack07" |
| fi |
| </code></pre> |
| |
| <h1 id="reinstating-the-placement-policy-in-a-location-aware-kudu-cluster">Reinstating the placement policy in a location-aware Kudu cluster</h1> |
| <p>As explained earlier, even if the initial placement of tablet replicas conforms |
| to the placement policy, the cluster might get to a point where there are not |
| enough tablet servers to place a new or a replacement replica. Ideally, such |
| situations should be handled automatically: once there are enough tablet servers |
| in the cluster or the misconfiguration is fixed, the placement policy should |
| be reinstated. Currently, it’s possible to reinstate the placement policy using |
| the <code>kudu</code> CLI tool:</p> |
| |
| <p><code>sudo -u kudu kudu cluster rebalance &lt;master_rpc_endpoints&gt;</code></p> |
| |
| <p>In the first phase, the location-aware rebalancing process tries to |
| reestablish the placement policy. If that’s not possible, the tool |
| terminates. Use the <code>--disable_policy_fixer</code> flag to skip this phase and |
| continue to the cross-location rebalancing phase.</p> |
| |
| <p>The second phase is cross-location rebalancing, i.e. moving tablet replicas |
| between different locations in attempt to spread tablet replicas among |
| locations evenly, equalizing the loads of locations throughout the cluster. |
| If the benefits of spreading the load among locations do not justify the cost |
| of the cross-location replica movement, the tool can be instructed to skip the |
| second phase of the location-aware rebalancing. Use the |
| <code>--disable_cross_location_rebalancing</code> command line flag for that.</p> |
| |
| <p>The third phase is intra-location rebalancing, i.e. balancing the distribution |
| of tablet replicas within each location as if each location is a cluster on its |
| own. Use the <code>--disable_intra_location_rebalancing</code> flag to skip this phase.</p> |
| |
| <h1 id="future-work">Future work</h1> |
| <p>Having a CLI tool to reinstate placement policy is nice, but it would be great |
| to run the location-aware rebalancing in background, automatically reinstating |
| the placement policy and making tablet replica distribution even |
| across a Kudu cluster.</p> |
| |
| <p>In addition to that, there is a idea to make it possible to have |
| multiple customizable placement policies in the system. As of now, there is |
| a request to implement so-called ‘table pinning’, i.e. make it possible |
| to specify placement policy where replicas of tablets of particular tables |
| are placed only at nodes within the specified locations. The table pinning |
| request is tracked by KUDU-2604 in Apache JIRA, see |
| <a href="https://issues.apache.org/jira/browse/KUDU-2604">KUDU-2604</a>.</p> |
| |
| <h1 id="references">References</h1> |
| <p>[1] Location awareness in Kudu: <a href="https://github.com/apache/kudu/blob/master/docs/design-docs/location-awareness.md">design document</a></p> |
| |
| <p>[2] A proposal for Kudu tablet server labeling: <a href="https://issues.apache.org/jira/browse/KUDU-2604">KUDU-2604</a></p> |
| |
| <p>[3] Further improvement: <a href="https://issues.apache.org/jira/browse/KUDU-2780">automatic cluster rebalancing</a>.</p></content><author><name>Alexey Serbin</name></author><summary>This post is about location awareness in Kudu. It gives an overview |
| of the following: |
| - principles of the design |
| - restrictions of the current implementation |
| - potential future enhancements and extensions</summary></entry><entry><title>Fine-Grained Authorization with Apache Kudu and Impala</title><link href="/2019/04/22/fine-grained-authorization-with-apache-kudu-and-impala.html" rel="alternate" type="text/html" title="Fine-Grained Authorization with Apache Kudu and Impala" /><published>2019-04-22T00:00:00-07:00</published><updated>2019-04-22T00:00:00-07:00</updated><id>/2019/04/22/fine-grained-authorization-with-apache-kudu-and-impala</id><content type="html" xml:base="/2019/04/22/fine-grained-authorization-with-apache-kudu-and-impala.html"><p>Note: This is a cross-post from the Cloudera Engineering Blog |
| <a href="https://blog.cloudera.com/blog/2019/04/fine-grained-authorization-with-apache-kudu-and-impala/">Fine-Grained Authorization with Apache Kudu and Impala</a></p> |
| |
| <p>Apache Impala supports fine-grained authorization via Apache Sentry on all of the tables it |
| manages including Apache Kudu tables. Given Impala is a very common way to access the data stored |
| in Kudu, this capability allows users deploying Impala and Kudu to fully secure the Kudu data in |
| multi-tenant clusters even though Kudu does not yet have native fine-grained authorization of its |
| own. This solution works because Kudu natively supports coarse-grained (all or nothing) |
| authorization which enables blocking all access to Kudu directly except for the impala user and |
| an optional whitelist of other trusted users. This post will describe how to use Apache Impala’s |
| fine-grained authorization support along with Apache Kudu’s coarse-grained authorization to |
| achieve a secure multi-tenant deployment.</p> |
| |
| <!--more--> |
| |
| <h2 id="sample-workflow">Sample Workflow</h2> |
| |
| <p>The examples in this post enable a workflow that uses Apache Spark to ingest data directly into |
| Kudu and Impala to run analytic queries on that data. The Spark job, run as the <code>etl_service</code> user, |
| is permitted to access the Kudu data via coarse-grained authorization. Even though this gives |
| access to all the data in Kudu, the <code>etl_service</code> user is only used for scheduled jobs or by an |
| administrator. All queries on the data, from a wide array of users, will use Impala and leverage |
| Impala’s fine-grained authorization. Impala’s |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_grant.html"><code>GRANT</code> statements</a> |
| allow you to flexibly control the privileges on the Kudu storage tables. Impala’s fine-grained |
| privileges along with support for |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_select.html"><code>SELECT</code></a>, |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_insert.html"><code>INSERT</code></a>, |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_update.html"><code>UPDATE</code></a>, |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_upsert.html"><code>UPSERT</code></a>, |
| and <a href="https://impala.apache.org/docs/build/html/topics/impala_delete.html"><code>DELETE</code></a> |
| statements, allow you to finely control who can read and write data to your Kudu tables while |
| using Impala. Below is a diagram showing the workflow described:</p> |
| |
| <p><img src="/img/fine-grained-authorization-with-apache-kudu.png" alt="png" class="img-responsive" /></p> |
| |
| <p><em>Note</em>: The examples below assume that Authorization has already been configured for Kudu, Impala, |
| and Spark. For help configuring authorization see the Cloudera |
| <a href="https://www.cloudera.com/documentation/enterprise/latest/topics/sg_auth_overview.html">authorization documentation</a>.</p> |
| |
| <h2 id="configuring-kudus-coarse-grained-authorization">Configuring Kudu’s Coarse-Grained Authorization</h2> |
| |
| <p>Kudu supports coarse-grained authorization of client requests based on the authenticated client |
| Kerberos principal. The two levels of access which can be configured are:</p> |
| |
| <ul> |
| <li><em>Superuser</em> – principals authorized as a superuser are able to perform certain administrative |
| functionality such as using the kudu command line tool to diagnose or repair cluster issues.</li> |
| <li><em>User</em> – principals authorized as a user are able to access and modify all data in the Kudu |
| cluster. This includes the ability to create, drop, and alter tables as well as read, insert, |
| update, and delete data.</li> |
| </ul> |
| |
| <p>Access levels are granted using whitelist-style Access Control Lists (ACLs), one for each of the |
| two levels. Each access control list either specifies a comma-separated list of users, or may be |
| set to <code>*</code> to indicate that all authenticated users are able to gain access at the specified level.</p> |
| |
| <p><em>Note</em>: The default value for the User ACL is <code>*</code>, which allows all users access to the cluster.</p> |
| |
| <h3 id="example-configuration">Example Configuration</h3> |
| |
| <p>The first and most important step is to remove the default ACL of <code>*</code> from Kudu’s |
| <a href="https://kudu.apache.org/docs/configuration_reference.html#kudu-master_user_acl"><code>–user_acl</code> configuration</a>. |
| This will ensure only the users you list will have access to the Kudu cluster. Then, to allow the |
| Impala service to access all of the data in Kudu, the Impala service user, usually impala, should |
| be added to the Kudu <code>–user_acl</code> configuration. Any user that is not using Impala will also need |
| to be added to this list. For example, an Apache Spark job might be used to load data directly |
| into Kudu. Generally, a single user is used to run scheduled jobs of applications that do not |
| support fine-grained authorization on their own. For this example, that user is <code>etl_service</code>. The |
| full <code>–user_acl</code> configuration is:</p> |
| |
| <div class="highlight"><pre><code class="language-bash" data-lang="bash">--user_acl<span class="o">=</span>impala,etl_service</code></pre></div> |
| |
| <p>For more details see the Kudu |
| <a href="https://kudu.apache.org/docs/security.html#_coarse_grained_authorization">authorization documentation</a>.</p> |
| |
| <h2 id="using-impalas-fine-grained-authorization">Using Impala’s Fine-Grained Authorization</h2> |
| |
| <p>Follow Impala’s |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_authorization.html">authorization documentation</a> |
| to configure fine-grained authorization. Once configured, you can use Impala’s |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_grant.html"><code>GRANT</code> statements</a> |
| to control the privileges of Kudu tables. These fine-grained privileges can be set at the database, |
| table and column level. Additionally you can individually control <code>SELECT</code>, <code>INSERT</code>, <code>CREATE</code>, |
| <code>ALTER</code>, and <code>DROP</code> privileges.</p> |
| |
| <p><em>Note</em>: A user needs the <code>ALL</code> privilege in order to run <code>DELETE</code>, <code>UPDATE</code>, or <code>UPSERT</code> |
| statements against a Kudu table.</p> |
| |
| <p>Below is a brief example with a couple tables stored in Kudu:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">messages</span> |
| <span class="p">(</span> |
| <span class="n">name</span> <span class="n">STRING</span><span class="p">,</span> |
| <span class="n">time</span> <span class="k">TIMESTAMP</span><span class="p">,</span> |
| <span class="n">message</span> <span class="n">STRING</span><span class="p">,</span> |
| <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">time</span><span class="p">)</span> |
| <span class="p">)</span> |
| <span class="n">PARTITION</span> <span class="k">BY</span> <span class="n">HASH</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="n">PARTITIONS</span> <span class="mi">4</span> |
| <span class="n">STORED</span> <span class="k">AS</span> <span class="n">KUDU</span><span class="p">;</span> |
| <span class="k">GRANT</span> <span class="k">ALL</span> <span class="k">ON</span> <span class="k">TABLE</span> <span class="n">messages</span> <span class="k">TO</span> <span class="n">userA</span><span class="p">;</span> |
| |
| <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">metrics</span> |
| <span class="p">(</span> |
| <span class="k">host</span> <span class="n">STRING</span> <span class="k">NOT</span> <span class="k">NULL</span><span class="p">,</span> |
| <span class="n">metric</span> <span class="n">STRING</span> <span class="k">NOT</span> <span class="k">NULL</span><span class="p">,</span> |
| <span class="n">time</span> <span class="n">INT64</span> <span class="k">NOT</span> <span class="k">NULL</span><span class="p">,</span> |
| <span class="n">value</span> <span class="n">DOUBLE</span> <span class="k">NOT</span> <span class="k">NULL</span><span class="p">,</span> |
| <span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="p">(</span><span class="k">host</span><span class="p">,</span> <span class="n">metric</span><span class="p">,</span> <span class="n">time</span><span class="p">)</span> |
| <span class="p">)</span> |
| <span class="n">PARTITION</span> <span class="k">BY</span> <span class="n">HASH</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="n">PARTITIONS</span> <span class="mi">4</span> |
| <span class="n">STORED</span> <span class="k">AS</span> <span class="n">KUDU</span><span class="p">;</span> |
| <span class="k">GRANT</span> <span class="k">ALL</span> <span class="k">ON</span> <span class="k">TABLE</span> <span class="n">messages</span> <span class="k">TO</span> <span class="n">userB</span><span class="p">;</span></code></pre></div> |
| |
| <h2 id="conclusion">Conclusion</h2> |
| |
| <p>This brief example that combines Kudu’s coarse-grained authorization and Impala’s fine-grained |
| authorization should enable you to meet the security needs of your data workflow today. The |
| pattern described here can be applied to other services and workflows using Kudu as well. For |
| greater authorization flexibility, you can look forward to the near future when Kudu supports |
| native fine-grained authorization on its own. The Apache Kudu contributors understand the |
| importance of native fine-grained authorization and they are working on integrations with |
| Apache Sentry and Apache Ranger.</p></content><author><name>Grant Henke</name></author><summary>Note: This is a cross-post from the Cloudera Engineering Blog |
| Fine-Grained Authorization with Apache Kudu and Impala |
| |
| Apache Impala supports fine-grained authorization via Apache Sentry on all of the tables it |
| manages including Apache Kudu tables. Given Impala is a very common way to access the data stored |
| in Kudu, this capability allows users deploying Impala and Kudu to fully secure the Kudu data in |
| multi-tenant clusters even though Kudu does not yet have native fine-grained authorization of its |
| own. This solution works because Kudu natively supports coarse-grained (all or nothing) |
| authorization which enables blocking all access to Kudu directly except for the impala user and |
| an optional whitelist of other trusted users. This post will describe how to use Apache Impala’s |
| fine-grained authorization support along with Apache Kudu’s coarse-grained authorization to |
| achieve a secure multi-tenant deployment.</summary></entry><entry><title>Testing Apache Kudu Applications on the JVM</title><link href="/2019/03/19/testing-apache-kudu-applications-on-the-jvm.html" rel="alternate" type="text/html" title="Testing Apache Kudu Applications on the JVM" /><published>2019-03-19T00:00:00-07:00</published><updated>2019-03-19T00:00:00-07:00</updated><id>/2019/03/19/testing-apache-kudu-applications-on-the-jvm</id><content type="html" xml:base="/2019/03/19/testing-apache-kudu-applications-on-the-jvm.html"><p>Note: This is a cross-post from the Cloudera Engineering Blog |
| <a href="https://blog.cloudera.com/blog/2019/03/testing-apache-kudu-applications-on-the-jvm/">Testing Apache Kudu Applications on the JVM</a></p> |
| |
| <p>Although the Kudu server is written in C++ for performance and efficiency, developers can write |
| client applications in C++, Java, or Python. To make it easier for Java developers to create |
| reliable client applications, we’ve added new utilities in Kudu 1.9.0 that allow you to write tests |
| using a Kudu cluster without needing to build Kudu yourself, without any knowledge of C++, |
| and without any complicated coordination around starting and stopping Kudu clusters for each test. |
| This post describes how the new testing utilities work and how you can use them in your application |
| tests.</p> |
| |
| <!--more--> |
| |
| <h2 id="user-guide">User Guide</h2> |
| |
| <p>Note: It is possible this blog post could become outdated – for the latest documentation on using |
| the JVM testing utilities see the |
| <a href="https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing">Kudu documentation</a>.</p> |
| |
| <h3 id="requirements">Requirements</h3> |
| |
| <p>In order to use the new testing utilities, the following requirements must be met:</p> |
| |
| <ul> |
| <li>OS |
| <ul> |
| <li>macOS El Capitan (10.11) or later</li> |
| <li>CentOS 6.6+, Ubuntu 14.04+, or another recent distribution of Linux |
| <a href="https://kudu.apache.org/docs/installation.html#_prerequisites_and_requirements">supported by Kudu</a></li> |
| </ul> |
| </li> |
| <li>JVM |
| <ul> |
| <li>Java 8+</li> |
| <li>Note: Java 7+ is deprecated, but still supported</li> |
| </ul> |
| </li> |
| <li>Build Tool |
| <ul> |
| <li>Maven 3.1 or later, required to support the |
| <a href="https://github.com/trustin/os-maven-plugin">os-maven-plugin</a></li> |
| <li>Gradle 2.1 or later, to support the |
| <a href="https://github.com/google/osdetector-gradle-plugin">osdetector-gradle-plugin</a></li> |
| <li>Any other build tool that can download the correct jar from Maven</li> |
| </ul> |
| </li> |
| </ul> |
| |
| <h3 id="build-configuration">Build Configuration</h3> |
| |
| <p>In order to use the Kudu testing utilities, add two dependencies to your classpath:</p> |
| |
| <ul> |
| <li>The <code>kudu-test-utils</code> dependency</li> |
| <li>The <code>kudu-binary</code> dependency</li> |
| </ul> |
| |
| <p>The <code>kudu-test-utils</code> dependency has useful utilities for testing applications that use Kudu. |
| Primarily, it provides the |
| <a href="https://github.com/apache/kudu/blob/master/java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java">KuduTestHarness class</a> |
| to manage the lifecycle of a Kudu cluster for each test. The <code>KuduTestHarness</code> is a |
| <a href="https://junit.org/junit4/javadoc/4.12/org/junit/rules/TestRule.html">JUnit TestRule</a> |
| that not only starts and stops a Kudu cluster for each test, but also has methods to manage the |
| cluster and get pre-configured <code>KuduClient</code> instances for use while testing.</p> |
| |
| <p>The <code>kudu-binary</code> dependency contains the native Kudu (server and command-line tool) binaries for |
| the specified operating system. In order to download the right artifact for the running operating |
| system it is easiest to use a plugin, such as the |
| <a href="https://github.com/trustin/os-maven-plugin">os-maven-plugin</a> or |
| <a href="https://github.com/google/osdetector-gradle-plugin">osdetector-gradle-plugin</a>, to detect the |
| current runtime environment. The <code>KuduTestHarness</code> will automatically find and use the <code>kudu-binary</code> |
| jar on the classpath.</p> |
| |
| <p>WARNING: The <code>kudu-binary</code> module should only be used to run Kudu for integration testing purposes. |
| It should never be used to run an actual Kudu service, in production or development, because the |
| <code>kudu-binary</code> module includes native security-related dependencies that have been copied from the |
| build system and will not be patched when the operating system on the runtime host is patched.</p> |
| |
| <h4 id="maven-configuration">Maven Configuration</h4> |
| |
| <p>If you are using Maven to build your project, add the following entries to your project’s |
| <code>pom.xml</code> file:</p> |
| |
| <div class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt">&lt;build&gt;</span> |
| <span class="nt">&lt;extensions&gt;</span> |
| <span class="c">&lt;!-- Used to find the right kudu-binary artifact with the Maven</span> |
| <span class="c"> property ${os.detected.classifier} --&gt;</span> |
| <span class="nt">&lt;extension&gt;</span> |
| <span class="nt">&lt;groupId&gt;</span>kr.motd.maven<span class="nt">&lt;/groupId&gt;</span> |
| <span class="nt">&lt;artifactId&gt;</span>os-maven-plugin<span class="nt">&lt;/artifactId&gt;</span> |
| <span class="nt">&lt;version&gt;</span>1.6.2<span class="nt">&lt;/version&gt;</span> |
| <span class="nt">&lt;/extension&gt;</span> |
| <span class="nt">&lt;/extensions&gt;</span> |
| <span class="nt">&lt;/build&gt;</span> |
| |
| <span class="nt">&lt;dependencies&gt;</span> |
| <span class="nt">&lt;dependency&gt;</span> |
| <span class="nt">&lt;groupId&gt;</span>org.apache.kudu<span class="nt">&lt;/groupId&gt;</span> |
| <span class="nt">&lt;artifactId&gt;</span>kudu-test-utils<span class="nt">&lt;/artifactId&gt;</span> |
| <span class="nt">&lt;version&gt;</span>1.9.0<span class="nt">&lt;/version&gt;</span> |
| <span class="nt">&lt;scope&gt;</span>test<span class="nt">&lt;/scope&gt;</span> |
| <span class="nt">&lt;/dependency&gt;</span> |
| <span class="nt">&lt;dependency&gt;</span> |
| <span class="nt">&lt;groupId&gt;</span>org.apache.kudu<span class="nt">&lt;/groupId&gt;</span> |
| <span class="nt">&lt;artifactId&gt;</span>kudu-binary<span class="nt">&lt;/artifactId&gt;</span> |
| <span class="nt">&lt;version&gt;</span>1.9.0<span class="nt">&lt;/version&gt;</span> |
| <span class="nt">&lt;classifier&gt;</span>${os.detected.classifier}<span class="nt">&lt;/classifier&gt;</span> |
| <span class="nt">&lt;scope&gt;</span>test<span class="nt">&lt;/scope&gt;</span> |
| <span class="nt">&lt;/dependency&gt;</span> |
| <span class="nt">&lt;/dependencies&gt;</span></code></pre></div> |
| |
| <h4 id="gradle-configuration">Gradle Configuration</h4> |
| |
| <p>If you are using Gradle to build your project, add the following entries to your project’s |
| <code>build.gradle</code> file:</p> |
| |
| <div class="highlight"><pre><code class="language-groovy" data-lang="groovy"><span class="n">plugins</span> <span class="o">{</span> |
| <span class="c1">// Used to find the right kudu-binary artifact with the Gradle</span> |
| <span class="c1">// property ${osdetector.classifier}</span> |
| <span class="n">id</span> <span class="s2">&quot;com.google.osdetector&quot;</span> <span class="n">version</span> <span class="s2">&quot;1.6.2&quot;</span> |
| <span class="o">}</span> |
| |
| <span class="n">dependencies</span> <span class="o">{</span> |
| <span class="n">testCompile</span> <span class="s2">&quot;org.apache.kudu:kudu-test-utils:1.9.0&quot;</span> |
| <span class="n">testCompile</span> <span class="s2">&quot;org.apache.kudu:kudu-binary:1.9.0:${osdetector.classifier}&quot;</span> |
| <span class="o">}</span></code></pre></div> |
| |
| <h2 id="test-setup">Test Setup</h2> |
| |
| <p>Once your project is configured correctly, you can start writing tests using the <code>kudu-test-utils</code> |
| and <code>kudu-binary</code> artifacts. One line of code will ensure that each test automatically starts and |
| stops a real Kudu cluster and that cluster logging is output through <code>slf4j</code>:</p> |
| |
| <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="nd">@Rule</span> <span class="kd">public</span> <span class="n">KuduTestHarness</span> <span class="n">harness</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">KuduTestHarness</span><span class="o">();</span></code></pre></div> |
| |
| <p>The <a href="https://github.com/apache/kudu/blob/master/java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java">KuduTestHarness</a> |
| has methods to get pre-configured clients, start and stop servers, and more. Below is an example |
| test to showcase some of the capabilities:</p> |
| |
| <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kn">import</span> <span class="nn">org.apache.kudu.*</span><span class="o">;</span> |
| <span class="kn">import</span> <span class="nn">org.apache.kudu.client.*</span><span class="o">;</span> |
| <span class="kn">import</span> <span class="nn">org.apache.kudu.test.KuduTestHarness</span><span class="o">;</span> |
| <span class="kn">import</span> <span class="nn">org.junit.*</span><span class="o">;</span> |
| |
| <span class="kn">import</span> <span class="nn">java.util.Arrays</span><span class="o">;</span> |
| <span class="kn">import</span> <span class="nn">java.util.Collections</span><span class="o">;</span> |
| |
| <span class="kd">public</span> <span class="kd">class</span> <span class="nc">MyKuduTest</span> <span class="o">{</span> |
| |
| <span class="nd">@Rule</span> |
| <span class="kd">public</span> <span class="n">KuduTestHarness</span> <span class="n">harness</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">KuduTestHarness</span><span class="o">();</span> |
| |
| <span class="nd">@Test</span> |
| <span class="kd">public</span> <span class="kt">void</span> <span class="nf">test</span><span class="o">()</span> <span class="kd">throws</span> <span class="n">Exception</span> <span class="o">{</span> |
| <span class="c1">// Get a KuduClient configured to talk to the running mini cluster.</span> |
| <span class="n">KuduClient</span> <span class="n">client</span> <span class="o">=</span> <span class="n">harness</span><span class="o">.</span><span class="na">getClient</span><span class="o">();</span> |
| |
| <span class="c1">// Some of the other most common KuduTestHarness methods include:</span> |
| <span class="n">AsyncKuduClient</span> <span class="n">asyncClient</span> <span class="o">=</span> <span class="n">harness</span><span class="o">.</span><span class="na">getAsyncClient</span><span class="o">();</span> |
| <span class="n">String</span> <span class="n">masterAddresses</span><span class="o">=</span> <span class="n">harness</span><span class="o">.</span><span class="na">getMasterAddressesAsString</span><span class="o">();</span> |
| <span class="n">List</span><span class="o">&lt;</span><span class="n">HostAndPort</span><span class="o">&gt;</span> <span class="n">masterServers</span> <span class="o">=</span> <span class="n">harness</span><span class="o">.</span><span class="na">getMasterServers</span><span class="o">();</span> |
| <span class="n">List</span><span class="o">&lt;</span><span class="n">HostAndPort</span><span class="o">&gt;</span> <span class="n">tabletServers</span> <span class="o">=</span> <span class="n">harness</span><span class="o">.</span><span class="na">getTabletServers</span><span class="o">();</span> |
| <span class="n">harness</span><span class="o">.</span><span class="na">killLeaderMasterServer</span><span class="o">();</span> |
| <span class="n">harness</span><span class="o">.</span><span class="na">killAllMasterServers</span><span class="o">();</span> |
| <span class="n">harness</span><span class="o">.</span><span class="na">startAllMasterServers</span><span class="o">();</span> |
| <span class="n">harness</span><span class="o">.</span><span class="na">killAllTabletServers</span><span class="o">();</span> |
| <span class="n">harness</span><span class="o">.</span><span class="na">startAllTabletServers</span><span class="o">();</span> |
| |
| <span class="c1">// Create a new Kudu table.</span> |
| <span class="n">String</span> <span class="n">tableName</span> <span class="o">=</span> <span class="s">&quot;myTable&quot;</span><span class="o">;</span> |
| <span class="n">Schema</span> <span class="n">schema</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">Schema</span><span class="o">(</span><span class="n">Arrays</span><span class="o">.</span><span class="na">asList</span><span class="o">(</span> |
| <span class="k">new</span> <span class="n">ColumnSchema</span><span class="o">.</span><span class="na">ColumnSchemaBuilder</span><span class="o">(</span><span class="s">&quot;key&quot;</span><span class="o">,</span> <span class="n">Type</span><span class="o">.</span><span class="na">INT32</span><span class="o">).</span><span class="na">key</span><span class="o">(</span><span class="kc">true</span><span class="o">).</span><span class="na">build</span><span class="o">(),</span> |
| <span class="k">new</span> <span class="n">ColumnSchema</span><span class="o">.</span><span class="na">ColumnSchemaBuilder</span><span class="o">(</span><span class="s">&quot;value&quot;</span><span class="o">,</span> <span class="n">Type</span><span class="o">.</span><span class="na">STRING</span><span class="o">).</span><span class="na">key</span><span class="o">(</span><span class="kc">true</span><span class="o">).</span><span class="na">build</span><span class="o">()</span> |
| <span class="o">));</span> |
| <span class="n">CreateTableOptions</span> <span class="n">opts</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">CreateTableOptions</span><span class="o">()</span> |
| <span class="o">.</span><span class="na">setRangePartitionColumns</span><span class="o">(</span><span class="n">Collections</span><span class="o">.</span><span class="na">singletonList</span><span class="o">(</span><span class="s">&quot;key&quot;</span><span class="o">));</span> |
| <span class="n">client</span><span class="o">.</span><span class="na">createTable</span><span class="o">(</span><span class="n">tableName</span><span class="o">,</span> <span class="n">schema</span><span class="o">,</span> <span class="n">opts</span><span class="o">);</span> |
| <span class="n">KuduTable</span> <span class="n">table</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="na">openTable</span><span class="o">(</span><span class="n">tableName</span><span class="o">);</span> |
| |
| <span class="c1">// Write a few rows to the table</span> |
| <span class="n">KuduSession</span> <span class="n">session</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="na">newSession</span><span class="o">();</span> |
| <span class="k">for</span><span class="o">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">10</span><span class="o">;</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span> |
| <span class="n">Insert</span> <span class="n">insert</span> <span class="o">=</span> <span class="n">table</span><span class="o">.</span><span class="na">newInsert</span><span class="o">();</span> |
| <span class="n">PartialRow</span> <span class="n">row</span> <span class="o">=</span> <span class="n">insert</span><span class="o">.</span><span class="na">getRow</span><span class="o">();</span> |
| <span class="n">row</span><span class="o">.</span><span class="na">addInt</span><span class="o">(</span><span class="s">&quot;key&quot;</span><span class="o">,</span> <span class="n">i</span><span class="o">);</span> |
| <span class="n">row</span><span class="o">.</span><span class="na">addString</span><span class="o">(</span><span class="s">&quot;value&quot;</span><span class="o">,</span> <span class="n">String</span><span class="o">.</span><span class="na">valueOf</span><span class="o">(</span><span class="n">i</span><span class="o">));</span> |
| <span class="n">session</span><span class="o">.</span><span class="na">apply</span><span class="o">(</span><span class="n">insert</span><span class="o">);</span> |
| <span class="o">}</span> |
| <span class="n">session</span><span class="o">.</span><span class="na">close</span><span class="o">();</span> |
| |
| <span class="c1">// ... Continue the test. Read and validate the rows, alter the table, etc.</span> |
| <span class="o">}</span> |
| <span class="o">}</span></code></pre></div> |
| |
| <p>For a complete example of a project using the <code>KuduTestHarness</code>, see the |
| <a href="https://github.com/apache/kudu/tree/master/examples/java/java-example">java-example</a> project in |
| the Kudu source code repository. The Kudu project itself uses the <code>KuduTestHarness</code> for all of its |
| own integration tests. For more complex examples, you can explore the various |
| <a href="https://github.com/apache/kudu/tree/master/java/kudu-client/src/test/java/org/apache/kudu/client">Kudu integration</a> |
| tests in the Kudu source code repository.</p> |
| |
| <h2 id="feedback">Feedback</h2> |
| |
| <p>Kudu 1.9.0 is the first release to have these testing utilities available. Although these utilities |
| simplify testing of Kudu applications, there is always room for improvement. |
| Please report any issues, ideas, or feedback to the Kudu user mailing list, Jira, or Slack channel |
| and we will try to incorporate your feedback quickly. See the |
| <a href="https://kudu.apache.org/community.html">Kudu community page</a> for details.</p> |
| |
| <h2 id="thank-you">Thank You</h2> |
| |
| <p>We would like to give a special thank you to everyone who helped contribute to the <code>kudu-test-utils</code> |
| and <code>kudu-binary</code> artifacts. We would especially like to thank |
| <a href="https://www.linkedin.com/in/brian-mcdevitt-1136a08/">Brian McDevitt</a> at <a href="https://www.phdata.io/">phData</a> |
| and |
| <a href="https://twitter.com/timrobertson100">Tim Robertson</a> at <a href="https://www.gbif.org/">GBIF</a> who helped us |
| tremendously.</p></content><author><name>Grant Henke & Mike Percy</name></author><summary>Note: This is a cross-post from the Cloudera Engineering Blog |
| Testing Apache Kudu Applications on the JVM |
| |
| Although the Kudu server is written in C++ for performance and efficiency, developers can write |
| client applications in C++, Java, or Python. To make it easier for Java developers to create |
| reliable client applications, we’ve added new utilities in Kudu 1.9.0 that allow you to write tests |
| using a Kudu cluster without needing to build Kudu yourself, without any knowledge of C++, |
| and without any complicated coordination around starting and stopping Kudu clusters for each test. |
| This post describes how the new testing utilities work and how you can use them in your application |
| tests.</summary></entry><entry><title>Apache Kudu 1.9.0 Released</title><link href="/2019/03/15/apache-kudu-1-9-0-release.html" rel="alternate" type="text/html" title="Apache Kudu 1.9.0 Released" /><published>2019-03-15T00:00:00-07:00</published><updated>2019-03-15T00:00:00-07:00</updated><id>/2019/03/15/apache-kudu-1-9-0-release</id><content type="html" xml:base="/2019/03/15/apache-kudu-1-9-0-release.html"><p>The Apache Kudu team is happy to announce the release of Kudu 1.9.0!</p> |
| |
| <p>The new release adds several new features and improvements, including the |
| following:</p> |
| |
| <!--more--> |
| |
| <ul> |
| <li>Added support for location awareness for placement of tablet replicas.</li> |
| <li>Introduced docker scripts to facilitate building and running Kudu on various |
| operating systems.</li> |
| <li>Introduced an experimental feature to allow users to run tests against a Kudu |
| mini cluster without having to first locally build or install Kudu.</li> |
| <li>Updated the compaction policy to favor reducing the number of rowsets, which |
| can lead to significantly faster scans and bootup times in certain workloads.</li> |
| <li>Multiple tooling enhancements have been made to improve visibility into Kudu |
| tables.</li> |
| </ul> |
| |
| <p>The above is just a list of the highlights, for a more complete list of new |
| features, improvements and fixes please refer to the <a href="/releases/1.9.0/docs/release_notes.html">release |
| notes</a>.</p> |
| |
| <p>The Apache Kudu project only publishes source code releases. To build Kudu |
| 1.9.0, follow these steps:</p> |
| |
| <ul> |
| <li>Download the Kudu <a href="/releases/1.9.0">1.9.0 source release</a></li> |
| <li>Follow the instructions in the documentation to build Kudu <a href="/releases/1.9.0/docs/installation.html#build_from_source">1.9.0 from |
| source</a></li> |
| </ul> |
| |
| <p>For your convenience, binary JAR files for the Kudu Java client library, Spark |
| DataSource, Flume sink, and other Java integrations are published to the ASF |
| Maven repository and are <a href="https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.kudu%22%20AND%20v%3A%221.9.0%22">now |
| available</a>.</p> |
| |
| <p>The Python client source is also available on |
| <a href="https://pypi.org/project/kudu-python/">PyPI</a>.</p></content><author><name>Andrew Wong</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.9.0! |
| |
| The new release adds several new features and improvements, including the |
| following:</summary></entry><entry><title>Transparent Hierarchical Storage Management with Apache Kudu and Impala</title><link href="/2019/03/05/transparent-hierarchical-storage-management-with-apache-kudu-and-impala.html" rel="alternate" type="text/html" title="Transparent Hierarchical Storage Management with Apache Kudu and Impala" /><published>2019-03-05T00:00:00-08:00</published><updated>2019-03-05T00:00:00-08:00</updated><id>/2019/03/05/transparent-hierarchical-storage-management-with-apache-kudu-and-impala</id><content type="html" xml:base="/2019/03/05/transparent-hierarchical-storage-management-with-apache-kudu-and-impala.html"><p>Note: This is a cross-post from the Cloudera Engineering Blog |
| <a href="https://blog.cloudera.com/blog/2019/03/transparent-hierarchical-storage-management-with-apache-kudu-and-impala/">Transparent Hierarchical Storage Management with Apache Kudu and Impala</a></p> |
| |
| <p>When picking a storage option for an application it is common to pick a single |
| storage option which has the most applicable features to your use case. For mutability |
| and real-time analytics workloads you may want to use Apache Kudu, but for massive |
| scalability at a low cost you may want to use HDFS. For that reason, there is a need |
| for a solution that allows you to leverage the best features of multiple storage |
| options. This post describes the sliding window pattern using Apache Impala with data |
| stored in Apache Kudu and Apache HDFS. With this pattern you get all of the benefits |
| of multiple storage layers in a way that is transparent to users.</p> |
| |
| <!--more--> |
| |
| <p>Apache Kudu is designed for fast analytics on rapidly changing data. Kudu provides a |
| combination of fast inserts/updates and efficient columnar scans to enable multiple |
| real-time analytic workloads across a single storage layer. For that reason, Kudu fits |
| well into a data pipeline as the place to store real-time data that needs to be |
| queryable immediately. Additionally, Kudu supports updating and deleting rows in |
| real-time allowing support for late arriving data and data correction.</p> |
| |
| <p>Apache HDFS is designed to allow for limitless scalability at a low cost. It is |
| optimized for batch oriented use cases where data is immutable. When paired with the |
| Apache Parquet file format, structured data can be accessed with extremely high |
| throughput and efficiency.</p> |
| |
| <p>For situations in which the data is small and ever-changing, like dimension tables, |
| it is common to keep all of the data in Kudu. It is even common to keep large tables |
| in Kudu when the data fits within Kudu’s |
| <a href="https://kudu.apache.org/docs/known_issues.html#_scale">scaling limits</a> and can benefit |
| from Kudu’s unique features. In cases where the data is massive, batch oriented, and |
| unlikely to change, storing the data in HDFS using the Parquet format is preferred. |
| When you need the benefits of both storage layers, the sliding window pattern is a |
| useful solution.</p> |
| |
| <h2 id="the-sliding-window-pattern">The Sliding Window Pattern</h2> |
| |
| <p>In this pattern, matching Kudu and Parquet formatted HDFS tables are created in Impala. |
| These tables are partitioned by a unit of time based on how frequently the data is |
| moved between the Kudu and HDFS table. It is common to use daily, monthly, or yearly |
| partitions. A unified view is created and a <code>WHERE</code> clause is used to define a boundary |
| that separates which data is read from the Kudu table and which is read from the HDFS |
| table. The defined boundary is important so that you can move data between Kudu and |
| HDFS without exposing duplicate records to the view. Once the data is moved, an atomic |
| <code>ALTER VIEW</code> statement can be used to move the boundary forward.</p> |
| |
| <p><img src="/img/transparent-hierarchical-storage-management-with-apache-kudu-and-impala/sliding-window-pattern.png" alt="png" class="img-responsive" /></p> |
| |
| <p>Note: This pattern works best with somewhat sequential data organized into range |
| partitions, because having a sliding window of time and dropping partitions is very |
| efficient.</p> |
| |
| <p>This pattern results in a sliding window of time where mutable data is stored in Kudu |
| and immutable data is stored in the Parquet format on HDFS. Leveraging both Kudu and |
| HDFS via Impala provides the benefits of both storage systems:</p> |
| |
| <ul> |
| <li>Streaming data is immediately queryable</li> |
| <li>Updates for late arriving data or manual corrections can be made</li> |
| <li>Data stored in HDFS is optimally sized increasing performance and preventing small files</li> |
| <li>Reduced cost</li> |
| </ul> |
| |
| <p>Impala also supports cloud storage options such as |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_s3.html">S3</a> and |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_adls.html">ADLS</a>. |
| This capability allows convenient access to a storage system that is remotely managed, |
| accessible from anywhere, and integrated with various cloud-based services. Because |
| this data is remote, queries against S3 data are less performant, making S3 suitable |
| for holding “cold” data that is only queried occasionally. This pattern can be |
| extended to use cloud storage for cold data by creating a third matching table and |
| adding another boundary to the unified view.</p> |
| |
| <p><img src="/img/transparent-hierarchical-storage-management-with-apache-kudu-and-impala/sliding-window-pattern-cold.png" alt="png" class="img-responsive" /></p> |
| |
| <p>Note: For simplicity only Kudu and HDFS are illustrated in the examples below.</p> |
| |
| <p>The process for moving data from Kudu to HDFS is broken into two phases. The first |
| phase is the data migration, and the second phase is the metadata change. These |
| ongoing steps should be scheduled to run automatically on a regular basis.</p> |
| |
| <p>In the first phase, the now immutable data is copied from Kudu to HDFS. Even though |
| data is duplicated from Kudu into HDFS, the boundary defined in the view will prevent |
| duplicate data from being shown to users. This step can include any validation and |
| retries as needed to ensure the data offload is successful.</p> |
| |
| <p><img src="/img/transparent-hierarchical-storage-management-with-apache-kudu-and-impala/phase-1.png" alt="png" class="img-responsive" /></p> |
| |
| <p>In the second phase, now that the data is safely copied to HDFS, the metadata is |
| changed to adjust how the offloaded partition is exposed. This includes shifting |
| the boundary forward, adding a new Kudu partition for the next period, and dropping |
| the old Kudu partition.</p> |
| |
| <p><img src="/img/transparent-hierarchical-storage-management-with-apache-kudu-and-impala/phase-2.png" alt="png" class="img-responsive" /></p> |
| |
| <h2 id="building-blocks">Building Blocks</h2> |
| |
| <p>In order to implement the sliding window pattern, a few Impala fundamentals are |
| required. Below each fundamental building block of the sliding window pattern is |
| described.</p> |
| |
| <h3 id="moving-data">Moving Data</h3> |
| |
| <p>Moving data among storage systems via Impala is straightforward provided you have |
| matching tables defined using each of the storage formats. In order to keep this post |
| brief, all of the options available when creating an Impala table are not described. |
| However, Impala’s |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_create_table.html">CREATE TABLE documentation</a> |
| can be referenced to find the correct syntax for Kudu, HDFS, and cloud storage tables. |
| A few examples are shown further below where the sliding window pattern is illustrated.</p> |
| |
| <p>Once the tables are created, moving the data is as simple as an |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_insert.html">INSERT…SELECT</a> statement:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">table_foo</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">table_bar</span><span class="p">;</span></code></pre></div> |
| |
| <p>All of the features of the |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_select.html">SELECT</a> |
| statement can be used to select the specific data you would like to move.</p> |
| |
| <p>Note: If moving data to Kudu, an <code>UPSERT INTO</code> statement can be used to handle |
| duplicate keys.</p> |
| |
| <h3 id="unified-querying">Unified Querying</h3> |
| |
| <p>Querying data from multiple tables and data sources in Impala is also straightforward. |
| For the sake of brevity, all of the options available when creating an Impala view are |
| not described. However, see Impala’s |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_create_view.html">CREATE VIEW documentation</a> |
| for more in-depth details.</p> |
| |
| <p>Creating a view for unified querying is as simple as a <code>CREATE VIEW</code> statement using |
| two <code>SELECT</code> clauses combined with a <code>UNION ALL</code>:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">VIEW</span> <span class="n">foo_view</span> <span class="k">AS</span> |
| <span class="k">SELECT</span> <span class="n">col1</span><span class="p">,</span> <span class="n">col2</span><span class="p">,</span> <span class="n">col3</span> <span class="k">FROM</span> <span class="n">foo_parquet</span> |
| <span class="k">UNION</span> <span class="k">ALL</span> |
| <span class="k">SELECT</span> <span class="n">col1</span><span class="p">,</span> <span class="n">col2</span><span class="p">,</span> <span class="n">col3</span> <span class="k">FROM</span> <span class="n">foo_kudu</span><span class="p">;</span></code></pre></div> |
| |
| <p>WARNING: Be sure to use <code>UNION ALL</code> and not <code>UNION</code>. The <code>UNION</code> keyword by itself |
| is the same as <code>UNION DISTINCT</code> and can have significant performance impact. |
| More information can be found in the Impala |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_union.html">UNION documentation</a>.</p> |
| |
| <p>All of the features of the |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_select.html">SELECT</a> |
| statement can be used to expose the correct data and columns from each of the |
| underlying tables. It is important to use the <code>WHERE</code> clause to pass through and |
| pushdown any predicates that need special handling or transformations. More examples |
| will follow below in the discussion of the sliding window pattern.</p> |
| |
| <p>Additionally, views can be altered via the |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_alter_view.html">ALTER VIEW</a> |
| statement. This is useful when combined with the <code>SELECT</code> statement because it can be |
| used to atomically update what data is being accessed by the view.</p> |
| |
| <h2 id="an-example-implementation">An Example Implementation</h2> |
| |
| <p>Below are sample steps to implement the sliding window pattern using a monthly period |
| with three months of active mutable data. Data older than three months will be |
| offloaded to HDFS using the Parquet format.</p> |
| |
| <h3 id="create-the-kudu-table">Create the Kudu Table</h3> |
| |
| <p>First, create a Kudu table which will hold three months of active mutable data. |
| The table is range partitioned by the time column with each range containing one |
| period of data. It is important to have partitions that match the period because |
| dropping Kudu partitions is much more efficient than removing the data via the |
| <code>DELETE</code> clause. The table is also hash partitioned by the other key column to ensure |
| that all of the data is not written to a single partition.</p> |
| |
| <p>Note: Your schema design should vary based on your data and read/write performance |
| considerations. This example schema is intended for demonstration purposes and not as |
| an “optimal” schema. See the |
| <a href="https://kudu.apache.org/docs/schema_design.html">Kudu schema design documentation</a> |
| for more guidance on choosing your schema. For example, you may not need any hash |
| partitioning if your |
| data input rate is low. Alternatively, you may need more hash buckets if your data |
| input rate is very high.</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">my_table_kudu</span> |
| <span class="p">(</span> |
| <span class="n">name</span> <span class="n">STRING</span><span class="p">,</span> |
| <span class="n">time</span> <span class="k">TIMESTAMP</span><span class="p">,</span> |
| <span class="n">message</span> <span class="n">STRING</span><span class="p">,</span> |
| <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">time</span><span class="p">)</span> |
| <span class="p">)</span> |
| <span class="n">PARTITION</span> <span class="k">BY</span> |
| <span class="n">HASH</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="n">PARTITIONS</span> <span class="mi">4</span><span class="p">,</span> |
| <span class="n">RANGE</span><span class="p">(</span><span class="n">time</span><span class="p">)</span> <span class="p">(</span> |
| <span class="n">PARTITION</span> <span class="s1">&#39;2018-01-01&#39;</span> <span class="o">&lt;=</span> <span class="k">VALUES</span> <span class="o">&lt;</span> <span class="s1">&#39;2018-02-01&#39;</span><span class="p">,</span> <span class="c1">--January</span> |
| <span class="n">PARTITION</span> <span class="s1">&#39;2018-02-01&#39;</span> <span class="o">&lt;=</span> <span class="k">VALUES</span> <span class="o">&lt;</span> <span class="s1">&#39;2018-03-01&#39;</span><span class="p">,</span> <span class="c1">--February</span> |
| <span class="n">PARTITION</span> <span class="s1">&#39;2018-03-01&#39;</span> <span class="o">&lt;=</span> <span class="k">VALUES</span> <span class="o">&lt;</span> <span class="s1">&#39;2018-04-01&#39;</span><span class="p">,</span> <span class="c1">--March</span> |
| <span class="n">PARTITION</span> <span class="s1">&#39;2018-04-01&#39;</span> <span class="o">&lt;=</span> <span class="k">VALUES</span> <span class="o">&lt;</span> <span class="s1">&#39;2018-05-01&#39;</span> <span class="c1">--April</span> |
| <span class="p">)</span> |
| <span class="n">STORED</span> <span class="k">AS</span> <span class="n">KUDU</span><span class="p">;</span></code></pre></div> |
| |
| <p>Note: There is an extra month partition to provide a buffer of time for the data to |
| be moved into the immutable table.</p> |
| |
| <h3 id="create-the-hdfs-table">Create the HDFS Table</h3> |
| |
| <p>Create the matching Parquet formatted HDFS table which will hold the older immutable |
| data. This table is partitioned by year, month, and day for efficient access even |
| though you can’t partition by the time column itself. This is addressed further in |
| the view step below. See Impala’s |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_partitioning.html">partitioning documentation</a> |
| for more details.</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">my_table_parquet</span> |
| <span class="p">(</span> |
| <span class="n">name</span> <span class="n">STRING</span><span class="p">,</span> |
| <span class="n">time</span> <span class="k">TIMESTAMP</span><span class="p">,</span> |
| <span class="n">message</span> <span class="n">STRING</span> |
| <span class="p">)</span> |
| <span class="n">PARTITIONED</span> <span class="k">BY</span> <span class="p">(</span><span class="k">year</span> <span class="nb">int</span><span class="p">,</span> <span class="k">month</span> <span class="nb">int</span><span class="p">,</span> <span class="k">day</span> <span class="nb">int</span><span class="p">)</span> |
| <span class="n">STORED</span> <span class="k">AS</span> <span class="n">PARQUET</span><span class="p">;</span></code></pre></div> |
| |
| <h3 id="create-the-unified-view">Create the Unified View</h3> |
| |
| <p>Now create the unified view which will be used to query all of the data seamlessly:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">VIEW</span> <span class="n">my_table_view</span> <span class="k">AS</span> |
| <span class="k">SELECT</span> <span class="n">name</span><span class="p">,</span> <span class="n">time</span><span class="p">,</span> <span class="n">message</span> |
| <span class="k">FROM</span> <span class="n">my_table_kudu</span> |
| <span class="k">WHERE</span> <span class="n">time</span> <span class="o">&gt;=</span> <span class="ss">&quot;2018-01-01&quot;</span> |
| <span class="k">UNION</span> <span class="k">ALL</span> |
| <span class="k">SELECT</span> <span class="n">name</span><span class="p">,</span> <span class="n">time</span><span class="p">,</span> <span class="n">message</span> |
| <span class="k">FROM</span> <span class="n">my_table_parquet</span> |
| <span class="k">WHERE</span> <span class="n">time</span> <span class="o">&lt;</span> <span class="ss">&quot;2018-01-01&quot;</span> |
| <span class="k">AND</span> <span class="k">year</span> <span class="o">=</span> <span class="k">year</span><span class="p">(</span><span class="n">time</span><span class="p">)</span> |
| <span class="k">AND</span> <span class="k">month</span> <span class="o">=</span> <span class="k">month</span><span class="p">(</span><span class="n">time</span><span class="p">)</span> |
| <span class="k">AND</span> <span class="k">day</span> <span class="o">=</span> <span class="k">day</span><span class="p">(</span><span class="n">time</span><span class="p">);</span></code></pre></div> |
| |
| <p>Each <code>SELECT</code> clause explicitly lists all of the columns to expose. This ensures that |
| the year, month, and day columns that are unique to the Parquet table are not exposed. |
| If needed, it also allows any necessary column or type mapping to be handled.</p> |
| |
| <p>The initial <code>WHERE</code> clauses applied to both my_table_kudu and my_table_parquet define |
| the boundary between Kudu and HDFS to ensure duplicate data is not read while in the |
| process of offloading data.</p> |
| |
| <p>The additional <code>AND</code> clauses applied to my_table_parquet are used to ensure good |
| predicate pushdown on the individual year, month, and day columns.</p> |
| |
| <p>WARNING: As stated earlier, be sure to use <code>UNION ALL</code> and not <code>UNION</code>. The <code>UNION</code> |
| keyword by itself is the same as <code>UNION DISTINCT</code> and can have significant performance |
| impact. More information can be found in the Impala |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_union.html"><code>UNION</code> documentation</a>.</p> |
| |
| <h3 id="ongoing-steps">Ongoing Steps</h3> |
| |
| <p>Now that the base tables and view are created, prepare the ongoing steps to maintain |
| the sliding window. Because these ongoing steps should be scheduled to run on a |
| regular basis, the examples below are shown using <code>.sql</code> files that take variables |
| which can be passed from your scripts and scheduling tool of choice.</p> |
| |
| <p>Create the <code>window_data_move.sql</code> file to move the data from the oldest partition to HDFS:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">INSERT</span> <span class="k">INTO</span> <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">hdfs_table</span><span class="err">}</span> <span class="n">PARTITION</span> <span class="p">(</span><span class="k">year</span><span class="p">,</span> <span class="k">month</span><span class="p">,</span> <span class="k">day</span><span class="p">)</span> |
| <span class="k">SELECT</span> <span class="o">*</span><span class="p">,</span> <span class="k">year</span><span class="p">(</span><span class="n">time</span><span class="p">),</span> <span class="k">month</span><span class="p">(</span><span class="n">time</span><span class="p">),</span> <span class="k">day</span><span class="p">(</span><span class="n">time</span><span class="p">)</span> |
| <span class="k">FROM</span> <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">kudu_table</span><span class="err">}</span> |
| <span class="k">WHERE</span> <span class="n">time</span> <span class="o">&gt;=</span> <span class="n">add_months</span><span class="p">(</span><span class="ss">&quot;${var:new_boundary_time}&quot;</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> |
| <span class="k">AND</span> <span class="n">time</span> <span class="o">&lt;</span> <span class="ss">&quot;${var:new_boundary_time}&quot;</span><span class="p">;</span> |
| <span class="n">COMPUTE</span> <span class="n">INCREMENTAL</span> <span class="n">STATS</span> <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">hdfs_table</span><span class="err">}</span><span class="p">;</span></code></pre></div> |
| |
| <p>Note: The |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_compute_stats.html">COMPUTE INCREMENTAL STATS</a> |
| clause is not required but helps Impala to optimize queries.</p> |
| |
| <p>To run the SQL statement, use the Impala shell and pass the required variables. |
| Below is an example:</p> |
| |
| <div class="highlight"><pre><code class="language-bash" data-lang="bash">impala-shell -i &lt;impalad:port&gt; -f window_data_move.sql |
| --var<span class="o">=</span><span class="nv">kudu_table</span><span class="o">=</span>my_table_kudu |
| --var<span class="o">=</span><span class="nv">hdfs_table</span><span class="o">=</span>my_table_parquet |
| --var<span class="o">=</span><span class="nv">new_boundary_time</span><span class="o">=</span><span class="s2">&quot;2018-02-01&quot;</span></code></pre></div> |
| |
| <p>Note: You can adjust the <code>WHERE</code> clause to match the given period and cadence of your |
| offload. Here the add_months function is used with an argument of -1 to move one month |
| of data in the past from the new boundary time.</p> |
| |
| <p>Create the <code>window_view_alter.sql</code> file to shift the time boundary forward by altering |
| the unified view:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">ALTER</span> <span class="k">VIEW</span> <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">view_name</span><span class="err">}</span> <span class="k">AS</span> |
| <span class="k">SELECT</span> <span class="n">name</span><span class="p">,</span> <span class="n">time</span><span class="p">,</span> <span class="n">message</span> |
| <span class="k">FROM</span> <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">kudu_table</span><span class="err">}</span> |
| <span class="k">WHERE</span> <span class="n">time</span> <span class="o">&gt;=</span> <span class="ss">&quot;${var:new_boundary_time}&quot;</span> |
| <span class="k">UNION</span> <span class="k">ALL</span> |
| <span class="k">SELECT</span> <span class="n">name</span><span class="p">,</span> <span class="n">time</span><span class="p">,</span> <span class="n">message</span> |
| <span class="k">FROM</span> <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">hdfs_table</span><span class="err">}</span> |
| <span class="k">WHERE</span> <span class="n">time</span> <span class="o">&lt;</span> <span class="ss">&quot;${var:new_boundary_time}&quot;</span> |
| <span class="k">AND</span> <span class="k">year</span> <span class="o">=</span> <span class="k">year</span><span class="p">(</span><span class="n">time</span><span class="p">)</span> |
| <span class="k">AND</span> <span class="k">month</span> <span class="o">=</span> <span class="k">month</span><span class="p">(</span><span class="n">time</span><span class="p">)</span> |
| <span class="k">AND</span> <span class="k">day</span> <span class="o">=</span> <span class="k">day</span><span class="p">(</span><span class="n">time</span><span class="p">);</span></code></pre></div> |
| |
| <p>To run the SQL statement, use the Impala shell and pass the required variables. |
| Below is an example:</p> |
| |
| <div class="highlight"><pre><code class="language-bash" data-lang="bash">impala-shell -i &lt;impalad:port&gt; -f window_view_alter.sql |
| --var<span class="o">=</span><span class="nv">view_name</span><span class="o">=</span>my_table_view |
| --var<span class="o">=</span><span class="nv">kudu_table</span><span class="o">=</span>my_table_kudu |
| --var<span class="o">=</span><span class="nv">hdfs_table</span><span class="o">=</span>my_table_parquet |
| --var<span class="o">=</span><span class="nv">new_boundary_time</span><span class="o">=</span><span class="s2">&quot;2018-02-01&quot;</span></code></pre></div> |
| |
| <p>Create the <code>window_partition_shift.sql</code> file to shift the Kudu partitions forward:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">ALTER</span> <span class="k">TABLE</span> <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">kudu_table</span><span class="err">}</span> |
| |
| <span class="k">ADD</span> <span class="n">RANGE</span> <span class="n">PARTITION</span> <span class="n">add_months</span><span class="p">(</span><span class="ss">&quot;${var:new_boundary_time}&quot;</span><span class="p">,</span> |
| <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">window_length</span><span class="err">}</span><span class="p">)</span> <span class="o">&lt;=</span> <span class="k">VALUES</span> <span class="o">&lt;</span> <span class="n">add_months</span><span class="p">(</span><span class="ss">&quot;${var:new_boundary_time}&quot;</span><span class="p">,</span> |
| <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">window_length</span><span class="err">}</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span> |
| |
| <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="err">${</span><span class="n">var</span><span class="p">:</span><span class="n">kudu_table</span><span class="err">}</span> |
| |
| <span class="k">DROP</span> <span class="n">RANGE</span> <span class="n">PARTITION</span> <span class="n">add_months</span><span class="p">(</span><span class="ss">&quot;${var:new_boundary_time}&quot;</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> |
| <span class="o">&lt;=</span> <span class="k">VALUES</span> <span class="o">&lt;</span> <span class="ss">&quot;${var:new_boundary_time}&quot;</span><span class="p">;</span></code></pre></div> |
| |
| <p>To run the SQL statement, use the Impala shell and pass the required variables. |
| Below is an example:</p> |
| |
| <div class="highlight"><pre><code class="language-bash" data-lang="bash">impala-shell -i &lt;impalad:port&gt; -f window_partition_shift.sql |
| --var<span class="o">=</span><span class="nv">kudu_table</span><span class="o">=</span>my_table_kudu |
| --var<span class="o">=</span><span class="nv">new_boundary_time</span><span class="o">=</span><span class="s2">&quot;2018-02-01&quot;</span> |
| --var<span class="o">=</span><span class="nv">window_length</span><span class="o">=</span>3</code></pre></div> |
| |
| <p>Note: You should periodically run |
| <a href="https://impala.apache.org/docs/build/html/topics/impala_compute_stats.html">COMPUTE STATS</a> |
| on your Kudu table to ensure Impala’s query performance is optimal.</p> |
| |
| <h3 id="experimentation">Experimentation</h3> |
| |
| <p>Now that you have created the tables, view, and scripts to leverage the sliding |
| window pattern, you can experiment with them by inserting data for different time |
| ranges and running the scripts to move the window forward through time.</p> |
| |
| <p>Insert some sample values into the Kudu table:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">my_table_kudu</span> <span class="k">VALUES</span> |
| <span class="p">(</span><span class="s1">&#39;joey&#39;</span><span class="p">,</span> <span class="s1">&#39;2018-01-01&#39;</span><span class="p">,</span> <span class="s1">&#39;hello&#39;</span><span class="p">),</span> |
| <span class="p">(</span><span class="s1">&#39;ross&#39;</span><span class="p">,</span> <span class="s1">&#39;2018-02-01&#39;</span><span class="p">,</span> <span class="s1">&#39;goodbye&#39;</span><span class="p">),</span> |
| <span class="p">(</span><span class="s1">&#39;rachel&#39;</span><span class="p">,</span> <span class="s1">&#39;2018-03-01&#39;</span><span class="p">,</span> <span class="s1">&#39;hi&#39;</span><span class="p">);</span></code></pre></div> |
| |
| <p>Show the data in each table/view:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_kudu</span><span class="p">;</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_parquet</span><span class="p">;</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_view</span><span class="p">;</span></code></pre></div> |
| |
| <p>Move the January data into HDFS:</p> |
| |
| <div class="highlight"><pre><code class="language-bash" data-lang="bash">impala-shell -i &lt;impalad:port&gt; -f window_data_move.sql |
| --var<span class="o">=</span><span class="nv">kudu_table</span><span class="o">=</span>my_table_kudu |
| --var<span class="o">=</span><span class="nv">hdfs_table</span><span class="o">=</span>my_table_parquet |
| --var<span class="o">=</span><span class="nv">new_boundary_time</span><span class="o">=</span><span class="s2">&quot;2018-02-01&quot;</span></code></pre></div> |
| |
| <p>Confirm the data is in both places, but not duplicated in the view:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_kudu</span><span class="p">;</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_parquet</span><span class="p">;</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_view</span><span class="p">;</span></code></pre></div> |
| |
| <p>Alter the view to shift the time boundary forward to February:</p> |
| |
| <div class="highlight"><pre><code class="language-bash" data-lang="bash">impala-shell -i &lt;impalad:port&gt; -f window_view_alter.sql |
| --var<span class="o">=</span><span class="nv">view_name</span><span class="o">=</span>my_table_view |
| --var<span class="o">=</span><span class="nv">kudu_table</span><span class="o">=</span>my_table_kudu |
| --var<span class="o">=</span><span class="nv">hdfs_table</span><span class="o">=</span>my_table_parquet |
| --var<span class="o">=</span><span class="nv">new_boundary_time</span><span class="o">=</span><span class="s2">&quot;2018-02-01&quot;</span></code></pre></div> |
| |
| <p>Confirm the data is still in both places, but not duplicated in the view:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_kudu</span><span class="p">;</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_parquet</span><span class="p">;</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_view</span><span class="p">;</span></code></pre></div> |
| |
| <p>Shift the Kudu partitions forward:</p> |
| |
| <div class="highlight"><pre><code class="language-bash" data-lang="bash">impala-shell -i &lt;impalad:port&gt; -f window_partition_shift.sql |
| --var<span class="o">=</span><span class="nv">kudu_table</span><span class="o">=</span>my_table_kudu |
| --var<span class="o">=</span><span class="nv">new_boundary_time</span><span class="o">=</span><span class="s2">&quot;2018-02-01&quot;</span> |
| --var<span class="o">=</span><span class="nv">window_length</span><span class="o">=</span>3</code></pre></div> |
| |
| <p>Confirm the January data is now only in HDFS:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_kudu</span><span class="p">;</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_parquet</span><span class="p">;</span> |
| <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_view</span><span class="p">;</span></code></pre></div> |
| |
| <p>Confirm predicate push down with Impala’s EXPLAIN statement:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">EXPLAIN</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_view</span><span class="p">;</span> |
| <span class="k">EXPLAIN</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_view</span> <span class="k">WHERE</span> <span class="n">time</span> <span class="o">&lt;</span> <span class="ss">&quot;2018-02-01&quot;</span><span class="p">;</span> |
| <span class="k">EXPLAIN</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table_view</span> <span class="k">WHERE</span> <span class="n">time</span> <span class="o">&gt;</span> <span class="ss">&quot;2018-02-01&quot;</span><span class="p">;</span></code></pre></div> |
| |
| <p>In the explain output you should see “kudu predicates” which include the time column |
| filters in the “SCAN KUDU” section and “predicates” which include the time, day, month, |
| and year columns in the “SCAN HDFS” section.</p></content><author><name>Grant Henke</name></author><summary>Note: This is a cross-post from the Cloudera Engineering Blog |
| Transparent Hierarchical Storage Management with Apache Kudu and Impala |
| |
| When picking a storage option for an application it is common to pick a single |
| storage option which has the most applicable features to your use case. For mutability |
| and real-time analytics workloads you may want to use Apache Kudu, but for massive |
| scalability at a low cost you may want to use HDFS. For that reason, there is a need |
| for a solution that allows you to leverage the best features of multiple storage |
| options. This post describes the sliding window pattern using Apache Impala with data |
| stored in Apache Kudu and Apache HDFS. With this pattern you get all of the benefits |
| of multiple storage layers in a way that is transparent to users.</summary></entry><entry><title>Call for Posts</title><link href="/2018/12/11/call-for-posts.html" rel="alternate" type="text/html" title="Call for Posts" /><published>2018-12-11T00:00:00-08:00</published><updated>2018-12-11T00:00:00-08:00</updated><id>/2018/12/11/call-for-posts</id><content type="html" xml:base="/2018/12/11/call-for-posts.html"><p>Most of the posts in the Kudu blog have been written by the project’s |
| committers and are either technical or news-like in nature. We’d like to hear |
| how you’re using Kudu in production, in testing, or in your hobby project and |
| we’d like to share it with the world!</p> |
| |
| <!--more--> |
| |
| <p>If you’d like to tell the world about how you are using Kudu in your project, |
| now is the time.</p> |
| |
| <p>To learn how to submit posts, read our <a href="/docs/contributing.html#_blog_posts">contributing |
| documentation</a>. Alternatively, you can |
| draft your post to Google Docs and share it with us on |
| <a href="&#109;&#097;&#105;&#108;&#116;&#111;:&#100;&#101;&#118;&#064;&#107;&#117;&#100;&#117;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">&#100;&#101;&#118;&#064;&#107;&#117;&#100;&#117;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;</a> and we’re happy to review it |
| and post it to the blog for you.</p></content><author><name>Attila Bukor</name></author><summary>Most of the posts in the Kudu blog have been written by the project’s |
| committers and are either technical or news-like in nature. We’d like to hear |
| how you’re using Kudu in production, in testing, or in your hobby project and |
| we’d like to share it with the world!</summary></entry><entry><title>Apache Kudu 1.8.0 Released</title><link href="/2018/10/26/apache-kudu-1-8-0-released.html" rel="alternate" type="text/html" title="Apache Kudu 1.8.0 Released" /><published>2018-10-26T00:00:00-07:00</published><updated>2018-10-26T00:00:00-07:00</updated><id>/2018/10/26/apache-kudu-1-8-0-released</id><content type="html" xml:base="/2018/10/26/apache-kudu-1-8-0-released.html"><p>The Apache Kudu team is happy to announce the release of Kudu 1.8.0!</p> |
| |
| <p>The new release adds several new features and improvements, including the |
| following:</p> |
| |
| <!--more--> |
| |
| <ul> |
| <li>Introduced manual data rebalancer tool which can be used to redistribute |
| table replicas among tablet servers</li> |
| <li>Added support for <code>IS NULL</code> and <code>IS NOT NULL</code> predicates to the Kudu Python |
| client</li> |
| <li>Multiple tooling improvements make diagnostics and troubleshooting simpler</li> |
| <li>The Kudu Spark connector now supports Spark Streaming DataFrames</li> |
| <li>Added Pandas support to the Python client</li> |
| </ul> |
| |
| <p>The above is just a list of the highlights, for a more complete list of new |
| features, improvements and fixes please refer to the <a href="/releases/1.8.0/docs/release_notes.html">release |
| notes</a>.</p> |
| |
| <p>The Apache Kudu project only publishes source code releases. To build Kudu |
| 1.8.0, follow these steps:</p> |
| |
| <ul> |
| <li>Download the Kudu <a href="/releases/1.8.0">1.8.0 source release</a></li> |
| <li>Follow the instructions in the documentation to build Kudu <a href="/releases/1.8.0/docs/installation.html#build_from_source">1.8.0 from |
| source</a></li> |
| </ul> |
| |
| <p>For your convenience, binary JAR files for the Kudu Java client library, Spark |
| DataSource, Flume sink, and other Java integrations are published to the ASF |
| Maven repository and are <a href="https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.kudu%22%20AND%20v%3A%221.8.0%22">now |
| available</a>.</p> |
| |
| <p>The Python client source is also available on |
| <a href="https://pypi.org/project/kudu-python/">PyPI</a>.</p></content><author><name>Attila Bukor</name></author><summary>The Apache Kudu team is happy to announce the release of Kudu 1.8.0! |
| |
| The new release adds several new features and improvements, including the |
| following:</summary></entry><entry><title>Index Skip Scan Optimization in Kudu</title><link href="/2018/09/26/index-skip-scan-optimization-in-kudu.html" rel="alternate" type="text/html" title="Index Skip Scan Optimization in Kudu" /><published>2018-09-26T00:00:00-07:00</published><updated>2018-09-26T00:00:00-07:00</updated><id>/2018/09/26/index-skip-scan-optimization-in-kudu</id><content type="html" xml:base="/2018/09/26/index-skip-scan-optimization-in-kudu.html"><p>This summer I got the opportunity to intern with the Apache Kudu team at Cloudera. |
| My project was to optimize the Kudu scan path by implementing a technique called |
| index skip scan (a.k.a. scan-to-seek, see section 4.1 in [1]). I wanted to share |
| my experience and the progress we’ve made so far on the approach.</p> |
| |
| <!--more--> |
| |
| <p>Let’s begin with discussing the current query flow in Kudu. |
| Consider the following table:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">metrics</span> <span class="p">(</span> |
| <span class="k">host</span> <span class="n">STRING</span><span class="p">,</span> |
| <span class="n">tstamp</span> <span class="nb">INT</span><span class="p">,</span> |
| <span class="n">clusterid</span> <span class="nb">INT</span><span class="p">,</span> |
| <span class="k">role</span> <span class="n">STRING</span><span class="p">,</span> |
| <span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="p">(</span><span class="k">host</span><span class="p">,</span> <span class="n">tstamp</span><span class="p">,</span> <span class="n">clusterid</span><span class="p">)</span> |
| <span class="p">);</span></code></pre></div> |
| |
| <p><img src="/img/index-skip-scan/example-table.png" alt="png" class="img-responsive" /> |
| <em>Sample rows of table <code>metrics</code> (sorted by key columns).</em></p> |
| |
| <p>In this case, by default, Kudu internally builds a primary key index (implemented as a |
| <a href="https://en.wikipedia.org/wiki/B-tree">B-tree</a>) for the table <code>metrics</code>. |
| As shown in the table above, the index data is sorted by the composite of all key columns. |
| When the user query contains the first key column (<code>host</code>), Kudu uses the index (as the index data is |
| primarily sorted on the first key column).</p> |
| |
| <p>Now, what if the user query does not contain the first key column and instead only contains the <code>tstamp</code> column? |
| In the above case, the <code>tstamp</code> column values are sorted with respect to <code>host</code>, |
| but are not globally sorted, and as such, it’s non-trivial to use the index to filter rows. |
| Instead, a full tablet scan is done by default. Other databases may optimize such scans by building secondary indexes |
| (though it might be redundant to build one on one of the primary keys). However, this isn’t an option for Kudu, |
| given its lack of secondary index support.</p> |
| |
| <p>The question is, can Kudu do better than a full tablet scan here?</p> |
| |
| <p>The answer is yes! Let’s observe the column preceding the <code>tstamp</code> column. We will refer to it as the |
| “prefix column” and its specific value as the “prefix key”. In this example, <code>host</code> is the prefix column. |
| Note that the prefix keys are sorted in the index and that all rows of a given prefix key are also sorted by the |
| remaining key columns. Therefore, we can use the index to skip to the rows that have distinct prefix keys, |
| and also satisfy the predicate on the <code>tstamp</code> column. |
| For example, consider the query:</p> |
| |
| <div class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="n">clusterid</span> <span class="k">FROM</span> <span class="n">metrics</span> <span class="k">WHERE</span> <span class="n">tstamp</span> <span class="o">=</span> <span class="mi">100</span><span class="p">;</span></code></pre></div> |
| |
| <p><img src="/img/index-skip-scan/skip-scan-example-table.png" alt="png" class="img-responsive" /> |
| <em>Skip scan flow illustration. The rows in green are scanned and the rest are skipped.</em></p> |
| |
| <p>The tablet server can use the index to <strong>skip</strong> to the first row with a distinct prefix key (<code>host = helium</code>) that |
| matches the predicate (<code>tstamp = 100</code>) and then <strong>scan</strong> through the rows until the predicate no longer matches. At that |
| point we would know that no more rows with <code>host = helium</code> will satisfy the predicate, and we can skip to the next |
| prefix key. This holds true for all distinct keys of <code>host</code>. Hence, this method is popularly known as |
| <strong>skip scan optimization</strong>[2, 3].</p> |
| |
| <h1 id="performance">Performance</h1> |
| |
| <p>This optimization can speed up queries significantly, depending on the cardinality (number of distinct values) of the |
| prefix column. The lower the prefix column cardinality, the better the skip scan performance. In fact, when the |
| prefix column cardinality is high, skip scan is not a viable approach. The performance graph (obtained using the example |
| schema and query pattern mentioned earlier) is shown below.</p> |
| |
| <p>Based on our experiments, on up to 10 million rows per tablet (as shown below), we found that the skip scan performance |
| begins to get worse with respect to the full tablet scan performance when the prefix column cardinality |
| exceeds sqrt(number_of_rows_in_tablet). |
| Therefore, in order to use skip scan performance benefits when possible and maintain a consistent performance in cases |
| of large prefix column cardinality, we have tentatively chosen to dynamically disable skip scan when the number of skips for |
| distinct prefix keys exceeds sqrt(number_of_rows_in_tablet). |
| It will be an interesting project to further explore sophisticated heuristics to decide when |
| to dynamically disable skip scan.</p> |
| |
| <p><img src="/img/index-skip-scan/skip-scan-performance-graph.png" alt="png" class="img-responsive" /></p> |
| |
| <h1 id="conclusion">Conclusion</h1> |
| |
| <p>Skip scan optimization in Kudu can lead to huge performance benefits that scale with the size of |
| data in Kudu tablets. This is a work-in-progress <a href="https://gerrit.cloudera.org/#/c/10983/">patch</a>. |
| The implementation in the patch works only for equality predicates on the non-first primary key |
| columns. An important point to note is that although, in the above specific example, the number of prefix |
| columns is one (<code>host</code>), this approach is generalized to work with any number of prefix columns.</p> |
| |
| <p>This work also lays the groundwork to leverage the skip scan approach and optimize query processing time in the |
| following use cases:</p> |
| |
| <ul> |
| <li>Range predicates</li> |
| <li>In-list predicates</li> |
| </ul> |
| |
| <p>This was my first time working on an open source project. I thoroughly enjoyed working on this challenging problem, |
| right from understanding the scan path in Kudu to working on a full-fledged implementation of |
| the skip scan optimization. I am very grateful to the Kudu team for guiding and supporting me throughout the |
| internship period.</p> |
| |
| <h1 id="references">References</h1> |
| |
| <p><a href="https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42851.pdf">[1]</a>: Gupta, Ashish, et al. “Mesa: |
| Geo-replicated, near real-time, scalable data warehousing.” Proceedings of the VLDB Endowment 7.12 (2014): 1259-1270.</p> |
| |
| <p><a href="https://oracle-base.com/articles/9i/index-skip-scanning/">[2]</a>: Index Skip Scanning - Oracle Database</p> |
| |
| <p><a href="https://www.sqlite.org/optoverview.html#skipscan">[3]</a>: Skip Scan - SQLite</p></content><author><name>Anupama Gupta</name></author><summary>This summer I got the opportunity to intern with the Apache Kudu team at Cloudera. |
| My project was to optimize the Kudu scan path by implementing a technique called |
| index skip scan (a.k.a. scan-to-seek, see section 4.1 in [1]). I wanted to share |
| my experience and the progress we’ve made so far on the approach.</summary></entry><entry><title>Simplified Data Pipelines with Kudu</title><link href="/2018/09/11/simplified-pipelines-with-kudu.html" rel="alternate" type="text/html" title="Simplified Data Pipelines with Kudu" /><published>2018-09-11T00:00:00-07:00</published><updated>2018-09-11T00:00:00-07:00</updated><id>/2018/09/11/simplified-pipelines-with-kudu</id><content type="html" xml:base="/2018/09/11/simplified-pipelines-with-kudu.html"><p>I’ve been working with Hadoop now for over seven years and fortunately, or unfortunately, have run |
| across a lot of structured data use cases. What we, at <a href="https://phdata.io/">phData</a>, have found is |
| that end users are typically comfortable with tabular data and prefer to access their data in a |
| structured manner using tables. |
| <!--more--></p> |
| |
| <p>When working on new structured data projects, the first question we always get from non-Hadoop |
| followers is, <em>“how do I update or delete a record?”</em> The second question we get is, <em>“when adding |
| records, why don’t they show up in Impala right away?”</em> For those of us who have worked with HDFS |
| and Impala on HDFS for years, these are simple questions to answer, but hard ones to explain.</p> |
| |
| <p>The pre-Kudu years were filled with 100’s (or 1000’s) of self-join views (or materialization jobs) |
| and compaction jobs, along with scheduled jobs to refresh Impala cache periodically so new records |
| show up. And while doable, for 10,000’s of tables, this basically became a distraction from solving |
| real business problems.</p> |
| |
| <p>With the introduction of Kudu, mixing record level updates, deletes, and inserts, while supporting |
| large scans, are now something we can sustainably manage at scale. HBase is very good at record |
| level updates, deletes and inserts, but doesn’t scale well for analytic use cases that often do full |
| table scans. Moreover, for streaming use cases, changes are available in near real-time. End users, |
| accustomed to having to <em>”wait”</em> for their data, can now consume the data as it arrives in their |
| table.</p> |
| |
| <p>A common data ingest pattern where Kudu becomes necessary is change data capture (CDC). That is, |
| capturing the inserts, updates, hard deletes, and streaming them into Kudu where they can be applied |
| immediately. Pre-Kudu this pipeline was very tedious to implement. Now with tools like |
| <a href="https://streamsets.com/">StreamSets</a>, you can get up and running in a few hours.</p> |
| |
| <p>A second common workflow is near real-time analytics. We’ve streamed data off mining trucks, |
| oil wells, manufacturing lines, and needed to make that data available to end users immediately. No |
| longer do we need to batch up writes, flush to HDFS and then refresh cache in Impala. As mentioned |
| before, with Kudu, the data are available as soon as it lands. This has been a significant |
| enhancement for end users, who previously had to <em>”wait”</em> for data.</p> |
| |
| <p>In summary, Kudu has made a tremendous impact in removing the operational distractions of merging in |
| changes, and refreshing the cache of downstream consumers. This now allows data engineers |
| and users to focus on solving business problems, rather than being bothered by the tediousness of |
| the backend.</p></content><author><name>Mac Noland</name></author><summary>I’ve been working with Hadoop now for over seven years and fortunately, or unfortunately, have run |
| across a lot of structured data use cases. What we, at phData, have found is |
| that end users are typically comfortable with tabular data and prefer to access their data in a |
| structured manner using tables.</summary></entry></feed> |