blob: 1a288609ad7b7a47d9eadca433b658a0cde4c85f [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
<meta name="description" content="A new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast data" />
<meta name="author" content="Cloudera" />
<title>Apache Kudu - Ecosystem</title>
<!-- Bootstrap core CSS -->
<link rel="stylesheet" href=""
<!-- Custom styles for this template -->
<link href="/css/kudu.css" rel="stylesheet"/>
<link href="/css/asciidoc.css" rel="stylesheet"/>
<link rel="shortcut icon" href="/img/logo-favicon.ico" />
<link rel="stylesheet" href="" />
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src=""></script>
<script src=""></script>
<div class="kudu-site container-fluid">
<!-- Static navbar -->
<nav class="navbar navbar-default">
<div class="container-fluid">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<a class="logo" href="/"><img
srcset="// 1x, // 2x"
alt="Apache Kudu"/></a>
<div id="navbar" class="collapse navbar-collapse">
<ul class="nav navbar-nav navbar-right">
<li >
<a href="/">Home</a>
<li >
<a href="/overview.html">Overview</a>
<li >
<a href="/docs/">Documentation</a>
<li >
<a href="/releases/">Releases</a>
<li >
<a href="/blog/">Blog</a>
<!-- NOTE: this dropdown menu does not appear on Mobile, so don't add anything here
that doesn't also appear elsewhere on the site. -->
<li class="dropdown active">
<a href="/community.html" role="button" aria-haspopup="true" aria-expanded="false">Community <span class="caret"></span></a>
<ul class="dropdown-menu">
<li class="dropdown-header">GET IN TOUCH</li>
<li><a class="icon email" href="/community.html">Mailing Lists</a></li>
<li><a class="icon slack" href="">Slack Channel</a></li>
<li role="separator" class="divider"></li>
<li><a href="/community.html#meetups-user-groups-and-conference-presentations">Events and Meetups</a></li>
<li><a href="/committers.html">Project Committers</a></li>
<li><a href="/ecosystem.html">Ecosystem</a></li>
<!--<li><a href="/roadmap.html">Roadmap</a></li>-->
<li><a href="/community.html#contributions">How to Contribute</a></li>
<li role="separator" class="divider"></li>
<li class="dropdown-header">DEVELOPER RESOURCES</li>
<li><a class="icon github" href="">GitHub</a></li>
<li><a class="icon gerrit" href="">Gerrit Code Review</a></li>
<li><a class="icon jira" href="">JIRA Issue Tracker</a></li>
<li role="separator" class="divider"></li>
<li class="dropdown-header">SOCIAL MEDIA</li>
<li><a class="icon twitter" href="">Twitter</a></li>
<li><a href="">Reddit</a></li>
<li role="separator" class="divider"></li>
<li class="dropdown-header">APACHE SOFTWARE FOUNDATION</li>
<li><a href="" target="_blank">Security</a></li>
<li><a href="" target="_blank">Sponsorship</a></li>
<li><a href="" target="_blank">Thanks</a></li>
<li><a href="" target="_blank">License</a></li>
<li >
<a href="/faq.html">FAQ</a>
</ul><!-- /.nav -->
</div><!-- /#navbar -->
</div><!-- /.container-fluid -->
<div class="row-fluid">
<div class="col-lg-12 ecosystem">
<h2 id="apache-kudu-ecosystem">Apache Kudu Ecosystem</h2>
<p>While the Apache Kudu project provides client bindings that allow users to
mutate and fetch data, more complex access patterns are often written via SQL
and compute engines. This is a non-exhaustive list of projects that integrate
with Kudu to enhance ingest, querying capabilities, and orchestration.</p>
<h3 id="frequently-used">Frequently used</h3>
<p>The following integrations are among the most commonly used with Apache Kudu
(sorted alphabetically).</p>
<li><a href="#apache-impala">Apache Impala</a></li>
<li><a href="#apache-nifi">Apache Nifi</a></li>
<li><a href="#apache-spark-sql">Apache Spark SQL</a></li>
<li><a href="#presto">Presto</a></li>
<h3 id="sql">SQL</h3>
<h4 id="apache-drill"><a href="">Apache Drill</a></h4>
<p>Apache Drill provides schema-free SQL Query Engine for Hadoop, NoSQL and Cloud
Storage. See the <a href="">Drill Kudu API
for more details.</p>
<h4 id="apache-hive"><a href="">Apache Hive</a></h4>
<p>The Apache Hive ™ data warehouse software facilitates reading, writing, and
managing large datasets residing in distributed storage using SQL. See the
<a href="">Hive Kudu integration
for more details.</p>
<h4 id="apache-impala"><a href="">Apache Impala</a></h4>
<p>Apache Impala is the open source, native analytic database for Apache Hadoop.
See the <a href="">Kudu Impala integration
documentation</a> for
more details.</p>
<h4 id="apache-spark-sql"><a href="">Apache Spark SQL</a></h4>
<p>Spark SQL is a Spark module for structured data processing. See the <a href="">Kudu Spark
for more details.</p>
<h4 id="presto"><a href="">Presto</a></h4>
<p>Presto is an open source distributed SQL query engine for running interactive
analytic queries against data sources of all sizes ranging from gigabytes to
petabytes. See the <a href="">Presto Kudu connector
documentation</a> for more
<h3 id="computation">Computation</h3>
<h4 id="apache-beam"><a href="">Apache Beam</a></h4>
<p>Apache Beam is a unified model for defining both batch and streaming
data-parallel processing pipelines, as well as a set of language-specific SDKs
for constructing pipelines and Runners for executing them on distributed
processing backends. See the <a href="">Beam Kudu source and sink
for more details.</p>
<h4 id="apache-spark"><a href="">Apache Spark</a></h4>
<p>Apache Spark is a unified analytics engine for large-scale data processing. See
the <a href="">Kudu Spark integration
for more details.</p>
<h4 id="pandas"><a href="">Pandas</a></h4>
<p>Pandas is an open source, BSD-licensed library providing high-performance,
easy-to-use data structures and data analysis tools for the Python programming
language. Kudu Python scanners can be converted to Pandas DataFrames. See
<a href="">Kudu’s Python
for example usage.</p>
<h3 id="talend-big-data"><a href="">Talend Big Data</a></h3>
<p>Talend simplifies and automates big data integration projects with on demand
Serverless Spark and machine learning. See <a href="">Talend’s Kudu component
for more details.</p>
<h3 id="ingest">Ingest</h3>
<h4 id="akka"><a href="">Akka</a></h4>
<p>Akka facilitates building highly concurrent, distributed, and resilient
message-driven applications on the JVM. See the <a href="">Alpakka Kudu connector
documentation</a> for more
<h4 id="apache-flink"><a href="">Apache Flink</a></h4>
<p>Apache Flink is a framework and distributed processing engine for stateful
computations over unbounded and bounded data streams. See the <a href="">Flink Kudu
for more details.</p>
<h4 id="apache-nifi"><a href="">Apache Nifi</a></h4>
<p>Apache NiFi supports powerful and scalable directed graphs of data routing,
transformation, and system mediation logic. See the <a href="">PutKudu processor
for more details.</p>
<h4 id="apache-spark-streaming"><a href="">Apache Spark Streaming</a></h4>
<p>Spark Streaming is an extension of the core Spark API that enables scalable,
high-throughput, fault-tolerant stream processing of live data streams.
See <a href="">Kudu’s Spark Streaming
for example usage.</p>
<h4 id="confluent-platform-kafka"><a href="">Confluent Platform Kafka</a></h4>
<p>Apache Kafka is an open-source distributed event streaming platform used by
thousands of companies for high-performance data pipelines, streaming
analytics, data integration, and mission-critical applications. See the <a href="">Kafka
Kudu connector
for more details.</p>
<h4 id="streamsets-data-collector"><a href="">StreamSets Data Collector</a></h4>
<p>StreamSets Data Collector is a lightweight, powerful engine that streams data
in real time. See the <a href="">StreamSets Data Collector Kudu destination
<h4 id="striim"><a href="">Striim</a></h4>
<p>Striim is real-time data integration software that enables continuous data
ingestion, in-flight stream processing, and delivery. See the <a href="">Striim Kudu
documentation</a> for
more details.</p>
<h4 id="tibco-streambase"><a href="">TIBCO StreamBase</a></h4>
<p>TIBCO StreamBase® is an event processing platform for applying mathematical and
relational processing to real-time data streams. See the <a href="">StreamBase Kudu
for more details.</p>
<h3 id="deployment-and-orchestration">Deployment and Orchestration</h3>
<h4 id="apache-camel"><a href="">Apache Camel</a></h4>
<p>Camel is an open source integration framework that empowers you to quickly and
easily integrate various systems consuming or producing data. See the <a href="">Camel
Kudu component
for more details.</p>
<h4 id="cloudera-manager"><a href="">Cloudera Manager</a></h4>
<p>Cloudera Manager is an end-to-end application for managing CDH clusters. See
the <a href="">Cloudera Manager documentation for
for more details.</p>
<h4 id="docker"><a href="">Docker</a></h4>
<p>Docker facilitates packaging software into standardized units for development,
shipment, and deployment. See the official <a href="">Apache Kudu
Dockerhub</a> and the <a href="">Apache Kudu Docker
Quickstart</a> for more details.</p>
<h4 id="wavefront"><a href="">Wavefront</a></h4>
<p>Wavefront is a high-performance streaming analytics platform that supports 3D
observability. See the <a href="">Wavefront Kudu integration
documentation</a> for more details.</p>
<h3 id="visualization">Visualization</h3>
<h4 id="zoomdata"><a href="">Zoomdata</a></h4>
<p>Zoomdata provides a high-performance BI engine and visually engaging,
interactive dashboards. See <a href="">Zoomdata’s Kudu
page</a> for
more details.</p>
<h2 id="distribution-and-support">Distribution and Support</h2>
<p>While Kudu is an Apache-licensed open source project, software vendors may
package and license it with other components to facilitate consumption. These
offerings are typically bundled with support to tune and facilitate
<li><a href="">Cloudera CDH</a></li>
<li><a href="">phData</a></li>
<footer class="footer">
<div class="row">
<div class="col-md-9">
<p class="small">
Copyright &copy; 2020 The Apache Software Foundation.
<p class="small">
Apache Kudu, Kudu, Apache, the Apache feather logo, and the Apache Kudu
project logo are either registered trademarks or trademarks of The
Apache Software Foundation in the United States and other countries.
<div class="col-md-3">
<a class="pull-right" href="">
<img src=""/>
<script src=""></script>
// Try to detect touch-screen devices. Note: Many laptops have touch screens.
$(document).ready(function() {
if ("ontouchstart" in document.documentElement) {
} else {
<script src=""
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
ga('create', 'UA-68448017-1', 'auto');
ga('send', 'pageview');
<script src=""></script>
anchors.options = {
placement: 'right',
visible: 'touch',