| // Licensed to the Apache Software Foundation (ASF) under one |
| // or more contributor license agreements. See the NOTICE file |
| // distributed with this work for additional information |
| // regarding copyright ownership. The ASF licenses this file |
| // to you under the Apache License, Version 2.0 (the |
| // "License"); you may not use this file except in compliance |
| // with the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, |
| // software distributed under the License is distributed on an |
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| // KIND, either express or implied. See the License for the |
| // specific language governing permissions and limitations |
| // under the License. |
| |
| [[developing]] |
| = Developing Applications With Apache Kudu (incubating) |
| |
| :author: Kudu Team |
| :imagesdir: ./images |
| :icons: font |
| :toc: left |
| :toclevels: 3 |
| :doctype: book |
| :backend: html5 |
| :sectlinks: |
| :experimental: |
| |
| Kudu provides C++ and Java client APIs, as well as reference examples to illustrate |
| their use. A Python API is included, but it is currently considered experimental, |
| unstable, and subject to change at any time. |
| |
| WARNING: Use of server-side or private interfaces is not supported, and interfaces |
| which are not part of public APIs have no stability guarantees. |
| |
| == Viewing the API Documentation |
| include::installation.adoc[tags=view_api] |
| |
| == Working Examples |
| |
| Several example applications are provided in the |
| link:https://github.com/cloudera/kudu-examples[kudu-examples] Github |
| repository. Each example includes a `README` that shows how to compile and run |
| it. These examples illustrate correct usage of the Kudu APIs, as well as how to |
| set up a virtual machine to run Kudu. The following list includes some of the |
| examples that are available today. Check the repository itself in case this list goes |
| out of date. |
| |
| `java-example`:: |
| A simple Java application which connects to a Kudu instance, creates a table, writes data to it, then drops the table. |
| `collectl`:: |
| A small Java application which listens on a TCP socket for time series data corresponding to the Collectl wire protocol. |
| The commonly-available collectl tool can be used to send example data to the server. |
| `clients/python`:: |
| An experimental Python client for Kudu. |
| `demo-vm-setup`:: |
| Scripts to download and run a VirtualBox virtual machine with Kudu already installed. |
| See link:quickstart.html[Quickstart] for more information. |
| |
| These examples should serve as helpful starting points for your own Kudu applications and integrations. |
| |
| === Maven Artifacts |
| The following Maven `<dependency>` element is valid for the Kudu public beta: |
| |
| [source,xml] |
| ---- |
| <dependency> |
| <groupId>org.kududb</groupId> |
| <artifactId>kudu-client</artifactId> |
| <version>0.5.0</version> |
| </dependency> |
| ---- |
| |
| Because the Maven artifacts are not in Maven Central, use the following `<repository>` |
| element: |
| |
| [source,xml] |
| ---- |
| <repository> |
| <id>cdh.repo</id> |
| <name>Cloudera Repositories</name> |
| <url>https://repository.cloudera.com/artifactory/cloudera-repos</url> |
| <snapshots> |
| <enabled>false</enabled> |
| </snapshots> |
| </repository> |
| ---- |
| |
| See subdirectories of https://github.com/cloudera/kudu-examples/tree/master/java for |
| example Maven pom.xml files. |
| |
| == Example Impala Commands With Kudu |
| |
| See link:kudu_impala_integration.html[Using Impala With Kudu] for guidance on installing |
| and using Impala with Kudu, including several `impala-shell` examples. |
| |
| == Kudu integration with Spark |
| |
| Kudu integrates with spark through the spark data source api as of version 0.9 |
| Include the kudu-spark using the --jars |
| [source] |
| ---- |
| spark-shell --jars /kudu-spark-0.9.0.jar |
| ---- |
| Then import kudu-spark and create a dataframe: |
| [source] |
| ---- |
| // Import kudu datasource |
| import org.kududb.spark.kudu._ |
| val kuduDataFrame = sqlContext.read.options(Map("kudu.master"-> "your.kudu.master.here","kudu.table"-> "your.kudu.table.here")).kudu |
| // Then query using spark api or register a temporary table and use spark sql |
| kuduDataFrame.select("id").filter("id">=5).show() |
| // Register kuduDataFrame as a temporary table for spark-sql |
| kuduDataFrame.registerTempTable("kudu_table") |
| // Select from the dataframe |
| sqlContext.sql("select id from kudu_table where id>=5").show() |
| |
| // create a new kudu table from a dataframe |
| val kuduContext = new KuduContext("your.kudu.master.here") |
| kuduContext.createTable("testcreatetable", df.schema, Seq("key"), new CreateTableOptions().setNumReplicas(1)) |
| |
| // then we can insert data into the kudu table |
| df.write.options(Map("kudu.master"-> "your.kudu.master.here","kudu.table"-> "your.kudu.table.here")).mode("append").kudu |
| |
| // to update existing data change the mode to 'overwrite' |
| df.write.options(Map("kudu.master"-> "your.kudu.master.here","kudu.table"-> "your.kudu.table.here")).mode("overwrite").kudu |
| |
| // to check for existance of a kudu table |
| kuduContext.tableExists("your.kudu.table.here") |
| |
| // to delete a kudu table |
| kuduContext.deleteTable("your.kudu.table.here") |
| ---- |
| |
| == Integration with MapReduce, YARN, and Other Frameworks |
| |
| Kudu was designed to integrate with MapReduce, YARN, Spark, and other frameworks in |
| the Hadoop ecosystem. See link:https://github.com/apache/incubator-kudu/blob/master/java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/RowCounter.java[RowCounter.java] |
| and |
| link:https://github.com/apache/incubator-kudu/blob/master/java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/ImportCsv.java[ImportCsv.java] |
| for examples which you can model your own integrations on. Stay tuned for more examples |
| using YARN and Spark in the future. |