blob: e59c82e1a654fd2850b982bf0caa401a1b82f219 [file] [log] [blame]
// Copyright 2015 Cloudera, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
[[installation]]
= Installing Kudu
:author: Kudu Team
:imagesdir: ./images
:icons: font
:toc: left
:toclevels: 3
:doctype: book
:backend: html5
:sectlinks:
:experimental:
You can deploy Kudu on a CDH cluster using packages or parcels, or you can build Kudu
from source. To run Kudu without installing anything, use the link:quickstart.html#quickstart_vm
[Kudu Quickstart VM].
== Prerequisites and Requirements
.Hardware
- A Cloudera Manager or CDH cluster which meets these requirements:
- A host to run the Kudu master. // TODO multi-master
- One or more hosts to run Kudu tablet servers. Production clusters need at least
three tablet servers.
.Operating System
- An operating system and version supported by Cloudera.
[[req_hole_punching]]
- A kernel version and filesystem that support _hole punching_. On Linux, hole punching
is the use of the `fallocate()` system call with the `FALLOC_FL_PUNCH_HOLE` option
set.
- RHEL or CentOS 6.4 or later, patched to kernel version of 2.6.32-358 or later.
Unpatched RHEL or CentOS 6.4 does not include a kernel with support for hole punching.
- Ubuntu 14.04 includes version 3.13 of the Linux kernel, which supports hole punching.
- Newer versions of the EXT4 or XFS filesystems support hole punching, but EXT3 does
not. Older versions of XFS that do not support hole punching return a `EOPNOTSUPP`
(operation not supported) error. Older versions of either EXT4 or XFS that do
not support hole punching cause Kudu to emit an error message such as the following
and to fail to start:
+
----
Error during hole punch test. The log block manager requires a
filesystem with hole punching support such as ext4 or xfs. On el6,
kernel version 2.6.32-358 or newer is required. To run without hole
punching (at the cost of some efficiency and scalability), reconfigure
Kudu with --block_manager=file. Refer to the Kudu documentation for more
details. Raw error message follows.
----
- Without hole punching support, the log block manager is unsafe to use. It won't
ever delete blocks, consuming ever more space on disk.
- If you can't use hole punching in your environment, you can still
try Kudu. Enable the file block manager instead of the log block manager by
adding the `--block_manager=file` flag to the commands you use to start the master
and tablet servers. The file block manager does not scale as well as the log block
manager.
- OSX is not supported, even for building from source.
.Storage
- If solid state storage is available, storing Kudu WALs on such high-performance
media may significantly improve latency when Kudu is configured for its highest
durability levels.
- A filesystem that supports <<req_hole_punching,hole punching>> is recommended.
Supported filesystems
are EXT4 and XFS.
== Install Using Parcels
If you use Cloudera Manager, the easiest way to install Kudu is with parcels. First,
add the parcel repository to Cloudera Manager. Then distribute Kudu to your cluster.
Here is the procedure in detail.
WARNING: If you install Kudu using parcels and your filesystem does not support hole
punching, the initial service start-up will fail and you will need to use an advanced
configuration snippet to configure the file block manager. See <<req_hole_punching,Hole Punching>>.
// tag::quickstart_parcels[]
// tag::install_csd[]
. Download the Custom Service Descriptor (CSD) JAR to the `/opt/cloudera/csd` directory.
+
----
$ cd /opt/cloudera/csd
$ wget http://archive.cloudera.com/beta/kudu/csd/KUDU-0.5.0.jar
----
. Restart the Cloudera Manager server to detect the CSD.
+
----
$ sudo service cloudera-scm-server restart
----
+
// end::install_csd[]
// tag::add_service[]
. In Cloudera Manager, go to *Hosts > Parcels*. Find `KUDU` in the list, and click *Download*.
. When the download is complete, select your cluster from the *Locations* selector,
and click *Distribute*. If you only have one cluster, it is selected automatically.
. When distribution is complete, click *Activate* to activate the parcel. Restart
the cluster when prompted. This may take several minutes.
. Install the Kudu service on your cluster. Go to the cluster where you want to install Kudu.
Click *Actions > Add a Service*. Select *Kudu* from the list, and click *Continue*.
. Select a host to be the master and one or more hosts to be tablet servers. A
host can act as both a master and a tablet server, but this may cause performance
problems on a large cluster. The Kudu master process is not resource-intensive and
can be collocated with other similar processes such as the HDFS Namenode or YARN
ResourceManager.
+
After selecting hosts, click *Continue*.
. Configure the storage locations for Kudu data and WAL files on masters and tablet
servers. Cloudera Manager will create the directories.
- You can use the same directory to store data and WALs.
- You cannot store WALs in a subdirectory of the data directory.
- If any host is both a master and tablet server, configure different directories
for master and tablet server. For instance, `/data/kudu/master` and `/data/kudu/tserver`.
- If you choose a filesystem that does not support <<req_hole_punching,hole punching>>,
service start-up will fail. Exit the configuration wizard by clicking the *Cloudera*
logo at the top of the Cloudera Manager interface, and search for the
*Kudu Service Advanced Configuration Snippet (Safety Valve) for gflagfile* configuration option.
Add the following line to it, and save your changes: `--block_manager=file`
. If you did not need to exit the wizard, click *Continue*. Kudu masters and tablet
servers are started. Otherwise, go to the Kudu service, and click *Actions > Start*.
+
// tag::verify_install[]
. Verify that services are running using one of the following methods:
- Examine the output of the `ps` command on servers to verify one or both of `kudu-master`
or `kudu-tserver` processes is running.
- Access the Master or Tablet Server web UI by opening `\http://<_host_name_>:8051/`
for masters
or `\http://<_host_name_>:8050/` for tablet servers.
+
// end::verify_install[]
. To manage services, go to the Kudu service and use the *Actions* menu to stop, start, restart, or
otherwise manage the service.
// end::add_service[]
// end::quickstart_parcels[]
[[install_packages]]
== Install Using Packages
If you prefer, you can install Kudu using packages managed by the operating system instead of Cloudera
Manager parcels. After installing Kudu using packages, Cloudera recommends managing the Kudu service
using Cloudera Manager, but you can manage it manually via operating system utilities.
[[kudu_package_locations]]
.Kudu Package Locations
|===
| OS | Repository | Individual Packages
| RHEL | link:http://archive.cloudera.com/beta/kudu/redhat/6/x86_64/kudu/cloudera-kudu.repo[RHEL 6] | link:http://archive.cloudera.com/beta/kudu/redhat/6/x86_64/kudu/0.5.0/RPMS/x86_64/[RHEL 6]
| Ubuntu | link:http://archive.cloudera.com/beta/kudu/ubuntu/trusty/amd64/kudu/cloudera.list[Trusty] | http://archive.cloudera.com/beta/kudu/ubuntu/trusty/amd64/kudu/pool/contrib/k/kudu/[Trusty]
|===
----
NOTE: For later versions of Ubuntu, the Ubuntu Trusty packages are reported to install, though they have not been extensively tested.
----
=== Install On RHEL Hosts
. Download and configure the Kudu repositories for your operating system, or manually
download individual RPMs, the appropriate link from <<kudu_package_locations>>.
. If using a Yum repository, use the following commands to install Kudu packages on
each host.
+
----
sudo yum install kudu # Base Kudu files
sudo yum install kudu-master # Kudu master init.d service script and default configuration
sudo yum install kudu-tserver # Kudu tablet server init.d service script and default configuration
sudo yum install kudu-client # Kudu C++ client shared library
sudo yum install kudu-client-devel # Kudu C++ client SDK
----
. To manually install the Kudu RPMs, first download them, then use the command
`sudo rpm -ivh <RPM to install>`. If you do not use Cloudera Manager, install the
`kudu-master` and `kudu-tserver` packages on the appropriate hosts. These packages
provide the operating system commands to start and stop Kudu. Do not attempt to
use operating system commands to start or stop Kudu if you use Cloudera Manager.
include::installation.adoc[tags=install_csd]
include::installation.adoc[tags=add_service]
. To manage Kudu without using Cloudera Manager, continue to <<required_config_without_cm>>.
Cloudera recommends using Cloudera Manager.
=== Install On Ubuntu or Debian Hosts
. If using an Ubuntu or Debian repository, use the following commands to install Kudu
packages on each host.
+
----
sudo apt-get install kudu # Base Kudu files
sudo apt-get install kudu-master # Service scripts for managing kudu-master
# Use ONLY if you do not use Cloudera Manager
sudo apt-get install kudu-tserver # Service scripts for managing kudu-tserver
# Use ONLY if you do not use Cloudera Manager
sudo apt-get install libkuduclient0 # Kudu C++ client shared library
sudo apt-get install libkuduclient-dev # Kudu C++ client SDK
----
. To manually install individual DEBs, first download them, then use the command
`sudo dpkg -i <DEB to install>`. If you do not use Cloudera Manager, install the
`kudu-master` and `kudu-tserver` packages on the appropriate hosts. These packages
provide the operating system commands to start and stop Kudu. Do not attempt to
use operating system commands to start or stop Kudu if you use Cloudera Manager.
include::installation.adoc[tags=install_csd]
include::installation.adoc[tags=add_service]
. To manage Kudu without using Cloudera Manager, continue to <<required_config_without_cm>>.
Cloudera recommends using Cloudera Manager.
[[required_config_without_cm]]
=== Required Configuration Without Cloudera Manager
If you install Kudu from packages and choose not to use Cloudera Manager, additional
configuration steps are required on each host before you can start Kudu services.
. The packages create a `kudu-conf` entry in the operating system's alternatives database,
and they ship the built-in `conf.dist` alternative. To adjust your configuration,
you can either edit the files in `/etc/kudu/conf/` directly, or create a new alternative
using the operating system utilities, make sure it is the link pointed to by `/etc/kudu/conf/`,
and create custom configuration files there. Some parts of the configuration are configured
in `/etc/default/kudu-master` and `/etc/default/kudu-tserver` files as well. You
should include or duplicate these configuration options if you create custom configuration files.
+
Review the configuration, including the default WAL and data directory locations,
and adjust them according to your requirements.
. Start Kudu services using the following commands:
+
[source,bash]
----
$ sudo service kudu-master start
$ sudo service kudu-tserver start
----
. To stop Kudu services, use the following commands:
+
[source,bash]
----
$ sudo service kudu-master stop
$ sudo service kudu-tserver stop
----
. Configure the Kudu services to start automatically when the server starts, by adding
them to the default runlevel.
+
[source,bash]
----
$ sudo chkconfig kudu-master on # RHEL / CentOS
$ sudo chkconfig kudu-tserver on # RHEL / CentOS
$ sudo update-rc.d kudu-master defaults # Debian / Ubuntu
$ sudo update-rc.d kudu-tserver defaults # Debian / Ubuntu
----
. For additional configuration of Kudu services, see link:configuration.html[Configuring
Kudu].
== Build From Source
If installing Kudu using parcels or packages does not provide the flexibility you
need, you can build Kudu from source. You can build from source on any operating system
supported by Cloudera.
[WARNING]
.Known Build Issues
====
* It is not possible to build Kudu on OSX.
* Do not build Kudu using `gcc` 4.6. It is known to cause runtime and test failures.
====
=== RHEL or CentOS
. Install the prerequisite libraries, if they are not installed:
+
----
$ sudo yum install boost-static boost-devel openssl-devel cyrus-sasl-devel
----
. Optional: Install `liboauth` and `liboauth-devel` if you plan to build the Twitter demo.
+
----
$ sudo yum install liboauth liboauth-devel
----
. Optional: Install the `asciidoctor` gem if you plan to build documentation.
+
----
$ sudo gem install asciidoctor
----
. Clone the Git repository and change to the new `kudu` directory.
+
[source,bash]
----
$ git clone http://github.mtv.cloudera.com/CDH/kudu
$ cd kudu
----
. Build any missing third-party requirements using the `build-if-necessary.sh` script.
+
[source,bash]
----
$ thirdparty/build-if-necessary.sh
----
. Build Kudu, using the utilities installed in the previous step. Edit the install
prefix to the location where you would like the Kudu binaries, libraries, and headers
installed during the `make install` step. The default value is `/usr/local/`.
+
[source,bash]
----
thirdparty/installed/bin/cmake . -DCMAKE_BUILD_TYPE=release -DCMAKE_INSTALL_PREFIX=/opt/kudu
make -j4
----
[[build_install_kudu]]
. Optional: Install Kudu binaries, libraries, and headers.
If you do not specify a `DESTDIR`, `/usr/local/` is the default.
+
[source,bash]
----
sudo make DESTDIR=/opt/kudu install
----
. Optional: Build the documentation.
+
----
$ make docs
----
.RHEL / Centos Build Script
====
This script provides an overview of the procedure to build Kudu on a
newly-installed RHEL or Centos host, and can be used as the basis for an
automated deployment scenario. It skips the steps marked *Optional* above.
[source,bash]
----
#!/bin/bash
sudo yum -y install boost-static boost-devel openssl-devel cyrus-sasl-devel
git clone http://github.sf.cloudera.com/CDH/kudu
cd kudu
thirdparty/build-if-necessary.sh
thirdparty/installed/bin/cmake . -DCMAKE_BUILD_TYPE=release
make -j4
make install
----
====
=== Ubuntu or Debian
. Install the prerequisite libraries, if they are not installed:
+
----
$ sudo apt-get -y install git autoconf automake libboost-thread-dev curl gcc g++ \
libssl-dev libsasl2-dev libtool ntp
----
. Optional: Install `liboauth-dev` if you plan to build the Twitter demo.
+
----
$ sudo apt-get -y install liboauth-dev
----
. Optional: Install the `asciidoctor` gem if you plan to build documentation.
+
----
$ sudo gem install asciidoctor
----
. Clone the Git repository and change to the new `kudu` directory.
+
[source,bash]
----
$ git clone http://github.mtv.cloudera.com/CDH/kudu
$ cd kudu
----
. Build any missing third-party requirements using the `build-if-necessary.sh` script.
+
[source,bash]
----
$ thirdparty/build-if-necessary.sh
----
. Build Kudu.
+
[source,bash]
----
thirdparty/installed/bin/cmake . -DCMAKE_BUILD_TYPE=release
make -j4
----
. Optional: Build the documentation.
+
----
$ make docs
----
.Ubuntu / Debian Build Script
====
This script provides an overview of the procedure to build Kudu on RHEL or
Centos, and can be used as the basis for an automated deployment scenario. It skips
the steps marked *Optional* above.
[source,bash]
----
#!/bin/bash
sudo apt-get -y install git autoconf automake libboost-thread-dev curl \
gcc g++ libssl-dev libsasl2-dev libtool ntp
git clone http://github.sf.cloudera.com/CDH/kudu
cd kudu
thirdparty/build-if-necessary.sh
thirdparty/installed/bin/cmake . -DCMAKE_BUILD_TYPE=release
make -j4
make install
----
====
[[build_cpp_client]]
== Installing the C++ Client Libraries
If you use Cloudera Manager with parcels, and need the Kudu client libraries to be
available for local development, install the `kudu-client` and `kudu-client-devel`
package for your platform. See <<install_packages>>.
WARNING: Only build against the client libraries and headers (`kudu_client.so` and `client.h`).
Other libraries and headers are internal to Kudu and have no stability guarantees.
[[build_java_client]]
== Build the Java Client
.Requirements
- JDK 7
- Apache Maven 3.x
- `protoc` 2.6 or newer installed in your path, or built from the `thirdparty/` directory.
You can run the following commands to build `protoc` from the third-party dependencies:
[source,bash]
----
$ thirdparty/download-thirdparty.sh
$ thirdparty/build-thirdparty.sh protobuf
----
To build the Java client, clone the Kudu Git
repository, change to the `java` directory, and issue the following command:
[source,bash]
----
$ mvn install -DskipTests
----
For more information about building the Java API, as well as Eclipse integration,
see `java/README.md`.
[[view_api]]
== View API Documentation
// tag::view_api[]
.C++ API Documentation
The documentation for the C++ client APIs is included in the header files in
`/usr/include/kudu/` if you installed Kudu using packages or subdirectories
of `src/kudu/client/` if you built Kudu from source. If you installed Kudu using parcels,
no headers are included in your installation. and you will need to <<build_kudu,build
Kudu from source>> in order to have access to the headers and shared libraries.
The following command is a naive approach to finding relevant header files. Use
of any APIs other than the client APIs is unsupported.
[source,bash]
----
$ find /usr/include/kudu -type f -name *.h
----
.Java API Documentation
You can view the link:../apidocs/index.html[Java API documentation] online. Alternatively,
after <<build_java_client,building the Java client>>, Java API documentation is available
in `java/kudu-client/target/apidocs/index.html`.
// end::view_api[]
== Next Steps
- link:configuration.html[Configuring Kudu]
- link:administration.html[Kudu Administration]