blob: bb1cd0768bead47e50a148cacb07a2a27c42afc7 [file] [view]
# Running Apache Drill
## Prerequisites
* Linux, Windows or OSX
* Oracle/OpenJDK 8 (JDK, not JRE)
Additional requirements when running in clustered mode:
* Hadoop 2.3+ distribution of Hadoop (such as Apache or MapR)
* Zookeeper is required for a clustered installation
## Installing the Tarball
1. mkdir /opt/drill
2. tar xvzf [tarball] --strip=1 -C /opt/drill
## Running in embedded mode
1. cd /opt/drill
2. bin/sqlline -u jdbc:drill:zk=local
3. Run a query (below).
## Running in clustered mode
1. Edit drill-override.conf to provide zookeeper location
2. Start the drillbit using bin/drillbit.sh start
3. Repeat on other nodes
4. Connect with sqlline by using bin/sqlline -u "jdbc:drill:zk=[zk_host:port]"
5. Run a query (below).
## Run a query
Drill comes preinstalled with a number of example data files including a small copy of the TPCH data in self describing Parquet files as well as the foodmart database in JSON. You can query these files using the cp schema. For example:
USE cp;
SELECT
employee_id,
first_name
FROM `employee.json`;
## More information
For more information including how to run a Apache Drill cluster, visit the [Apache Drill Documentation](http://drill.apache.org/docs/)