Running Apache Drill

Prerequisites

Linux, Windows or OSX
Oracle/OpenJDK 8 (JDK, not JRE)

Additional requirements when running in clustered mode:

Hadoop 2.3+ distribution of Hadoop (such as Apache or MapR)
Zookeeper is required for a clustered installation

Installing the Tarball

mkdir /opt/drill
tar xvzf [tarball] --strip=1 -C /opt/drill

Running in embedded mode

cd /opt/drill
bin/sqlline -u jdbc:drill:zk=local
Run a query (below).

Running in clustered mode

Edit drill-override.conf to provide zookeeper location
Start the drillbit using bin/drillbit.sh start
Repeat on other nodes
Connect with sqlline by using bin/sqlline -u “jdbc:drill:zk=[zk_host:port]”
Run a query (below).

Run a query

Drill comes preinstalled with a number of example data files including a small copy of the TPCH data in self describing Parquet files as well as the foodmart database in JSON. You can query these files using the cp schema. For example:

USE cp;

SELECT 
  employee_id, 
  first_name
FROM `employee.json`;

More information

For more information including how to run a Apache Drill cluster, visit the Apache Drill Documentation