Running Apache Drill
Prerequisites
- Linux, Windows or OSX
- Oracle/OpenJDK 8 (JDK, not JRE)
Additional requirements when running in clustered mode:
- Hadoop 2.3+ distribution of Hadoop (such as Apache or MapR)
- Zookeeper is required for a clustered installation
Installing the Tarball
- mkdir /opt/drill
- tar xvzf [tarball] --strip=1 -C /opt/drill
Running in embedded mode
- cd /opt/drill
- bin/sqlline -u jdbc:drill:zk=local
- Run a query (below).
Running in clustered mode
- Edit drill-override.conf to provide zookeeper location
- Start the drillbit using bin/drillbit.sh start
- Repeat on other nodes
- Connect with sqlline by using bin/sqlline -u “jdbc:drill:zk=[zk_host:port]”
- Run a query (below).
Run a query
Drill comes preinstalled with a number of example data files including a small copy of the TPCH data in self describing Parquet files as well as the foodmart database in JSON. You can query these files using the cp schema. For example:
USE cp;
SELECT
employee_id,
first_name
FROM `employee.json`;
More information
For more information including how to run a Apache Drill cluster, visit the Apache Drill Documentation