Drill druid storage plugin allows you to perform SQL queries against Druid datasource(s). This storage plugin is part of Apache Drill
Druid supports multiple native queries to address sundry use-cases. To fetch raw druid rows, druid API support two forms of query, SELECT
(no relation to SQL) and SCAN
. Currently, this plugin uses the Scan query API to fetch raw rows from druid as json.
Filters pushed down to native druid filter structure, converting SQL where clauses to the respective druid Filters.
The plugin can be registered in Apache Drill using the drill web interface by navigating to the storage
page. Following is the default registration configuration.
{ "type" : "druid", "brokerAddress" : "http://localhost:8082", "coordinatorAddress": "http://localhost:8081", "averageRowSizeBytes": 100, "enabled" : false }
Building the plugin
mvn install -pl contrib/storage-druid
Building DRILL
mvn clean install -DskipTests
Start Drill In Embedded Mode (mac)
distribution/target/apache-drill-1.20.0-SNAPSHOT/apache-drill-1.20.0-SNAPSHOT/bin/drill-embedded
Starting Druid (Docker and Docker Compose required)
cd contrib/storage-druid/src/test/resources/druid docker-compose up -d
There is an Indexing Task Json
in the same folder as the docker compose file. It can be used to ingest the wikipedia datasource.
Make sure the druid storage plugin is enabled in Drill.