layout: doc_page

Druid Firehoses

Firehoses are used in the stream-pull ingestion model. They are pluggable and thus the configuration schema can and will vary based on the type of the firehose.

FieldTypeDescriptionRequired
typeStringSpecifies the type of firehose. Each value will have its own configuration schema, firehoses packaged with Druid are described below.yes

Additional Firehoses

There are several firehoses readily available in Druid, some are meant for examples, others can be used directly in a production environment.

For additional firehoses, please see our extensions list.

LocalFirehose

This Firehose can be used to read the data from files on local disk. It can be used for POCs to ingest data on disk. A sample local firehose spec is shown below:

{
    "type"    : "local",
    "filter"   : "*.csv",
    "baseDir"  : "/data/directory"
}
propertydescriptionrequired?
typeThis should be “local”.yes
filterA wildcard filter for files. See here for more information.yes
baseDirdirectory to search recursively for files to be ingested.yes

IngestSegmentFirehose

This Firehose can be used to read the data from existing druid segments. It can be used ingest existing druid segments using a new schema and change the name, dimensions, metrics, rollup, etc. of the segment. A sample ingest firehose spec is shown below -

{
    "type"    : "ingestSegment",
    "dataSource"   : "wikipedia",
    "interval" : "2013-01-01/2013-01-02"
}
propertydescriptionrequired?
typeThis should be “ingestSegment”.yes
dataSourceA String defining the data source to fetch rows from, very similar to a table in a relational databaseyes
intervalA String representing ISO-8601 Interval. This defines the time range to fetch the data over.yes
dimensionsThe list of dimensions to select. If left empty, no dimensions are returned. If left null or not defined, all dimensions are returned.no
metricsThe list of metrics to select. If left empty, no metrics are returned. If left null or not defined, all metrics are selected.no
filterSee Filtersyes

CombiningFirehose

This firehose can be used to combine and merge data from a list of different firehoses. This can be used to merge data from more than one firehose.

{
    "type"  :   "combining",
    "delegates" : [ { firehose1 }, { firehose2 }, ..... ]
}
propertydescriptionrequired?
typeThis should be “combining”yes
delegateslist of firehoses to combine data fromyes

EventReceiverFirehose

EventReceiverFirehoseFactory can be used to ingest events using an http endpoint.

{
  "type": "receiver",
  "serviceName": "eventReceiverServiceName",
  "bufferSize": 10000
}

When using this firehose, events can be sent by submitting a POST request to the http endpoint:

http://<peonHost>:<port>/druid/worker/v1/chat/<eventReceiverServiceName>/push-events/

propertydescriptionrequired?
typeThis should be “receiver”yes
serviceNamename used to announce the event receiver service endpointyes
bufferSizesize of buffer used by firehose to store eventsno default(100000)

Shut down time for EventReceiverFirehose can be specified by submitting a POST request to

http://<peonHost>:<port>/druid/worker/v1/chat/<eventReceiverServiceName>/shutdown?shutoffTime=<shutoffTime>

If shutOffTime is not specified, the firehose shuts off immediately.

TimedShutoffFirehose

This can be used to start a firehose that will shut down at a specified time. An example is shown below:

{
    "type"  :   "timed",
    "shutoffTime": "2015-08-25T01:26:05.119Z",
    "delegate": {
          "type": "receiver",
          "serviceName": "eventReceiverServiceName",
          "bufferSize": 100000
     }
}
propertydescriptionrequired?
typeThis should be “timed”yes
shutoffTimetime at which the firehose should shut down, in ISO8601 formatyes
delegatefirehose to useyes