Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of event data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system is centrally managed and allows for intelligent dynamic management. It uses a simple extensible data model that allows for online analytic application.
The Apache Flume Twitter module provides a source to receive data from Twitter
Apache Flume Twitter is open-sourced under the Apache Software Foundation License v2.0.
Documentation is included in the binary distribution under the docs directory. In source form, it can be found in the flume-ng-doc directory.
The Flume 1.x guide and FAQ are available here:
Bug and Issue tracker.
Compiling Flume Twitter requires the following tools: