Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of event data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system is centrally managed and allows for intelligent dynamic management. It uses a simple extensible data model that allows for online analytic application.
The Apache Flume Hadoop module provides Flume components that leverage Hadoop technologies.
Apache Flume Hadoop is open-sourced under the Apache Software Foundation License v2.0.
Documentation is included in the binary distribution under the docs directory. In source form, it can be found in the flume-ng-doc directory.
The Flume 1.x guide and FAQ are available here:
Bug and Issue tracker.
Compiling Flume Hadoop requires the following tools: