blob: 910705e7dbfed6239d7759af5c36402aea3b0a22 [file] [log] [blame]
Flume 0.9.5
Files written by a collector now have a '.tmp' suffix that indicates
that the file is still being written. When the file is completed,
this suffix is removed.
Flume 0.9.4
Old Thrift sinks have been removed and are no longer available.
These are 'tacksink' and 'trawsink'. Additionally, the old aliases
'tsink' and 'tsource' are no longer present. Use the full names
'thriftSink' and 'thriftSource' instead.
The setting 'flume.collector.dfs.compress.gzip' was deprecated in
a previous release. It has now been removed. You should use the
'flume.collector.dfs.compress.codec' setting instead.
Flume 0.9.3
There have been some changes that removed advanced and purposely
undocumented sinks such as the 'let' sink and the 'failchain' sink.
Additions to the language include a new syntax for decorators,
python-style keyword arguments, and a new 'collector' syntax that
allows for multiple collector destinations.
Flume 0.9.2
Multiple masters does not work in conjunction with automatic failover
chains configurations.
This version significantly reduces the chances of data duplication
encountered due to the FLUME-252 (races in tail) bug. However, it
does not completely fix the problem. A workaround is to use
'exec("tail -F <file>")' instead of the tail source Linux/Unix
systems. A new issue has been filed as FLUME-320 to continue tracking
this problem.
Flume 0.9.2 is *not backwards compatible* with <0.9.2 installations
due to FLUME-216 - this release changes how configurations and node
maps are persisted to ZooKeeper and 0.9.2 masters cannot read
configurations or node maps written by masters from <0.9.2
installations. There is no upgrade path yet to preserve both node maps
and configurations, but one is planned for the 0.9.2 release.
The bug that caused FLUME-286 has been fixed for Thrift RPCs but not
for Avro-based RPC mechanisms. Previous to the fix, the Thrift
version would not recover when a downstream RPC server had a network
partion such as a wire cut or power down. Unlike killing a server,
these situations provided no failure feedback. The Avro version
currently does not properly detect or recover from these situations so
the DFO mode can lose data. This is reported as issue FLUME-313.
Flume 0.9.1 Update 1 (CDH3b3) Release Notes
This release is a bugfix update to the 0.9.1 release.
The only new features added to this release are:
* Support for writing to kerberized HDFS.
* Some minor fixes that prevent some user errors
* Some cosmetic fixes to the web interface.
Bug fixes:
* Critical bug fixes having to do with node hangs or situations where
agents do not properly recover from failures.
Many documentation updates including:
* A "cookbook" of example use case tutorials
* Documentation corrections
* More details on plugin behavior
Compatiblity Changes:
* When run from the command line, Flume now defaults to storing the
agent data in /tmp/flume-${} instead of /tmp/flume. When
upgrading an existing installation, if you have not configured an
alternate path, please rename /tmp/flume to /tmp/flume-flume
Known bugs (includes 0.9.1 issues):
* Tail is updated to potentially fix a race condition but races are
still possible (FLUME-218). Another implementation is pending for a
0.9.2 release (FLUME-252).
Flume 0.9.1 Release notes
This release improves and adds several key features including:
* gzip HDFS output file compression
* tailDir source
* Exposed internal configuration and extension information via web
* Automated eclipse project generation via ant
* Added documentation to source repository
* Example plugins and documentation on plugins.
* Numerous typographic corrections.
There were several critical fixes in this release as well.
* Fixed support for data coming from scribe
* Fixed syslog source underflow problem
* Fixed issues with hangs when changing or decommissioning certain
node configurations
* Prevent certain user errors
Compatability changes:
* We no longer specifiy a -Xmx memory setting for the JVM. By default
the JVM will set this value to 128MB. This value may need to be
increased on nodes. (FLUME-72)
Now Stable:
* Fully-Distributed Mode
* Sink/Source/Decorator Plugins
* Automatic versions of failover.
* Logical Naming translations.
Highlighted known bugs:
* Multiple Flume Masters does not work with automatic failover chains
and automatic flow isolation. (FLUME-114, FLUME-150)
* There are several known performance bugs (FLUME-264, FLUME-192).
* One event is lost in BE mode and two events are lost in DFO mode
when recovering from collector outage. These losses are technically
allowed but can probably be improved. (FLUME-257)
* Tail has a race condition that results in event duplication
(FLUME-218) and there are issues with non ascii characters
Flume 0.9.0 (CDH3b2) Release Notes
This release of Flume implements the core framework and has
implementations of most of the major features of the Flume system
described in the Flume User Guide. Some sections are stable, while
others are under active development. We also list several known bugs
and instabilities.
Supported platforms: Linux, requires Java 1.6, not tested on windows
* Introduction / Design
* Quick Start
* Psuedo-Distributed Mode (node configuration via master)
** Simple web interface
** Agent/collector roles
* Fully-Distributed Mode section:
** Static configuration of failover chains.
** Single master Memory-Backed and Zookeeper-Backed Configuration
* Data model of Flume event
* Output Bucketing with the CollectorSink
* Flume's Dataflow language
* Flume Shell
* Fully-Distributed Mode:
** Modes for unreliable network conditions in testing
** Disk failover mode and End-to-end modes
** DFO and E2E retransmission is unstable when connected to a network
* Sink/Source/Decorator Plugins
Active Development:
* Automatic versions of failover.
* Logical Naming translations.
* Multiple Flume Masters
* Output Formatting options
* Custom metadata extraction
Known Bugs
* ZK session timeout makes master unrecoverable
* Shell does not provide useful feedback on errors
* Translated configurations are not always auto-updated when logical
nodes are newly spawned onto physical nodes. An extra 'refreshAll'
command may need to be executed.
* Two collectorSources on the same node sometimes fails to translate
the more recent collectorSource
* Logical node names must be unique
* Logical nodes with the same name as the physical nodes on which they
run cannot be deleted