blob: a4d330b4ca846fd9cf972732d3b60c5e15d24f19 [file] [log] [blame]
~~ Licensed to the Apache Software Foundation (ASF) under one or more
~~ contributor license agreements. See the NOTICE file distributed with
~~ this work for additional information regarding copyright ownership.
~~ The ASF licenses this file to You under the Apache License, Version 2.0
~~ (the "License"); you may not use this file except in compliance with
~~ the License. You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing, software
~~ distributed under the License is distributed on an "AS IS" BASIS,
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~~ See the License for the specific language governing permissions and
~~ limitations under the License.
~~
Overview
Apache Chukwa supports two different reliability strategies.
The first, default strategy, is as follows: collectors write data to HDFS, and
as soon as the HDFS write call returns success, report success to the agent,
which advances its checkpoint state.
This is potentially a problem if HDFS (or some other storage tier) has
non-durable or asynchronous writes. As a result, Apache Chukwa offers a mechanism,
asynchronous acknowledgement, for coping with this case.
This mechanism can be enabled by setting option <httpConnector.asyncAcks>.
This option applies to both agents and collectors. On the collector side, it
tells the collector to return asynchronous acknowledgements. On the agent side,
it tells agents to look for and process them correctly. Agents with the option
set to false should work OK with collectors where it's set to true. The
reverse is not generally true: agents will expect a collector to be able to
answer questions about the state of the filesystem.
Theory
In this approach, rather than try to build a fault tolerant collector,
Apache Chukwa agents look <<through>> the collectors to the underlying state of the
filesystem. This filesystem state is what is used to detect and recover from
failure. Recovery is handled entirely by the agent, without requiring anything
at all from the failed collector.
When an agent sends data to a collector, the collector responds with the name
of the HDFS file in which the data will be stored and the future location of
the data within the file. This is very easy to compute -- since each file is
only written by a single collector, the only requirement is to enqueue the
data and add up lengths.
Every few minutes, each agent process polls a collector to find the length of
each file to which data is being written. The length of the file is then
compared with the offset at which each chunk was to be written. If the file
length exceeds this value, then the data has been committed and the agent
process advances its checkpoint accordingly. (Note that the length returned by
the filesystem is the amount of data that has been successfully replicated.)
There is nothing essential about the role of collectors in monitoring the
written files. Collectors store no per-agent state. The reason to poll
collectors, rather than the filesystem directly, is to reduce the load on
the filesystem master and to shield agents from the details of the storage
system.
The collector component that handles these requests is
<datacollection.collector.servlet.CommitCheckServlet>.
This will be started if <httpConnector.asyncAcks> is true in the
collector configuration.
On error, agents resume from their last checkpoint and pick a new collector.
In the event of a failure, the total volume of data retransmitted is bounded by
the period between collector file rotations.
The solution is end-to-end. Authoritative copies of data can only exist in
two places: the nodes where data was originally produced, and the HDFS file
system where it will ultimately be stored. Collectors only hold soft state;
the only ``hard'' state stored by Apache Chukwa is the agent checkpoints. Below is a
diagram of the flow of messages in this protocol.
Configuration
In addition to <httpConnector.asyncAcks> (which enables asynchronous
acknowledgement) a number of options affect this mode of operation.
Option <chukwaCollector.asyncAcks.scanperiod> affects how often collectors
will check the filesystem for commits. It defaults to twice the rotation
interval.
Option <chukwaCollector.asyncAcks.scanpaths> determines where in HDFS
collectors will look. It defaults to the data sink dir plus the archive dir.
In the future, Zookeeper could be used instead to track rotations.