examples/streaming/fsio/README.md

This directory contains the example that reads a hadoop sequence file and duplicate its content to new sequence files. This README explain how to quick-start this example.

First of all, this example is a simple case and here are some limitations you should know:

The example only accepts one sequence file, not a directory, and the output file format is also sequence file.
The example will duplicate the input file constantly, so if the example runs for a long time, the output files will be large.
Each SeqFileStreamProcessor will generate a output file.

In order to run the example:

Prepare a sequence file first, save it to local file system or HDFS.
Start a gearpump cluster, including Master and Workers.
Submit the application:

./target/pack/bin/gear app -jar ./examples/target/$SCALA_VERSION_MAJOR/gearpump-examples-assembly-$VERSION.jar org.apache.gearpump.streaming.examples.sol.SOL -input $INPUT_FILE_PATH -output $OUTPUT_DIRECTORY

Stop the application:

./target/pack/bin/gear kill -appid $APPID

Note that the output parameter should be a directory.