tree: 586a50ff7804d5bcce82d78379905de9e1c20787 [path history] [tgz]
  2. build.gradle
  3. src/

Example Pipelines

The examples included in this module serve to demonstrate the basic functionality of Apache Beam, and act as starting points for the development of more complex pipelines.

Word Count

A good starting point for new users is our set of word count examples, which computes word frequencies. This series of four successively more detailed pipelines is described in detail in the accompanying walkthrough.

  1. MinimalWordCount is the simplest word count pipeline and introduces basic concepts like Pipelines, PCollections, ParDo, and reading and writing data from external storage.

  2. WordCount introduces best practices like PipelineOptions and custom PTransforms.

  3. DebuggingWordCount demonstrates some best practices for instrumenting your pipeline code.

  4. WindowedWordCount shows how to run the same pipeline over either unbounded PCollections in streaming mode or bounded PCollections in batch mode.

Running Examples

See Apache Beam WordCount Example for information on running these examples.

Beyond Word Count [WIP]

After you've finished running your first few word count pipelines, take a look at the cookbook directory for some common and useful patterns like joining, filtering, and combining.

The complete directory contains a few realistic end-to-end pipelines.

See the other examples as well. This directory includes a Java 8 version of the MinimalWordCount example, as well as a series of examples in a simple ‘mobile gaming’ domain. This series introduces some advanced concepts.