tree 1cd93287596ca07146da0ca50f89e409a62735e7
parent 2fa4fdecd8ef06534e369e527b15ae8193823c8b
parent 32aeb7ac3d49ade0dc3ad79e711e7b624091d485
author Kenneth Knowles <klk@google.com> 1504809502 -0700
committer Kenneth Knowles <klk@google.com> 1504809502 -0700

This closes #3705: [BEAM-165] Initial implementation of the MapReduce runner

  mr-runner: Removes WordCountTest, fixes checkstyle, findbugs, and addressed comments.
  mr-runner-hack: disable unrelated modules to shorten build time during development.
  mr-runner: support SourceMetrics, this fixes MetricsTest.testBoundedSourceMetrics().
  mr-runner: introduces duplicateFactor in FlattenOperation, this fixes testFlattenInputMultipleCopies().
  mr-runner: translate empty flatten into EmptySource, this fixes few empty FalttenTests.
  mr-runner: ensure Operation only start/finish once for diamond shaped DAG, this fixes ParDoLifecycleTest.
  mr-runner: Graph.getSteps() to return with topological order, this fixes few CombineTests.
  mr-runner: fail early in the runner when MapReduce job fails.
  mr-runner: use InMemoryStateInternals in ParDoOperation, this fixed ParDoTest that uses state.
  mr-runner: use the correct step name in ParDoTranslator, this fixes MetricsTest.testAttemptedCounterMetrics().
  mr-runner: remove the hard-coded GlobalWindow coder, and fixes WindowingTest.
  mr-runner: handle no files case in FileSideInputReader for empty views.
  mr-runner: fix NPE in PipelineTest.testIdentityTransform().
  mr-runner: filter out unsupported features in ValidatesRunner tests.
  mr-runner: setMetricsSupported to run ValidatesRunner tests with TestPipeline.
  mr-runner: fix the bug that steps are attached multiple times in diamond shaped DAG.
  [BEAM-2783] support metrics in MapReduceRunner.
  mr-runner: setup file paths for read and write sides of materialization.
  mr-runner: support side inputs by reading in all views contents.
  mr-runner: support multiple SourceOperations by composing and partitioning.
  mr-runner: support PCollections materialization with multiple MR jobs.
  mr-runner: hack to get around that ViewAsXXX.expand() return wrong output PValue.
  mr-runner: support graph visualization with dotfiles.
  mr-runner: refactors and creates Graph data structures to handle general Beam pipelines.
  mr-runner: add JarClassInstanceFactory to run ValidatesRunner tests.
  mr-runner: support reduce side ParDos and WordCount.
  core-java: InMemoryTimerInternals expose getTimers() for timer firings in mr-runner.
  mr-runner: add BeamReducer and support GroupByKey.
  mr-runner: add ParDoOperation and support ParDos chaining.
  mr-runner: add JobPrototype and translate it to a MR job.
  mr-runner: support BoundedSource with BeamInputFormat.
  MapReduceRunner: add unit tests for GraphConverter and GraphPlanner.
  MapReduceRunner: add Graph and its visitors.
  Initial commit for MapReduceRunner.
