tree: 5790edbc173c172dd2b22a714d5c6215ea2ade0d [path history] [tgz]
  1. src/
  2. pom.xml
  3. README.md
sdks/java/extensions/join-library/README.md

Join-library

Join-library provides inner join, outer left and right join functions to Apache Beam. The aim is to simplify the most common cases of join to a simple function call.

The functions are generic so it supports join of any types supported by Beam. Input to the join functions are PCollections of Key/Values. Both the left and right PCollections need the same type for the key. All the join functions return a Key/Value where Key is the join key and value is a Key/Value where the key is the left value and right is the value.

In the cases of outer join, since null cannot be serialized the user have to provide a value that represent null for that particular use case.

Example how to use join-library:

PCollection<KV<String, String>> leftPcollection = ...
PCollection<KV<String, Long>> rightPcollection = ...

PCollection<KV<String, KV<String, Long>>> joinedPcollection =
  Join.innerJoin(leftPcollection, rightPcollection);

Join-library can be found on maven-central:

<dependency>
  <groupId>org.apache.beam</groupId>
  <artifactId>join-library</artifactId>
  <version>0.1-incubating-SNAPSHOT</version>
</dependency>

Questions or comments: M.Runesson [at] gmail [dot] com