tree: 4186ebf683beb95869fd9bebf08cc7b70981b021 [path history] [tgz]
  1. src/
  2. pom.xml
  3. README.md
sdks/java/extensions/join-library/README.md

Join-library

Join-library provides inner join, outer left and right join functions to Apache Beam. The aim is to simplify the most common cases of join to a simple function call.

The functions are generic so it supports join of any types supported by Beam. Input to the join functions are PCollections of Key/Values. Both the left and right PCollections need the same type for the key. All the join functions return a Key/Value where Key is the join key and value is a Key/Value where the key is the left value and right is the value.

In the cases of outer join, since null cannot be serialized the user have to provide a value that represent null for that particular use case.

Example how to use join-library:

PCollection<KV<String, String>> leftPcollection = ...
PCollection<KV<String, Long>> rightPcollection = ...

PCollection<KV<String, KV<String, Long>>> joinedPcollection =
  Join.innerJoin(leftPcollection, rightPcollection);