By default, the coder for the output PCollection
is the same as the coder for the first PCollection
in the input PCollectionList
. However, the input PCollection
objects can each use different coders, as long as they all contain the same data type in your chosen language.
When using Flatten
to merge PCollection
objects that have a windowing strategy applied, all of the PCollection
objects you want to merge must use a compatible windowing strategy and window sizing. For example, all the collections you're merging must all use (hypothetically) identical 5-minute fixed windows or 4-minute sliding windows starting every 30 seconds.
If your pipeline attempts to use Flatten
to merge PCollection
objects with incompatible windows, Beam generates an IllegalStateException
error when your pipeline is constructed
See more information in the [Beam Programming Guide]({{ site.baseurl }}/documentation/programming-guide/#flatten).
Example: Apply a Flatten
transform to merge multiple PCollection
objects
// Flatten takes a PCollectionList of PCollection objects of a given type. // Returns a single PCollection that contains all of the elements in the PCollection objects in that list. PCollection<String> pc1 = Create.of("Hello"); PCollection<String> pc2 = Create.of("World", "Beam"); PCollection<String> pc3 = Create.of("Is", "Fun"); PCollectionList<String> collections = PCollectionList.of(pc1).and(pc2).and(pc3); PCollection<String> merged = collections.apply(Flatten.<String>pCollections());
The resulting collection now has all the elements: “Hello”, “World”, “Beam”, “Is”, and “Fun”.