layout: section title: “Partition” permalink: /documentation/transforms/java/elementwise/partition/ section_menu: section-menu/documentation.html

Partition

The number of partitions must be determined at graph construction time. You cannot determine the number of partitions in mid-pipeline.

See more information in the [Beam Programming Guide]({{ site.baseurl }}/documentation/programming-guide/#partition).

Examples

Example: dividing a PCollection into percentile groups

// Provide an int value with the desired number of result partitions, and a PartitionFn that represents the
// partitioning function. In this example, we define the PartitionFn in-line. Returns a PCollectionList
// containing each of the resulting partitions as individual PCollection objects.
PCollection<Student> students = ...;
// Split students up into 10 partitions, by percentile:
PCollectionList<Student> studentsByPercentile =
    students.apply(Partition.of(10, new PartitionFn<Student>() {
        public int partitionFor(Student student, int numPartitions) {
            return student.getPercentile()  // 0..99
                 * numPartitions / 100;
        }}));

// You can extract each partition from the PCollectionList using the get method, as follows:
PCollection<Student> fortiethPercentile = studentsByPercentile.get(4);

Related transforms

  • [Filter]({{ site.baseurl }}/documentation/transforms/java/elementwise/filter) is useful if the function is just deciding whether to output an element or not.
  • [ParDo]({{ site.baseurl }}/documentation/transforms/java/elementwise/pardo) is the most general element-wise mapping operation, and includes other abilities such as multiple output collections and side-inputs.
  • [CoGroupByKey]({{ site.baseurl }}/documentation/transforms/java/aggregation/cogroupbykey) performs a per-key equijoin.