examples/multi-language/README.md
Example multi-language pipelines
This project provides examples of Apache Beam multi-language pipelines:
- python/addprefix - A Python pipeline that reads a text file and attaches a prefix on the Java side to each input.
- python/javacount - A Python pipeline that counts words using the Java
Count.perElement()
transform. - python/javadatagenerator - A Python pipeline that produces a set of strings generated from Java. This example demonstrates the
JavaExternalTransform
API.
Instructions for running the pipelines
1) Start the expansion service
- Download the latest ‘beam-examples-multi-language’ JAR. Starting with Apache Beam 2.36.0, you can find it in the Maven Central Repository.
- Run the following command, replacing
<version>
and <port>
with valid values: java -jar beam-examples-multi-language-<version>.jar <port> --javaClassLookupAllowlistFile='*'
2) Set up a Python virtual environment for Beam
- See the Python quickstart for more information.
3) Execute the Python pipeline
In a new shell, run a pipeline in the python directory using a Beam runner that supports multi-language pipelines.
The Python files contain details about the actual commands to run.