This Quickstart will walk you through executing your first Beam pipeline to run WordCount, written using Beam's Go SDK, on a runner of your choice.
If you're interested in contributing to the Apache Beam Go codebase, see the Contribution Guide.
{{< toc >}}
The Beam SDK for Go requires go version 1.20 or newer. It can be downloaded here. Check what go version you have by running:
{{< highlight >}} go version {{< /highlight >}}
If you are unfamiliar with Go, see the Get Started With Go Tutorial.
The Apache Beam examples directory has many examples. All examples can be run by passing the required arguments described in the examples.
For example, to run wordcount, run:
{{< runner direct >}} go run github.com/apache/beam/sdks/v2/go/examples/wordcount@latest --input “gs://apache-beam-samples/shakespeare/kinglear.txt” --output counts less counts {{< /runner >}}
{{< runner dataflow >}} go run github.com/apache/beam/sdks/v2/go/examples/wordcount@latest --input gs://dataflow-samples/shakespeare/kinglear.txt
--output gs:///counts
--runner dataflow
--project your-gcp-project
--region your-gcp-region
--temp_location gs:///tmp/
--staging_location gs:///binaries/ {{< /runner >}}
{{< runner spark >}}
./gradlew :runners:spark:3:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077
go run github.com/apache/beam/sdks/v2/go/examples/wordcount@latest --input <PATH_TO_INPUT_FILE>
--output counts
--runner spark
--endpoint localhost:8099 {{< /runner >}}
Please don't hesitate to reach out if you encounter any issues!