blob: c8f64832fe289e89d9045c53a78b277c0fe70e47 [file] [log] [blame]
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Apache Beam – release</title><link>/categories/release/</link><description>Recent content in release on Apache Beam</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Wed, 26 Jun 2024 13:00:00 -0800</lastBuildDate><atom:link href="/categories/release/index.xml" rel="self" type="application/rss+xml"/><item><title>Blog: Apache Beam 2.57.0</title><link>/blog/beam-2.57.0/</link><pubDate>Wed, 26 Jun 2024 13:00:00 -0800</pubDate><guid>/blog/beam-2.57.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.57.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2570-2024-06-26">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.57.0, check out the &lt;a href="https://github.com/apache/beam/milestone/21">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Apache Beam adds Python 3.12 support (&lt;a href="https://github.com/apache/beam/issues/29149">#29149&lt;/a>).&lt;/li>
&lt;li>Added FlinkRunner for Flink 1.18 (&lt;a href="https://github.com/apache/beam/issues/30789">#30789&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Ensure that BigtableIO closes the reader streams (&lt;a href="https://github.com/apache/beam/issues/31477">#31477&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Added Feast feature store handler for enrichment transform (Python) (&lt;a href="https://github.com/apache/beam/issues/30964">#30957&lt;/a>).&lt;/li>
&lt;li>BigQuery per-worker metrics are reported by default for Streaming Dataflow Jobs (Java) (&lt;a href="https://github.com/apache/beam/pull/31015">#31015&lt;/a>)&lt;/li>
&lt;li>Adds &lt;code>inMemory()&lt;/code> variant of Java List and Map side inputs for more efficient lookups when the entire side input fits into memory.&lt;/li>
&lt;li>Beam YAML now supports the jinja templating syntax.
Template variables can be passed with the (json-formatted) &lt;code>--jinja_variables&lt;/code> flag.&lt;/li>
&lt;li>DataFrame API now supports pandas 2.1.x and adds 12 more string functions for Series.(&lt;a href="https://github.com/apache/beam/pull/31185">#31185&lt;/a>).&lt;/li>
&lt;li>Added BigQuery handler for enrichment transform (Python) (&lt;a href="https://github.com/apache/beam/pull/31295">#31295&lt;/a>)&lt;/li>
&lt;li>Disable soft delete policy when creating the default bucket for a project (Java) (&lt;a href="https://github.com/apache/beam/pull/31324">#31324&lt;/a>).&lt;/li>
&lt;li>Added &lt;code>DoFn.SetupContextParam&lt;/code> and &lt;code>DoFn.BundleContextParam&lt;/code> which can be used
as a python &lt;code>DoFn.process&lt;/code>, &lt;code>Map&lt;/code>, or &lt;code>FlatMap&lt;/code> parameter to invoke a context
manager per DoFn setup or bundle (analogous to using &lt;code>setup&lt;/code>/&lt;code>teardown&lt;/code>
or &lt;code>start_bundle&lt;/code>/&lt;code>finish_bundle&lt;/code> respectively.)&lt;/li>
&lt;li>Go SDK Prism Runner
&lt;ul>
&lt;li>Pre-built Prism binaries are now part of the release and are available via the Github release page. (&lt;a href="https://github.com/apache/beam/issues/29697">#29697&lt;/a>).&lt;/li>
&lt;li>ProcessingTime is now handled synthetically with TestStream pipelines and Non-TestStream pipelines, for fast test pipeline execution by default. (&lt;a href="https://github.com/apache/beam/issues/30083">#30083&lt;/a>).
&lt;ul>
&lt;li>Prism does NOT yet support &amp;ldquo;real time&amp;rdquo; execution for this release.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Improve processing for large elements to reduce the chances for exceeding 2GB protobuf limits (Python)([https://github.com/apache/beam/issues/31607]).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Java&amp;rsquo;s View.asList() side inputs are now optimized for iterating rather than
indexing when in the global window.
This new implementation still supports all (immutable) List methods as before,
but some of the random access methods like get() and size() will be slower.
To use the old implementation one can use View.asList().withRandomAccess().&lt;/li>
&lt;li>SchemaTransforms implemented with TypedSchemaTransformProvider now produce a
configuration Schema with snake_case naming convention
(&lt;a href="https://github.com/apache/beam/pull/31374">#31374&lt;/a>). This will make the following
cases problematic:
&lt;ul>
&lt;li>Running a pre-2.57.0 remote SDK pipeline containing a 2.57.0+ Java SchemaTransform,
and vice versa:&lt;/li>
&lt;li>Running a 2.57.0+ remote SDK pipeline containing a pre-2.57.0 Java SchemaTransform&lt;/li>
&lt;li>All direct uses of Python&amp;rsquo;s &lt;a href="https://github.com/apache/beam/blob/a998107a1f5c3050821eef6a5ad5843d8adb8aec/sdks/python/apache_beam/transforms/external.py#L381">SchemaAwareExternalTransform&lt;/a>
should be updated to use new snake_case parameter names.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Upgraded Jackson Databind to 2.15.4 (Java) (&lt;a href="https://github.com/apache/beam/issues/26743">#26743&lt;/a>).
jackson-2.15 has known breaking changes. An important one is it imposed a buffer limit for parser.
If your custom PTransform/DoFn are affected, refer to &lt;a href="https://github.com/apache/beam/pull/31580">#31580&lt;/a> for mitigation.&lt;/li>
&lt;/ul>
&lt;p>For the most up to date list of known issues, see &lt;a href="https://github.com/apache/beam/blob/master/CHANGES.md">https://github.com/apache/beam/blob/master/CHANGES.md&lt;/a>&lt;/p>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.57.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Andrey Devyatkin&lt;/p>
&lt;p>Anody Zhang&lt;/p>
&lt;p>Arvind Ram&lt;/p>
&lt;p>Ben Konz&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Celeste Zeng&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Claire McGinty&lt;/p>
&lt;p>Colm O hEigeartaigh&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Evan Galpin&lt;/p>
&lt;p>Ferran Fernández Garrido&lt;/p>
&lt;p>Florent Biville&lt;/p>
&lt;p>Jack Dingilian&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>JayajP&lt;/p>
&lt;p>Jeff Kinard&lt;/p>
&lt;p>Jeffrey Kinard&lt;/p>
&lt;p>John Casey&lt;/p>
&lt;p>Justin Uang&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Kevin Zhou&lt;/p>
&lt;p>Liam Miller-Cushon&lt;/p>
&lt;p>Maarten Vercruysse&lt;/p>
&lt;p>Maciej Szwaja&lt;/p>
&lt;p>Maja Kontrec Rönn&lt;/p>
&lt;p>Marc hurabielle&lt;/p>
&lt;p>Martin Trieu&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Min Zhu&lt;/p>
&lt;p>Naireen Hussain&lt;/p>
&lt;p>Nick Anikin&lt;/p>
&lt;p>Pablo Rodriguez Defino&lt;/p>
&lt;p>Paul King&lt;/p>
&lt;p>Priyans Desai&lt;/p>
&lt;p>Radosław Stankiewicz&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Rodrigo Bozzolo&lt;/p>
&lt;p>RyuSA&lt;/p>
&lt;p>Sam Rohde&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Sergei Lilichenko&lt;/p>
&lt;p>Shahar Epstein&lt;/p>
&lt;p>Shunping Huang&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Tomo Suzuki&lt;/p>
&lt;p>Tony Tang&lt;/p>
&lt;p>Valentyn Tymofieiev&lt;/p>
&lt;p>Vincent Stollenwerk&lt;/p>
&lt;p>Vineet Kumar&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>XQ Hu&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>akashorabek&lt;/p>
&lt;p>bzablocki&lt;/p>
&lt;p>kberezin&lt;/p></description></item><item><title>Blog: Apache Beam 2.56.0</title><link>/blog/beam-2.56.0/</link><pubDate>Wed, 01 May 2024 10:00:00 -0400</pubDate><guid>/blog/beam-2.56.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.56.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2550-2023-03-25">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.56.0, check out the &lt;a href="https://github.com/apache/beam/milestone/20">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Added FlinkRunner for Flink 1.17, removed support for Flink 1.12 and 1.13. Previous version of Pipeline running on Flink 1.16 and below can be upgraded to 1.17, if the Pipeline is first updated to Beam 2.56.0 with the same Flink version. After Pipeline runs with Beam 2.56.0, it should be possible to upgrade to FlinkRunner with Flink 1.17. (&lt;a href="https://github.com/apache/beam/issues/29939">#29939&lt;/a>)&lt;/li>
&lt;li>New Managed I/O Java API (&lt;a href="https://github.com/apache/beam/pull/30830">#30830&lt;/a>).&lt;/li>
&lt;li>New Ordered Processing PTransform added for processing order-sensitive stateful data (&lt;a href="https://github.com/apache/beam/pull/30735">#30735&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Upgraded Avro version to 1.11.3, kafka-avro-serializer and kafka-schema-registry-client versions to 7.6.0 (Java) (&lt;a href="https://github.com/apache/beam/pull/30638">#30638&lt;/a>).
The newer Avro package is known to have breaking changes. If you are affected, you can keep pinned to older Avro versions which are also tested with Beam.&lt;/li>
&lt;li>Iceberg read/write support is available through the new Managed I/O Java API (&lt;a href="https://github.com/apache/beam/pull/30830">#30830&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Profiling of Cythonized code has been disabled by default. This might improve performance for some Python pipelines (&lt;a href="https://github.com/apache/beam/pull/30938">#30938&lt;/a>).&lt;/li>
&lt;li>Bigtable enrichment handler now accepts a custom function to build a composite row key. (Python) (&lt;a href="https://github.com/apache/beam/issues/30975">#30974&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Default consumer polling timeout for KafkaIO.Read was increased from 1 second to 2 seconds. Use KafkaIO.read().withConsumerPollingTimeout(Duration duration) to configure this timeout value when necessary (&lt;a href="https://github.com/apache/beam/issues/30870">#30870&lt;/a>).&lt;/li>
&lt;li>Python Dataflow users no longer need to manually specify &amp;ndash;streaming for pipelines using unbounded sources such as ReadFromPubSub.&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed locking issue when shutting down inactive bundle processors. Symptoms of this issue include slowness or stuckness in long-running jobs (Python) (&lt;a href="https://github.com/apache/beam/pull/30679">#30679&lt;/a>).&lt;/li>
&lt;li>Fixed logging issue that caused silecing the pip output when installing of dependencies provided in &lt;code>--requirements_file&lt;/code> (Python).&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.56.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Abacn&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Andrei Gurau&lt;/p>
&lt;p>Andrey Devyatkin&lt;/p>
&lt;p>Aravind Pedapudi&lt;/p>
&lt;p>Arun Pandian&lt;/p>
&lt;p>Arvind Ram&lt;/p>
&lt;p>Bartosz Zablocki&lt;/p>
&lt;p>Brachi Packter&lt;/p>
&lt;p>Byron Ellis&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Clement DAL PALU&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Daria Bezkorovaina&lt;/p>
&lt;p>Dip Patel&lt;/p>
&lt;p>Evan Burrell&lt;/p>
&lt;p>Hai Joey Tran&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>JayajP&lt;/p>
&lt;p>Jeff Kinard&lt;/p>
&lt;p>Julien Tournay&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Luís Bianchin&lt;/p>
&lt;p>Maciej Szwaja&lt;/p>
&lt;p>Melody Shen&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>Pablo Estrada&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Sergei Lilichenko&lt;/p>
&lt;p>Shahar Epstein&lt;/p>
&lt;p>Shunping Huang&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Timothy Itodo&lt;/p>
&lt;p>Veronica Wasson&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>akashorabek&lt;/p>
&lt;p>bzablocki&lt;/p>
&lt;p>clmccart&lt;/p>
&lt;p>damccorm&lt;/p>
&lt;p>dependabot[bot]&lt;/p>
&lt;p>dmitryor&lt;/p>
&lt;p>github-actions[bot]&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>martin trieu&lt;/p>
&lt;p>tvalentyn&lt;/p>
&lt;p>xianhualiu&lt;/p></description></item><item><title>Blog: Apache Beam 2.55.0</title><link>/blog/beam-2.55.0/</link><pubDate>Mon, 25 Mar 2024 10:00:00 -0400</pubDate><guid>/blog/beam-2.55.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.55.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2550-2023-03-25">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.55.0, check out the &lt;a href="https://github.com/apache/beam/milestone/19">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>The Python SDK will now include automatically generated wrappers for external Java transforms! (&lt;a href="https://github.com/apache/beam/pull/29834">#29834&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Added support for handling bad records to BigQueryIO (&lt;a href="https://github.com/apache/beam/pull/30081">#30081&lt;/a>).
&lt;ul>
&lt;li>Full Support for Storage Read and Write APIs&lt;/li>
&lt;li>Partial Support for File Loads (Failures writing to files supported, failures loading files to BQ unsupported)&lt;/li>
&lt;li>No Support for Extract or Streaming Inserts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Added support for handling bad records to PubSubIO (&lt;a href="https://github.com/apache/beam/pull/30372">#30372&lt;/a>).
&lt;ul>
&lt;li>Support is not available for handling schema mismatches, and enabling error handling for writing to Pub/Sub topics with schemas is not recommended&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;code>--enableBundling&lt;/code> pipeline option for BigQueryIO DIRECT_READ is replaced by &lt;code>--enableStorageReadApiV2&lt;/code>. Both were considered experimental and subject to change (Java) (&lt;a href="https://github.com/apache/beam/issues/26354">#26354&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Allow writing clustered and not time-partitioned BigQuery tables (Java) (&lt;a href="https://github.com/apache/beam/pull/30094">#30094&lt;/a>).&lt;/li>
&lt;li>Redis cache support added to RequestResponseIO and Enrichment transform (Python) (&lt;a href="https://github.com/apache/beam/pull/30307">#30307&lt;/a>)&lt;/li>
&lt;li>Merged &lt;code>sdks/java/fn-execution&lt;/code> and &lt;code>runners/core-construction-java&lt;/code> into the main SDK. These artifacts were never meant for users, but noting
that they no longer exist. These are steps to bring portability into the core SDK alongside all other core functionality.&lt;/li>
&lt;li>Added Vertex AI Feature Store handler for Enrichment transform (Python) (&lt;a href="https://github.com/apache/beam/pull/30388">#30388&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Arrow version was bumped to 15.0.0 from 5.0.0 (&lt;a href="https://github.com/apache/beam/pull/30181">#30181&lt;/a>).&lt;/li>
&lt;li>Go SDK users who build custom worker containers may run into issues with the move to distroless containers as a base (see Security Fixes).
&lt;ul>
&lt;li>The issue stems from distroless containers lacking additional tools, which current custom container processes may rely on.&lt;/li>
&lt;li>See &lt;a href="https://beam.apache.org/documentation/runtime/environments/#from-scratch-go">https://beam.apache.org/documentation/runtime/environments/#from-scratch-go&lt;/a> for instructions on building and using a custom container.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Python SDK has changed the default value for the &lt;code>--max_cache_memory_usage_mb&lt;/code> pipeline option from 100 to 0. This option was first introduced in the 2.52.0 SDK version. This change restores the behavior of the 2.51.0 SDK, which does not use the state cache. If your pipeline uses iterable side inputs views, consider increasing the cache size by setting the option manually. (&lt;a href="https://github.com/apache/beam/issues/30360">#30360&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>N/A&lt;/li>
&lt;/ul>
&lt;h2 id="bug-fixes">Bug fixes&lt;/h2>
&lt;ul>
&lt;li>Fixed &lt;code>SpannerIO.readChangeStream&lt;/code> to support propagating credentials from pipeline options
to the &lt;code>getDialect&lt;/code> calls for authenticating with Spanner (Java) (&lt;a href="https://github.com/apache/beam/pull/30361">#30361&lt;/a>).&lt;/li>
&lt;li>Reduced the number of HTTP requests in GCSIO function calls (Python) (&lt;a href="https://github.com/apache/beam/pull/30205">#30205&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="security-fixes">Security Fixes&lt;/h2>
&lt;ul>
&lt;li>Go SDK base container image moved to distroless/base-nossl-debian12, reducing vulnerable container surface to kernel and glibc (&lt;a href="https://github.com/apache/beam/pull/30011">#30011&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>In Python pipelines, when shutting down inactive bundle processors, shutdown logic can overaggressively hold the lock, blocking acceptance of new work. Symptoms of this issue include slowness or stuckness in long-running jobs. Fixed in 2.56.0 (&lt;a href="https://github.com/apache/beam/pull/30679">#30679&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.55.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrew Crites&lt;/p>
&lt;p>Andrey Devyatkin&lt;/p>
&lt;p>Arun Pandian&lt;/p>
&lt;p>Arvind Ram&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Chris Gray&lt;/p>
&lt;p>Claire McGinty&lt;/p>
&lt;p>Damon Douglas&lt;/p>
&lt;p>Dan Ellis&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Daria Bezkorovaina&lt;/p>
&lt;p>Dima I&lt;/p>
&lt;p>Edward Cui&lt;/p>
&lt;p>Ferran Fernández Garrido&lt;/p>
&lt;p>GStravinsky&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>Jason Mitchell&lt;/p>
&lt;p>JayajP&lt;/p>
&lt;p>Jeff Kinard&lt;/p>
&lt;p>Jeffrey Kinard&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Michel Davit&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Ritesh Tarway&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Scott Strong&lt;/p>
&lt;p>Shunping Huang&lt;/p>
&lt;p>Steven van Rossum&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Talat UYARER&lt;/p>
&lt;p>Ukjae Jeong (Jay)&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>akashorabek&lt;/p>
&lt;p>case-k&lt;/p>
&lt;p>clmccart&lt;/p>
&lt;p>dengwe1&lt;/p>
&lt;p>dhruvdua&lt;/p>
&lt;p>hardshah&lt;/p>
&lt;p>johnjcasey&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>martin trieu&lt;/p>
&lt;p>tvalentyn&lt;/p></description></item><item><title>Blog: Apache Beam 2.54.0</title><link>/blog/beam-2.54.0/</link><pubDate>Wed, 14 Feb 2024 09:00:00 -0400</pubDate><guid>/blog/beam-2.54.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.54.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.54.0, check out the &lt;a href="https://github.com/apache/beam/milestone/18">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://s.apache.org/enrichment-transform">Enrichment Transform&lt;/a> along with GCP BigTable handler added to Python SDK (&lt;a href="https://github.com/apache/beam/pull/30001">#30001&lt;/a>).&lt;/li>
&lt;li>Beam Java Batch pipelines run on Google Cloud Dataflow will default to the Portable Runner (v2) starting with this version. (All other languages are already on Runner V2.) See &lt;a href="https://cloud.google.com/dataflow/docs/runner-v2">Runner V2 documentation&lt;/a> for how to enable or disable it intentionally.&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Added support for writing to BigQuery dynamic destinations with Python&amp;rsquo;s Storage Write API (&lt;a href="https://github.com/apache/beam/pull/30045">#30045&lt;/a>)&lt;/li>
&lt;li>Adding support for Tuples DataType in ClickHouse (Java) (&lt;a href="https://github.com/apache/beam/pull/29715">#29715&lt;/a>).&lt;/li>
&lt;li>Added support for handling bad records to FileIO, TextIO, AvroIO (&lt;a href="https://github.com/apache/beam/pull/29670">#29670&lt;/a>).&lt;/li>
&lt;li>Added support for handling bad records to BigtableIO (&lt;a href="https://github.com/apache/beam/pull/29885">#29885&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://s.apache.org/enrichment-transform">Enrichment Transform&lt;/a> along with GCP BigTable handler added to Python SDK (&lt;a href="https://github.com/apache/beam/pull/30001">#30001&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>N/A&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>N/A&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed a memory leak affecting some Go SDK since 2.46.0. (&lt;a href="https://github.com/apache/beam/pull/28142">#28142&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="security-fixes">Security Fixes&lt;/h2>
&lt;ul>
&lt;li>N/A&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Some Python pipelines that run with 2.52.0-2.54.0 SDKs and use large materialized side inputs might be affected by a performance regression. To restore the prior behavior on these SDK versions, supply the &lt;code>--max_cache_memory_usage_mb=0&lt;/code> pipeline option. (&lt;a href="https://github.com/apache/beam/issues/30360">#30360&lt;/a>).&lt;/li>
&lt;li>Python pipelines that run with 2.53.0-2.54.0 SDKs and perform file operations on GCS might be affected by excess HTTP requests. This could lead to a performance regression or a permission issue. (&lt;a href="https://github.com/apache/beam/issues/28398">#28398&lt;/a>)&lt;/li>
&lt;li>In Python pipelines, when shutting down inactive bundle processors, shutdown logic can overaggressively hold the lock, blocking acceptance of new work. Symptoms of this issue include slowness or stuckness in long-running jobs. Fixed in 2.56.0 (&lt;a href="https://github.com/apache/beam/pull/30679">#30679&lt;/a>).&lt;/li>
&lt;/ul>
&lt;p>For the most up to date list of known issues, see &lt;a href="https://github.com/apache/beam/blob/master/CHANGES.md">https://github.com/apache/beam/blob/master/CHANGES.md&lt;/a>&lt;/p>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.54.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrew Crites&lt;/p>
&lt;p>Arun Pandian&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>caneff&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Changyu Li&lt;/p>
&lt;p>Cheskel Twersky&lt;/p>
&lt;p>Claire McGinty&lt;/p>
&lt;p>clmccart&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>dependabot[bot]&lt;/p>
&lt;p>Edward Cheng&lt;/p>
&lt;p>Ferran Fernández Garrido&lt;/p>
&lt;p>Hai Joey Tran&lt;/p>
&lt;p>hugo-syn&lt;/p>
&lt;p>Issac&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>JayajP&lt;/p>
&lt;p>Jeffrey Kinard&lt;/p>
&lt;p>Jerry Wang&lt;/p>
&lt;p>Jing&lt;/p>
&lt;p>Joey Tran&lt;/p>
&lt;p>johnjcasey&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Knut Olav Løite&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>Marc&lt;/p>
&lt;p>Mark Zitnik&lt;/p>
&lt;p>martin trieu&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Naireen Hussain&lt;/p>
&lt;p>Neeraj Bansal&lt;/p>
&lt;p>Niel Markwick&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>pablo rodriguez defino&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Shunping Huang&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>S. Veyrié&lt;/p>
&lt;p>Talat UYARER&lt;/p>
&lt;p>tvalentyn&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>Zechen Jian&lt;/p></description></item><item><title>Blog: Apache Beam 2.53.0</title><link>/blog/beam-2.53.0/</link><pubDate>Thu, 04 Jan 2024 09:00:00 -0400</pubDate><guid>/blog/beam-2.53.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.53.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.53.0, check out the &lt;a href="https://github.com/apache/beam/milestone/17">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Python streaming users that use 2.47.0 and newer versions of Beam should update to version 2.53.0, which fixes a known issue: (&lt;a href="https://github.com/apache/beam/issues/27330">#27330&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>TextIO now supports skipping multiple header lines (Java) (&lt;a href="https://github.com/apache/beam/issues/17990">#17990&lt;/a>).&lt;/li>
&lt;li>Python GCSIO is now implemented with GCP GCS Client instead of apitools (&lt;a href="https://github.com/apache/beam/issues/25676">#25676&lt;/a>)&lt;/li>
&lt;li>Adding support for LowCardinality DataType in ClickHouse (Java) (&lt;a href="https://github.com/apache/beam/pull/29533">#29533&lt;/a>).&lt;/li>
&lt;li>Added support for handling bad records to KafkaIO (Java) (&lt;a href="https://github.com/apache/beam/pull/29546">#29546&lt;/a>)&lt;/li>
&lt;li>Add support for generating text embeddings in MLTransform for Vertex AI and Hugging Face Hub models.(&lt;a href="https://github.com/apache/beam/pull/29564">#29564&lt;/a>)&lt;/li>
&lt;li>NATS IO connector added (Go) (&lt;a href="https://github.com/apache/beam/issues/29000">#29000&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>The Python SDK now type checks &lt;code>collections.abc.Collections&lt;/code> types properly. Some type hints that were erroneously allowed by the SDK may now fail. (&lt;a href="https://github.com/apache/beam/pull/29272">#29272&lt;/a>)&lt;/li>
&lt;li>Running multi-language pipelines locally no longer requires Docker.
Instead, the same (generally auto-started) subprocess used to perform the
expansion can also be used as the cross-language worker.&lt;/li>
&lt;li>Framework for adding Error Handlers to composite transforms added in Java (&lt;a href="https://github.com/apache/beam/pull/29164">#29164&lt;/a>).&lt;/li>
&lt;li>Python 3.11 images now include google-cloud-profiler (&lt;a href="https://github.com/apache/beam/pull/29651">#29561&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Euphoria DSL is deprecated and will be removed in a future release (not before 2.56.0) (&lt;a href="https://github.com/apache/beam/issues/29451">#29451&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>(Python) Fixed sporadic crashes in streaming pipelines that affected some users of 2.47.0 and newer SDKs (&lt;a href="https://github.com/apache/beam/issues/27330">#27330&lt;/a>).&lt;/li>
&lt;li>(Python) Fixed a bug that caused MLTransform to drop identical elements in the output PCollection (&lt;a href="https://github.com/apache/beam/issues/29600">#29600&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="security-fixes">Security Fixes&lt;/h2>
&lt;ul>
&lt;li>Upgraded to go 1.21.5 to build, fixing &lt;a href="https://security-tracker.debian.org/tracker/CVE-2023-45285">CVE-2023-45285&lt;/a> and &lt;a href="https://security-tracker.debian.org/tracker/CVE-2023-39326">CVE-2023-39326&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Potential race condition causing NPE in DataflowExecutionStateSampler in Dataflow Java Streaming pipelines (&lt;a href="https://github.com/apache/beam/issues/29987">#29987&lt;/a>).&lt;/li>
&lt;li>Some Python pipelines that run with 2.52.0-2.54.0 SDKs and use large materialized side inputs might be affected by a performance regression. To restore the prior behavior on these SDK versions, supply the &lt;code>--max_cache_memory_usage_mb=0&lt;/code> pipeline option. (&lt;a href="https://github.com/apache/beam/issues/30360">#30360&lt;/a>).&lt;/li>
&lt;li>Python pipelines that run with 2.53.0-2.54.0 SDKs and perform file operations on GCS might be affected by excess HTTP requests. This could lead to a performance regression or a permission issue. (&lt;a href="https://github.com/apache/beam/issues/28398">#28398&lt;/a>)&lt;/li>
&lt;li>In Python pipelines, when shutting down inactive bundle processors, shutdown logic can overaggressively hold the lock, blocking acceptance of new work. Symptoms of this issue include slowness or stuckness in long-running jobs. Fixed in 2.56.0 (&lt;a href="https://github.com/apache/beam/pull/30679">#30679&lt;/a>).&lt;/li>
&lt;/ul>
&lt;p>For the most up to date list of known issues, see &lt;a href="https://github.com/apache/beam/blob/master/CHANGES.md">https://github.com/apache/beam/blob/master/CHANGES.md&lt;/a>&lt;/p>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.53.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Arun Pandian&lt;/p>
&lt;p>Balázs Németh&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Byron Ellis&lt;/p>
&lt;p>Calvin Swenson Jr&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Clay Johnson&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Ferran Fernández Garrido&lt;/p>
&lt;p>Georgii Zemlianyi&lt;/p>
&lt;p>Israel Herraiz&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jacob Tomlinson&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>JayajP&lt;/p>
&lt;p>Jeffrey Kinard&lt;/p>
&lt;p>Johanna Öjeling&lt;/p>
&lt;p>Julian Braha&lt;/p>
&lt;p>Julien Tournay&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Lawrence Qiu&lt;/p>
&lt;p>Mark Zitnik&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Michel Davit&lt;/p>
&lt;p>Mike Williamson&lt;/p>
&lt;p>Naireen&lt;/p>
&lt;p>Naireen Hussain&lt;/p>
&lt;p>Niel Markwick&lt;/p>
&lt;p>Pablo Estrada&lt;/p>
&lt;p>Radosław Stankiewicz&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Sam Rohde&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Shunping Huang&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Talat UYARER&lt;/p>
&lt;p>Tom Stepp&lt;/p>
&lt;p>Tony Tang&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>Zechen Jiang&lt;/p>
&lt;p>clmccart&lt;/p>
&lt;p>damccorm&lt;/p>
&lt;p>darshan-sj&lt;/p>
&lt;p>gabry.wu&lt;/p>
&lt;p>johnjcasey&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>lrakla&lt;/p>
&lt;p>martin trieu&lt;/p>
&lt;p>tvalentyn&lt;/p></description></item><item><title>Blog: Apache Beam 2.52.0</title><link>/blog/beam-2.52.0/</link><pubDate>Fri, 17 Nov 2023 09:00:00 -0400</pubDate><guid>/blog/beam-2.52.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.52.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2520-2023-11-17">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.52.0, check out the &lt;a href="https://github.com/apache/beam/milestone/16">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Previously deprecated Avro-dependent code (Beam Release 2.46.0) has been finally removed from Java SDK &amp;ldquo;core&amp;rdquo; package.
Please, use &lt;code>beam-sdks-java-extensions-avro&lt;/code> instead. This will allow to easily update Avro version in user code without
potential breaking changes in Beam &amp;ldquo;core&amp;rdquo; since the Beam Avro extension already supports the latest Avro versions and
should handle this. (&lt;a href="https://github.com/apache/beam/issues/25252">#25252&lt;/a>).&lt;/li>
&lt;li>Publishing Java 21 SDK container images now supported as part of Apache Beam release process. (&lt;a href="https://github.com/apache/beam/issues/28120">#28120&lt;/a>)
&lt;ul>
&lt;li>Direct Runner and Dataflow Runner support running pipelines on Java21 (experimental until tests fully setup). For other runners (Flink, Spark, Samza, etc) support status depend on runner projects.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Add &lt;code>UseDataStreamForBatch&lt;/code> pipeline option to the Flink runner. When it is set to true, Flink runner will run batch
jobs using the DataStream API. By default the option is set to false, so the batch jobs are still executed
using the DataSet API.&lt;/li>
&lt;li>&lt;code>upload_graph&lt;/code> as one of the Experiments options for DataflowRunner is no longer required when the graph is larger than 10MB for Java SDK (&lt;a href="https://github.com/apache/beam/pull/28621">PR#28621&lt;/a>).&lt;/li>
&lt;li>state amd side input cache has been enabled to a default of 100 MB. Use &lt;code>--max_cache_memory_usage_mb=X&lt;/code> to provide cache size for the user state API and side inputs. (Python) (&lt;a href="https://github.com/apache/beam/issues/28770">#28770&lt;/a>).&lt;/li>
&lt;li>Beam YAML stable release. Beam pipelines can now be written using YAML and leverage the Beam YAML framework which includes a preliminary set of IO&amp;rsquo;s and turnkey transforms. More information can be found in the YAML root folder and in the &lt;a href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/yaml/README.md">README&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>&lt;code>org.apache.beam.sdk.io.CountingSource.CounterMark&lt;/code> uses custom &lt;code>CounterMarkCoder&lt;/code> as a default coder since all Avro-dependent
classes finally moved to &lt;code>extensions/avro&lt;/code>. In case if it&amp;rsquo;s still required to use &lt;code>AvroCoder&lt;/code> for &lt;code>CounterMark&lt;/code>, then,
as a workaround, a copy of &amp;ldquo;old&amp;rdquo; &lt;code>CountingSource&lt;/code> class should be placed into a project code and used directly
(&lt;a href="https://github.com/apache/beam/issues/25252">#25252&lt;/a>).&lt;/li>
&lt;li>Renamed &lt;code>host&lt;/code> to &lt;code>firestoreHost&lt;/code> in &lt;code>FirestoreOptions&lt;/code> to avoid potential conflict of command line arguments (Java) (&lt;a href="https://github.com/apache/beam/pull/29201">#29201&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed &amp;ldquo;Desired bundle size 0 bytes must be greater than 0&amp;rdquo; in Java SDK&amp;rsquo;s BigtableIO.BigtableSource when you have more cores than bytes to read (Java) &lt;a href="https://github.com/apache/beam/issues/28793">#28793&lt;/a>.&lt;/li>
&lt;li>&lt;code>watch_file_pattern&lt;/code> arg of the &lt;a href="https://github.com/apache/beam/blob/104c10b3ee536a9a3ea52b4dbf62d86b669da5d9/sdks/python/apache_beam/ml/inference/base.py#L997">RunInference&lt;/a> arg had no effect prior to 2.52.0. To use the behavior of arg &lt;code>watch_file_pattern&lt;/code> prior to 2.52.0, follow the documentation at &lt;a href="https://beam.apache.org/documentation/ml/side-input-updates/">https://beam.apache.org/documentation/ml/side-input-updates/&lt;/a> and use &lt;code>WatchFilePattern&lt;/code> PTransform as a SideInput. (&lt;a href="https://github.com/apache/beam/pulls/28948">#28948&lt;/a>)&lt;/li>
&lt;li>&lt;code>MLTransform&lt;/code> doesn&amp;rsquo;t output artifacts such as min, max and quantiles. Instead, &lt;code>MLTransform&lt;/code> will add a feature to output these artifacts as human readable format - &lt;a href="https://github.com/apache/beam/issues/29017">#29017&lt;/a>. For now, to use the artifacts such as min and max that were produced by the eariler &lt;code>MLTransform&lt;/code>, use &lt;code>read_artifact_location&lt;/code> of &lt;code>MLTransform&lt;/code>, which reads artifacts that were produced earlier in a different &lt;code>MLTransform&lt;/code> (&lt;a href="https://github.com/apache/beam/pull/29016/">#29016&lt;/a>)&lt;/li>
&lt;li>Fixed a memory leak, which affected some long-running Python pipelines: &lt;a href="https://github.com/apache/beam/issues/28246">#28246&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="security-fixes">Security Fixes&lt;/h2>
&lt;ul>
&lt;li>Fixed &lt;a href="https://www.cve.org/CVERecord?id=CVE-2023-39325">CVE-2023-39325&lt;/a> (Java/Python/Go) (&lt;a href="https://github.com/apache/beam/issues/29118">#29118&lt;/a>).&lt;/li>
&lt;li>Mitigated &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2023-47248">CVE-2023-47248&lt;/a> (Python) &lt;a href="https://github.com/apache/beam/issues/29392">#29392&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.52.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Aleksandr Dudko&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrei Gurau&lt;/p>
&lt;p>Andrey Devyatkin&lt;/p>
&lt;p>BjornPrime&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Bulat&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Devansh Modi&lt;/p>
&lt;p>Dominik Dębowczyk&lt;/p>
&lt;p>Ferran Fernández Garrido&lt;/p>
&lt;p>Hai Joey Tran&lt;/p>
&lt;p>Israel Herraiz&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>JayajP&lt;/p>
&lt;p>Jeff Kinard&lt;/p>
&lt;p>Jeffrey Kinard&lt;/p>
&lt;p>Jiangjie Qin&lt;/p>
&lt;p>Jing&lt;/p>
&lt;p>Joar Wandborg&lt;/p>
&lt;p>Johanna Öjeling&lt;/p>
&lt;p>Julien Tournay&lt;/p>
&lt;p>Kanishk Karanawat&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Kerry Donny-Clark&lt;/p>
&lt;p>Luís Bianchin&lt;/p>
&lt;p>Minbo Bae&lt;/p>
&lt;p>Pranav Bhandari&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>RyuSA&lt;/p>
&lt;p>Shunping Huang&lt;/p>
&lt;p>Steven van Rossum&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Tony Tang&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Vivek Sumanth&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>aku019&lt;/p>
&lt;p>brucearctor&lt;/p>
&lt;p>caneff&lt;/p>
&lt;p>damccorm&lt;/p>
&lt;p>ddebowczyk92&lt;/p>
&lt;p>dependabot[bot]&lt;/p>
&lt;p>dpcollins-google&lt;/p>
&lt;p>edman124&lt;/p>
&lt;p>gabry.wu&lt;/p>
&lt;p>illoise&lt;/p>
&lt;p>johnjcasey&lt;/p>
&lt;p>jonathan-lemos&lt;/p>
&lt;p>kennknowles&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>magicgoody&lt;/p>
&lt;p>martin trieu&lt;/p>
&lt;p>nancyxu123&lt;/p>
&lt;p>pablo rodriguez defino&lt;/p>
&lt;p>tvalentyn&lt;/p></description></item><item><title>Blog: Apache Beam 2.51.0</title><link>/blog/beam-2.51.0/</link><pubDate>Wed, 11 Oct 2023 09:00:00 -0400</pubDate><guid>/blog/beam-2.51.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.51.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2510-2023-10-03">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.51.0, check out the &lt;a href="https://github.com/apache/beam/milestone/15">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>In Python, &lt;a href="https://beam.apache.org/documentation/sdks/python-machine-learning/#why-use-the-runinference-api">RunInference&lt;/a> now supports loading many models in the same transform using a &lt;a href="https://beam.apache.org/documentation/sdks/python-machine-learning/#use-a-keyed-modelhandler">KeyedModelHandler&lt;/a> (&lt;a href="https://github.com/apache/beam/issues/27628">#27628&lt;/a>).&lt;/li>
&lt;li>In Python, the &lt;a href="https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.vertex_ai_inference.html#apache_beam.ml.inference.vertex_ai_inference.VertexAIModelHandlerJSON">VertexAIModelHandlerJSON&lt;/a> now supports passing in inference_args. These will be passed through to the Vertex endpoint as parameters.&lt;/li>
&lt;li>Added support to run &lt;code>mypy&lt;/code> on user pipelines (&lt;a href="https://github.com/apache/beam/issues/27906">#27906&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Removed fastjson library dependency for Beam SQL. Table property is changed to be based on jackson ObjectNode (Java) (&lt;a href="https://github.com/apache/beam/issues/24154">#24154&lt;/a>).&lt;/li>
&lt;li>Removed TensorFlow from Beam Python container images &lt;a href="https://github.com/apache/beam/pull/28424">PR&lt;/a>. If you have been negatively affected by this change, please comment on &lt;a href="https://github.com/apache/beam/issues/20605">#20605&lt;/a>.&lt;/li>
&lt;li>Removed the parameter &lt;code>t reflect.Type&lt;/code> from &lt;code>parquetio.Write&lt;/code>. The element type is derived from the input PCollection (Go) (&lt;a href="https://github.com/apache/beam/issues/28490">#28490&lt;/a>)&lt;/li>
&lt;li>Refactor BeamSqlSeekableTable.setUp adding a parameter joinSubsetType. &lt;a href="https://github.com/apache/beam/issues/28283">#28283&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed exception chaining issue in GCS connector (Python) (&lt;a href="https://github.com/apache/beam/issues/26769#issuecomment-1700422615">#26769&lt;/a>).&lt;/li>
&lt;li>Fixed streaming inserts exception handling, GoogleAPICallErrors are now retried according to retry strategy and routed to failed rows where appropriate rather than causing a pipeline error (Python) (&lt;a href="https://github.com/apache/beam/issues/21080">#21080&lt;/a>).&lt;/li>
&lt;li>Fixed a bug in Python SDK&amp;rsquo;s cross-language Bigtable sink that mishandled records that don&amp;rsquo;t have an explicit timestamp set: &lt;a href="https://github.com/apache/beam/issues/28632">#28632&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="security-fixes">Security Fixes&lt;/h2>
&lt;ul>
&lt;li>Python containers updated, fixing &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2021-30474">CVE-2021-30474&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2021-30475">CVE-2021-30475&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2021-30473">CVE-2021-30473&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2020-36133">CVE-2020-36133&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2020-36131">CVE-2020-36131&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2020-36130">CVE-2020-36130&lt;/a>, and &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2020-36135">CVE-2020-36135&lt;/a>&lt;/li>
&lt;li>Used go 1.21.1 to build, fixing &lt;a href="https://security-tracker.debian.org/tracker/CVE-2023-39320">CVE-2023-39320&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Python pipelines using BigQuery Storage Read API must pin &lt;code>fastavro&lt;/code> dependency to 1.8.3
or earlier: &lt;a href="https://github.com/apache/beam/issues/28811">#28811&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.50.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Adam Whitmore&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Aleksandr Dudko&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrey Devyatkin&lt;/p>
&lt;p>Arvind Ram&lt;/p>
&lt;p>Arwin Tio&lt;/p>
&lt;p>BjornPrime&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Bulat&lt;/p>
&lt;p>Celeste Zeng&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Clay Johnson&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>David Cavazos&lt;/p>
&lt;p>Dip Patel&lt;/p>
&lt;p>Hai Joey Tran&lt;/p>
&lt;p>Hao Xu&lt;/p>
&lt;p>Haruka Abe&lt;/p>
&lt;p>Jack Dingilian&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jeff Kinard&lt;/p>
&lt;p>Jeffrey Kinard&lt;/p>
&lt;p>Joey Tran&lt;/p>
&lt;p>Johanna Öjeling&lt;/p>
&lt;p>Julien Tournay&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Kerry Donny-Clark&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Melissa Pashniak&lt;/p>
&lt;p>Michel Davit&lt;/p>
&lt;p>Moritz Mack&lt;/p>
&lt;p>Pranav Bhandari&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Reeba Qureshi&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Ruwann&lt;/p>
&lt;p>Ryan Tam&lt;/p>
&lt;p>Sam Rohde&lt;/p>
&lt;p>Sereana Seim&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Tim Grein&lt;/p>
&lt;p>Udi Meiri&lt;/p>
&lt;p>Valentyn Tymofieiev&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>Xinyu Liu&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>Zbynek Konecny&lt;/p>
&lt;p>Zechen Jiang&lt;/p>
&lt;p>bzablocki&lt;/p>
&lt;p>caneff&lt;/p>
&lt;p>dependabot[bot]&lt;/p>
&lt;p>gDuperran&lt;/p>
&lt;p>gabry.wu&lt;/p>
&lt;p>johnjcasey&lt;/p>
&lt;p>kberezin-nshl&lt;/p>
&lt;p>kennknowles&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>lostluck&lt;/p>
&lt;p>magicgoody&lt;/p>
&lt;p>martin trieu&lt;/p>
&lt;p>mosche&lt;/p>
&lt;p>olalamichelle&lt;/p>
&lt;p>tvalentyn&lt;/p>
&lt;p>xqhu&lt;/p>
&lt;p>Łukasz Spyra&lt;/p></description></item><item><title>Blog: Apache Beam 2.50.0</title><link>/blog/beam-2.50.0/</link><pubDate>Wed, 30 Aug 2023 09:00:00 -0400</pubDate><guid>/blog/beam-2.50.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.50.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2500-2023-08-30">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.50.0, check out the &lt;a href="https://github.com/apache/beam/milestone/14">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Spark 3.2.2 is used as default version for Spark runner (&lt;a href="https://github.com/apache/beam/issues/23804">#23804&lt;/a>).&lt;/li>
&lt;li>The Go SDK has a new default local runner, called Prism (&lt;a href="https://github.com/apache/beam/issues/24789">#24789&lt;/a>).&lt;/li>
&lt;li>All Beam released container images are now &lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/build-multi-arch-for-arm#what_is_a_multi-arch_image">multi-arch images&lt;/a> that support both x86 and ARM CPU architectures.&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Java KafkaIO now supports picking up topics via topicPattern (&lt;a href="https://github.com/apache/beam/pull/26948">#26948&lt;/a>)&lt;/li>
&lt;li>Support for read from Cosmos DB Core SQL API (&lt;a href="https://github.com/apache/beam/issues/23604">#23604&lt;/a>)&lt;/li>
&lt;li>Upgraded to HBase 2.5.5 for HBaseIO. (Java) (&lt;a href="https://github.com/apache/beam/issues/19554">#27711&lt;/a>)&lt;/li>
&lt;li>Added support for GoogleAdsIO source (Java) (&lt;a href="https://github.com/apache/beam/pull/27681">#27681&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>The Go SDK now requires Go 1.20 to build. (&lt;a href="https://github.com/apache/beam/issues/27558">#27558&lt;/a>)&lt;/li>
&lt;li>The Go SDK has a new default local runner, Prism. (&lt;a href="https://github.com/apache/beam/issues/24789">#24789&lt;/a>).
&lt;ul>
&lt;li>Prism is a portable runner that executes each transform independantly, ensuring coders.&lt;/li>
&lt;li>At this point it supercedes the Go direct runner in functionality. The Go direct runner is now deprecated.&lt;/li>
&lt;li>See &lt;a href="https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/README.md">https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/README.md&lt;/a> for the goals and features of Prism.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Hugging Face Model Handler for RunInference added to Python SDK. (&lt;a href="https://github.com/apache/beam/pull/26632">#26632&lt;/a>)&lt;/li>
&lt;li>Hugging Face Pipelines support for RunInference added to Python SDK. (&lt;a href="https://github.com/apache/beam/pull/27399">#27399&lt;/a>)&lt;/li>
&lt;li>Vertex AI Model Handler for RunInference now supports private endpoints (&lt;a href="https://github.com/apache/beam/pull/27696">#27696&lt;/a>)&lt;/li>
&lt;li>MLTransform transform added with support for common ML pre/postprocessing operations (&lt;a href="https://github.com/apache/beam/pull/26795">#26795&lt;/a>)&lt;/li>
&lt;li>Upgraded the Kryo extension for the Java SDK to Kryo 5.5.0. This brings in bug fixes, performance improvements, and serialization of Java 14 records. (&lt;a href="https://github.com/apache/beam/issues/27635">#27635&lt;/a>)&lt;/li>
&lt;li>All Beam released container images are now &lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/build-multi-arch-for-arm#what_is_a_multi-arch_image">multi-arch images&lt;/a> that support both x86 and ARM CPU architectures. (&lt;a href="https://github.com/apache/beam/issues/27674">#27674&lt;/a>). The multi-arch container images include:
&lt;ul>
&lt;li>All versions of Go, Python, Java and Typescript SDK containers.&lt;/li>
&lt;li>All versions of Flink job server containers.&lt;/li>
&lt;li>Java and Python expansion service containers.&lt;/li>
&lt;li>Transform service controller container.&lt;/li>
&lt;li>Spark3 job server container.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Added support for batched writes to AWS SQS for improved throughput (Java, AWS 2).(&lt;a href="https://github.com/apache/beam/issues/21429">#21429&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Python SDK: Legacy runner support removed from Dataflow, all pipelines must use runner v2.&lt;/li>
&lt;li>Python SDK: Dataflow Runner will no longer stage Beam SDK from PyPI in the &lt;code>--staging_location&lt;/code> at pipeline submission. Custom container images that are not based on Beam&amp;rsquo;s default image must include Apache Beam installation.(&lt;a href="https://github.com/apache/beam/issues/26996">#26996&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>The Go Direct Runner is now Deprecated. It remains available to reduce migration churn.
&lt;ul>
&lt;li>Tests can be set back to the direct runner by overriding TestMain: &lt;code>func TestMain(m *testing.M) { ptest.MainWithDefault(m, &amp;quot;direct&amp;quot;) }&lt;/code>&lt;/li>
&lt;li>It&amp;rsquo;s recommended to fix issues seen in tests using Prism, as they can also happen on any portable runner.&lt;/li>
&lt;li>Use the generic register package for your pipeline DoFns to ensure pipelines function on portable runners, like prism.&lt;/li>
&lt;li>Do not rely on closures or using package globals for DoFn configuration. They don&amp;rsquo;t function on portable runners.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed DirectRunner bug in Python SDK where GroupByKey gets empty PCollection and fails when pipeline option &lt;code>direct_num_workers!=1&lt;/code>.(&lt;a href="https://github.com/apache/beam/pull/27373">#27373&lt;/a>)&lt;/li>
&lt;li>Fixed BigQuery I/O bug when estimating size on queries that utilize row-level security (&lt;a href="https://github.com/apache/beam/pull/27474">#27474&lt;/a>)&lt;/li>
&lt;li>Beam Python containers rely on a version of Debian/aom that has several security vulnerabilities: &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2021-30474">CVE-2021-30474&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2021-30475">CVE-2021-30475&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2021-30473">CVE-2021-30473&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2020-36133">CVE-2020-36133&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2020-36131">CVE-2020-36131&lt;/a>, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2020-36130">CVE-2020-36130&lt;/a>, and &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2020-36135">CVE-2020-36135&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Long-running Python pipelines might experience a memory leak: &lt;a href="https://github.com/apache/beam/issues/28246">#28246&lt;/a>.&lt;/li>
&lt;li>Python Pipelines using BigQuery IO or &lt;code>orjson&lt;/code> dependency might experience segmentation faults or get stuck: &lt;a href="https://github.com/apache/beam/issues/28318">#28318&lt;/a>.&lt;/li>
&lt;li>Python SDK&amp;rsquo;s cross-language Bigtable sink mishandles records that don&amp;rsquo;t have an explicit timestamp set: &lt;a href="https://github.com/apache/beam/issues/28632">#28632&lt;/a>. To avoid this issue, set explicit timestamps for all records before writing to Bigtable.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.50.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Abacn&lt;/p>
&lt;p>acejune&lt;/p>
&lt;p>AdalbertMemSQL&lt;/p>
&lt;p>ahmedabu98&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>al97&lt;/p>
&lt;p>Aleksandr Dudko&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrey Devyatkin&lt;/p>
&lt;p>Anton Shalkovich&lt;/p>
&lt;p>ArjunGHUB&lt;/p>
&lt;p>Bjorn Pedersen&lt;/p>
&lt;p>BjornPrime&lt;/p>
&lt;p>Brett Morgan&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Buqian Zheng&lt;/p>
&lt;p>Burke Davison&lt;/p>
&lt;p>Byron Ellis&lt;/p>
&lt;p>bzablocki&lt;/p>
&lt;p>case-k&lt;/p>
&lt;p>Celeste Zeng&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Clay Johnson&lt;/p>
&lt;p>Connor Brett&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Damon Douglas&lt;/p>
&lt;p>Dan Hansen&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Darkhan Nausharipov&lt;/p>
&lt;p>Dip Patel&lt;/p>
&lt;p>Dmytro Sadovnychyi&lt;/p>
&lt;p>Florent Biville&lt;/p>
&lt;p>Gabriel Lacroix&lt;/p>
&lt;p>Hai Joey Tran&lt;/p>
&lt;p>Hong Liang Teoh&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>James Fricker&lt;/p>
&lt;p>Jeff Kinard&lt;/p>
&lt;p>Jeff Zhang&lt;/p>
&lt;p>Jing&lt;/p>
&lt;p>johnjcasey&lt;/p>
&lt;p>jon esperanza&lt;/p>
&lt;p>Josef Šimánek&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Laksh&lt;/p>
&lt;p>Liam Miller-Cushon&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>magicgoody&lt;/p>
&lt;p>Mahmud Ridwan&lt;/p>
&lt;p>Manav Garg&lt;/p>
&lt;p>Marco Vela&lt;/p>
&lt;p>martin trieu&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Michel Davit&lt;/p>
&lt;p>Moritz Mack&lt;/p>
&lt;p>mosche&lt;/p>
&lt;p>Peter Sobot&lt;/p>
&lt;p>Pranav Bhandari&lt;/p>
&lt;p>Reeba Qureshi&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>RyuSA&lt;/p>
&lt;p>Saba Sathya&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Steven Niemitz&lt;/p>
&lt;p>Steven van Rossum&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Tony Tang&lt;/p>
&lt;p>Valentyn Tymofieiev&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>Yichi Zhang&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>Zechen Jiang&lt;/p></description></item><item><title>Blog: Apache Beam 2.49.0</title><link>/blog/beam-2.49.0/</link><pubDate>Mon, 17 Jul 2023 09:00:00 -0400</pubDate><guid>/blog/beam-2.49.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.49.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2490-2023-07-17">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.49.0, check out the &lt;a href="https://github.com/apache/beam/milestone/13">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Support for Bigtable Change Streams added in Java &lt;code>BigtableIO.ReadChangeStream&lt;/code> (&lt;a href="https://github.com/apache/beam/issues/27183">#27183&lt;/a>).&lt;/li>
&lt;li>Added Bigtable Read and Write cross-language transforms to Python SDK ((&lt;a href="https://github.com/apache/beam/issues/26593">#26593&lt;/a>), (&lt;a href="https://github.com/apache/beam/issues/27146">#27146&lt;/a>)).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Allow prebuilding large images when using &lt;code>--prebuild_sdk_container_engine=cloud_build&lt;/code>, like images depending on &lt;code>tensorflow&lt;/code> or &lt;code>torch&lt;/code> (&lt;a href="https://github.com/apache/beam/pull/27023">#27023&lt;/a>).&lt;/li>
&lt;li>Disabled &lt;code>pip&lt;/code> cache when installing packages on the workers. This reduces the size of prebuilt Python container images (&lt;a href="https://github.com/apache/beam/pull/27035">#27035&lt;/a>).&lt;/li>
&lt;li>Select dedicated avro datum reader and writer (Java) (&lt;a href="https://github.com/apache/beam/issues/18874">#18874&lt;/a>).&lt;/li>
&lt;li>Timer API for the Go SDK (Go) (&lt;a href="https://github.com/apache/beam/issues/22737">#22737&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Remove Python 3.7 support. (&lt;a href="https://github.com/apache/beam/issues/26447">#26447&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed KinesisIO &lt;code>NullPointerException&lt;/code> when a progress check is made before the reader is started (IO) (&lt;a href="https://github.com/apache/beam/issues/23868">#23868&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>Long-running Python pipelines might experience a memory leak: &lt;a href="https://github.com/apache/beam/issues/28246">#28246&lt;/a>.&lt;/li>
&lt;li>Python SDK&amp;rsquo;s cross-language Bigtable sink mishandles records that don&amp;rsquo;t have an explicit timestamp set: &lt;a href="https://github.com/apache/beam/issues/28632">#28632&lt;/a>. To avoid this issue, set explicit timestamps for all records before writing to Bigtable.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.49.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Abzal Tuganbay&lt;/p>
&lt;p>AdalbertMemSQL&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Alan Zhang&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrei Gurau&lt;/p>
&lt;p>Arwin Tio&lt;/p>
&lt;p>Bartosz Zablocki&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Burke Davison&lt;/p>
&lt;p>Byron Ellis&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Charles Rothrock&lt;/p>
&lt;p>Chris Gavin&lt;/p>
&lt;p>Claire McGinty&lt;/p>
&lt;p>Clay Johnson&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Daniel Dopierała&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Darkhan Nausharipov&lt;/p>
&lt;p>David Cavazos&lt;/p>
&lt;p>Dip Patel&lt;/p>
&lt;p>Dmitry Repin&lt;/p>
&lt;p>Gavin McDonald&lt;/p>
&lt;p>Jack Dingilian&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>James Fricker&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>Jasper Van den Bossche&lt;/p>
&lt;p>John Casey&lt;/p>
&lt;p>John Gill&lt;/p>
&lt;p>Joseph Crowley&lt;/p>
&lt;p>Kanishk Karanawat&lt;/p>
&lt;p>Katie Liu&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Kyle Galloway&lt;/p>
&lt;p>Liam Miller-Cushon&lt;/p>
&lt;p>MakarkinSAkvelon&lt;/p>
&lt;p>Masato Nakamura&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Michel Davit&lt;/p>
&lt;p>Naireen Hussain&lt;/p>
&lt;p>Nathaniel Young&lt;/p>
&lt;p>Nelson Osacky&lt;/p>
&lt;p>Nick Li&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>Pablo Estrada&lt;/p>
&lt;p>Reeba Qureshi&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Rouslan&lt;/p>
&lt;p>Saadat Su&lt;/p>
&lt;p>Sam Rohde&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Sanil Jain&lt;/p>
&lt;p>Shunping Huang&lt;/p>
&lt;p>Smeet nagda&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Timur Sultanov&lt;/p>
&lt;p>Udi Meiri&lt;/p>
&lt;p>Valentyn Tymofieiev&lt;/p>
&lt;p>Vlado Djerek&lt;/p>
&lt;p>WuA&lt;/p>
&lt;p>XQ Hu&lt;/p>
&lt;p>Xianhua Liu&lt;/p>
&lt;p>Xinyu Liu&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>Zachary Houfek&lt;/p>
&lt;p>alexeyinkin&lt;/p>
&lt;p>bigduu&lt;/p>
&lt;p>bullet03&lt;/p>
&lt;p>bzablocki&lt;/p>
&lt;p>jonathan-lemos&lt;/p>
&lt;p>jubebo&lt;/p>
&lt;p>magicgoody&lt;/p>
&lt;p>ruslan-ikhsan&lt;/p>
&lt;p>sultanalieva-s&lt;/p>
&lt;p>vitaly.terentyev&lt;/p></description></item><item><title>Blog: Apache Beam 2.48.0</title><link>/blog/beam-2.48.0/</link><pubDate>Wed, 31 May 2023 11:30:00 -0400</pubDate><guid>/blog/beam-2.48.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.48.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2480-2023-05-31">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.48.0, check out the &lt;a href="https://github.com/apache/beam/milestone/12">detailed release notes&lt;/a>.&lt;/p>
&lt;p>&lt;strong>Note: The release tag for Go SDK for this release is sdks/v2.48.2 instead of sdks/v2.48.0 because of incorrect commit attached to the release tag sdks/v2.48.0.&lt;/strong>&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>&amp;ldquo;Experimental&amp;rdquo; annotation cleanup: the annotation and concept have been removed from Beam to avoid
the misperception of code as &amp;ldquo;not ready&amp;rdquo;. Any proposed breaking changes will be subject to
case-by-case pro/con decision making (and generally avoided) rather than using the &amp;ldquo;Experimental&amp;rdquo;
to allow them.&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Added rename for GCS and copy for local filesystem (Go) (&lt;a href="https://github.com/apache/beam/issues/26064">#25779&lt;/a>).&lt;/li>
&lt;li>Added support for enhanced fan-out in KinesisIO.Read (Java) (&lt;a href="https://github.com/apache/beam/issues/19967">#19967&lt;/a>).
&lt;ul>
&lt;li>This change is not compatible with Flink savepoints created by Beam 2.46.0 applications which had KinesisIO sources.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Added textio.ReadWithFilename transform (Go) (&lt;a href="https://github.com/apache/beam/issues/25812">#25812&lt;/a>).&lt;/li>
&lt;li>Added fileio.MatchContinuously transform (Go) (&lt;a href="https://github.com/apache/beam/issues/26186">#26186&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Allow passing service name for google-cloud-profiler (Python) (&lt;a href="https://github.com/apache/beam/issues/26280">#26280&lt;/a>).&lt;/li>
&lt;li>Dead letter queue support added to RunInference in Python (&lt;a href="https://github.com/apache/beam/issues/24209">#24209&lt;/a>).&lt;/li>
&lt;li>Support added for defining pre/postprocessing operations on the RunInference transform (&lt;a href="https://github.com/apache/beam/issues/26308">#26308&lt;/a>)&lt;/li>
&lt;li>Adds a Docker Compose based transform service that can be used to discover and use portable Beam transforms (&lt;a href="https://github.com/apache/beam/pull/26023">#26023&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Passing a tag into MultiProcessShared is now required in the Python SDK (&lt;a href="https://github.com/apache/beam/issues/26168">#26168&lt;/a>).&lt;/li>
&lt;li>CloudDebuggerOptions is removed (deprecated in Beam v2.47.0) for Dataflow runner as the Google Cloud Debugger service is &lt;a href="https://cloud.google.com/debugger/docs/deprecations">shutting down&lt;/a>. (Java) (&lt;a href="https://github.com/apache/beam/issues/25959">#25959&lt;/a>).&lt;/li>
&lt;li>AWS 2 client providers (deprecated in Beam &lt;a href="#2380---2022-04-20">v2.38.0&lt;/a>) are finally removed (&lt;a href="https://github.com/apache/beam/issues/26681">#26681&lt;/a>).&lt;/li>
&lt;li>AWS 2 SnsIO.writeAsync (deprecated in Beam v2.37.0 due to risk of data loss) was finally removed (&lt;a href="https://github.com/apache/beam/issues/26710">#26710&lt;/a>).&lt;/li>
&lt;li>AWS 2 coders (deprecated in Beam v2.43.0 when adding Schema support for AWS Sdk Pojos) are finally removed (&lt;a href="https://github.com/apache/beam/issues/23315">#23315&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed Java bootloader failing with Too Long Args due to long classpaths, with a pathing jar. (Java) (&lt;a href="https://github.com/apache/beam/issues/25582">#25582&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>PubsubIO writes will throw &lt;em>SizeLimitExceededException&lt;/em> for any message above 100 bytes, when used in batch (bounded) mode. (Java) (&lt;a href="https://github.com/apache/beam/issues/27000">#27000&lt;/a>).&lt;/li>
&lt;li>Long-running Python pipelines might experience a memory leak: &lt;a href="https://github.com/apache/beam/issues/28246">#28246&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.48.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Abzal Tuganbay&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrei Gurau&lt;/p>
&lt;p>Andrey Devyatkin&lt;/p>
&lt;p>Balázs Németh&lt;/p>
&lt;p>Bazyli Polednia&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Clay Johnson&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Daniel Arn&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Darkhan Nausharipov&lt;/p>
&lt;p>Dip Patel&lt;/p>
&lt;p>Dmitry Repin&lt;/p>
&lt;p>George Novitskiy&lt;/p>
&lt;p>Israel Herraiz&lt;/p>
&lt;p>Jack Dingilian&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>Jasper Van den Bossche&lt;/p>
&lt;p>Jeff Zhang&lt;/p>
&lt;p>Jeremy Edwards&lt;/p>
&lt;p>Johanna Öjeling&lt;/p>
&lt;p>John Casey&lt;/p>
&lt;p>Katie Liu&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Kerry Donny-Clark&lt;/p>
&lt;p>Kuba Rauch&lt;/p>
&lt;p>Liam Miller-Cushon&lt;/p>
&lt;p>MakarkinSAkvelon&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Michel Davit&lt;/p>
&lt;p>Moritz Mack&lt;/p>
&lt;p>Nick Li&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>Pablo Estrada&lt;/p>
&lt;p>Pranav Bhandari&lt;/p>
&lt;p>Pranjal Joshi&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Rouslan&lt;/p>
&lt;p>RuiLong J&lt;/p>
&lt;p>RyujiTamaki&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Sanil Jain&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Timur Sultanov&lt;/p>
&lt;p>Tony Tang&lt;/p>
&lt;p>Udi Meiri&lt;/p>
&lt;p>Valentyn Tymofieiev&lt;/p>
&lt;p>Vishal Bhise&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Xinyu Liu&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>bullet03&lt;/p>
&lt;p>darshan-sj&lt;/p>
&lt;p>kellen&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>mokamoka03210120&lt;/p>
&lt;p>psolomin&lt;/p></description></item><item><title>Blog: Apache Beam 2.47.0</title><link>/blog/beam-2.47.0/</link><pubDate>Wed, 10 May 2023 12:00:00 -0500</pubDate><guid>/blog/beam-2.47.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.47.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2470-2023-05-10">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.47.0, check out the &lt;a href="https://github.com/apache/beam/milestone/10">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Apache Beam adds Python 3.11 support (&lt;a href="https://github.com/apache/beam/issues/23848">#23848&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>BigQuery Storage Write API is now available in Python SDK via cross-language (&lt;a href="https://github.com/apache/beam/issues/21961">#21961&lt;/a>).&lt;/li>
&lt;li>Added HbaseIO support for writing RowMutations (ordered by rowkey) to Hbase (Java) (&lt;a href="https://github.com/apache/beam/issues/25830">#25830&lt;/a>).&lt;/li>
&lt;li>Added fileio transforms MatchFiles, MatchAll and ReadMatches (Go) (&lt;a href="https://github.com/apache/beam/issues/25779">#25779&lt;/a>).&lt;/li>
&lt;li>Add integration test for JmsIO + fix issue with multiple connections (Java) (&lt;a href="https://github.com/apache/beam/issues/25887">#25887&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>The Flink runner now supports Flink 1.16.x (&lt;a href="https://github.com/apache/beam/issues/25046">#25046&lt;/a>).&lt;/li>
&lt;li>Schema&amp;rsquo;d PTransforms can now be directly applied to Beam dataframes just like PCollections.
(Note that when doing multiple operations, it may be more efficient to explicitly chain the operations
like &lt;code>df | (Transform1 | Transform2 | ...)&lt;/code> to avoid excessive conversions.)&lt;/li>
&lt;li>The Go SDK adds new transforms periodic.Impulse and periodic.Sequence that extends support
for slowly updating side input patterns. (&lt;a href="https://github.com/apache/beam/issues/23106">#23106&lt;/a>)&lt;/li>
&lt;li>Several Google client libraries in Python SDK dependency chain were updated to latest available major versions. (&lt;a href="https://github.com/apache/beam/pull/24599">#24599&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>If a main session fails to load, the pipeline will now fail at worker startup. (&lt;a href="https://github.com/apache/beam/issues/25401">#25401&lt;/a>).&lt;/li>
&lt;li>Python pipeline options will now ignore unparsed command line flags prefixed with a single dash. (&lt;a href="https://github.com/apache/beam/issues/25943">#25943&lt;/a>).&lt;/li>
&lt;li>The SmallestPerKey combiner now requires keyword-only arguments for specifying optional parameters, such as &lt;code>key&lt;/code> and &lt;code>reverse&lt;/code>. (&lt;a href="https://github.com/apache/beam/issues/25888">#25888&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Cloud Debugger support and its pipeline options are deprecated and will be removed in the next Beam version,
in response to the Google Cloud Debugger service &lt;a href="https://cloud.google.com/debugger/docs/deprecations">turning down&lt;/a>. (Java) (&lt;a href="https://github.com/apache/beam/issues/25959">#25959&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>BigQuery sink in STORAGE_WRITE_API mode in batch pipelines might result in data consistency issues during the handling of other unrelated transient errors for Beam SDKs 2.35.0 - 2.46.0 (inclusive). For more details see: &lt;a href="https://github.com/apache/beam/issues/26521">https://github.com/apache/beam/issues/26521&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>BigQueryIO Storage API write with autoUpdateSchema may cause data corruption for Beam SDKs 2.45.0 - 2.47.0 (inclusive) (&lt;a href="https://github.com/apache/beam/issues/26789">#26789&lt;/a>)&lt;/li>
&lt;li>Long-running Python pipelines might experience a memory leak: &lt;a href="https://github.com/apache/beam/issues/28246">#28246&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.47.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Amir Fayazi&lt;/p>
&lt;p>Amrane Ait Zeouay&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrew Pilloud&lt;/p>
&lt;p>Andrey Kot&lt;/p>
&lt;p>Bjorn Pedersen&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Buqian Zheng&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>ChangyuLi28&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Dmitry Repin&lt;/p>
&lt;p>George Ma&lt;/p>
&lt;p>Jack Dingilian&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jasper Van den Bossche&lt;/p>
&lt;p>Jeremy Edwards&lt;/p>
&lt;p>Jiangjie (Becket) Qin&lt;/p>
&lt;p>Johanna Öjeling&lt;/p>
&lt;p>Juta Staes&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Kyle Weaver&lt;/p>
&lt;p>Mattie Fu&lt;/p>
&lt;p>Moritz Mack&lt;/p>
&lt;p>Nick Li&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>Pablo Estrada&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Reza Rokni&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Saadat Su&lt;/p>
&lt;p>Saifuddin53&lt;/p>
&lt;p>Sam Rohde&lt;/p>
&lt;p>Shubham Krishna&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Theodore Ni&lt;/p>
&lt;p>Thomas Gaddy&lt;/p>
&lt;p>Timur Sultanov&lt;/p>
&lt;p>Udi Meiri&lt;/p>
&lt;p>Valentyn Tymofieiev&lt;/p>
&lt;p>Xinyu Liu&lt;/p>
&lt;p>Yanan Hao&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>Yuvi Panda&lt;/p>
&lt;p>andres-vv&lt;/p>
&lt;p>bochap&lt;/p>
&lt;p>dannikay&lt;/p>
&lt;p>darshan-sj&lt;/p>
&lt;p>dependabot[bot]&lt;/p>
&lt;p>harrisonlimh&lt;/p>
&lt;p>hnnsgstfssn&lt;/p>
&lt;p>jrmccluskey&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>tvalentyn&lt;/p>
&lt;p>xianhualiu&lt;/p>
&lt;p>zhangskz&lt;/p></description></item><item><title>Blog: Apache Beam 2.46.0</title><link>/blog/beam-2.46.0/</link><pubDate>Fri, 10 Mar 2023 13:00:00 -0500</pubDate><guid>/blog/beam-2.46.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.46.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2460-2023-03-10">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.46.0, check out the &lt;a href="https://github.com/apache/beam/milestone/9?closed=1">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Java SDK containers migrated to &lt;a href="https://hub.docker.com/_/eclipse-temurin">Eclipse Temurin&lt;/a>
as a base. This change migrates away from the deprecated &lt;a href="https://hub.docker.com/_/openjdk">OpenJDK&lt;/a>
container. Eclipse Temurin is currently based upon Ubuntu 22.04 while the OpenJDK
container was based upon Debian 11.&lt;/li>
&lt;li>RunInference PTransform will accept model paths as SideInputs in Python SDK. (&lt;a href="https://github.com/apache/beam/issues/24042">#24042&lt;/a>)&lt;/li>
&lt;li>RunInference supports ONNX runtime in Python SDK (&lt;a href="https://github.com/apache/beam/issues/22972">#22972&lt;/a>)&lt;/li>
&lt;li>Tensorflow Model Handler for RunInference in Python SDK (&lt;a href="https://github.com/apache/beam/issues/25366">#25366&lt;/a>)&lt;/li>
&lt;li>Java SDK modules migrated to use &lt;code>:sdks:java:extensions:avro&lt;/code> (&lt;a href="https://github.com/apache/beam/issues/24748">#24748&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Added in JmsIO a retry policy for failed publications (Java) (&lt;a href="https://github.com/apache/beam/issues/24971">#24971&lt;/a>).&lt;/li>
&lt;li>Support for &lt;code>LZMA&lt;/code> compression/decompression of text files added to the Python SDK (&lt;a href="https://github.com/apache/beam/issues/25316">#25316&lt;/a>)&lt;/li>
&lt;li>Added ReadFrom/WriteTo Csv/Json as top-level transforms to the Python SDK.&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Add UDF metrics support for Samza portable mode.&lt;/li>
&lt;li>Option for SparkRunner to avoid the need of SDF output to fit in memory (&lt;a href="https://github.com/apache/beam/issues/23852">#23852&lt;/a>).
This helps e.g. with ParquetIO reads. Turn the feature on by adding experiment &lt;code>use_bounded_concurrent_output_for_sdf&lt;/code>.&lt;/li>
&lt;li>Add &lt;code>WatchFilePattern&lt;/code> transform, which can be used as a side input to the RunInference PTransfrom to watch for model updates using a file pattern. (&lt;a href="https://github.com/apache/beam/issues/24042">#24042&lt;/a>)&lt;/li>
&lt;li>Add support for loading TorchScript models with &lt;code>PytorchModelHandler&lt;/code>. The TorchScript model path can be
passed to PytorchModelHandler using &lt;code>torch_script_model_path=&amp;lt;path_to_model&amp;gt;&lt;/code>. (&lt;a href="https://github.com/apache/beam/pull/25321">#25321&lt;/a>)&lt;/li>
&lt;li>The Go SDK now requires Go 1.19 to build. (&lt;a href="https://github.com/apache/beam/pull/25545">#25545&lt;/a>)&lt;/li>
&lt;li>The Go SDK now has an initial native Go implementation of a portable Beam Runner called Prism. (&lt;a href="https://github.com/apache/beam/pull/24789">#24789&lt;/a>)
&lt;ul>
&lt;li>For more details and current state see &lt;a href="https://github.com/apache/beam/tree/master/sdks/go/pkg/beam/runners/prism">https://github.com/apache/beam/tree/master/sdks/go/pkg/beam/runners/prism&lt;/a>.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>The deprecated SparkRunner for Spark 2 (see &lt;a href="#2410---2022-08-23">2.41.0&lt;/a>) was removed (&lt;a href="https://github.com/apache/beam/pull/25263">#25263&lt;/a>).&lt;/li>
&lt;li>Python&amp;rsquo;s BatchElements performs more aggressive batching in some cases,
capping at 10 second rather than 1 second batches by default and excluding
fixed cost in this computation to better handle cases where the fixed cost
is larger than a single second. To get the old behavior, one can pass
&lt;code>target_batch_duration_secs_including_fixed_cost=1&lt;/code> to BatchElements.&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Avro related classes are deprecated in module &lt;code>beam-sdks-java-core&lt;/code> and will be eventually removed. Please, migrate to a new module &lt;code>beam-sdks-java-extensions-avro&lt;/code> instead by importing the classes from &lt;code>org.apache.beam.sdk.extensions.avro&lt;/code> package.
For the sake of migration simplicity, the relative package path and the whole class hierarchy of Avro related classes in new module is preserved the same as it was before.
For example, import &lt;code>org.apache.beam.sdk.extensions.avro.coders.AvroCoder&lt;/code> class instead of&lt;code>org.apache.beam.sdk.coders.AvroCoder&lt;/code>. (&lt;a href="https://github.com/apache/beam/issues/24749">#24749&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.46.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Alan Zhang&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Amrane Ait Zeouay&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrew Pilloud&lt;/p>
&lt;p>Brian Hulette&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Byron Ellis&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Darkhan Nausharipov&lt;/p>
&lt;p>David Katz&lt;/p>
&lt;p>Dmitry Repin&lt;/p>
&lt;p>Doug Judd&lt;/p>
&lt;p>Egbert van der Wal&lt;/p>
&lt;p>Elizaveta Lomteva&lt;/p>
&lt;p>Evan Galpin&lt;/p>
&lt;p>Herman Mak&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>Johanna Öjeling&lt;/p>
&lt;p>John Casey&lt;/p>
&lt;p>Jozef Vilcek&lt;/p>
&lt;p>Junhao Liu&lt;/p>
&lt;p>Juta Staes&lt;/p>
&lt;p>Katie Liu&lt;/p>
&lt;p>Kiley Sok&lt;/p>
&lt;p>Liam Miller-Cushon&lt;/p>
&lt;p>Luke Cwik&lt;/p>
&lt;p>Moritz Mack&lt;/p>
&lt;p>Ning Kang&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>Pablo E&lt;/p>
&lt;p>Pablo Estrada&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Ruslan Altynnikov&lt;/p>
&lt;p>Ryan Zhang&lt;/p>
&lt;p>Sam Rohde&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Sam sam&lt;/p>
&lt;p>Sergei Lilichenko&lt;/p>
&lt;p>Shivam&lt;/p>
&lt;p>Shubham Krishna&lt;/p>
&lt;p>Theodore Ni&lt;/p>
&lt;p>Timur Sultanov&lt;/p>
&lt;p>Tony Tang&lt;/p>
&lt;p>Vachan&lt;/p>
&lt;p>Veronica Wasson&lt;/p>
&lt;p>Vincent Devillers&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>William Ross Morrow&lt;/p>
&lt;p>Xinyu Liu&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>ZhengLin Li&lt;/p>
&lt;p>Ziqi Ma&lt;/p>
&lt;p>ahmedabu98&lt;/p>
&lt;p>alexeyinkin&lt;/p>
&lt;p>aliftadvantage&lt;/p>
&lt;p>bullet03&lt;/p>
&lt;p>dannikay&lt;/p>
&lt;p>darshan-sj&lt;/p>
&lt;p>dependabot[bot]&lt;/p>
&lt;p>johnjcasey&lt;/p>
&lt;p>kamrankoupayi&lt;/p>
&lt;p>kileys&lt;/p>
&lt;p>liferoad&lt;/p>
&lt;p>nancyxu123&lt;/p>
&lt;p>nickuncaged1201&lt;/p>
&lt;p>pablo rodriguez defino&lt;/p>
&lt;p>tvalentyn&lt;/p>
&lt;p>xqhu&lt;/p></description></item><item><title>Blog: Apache Beam 2.45.0</title><link>/blog/beam-2.45.0/</link><pubDate>Wed, 15 Feb 2023 09:00:00 -0700</pubDate><guid>/blog/beam-2.45.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.45.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2430-2023-01-13">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.45.0, check out the &lt;a href="https://github.com/apache/beam/milestone/8?closed=1">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>MongoDB IO connector added (Go) (&lt;a href="https://github.com/apache/beam/issues/24575">#24575&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>RunInference Wrapper with Sklearn Model Handler support added in Go SDK (&lt;a href="https://github.com/apache/beam/issues/23382">#24497&lt;/a>).&lt;/li>
&lt;li>Adding override of allowed TLS algorithms (Java), now maintaining the disabled/legacy algorithms
present in 2.43.0 (up to 1.8.0_342, 11.0.16, 17.0.2 for respective Java versions). This is accompanied
by an explicit re-enabling of TLSv1 and TLSv1.1 for Java 8 and Java 11.&lt;/li>
&lt;li>Add UDF metrics support for Samza portable mode.&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Portable Java pipelines, Go pipelines, Python streaming pipelines, and portable Python batch
pipelines on Dataflow are required to use Runner V2. The &lt;code>disable_runner_v2&lt;/code>,
&lt;code>disable_runner_v2_until_2023&lt;/code>, &lt;code>disable_prime_runner_v2&lt;/code> experiments will raise an error during
pipeline construction. You can no longer specify the Dataflow worker jar override. Note that
non-portable Java jobs and non-portable Python batch jobs are not impacted. (&lt;a href="https://github.com/apache/beam/issues/24515">#24515&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Avoids Cassandra syntax error when user-defined query has no where clause in it (Java) (&lt;a href="https://github.com/apache/beam/issues/24829">#24829&lt;/a>).&lt;/li>
&lt;li>Fixed JDBC connection failures (Java) during handshake due to deprecated TLSv1(.1) protocol for the JDK. (&lt;a href="https://github.com/apache/beam/issues/24623">#24623&lt;/a>)&lt;/li>
&lt;li>Fixed Python BigQuery Batch Load write may truncate valid data when deposition sets to WRITE_TRUNCATE and incoming data is large (Python) (&lt;a href="https://github.com/apache/beam/issues/24535">#24623&lt;/a>).&lt;/li>
&lt;li>Fixed Kafka watermark issue with sparse data on many partitions (&lt;a href="https://github.com/apache/beam/pull/24205">#24205&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.45.0 release. Thank you to all contributors!&lt;/p>
&lt;p>AdalbertMemSQL&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrea Nardelli&lt;/p>
&lt;p>Andrei Gurau&lt;/p>
&lt;p>Andrew Pilloud&lt;/p>
&lt;p>Benjamin Gonzalez&lt;/p>
&lt;p>BjornPrime&lt;/p>
&lt;p>Brian Hulette&lt;/p>
&lt;p>Bulat&lt;/p>
&lt;p>Byron Ellis&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Charles Rothrock&lt;/p>
&lt;p>Damon&lt;/p>
&lt;p>Daniela Martín&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Darkhan Nausharipov&lt;/p>
&lt;p>Dejan Spasic&lt;/p>
&lt;p>Diego Gomez&lt;/p>
&lt;p>Dmitry Repin&lt;/p>
&lt;p>Doug Judd&lt;/p>
&lt;p>Elias Segundo Antonio&lt;/p>
&lt;p>Evan Galpin&lt;/p>
&lt;p>Evgeny Antyshev&lt;/p>
&lt;p>Fernando Morales&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Johanna Öjeling&lt;/p>
&lt;p>John Casey&lt;/p>
&lt;p>Junhao Liu&lt;/p>
&lt;p>Kanishk Karanawat&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Kiley Sok&lt;/p>
&lt;p>Liam Miller-Cushon&lt;/p>
&lt;p>Lucas Marques&lt;/p>
&lt;p>Luke Cwik&lt;/p>
&lt;p>MakarkinSAkvelon&lt;/p>
&lt;p>Marco Robles&lt;/p>
&lt;p>Mark Zitnik&lt;/p>
&lt;p>Melanie&lt;/p>
&lt;p>Moritz Mack&lt;/p>
&lt;p>Ning Kang&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>Pablo Estrada&lt;/p>
&lt;p>Philippe Moussalli&lt;/p>
&lt;p>Piyush Sagar&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Rick Viscomi&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Sergei Lilichenko&lt;/p>
&lt;p>Seung Jin An&lt;/p>
&lt;p>Shane Hansen&lt;/p>
&lt;p>Sho Nakatani&lt;/p>
&lt;p>Shunya Ueta&lt;/p>
&lt;p>Siddharth Agrawal&lt;/p>
&lt;p>Timur Sultanov&lt;/p>
&lt;p>Veronica Wasson&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Xinbin Huang&lt;/p>
&lt;p>Xinyu Liu&lt;/p>
&lt;p>Xinyue Zhang&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>ZhengLin Li&lt;/p>
&lt;p>alexeyinkin&lt;/p>
&lt;p>andoni-guzman&lt;/p>
&lt;p>andthezhang&lt;/p>
&lt;p>bullet03&lt;/p>
&lt;p>camphillips22&lt;/p>
&lt;p>gabihodoroaga&lt;/p>
&lt;p>harrisonlimh&lt;/p>
&lt;p>pablo rodriguez defino&lt;/p>
&lt;p>ruslan-ikhsan&lt;/p>
&lt;p>tvalentyn&lt;/p>
&lt;p>yyy1000&lt;/p>
&lt;p>zhengbuqian&lt;/p></description></item><item><title>Blog: Apache Beam 2.44.0</title><link>/blog/beam-2.44.0/</link><pubDate>Tue, 17 Jan 2023 09:00:00 -0700</pubDate><guid>/blog/beam-2.44.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.44.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2430-2023-01-13">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.44.0, check out the &lt;a href="https://github.com/apache/beam/milestone/7?closed=1">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Support for Bigtable sink (Write and WriteBatch) added (Go) (&lt;a href="https://github.com/apache/beam/issues/23324">#23324&lt;/a>).&lt;/li>
&lt;li>S3 implementation of the Beam filesystem (Go) (&lt;a href="https://github.com/apache/beam/issues/23991">#23991&lt;/a>).&lt;/li>
&lt;li>Support for SingleStoreDB source and sink added (Java) (&lt;a href="https://github.com/apache/beam/issues/22617">#22617&lt;/a>).&lt;/li>
&lt;li>Added support for DefaultAzureCredential authentication in Azure Filesystem (Python) (&lt;a href="https://github.com/apache/beam/issues/24210">#24210&lt;/a>).&lt;/li>
&lt;li>Added new CdapIO for CDAP Batch and Streaming Source/Sinks (Java) (&lt;a href="https://github.com/apache/beam/issues/24961">#24961&lt;/a>).&lt;/li>
&lt;li>Added new SparkReceiverIO for Spark Receivers 2.4.* (Java) (&lt;a href="https://github.com/apache/beam/issues/24960">#24960&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Beam now provides a portable &amp;ldquo;runner&amp;rdquo; that can render pipeline graphs with
graphviz. See &lt;code>python -m apache_beam.runners.render --help&lt;/code> for more details.&lt;/li>
&lt;li>Local packages can now be used as dependencies in the requirements.txt file, rather
than requiring them to be passed separately via the &lt;code>--extra_package&lt;/code> option
(Python) (&lt;a href="https://github.com/apache/beam/pull/23684">#23684&lt;/a>).&lt;/li>
&lt;li>Pipeline Resource Hints now supported via &lt;code>--resource_hints&lt;/code> flag (Go) (&lt;a href="https://github.com/apache/beam/pull/23990">#23990&lt;/a>).&lt;/li>
&lt;li>Make Python SDK containers reusable on portable runners by installing dependencies to temporary venvs (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12792">BEAM-12792&lt;/a>).&lt;/li>
&lt;li>RunInference model handlers now support the specification of a custom inference function in Python (&lt;a href="https://github.com/apache/beam/issues/22572">#22572&lt;/a>)&lt;/li>
&lt;li>Support for &lt;code>map_windows&lt;/code> urn added to Go SDK (&lt;a href="https://github.apache/beam/pull/24307">#24307&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>&lt;code>ParquetIO.withSplit&lt;/code> was removed since splittable reading has been the default behavior since 2.35.0. The effect of
this change is to drop support for non-splittable reading (Java)(&lt;a href="https://github.com/apache/beam/issues/23832">#23832&lt;/a>).&lt;/li>
&lt;li>&lt;code>beam-sdks-java-extensions-google-cloud-platform-core&lt;/code> is no longer a
dependency of the Java SDK Harness. Some users of a portable runner (such as Dataflow Runner v2)
may have an undeclared dependency on this package (for example using GCS with
TextIO) and will now need to declare the dependency.&lt;/li>
&lt;li>&lt;code>beam-sdks-java-core&lt;/code> is no longer a dependency of the Java SDK Harness. Users of a portable
runner (such as Dataflow Runner v2) will need to provide this package and its dependencies.&lt;/li>
&lt;li>Slices now use the Beam Iterable Coder. This enables cross language use, but breaks pipeline updates
if a Slice type is used as a PCollection element or State API element. (Go)&lt;a href="https://github.com/apache/beam/issues/24339">#24339&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed JmsIO acknowledgment issue (Java) (&lt;a href="https://github.com/apache/beam/issues/20814">#20814&lt;/a>)&lt;/li>
&lt;li>Fixed Beam SQL CalciteUtils (Java) and Cross-language JdbcIO (Python) did not support JDBC CHAR/VARCHAR, BINARY/VARBINARY logical types (&lt;a href="https://github.com/apache/beam/issues/23747">#23747&lt;/a>, &lt;a href="https://github.com/apache/beam/issues/23526">#23526&lt;/a>).&lt;/li>
&lt;li>Ensure iterated and emitted types are used with the generic register package are registered with the type and schema registries.(Go) (&lt;a href="https://github.com/apache/beam/pull/23889">#23889&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.44.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud&lt;/p>
&lt;p>Ahmet Altay&lt;/p>
&lt;p>Alex Merose&lt;/p>
&lt;p>Alexey Inkin&lt;/p>
&lt;p>Alexey Romanenko&lt;/p>
&lt;p>Anand Inguva&lt;/p>
&lt;p>Andrei Gurau&lt;/p>
&lt;p>Andrej Galad&lt;/p>
&lt;p>Andrew Pilloud&lt;/p>
&lt;p>Ayush Sharma&lt;/p>
&lt;p>Benjamin Gonzalez&lt;/p>
&lt;p>Bjorn Pedersen&lt;/p>
&lt;p>Brian Hulette&lt;/p>
&lt;p>Bruno Volpato&lt;/p>
&lt;p>Bulat Safiullin&lt;/p>
&lt;p>Chamikara Jayalath&lt;/p>
&lt;p>Chris Gavin&lt;/p>
&lt;p>Damon Douglas&lt;/p>
&lt;p>Danielle Syse&lt;/p>
&lt;p>Danny McCormick&lt;/p>
&lt;p>Darkhan Nausharipov&lt;/p>
&lt;p>David Cavazos&lt;/p>
&lt;p>Dmitry Repin&lt;/p>
&lt;p>Doug Judd&lt;/p>
&lt;p>Elias Segundo Antonio&lt;/p>
&lt;p>Evan Galpin&lt;/p>
&lt;p>Evgeny Antyshev&lt;/p>
&lt;p>Heejong Lee&lt;/p>
&lt;p>Henrik Heggelund-Berg&lt;/p>
&lt;p>Israel Herraiz&lt;/p>
&lt;p>Jack McCluskey&lt;/p>
&lt;p>Jan Lukavský&lt;/p>
&lt;p>Janek Bevendorff&lt;/p>
&lt;p>Johanna Öjeling&lt;/p>
&lt;p>John J. Casey&lt;/p>
&lt;p>Jozef Vilcek&lt;/p>
&lt;p>Kanishk Karanawat&lt;/p>
&lt;p>Kenneth Knowles&lt;/p>
&lt;p>Kiley Sok&lt;/p>
&lt;p>Laksh&lt;/p>
&lt;p>Liam Miller-Cushon&lt;/p>
&lt;p>Luke Cwik&lt;/p>
&lt;p>MakarkinSAkvelon&lt;/p>
&lt;p>Minbo Bae&lt;/p>
&lt;p>Moritz Mack&lt;/p>
&lt;p>Nancy Xu&lt;/p>
&lt;p>Ning Kang&lt;/p>
&lt;p>Nivaldo Tokuda&lt;/p>
&lt;p>Oleh Borysevych&lt;/p>
&lt;p>Pablo Estrada&lt;/p>
&lt;p>Philippe Moussalli&lt;/p>
&lt;p>Pranav Bhandari&lt;/p>
&lt;p>Rebecca Szper&lt;/p>
&lt;p>Reuven Lax&lt;/p>
&lt;p>Rick Smit&lt;/p>
&lt;p>Ritesh Ghorse&lt;/p>
&lt;p>Robert Bradshaw&lt;/p>
&lt;p>Robert Burke&lt;/p>
&lt;p>Ryan Thompson&lt;/p>
&lt;p>Sam Whittle&lt;/p>
&lt;p>Sanil Jain&lt;/p>
&lt;p>Scott Strong&lt;/p>
&lt;p>Shubham Krishna&lt;/p>
&lt;p>Steven van Rossum&lt;/p>
&lt;p>Svetak Sundhar&lt;/p>
&lt;p>Thiago Nunes&lt;/p>
&lt;p>Tianyang Hu&lt;/p>
&lt;p>Trevor Gevers&lt;/p>
&lt;p>Valentyn Tymofieiev&lt;/p>
&lt;p>Vitaly Terentyev&lt;/p>
&lt;p>Vladislav Chunikhin&lt;/p>
&lt;p>Xinyu Liu&lt;/p>
&lt;p>Yi Hu&lt;/p>
&lt;p>Yichi Zhang&lt;/p>
&lt;p>AdalbertMemSQL&lt;/p>
&lt;p>agvdndor&lt;/p>
&lt;p>andremissaglia&lt;/p>
&lt;p>arne-alex&lt;/p>
&lt;p>bullet03&lt;/p>
&lt;p>camphillips22&lt;/p>
&lt;p>capthiron&lt;/p>
&lt;p>creste&lt;/p>
&lt;p>fab-jul&lt;/p>
&lt;p>illoise&lt;/p>
&lt;p>kn1kn1&lt;/p>
&lt;p>nancyxu123&lt;/p>
&lt;p>peridotml&lt;/p>
&lt;p>shinannegans&lt;/p>
&lt;p>smeet07&lt;/p></description></item><item><title>Blog: Apache Beam 2.43.0</title><link>/blog/beam-2.43.0/</link><pubDate>Thu, 17 Nov 2022 09:00:00 -0700</pubDate><guid>/blog/beam-2.43.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.43.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2430-2022-11-17">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.43.0, check out the &lt;a href="https://github.com/apache/beam/milestone/5?closed=1">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Python 3.10 support in Apache Beam (&lt;a href="https://github.com/apache/beam/issues/21458">#21458&lt;/a>).&lt;/li>
&lt;li>An initial implementation of a runner that allows us to run Beam pipelines on Dask. Try it out and give us feedback! (Python) (&lt;a href="https://github.com/apache/beam/issues/18962">#18962&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Decreased TextSource CPU utilization by 2.3x (Java) (&lt;a href="https://github.com/apache/beam/issues/23193">#23193&lt;/a>).&lt;/li>
&lt;li>Fixed bug when using SpannerIO with RuntimeValueProvider options (Java) (&lt;a href="https://github.com/apache/beam/issues/22146">#22146&lt;/a>).&lt;/li>
&lt;li>Fixed issue for unicode rendering on WriteToBigQuery (&lt;a href="https://github.com/apache/beam/issues/22312">#22312&lt;/a>)&lt;/li>
&lt;li>Remove obsolete variants of BigQuery Read and Write, always using Beam-native variant
(&lt;a href="https://github.com/apache/beam/issues/23564">#23564&lt;/a> and &lt;a href="https://github.com/apache/beam/issues/23559">#23559&lt;/a>).&lt;/li>
&lt;li>Bumped google-cloud-spanner dependency version to 3.x for Python SDK (&lt;a href="https://github.com/apache/beam/issues/21198">#21198&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Dataframe wrapper added in Go SDK via Cross-Language (with automatic expansion service). (Go) (&lt;a href="https://github.com/apache/beam/issues/23384">#23384&lt;/a>).&lt;/li>
&lt;li>Name all Java threads to aid in debugging (&lt;a href="https://github.com/apache/beam/issues/23049">#23049&lt;/a>).&lt;/li>
&lt;li>An initial implementation of a runner that allows us to run Beam pipelines on Dask. (Python) (&lt;a href="https://github.com/apache/beam/issues/18962">#18962&lt;/a>).&lt;/li>
&lt;li>Allow configuring GCP OAuth scopes via pipeline options. This unblocks usages of Beam IOs that require additional scopes.
For example, this feature makes it possible to access Google Drive backed tables in BigQuery (&lt;a href="https://github.com/apache/beam/issues/23290">#23290&lt;/a>).&lt;/li>
&lt;li>An example for using Python RunInference from Java (&lt;a href="https://github.com/apache/beam/pull/23619">#23290&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>CoGroupByKey transform in Python SDK has changed the output typehint. The typehint component representing grouped values changed from List to Iterable,
which more accurately reflects the nature of the arbitrarily large output collection. &lt;a href="https://github.com/apache/beam/issues/21556">#21556&lt;/a> Beam users may see an error on transforms downstream from CoGroupByKey. Users must change methods expecting a List to expect an Iterable going forward. See &lt;a href="https://docs.google.com/document/d/1RIzm8-g-0CyVsPb6yasjwokJQFoKHG4NjRUcKHKINu0">document&lt;/a> for information and fixes.&lt;/li>
&lt;li>The PortableRunner for Spark assumes Spark 3 as default Spark major version unless configured otherwise using &lt;code>--spark_version&lt;/code>.
Spark 2 support is deprecated and will be removed soon (&lt;a href="https://github.com/apache/beam/issues/23728">#23728&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed Python cross-language JDBC IO Connector cannot read or write rows containing Numeric/Decimal type values (&lt;a href="https://github.com/apache/beam/issues/19817">#19817&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.43.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud
AlexZMLyu
Alexey Romanenko
Anand Inguva
Andrew Pilloud
Andy Ye
Arnout Engelen
Benjamin Gonzalez
Bharath Kumarasubramanian
BjornPrime
Brian Hulette
Bruno Volpato
Chamikara Jayalath
Colin Versteeg
Damon
Daniel Smilkov
Daniela Martín
Danny McCormick
Darkhan Nausharipov
David Huntsperger
Denis Pyshev
Dmitry Repin
Evan Galpin
Evgeny Antyshev
Fernando Morales
Geddy05
Harshit Mehrotra
Iñigo San Jose Visiers
Ismaël Mejía
Israel Herraiz
Jan Lukavský
Juta Staes
Kanishk Karanawat
Kenneth Knowles
KevinGG
Kiley Sok
Liam Miller-Cushon
Luke Cwik
Mc
Melissa Pashniak
Moritz Mack
Ning Kang
Pablo Estrada
Philippe Moussalli
Pranav Bhandari
Rebecca Szper
Reuven Lax
Ritesh Ghorse
Robert Bradshaw
Robert Burke
Ryan Thompson
Ryohei Nagao
Sam Rohde
Sam Whittle
Sanil Jain
Seunghwan Hong
Shane Hansen
Shubham Krishna
Shunsuke Otani
Steve Niemitz
Steven van Rossum
Svetak Sundhar
Thiago Nunes
Toran Sahu
Veronica Wasson
Vitaly Terentyev
Vladislav Chunikhin
Xinyu Liu
Yi Hu
Yixiao Shen
alexeyinkin
arne-alex
azhurkevich
bulat safiullin
bullet03
coldWater
dpcollins-google
egalpin
johnjcasey
liferoad
rvballada
shaojwu
tvalentyn&lt;/p></description></item><item><title>Blog: Apache Beam 2.42.0</title><link>/blog/beam-2.42.0/</link><pubDate>Mon, 17 Oct 2022 09:00:00 -0700</pubDate><guid>/blog/beam-2.42.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.42.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2420-2022-10-17">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.42.0, check out the &lt;a href="https://github.com/apache/beam/milestone/4?closed=1">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Added support for stateful DoFns to the Go SDK.&lt;/li>
&lt;li>Added support for &lt;a href="/documentation/programming-guide/#batched-dofns">Batched
DoFns&lt;/a>
to the Python SDK.&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Added support for Zstd compression to the Python SDK.&lt;/li>
&lt;li>Added support for Google Cloud Profiler to the Go SDK.&lt;/li>
&lt;li>Added support for stateful DoFns to the Go SDK.&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>The Go SDK&amp;rsquo;s Row Coder now uses a different single-precision float encoding for float32 types to match Java&amp;rsquo;s behavior (&lt;a href="https://github.com/apache/beam/issues/22629">#22629&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed Python cross-language JDBC IO Connector cannot read or write rows containing Timestamp type values &lt;a href="https://github.com/apache/beam/issues/19817">19817&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>Go SDK doesn&amp;rsquo;t yet support Slowly Changing Side Input pattern (&lt;a href="https://github.com/apache/beam/issues/23106">#23106&lt;/a>)&lt;/li>
&lt;li>See a full list of open &lt;a href="https://github.com/apache/beam/milestone/4">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.42.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Abirdcfly
Ahmed Abualsaud
Alexander Zhuravlev
Alexey Inkin
Alexey Romanenko
Anand Inguva
Andrej Galad
Andrew Pilloud
Andy Ye
Balázs Németh
Brian Hulette
Bruno Volpato
bulat safiullin
bullet03
Chamikara Jayalath
ChangyuLi28
Clément Guillaume
Damon
Danny McCormick
Darkhan Nausharipov
David Huntsperger
dpcollins-google
Evgeny Antyshev
grufino
Heejong Lee
Ismaël Mejía
Jack McCluskey
johnjcasey
Jonathan Shen
Kenneth Knowles
Ke Wu
Kiley Sok
Liam Miller-Cushon
liferoad
Lucas Nogueira
Luke Cwik
MakarkinSAkvelon
Manit Gupta
masahitojp
Michael Hu
Michel Davit
Moritz Mack
Naireen Hussain
nancyxu123
Nikhil Nadig
oborysevych
Pablo Estrada
Pranav Bhandari
Rajat Bhatta
Rebecca Szper
Reuven Lax
Ritesh Ghorse
Robert Bradshaw
Robert Burke
Ryan Thompson
Sam Whittle
Sergey Pronin
Shivam
Shunsuke Otani
Shunya Ueta
Steven Niemitz
Stuart
Svetak Sundhar
Valentyn Tymofieiev
Vitaly Terentyev
Vlad
Vladislav Chunikhin
Yichi Zhang
Yi Hu
Yixiao Shen&lt;/p></description></item><item><title>Blog: Apache Beam 2.41.0</title><link>/blog/beam-2.41.0/</link><pubDate>Tue, 23 Aug 2022 09:00:00 -0700</pubDate><guid>/blog/beam-2.41.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.41.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2410-2022-08-23">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.41.0, check out the &lt;a href="https://github.com/apache/beam/milestone/3?closed=1">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Projection Pushdown optimizer is now on by default for streaming, matching the behavior of batch pipelines since 2.38.0. If you encounter a bug with the optimizer, please file an issue and disable the optimizer using pipeline option &lt;code>--experiments=disable_projection_pushdown&lt;/code>.&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Previously available in Java sdk, Python sdk now also supports logging level overrides per module. (&lt;a href="https://github.com/apache/beam/issues/18222">#18222&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Projection Pushdown optimizer may break Dataflow upgrade compatibility for optimized pipelines when it removes unused fields. If you need to upgrade and encounter a compatibility issue, disable the optimizer using pipeline option &lt;code>--experiments=disable_projection_pushdown&lt;/code>.&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Support for Spark 2.4.x is deprecated and will be dropped with the release of Beam 2.44.0 or soon after (Spark runner) (&lt;a href="https://github.com/apache/beam/issues/22094">#22094&lt;/a>).&lt;/li>
&lt;li>The modules &lt;a href="https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services">amazon-web-services&lt;/a> and
&lt;a href="https://github.com/apache/beam/tree/master/sdks/java/io/kinesis">kinesis&lt;/a> for AWS Java SDK v1 are deprecated
in favor of &lt;a href="https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services2">amazon-web-services2&lt;/a>
and will be eventually removed after a few Beam releases (Java) (&lt;a href="https://github.com/apache/beam/issues/21249">#21249&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed a condition where retrying queries would yield an incorrect cursor in the Java SDK Firestore Connector (&lt;a href="https://github.com/apache/beam/issues/22089">#22089&lt;/a>).&lt;/li>
&lt;li>Fixed plumbing allowed lateness in Go SDK. It was ignoring the user set value earlier and always used to set to 0. (&lt;a href="https://github.com/apache/beam/issues/22474">#22474&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>See a full list of open &lt;a href="https://github.com/apache/beam/milestone/3">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.41.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud
Ahmet Altay
akashorabek
Alexey Inkin
Alexey Romanenko
Anand Inguva
andoni-guzman
Andrew Pilloud
Andrey
Andy Ye
Balázs Németh
Benjamin Gonzalez
BjornPrime
Brian Hulette
bulat safiullin
bullet03
Byron Ellis
Chamikara Jayalath
Damon Douglas
Daniel Oliveira
Daniel Thevessen
Danny McCormick
David Huntsperger
Dheeraj Gharde
Etienne Chauchot
Evan Galpin
Fernando Morales
Heejong Lee
Jack McCluskey
johnjcasey
Kenneth Knowles
Ke Wu
Kiley Sok
Liam Miller-Cushon
Lucas Nogueira
Luke Cwik
MakarkinSAkvelon
Manu Zhang
Minbo Bae
Moritz Mack
Naireen Hussain
Ning Kang
Oleh Borysevych
Pablo Estrada
pablo rodriguez defino
Pranav Bhandari
Rebecca Szper
Red Daly
Reuven Lax
Ritesh Ghorse
Robert Bradshaw
Robert Burke
Ryan Thompson
Sam Whittle
Steven Niemitz
Valentyn Tymofieiev
Vincent Marquez
Vitaly Terentyev
Vlad
Vladislav Chunikhin
Yichi Zhang
Yi Hu
yirutang
Yixiao Shen
Yu Feng&lt;/p></description></item><item><title>Blog: Apache Beam 2.40.0</title><link>/blog/beam-2.40.0/</link><pubDate>Sat, 25 Jun 2022 09:00:00 -0700</pubDate><guid>/blog/beam-2.40.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.40.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2400-2022-06-25">download page&lt;/a> for this
release.&lt;/p>
&lt;p>For more information on changes in 2.40.0 check out the &lt;a href="https://github.com/apache/beam/releases/tag/v2.40.0">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Added &lt;a href="https://s.apache.org/inference-sklearn-pytorch">RunInference&lt;/a> API, a framework agnostic transform for inference. With this release, PyTorch and Scikit-learn are supported by the transform.
See also example at apache_beam/examples/inference/pytorch_image_classification.py&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Upgraded to Hive 3.1.3 for HCatalogIO. Users can still provide their own version of Hive. (Java) (&lt;a href="https://github.com/apache/beam/issues/19554">Issue-19554&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Go SDK users can now use generic registration functions to optimize their DoFn execution. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14347">BEAM-14347&lt;/a>)&lt;/li>
&lt;li>Go SDK users may now write self-checkpointing Splittable DoFns to read from streaming sources. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11104">BEAM-11104&lt;/a>)&lt;/li>
&lt;li>Go SDK textio Reads have been moved to Splittable DoFns exclusively. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14489">BEAM-14489&lt;/a>)&lt;/li>
&lt;li>Pipeline drain support added for Go SDK has now been tested. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11106">BEAM-11106&lt;/a>)&lt;/li>
&lt;li>Go SDK users can now see heap usage, sideinput cache stats, and active process bundle stats in Worker Status. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13829">BEAM-13829&lt;/a>)&lt;/li>
&lt;li>The serialization (pickling) library for Python is dill==0.3.1.1 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11167">BEAM-11167&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>The Go Sdk now requires a minimum version of 1.18 in order to support generics (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14347">BEAM-14347&lt;/a>).&lt;/li>
&lt;li>synthetic.SourceConfig field types have changed to int64 from int for better compatibility with Flink&amp;rsquo;s use of Logical types in Schemas (Go) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14173">BEAM-14173&lt;/a>)&lt;/li>
&lt;li>Default coder updated to compress sources used with &lt;code>BoundedSourceAsSDFWrapperFn&lt;/code> and &lt;code>UnboundedSourceAsSDFWrapper&lt;/code>.&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed Java expansion service to allow specific files to stage (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14160">BEAM-14160&lt;/a>).&lt;/li>
&lt;li>Fixed Elasticsearch connection when using both ssl and username/password (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14000">BEAM-14000&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Python&amp;rsquo;s &lt;code>beam.FlatMap&lt;/code> will raise &lt;code>AttributeError: 'builtin_function_or_method' object has no attribute '__func__'&lt;/code> when
constructed with some
&lt;a href="https://docs.python.org/3/library/functions.html">built-ins&lt;/a>, like &lt;code>sum&lt;/code>
and &lt;code>len&lt;/code> (&lt;a href="https://github.com/apache/beam/issues/22091">#22091&lt;/a>).&lt;/li>
&lt;li>Java&amp;rsquo;s &lt;code>BigQueryIO.Write&lt;/code> can have an exception where it attempts to output a timestamp beyond the max timestamp range
&lt;code>Cannot output with timestamp 294247-01-10T04:00:54.776Z. Output timestamps must be no earlier than the timestamp of the current input or timer (294247-01-10T04:00:54.776Z) minus the allowed skew (0 milliseconds) and no later than 294247-01-10T04:00:54.775Z. See the DoFn#getAllowedTimestampSkew() Javadoc for details on changing the allowed skew.&lt;/code>
This happens when a sink is idle, causing the idle timeout to trigger, or when a specific table is idle long enough when using dynamic destinations.
When this happens, the job is no longer able to be drained. This has been fixed for the 2.41 release.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.40.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud
Ahmet Altay
Aizhamal Nurmamat kyzy
Alejandro Rodriguez-Morantes
Alexander Zhuravlev
Alexey Romanenko
Anand Inguva
andoni-guzman
Andy Ye
Balázs Németh
Benjamin Gonzalez
Brian Hulette
bulat safiullin
bullet03
Chamikara Jayalath
Damon Douglas
Daniel Oliveira
Danny McCormick
Darkhan Nausharipov
David Huntsperger
Diego Gomez
dpcollins-google
Ekaterina Tatanova
Elias Segundo
Etienne Chauchot
Evan Galpin
fbeevikm
Fernando Morales
Heejong Lee
Igor Krasavin
Ilion Beyst
Israel Herraiz
Jack McCluskey
Jan Kuehle
Jan Lukavský
johnjcasey
Jonathan Lui
jrmccluskey
Julien Tournay
Kenneth Knowles
Kerry Donny-Clark
Kevin Puthusseri
Kiley Sok
Kyle Weaver
kynx
Lucas Nogueira
Luke Cwik
LuNing Wang
Marco Robles
masahitojp
Minbo Bae
Moritz Mack
Naireen Hussain
Nancy Xu
Niel Markwick
Ning Kang
nishant jain
nishantjain91
Oskar Firlej
Pablo Estrada
pablo rodriguez defino
Rebecca Szper
Red Daly
Reuven Lax
Ritesh Ghorse
Robert Bradshaw
Robert Burke
Ryan Thompson
Sam Whittle
Thiago Nunes
Tom Stepp
vachan-shetty
Valentyn Tymofieiev
vikash2310
Vitaly Terentyev
Vladislav Chunikhin
Yichi Zhang
Yi Hu
Yiru Tang
yixiaoshen
zwestrick&lt;/p></description></item><item><title>Blog: Apache Beam 2.39.0</title><link>/blog/beam-2.39.0/</link><pubDate>Wed, 25 May 2022 09:00:00 -0700</pubDate><guid>/blog/beam-2.39.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.39.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2390-2022-05-25">download page&lt;/a> for this
release.&lt;/p>
&lt;p>For more information on changes in 2.39.0 check out the &lt;a href="https://issues.apache.org/jira/secure/ConfigureReleaseNote.jspa?projectId=12319527&amp;amp;version=12351170">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>JmsIO gains the ability to map any kind of input to any subclass of &lt;code>javax.jms.Message&lt;/code> (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-16308">BEAM-16308&lt;/a>).&lt;/li>
&lt;li>JmsIO introduces the ability to write to dynamic topics (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-16308">BEAM-16308&lt;/a>).
&lt;ul>
&lt;li>A &lt;code>topicNameMapper&lt;/code> must be set to extract the topic name from the input value.&lt;/li>
&lt;li>A &lt;code>valueMapper&lt;/code> must be set to convert the input value to JMS message.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Reduce number of threads spawned by BigqueryIO StreamingInserts (
&lt;a href="https://issues.apache.org/jira/browse/BEAM-14283">BEAM-14283&lt;/a>).&lt;/li>
&lt;li>Implemented Apache PulsarIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8218">BEAM-8218&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Support for flink scala 2.12, because most of the libraries support version 2.12 onwards. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14386">beam-14386&lt;/a>)&lt;/li>
&lt;li>&amp;lsquo;Manage Clusters&amp;rsquo; JupyterLab extension added for users to configure usage of Dataproc clusters managed by Interactive Beam (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14130">BEAM-14130&lt;/a>).&lt;/li>
&lt;li>Pipeline drain support added for Go SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11106">BEAM-11106&lt;/a>). &lt;strong>Note: this feature is not yet fully validated and should be treated as experimental in this release.&lt;/strong>&lt;/li>
&lt;li>&lt;code>DataFrame.unstack()&lt;/code>, &lt;code>DataFrame.pivot() &lt;/code> and &lt;code>Series.unstack()&lt;/code>
implemented for DataFrame API (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13948">BEAM-13948&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-13966">BEAM-13966&lt;/a>).&lt;/li>
&lt;li>Support for impersonation credentials added to dataflow runner in the Java and Python SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14014">BEAM-14014&lt;/a>).&lt;/li>
&lt;li>Implemented Jupyterlab extension for managing Dataproc clusters (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14130">BEAM-14130&lt;/a>).&lt;/li>
&lt;li>ExternalPythonTransform API added for easily invoking Python transforms from
Java (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14143">BEAM-14143&lt;/a>).&lt;/li>
&lt;li>Added Add support for Elasticsearch 8.x (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14003">BEAM-14003&lt;/a>).&lt;/li>
&lt;li>Shard aware Kinesis record aggregation (AWS Sdk v2), (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14104">BEAM-14104&lt;/a>).&lt;/li>
&lt;li>Upgrade to ZetaSQL 2022.04.1 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14348">BEAM-14348&lt;/a>).&lt;/li>
&lt;li>Fixed ReadFromBigQuery cannot be used with the interactive runner (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14112">BEAM-14112&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Unused functions &lt;code>ShallowCloneParDoPayload()&lt;/code>, &lt;code>ShallowCloneSideInput()&lt;/code>, and &lt;code>ShallowCloneFunctionSpec()&lt;/code> have been removed from the Go SDK&amp;rsquo;s pipelinex package (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13739">BEAM-13739&lt;/a>).&lt;/li>
&lt;li>JmsIO requires an explicit &lt;code>valueMapper&lt;/code> to be set (&lt;a href="https://issues.apache.org/jira/browse/BEAM-16308">BEAM-16308&lt;/a>). You can use the &lt;code>TextMessageMapper&lt;/code> to convert &lt;code>String&lt;/code> inputs to JMS &lt;code>TestMessage&lt;/code>s:&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-java" data-lang="java">&lt;span class="line">&lt;span class="cl"> &lt;span class="n">JmsIO&lt;/span>&lt;span class="o">.&amp;lt;&lt;/span>&lt;span class="n">String&lt;/span>&lt;span class="o">&amp;gt;&lt;/span>&lt;span class="n">write&lt;/span>&lt;span class="o">()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">.&lt;/span>&lt;span class="na">withConnectionFactory&lt;/span>&lt;span class="o">(&lt;/span>&lt;span class="n">jmsConnectionFactory&lt;/span>&lt;span class="o">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">.&lt;/span>&lt;span class="na">withValueMapper&lt;/span>&lt;span class="o">(&lt;/span>&lt;span class="k">new&lt;/span> &lt;span class="n">TextMessageMapper&lt;/span>&lt;span class="o">());&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ul>
&lt;li>Coders in Python are expected to inherit from Coder. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14351">BEAM-14351&lt;/a>).&lt;/li>
&lt;li>New abstract method &lt;code>metadata()&lt;/code> added to io.filesystem.FileSystem in the
Python SDK. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14314">BEAM-14314&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Flink 1.11 is no longer supported (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14139">BEAM-14139&lt;/a>).&lt;/li>
&lt;li>Python 3.6 is no longer supported (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13657">BEAM-13657&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed Java Spanner IO NPE when ProjectID not specified in template executions (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14405">BEAM-14405&lt;/a>).&lt;/li>
&lt;li>Fixed potential NPE in BigQueryServicesImpl.getErrorInfo (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14133">BEAM-14133&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>See a full list of open &lt;a href="https://issues.apache.org/jira/browse/BEAM-14412?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.39.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.39.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed Abualsaud,
Ahmet Altay,
Aizhamal Nurmamat kyzy,
Alexander Zhuravlev,
Alexey Romanenko,
Anand Inguva,
Andrei Gurau,
Andrew Pilloud,
Andy Ye,
Arun Pandian,
Arwin Tio,
Aydar Farrakhov,
Aydar Zainutdinov,
AydarZaynutdinov,
Balázs Németh,
Benjamin Gonzalez,
Brian Hulette,
Buqian Zheng,
Chamikara Jayalath,
Chun Yang,
Daniel Oliveira,
Daniela Martín,
Danny McCormick,
David Huntsperger,
Deepak Nagaraj,
Denise Case,
Esun Kim,
Etienne Chauchot,
Evan Galpin,
Hector Miuler Malpica Gallegos,
Heejong Lee,
Hengfeng Li,
Ilango Rajagopal,
Ilion Beyst,
Israel Herraiz,
Jack McCluskey,
Kamil Bregula,
Kamil Breguła,
Ke Wu,
Kenneth Knowles,
KevinGG,
Kiley,
Kiley Sok,
Kyle Weaver,
Liam Miller-Cushon,
Luke Cwik,
Marco Robles,
Matt Casters,
Michael Li,
MiguelAnzoWizeline,
Milan Patel,
Minbo Bae,
Moritz Mack,
Nick Caballero,
Niel Markwick,
Ning Kang,
Oskar Firlej,
Pablo Estrada,
Pavel Avilov,
Reuven Lax,
Reza Rokni,
Ritesh Ghorse,
Robert Bradshaw,
Robert Burke,
Ryan Thompson,
Sam Whittle,
Steven Niemitz,
Thiago Nunes,
Tomo Suzuki,
Valentyn Tymofieiev,
Victor,
Yi Hu,
Yichi Zhang,
Yiru Tang,
ahmedabu98,
andoni-guzman,
brachipa,
bulat safiullin,
bullet03,
dannymartinm,
daria.malkova,
dpcollins-google,
egalpin,
emily,
fbeevikm,
johnjcasey,
kileys,
&lt;a href="mailto:msbukal@google.com">msbukal@google.com&lt;/a>,
nguyennk92,
pablo rodriguez defino,
rszper,
rvballada,
sachinag,
tvalentyn,
vachan-shetty,
yirutang&lt;/p></description></item><item><title>Blog: Apache Beam 2.38.0</title><link>/blog/beam-2.38.0/</link><pubDate>Wed, 20 Apr 2022 09:00:00 -0700</pubDate><guid>/blog/beam-2.38.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.38.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2380-2022-04-20">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.38.0 check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12351169">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Introduce projection pushdown optimizer to the Java SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12976">BEAM-12976&lt;/a>). The optimizer currently only works on the &lt;a href="/documentation/io/built-in/google-bigquery/#storage-api">BigQuery Storage API&lt;/a>, but more I/Os will be added in future releases. If you encounter a bug with the optimizer, please file a JIRA and disable the optimizer using pipeline option &lt;code>--experiments=disable_projection_pushdown&lt;/code>.&lt;/li>
&lt;li>A new IO for Neo4j graph databases was added. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-1857">BEAM-1857&lt;/a>) It has the ability to update nodes and relationships using UNWIND statements and to read data using cypher statements with parameters.&lt;/li>
&lt;li>&lt;code>amazon-web-services2&lt;/code> has reached feature parity and is finally recommended over the earlier &lt;code>amazon-web-services&lt;/code> and &lt;code>kinesis&lt;/code> modules (Java). These will be deprecated in one of the next releases (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13174">BEAM-13174&lt;/a>).
&lt;ul>
&lt;li>Long outstanding write support for &lt;code>Kinesis&lt;/code> was added (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13175">BEAM-13175&lt;/a>).&lt;/li>
&lt;li>Configuration was simplified and made consistent across all IOs, including the usage of &lt;code>AwsOptions&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13563">BEAM-13563&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-13663">BEAM-13663&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-13587">BEAM-13587&lt;/a>).&lt;/li>
&lt;li>Additionally, there&amp;rsquo;s a long list of recent improvements and fixes to
&lt;code>S3&lt;/code> Filesystem (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13245">BEAM-13245&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-13246">BEAM-13246&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-13441">BEAM-13441&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-13445">BEAM-13445&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-14011">BEAM-14011&lt;/a>),
&lt;code>DynamoDB&lt;/code> IO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13009">BEAM-13209&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-13209">BEAM-13209&lt;/a>),
&lt;code>SQS&lt;/code> IO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13631">BEAM-13631&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-13510">BEAM-13510&lt;/a>) and others.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Pipeline dependencies supplied through &lt;code>--requirements_file&lt;/code> will now be staged to the runner using binary distributions (wheels) of the PyPI packages for linux_x86_64 platform (&lt;a href="https://issues.apache.org/jira/browse/BEAM-4032">BEAM-4032&lt;/a>). To restore the behavior to use source distributions, set pipeline option &lt;code>--requirements_cache_only_sources&lt;/code>. To skip staging the packages at submission time, set pipeline option &lt;code>--requirements_cache=skip&lt;/code> (Python).&lt;/li>
&lt;li>The Flink runner now supports Flink 1.14.x (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13106">BEAM-13106&lt;/a>).&lt;/li>
&lt;li>Interactive Beam now supports remotely executing Flink pipelines on Dataproc (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14071">BEAM-14071&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>(Python) Previously &lt;code>DoFn.infer_output_types&lt;/code> was expected to return &lt;code>Iterable[element_type]&lt;/code> where &lt;code>element_type&lt;/code> is the PCollection elemnt type. It is now expected to return &lt;code>element_type&lt;/code>. Take care if you have overriden &lt;code>infer_output_type&lt;/code> in a &lt;code>DoFn&lt;/code> (this is not common). See &lt;a href="https://issues.apache.org/jira/browse/BEAM-13860">BEAM-13860&lt;/a>.&lt;/li>
&lt;li>(&lt;code>amazon-web-services2&lt;/code>) The types of &lt;code>awsRegion&lt;/code> / &lt;code>endpoint&lt;/code> in &lt;code>AwsOptions&lt;/code> changed from String to &lt;code>Region&lt;/code> / &lt;code>URI&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13563">BEAM-13563&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Beam 2.38.0 will be the last minor release to support Flink 1.11.&lt;/li>
&lt;li>(&lt;code>amazon-web-services2&lt;/code>) Client providers (&lt;code>withXYZClientProvider()&lt;/code>) as well as IO specific &lt;code>RetryConfiguration&lt;/code>s are deprecated, instead use &lt;code>withClientConfiguration()&lt;/code> or &lt;code>AwsOptions&lt;/code> to configure AWS IOs / clients.
Custom implementations of client providers shall be replaced with a respective &lt;code>ClientBuilderFactory&lt;/code> and configured through &lt;code>AwsOptions&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13563">BEAM-13563&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fix S3 copy for large objects (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14011">BEAM-14011&lt;/a>)&lt;/li>
&lt;li>Fix quadratic behavior of pipeline canonicalization (Go) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-14128">BEAM-14128&lt;/a>)
&lt;ul>
&lt;li>This caused unnecessarily long pre-processing times before job submission for large complex pipelines.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Fix &lt;code>pyarrow&lt;/code> version parsing (Python)(&lt;a href="https://issues.apache.org/jira/browse/BEAM-14235">BEAM-14235&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>See a full list of open &lt;a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.38.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.38.0 release. Thank you to all contributors!&lt;/p>
&lt;p>abhijeet-lele
Ahmet Altay
akustov
Alexander
Alexander Zhuravlev
Alexey Romanenko
AlikRodriguez
Anand Inguva
andoni-guzman
andreukus
Andy Ye
Ankur Goenka
ansh0l
Artur Khanin
Aydar Farrakhov
Aydar Zainutdinov
Benjamin Gonzalez
Brian Hulette
brucearctor
bulat safiullin
bullet03
Carl Mastrangelo
Chamikara Jayalath
Chun Yang
Daniela Martín
Daniel Oliveira
Danny McCormick
daria.malkova
David Cavazos
David Huntsperger
dmitryor
Dmytro Sadovnychyi
dpcollins-google
egalpin
Elias Segundo Antonio
emily
Etienne Chauchot
Hengfeng Li
Ismaël Mejía
Israel Herraiz
Jack McCluskey
Jakub Kukul
Janek Bevendorff
Jeff Klukas
Johan Sternby
Kamil Breguła
Kenneth Knowles
Ke Wu
Kiley
Kyle Weaver
laraschmidt
Lara Schmidt
LE QUELLEC Olivier
Luka Kalinovcic
Luke Cwik
Marcin Kuthan
masahitojp
Masato Nakamura
Matt Casters
Melissa Pashniak
Michael Li
Miguel Hernandez
Moritz Mack
mosche
nancyxu123
Nathan J Mehl
Niel Markwick
Ning Kang
Pablo Estrada
paul-tlh
Pavel Avilov
Rahul Iyer
Reuven Lax
Ritesh Ghorse
Robert Bradshaw
Robert Burke
Ryan Skraba
Ryan Thompson
Sam Whittle
Seth Vargo
sp029619
Steven Niemitz
Thiago Nunes
Udi Meiri
Valentyn Tymofieiev
Victor
vitaly.terentyev
Yichi Zhang
Yi Hu
yirutang
Zachary Houfek
Zoe&lt;/p></description></item><item><title>Blog: Apache Beam 2.37.0</title><link>/blog/beam-2.37.0/</link><pubDate>Fri, 04 Mar 2022 08:30:00 -0800</pubDate><guid>/blog/beam-2.37.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.37.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2370-2022-03-04">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.37.0 check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12351168">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Java 17 support for Dataflow (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12240">BEAM-12240&lt;/a>).
&lt;ul>
&lt;li>Users using Dataflow Runner V2 may see issues with state cache due to inaccurate object sizes (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13695">BEAM-13695&lt;/a>).&lt;/li>
&lt;li>ZetaSql is currently unsupported (&lt;a href="https://github.com/google/zetasql/issues/89">issue&lt;/a>).&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Python 3.9 support in Apache Beam (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12000">BEAM-12000&lt;/a>).
&lt;ul>
&lt;li>Dataflow support for Python 3.9 is expected to be available with 2.37.0,
but may not be fully available yet when the release is announced (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13864">BEAM-13864&lt;/a>).&lt;/li>
&lt;li>Users of Dataflow Runner V2 can run Python 3.9 pipelines with 2.37.0 release right away.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Go SDK now has wrappers for the following Cross Language Transforms from Java, along with automatic expansion service startup for each.
&lt;ul>
&lt;li>JDBCIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13293">BEAM-13293&lt;/a>).&lt;/li>
&lt;li>Debezium (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13761">BEAM-13761&lt;/a>).&lt;/li>
&lt;li>BeamSQL (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13683">BEAM-13683&lt;/a>).&lt;/li>
&lt;li>BiqQuery (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13732">BEAM-13732&lt;/a>).&lt;/li>
&lt;li>KafkaIO now also has automatic expansion service startup. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13821">BEAM-13821&lt;/a>).&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>DataFrame API now supports pandas 1.4.x (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13605">BEAM-13605&lt;/a>).&lt;/li>
&lt;li>Go SDK DoFns can now observe trigger panes directly (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13757">BEAM-13757&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>See a full list of open &lt;a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.37.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.37.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Aizhamal Nurmamat kyzy
Alexander
Alexander Chermenin
Alexandr Zhuravlev
Alexey Romanenko
Anand Inguva
andoni-guzman
andreukus
Andy Ye
Artur Khanin
Aydar Farrakhov
Aydar Zainutdinov
AydarZaynutdinov
Benjamin Gonzalez
Brian Hulette
Chamikara Jayalath
Daniel Oliveira
Danny McCormick
daria-malkova
daria.malkova
darshan-sj
David Huntsperger
dprieto91
emily
Etienne Chauchot
Fernando Morales
Heejong Lee
Ismaël Mejía
Jack McCluskey
Jan Lukavský
johnjcasey
Kamil Breguła
kellen
Kenneth Knowles
kileys
Kyle Weaver
Luke Cwik
Marcin Kuthan
Marco Robles
Matt Rudary
Miguel Hernandez
Milena Bukal
Moritz Mack
Mostafa Aghajani
Ning Kang
Pablo Estrada
Pavel Avilov
Reuven Lax
Ritesh Ghorse
Robert Bradshaw
Robert Burke
Sam Whittle
Sandy Chapman
Sergey Kalinin
Thiago Nunes
thorbjorn444
Tim Robertson
Tomo Suzuki
Valentyn Tymofieiev
Victor
Victor Chen
Vitaly Ivanov
Yichi Zhang&lt;/p></description></item><item><title>Blog: Apache Beam 2.36.0</title><link>/blog/beam-2.36.0/</link><pubDate>Mon, 07 Feb 2022 10:11:00 -0800</pubDate><guid>/blog/beam-2.36.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.36.0 release of Apache Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2360-2022-02-07">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.36.0, check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12350407">detailed release
notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Support for stopReadTime on KafkaIO SDF (Java).(&lt;a href="https://issues.apache.org/jira/browse/BEAM-13171">BEAM-13171&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>💻 Support for ARM64 / Mac M1 out of the box. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11703">BEAM-11703&lt;/a>).&lt;/li>
&lt;li>Added support for cloudpickle as a pickling library for Python SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8123">BEAM-8123&lt;/a>). To use cloudpickle, set pipeline option: &amp;ndash;pickle_library=cloudpickle&lt;/li>
&lt;li>Added option to specify triggering frequency when streaming to BigQuery (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12865">BEAM-12865&lt;/a>).&lt;/li>
&lt;li>Added option to enable caching uploaded artifacts across job runs for Python Dataflow jobs (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13459">BEAM-13459&lt;/a>). To enable, set pipeline option: &amp;ndash;enable_artifact_caching, this will be enabled by default in a future release.&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Updated the jedis from 3.x to 4.x to Java RedisIO. If you are using RedisIO and using jedis directly, please refer to &lt;a href="https://github.com/redis/jedis/blob/v4.0.0/docs/3to4.md">this page&lt;/a> to update it. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12092">BEAM-12092&lt;/a>).&lt;/li>
&lt;li>Datatype of timestamp fields in &lt;code>SqsMessage&lt;/code> for AWS IOs for SDK v2 was changed from &lt;code>String&lt;/code> to &lt;code>long&lt;/code>, visibility of all fields was fixed from &lt;code>package private&lt;/code> to &lt;code>public&lt;/code> &lt;a href="https://issues.apache.org/jira/browse/BEAM-13638">BEAM-13638&lt;/a>.&lt;/li>
&lt;li>Properly check output timestamps on elements output from DoFns, timers, and onWindowExpiration in Java &lt;a href="https://issues.apache.org/jira/browse/BEAM-12931">BEAM-12931&lt;/a>.&lt;/li>
&lt;li>Fixed a bug with DeferredDataFrame.xs when used with a non-tuple key
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-13421%5D">BEAM-13421&lt;/a>).&lt;/li>
&lt;li>Beam Python now requires &lt;code>google-cloud-pubsub&amp;gt;=2.1.0&lt;/code>. The API surface for &lt;code>apache_beam.io.gcp.pubsub&lt;/code> has not changed, but code that uses the PubSub client directly may need to be updated.&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Users may encounter an unexpected java.lang.ArithmeticException when outputting a timestamp
for an element further than allowedSkew from an allowed DoFN skew set to a value more than
Integer.MAX_VALUE.&lt;/li>
&lt;li>S3 object metadata retrieval broken in Python SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13980">BEAM-13980&lt;/a>)&lt;/li>
&lt;li>See a full list of open &lt;a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.36.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.36.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ada Wong
Ahmet Altay
Alexander
Alexander Dahl
Alexandr Zhuravlev
Alexey Romanenko
AlikRodriguez
Anand Inguva
Andrew Pilloud
Andy Ye
Arkadiusz Gasiński
Artur Khanin
Arun Pandian
Aydar Farrakhov
Aydar Zainutdinov
AydarZaynutdinov
Benjamin Gonzalez
Brian Hulette
Chamikara Jayalath
Daniel Collins
Daniel Oliveira
Daniel Thevessen
Daniela Martín
David Hinkes
David Huntsperger
Emily Ye
Etienne Chauchot
Evan Galpin
Heejong Lee
Ilya
Ilya Kozyrev
In-Ho Yi
Jack McCluskey
Janek Bevendorff
Jarek Potiuk
Ke Wu
KevinGG
Kyle Hersey
Kyle Weaver
Luís Bianchin
Luke Cwik
Masato Nakamura
Matthias Baetens
Mehdi Drissi
Melissa Pashniak
Michel Davit
Miguel Hernandez
MiguelAnzoWizeline
Milena Bukal
Moritz Mack
Mostafa Aghajani
Nathan J Mehl
Niel Markwick
Ning Kang
Pablo Estrada
Pavel Avilov
Quentin Sommer
Reuben van Ammers
Reuven Lax
Ritesh Ghorse
Robert Bradshaw
Robert Burke
Ryan Thompson
Sam Whittle
Sayat
Sergei Lebedev
Sergey Kalinin
Steve Niemitz
Talat Uyarer
Thiago Nunes
Tianyang Hu
Tim Robertson
Valentyn Tymofieiev
Vitaly Ivanov
Yichi Zhang
Yiru Tang
Yu Feng
Yu ISHIKAWA
Zachary Houfek
blais
daria-malkova
daria.malkova
darshan-sj
dpcollins-google
emily
ewianda
johnjcasey
kileys
lam206
laraschmidt
mosche
&lt;a href="mailto:msbukal@google.com">msbukal@google.com&lt;/a>
tvalentyn&lt;/p></description></item><item><title>Blog: Apache Beam 2.35.0</title><link>/blog/beam-2.35.0/</link><pubDate>Wed, 29 Dec 2021 10:11:00 -0800</pubDate><guid>/blog/beam-2.35.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.35.0 release of Apache Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2350-2021-12-29">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.35.0, check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12350406">detailed release
notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>MultiMap side inputs are now supported by the Go SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-3293">BEAM-3293&lt;/a>).&lt;/li>
&lt;li>Side inputs are supported within Splittable DoFns for Dataflow Runner V1 and Dataflow Runner V2. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12522">BEAM-12522&lt;/a>).&lt;/li>
&lt;li>Upgrades Log4j version used in test suites (Apache Beam testing environment only, not for end user consumption) to 2.17.0(&lt;a href="https://issues.apache.org/jira/browse/BEAM-13434">BEAM-13434&lt;/a>).
Note that Apache Beam versions do not depend on the Log4j 2 dependency (log4j-core) impacted by &lt;a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44228">CVE-2021-44228&lt;/a>.
However we urge users to update direct and indirect dependencies (if any) on Log4j 2 to the latest version by updating their build configuration and redeploying impacted pipelines.&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>We changed the data type for ranges in &lt;code>JdbcIO.readWithPartitions&lt;/code> from &lt;code>int&lt;/code> to &lt;code>long&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13149">BEAM-13149&lt;/a>).
This is a relatively minor breaking change, which we&amp;rsquo;re implementing to improve the usability of the transform without increasing cruft.
This transform is relatively new, so we may implement other breaking changes in the future to improve its usability.&lt;/li>
&lt;li>Side inputs are supported within Splittable DoFns for Dataflow Runner V1 and Dataflow Runner V2. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12522">BEAM-12522&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Added custom delimiters to Python TextIO reads (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12730">BEAM-12730&lt;/a>).&lt;/li>
&lt;li>Added escapechar parameter to Python TextIO reads (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13189">BEAM-13189&lt;/a>).&lt;/li>
&lt;li>Splittable reading is enabled by default while reading data with ParquetIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12070">BEAM-12070&lt;/a>).&lt;/li>
&lt;li>DoFn Execution Time metrics added to Go (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13001">BEAM-13001&lt;/a>).&lt;/li>
&lt;li>Cross-bundle side input caching is now available in the Go SDK for runners that support the feature by setting the EnableSideInputCache hook (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11097">BEAM-11097&lt;/a>).&lt;/li>
&lt;li>Upgraded the GCP Libraries BOM version to 24.0.0 and associated dependencies (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11205">BEAM-11205&lt;/a>). For Google Cloud client library versions set by this BOM,
see &lt;a href="https://storage.googleapis.com/cloud-opensource-java-dashboard/com.google.cloud/libraries-bom/24.0.0/artifact_details.html">this table&lt;/a>.&lt;/li>
&lt;li>Removed avro-python3 dependency in AvroIO. Fastavro has already been our Avro library of choice on Python 3. Boolean use_fastavro is left for api compatibility, but will have no effect.(&lt;a href="https://github.com/apache/beam/pull/15900">BEAM-13016&lt;/a>).&lt;/li>
&lt;li>MultiMap side inputs are now supported by the Go SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-3293">BEAM-3293&lt;/a>).&lt;/li>
&lt;li>Remote packages can now be downloaded from locations supported by apache_beam.io.filesystems. The files will be downloaded on Stager and uploaded to staging location. For more information, see &lt;a href="https://issues.apache.org/jira/browse/BEAM-11275">BEAM-11275&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>A new URN convention was adopted for cross-language transforms and existing URNs were updated. This may break advanced use-cases, for example, if a custom expansion service is used to connect diffrent Beam Java and Python versions. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12047">BEAM-12047&lt;/a>).&lt;/li>
&lt;li>The upgrade to Calcite 1.28.0 introduces a breaking change in the SUBSTRING function in SqlTransform, when used with the Calcite dialect (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13099">BEAM-13099&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/CALCITE-4427">CALCITE-4427&lt;/a>).&lt;/li>
&lt;li>ListShards (with DescribeStreamSummary) is used instead of DescribeStream to list shards in Kinesis streams (AWS SDK v2). Due to this change, as mentioned in &lt;a href="https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListShards.html">AWS documentation&lt;/a>, for fine-grained IAM policies it is required to update them to allow calls to ListShards and DescribeStreamSummary APIs. For more information, see &lt;a href="https://docs.aws.amazon.com/streams/latest/dev/controlling-access.html">Controlling Access to Amazon Kinesis Data Streams&lt;/a> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13233">BEAM-13233&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Non-splittable reading is deprecated while reading data with ParquetIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12070">BEAM-12070&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Properly map main input windows to side input windows by default (Go)
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-11087">BEAM-11087&lt;/a>).&lt;/li>
&lt;li>Fixed data loss when writing to DynamoDB without setting deduplication key names (Java)
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-13009">BEAM-13009&lt;/a>).&lt;/li>
&lt;li>Go SDK Examples now have types and functions registered. (Go) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-5378">BEAM-5378&lt;/a>)&lt;/li>
&lt;li>Fixed data loss when using Python WriteToFiles in streaming pipeline (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12950">BEAM-12950&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Users of beam-sdks-java-io-hcatalog (and beam-sdks-java-extensions-sql-hcatalog) must take care to override the transitive log4j dependency when they add a hive dependency (&lt;a href="https://issues.apache.org/jira/browse/BEAM-13499">BEAM-13499&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.35.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay
Alexandr Zhuravlev
Alexey Romanenko
AlikRodriguez
Anand Inguva
Andrew Pilloud
Ankur Goenka
Anthony Sottile
Artur Khanin
Aydar Farrakhov
Aydar Zainutdinov
Benjamin Gonzalez
brachipa
Brian Hulette
Calvin Leung
Chamikara Jayalath
Chris Gray
Damon Douglas
Daniel Collins
Daniel Oliveira
daria.malkova
darshan-sj
David Huntsperger
David Prieto Rivera
Dmitrii Kuzin
dpcollins-google
dprieto
egalpin
Etienne Chauchot
Eugene Nikolaiev
Fernando Morales
Hector Lagos
Heejong Lee
Ilya Kozyrev
Iñigo San Jose Visiers
Jack McCluskey
Jiayang Wu
jrhy
Kenneth Knowles
KevinGG
kileys
klmilam
Kyle Weaver
Luís Bianchin
Luke Cwik
Melissa Pashniak
Michael Luckey
Miguel Hernandez
Milena Bukal
Minbo Bae
minherz
Moritz Mack
mosche
Natalie
Ning Kang
Pablo Estrada
Pavel Avilov
Reuven Lax
Ritesh Ghorse
Robert Bradshaw
Robert Burke
Rogan Morrow
Ruslan Altynnikov
Sam Whittle
Sergey Kalinin
Slava Chernyak
Svetak Sundhar
Tianyang Hu
Tim Robertson
Tomo Suzuki
tuorhador
Udi Meiri
vachan-shetty
Valentyn Tymofieiev
Yichi Zhang
zhoufek&lt;/p></description></item><item><title>Blog: Apache Beam 2.34.0</title><link>/blog/beam-2.34.0/</link><pubDate>Thu, 11 Nov 2021 00:11:00 -0800</pubDate><guid>/blog/beam-2.34.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.34.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2340-2021-11-11">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.34.0, check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12350405">detailed release
notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>The Beam Java API for Calcite SqlTransform is no longer experimental (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12680">BEAM-12680&lt;/a>).&lt;/li>
&lt;li>Python&amp;rsquo;s ParDo (Map, FlatMap, etc.) transforms now suport a &lt;code>with_exception_handling&lt;/code> option for easily ignoring bad records and implementing the dead letter pattern.&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>&lt;code>ReadFromBigQuery&lt;/code> and &lt;code>ReadAllFromBigQuery&lt;/code> now run queries with BATCH priority by default. The &lt;code>query_priority&lt;/code> parameter is introduced to the same transforms to allow configuring the query priority (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12913">BEAM-12913&lt;/a>).&lt;/li>
&lt;li>[EXPERIMENTAL] Support for &lt;a href="https://cloud.google.com/bigquery/docs/reference/storage">BigQuery Storage Read API&lt;/a> added to &lt;code>ReadFromBigQuery&lt;/code>. The newly introduced &lt;code>method&lt;/code> parameter can be set as &lt;code>DIRECT_READ&lt;/code> to use the Storage Read API. The default is &lt;code>EXPORT&lt;/code> which invokes a BigQuery export request. (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10917">BEAM-10917&lt;/a>).&lt;/li>
&lt;li>[EXPERIMENTAL] Added &lt;code>use_native_datetime&lt;/code> parameter to &lt;code>ReadFromBigQuery&lt;/code> to configure the return type of &lt;a href="https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#datetime_type">DATETIME&lt;/a> fields when using &lt;code>ReadFromBigQuery&lt;/code>. This parameter can &lt;em>only&lt;/em> be used when &lt;code>method = DIRECT_READ&lt;/code>(Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10917">BEAM-10917&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Upgrade to Calcite 1.26.0 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9379">BEAM-9379&lt;/a>).&lt;/li>
&lt;li>Added a new &lt;code>dataframe&lt;/code> extra to the Python SDK that tracks &lt;code>pandas&lt;/code> versions
we&amp;rsquo;ve verified compatibility with. We now recommend installing Beam with &lt;code>pip install apache-beam[dataframe]&lt;/code> when you intend to use the DataFrame API
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-12906">BEAM-12906&lt;/a>).&lt;/li>
&lt;li>Add an &lt;a href="https://github.com/cometta/python-apache-beam-spark">example&lt;/a> of deploying Python Apache Beam job with Spark Cluster&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>SQL Rows are no longer flattened (&lt;a href="https://issues.apache.org/jira/browse/BEAM-5505">BEAM-5505&lt;/a>).&lt;/li>
&lt;li>[Go SDK] beam.TryCrossLanguage&amp;rsquo;s signature now matches beam.CrossLanguage. Like other Try functions it returns an error instead of panicking. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9918">BEAM-9918&lt;/a>).&lt;/li>
&lt;li>&lt;a href="https://jira.apache.org/jira/browse/BEAM-12925">BEAM-12925&lt;/a> was fixed. It used to silently pass incorrect null data read from JdbcIO. Pipelines affected by this will now start throwing failures instead of silently passing incorrect data.&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed error while writing multiple DeferredFrames to csv (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12701">BEAM-12701&lt;/a>).&lt;/li>
&lt;li>Fixed error when importing the DataFrame API with pandas 1.0.x installed (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12945">BEAM-12945&lt;/a>).&lt;/li>
&lt;li>Fixed top.SmallestPerKey implementation in the Go SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12946">BEAM-12946&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used).
This results in the error message: &lt;code>IllegalArgumentException: Attempting to access unknown side input&lt;/code>.
Please upgrade to a newer version (&amp;gt; 2.34.0) or use another write method (e.g. &lt;code>STORAGE_WRITE_API&lt;/code>).&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.34.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay,
Aizhamal Nurmamat kyzy,
Alex Amato,
Alexander Chermenin,
Alexey Romanenko,
AlikRodriguez,
Andrew Pilloud,
Andy Xu,
Ankur Goenka,
Aydar Farrakhov,
Aydar Zainutdinov,
Aydar Zaynutdinov,
AydarZaynutdinov,
Benjamin Gonzalez,
BenWhitehead,
Brachi Packter,
Brian Hulette,
Bu Sun Kim,
Chamikara Jayalath,
Chris Gray,
Chuck Yang,
Chun Yang,
Claire McGinty,
comet,
Daniel Collins,
Daniel Oliveira,
Daniel Thevessen,
daria.malkova,
David Cavazos,
David Huntsperger,
Dmytro Kozhevin,
dpcollins-google,
Eduardo Sánchez López,
Elias Djurfeldt,
emily,
Emily Ye,
Enis Sert,
Etienne Chauchot,
Fernando Morales,
Heejong Lee,
Ihor Indyk,
Ismaël Mejía,
Israel Herraiz,
Jack McCluskey,
Jonathan Hourany,
Judah Rand,
Kenneth Knowles,
KevinGG,
Ke Wu,
kileys,
Kyle Weaver,
Luke Cwik,
masahitojp,
MiguelAnzoWizeline,
Minbo Bae,
Niels Basjes,
Ning Kang,
Pablo Estrada,
pareshsarafmdb,
Paul Féraud,
Piotr Szczepanik,
Reuven Lax,
Ritesh Ghorse,
R. Miles McCain,
Robert Bradshaw,
Robert Burke,
Rogan Morrow,
Ruwan Lambrichts,
rvballada,
Ryan Thompson,
Sam Rohde,
Sam Whittle,
Ștefan Istrate,
Steve Niemitz,
Thomas Li Fredriksen,
Tomo Suzuki,
tvalentyn,
Udi Meiri,
Vachan,
Valentyn Tymofieiev,
Vincent Marquez,
WinsonT,
Yichi Zhang,
Yifan Mai,
Yilei &amp;ldquo;Dolee&amp;rdquo; Yang,
zhoufek&lt;/p></description></item><item><title>Blog: Apache Beam 2.33.0</title><link>/blog/beam-2.33.0/</link><pubDate>Thu, 07 Oct 2021 00:00:01 -0800</pubDate><guid>/blog/beam-2.33.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.33.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2330-2021-10-07">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.33.0, check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12350404">detailed release
notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Go SDK is no longer experimental, and is officially part of the Beam release process.
&lt;ul>
&lt;li>Matching Go SDK containers are published on release.&lt;/li>
&lt;li>Batch usage is well supported, and tested on Flink, Spark, and the Python Portable Runner.
&lt;ul>
&lt;li>SDK Tests are also run against Google Cloud Dataflow, but this doesn&amp;rsquo;t indicate reciprocal support.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>The SDK supports Splittable DoFns, Cross Language transforms, and most Beam Model basics.&lt;/li>
&lt;li>Go Modules are now used for dependency management.
&lt;ul>
&lt;li>This is a breaking change, see Breaking Changes for resolution.&lt;/li>
&lt;li>Easier path to contribute to the Go SDK, no need to set up a GO_PATH.&lt;/li>
&lt;li>Minimum Go version is now Go v1.16&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>See the announcement blogpost for full information once published.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;!--
{$TOPICS e.g.:}
### I/Os
* Support for X source added (Java) ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)).
{$TOPICS}
-->
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>Projection pushdown in SchemaIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12609">BEAM-12609&lt;/a>).&lt;/li>
&lt;li>Upgrade Flink runner to Flink versions 1.13.2, 1.12.5 and 1.11.4 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10955">BEAM-10955&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>Since release 2.30.0, &amp;ldquo;The AvroCoder changes for BEAM-2303 [changed] the reader/writer from the Avro ReflectDatum* classes to the SpecificDatum* classes&amp;rdquo; (Java). This default behavior change has been reverted in this release. Use the &lt;code>useReflectApi&lt;/code> setting to control it (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12628">BEAM-12628&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="deprecations">Deprecations&lt;/h3>
&lt;ul>
&lt;li>Python GBK will stop supporting unbounded PCollections that have global windowing and a default trigger in Beam 2.34. This can be overriden with &lt;code>--allow_unsafe_triggers&lt;/code>. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9487">BEAM-9487&lt;/a>).&lt;/li>
&lt;li>Python GBK will start requiring safe triggers or the &lt;code>--allow_unsafe_triggers&lt;/code> flag starting with Beam 2.34. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9487">BEAM-9487&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>UnsupportedOperationException when reading from BigQuery tables and converting
TableRows to Beam Rows (Java)
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-12479">BEAM-12479&lt;/a>).&lt;/li>
&lt;li>SDFBoundedSourceReader behaves much slower compared with the original behavior
of BoundedSource (Python)
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-12781">BEAM-12781&lt;/a>).&lt;/li>
&lt;li>ORDER BY column not in SELECT crashes (ZetaSQL)
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-12759">BEAM-12759&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>Spark 2.x users will need to update Spark&amp;rsquo;s Jackson runtime dependencies (&lt;code>spark.jackson.version&lt;/code>) to at least version 2.9.2, due to Beam updating its dependencies.&lt;/li>
&lt;li>See a full list of open &lt;a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.33.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/li>
&lt;li>Go SDK jobs may produce &amp;ldquo;Failed to deduce Step from MonitoringInfo&amp;rdquo; messages following successful job execution. The messages are benign and don&amp;rsquo;t indicate job failure. These are due to not yet handling PCollection metrics.&lt;/li>
&lt;li>Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used).
This results in the error message: &lt;code>IllegalArgumentException: Attempting to access unknown side input&lt;/code>.
Please upgrade to a newer version (&amp;gt; 2.34.0) or use another write method (e.g. &lt;code>STORAGE_WRITE_API&lt;/code>).&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.33.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay,
Alex Amato,
Alexey Romanenko,
Andreas Bergmeier,
Andres Rodriguez,
Andrew Pilloud,
Andy Xu,
Ankur Goenka,
anthonyqzhu,
Benjamin Gonzalez,
Bhupinder Sindhwani,
Chamikara Jayalath,
Claire McGinty,
Daniel Mateus Pires,
Daniel Oliveira,
David Huntsperger,
Dylan Hercher,
emily,
Emily Ye,
Etienne Chauchot,
Eugene Nikolaiev,
Heejong Lee,
iindyk,
Iñigo San Jose Visiers,
Ismaël Mejía,
Jack McCluskey,
Jan Lukavský,
Jeff Ruane,
Jeremy Lewi,
KevinGG,
Ke Wu,
Kyle Weaver,
lostluck,
Luke Cwik,
Marwan Tammam,
masahitojp,
Mehdi Drissi,
Minbo Bae,
Ning Kang,
Pablo Estrada,
Pascal Gillet,
Pawas Chhokra,
Reuven Lax,
Ritesh Ghorse,
Robert Bradshaw,
Robert Burke,
Rodrigo Benenson,
Ryan Thompson,
Saksham Gupta,
Sam Rohde,
Sam Whittle,
Sayat,
Sayat Satybaldiyev,
Siyuan Chen,
Slava Chernyak,
Steve Niemitz,
Steven Niemitz,
tvalentyn,
Tyson Hamilton,
Udi Meiri,
vachan-shetty,
Venkatramani Rajgopal,
Yichi Zhang,
zhoufek&lt;/p></description></item><item><title>Blog: Apache Beam 2.32.0</title><link>/blog/beam-2.32.0/</link><pubDate>Wed, 25 Aug 2021 00:00:01 -0800</pubDate><guid>/blog/beam-2.32.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.32.0 release of Apache Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2320-2021-08-11">download page&lt;/a> for this release.&lt;/p>
&lt;!-- more -->
&lt;p>For more information on changes in 2.32.0, check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12349992">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>The &lt;a href="/documentation/dsls/dataframes/overview/">Beam DataFrame
API&lt;/a> is no
longer experimental! We&amp;rsquo;ve spent the time since the &lt;a href="/blog/dataframe-api-preview-available/">2.26.0 preview
announcement&lt;/a>
implementing the most frequently used pandas operations
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-9547">BEAM-9547&lt;/a>), improving
&lt;a href="https://beam.apache.org/releases/pydoc/current/apache_beam.dataframe.html">documentation&lt;/a>
and &lt;a href="https://issues.apache.org/jira/browse/BEAM-12028">error messages&lt;/a>,
adding
&lt;a href="https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/dataframe">examples&lt;/a>,
integrating DataFrames with &lt;a href="https://beam.apache.org/releases/pydoc/current/apache_beam.runners.interactive.interactive_beam.html">interactive
Beam&lt;/a>,
and of course finding and fixing
&lt;a href="https://issues.apache.org/jira/issues/?jql=project%3DBEAM%20AND%20issuetype%3DBug%20AND%20status%3DResolved%20AND%20component%3Ddsl-dataframe">bugs&lt;/a>.
Leaving experimental just means that we now have high confidence in the API
and recommend its use for production workloads. We will continue to improve
the API, guided by your
&lt;a href="/community/contact-us/">feedback&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Added ability to use JdbcIO.Write.withResults without statement and preparedStatementSetter. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12511">BEAM-12511&lt;/a>)&lt;/li>
&lt;/ul>
&lt;ul>
&lt;li>Added ability to register URI schemes to use the S3 protocol via FileIO. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12435">BEAM-12435&lt;/a>).&lt;/li>
&lt;/ul>
&lt;ul>
&lt;li>Respect number of shards set in SnowflakeWrite batch mode. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12715">BEAM-12715&lt;/a>)&lt;/li>
&lt;li>Java SDK: Update Google Cloud Healthcare IO connectors from using v1beta1 to using the GA version.&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Add support to convert Beam Schema to Avro Schema for JDBC LogicalTypes:
&lt;code>VARCHAR&lt;/code>, &lt;code>NVARCHAR&lt;/code>, &lt;code>LONGVARCHAR&lt;/code>, &lt;code>LONGNVARCHAR&lt;/code>, &lt;code>DATE&lt;/code>, &lt;code>TIME&lt;/code>
(Java)(&lt;a href="https://issues.apache.org/jira/browse/BEAM-12385">BEAM-12385&lt;/a>).&lt;/li>
&lt;li>Reading from JDBC source by partitions (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12456">BEAM-12456&lt;/a>).&lt;/li>
&lt;li>PubsubIO can now write to a dead-letter topic after a parsing error (Java)(&lt;a href="https://issues.apache.org/jira/browse/BEAM-12474">BEAM-12474&lt;/a>).&lt;/li>
&lt;li>New append-only option for Elasticsearch sink (Java) &lt;a href="https://issues.apache.org/jira/browse/BEAM-12601">BEAM-12601&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>ListShards (with DescribeStreamSummary) is used instead of DescribeStream to list shards in Kinesis streams. Due to this change, as mentioned in &lt;a href="https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListShards.html">AWS documentation&lt;/a>, for fine-grained IAM policies it is required to update them to allow calls to ListShards and DescribeStreamSummary APIs. For more information, see &lt;a href="https://docs.aws.amazon.com/streams/latest/dev/controlling-access.html">Controlling Access to Amazon Kinesis Data Streams&lt;/a> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12225">BEAM-12225&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Python GBK will stop supporting unbounded PCollections that have global windowing and a default trigger in Beam 2.33. This can be overriden with &lt;code>--allow_unsafe_triggers&lt;/code>. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9487">BEAM-9487&lt;/a>).&lt;/li>
&lt;li>Python GBK will start requiring safe triggers or the &lt;code>--allow_unsafe_triggers&lt;/code> flag starting with Beam 2.33. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9487">BEAM-9487&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="bugfixes">Bugfixes&lt;/h2>
&lt;ul>
&lt;li>Fixed race condition in RabbitMqIO causing duplicate acks (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6516">BEAM-6516&lt;/a>))&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.32.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Ajo Thomas, Alex Amato, Alexey Romanenko, Alex Koay, allenpradeep, Anant Damle, Andrew Pilloud, Ankur Goenka, Ashwin Ramaswami, Benjamin Gonzalez, BenWhitehead, Blake Williams, Boyuan Zhang, Brian Hulette, Chamikara Jayalath, Daniel Oliveira, Daniel Thevessen, daria-malkova, David Cavazos, David Huntsperger, dennisylyung, Dennis Yung, dmkozh, egalpin, emily, Esun Kim, Gabriel Melo de Paula, Harch Vardhan, Heejong Lee, heidimhurst, hoshimura, Iñigo San Jose Visiers, Ismaël Mejía, Jack McCluskey, Jan Lukavský, Justin King, Kenneth Knowles, KevinGG, Ke Wu, kileys, Kyle Weaver, Luke Cwik, Maksym Skorupskyi, masahitojp, Matthew Ouyang, Matthias Baetens, Matt Rudary, MiguelAnzoWizeline, Miguel Hernandez, Nikita Petunin, Ning Ding, Ning Kang, odidev, Pablo Estrada, Pascal Gillet, rafal.ochyra, raphael.sanamyan, Reuven Lax, Robert Bradshaw, Robert Burke, roger-mike, Ryan McDowell, Sam Rohde, Sam Whittle, Siyuan Chen, Teng Qiu, Tianzi Cai, Tobias Hermann, Tomo Suzuki, tvalentyn, Tyson Hamilton, Udi Meiri, Valentyn Tymofieiev, Vitaly Terentyev, Yichi Zhang, Yifan Mai, yoshiki.obata, Yu Feng, YuqiHuai, yzhang559, Zachary Houfek, zhoufek&lt;/p></description></item><item><title>Blog: Apache Beam 2.31.0</title><link>/blog/beam-2.31.0/</link><pubDate>Thu, 08 Jul 2021 09:00:00 -0700</pubDate><guid>/blog/beam-2.31.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.31.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2310-2021-07-08">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.31.0, check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12349991">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Fixed bug in ReadFromBigQuery when a RuntimeValueProvider is used as value of table argument (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12514">BEAM-12514&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>&lt;code>CREATE FUNCTION&lt;/code> DDL statement added to Calcite SQL syntax. &lt;code>JAR&lt;/code> and &lt;code>AGGREGATE&lt;/code> are now reserved keywords. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12339">BEAM-12339&lt;/a>).&lt;/li>
&lt;li>Flink 1.13 is now supported by the Flink runner (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12277">BEAM-12277&lt;/a>).&lt;/li>
&lt;li>DatastoreIO: Write and delete operations now follow automatic gradual ramp-up,
in line with best practices (Java/Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12260">BEAM-12260&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-12272">BEAM-12272&lt;/a>).&lt;/li>
&lt;li>Python &lt;code>TriggerFn&lt;/code> has a new &lt;code>may_lose_data&lt;/code> method to signal potential data loss. Default behavior assumes safe (necessary for backwards compatibility). See Deprecations for potential impact of overriding this. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9487">BEAM-9487&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>Python Row objects are now sensitive to field order. So &lt;code>Row(x=3, y=4)&lt;/code> is no
longer considered equal to &lt;code>Row(y=4, x=3)&lt;/code> (BEAM-11929).&lt;/li>
&lt;li>Kafka Beam SQL tables now ascribe meaning to the LOCATION field; previously
it was ignored if provided.&lt;/li>
&lt;li>&lt;code>TopCombineFn&lt;/code> disallow &lt;code>compare&lt;/code> as its argument (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7372">BEAM-7372&lt;/a>).&lt;/li>
&lt;li>Drop support for Flink 1.10 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12281">BEAM-12281&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="deprecations">Deprecations&lt;/h3>
&lt;ul>
&lt;li>Python GBK will stop supporting unbounded PCollections that have global windowing and a default trigger in Beam 2.33. This can be overriden with &lt;code>--allow_unsafe_triggers&lt;/code>. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9487">BEAM-9487&lt;/a>).&lt;/li>
&lt;li>Python GBK will start requiring safe triggers or the &lt;code>--allow_unsafe_triggers&lt;/code> flag starting with Beam 2.33. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9487">BEAM-9487&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>See a full list of &lt;a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.31.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to &lt;code>git shortlog&lt;/code>, the following people contributed to the 2.31.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, ajo thomas, Alan Myrvold, Alex Amato, Alexey Romanenko,
AlikRodriguez, Anant Damle, Andrew Pilloud, Benjamin Gonzalez, Boyuan Zhang,
Brian Hulette, Chamikara Jayalath, Daniel Oliveira, David Cavazos,
David Huntsperger, David Moravek, Dmytro Kozhevin, dpcollins-google, Emily Ye,
Ernesto Valentino, Evan Galpin, Fernando Morales, Heejong Lee, Ismaël Mejía,
Jan Lukavský, Josias Rico, jrynd, Kenneth Knowles, Ke Wu, kileys, Kyle Weaver,
masahitojp, Matthias Baetens, Maximilian Michels, Milena Bukal,
Nathan J. Mehl, Pablo Estrada, Peter Sobot, Reuven Lax, Robert Bradshaw,
Robert Burke, roger-mike, Sam Rohde, Sam Whittle, Stephan Hoyer, Tom Underhill,
tvalentyn, Uday Singh, Udi Meiri, Vitaly Terentyev, Xinyu Liu, Yichi Zhang,
Yifan Mai, yoshiki.obata, zhoufek&lt;/p></description></item><item><title>Blog: Apache Beam 2.30.0</title><link>/blog/beam-2.30.0/</link><pubDate>Wed, 09 Jun 2021 09:00:00 -0700</pubDate><guid>/blog/beam-2.30.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.30.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2300-2021-06-09">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.30.0, check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12349978">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Legacy Read transform (non-SDF based Read) is used by default for non-FnAPI opensource runners. Use &lt;code>use_sdf_read&lt;/code> experimental flag to re-enable SDF based Read transforms (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10670">BEAM-10670&lt;/a>)&lt;/li>
&lt;li>Upgraded vendored gRPC dependency to 1.36.0 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11227">BEAM-11227&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Fixed the issue that WriteToBigQuery with batch file loads does not respect schema update options when there are multiple load jobs (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11277">BEAM-11277&lt;/a>)&lt;/li>
&lt;li>Fixed the issue that the job didn&amp;rsquo;t properly retry since BigQuery sink swallows HttpErrors when performing streaming inserts (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12362">BEAM-12362&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>Added capability to declare resource hints in Java and Python SDKs (&lt;a href="https://issues.apache.org/jira/browse/BEAM-2085">BEAM-2085&lt;/a>)&lt;/li>
&lt;li>Added Spanner IO Performance tests for read and write in Python SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10029">BEAM-10029&lt;/a>)&lt;/li>
&lt;li>Added support for accessing GCP PubSub Message ordering keys, message IDs and message publish timestamp in Python SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7819">BEAM-7819&lt;/a>)&lt;/li>
&lt;li>DataFrame API: Added support for collecting DataFrame objects in interactive Beam (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11855">BEAM-11855&lt;/a>)&lt;/li>
&lt;li>DataFrame API: Added &lt;a href="https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/dataframe">apache_beam.examples.dataframe&lt;/a> module (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12024">BEAM-12024&lt;/a>)&lt;/li>
&lt;li>Upgraded the GCP Libraries BOM version to 20.0.0 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11205">BEAM-11205&lt;/a>). For Google Cloud client library versions set by this BOM, see &lt;a href="https://storage.googleapis.com/cloud-opensource-java-dashboard/com.google.cloud/libraries-bom/20.0.0/artifact_details.html">this table&lt;/a>&lt;/li>
&lt;li>Added &lt;code>sdkContainerImage&lt;/code> flag to (eventually) replace &lt;code>workerHarnessContainerImage&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12212">BEAM-12212&lt;/a>)&lt;/li>
&lt;li>Added support for Dataflow update when schemas are used (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12198">BEAM-12198&lt;/a>)&lt;/li>
&lt;li>Fixed the issue that &lt;code>ZipFiles.zipDirectory&lt;/code> leaks native JVM memory (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12220">BEAM-12220&lt;/a>)&lt;/li>
&lt;li>Fixed the issue that &lt;code>Reshuffle.withNumBuckets&lt;/code> creates &lt;code>(N*2)-1&lt;/code> buckets (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12361">BEAM-12361&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>Drop support for Flink 1.8 and 1.9 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11948">BEAM-11948&lt;/a>)&lt;/li>
&lt;li>MongoDbIO: Read.withFilter() and Read.withProjection() are removed since they are deprecated since Beam 2.12.0 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12217">BEAM-12217&lt;/a>)&lt;/li>
&lt;li>RedisIO.readAll() was removed since it was deprecated since Beam 2.13.0. Please use RedisIO.readKeyPatterns() for the equivalent functionality (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12214">BEAM-12214&lt;/a>)&lt;/li>
&lt;li>MqttIO.create() with clientId constructor removed because it was deprecated since Beam 2.13.0 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12216">BEAM-12216&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>See a full list of open &lt;a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.30.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to &lt;code>git shortlog&lt;/code>, the following people contributed to the 2.30.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alex Amato, Alexey Romanenko, Anant Damle, Andreas Bergmeier, Andrew Pilloud, Ankur Goenka,
Anup D, Artur Khanin, Benjamin Gonzalez, Bipin Upadhyaya, Boyuan Zhang, Brian Hulette, Bulat Shakirzyanov,
Chamikara Jayalath, Chun Yang, Daniel Kulp, Daniel Oliveira, David Cavazos, Elliotte Rusty Harold, Emily Ye,
Eric Roshan-Eisner, Evan Galpin, Fabien Caylus, Fernando Morales, Heejong Lee, Iñigo San Jose Visiers,
Isidro Martínez, Ismaël Mejía, Ke Wu, Kenneth Knowles, KevinGG, Kyle Weaver, Ludovic Post, MATTHEW Ouyang (LCL),
Mackenzie Clark, Masato Nakamura, Matthias Baetens, Max, Nicholas Azar, Ning Kang, Pablo Estrada, Patrick McCaffrey,
Quentin Sommer, Reuven Lax, Robert Bradshaw, Robert Burke, Rui Wang, Sam Rohde, Sam Whittle, Shoaib Zafar,
Siyuan Chen, Sruthi Sree Kumar, Steve Niemitz, Sylvain Veyrié, Tomo Suzuki, Udi Meiri, Valentyn Tymofieiev,
Vitaly Terentyev, Wenbing, Xinyu Liu, Yichi Zhang, Yifan Mai, Yueyang Qiu, Yunqing Zhou, ajo thomas, brucearctor,
dmkozh, dpcollins-google, emily, jordan-moore, kileys, lostluck, masahitojp, roger-mike, sychen, tvalentyn,
vachan-shetty, yoshiki.obata&lt;/p></description></item><item><title>Blog: Apache Beam 2.29.0</title><link>/blog/beam-2.29.0/</link><pubDate>Thu, 29 Apr 2021 09:00:00 -0700</pubDate><guid>/blog/beam-2.29.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.29.0 release of Beam.
This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2290-2021-04-15">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.29.0, check out the &lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12349629">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Spark Classic and Portable runners officially support Spark 3 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7093">BEAM-7093&lt;/a>).&lt;/li>
&lt;li>Official Java 11 support for most runners (Dataflow, Flink, Spark) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-2530">BEAM-2530&lt;/a>).&lt;/li>
&lt;li>DataFrame API now supports GroupBy.apply (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11628">BEAM-11628&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Added support for S3 filesystem on AWS SDK V2 (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7637">BEAM-7637&lt;/a>)&lt;/li>
&lt;li>GCP BigQuery sink (file loads) uses runner determined sharding for unbounded data (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11772">BEAM-11772&lt;/a>)&lt;/li>
&lt;li>KafkaIO now recognizes the &lt;code>partition&lt;/code> property in writing records (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11806">BEAM-11806&lt;/a>)&lt;/li>
&lt;li>Support for Hadoop configuration on ParquetIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11913">BEAM-11913&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>DataFrame API now supports pandas 1.2.x (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11531">BEAM-11531&lt;/a>).&lt;/li>
&lt;li>Multiple DataFrame API bugfixes (&lt;a href="https://issues.apache.org/jira/browse/BEAM-12071">BEAM-12071&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-11929">BEAM-11929&lt;/a>)&lt;/li>
&lt;li>DDL supported in SQL transforms (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11850">BEAM-11850&lt;/a>)&lt;/li>
&lt;li>Upgrade Flink runner to Flink version 1.12.2 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11941">BEAM-11941&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>Deterministic coding enforced for GroupByKey and Stateful DoFns. Previously non-deterministic coding was allowed, resulting in keys not properly being grouped in some cases. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11719">BEAM-11719&lt;/a>)
To restore the old behavior, one can register &lt;code>FakeDeterministicFastPrimitivesCoder&lt;/code> with
&lt;code>beam.coders.registry.register_fallback_coder(beam.coders.coders.FakeDeterministicFastPrimitivesCoder())&lt;/code>
or use the &lt;code>allow_non_deterministic_key_coders&lt;/code> pipeline option.&lt;/li>
&lt;/ul>
&lt;h3 id="deprecations">Deprecations&lt;/h3>
&lt;ul>
&lt;li>Support for Flink 1.8 and 1.9 will be removed in the next release (2.30.0) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11948">BEAM-11948&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>See a full list of open &lt;a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.29.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to &lt;code>git shortlog&lt;/code>, the following people contributed to the 2.29.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alan Myrvold, Alex Amato, Alexander Chermenin, Alexey Romanenko,
Allen Pradeep Xavier, Amy Wu, Anant Damle, Andreas Bergmeier, Andrei Balici,
Andrew Pilloud, Andy Xu, Ankur Goenka, Bashir Sadjad, Benjamin Gonzalez, Boyuan
Zhang, Brian Hulette, Chamikara Jayalath, Chinmoy Mandayam, Chuck Yang,
dandy10, Daniel Collins, Daniel Oliveira, David Cavazos, David Huntsperger,
David Moravek, Dmytro Kozhevin, Emily Ye, Esun Kim, Evgeniy Belousov, Filip
Popić, Fokko Driesprong, Gris Cuevas, Heejong Lee, Ihor Indyk, Ismaël Mejía,
Jakub-Sadowski, Jan Lukavský, John Edmonds, Juan Sandoval, 谷口恵輔, Kenneth
Jung, Kenneth Knowles, KevinGG, Kiley Sok, Kyle Weaver, MabelYC, Mackenzie
Clark, Masato Nakamura, Milena Bukal, Miltos, Minbo Bae, Miraç Vuslat Başaran,
mynameborat, Nahian-Al Hasan, Nam Bui, Niel Markwick, Niels Basjes, Ning Kang,
Nir Gazit, Pablo Estrada, Ramazan Yapparov, Raphael Sanamyan, Reuven Lax, Rion
Williams, Robert Bradshaw, Robert Burke, Rui Wang, Sam Rohde, Sam Whittle,
Shehzaad Nakhoda, Shehzaad Nakhoda, Siyuan Chen, Sonam Ramchand, Steve Niemitz,
sychen, Sylvain Veyrié, Tim Robertson, Tobias Kaymak, Tomasz Szerszeń, Tomasz
Szerszeń, Tomo Suzuki, Tyson Hamilton, Udi Meiri, Valentyn Tymofieiev, Yichi
Zhang, Yifan Mai, Yixing Zhang, Yoshiki Obata&lt;/p></description></item><item><title>Blog: Apache Beam 2.28.0</title><link>/blog/beam-2.28.0/</link><pubDate>Mon, 22 Feb 2021 12:00:00 -0800</pubDate><guid>/blog/beam-2.28.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.28.0 release of Apache Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2280-2021-02-22">download page&lt;/a> for this release.
For more information on changes in 2.28.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12349499">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Many improvements related to Parquet support (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11460">BEAM-11460&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-8202">BEAM-8202&lt;/a>, and &lt;a href="https://issues.apache.org/jira/browse/BEAM-11526">BEAM-11526&lt;/a>)&lt;/li>
&lt;li>Hash Functions in BeamSQL (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10074">BEAM-10074&lt;/a>)&lt;/li>
&lt;li>Hash functions in ZetaSQL (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11624">BEAM-11624&lt;/a>)&lt;/li>
&lt;li>Create ApproximateDistinct using HLL Impl (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10324">BEAM-10324&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>SpannerIO supports using BigDecimal for Numeric fields (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11643">BEAM-11643&lt;/a>)&lt;/li>
&lt;li>Add Beam schema support to ParquetIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11526">BEAM-11526&lt;/a>)&lt;/li>
&lt;li>Support ParquetTable Writer (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8202">BEAM-8202&lt;/a>)&lt;/li>
&lt;li>GCP BigQuery sink (streaming inserts) uses runner determined sharding (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11408">BEAM-11408&lt;/a>)&lt;/li>
&lt;li>PubSub support types: TIMESTAMP, DATE, TIME, DATETIME (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11533">BEAM-11533&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>ParquetIO add methods &lt;em>readGenericRecords&lt;/em> and &lt;em>readFilesGenericRecords&lt;/em> can read files with an unknown schema. See &lt;a href="https://github.com/apache/beam/pull/13554">PR-13554&lt;/a> and (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11460">BEAM-11460&lt;/a>)&lt;/li>
&lt;li>Added support for thrift in KafkaTableProvider (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11482">BEAM-11482&lt;/a>)&lt;/li>
&lt;li>Added support for HadoopFormatIO to skip key/value clone (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11457">BEAM-11457&lt;/a>)&lt;/li>
&lt;li>Support Conversion to GenericRecords in Convert.to transform (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11571">BEAM-11571&lt;/a>).&lt;/li>
&lt;li>Support writes for Parquet Tables in Beam SQL (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8202">BEAM-8202&lt;/a>).&lt;/li>
&lt;li>Support reading Parquet files with unknown schema (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11460">BEAM-11460&lt;/a>)&lt;/li>
&lt;li>Support user configurable Hadoop Configuration flags for ParquetIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11527">BEAM-11527&lt;/a>)&lt;/li>
&lt;li>Expose commit_offset_in_finalize and timestamp_policy to ReadFromKafka (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11677">BEAM-11677&lt;/a>)&lt;/li>
&lt;li>S3 options does not provided to boto3 client while using FlinkRunner and Beam worker pool container (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11799">BEAM-11799&lt;/a>)&lt;/li>
&lt;li>HDFS not deduplicating identical configuration paths (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11329">BEAM-11329&lt;/a>)&lt;/li>
&lt;li>Hash Functions in BeamSQL (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10074">BEAM-10074&lt;/a>)&lt;/li>
&lt;li>Create ApproximateDistinct using HLL Impl (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10324">BEAM-10324&lt;/a>)&lt;/li>
&lt;li>Add Beam schema support to ParquetIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11526">BEAM-11526&lt;/a>)&lt;/li>
&lt;li>Add a Deque Encoder (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11538">BEAM-11538&lt;/a>)&lt;/li>
&lt;li>Hash functions in ZetaSQL (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11624">BEAM-11624&lt;/a>)&lt;/li>
&lt;li>Refactor ParquetTableProvider (&lt;a href="https://issues.apache.org/jira/browse/">&lt;/a>)&lt;/li>
&lt;li>Add JVM properties to JavaJobServer (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8344">BEAM-8344&lt;/a>)&lt;/li>
&lt;li>Single source of truth for supported Flink versions (&lt;a href="https://issues.apache.org/jira/browse/">&lt;/a>)&lt;/li>
&lt;li>Use metric for Python BigQuery streaming insert API latency logging (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11018">BEAM-11018&lt;/a>)&lt;/li>
&lt;li>Use metric for Java BigQuery streaming insert API latency logging (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11032">BEAM-11032&lt;/a>)&lt;/li>
&lt;li>Upgrade Flink runner to Flink versions 1.12.1 and 1.11.3 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11697">BEAM-11697&lt;/a>)&lt;/li>
&lt;li>Upgrade Beam base image to use Tensorflow 2.4.1 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11762">BEAM-11762&lt;/a>)&lt;/li>
&lt;li>Create Beam GCP BOM (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11665">BEAM-11665&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>The Java artifacts &amp;ldquo;beam-sdks-java-io-kinesis&amp;rdquo;, &amp;ldquo;beam-sdks-java-io-google-cloud-platform&amp;rdquo;, and
&amp;ldquo;beam-sdks-java-extensions-sql-zetasql&amp;rdquo; declare Guava 30.1-jre dependency (It was 25.1-jre in Beam 2.27.0).
This new Guava version may introduce dependency conflicts if your project or dependencies rely
on removed APIs. If affected, ensure to use an appropriate Guava version via &lt;code>dependencyManagement&lt;/code> in Maven and
&lt;code>force&lt;/code> in Gradle.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.28.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alex Amato, Alexey Romanenko, Allen Pradeep Xavier, Anant Damle, Artur Khanin,
Boyuan Zhang, Brian Hulette, Chamikara Jayalath, Chris Roth, Costi Ciudatu, Damon Douglas,
Daniel Collins, Daniel Oliveira, David Cavazos, David Huntsperger, Elliotte Rusty Harold,
Emily Ye, Etienne Chauchot, Etta Rapp, Evan Palmer, Eyal, Filip Krakowski, Fokko Driesprong,
Heejong Lee, Ismaël Mejía, janeliulwq, Jan Lukavský, John Edmonds, Jozef Vilcek, Kenneth Knowles
Ke Wu, kileys, Kyle Weaver, MabelYC, masahitojp, Masato Nakamura, Milena Bukal, Miraç Vuslat Başaran,
Nelson Osacky, Niel Markwick, Ning Kang, omarismail94, Pablo Estrada, Piotr Szuberski,
ramazan-yapparov, Reuven Lax, Reza Rokni, rHermes, Robert Bradshaw, Robert Burke, Robert Gruener,
Romster, Rui Wang, Sam Whittle, shehzaadn-vd, Siyuan Chen, Sonam Ramchand, Tobiasz Kędzierski,
Tomo Suzuki, tszerszen, tvalentyn, Tyson Hamilton, Udi Meiri, Xinbin Huang, Yichi Zhang,
Yifan Mai, yoshiki.obata, Yueyang Qiu, Yusaku Matsuki&lt;/p></description></item><item><title>Blog: Apache Beam 2.27.0</title><link>/blog/beam-2.27.0/</link><pubDate>Thu, 07 Jan 2021 12:00:00 -0800</pubDate><guid>/blog/beam-2.27.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.27.0 release of Apache Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2270-2020-12-22">download page&lt;/a> for this release.
For more information on changes in 2.27.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12349380">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Java 11 Containers are now published with all Beam releases.&lt;/li>
&lt;li>There is a new transform &lt;code>ReadAllFromBigQuery&lt;/code> that can receive multiple requests to read data from BigQuery at pipeline runtime. See &lt;a href="https://github.com/apache/beam/pull/13170">PR 13170&lt;/a>, and &lt;a href="https://issues.apache.org/jira/browse/BEAM-9650">BEAM-9650&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>ReadFromMongoDB can now be used with MongoDB Atlas (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11266">BEAM-11266&lt;/a>.)&lt;/li>
&lt;li>ReadFromMongoDB/WriteToMongoDB will mask password in display_data (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11444">BEAM-11444&lt;/a>.)&lt;/li>
&lt;li>There is a new transform &lt;code>ReadAllFromBigQuery&lt;/code> that can receive multiple requests to read data from BigQuery at pipeline runtime. See &lt;a href="https://github.com/apache/beam/pull/13170">PR 13170&lt;/a>, and &lt;a href="https://issues.apache.org/jira/browse/BEAM-9650">BEAM-9650&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Beam modules that depend on Hadoop are now tested for compatibility with Hadoop 3 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8569">BEAM-8569&lt;/a>). (Hive/HCatalog pending)&lt;/li>
&lt;li>Publishing Java 11 SDK container images now supported as part of Apache Beam release process. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8106">BEAM-8106&lt;/a>)&lt;/li>
&lt;li>Added Cloud Bigtable Provider extension to Beam SQL (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11173">BEAM-11173&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-11373">BEAM-11373&lt;/a>)&lt;/li>
&lt;li>Added a schema provider for thrift data (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11338">BEAM-11338&lt;/a>)&lt;/li>
&lt;li>Added combiner packing pipeline optimization to Dataflow runner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10641">BEAM-10641&lt;/a>)&lt;/li>
&lt;li>Added an example to ingest data from Apache Kafka to Google Pub/Sub. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11065">BEAM-11065&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>HBaseIO hbase-shaded-client dependency should be now provided by the users (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9278">BEAM-9278&lt;/a>).&lt;/li>
&lt;li>&lt;code>--region&lt;/code> flag in amazon-web-services2 was replaced by &lt;code>--awsRegion&lt;/code> (&lt;a href="https://issues.apache.org/jira/projects/BEAM/issues/BEAM-11331">BEAM-11331&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.27.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Aliraza Nagamia, Allen Pradeep Xavier,
Andrew Pilloud, andreyKaparulin, Ashwin Ramaswami, Boyuan Zhang, Brent Worden, Brian Hulette,
Carlos Marin, Chamikara Jayalath, Costi Ciudatu, Damon Douglas, Daniel Collins,
Daniel Oliveira, David Huntsperger, David Lu, David Moravek, David Wrede,
dennis, Dennis Yung, dpcollins-google, Emily Ye, emkornfield,
Esun Kim, Etienne Chauchot, Eugene Nikolaiev, Frank Zhao, Haizhou Zhao,
Hector Acosta, Heejong Lee, Ilya, Iñigo San Jose Visiers, InigoSJ,
Ismaël Mejía, janeliulwq, Jan Lukavský, Kamil Wasilewski, Kenneth Jung,
Kenneth Knowles, Ke Wu, kileys, Kyle Weaver, lostluck,
Matt Casters, Maximilian Michels, Michal Walenia, Mike Dewar, nehsyc,
Nelson Osacky, Niels Basjes, Ning Kang, Pablo Estrada, palmere-google,
Pawel Pasterz, Piotr Szuberski, purbanow, Reuven Lax, rHermes,
Robert Bradshaw, Robert Burke, Rui Wang, Sam Rohde, Sam Whittle,
Siyuan Chen, Tim Robertson, Tobiasz Kędzierski, tszerszen,
Valentyn Tymofieiev, Tyson Hamilton, Udi Meiri, vachan-shetty, Xinyu Liu,
Yichi Zhang, Yifan Mai, yoshiki.obata, Yueyang Qiu&lt;/p></description></item><item><title>Blog: Apache Beam 2.26.0</title><link>/blog/beam-2.26.0/</link><pubDate>Fri, 11 Dec 2020 12:00:00 -0800</pubDate><guid>/blog/beam-2.26.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.26.0 release of Apache Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2260-2020-12-11">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.26.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12348833">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Splittable DoFn is now the default for executing the Read transform for Java based runners (Spark with bounded pipelines) in addition to existing runners from the 2.25.0 release (Direct, Flink, Jet, Samza, Twister2). The expected output of the Read transform is unchanged. Users can opt-out using &lt;code>--experiments=use_deprecated_read&lt;/code>. The Apache Beam community is looking for feedback for this change as the community is planning to make this change permanent with no opt-out. If you run into an issue requiring the opt-out, please send an e-mail to &lt;a href="mailto:user@beam.apache.org">user@beam.apache.org&lt;/a> specifically referencing BEAM-10670 in the subject line and why you needed to opt-out. (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10670">BEAM-10670&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Java BigQuery streaming inserts now have timeouts enabled by default. Pass &lt;code>--HTTPWriteTimeout=0&lt;/code> to revert to the old behavior. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6103">BEAM-6103&lt;/a>)&lt;/li>
&lt;li>Added support for Contextual Text IO (Java), a version of text IO that provides metadata about the records (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10124">BEAM-10124&lt;/a>). Support for this IO is currently experimental. Specifically, &lt;strong>there are no update-compatibility guarantees&lt;/strong> for streaming jobs with this IO between current future verisons of Apache Beam SDK.&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Added support for avro payload format in Beam SQL Kafka Table (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10885">BEAM-10885&lt;/a>)&lt;/li>
&lt;li>Added support for json payload format in Beam SQL Kafka Table (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10893">BEAM-10893&lt;/a>)&lt;/li>
&lt;li>Added support for protobuf payload format in Beam SQL Kafka Table (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10892">BEAM-10892&lt;/a>)&lt;/li>
&lt;li>Added support for avro payload format in Beam SQL Pubsub Table (&lt;a href="https://issues.apache.org/jira/browse/BEAM-5504">BEAM-5504&lt;/a>)&lt;/li>
&lt;li>Added option to disable unnecessary copying between operators in Flink Runner (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-11146">BEAM-11146&lt;/a>)&lt;/li>
&lt;li>Added CombineFn.setup and CombineFn.teardown to Python SDK. These methods let you initialize the CombineFn&amp;rsquo;s state before any of the other methods of the CombineFn is executed and clean that state up later on. If you are using Dataflow, you need to enable Dataflow Runner V2 by passing &lt;code>--experiments=use_runner_v2&lt;/code> before using this feature. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-3736">BEAM-3736&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>BigQuery&amp;rsquo;s DATETIME type now maps to Beam logical type org.apache.beam.sdk.schemas.logicaltypes.SqlTypes.DATETIME&lt;/li>
&lt;li>Pandas 1.x is now required for dataframe operations.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.26.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Abhishek Yadav, AbhiY98, Ahmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko,
Andrew Pilloud, Ankur Goenka, Boyuan Zhang, Brian Hulette, Chad Dombrova,
Chamikara Jayalath, Curtis &amp;ldquo;Fjord&amp;rdquo; Hawthorne, Damon Douglas, dandy10, Daniel Oliveira,
David Cavazos, dennis, Derrick Qin, dpcollins-google, Dylan Hercher, emily, Esun Kim,
Gleb Kanterov, Heejong Lee, Ismaël Mejía, Jan Lukavský, Jean-Baptiste Onofré, Jing,
Jozef Vilcek, Justin White, Kamil Wasilewski, Kenneth Knowles, kileys, Kyle Weaver,
lostluck, Luke Cwik, Mark, Maximilian Michels, Milan Cermak, Mohammad Hossein Sekhavat,
Nelson Osacky, Neville Li, Ning Kang, pabloem, Pablo Estrada, pawelpasterz,
Pawel Pasterz, Piotr Szuberski, PoojaChandak, purbanow, rarokni, Ravi Magham,
Reuben van Ammers, Reuven Lax, Reza Rokni, Robert Bradshaw, Robert Burke,
Romain Manni-Bucau, Rui Wang, rworley-monster, Sam Rohde, Sam Whittle, shollyman,
Simone Primarosa, Siyuan Chen, Steve Niemitz, Steven van Rossum, sychen, Teodor Spæren,
Tim Clemons, Tim Robertson, Tobiasz Kędzierski, tszerszen, Tudor Marian, tvalentyn,
Tyson Hamilton, Udi Meiri, Vasu Gupta, xasm83, Yichi Zhang, yichuan66, Yifan Mai,
yoshiki.obata, Yueyang Qiu, yukihira1992&lt;/p></description></item><item><title>Blog: Apache Beam 2.25.0</title><link>/blog/beam-2.25.0/</link><pubDate>Fri, 23 Oct 2020 14:00:00 -0800</pubDate><guid>/blog/beam-2.25.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.25.0 release of Apache Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2250-2020-10-23">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.25.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12347147">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Splittable DoFn is now the default for executing the Read transform for Java based runners (Direct, Flink, Jet, Samza, Twister2). The expected output of the Read transform is unchanged. Users can opt-out using &lt;code>--experiments=use_deprecated_read&lt;/code>. The Apache Beam community is looking for feedback for this change as the community is planning to make this change permanent with no opt-out. If you run into an issue requiring the opt-out, please send an e-mail to &lt;a href="mailto:user@beam.apache.org">user@beam.apache.org&lt;/a> specifically referencing BEAM-10670 in the subject line and why you needed to opt-out. (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10670">BEAM-10670&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Added cross-language support to Java&amp;rsquo;s KinesisIO, now available in the Python module &lt;code>apache_beam.io.kinesis&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10138">BEAM-10138&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-10137">BEAM-10137&lt;/a>).&lt;/li>
&lt;li>Update Snowflake JDBC dependency for SnowflakeIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10864">BEAM-10864&lt;/a>)&lt;/li>
&lt;li>Added cross-language support to Java&amp;rsquo;s SnowflakeIO.Write, now available in the Python module &lt;code>apache_beam.io.snowflake&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9898">BEAM-9898&lt;/a>).&lt;/li>
&lt;li>Added delete function to Java&amp;rsquo;s &lt;code>ElasticsearchIO#Write&lt;/code>. Now, Java&amp;rsquo;s ElasticsearchIO can be used to selectively delete documents using &lt;code>withIsDeleteFn&lt;/code> function (&lt;a href="https://issues.apache.org/jira/browse/BEAM-5757">BEAM-5757&lt;/a>).&lt;/li>
&lt;li>Java SDK: Added new IO connector for InfluxDB - InfluxDbIO (&lt;a href="https://issues.apache.org/jira/browse/BEAM-2546">BEAM-2546&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Support for repeatable fields in JSON decoder for &lt;code>ReadFromBigQuery&lt;/code> added. (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10524">BEAM-10524&lt;/a>)&lt;/li>
&lt;li>Added an opt-in, performance-driven runtime type checking system for the Python SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10549">BEAM-10549&lt;/a>).
More details will be in an upcoming &lt;a href="/blog/python-performance-runtime-type-checking/index.html">blog post&lt;/a>.&lt;/li>
&lt;li>Added support for Python 3 type annotations on PTransforms using typed PCollections (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10258">BEAM-10258&lt;/a>).
More details will be in an upcoming &lt;a href="/blog/python-improved-annotations/index.html">blog post&lt;/a>.&lt;/li>
&lt;li>Improved the Interactive Beam API where recording streaming jobs now start a long running background recording job. Running ib.show() or ib.collect() samples from the recording (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10603">BEAM-10603&lt;/a>).&lt;/li>
&lt;li>In Interactive Beam, ib.show() and ib.collect() now have &amp;ldquo;n&amp;rdquo; and &amp;ldquo;duration&amp;rdquo; as parameters. These mean read only up to &amp;ldquo;n&amp;rdquo; elements and up to &amp;ldquo;duration&amp;rdquo; seconds of data read from the recording (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10603">BEAM-10603&lt;/a>).&lt;/li>
&lt;li>Initial preview of &lt;a href="https://s.apache.org/simpler-python-pipelines-2020#slide=id.g905ac9257b_1_21">Dataframes&lt;/a> support.
See also example at apache_beam/examples/wordcount_dataframe.py&lt;/li>
&lt;li>Fixed support for type hints on &lt;code>@ptransform_fn&lt;/code> decorators in the Python SDK.
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-4091">BEAM-4091&lt;/a>)
This has not enabled by default to preserve backwards compatibility; use the
&lt;code>--type_check_additional=ptransform_fn&lt;/code> flag to enable. It may be enabled by
default in future versions of Beam.&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Python 2 and Python 3.5 support dropped (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10644">BEAM-10644&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-9372">BEAM-9372&lt;/a>).&lt;/li>
&lt;li>Pandas 1.x allowed. Older version of Pandas may still be used, but may not be as well tested.&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Python transform ReadFromSnowflake has been moved from &lt;code>apache_beam.io.external.snowflake&lt;/code> to &lt;code>apache_beam.io.snowflake&lt;/code>. The previous path will be removed in the future versions.&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Dataflow streaming timers once against not strictly time ordered when set earlier mid-bundle, as the fix for &lt;a href="https://issues.apache.org/jira/browse/BEAM-8543">BEAM-8543&lt;/a> introduced more severe bugs and has been rolled back.&lt;/li>
&lt;li>Default compressor change breaks dataflow python streaming job update compatibility. Please use python SDK version &amp;lt;= 2.23.0 or &amp;gt; 2.25.0 if job update is critical.(&lt;a href="https://issues.apache.org/jira/browse/BEAM-11113">BEAM-11113&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.25.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alan Myrvold, Aldair Coronel Ruiz, Alexey Romanenko, Andrew Pilloud, Ankur Goenka,
Ayoub ENNASSIRI, Bipin Upadhyaya, Boyuan Zhang, Brian Hulette, Brian Michalski, Chad Dombrova,
Chamikara Jayalath, Damon Douglas, Daniel Oliveira, David Cavazos, David Janicek, Doug Roeper, Eric
Roshan-Eisner, Etta Rapp, Eugene Kirpichov, Filipe Regadas, Heejong Lee, Ihor Indyk, Irvi Firqotul
Aini, Ismaël Mejía, Jan Lukavský, Jayendra, Jiadai Xia, Jithin Sukumar, Jozsef Bartok, Kamil
Gałuszka, Kamil Wasilewski, Kasia Kucharczyk, Kenneth Jung, Kenneth Knowles, Kevin Puthusseri, Kevin
Sijo Puthusseri, KevinGG, Kyle Weaver, Leiyi Zhang, Lourens Naudé, Luke Cwik, Matthew Ouyang,
Maximilian Michels, Michal Walenia, Milan Cermak, Monica Song, Nelson Osacky, Neville Li, Ning Kang,
Pablo Estrada, Piotr Szuberski, Qihang, Rehman, Reuven Lax, Robert Bradshaw, Robert Burke, Rui Wang,
Saavan Nanavati, Sam Bourne, Sam Rohde, Sam Whittle, Sergiy Kolesnikov, Sindy Li, Siyuan Chen, Steve
Niemitz, Terry Xian, Thomas Weise, Tobiasz Kędzierski, Truc Le, Tyson Hamilton, Udi Meiri, Valentyn
Tymofieiev, Yichi Zhang, Yifan Mai, Yueyang Qiu, annaqin418, danielxjd, dennis, dp, fuyuwei,
lostluck, nehsyc, odeshpande, odidev, pulasthi, purbanow, rworley-monster, sclukas77, terryxian78,
tvalentyn, yoshiki.obata&lt;/p></description></item><item><title>Blog: Apache Beam 2.24.0</title><link>/blog/beam-2.24.0/</link><pubDate>Fri, 18 Sep 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.24.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.24.0 release of Apache Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2240-2020-09-18">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.24.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12347146">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Apache Beam 2.24.0 is the last release with Python 2 and Python 3.5
support.&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>New overloads for BigtableIO.Read.withKeyRange() and BigtableIO.Read.withRowFilter()
methods that take ValueProvider as a parameter (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10283">BEAM-10283&lt;/a>).&lt;/li>
&lt;li>The WriteToBigQuery transform (Python) in Dataflow Batch no longer relies on BigQuerySink by default. It relies on
a new, fully-featured transform based on file loads into BigQuery. To revert the behavior to the old implementation,
you may use &lt;code>--experiments=use_legacy_bq_sink&lt;/code>.&lt;/li>
&lt;li>Add cross-language support to Java&amp;rsquo;s JdbcIO, now available in the Python module &lt;code>apache_beam.io.jdbc&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10135">BEAM-10135&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-10136">BEAM-10136&lt;/a>).&lt;/li>
&lt;li>Add support of AWS SDK v2 for KinesisIO.Read (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9702">BEAM-9702&lt;/a>).&lt;/li>
&lt;li>Add streaming support to SnowflakeIO in Java SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9896">BEAM-9896&lt;/a>)&lt;/li>
&lt;li>Support reading and writing to Google Healthcare DICOM APIs in Python SDK (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10601">BEAM-10601&lt;/a>)&lt;/li>
&lt;li>Add dispositions for SnowflakeIO.write (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10343">BEAM-10343&lt;/a>)&lt;/li>
&lt;li>Add cross-language support to SnowflakeIO.Read now available in the Python module &lt;code>apache_beam.io.external.snowflake&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9897">BEAM-9897&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Shared library for simplifying management of large shared objects added to Python SDK. Example use case is sharing a large TF model object across threads (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10417">BEAM-10417&lt;/a>).&lt;/li>
&lt;li>Dataflow streaming timers are not strictly time ordered when set earlier mid-bundle (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8543">BEAM-8543&lt;/a>).&lt;/li>
&lt;li>OnTimerContext should not create a new one when processing each element/timer in FnApiDoFnRunner (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9839">BEAM-9839&lt;/a>)&lt;/li>
&lt;li>Key should be available in @OnTimer methods (Spark Runner) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9850">BEAM-9850&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>WriteToBigQuery transforms now require a GCS location to be provided through either
custom_gcs_temp_location in the constructor of WriteToBigQuery or the fallback option
&amp;ndash;temp_location, or pass method=&amp;ldquo;STREAMING_INSERTS&amp;rdquo; to WriteToBigQuery (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6928">BEAM-6928&lt;/a>).&lt;/li>
&lt;li>Python SDK now understands &lt;code>typing.FrozenSet&lt;/code> type hints, which are not interchangeable with &lt;code>typing.Set&lt;/code>. You may need to update your pipelines if type checking fails. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10197">BEAM-10197&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>Default compressor change breaks dataflow python streaming job update compatibility. Please use python SDK version &amp;lt;= 2.23.0 or &amp;gt; 2.25.0 if job update is critical.(&lt;a href="https://issues.apache.org/jira/browse/BEAM-11113">BEAM-11113&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.24.0 release. Thank you to all contributors!&lt;/p>
&lt;p>adesormi, Ahmet Altay, Alex Amato, Alexey Romanenko, Andrew Pilloud, Ashwin Ramaswami, Borzoo,
Boyuan Zhang, Brian Hulette, Brian M, Bu Sun Kim, Chamikara Jayalath, Colm O hEigeartaigh,
Corvin Deboeser, Damian Gadomski, Damon Douglas, Daniel Oliveira, Dariusz Aniszewski,
davidak09, David Cavazos, David Moravek, David Yan, dhodun, Doug Roeper, Emil Hessman, Emily Ye,
Etienne Chauchot, Etta Rapp, Eugene Kirpichov, fuyuwei, Gleb Kanterov,
Harrison Green, Heejong Lee, Henry Suryawirawan, InigoSJ, Ismaël Mejía, Israel Herraiz,
Jacob Ferriero, Jan Lukavský, Jayendra, jfarr, jhnmora000, Jiadai Xia, JIahao wu, Jie Fan,
Jiyong Jung, Julius Almeida, Kamil Gałuszka, Kamil Wasilewski, Kasia Kucharczyk, Kenneth Knowles,
Kevin Puthusseri, Kyle Weaver, Łukasz Gajowy, Luke Cwik, Mark-Zeng, Maximilian Michels,
Michal Walenia, Niel Markwick, Ning Kang, Pablo Estrada, pawel.urbanowicz, Piotr Szuberski,
Rafi Kamal, rarokni, Rehman Murad Ali, Reuben van Ammers, Reuven Lax, Ricardo Bordon,
Robert Bradshaw, Robert Burke, Robin Qiu, Rui Wang, Saavan Nanavati, sabhyankar, Sam Rohde,
Scott Lukas, Siddhartha Thota, Simone Primarosa, Sławomir Andrian,
Steve Niemitz, Tobiasz Kędzierski, Tomo Suzuki, Tyson Hamilton, Udi Meiri,
Valentyn Tymofieiev, viktorjonsson, Xinyu Liu, Yichi Zhang, Yixing Zhang, yoshiki.obata,
Yueyang Qiu, zijiesong&lt;/p></description></item><item><title>Blog: Apache Beam 2.23.0</title><link>/blog/beam-2.23.0/</link><pubDate>Wed, 29 Jul 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.23.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.23.0 release of Apache Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2230-2020-07-29">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.23.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12347145">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Twister2 Runner (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7304">BEAM-7304&lt;/a>).&lt;/li>
&lt;li>Python 3.8 support (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8494">BEAM-8494&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Support for reading from Snowflake added (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9722">BEAM-9722&lt;/a>).&lt;/li>
&lt;li>Support for writing to Splunk added (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8596">BEAM-8596&lt;/a>).&lt;/li>
&lt;li>Support for assume role added (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10335">BEAM-10335&lt;/a>).&lt;/li>
&lt;li>A new transform to read from BigQuery has been added: &lt;code>apache_beam.io.gcp.bigquery.ReadFromBigQuery&lt;/code>. This transform
is experimental. It reads data from BigQuery by exporting data to Avro files, and reading those files. It also supports
reading data by exporting to JSON files. This has small differences in behavior for Time and Date-related fields. See
Pydoc for more information.&lt;/li>
&lt;li>Add dispositions for SnowflakeIO.write (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10343">BEAM-10343&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>Update Snowflake JDBC dependency and add application=beam to connection URL (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10383">BEAM-10383&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>&lt;code>RowJson.RowJsonDeserializer&lt;/code>, &lt;code>JsonToRow&lt;/code>, and &lt;code>PubsubJsonTableProvider&lt;/code> now accept &amp;ldquo;implicit
nulls&amp;rdquo; by default when deserializing JSON (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-10220">BEAM-10220&lt;/a>).
Previously nulls could only be represented with explicit null values, as in
&lt;code>{&amp;quot;foo&amp;quot;: &amp;quot;bar&amp;quot;, &amp;quot;baz&amp;quot;: null}&lt;/code>, whereas an implicit null like &lt;code>{&amp;quot;foo&amp;quot;: &amp;quot;bar&amp;quot;}&lt;/code> would raise an
exception. Now both JSON strings will yield the same result by default. This behavior can be
overridden with &lt;code>RowJson.RowJsonDeserializer#withNullBehavior&lt;/code>.&lt;/li>
&lt;li>Fixed a bug in &lt;code>GroupIntoBatches&lt;/code> experimental transform in Python to actually group batches by key.
This changes the output type for this transform (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6696">BEAM-6696&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Remove Gearpump runner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9999">BEAM-9999&lt;/a>)&lt;/li>
&lt;li>Remove Apex runner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9999">BEAM-9999&lt;/a>)&lt;/li>
&lt;li>RedisIO.readAll() is deprecated and will be removed in 2 versions, users must use RedisIO.readKeyPatterns() as a replacement (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9747">BEAM-9747&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.23.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Aaron, Abhishek Yadav, Ahmet Altay, aiyangar, Aizhamal Nurmamat kyzy, Ajo Thomas, Akshay-Iyangar, Alan Pryor, Alex Amato, Alexey Romanenko, Allen Pradeep Xavier, Andrew Crites, Andrew Pilloud, Ankur Goenka, Anna Qin, Ashwin Ramaswami, bntnam, Borzoo Esmailloo, Boyuan Zhang, Brian Hulette, Brian Michalski, brucearctor, Chamikara Jayalath, chi-chi weng, Chuck Yang, Chun Yang, Colm O hEigeartaigh, Corvin Deboeser, Craig Chambers, Damian Gadomski, Damon Douglas, Daniel Oliveira, Dariusz Aniszewski, darshanj, darshan jani, David Cavazos, David Moravek, David Yan, Esun Kim, Etienne Chauchot, Filipe Regadas, fuyuwei, Graeme Morgan, Hannah-Jiang, Harch Vardhan, Heejong Lee, Henry Suryawirawan, InigoSJ, Ismaël Mejía, Israel Herraiz, Jacob Ferriero, Jan Lukavský, Jie Fan, John Mora, Jozef Vilcek, Julien Phalip, Justine Koa, Kamil Gabryjelski, Kamil Wasilewski, Kasia Kucharczyk, Kenneth Jung, Kenneth Knowles, kevingg, Kevin Sijo Puthusseri, kshivvy, Kyle Weaver, Kyoungha Min, Kyungwon Jo, Luke Cwik, Mark Liu, Mark-Zeng, Matthias Baetens, Maximilian Michels, Michal Walenia, Mikhail Gryzykhin, Nam Bui, Nathan Fisher, Niel Markwick, Ning Kang, Omar Ismail, Pablo Estrada, paul fisher, Pawel Pasterz, perkss, Piotr Szuberski, pulasthi, purbanow, Rahul Patwari, Rajat Mittal, Rehman, Rehman Murad Ali, Reuben van Ammers, Reuven Lax, Reza Rokni, Rion Williams, Robert Bradshaw, Robert Burke, Rui Wang, Ruoyun Huang, sabhyankar, Sam Rohde, Sam Whittle, sclukas77, Sebastian Graca, Shoaib Zafar, Sruthi Sree Kumar, Stephen O&amp;rsquo;Kennedy, Steve Koonce, Steve Niemitz, Steven van Rossum, Ted Romer, Tesio, Thinh Ha, Thomas Weise, Tobias Kaymak, tobiaslieber-cognitedata, Tobiasz Kędzierski, Tomo Suzuki, Tudor Marian, tvs, Tyson Hamilton, Udi Meiri, Valentyn Tymofieiev, Vasu Nori, xuelianhan, Yichi Zhang, Yifan Zou, Yixing Zhang, yoshiki.obata, Yueyang Qiu, Yu Feng, Yuwei Fu, Zhuo Peng, ZijieSong946.&lt;/p></description></item><item><title>Blog: Apache Beam 2.22.0</title><link>/blog/beam-2.22.0/</link><pubDate>Mon, 08 Jun 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.22.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.22.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2220-2020-06-08">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.22.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12347144">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Basic Kafka read/write support for DataflowRunner (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8019">BEAM-8019&lt;/a>).&lt;/li>
&lt;li>Sources and sinks for Google Healthcare APIs (Java)(&lt;a href="https://issues.apache.org/jira/browse/BEAM-9468">BEAM-9468&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>&lt;code>--workerCacheMB&lt;/code> flag is supported in Dataflow streaming pipeline (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9964">BEAM-9964&lt;/a>)&lt;/li>
&lt;li>&lt;code>--direct_num_workers=0&lt;/code> is supported for FnApi runner. It will set the number of threads/subprocesses to number of cores of the machine executing the pipeline (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9443">BEAM-9443&lt;/a>).&lt;/li>
&lt;li>Python SDK now has experimental support for SqlTransform (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8603">BEAM-8603&lt;/a>).&lt;/li>
&lt;li>Add OnWindowExpiration method to Stateful DoFn (&lt;a href="https://issues.apache.org/jira/browse/BEAM-1589">BEAM-1589&lt;/a>).&lt;/li>
&lt;li>Added PTransforms for Google Cloud DLP (Data Loss Prevention) services integration (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9723">BEAM-9723&lt;/a>):
&lt;ul>
&lt;li>Inspection of data,&lt;/li>
&lt;li>Deidentification of data,&lt;/li>
&lt;li>Reidentification of data.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Add a more complete I/O support matrix in the documentation site (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9916">BEAM-9916&lt;/a>).&lt;/li>
&lt;li>Upgrade Sphinx to 3.0.3 for building PyDoc.&lt;/li>
&lt;li>Added a PTransform for image annotation using Google Cloud AI image processing service
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-9646">BEAM-9646&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>The Python SDK now requires &lt;code>--job_endpoint&lt;/code> to be set when using &lt;code>--runner=PortableRunner&lt;/code> (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9860">BEAM-9860&lt;/a>). Users seeking the old default behavior should set &lt;code>--runner=FlinkRunner&lt;/code> instead.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.22.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, aiyangar, Ajo Thomas, Akshay-Iyangar, Alan Pryor, Alexey Romanenko, Allen Pradeep Xavier, amaliujia, Andrew Pilloud, Ankur Goenka, Ashwin Ramaswami, bntnam, Borzoo Esmailloo, Boyuan Zhang, Brian Hulette, Chamikara Jayalath, Colm O hEigeartaigh, Craig Chambers, Damon Douglas, Daniel Oliveira, David Cavazos, David Moravek, Esun Kim, Etienne Chauchot, Filipe Regadas, Graeme Morgan, Hannah Jiang, Hannah-Jiang, Harch Vardhan, Heejong Lee, Henry Suryawirawan, Ismaël Mejía, Israel Herraiz, Jacob Ferriero, Jan Lukavský, John Mora, Kamil Wasilewski, Kenneth Jung, Kenneth Knowles, kevingg, Kyle Weaver, Kyoungha Min, Kyungwon Jo, Luke Cwik, Mark Liu, Matthias Baetens, Maximilian Michels, Michal Walenia, Mikhail Gryzykhin, Nam Bui, Niel Markwick, Ning Kang, Omar Ismail, omarismail94, Pablo Estrada, paul fisher, pawelpasterz, Pawel Pasterz, Piotr Szuberski, Rahul Patwari, rarokni, Rehman, Rehman Murad Ali, Reuven Lax, Robert Bradshaw, Robert Burke, Rui Wang, Ruoyun Huang, Sam Rohde, Sam Whittle, Sebastian Graca, Shoaib Zafar, Sruthi Sree Kumar, Stephen O&amp;rsquo;Kennedy, Steve Koonce, Steve Niemitz, Steven van Rossum, Tesio, Thomas Weise, tobiaslieber-cognitedata, Tomo Suzuki, Tudor Marian, tvalentyn, Tyson Hamilton, Udi Meiri, Valentyn Tymofieiev, Vasu Nori, xuelianhan, Yichi Zhang, Yifan Zou, yoshiki.obata, Yueyang Qiu, Zhuo Peng&lt;/p></description></item><item><title>Blog: Apache Beam 2.21.0</title><link>/blog/beam-2.21.0/</link><pubDate>Wed, 27 May 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.21.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.21.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2210-2020-05-27">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.21.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12347143">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="ios">I/Os&lt;/h2>
&lt;ul>
&lt;li>Python: Deprecated module &lt;code>apache_beam.io.gcp.datastore.v1&lt;/code> has been removed
as the client it uses is out of date and does not support Python 3
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-9529">BEAM-9529&lt;/a>).
Please migrate your code to use
&lt;a href="https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.datastore.v1new.datastoreio.html">apache_beam.io.gcp.datastore.&lt;strong>v1new&lt;/strong>&lt;/a>.
See the updated
&lt;a href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/cookbook/datastore_wordcount.py">datastore_wordcount&lt;/a>
for example usage.&lt;/li>
&lt;li>Python SDK: Added integration tests and updated batch write functionality for Google Cloud Spanner transform (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8949">BEAM-8949&lt;/a>).&lt;/li>
&lt;/ul>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Python SDK will now use Python 3 type annotations as pipeline type hints.
(&lt;a href="https://github.com/apache/beam/pull/10717">#10717&lt;/a>)&lt;/p>
&lt;p>If you suspect that this feature is causing your pipeline to fail, calling
&lt;code>apache_beam.typehints.disable_type_annotations()&lt;/code> before pipeline creation
will disable is completely, and decorating specific functions (such as
&lt;code>process()&lt;/code>) with &lt;code>@apache_beam.typehints.no_annotations&lt;/code> will disable it
for that function.&lt;/p>
&lt;p>More details can be found in
&lt;a href="/documentation/sdks/python-type-safety/">Ensuring Python Type Safety&lt;/a>
and the Python SDK Typing Changes
&lt;a href="/blog/python-typing/">blog post&lt;/a>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Java SDK: Introducing the concept of options in Beam Schema’s. These options add extra
context to fields and schemas. This replaces the current Beam metadata that is present
in a FieldType only, options are available in fields and row schemas. Schema options are
fully typed and can contain complex rows. &lt;em>Remark: Schema aware is still experimental.&lt;/em>
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-9035">BEAM-9035&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Java SDK: The protobuf extension is fully schema aware and also includes protobuf option
conversion to beam schema options. &lt;em>Remark: Schema aware is still experimental.&lt;/em>
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-9044">BEAM-9044&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Added ability to write to BigQuery via Avro file loads (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8841">BEAM-8841&lt;/a>)&lt;/p>
&lt;p>By default, file loads will be done using JSON, but it is possible to
specify the temp_file_format parameter to perform file exports with AVRO.
AVRO-based file loads work by exporting Python types into Avro types, so
to switch to Avro-based loads, you will need to change your data types
from Json-compatible types (string-type dates and timestamp, long numeric
values as strings) into Python native types that are written to Avro
(Python&amp;rsquo;s date, datetime types, decimal, etc). For more information
see &lt;a href="https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro#avro_conversions">https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro#avro_conversions&lt;/a>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Added integration of Java SDK with Google Cloud AI VideoIntelligence service
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-9147">BEAM-9147&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Added integration of Java SDK with Google Cloud AI natural language processing API
(&lt;a href="https://issues.apache.org/jira/browse/BEAM-9634">BEAM-9634&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>docker-pull-licenses&lt;/code> tag was introduced. Licenses/notices of third party dependencies will be added to the docker images when &lt;code>docker-pull-licenses&lt;/code> was set.
The files are added to &lt;code>/opt/apache/beam/third_party_licenses/&lt;/code>.
By default, no licenses/notices are added to the docker images. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9136">BEAM-9136&lt;/a>)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="breaking-changes">Breaking Changes&lt;/h2>
&lt;ul>
&lt;li>Dataflow runner now requires the &lt;code>--region&lt;/code> option to be set, unless a default value is set in the environment (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9199">BEAM-9199&lt;/a>). See &lt;a href="https://cloud.google.com/dataflow/docs/concepts/regional-endpoints">here&lt;/a> for more details.&lt;/li>
&lt;li>HBaseIO.ReadAll now requires a PCollection of HBaseIO.Read objects instead of HBaseQuery objects (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9279">BEAM-9279&lt;/a>).&lt;/li>
&lt;li>ProcessContext.updateWatermark has been removed in favor of using a WatermarkEstimator (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9430">BEAM-9430&lt;/a>).&lt;/li>
&lt;li>Coder inference for PCollection of Row objects has been disabled (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9569">BEAM-9569&lt;/a>).&lt;/li>
&lt;li>Go SDK docker images are no longer released until further notice.&lt;/li>
&lt;/ul>
&lt;h2 id="deprecations">Deprecations&lt;/h2>
&lt;ul>
&lt;li>Java SDK: Beam Schema FieldType.getMetadata is now deprecated and is replaced by the Beam
Schema Options, it will be removed in version &lt;code>2.23.0&lt;/code>. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9704">BEAM-9704&lt;/a>)&lt;/li>
&lt;li>The &lt;code>--zone&lt;/code> option in the Dataflow runner is now deprecated. Please use &lt;code>--worker_zone&lt;/code> instead. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-9716">BEAM-9716&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.21.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Aaron Meihm, Adrian Eka, Ahmet Altay, AldairCoronel, Alex Van Boxel, Alexey Romanenko, Andrew Crites, Andrew Pilloud, Ankur Goenka, Badrul (Taki) Chowdhury, Bartok Jozsef, Boyuan Zhang, Brian Hulette, brucearctor, bumblebee-coming, Chad Dombrova, Chamikara Jayalath, Chie Hayashida, Chris Gorgolewski, Chuck Yang, Colm O hEigeartaigh, Curtis &amp;ldquo;Fjord&amp;rdquo; Hawthorne, Daniel Mills, Daniel Oliveira, David Yan, Elias Djurfeldt, Emiliano Capoccia, Etienne Chauchot, Fernando Diaz, Filipe Regadas, Gleb Kanterov, Hai Lu, Hannah Jiang, Harch Vardhan, Heejong Lee, Henry Suryawirawan, Hk-tang, Ismaël Mejía, Jacoby, Jan Lukavský, Jeroen Van Goey, jfarr, Jozef Vilcek, Kai Jiang, Kamil Wasilewski, Kenneth Knowles, KevinGG, Kyle Weaver, Kyoungha Min, Luke Cwik, Maximilian Michels, Michal Walenia, Ning Kang, Pablo Estrada, paul fisher, Piotr Szuberski, Reuven Lax, Robert Bradshaw, Robert Burke, Rose Nguyen, Rui Wang, Sam Rohde, Sam Whittle, Spoorti Kundargi, Steve Koonce, sunjincheng121, Ted Yun, Tesio, Thomas Weise, Tomo Suzuki, Udi Meiri, Valentyn Tymofieiev, Vasu Nori, Yichi Zhang, yoshiki.obata, Yueyang Qiu&lt;/p></description></item><item><title>Blog: Apache Beam 2.20.0</title><link>/blog/beam-2.20.0/</link><pubDate>Wed, 15 Apr 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.20.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.20.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2190-2020-02-04">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.20.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12346780">detailed release notes&lt;/a>.&lt;/p>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;p>Python SDK: . (#10223).&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8561">BEAM-8561&lt;/a> Adds support for Thrift encoded data via ThriftIO&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7310">BEAM-7310&lt;/a> KafkaIO supports schema resolution using Confluent Schema Registry&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7246">BEAM-7246&lt;/a> Support for Google Cloud Spanner. This is an experimental module for reading and writing data from Google Cloud Spanner&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8399">BEAM-8399&lt;/a> Adds support for standard HDFS URLs (with server name)&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9146">BEAM-9146&lt;/a> New AnnotateVideo &amp;amp; AnnotateVideoWithContext PTransform&amp;rsquo;s that integrates GCP Video Intelligence functionality&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9247">BEAM-9247&lt;/a> New AnnotateImage &amp;amp; AnnotateImageWithContext PTransform&amp;rsquo;s for element-wise &amp;amp; batch image annotation using Google Cloud Vision API&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9258">BEAM-9258&lt;/a> Added a PTransform for inspection and deidentification of text using Google Cloud DLP&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9248">BEAM-9248&lt;/a> New AnnotateText PTransform that integrates Google Cloud Natural Language functionality&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9305">BEAM-9305&lt;/a> ReadFromBigQuery now supports value providers for the query string&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8841">BEAM-8841&lt;/a> Added ability to write to BigQuery via Avro file loads&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9228">BEAM-9228&lt;/a> Direct runner for FnApi supports further parallelism&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8550">BEAM-8550&lt;/a> Support for @RequiresTimeSortedInput in Flink and Spark&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-6857">BEAM-6857&lt;/a> Added support for dynamic timers&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-3453">BEAM-3453&lt;/a> Backwards incompatible change in ReadFromPubSub(topic=) in Python&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9310">BEAM-9310&lt;/a> SpannerAccessor in Java is now package-private to reduce API surface&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8616">BEAM-8616&lt;/a> ParquetIO hadoop dependency should be now provided by the users&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9063">BEAM-9063&lt;/a> Docker images will be deployed to apache/beam repositories from 2.20&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9579">BEAM-9579&lt;/a> Fixed numpy operators in ApproximateQuantiles&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9277">BEAM-9277&lt;/a> Fixed exception when running in IPython notebook&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-1833">BEAM-1833&lt;/a> Restructure Python pipeline construction to better follow the Runner API&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9225">BEAM-9225&lt;/a> Fixed Flink uberjar job termination bug&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9503">BEAM-9503&lt;/a> Fixed SyntaxError in process worker startup&lt;/li>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9322">BEAM-9322&lt;/a> Python SDK ignores manually set PCollection tags&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9445">BEAM-9445&lt;/a> Python SDK pre_optimize=all experiment may cause error&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9725">BEAM-9725&lt;/a> Python SDK performance regression for reshuffle transform&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.20.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alex Amato, Alexey Romanenko, Andrew Pilloud, Ankur Goenka, Anton Kedin, Boyuan Zhang, Brian Hulette, Brian Martin, Chamikara Jayalath
, Charles Chen, Craig Chambers, Daniel Oliveira, David Moravek, David Rieber, Dustin Rhodes, Etienne Chauchot, Gleb Kanterov, Hai Lu, Heejong Lee
, Ismaël Mejía, J Ross Thomson, Jan Lukavský, Jason Kuster, Jean-Baptiste Onofré, Jeff Klukas, João Cabrita, Juan Rael, Juta, Kasia Kucharczyk
, Kengo Seki, Kenneth Jung, Kenneth Knowles, Kyle Weaver, Kyle Winkelman, Lukas Drbal, Marek Simunek, Mark Liu, Maximilian Michels, Melissa Pashniak
, Michael Luckey, Michal Walenia, Mike Pedersen, Mikhail Gryzykhin, Niel Markwick, Pablo Estrada, Pascal Gula, Rehman Murad Ali, Reuven Lax, Rob, Robbe Sneyders
, Robert Bradshaw, Robert Burke, Rui Wang, Ruoyun Huang, Ryan Williams, Sam Rohde, Sam Whittle, Scott Wegner, Shoaib Zafar, Thomas Weise, Tianyang Hu, Tyler Akidau
, Udi Meiri, Valentyn Tymofieiev, Xinyu Liu, XuMingmin, ttanay, tvalentyn, Łukasz Gajowy&lt;/p></description></item><item><title>Blog: Apache Beam 2.19.0</title><link>/blog/beam-2.19.0/</link><pubDate>Tue, 04 Feb 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.19.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.19.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2190-2020-02-04">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.19.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12346582">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Multiple improvements made into Python SDK harness:
&lt;a href="https://issues.apache.org/jira/browse/BEAM-8624">BEAM-8624&lt;/a>,
&lt;a href="https://issues.apache.org/jira/browse/BEAM-8623">BEAM-8623&lt;/a>,
&lt;a href="https://issues.apache.org/jira/browse/BEAM-7949">BEAM-7949&lt;/a>,
&lt;a href="https://issues.apache.org/jira/browse/BEAM-8935">BEAM-8935&lt;/a>,
&lt;a href="https://issues.apache.org/jira/browse/BEAM-8816">BEAM-8816&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-1440">BEAM-1440&lt;/a> Create a BigQuery source (that implements iobase.BoundedSource) for Python SDK&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-2572">BEAM-2572&lt;/a> Implement an S3 filesystem for Python SDK&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5192">BEAM-5192&lt;/a> Support Elasticsearch 7.x&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8745">BEAM-8745&lt;/a> More fine-grained controls for the size of a BigQuery Load job&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8801">BEAM-8801&lt;/a> PubsubMessageToRow should not check useFlatSchema() in processElement&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8953">BEAM-8953&lt;/a> Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8946">BEAM-8946&lt;/a> Report collection size from MongoDBIOIT&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8978">BEAM-8978&lt;/a> Report saved data size from HadoopFormatIOIT&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-6008">BEAM-6008&lt;/a> Improve error reporting in Java/Python PortableRunner&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8296">BEAM-8296&lt;/a> Containerize the Spark job server&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8746">BEAM-8746&lt;/a> Allow the local job service to work from inside docker&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8837">BEAM-8837&lt;/a> PCollectionVisualizationTest: possible bug&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8139">BEAM-8139&lt;/a> Execute portable Spark application jar&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9019">BEAM-9019&lt;/a> Improve Spark Encoders (wrappers of beam coders)&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9053">BEAM-9053&lt;/a> Improve error message when unable to get the correct filesystem for specified path in Python SDK) Improve error message when unable to get the correct filesystem for specified path in Python SDK&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9055">BEAM-9055&lt;/a> Unify the config names of Fn Data API across languages&lt;/li>
&lt;/ul>
&lt;h3 id="sql">SQL&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5690">BEAM-5690&lt;/a> Issue with GroupByKey in BeamSql using SparkRunner&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8993">BEAM-8993&lt;/a> [SQL] MongoDb should use predicate push-down&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8844">BEAM-8844&lt;/a> [SQL] Create performance tests for BigQueryTable&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9023">BEAM-9023&lt;/a> Upgrade to ZetaSQL 2019.12.1&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8989">BEAM-8989&lt;/a> Backwards incompatible change in ParDo.getSideInputs (caught by failure when running Apache Nemo quickstart)&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8402">BEAM-8402&lt;/a> Backwards incompatible change related to how Environments are represented in Python &lt;code>DirectRunner&lt;/code>.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9218">BEAM-9218&lt;/a> Template staging broken on Beam 2.18.0&lt;/li>
&lt;/ul>
&lt;h3 id="dependency-changes">Dependency Changes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8696">BEAM-8696&lt;/a> Beam Dependency Update Request: com.google.protobuf:protobuf-java&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8701">BEAM-8701&lt;/a> Beam Dependency Update Request: commons-io:commons-io&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8716">BEAM-8716&lt;/a> Beam Dependency Update Request: org.apache.commons:commons-csv&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8717">BEAM-8717&lt;/a> Beam Dependency Update Request: org.apache.commons:commons-lang3&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8749">BEAM-8749&lt;/a> Beam Dependency Update Request: com.datastax.cassandra:cassandra-driver-mapping&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5546">BEAM-5546&lt;/a> Beam Dependency Update Request: commons-codec:commons-codec&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9123">BEAM-9123&lt;/a> HadoopResourceId returns wrong directory name&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8962">BEAM-8962&lt;/a> FlinkMetricContainer causes churn in the JobManager and lets the web frontend malfunction&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5495">BEAM-5495&lt;/a> PipelineResources algorithm is not working in most environments&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8025">BEAM-8025&lt;/a> Cassandra IO classMethod test is flaky&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8577">BEAM-8577&lt;/a> FileSystems may have not be initialized during ResourceId deserialization&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8582">BEAM-8582&lt;/a> Python SDK emits duplicate records for Default and AfterWatermark triggers&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8943">BEAM-8943&lt;/a> SDK harness servers don&amp;rsquo;t shut down properly when SDK harness environment cleanup fails&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8995">BEAM-8995&lt;/a> apache_beam.io.gcp.bigquery_read_it_test failing on Py3.5 PC with: TypeError: the JSON object must be str, not &amp;lsquo;bytes&amp;rsquo;&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8999">BEAM-8999&lt;/a> PGBKCVOperation does not respect timestamp combiners&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9050">BEAM-9050&lt;/a> Beam pickler doesn&amp;rsquo;t pickle classes that have &lt;strong>module&lt;/strong> set to None.&lt;/li>
&lt;li>&lt;/li>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.19.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alex Amato, Alexey Romanenko, Andrew Pilloud, Ankur Goenka, Anton Kedin, Boyuan Zhang, Brian Hulette, Brian Martin, Chamikara Jayalath, Charles Chen, Craig Chambers, Daniel Oliveira, David Moravek, David Rieber, Dustin Rhodes, Etienne Chauchot, Gleb Kanterov, Hai Lu, Heejong Lee, Ismaël Mejía, Jan Lukavský, Jason Kuster, Jean-Baptiste Onofré, Jeff Klukas, João Cabrita, J Ross Thomson, Juan Rael, Juta, Kasia Kucharczyk, Kengo Seki, Kenneth Jung, Kenneth Knowles, Kyle Weaver, Kyle Winkelman, Lukas Drbal, Łukasz Gajowy, Marek Simunek, Mark Liu, Maximilian Michels, Melissa Pashniak, Michael Luckey, Michal Walenia, Mike Pedersen, Mikhail Gryzykhin, Niel Markwick, Pablo Estrada, Pascal Gula, Reuven Lax, Rob, Robbe Sneyders, Robert Bradshaw, Robert Burke, Rui Wang, Ruoyun Huang, Ryan Williams, Sam Rohde, Sam Whittle, Scott Wegner, Thomas Weise, Tianyang Hu, ttanay, tvalentyn, Tyler Akidau, Udi Meiri, Valentyn Tymofieiev, Xinyu Liu, XuMingmin&lt;/p></description></item><item><title>Blog: Apache Beam 2.18.0</title><link>/blog/beam-2.18.0/</link><pubDate>Thu, 23 Jan 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.18.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.18.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2180-2020-01-23">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.18.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346383&amp;amp;projectId=12319527">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8470">BEAM-8470&lt;/a> - Create a new Spark runner based on Spark Structured streaming framework&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7636">BEAM-7636&lt;/a> - Added SqsIO v2 support.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8513">BEAM-8513&lt;/a> - RabbitMqIO: Allow reads from exchange-bound queue without declaring the exchange.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8540">BEAM-8540&lt;/a> - Fix CSVSink example in FileIO docs&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5878">BEAM-5878&lt;/a> - Added support DoFns with Keyword-only arguments in Python 3.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-6756">BEAM-6756&lt;/a> - Improved support for lazy iterables in schemas (Java).&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-4776">BEAM-4776&lt;/a> AND &lt;a href="https://issues.apache.org/jira/browse/BEAM-4777">BEAM-4777&lt;/a> - Added metrics supports to portable runners.&lt;/li>
&lt;li>Various improvements to Interactive Beam: &lt;a href="https://issues.apache.org/jira/browse/BEAM-7760">BEAM-7760&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-8379">BEAM-8379&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-8016">BEAM-8016&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-8016">BEAM-8016&lt;/a>.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8658">BEAM-8658&lt;/a> - Optionally set artifact staging port in FlinkUberJarJobServer.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8660">BEAM-8660&lt;/a> - Override returned artifact staging endpoint&lt;/li>
&lt;/ul>
&lt;h3 id="sql">SQL&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8343">BEAM-8343&lt;/a> - [SQL] Add means for IO APIs to support predicate and/or project push-down when running SQL pipelines. And &lt;a href="https://issues.apache.org/jira/browse/BEAM-8468">BEAM-8468&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-8365">BEAM-8365&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-8508">BEAM-8508&lt;/a>.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8427">BEAM-8427&lt;/a> - [SQL] Add support for MongoDB source.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8456">BEAM-8456&lt;/a> - Add pipeline option to control truncate of BigQuery data processed by Beam SQL.&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8814">BEAM-8814&lt;/a> - &amp;ndash;no_auth flag changed to boolean type.&lt;/li>
&lt;/ul>
&lt;h3 id="deprecations">Deprecations&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8252">BEAM-8252&lt;/a> AND &lt;a href="https://issues.apache.org/jira/browse/BEAM-8254">BEAM-8254&lt;/a> Add worker_region and worker_zone options. Deprecated &amp;ndash;zone flag and &amp;ndash;worker_region experiment argument.&lt;/li>
&lt;/ul>
&lt;h3 id="dependency-changes">Dependency Changes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7078">BEAM-7078&lt;/a> - com.amazonaws:amazon-kinesis-client updated to 1.13.0.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8822">BEAM-8822&lt;/a> - Upgrade Hadoop dependencies to version 2.8.&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7917">BEAM-7917&lt;/a> - Python datastore v1new fails on retry.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7981">BEAM-7981&lt;/a> - ParDo function wrapper doesn&amp;rsquo;t support Iterable output types.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8146">BEAM-8146&lt;/a> - SchemaCoder/RowCoder have no equals() function.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8347">BEAM-8347&lt;/a> - UnboundedRabbitMqReader can fail to advance watermark if no new data comes in.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8352">BEAM-8352&lt;/a> - Reading records in background may lead to OOM errors&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8480">BEAM-8480&lt;/a> - Explicitly set restriction coder for bounded reader wrapper SDF.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8515">BEAM-8515&lt;/a> - Ensure that ValueProvider types have equals/hashCode implemented for comparison reasons.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8579">BEAM-8579&lt;/a> - Strip UTF-8 BOM bytes (if present) in TextSource.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8657">BEAM-8657&lt;/a> - Not doing Combiner lifting for data-driven triggers.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8663">BEAM-8663&lt;/a> - BundleBasedRunner Stacked Bundles don&amp;rsquo;t respect PaneInfo.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8667">BEAM-8667&lt;/a> - Data channel should to avoid unlimited buffering in Python SDK.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8802">BEAM-8802&lt;/a> - Timestamp combiner not respected across bundles in streaming mode.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8803">BEAM-8803&lt;/a> - Default behaviour for Python BQ Streaming inserts sink should be to retry always.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8825">BEAM-8825&lt;/a> - OOM when writing large numbers of &amp;rsquo;narrow&amp;rsquo; rows.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8835">BEAM-8835&lt;/a> - Artifact retrieval fails with FlinkUberJarJobServer&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8836">BEAM-8836&lt;/a> - ExternalTransform is not providing a unique name&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8884">BEAM-8884&lt;/a> - Python MongoDBIO TypeError when splitting.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9041">BEAM-9041&lt;/a> - SchemaCoder equals should not rely on from/toRowFunction equality.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9042">BEAM-9042&lt;/a> - AvroUtils.schemaCoder(schema) produces a not serializable SchemaCoder.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9065">BEAM-9065&lt;/a> - Spark runner accumulates metrics (incorrectly) between runs.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-6303">BEAM-6303&lt;/a> - Add .parquet extension to files in ParquetIO.&lt;/li>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8882">BEAM-8882&lt;/a> - Python: &lt;code>beam.Create&lt;/code> no longer preserves order unless &lt;code>reshuffle=False&lt;/code> is passed in as an argument.&lt;/p>
&lt;p>You may encounter this issue when using DirectRunner.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9065">BEAM-9065&lt;/a> - Spark runner accumulates metrics (incorrectly) between runs&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9123">BEAM-9123&lt;/a> - HadoopResourceId returns wrong directory name&lt;/p>
&lt;/li>
&lt;li>
&lt;p>See a full list of open &lt;a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.18.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect&lt;/a> this version.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9144">BEAM-9144&lt;/a> - If you are using Avro 1.9.x with Beam you should not upgrade to this version. There is an issue with timestamp conversions. A fix will be available in the next release.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.18.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Aizhamal Nurmamat kyzy, Alan Myrvold, Alexey Romanenko, Alex Van Boxel, Andre Araujo, Andrew Crites, Andrew Pilloud, Aryan Naraghi, Boyuan Zhang, Brian Hulette, bumblebee-coming, Cerny Ondrej, Chad Dombrova, Chamikara Jayalath, Changming Ma, Chun Yang, cmachgodaddy, Colm O hEigeartaigh, Craig Chambers, Daniel Oliveira, Daniel Robert, David Cavazos, David Moravek, David Song, dependabot[bot], Derek, Dmytro Sadovnychyi, Elliotte Rusty Harold, Etienne Chauchot, Hai Lu, Henry Suryawirawan, Ismaël Mejía, Jack Whelpton, Jan Lukavský, Jean-Baptiste Onofré, Jeff Klukas, Jincheng Sun, Jing, Jing Chen, Joe Tsai, Jonathan Alvarez-Gutierrez, Kamil Wasilewski, KangZhiDong, Kasia Kucharczyk, Kenneth Knowles, kirillkozlov, Kirill Kozlov, Kyle Weaver, liumomo315, lostluck, Łukasz Gajowy, Luke Cwik, Mark Liu, Maximilian Michels, Michal Walenia, Mikhail Gryzykhin, Niel Markwick, Ning Kang, nlofeudo, pabloem, Pablo Estrada, Pankaj Gudlani, Piotr Szczepanik, Primevenn, Reuven Lax, Robert Bradshaw, Robert Burke, Rui Wang, Ruoyun Huang, RusOr10n, Ryan Skraba, Saikat Maitra, sambvfx, Sam Rohde, Samuel Husso, Stefano, Steve Koonce, Steve Niemitz, sunjincheng121, Thomas Weise, Tianyang Hu, Tim Robertson, Tomo Suzuki, tvalentyn, Udi Meiri, Valentyn Tymofieiev, Viola Lyu, Wenjia Liu, Yichi Zhang, Yifan Zou, yoshiki.obata, Yueyang Qiu, ziel, 康智冬&lt;/p></description></item><item><title>Blog: Apache Beam 2.17.0</title><link>/blog/beam-2.17.0/</link><pubDate>Mon, 06 Jan 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.17.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.17.0 release of Beam. This release includes both improvements and new functionality.
Users of the MongoDbIO connector are encouraged to upgrade to this release to address a &lt;a href="/security/CVE-2020-1929/">security vulnerability&lt;/a>.&lt;/p>
&lt;p>See the &lt;a href="/get-started/downloads/#2170-2020-01-06">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.17.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12345970&amp;amp;projectId=12319527">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7962">BEAM-7962&lt;/a> - Drop support for Flink 1.5 and 1.6&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7635">BEAM-7635&lt;/a> - Migrate SnsIO to AWS SDK for Java 2&lt;/li>
&lt;li>Improved usability for portable Flink Runner
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8183">BEAM-8183&lt;/a> - Optionally bundle multiple pipelines into a single Flink jar.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8372">BEAM-8372&lt;/a> - Allow submission of Flink UberJar directly to flink cluster.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8471">BEAM-8471&lt;/a> - Flink native job submission for portable pipelines.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8312">BEAM-8312&lt;/a> - Flink portable pipeline jars do not need to stage artifacts remotely.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7730">BEAM-7730&lt;/a> - Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7990">BEAM-7990&lt;/a> - Add ability to read parquet files into PCollection of pyarrow.Table.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8355">BEAM-8355&lt;/a> - Make BooleanCoder a standard coder.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8394">BEAM-8394&lt;/a> - Add withDataSourceConfiguration() method in JdbcIO.ReadRows class.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5428">BEAM-5428&lt;/a> - Implement cross-bundle state caching.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5967">BEAM-5967&lt;/a> - Add handling of DynamicMessage in ProtoCoder.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7473">BEAM-7473&lt;/a> - Update RestrictionTracker within Python to not be required to be thread safe.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7920">BEAM-7920&lt;/a> - Added AvroTableProvider to Beam SQL.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8098">BEAM-8098&lt;/a> - Improve documentation on BigQueryIO.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8100">BEAM-8100&lt;/a> - Add exception handling to Json transforms in Java SDK.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8306">BEAM-8306&lt;/a> - Improve estimation of data byte size reading from source in ElasticsearchIO.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8351">BEAM-8351&lt;/a> - Support passing in arbitrary KV pairs to sdk worker via external environment config.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8396">BEAM-8396&lt;/a> - Default to LOOPBACK mode for local flink (spark, &amp;hellip;) runner.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8410">BEAM-8410&lt;/a> - JdbcIO should support setConnectionInitSqls in its DataSource.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8609">BEAM-8609&lt;/a> - Add HllCount to Java transform catalog.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8861">BEAM-8861&lt;/a> - Disallow self-signed certificates by default in ElasticsearchIO.&lt;/li>
&lt;/ul>
&lt;h3 id="dependency-changes">Dependency Changes&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8285">BEAM-8285&lt;/a> - Upgrade ZetaSQL to 2019.09.1.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8392">BEAM-8392&lt;/a> - Upgrade pyarrow version bounds: 0.15.1&amp;lt;= to &amp;lt;0.16.0.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5895">BEAM-5895&lt;/a> - Upgrade com.rabbitmq:amqp-client to 5.7.3.&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-6896">BEAM-6896&lt;/a> - Upgrade PyYAML version bounds: 3.12&amp;lt;= to &amp;lt;6.0.0.&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>[BEAM-8819] - AvroCoder for SpecificRecords is not serialized correctly since 2.13.0&lt;/li>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8989">BEAM-8989&lt;/a> Apache Nemo
runner broken due to backwards incompatible change since 2.16.0.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.17.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alan Myrvold, Alexey Romanenko, Andre-Philippe Paquet, Andrew
Pilloud, angulartist, Ankit Jhalaria, Ankur Goenka, Anton Kedin, Aryan Naraghi,
Aurélien Geron, B M VISHWAS, Bartok Jozsef, Boyuan Zhang, Brian Hulette, Cerny
Ondrej, Chad Dombrova, Chamikara Jayalath, ChethanU, cmach, Colm O hEigeartaigh,
Cyrus Maden, Daniel Oliveira, Daniel Robert, Dante, David Cavazos, David
Moravek, David Yan, Enrico Canzonieri, Etienne Chauchot, gxercavins, Hai Lu,
Hannah Jiang, Ian Lance Taylor, Ismaël Mejía, Israel Herraiz, James Wen, Jan
Lukavský, Jean-Baptiste Onofré, Jeff Klukas, jesusrv1103, Jofre, Kai Jiang,
Kamil Wasilewski, Kasia Kucharczyk, Kenneth Knowles, Kirill Kozlov,
kirillkozlov, Kohki YAMAGIWA, Kyle Weaver, Leonardo Alves Miguel, lloigor,
lostluck, Luis Enrique Ortíz Ramirez, Luke Cwik, Mark Liu, Maximilian Michels,
Michal Walenia, Mikhail Gryzykhin, mrociorg, Nicolas Delsaux, Ning Kang, NING
KANG, Pablo Estrada, pabloem, Piotr Szczepanik, rahul8383, Rakesh Kumar, Renat
Nasyrov, Reuven Lax, Robert Bradshaw, Robert Burke, Rui Wang, Ruslan Altynnikov,
Ryan Skraba, Salman Raza, Saul Chavez, Sebastian Jambor, sunjincheng121, Tatu
Saloranta, tchiarato, Thomas Weise, Tomo Suzuki, Tudor Marian, tvalentyn, Udi
Meiri, Valentyn Tymofieiev, Viola Lyu, Vishwas, Yichi Zhang, Yifan Zou, Yueyang
Qiu, Łukasz Gajowy&lt;/p></description></item><item><title>Blog: Apache Beam 2.16.0</title><link>/blog/beam-2.16.0/</link><pubDate>Mon, 07 Oct 2019 00:00:01 -0800</pubDate><guid>/blog/beam-2.16.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.16.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2160-2019-10-07">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.16.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12345494">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Customizable Docker container images released and supported by Beam portable runners on Python 2.7, 3.5, 3.6, 3.7. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7907">BEAM-7907&lt;/a>)&lt;/li>
&lt;li>Integration improvements for Python Streaming on Dataflow including service features like autoscaling, drain, update, streaming engine and counter updates.&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7013">BEAM-7013&lt;/a>)&lt;/li>
&lt;li>Element counters in the Web UI graph representations for transforms for Python streaming jobs in Google Cloud Dataflow. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7045">BEAM-7045&lt;/a>)&lt;/li>
&lt;li>Add SetState in Python sdk. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7741">BEAM-7741&lt;/a>)&lt;/li>
&lt;li>Add hot key detection to Dataflow Runner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7820">BEAM-7820&lt;/a>)&lt;/li>
&lt;li>Add ability to get the list of submitted jobs from gRPC JobService. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7927">BEAM-7927&lt;/a>)&lt;/li>
&lt;li>Portable Flink pipelines can now be bundled into executable jars. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7966">BEAM-7966&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-7967">BEAM-7967&lt;/a>)&lt;/li>
&lt;li>SQL join selection should be done in planner, not in expansion to PTransform. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6114">BEAM-6114&lt;/a>)&lt;/li>
&lt;li>A Python Sink for BigQuery with File Loads in Streaming. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6611">BEAM-6611&lt;/a>)&lt;/li>
&lt;li>Python BigQuery sink should be able to handle 15TB load job quota. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7588">BEAM-7588&lt;/a>)&lt;/li>
&lt;li>Spark portable runner: reuse SDK harness. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7600">BEAM-7600&lt;/a>)&lt;/li>
&lt;li>BigQuery File Loads to work well with load job size limits. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7742">BEAM-7742&lt;/a>)&lt;/li>
&lt;li>External environment with containerized worker pool. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7980">BEAM-7980&lt;/a>)&lt;/li>
&lt;li>Use OffsetRange as restriction for OffsetRestrictionTracker. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8014">BEAM-8014&lt;/a>)&lt;/li>
&lt;li>Get logs for SDK worker Docker containers. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8015">BEAM-8015&lt;/a>)&lt;/li>
&lt;li>PCollection boundedness is tracked and propagated in python sdk. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8088">BEAM-8088&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="dependency-changes">Dependency Changes&lt;/h3>
&lt;ul>
&lt;li>Upgrade &amp;ldquo;com.amazonaws:amazon-kinesis-producer&amp;rdquo; to version 0.13.1. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7894">BEAM-7894&lt;/a>)&lt;/li>
&lt;li>Upgrade to joda time 2.10.3 to get updated TZDB. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8161">BEAM-8161&lt;/a>)&lt;/li>
&lt;li>Upgrade Jackson to version 2.9.10. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8299">BEAM-8299&lt;/a>)&lt;/li>
&lt;li>Upgrade grpcio minimum required version to 1.12.1. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7986">BEAM-7986&lt;/a>)&lt;/li>
&lt;li>Upgrade funcsigs minimum required version to 1.0.2 in Python2. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7060">BEAM-7060&lt;/a>)&lt;/li>
&lt;li>Upgrade google-cloud-pubsub maximum required version to 1.0.0. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-5539">BEAM-5539&lt;/a>)&lt;/li>
&lt;li>Upgrade google-cloud-bigtable maximum required version to 1.0.0. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-5539">BEAM-5539&lt;/a>)&lt;/li>
&lt;li>Upgrade dill version to 0.3.0. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8324">BEAM-8324&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>Given that Python 2 will reach EOL on Jan 1 2020, Python 2 users of Beam will now receive a warning that new releases of Apache Beam will soon support Python 3 only.&lt;/li>
&lt;li>Filesystems not properly registered using FileIO.write in FlinkRunner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8303">BEAM-8303&lt;/a>)&lt;/li>
&lt;li>Performance regression in Java DirectRunner in streaming mode. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8363">BEAM-8363&lt;/a>)&lt;/li>
&lt;li>Can&amp;rsquo;t install the Python SDK on macOS 10.15. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8368">BEAM-8368&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.16.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alex Van Boxel, Alexey Romanenko, Alexey Strokach, Alireza Samadian,
Andre-Philippe Paquet, Andrew Pilloud, Ankur Goenka, Anton Kedin, Aryan Naraghi,
B M VISHWAS, Bartok Jozsef, Bill Neubauer, Boyuan Zhang, Brian Hulette, Bruno Volpato,
Chad Dombrova, Chamikara Jayalath, Charith Ellawala, Charles Chen, Claire McGinty,
Cyrus Maden, Daniel Oliveira, Dante, David Cavazos, David Moravek, David Yan,
Dominic Mitchell, Elias Djurfeldt, Enrico Canzonieri, Etienne Chauchot, Gleb Kanterov,
Hai Lu, Hannah Jiang, Heejong Lee, Ian Lance Taylor, Ismaël Mejía, Jack Whelpton,
James Wen, Jan Lukavský, Jean-Baptiste Onofré, Jofre, Kai Jiang, Kamil Wasilewski,
Kasia Kucharczyk, Kenneth Jung, Kenneth Knowles, Kirill Kozlov, Kohki YAMAGIWA,
Kyle Weaver, Kyle Winkelman, Ludovic Post, Luis Enrique Ortíz Ramirez, Luke Cwik,
Mark Liu, Maximilian Michels, Michal Walenia, Mike Kaplinskiy, Mikhail Gryzykhin,
NING KANG, Oliver Henlich, Pablo Estrada, Rakesh Kumar, Renat Nasyrov, Reuven Lax,
Robert Bradshaw, Robert Burke, Rui Wang, Ruoyun Huang, Ryan Skraba, Sahith Nallapareddy,
Salman Raza, Sam Rohde, Saul Chavez, Shoaib, Shoaib Zafar, Slava Chernyak, Tanay Tummalapalli,
Thinh Ha, Thomas Weise, Tianzi Cai, Tim van der Lippe, Tomer Zeltzer, Tudor Marian,
Udi Meiri, Valentyn Tymofieiev, Yichi Zhang, Yifan Zou, Yueyang Qiu, gxercavins,
jesusrv1103, lostluck, matt-darwin, mrociorg, ostrokach, parahul, rahul8383, rosetn,
sunjincheng121, the1plummie, ttanay, tvalentyn, venn001, yoshiki.obata, Łukasz Gajowy&lt;/p></description></item><item><title>Blog: Apache Beam 2.15.0</title><link>/blog/beam-2.15.0/</link><pubDate>Thu, 22 Aug 2019 00:00:01 -0800</pubDate><guid>/blog/beam-2.15.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.15.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2150-2019-08-22">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.15.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12345489">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Vendored Guava was upgraded to version 26.0.&lt;/li>
&lt;li>Support multi-process execution on the FnApiRunner for Python. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-3645">BEAM-3645&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Add AvroIO.sink for IndexedRecord (FileIO compatible). (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6480">BEAM-6480&lt;/a>)&lt;/li>
&lt;li>Add support for writing to BigQuery clustered tables. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-5191">BEAM-5191&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>Support ParquetTable in SQL. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7728">BEAM-7728&lt;/a>)&lt;/li>
&lt;li>Add hot key detection to Dataflow Runner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7820">BEAM-7820&lt;/a>)&lt;/li>
&lt;li>Support schemas in the JDBC sink. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6675">BEAM-6675&lt;/a>)&lt;/li>
&lt;li>Report GCS throttling time to Dataflow autoscaler for better autoscaling. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7667">BEAM-7667&lt;/a>)&lt;/li>
&lt;li>Support transform_name_mapping option in Python SDK for &lt;code>--update&lt;/code> use. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7761">BEAM-7761&lt;/a>)&lt;/li>
&lt;li>Dependency: Upgrade Jackson databind to version 2.9.9.3 (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7880">BEAM-7880&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-7616">BEAM-7616&lt;/a> urlopen calls may get stuck. (Regression from 2.14.0)&lt;/li>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8111">BEAM-8111&lt;/a> SchemaCoder fails on Dataflow, preventing the use of SqlTransform and schema-aware transforms. (Regression from 2.14.0)&lt;/li>
&lt;li>(&lt;a href="https://issues.apache.org/jira/browse/BEAM-8368">BEAM-8368&lt;/a>) Can&amp;rsquo;t install the Python SDK on macOS 10.15.&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>&lt;code>--region&lt;/code> flag will be a required flag in the future for Dataflow. A warning is added to warn for this future change. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7833">BEAM-7833&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.15.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alexey Romanenko, Alex Goos, Alireza Samadian, Andrew Pilloud, Ankur Goenka,
Anton Kedin, Aryan Naraghi, Bartok Jozsef, bmv126, B M VISHWAS, Boyuan Zhang,
Brian Hulette, brucearctor, Cade Markegard, Cam Mach, Chad Dombrova,
Chaim Turkel, Chamikara Jayalath, Charith Ellawala, Claire McGinty, Craig Chambers,
Daniel Oliveira, David Cavazos, David Moravek, Dominic Mitchell, Dustin Rhodes,
Etienne Chauchot, Filipe Regadas, Gleb Kanterov, Gunnar Schulze, Hannah Jiang,
Heejong Lee, Henry Suryawirawan, Ismaël Mejía, Ivo Galic, Jan Lukavský,
Jawad, Juta, Juta Staes, Kai Jiang, Kamil Wasilewski, Kasia Kucharczyk,
Kenneth Jung, Kenneth Knowles, Kyle Weaver, Lily Li, Logan HAUSPIE, lostluck,
Łukasz Gajowy, Luke Cwik, Mark Liu, Matt Helm, Maximilian Michels,
Michael Luckey, Mikhail Gryzykhin, Neville Li, Nicholas Rucci, pabloem,
Pablo Estrada, Paul King, Paul Suganthan, Raheel Khan, Rakesh Kumar,
Reza Rokni, Robert Bradshaw, Robert Burke, rosetn, Rui Wang, Ryan Skraba, RyanSkraba,
Sahith Nallapareddy, Sam Rohde, Sam Whittle, Steve Niemitz, Tanay Tummalapalli, Thomas Weise,
Tianyang Hu, ttanay, tvalentyn, Udi Meiri, Valentyn Tymofieiev, Wout Scheepers,
yanzhi, Yekut, Yichi Zhang, Yifan Zou, yoshiki.obata, Yueyang Qiu, Yunqing Zhou&lt;/p></description></item><item><title>Blog: Apache Beam 2.14.0</title><link>/blog/beam-2.14.0/</link><pubDate>Wed, 31 Jul 2019 00:00:01 -0800</pubDate><guid>/blog/beam-2.14.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.14.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2140-2019-08-01">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.14.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12345431">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;ul>
&lt;li>Python 3 support is extended to Python 3.6 and 3.7; in addition to various other Python 3 &lt;a href="https://issues.apache.org/jira/browse/BEAM-1251?focusedCommentId=16890504&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16890504">improvements&lt;/a>.&lt;/li>
&lt;li>Spark portable runner (batch) now &lt;a href="https://lists.apache.org/thread.html/c43678fc24c9a1dc9f48c51c51950aedcb9bc0fd3b633df16c3d595a@%3Cuser.beam.apache.org%3E">available&lt;/a> for Java, Python, Go.&lt;/li>
&lt;li>Added new runner: Hazelcast Jet Runner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7305">BEAM-7305&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Schema support added to BigQuery reads. (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6673">BEAM-6673&lt;/a>)&lt;/li>
&lt;li>Schema support added to JDBC source. (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6674">BEAM-6674&lt;/a>)&lt;/li>
&lt;li>BigQuery support for &lt;code>bytes&lt;/code> is fixed. (Python 3) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6769">BEAM-6769&lt;/a>)&lt;/li>
&lt;li>Added DynamoDB IO. (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7043">BEAM-7043&lt;/a>)&lt;/li>
&lt;li>Added support unbounded reads with HCatalogIO (Java) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7450">BEAM-7450&lt;/a>)&lt;/li>
&lt;li>Added BoundedSource wrapper for SDF. (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7443">BEAM-7443&lt;/a>)&lt;/li>
&lt;li>Added support for INCRBY/DECRBY operations in RedisIO. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7286">BEAM-7286&lt;/a>)&lt;/li>
&lt;li>Added Support for ValueProvider defined GCS Location for WriteToBigQuery with File Loads. (Java) ((&lt;a href="https://issues.apache.org/jira/browse/BEAM-7603">BEAM-7603&lt;/a>))&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>Python SDK add support for DoFn &lt;code>setup&lt;/code> and &lt;code>teardown&lt;/code> methods. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-562">BEAM-562&lt;/a>)&lt;/li>
&lt;li>Python SDK adds new transforms: &lt;a href="https://issues.apache.org/jira/browse/BEAM-6693">ApproximateUnique&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-6695">Latest&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-7019">Reify&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-7021">ToString&lt;/a>, &lt;a href="https://issues.apache.org/jira/browse/BEAM-7023">WithKeys&lt;/a>.&lt;/li>
&lt;li>Added hook for user-defined JVM initialization in workers. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6872">BEAM-6872&lt;/a>)&lt;/li>
&lt;li>Added support for SQL Row Estimation for BigQueryTable. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7513">BEAM-7513&lt;/a>)&lt;/li>
&lt;li>Auto sharding of streaming sinks in FlinkRunner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-5865">BEAM-5865&lt;/a>)&lt;/li>
&lt;li>Removed the Hadoop dependency from the external sorter. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7268">BEAM-7268&lt;/a>)&lt;/li>
&lt;li>Added option to expire portable SDK worker environments. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7348">BEAM-7348&lt;/a>)&lt;/li>
&lt;li>Beam does not relocate Guava anymore and depends only on its own vendored version of Guava. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6620">BEAM-6620&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>Deprecated set/getClientConfiguration in Jdbc IO. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7263">BEAM-7263&lt;/a>)&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>Fixed reading of concatenated compressed files. (Python) (&lt;a href="https://issues.apache.org/jira/browse/BEAM-6952">BEAM-6952&lt;/a>)&lt;/li>
&lt;li>Fixed re-scaling issues on Flink &amp;gt;= 1.6 versions. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7144">BEAM-7144&lt;/a>)&lt;/li>
&lt;li>Fixed SQL EXCEPT DISTINCT behavior. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7194">BEAM-7194&lt;/a>)&lt;/li>
&lt;li>Fixed OOM issues with bounded Reads for Flink Runner. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7442">BEAM-7442&lt;/a>)&lt;/li>
&lt;li>Fixed HdfsFileSystem to correctly match directories. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7561">BEAM-7561&lt;/a>)&lt;/li>
&lt;li>Upgraded Spark runner to use spark version 2.4.3. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7265">BEAM-7265&lt;/a>)&lt;/li>
&lt;li>Upgraded Jackson to version 2.9.9. (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7465">BEAM-7465&lt;/a>)&lt;/li>
&lt;li>Various other bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h3 id="known-issues">Known Issues&lt;/h3>
&lt;ul>
&lt;li>Do &lt;strong>NOT&lt;/strong> use Python MongoDB source in this release. Python MongoDB source &lt;a href="https://issues.apache.org/jira/browse/BEAM-5148">added&lt;/a> in this release has a known issue that can result in data loss. See (&lt;a href="https://issues.apache.org/jira/browse/BEAM-7866">BEAM-7866&lt;/a>) for details.&lt;/li>
&lt;li>Can&amp;rsquo;t install the Python SDK on macOS 10.15. See (&lt;a href="https://issues.apache.org/jira/browse/BEAM-8368">BEAM-8368&lt;/a>) for details.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.14.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Aizhamal Nurmamat kyzy, Ajo Thomas, Alex Amato, Alexey Romanenko,
Alexey Strokach, Alex Van Boxel, Alireza Samadian, Andrew Pilloud,
Ankit Jhalaria, Ankur Goenka, Anton Kedin, Aryan Naraghi, Bartok Jozsef,
Bora Kaplan, Boyuan Zhang, Brian Hulette, Cam Mach, Chamikara Jayalath,
Charith Ellawala, Charles Chen, Colm O hEigeartaigh, Cyrus Maden,
Daniel Mills, Daniel Oliveira, David Cavazos, David Moravek, David Yan,
Daniel Lescohier, Elwin Arens, Etienne Chauchot, Fábio Franco Uechi,
Finch Keung, Frederik Bode, Gregory Kovelman, Graham Polley, Hai Lu, Hannah Jiang,
Harshit Dwivedi, Harsh Vardhan, Heejong Lee, Henry Suryawirawan,
Ismaël Mejía, Jan Lukavský, Jean-Baptiste Onofré, Jozef Vilcek, Juta, Kai Jiang,
Kamil Wu, Kasia Kucharczyk, Kenneth Knowles, Kyle Weaver, Lara Schmidt,
Łukasz Gajowy, Luke Cwik, Manu Zhang, Mark Liu, Matthias Baetens,
Maximilian Michels, Melissa Pashniak, Michael Luckey, Michal Walenia,
Mikhail Gryzykhin, Ming Liang, Neville Li, Pablo Estrada, Paul Suganthan,
Peter Backx, Rakesh Kumar, Rasmi Elasmar, Reuven Lax, Reza Rokni, Robbe Sneyders,
Robert Bradshaw, Robert Burke, Rose Nguyen, Rui Wang, Ruoyun Huang,
Shoaib Zafar, Slava Chernyak, Steve Niemitz, Tanay Tummalapalli, Thomas Weise,
Tim Robertson, Tim van der Lippe, Udi Meiri, Valentyn Tymofieiev, Varun Dhussa,
Viktor Gerdin, Yichi Zhang, Yifan Mai, Yifan Zou, Yueyang Qiu.&lt;/p></description></item><item><title>Blog: Apache Beam 2.13.0</title><link>/blog/beam-2.13.0/</link><pubDate>Fri, 07 Jun 2019 00:00:01 -0800</pubDate><guid>/blog/beam-2.13.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.13.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2130-2019-05-21">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.13.0, check out the
&lt;a href="https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12345166">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Support reading query results with the BigQuery storage API.&lt;/li>
&lt;li>Support KafkaIO to be configured externally for use with other SDKs.&lt;/li>
&lt;li>BigQuery IO now supports BYTES datatype on Python 3.&lt;/li>
&lt;li>Avro IO support enabled on Python 3.&lt;/li>
&lt;li>For Python 3 pipelines, the default Avro library used by Beam AvroIO and Dataflow workers was switched from avro-python3 to fastavro.&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>Flink 1.8 support added.&lt;/li>
&lt;li>Support to run word count on Portable Spark runner.&lt;/li>
&lt;li>ElementCount metrics in FnApi Dataflow Runner.&lt;/li>
&lt;li>Support to create BinaryCombineFn from lambdas.&lt;/li>
&lt;/ul>
&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
&lt;ul>
&lt;li>When writing BYTES Datatype into Bigquery with Beam Bigquery IO on Python DirectRunner, users need to base64-encode bytes values before passing them to Bigquery IO. Accordingly, when reading bytes data from BigQuery, the IO will also return base64-encoded bytes. This change only affects Bigquery IO on Python DirectRunner. New DirectRunner behavior is consistent with treatment of Bytes by Beam Java Bigquery IO, and Python Dataflow Runner.&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed to the 2.13.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Aaron Li, Ahmet Altay, Aizhamal Nurmamat kyzy, Alex Amato, Alexey Romanenko,
Andrew Pilloud, Ankur Goenka, Anton Kedin, apstndb, Boyuan Zhang, Brian Hulette,
Brian Quinlan, Chamikara Jayalath, Cyrus Maden, Daniel Chen, Daniel Oliveira,
David Cavazos, David Moravek, David Yan, EdgarLGB, Etienne Chauchot, frederik2,
Gleb Kanterov, Harshit Dwivedi, Harsh Vardhan, Heejong Lee, Hennadiy Leontyev,
Henri-Mayeul de Benque, Ismaël Mejía, Jae-woo Kim, Jamie Kirkpatrick, Jan Lukavský,
Jason Kuster, Jean-Baptiste Onofré, JohnZZGithub, Jozef Vilcek, Juta, Kenneth Jung,
Kenneth Knowles, Kyle Weaver, Łukasz Gajowy, Luke Cwik, Mark Liu, Mathieu Blanchard,
Maximilian Michels, Melissa Pashniak, Michael Luckey, Michal Walenia, Mike Kaplinskiy,
Mike Pedersen, Mikhail Gryzykhin, Mikhail-Ivanov, Niklas Hansson, pabloem,
Pablo Estrada, Pranay Nanda, Reuven Lax, Richard Moorhead, Robbe Sneyders,
Robert Bradshaw, Robert Burke, Roman van der Krogt, rosetn, Rui Wang, Ryan Yuan,
Sam Whittle, sudhan499, Sylwester Kardziejonek, Ted, Thomas Weise, Tim Robertson,
ttanay, tvalentyn, Udi Meiri, Valentyn Tymofieiev, Xinyu Liu, Yifan Zou,
yoshiki.obata, Yueyang Qiu&lt;/p></description></item><item><title>Blog: Apache Beam 2.12.0</title><link>/blog/beam-2.12.0/</link><pubDate>Thu, 25 Apr 2019 00:00:01 -0800</pubDate><guid>/blog/beam-2.12.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.12.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2120-2019-04-25">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.12.0, check out the
&lt;a href="https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12344944">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Add support for a BigQuery custom sink for Python SDK.&lt;/li>
&lt;li>Add support to specify a query in CassandraIO for Java SDK.&lt;/li>
&lt;li>Add experimental support for cross-language transforms,
please see &lt;a href="https://issues.apache.org/jira/browse/BEAM-6730">BEAM-6730&lt;/a>&lt;/li>
&lt;li>Add support in the Flink Runner for exactly-once Writes with KafkaIO&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>Enable Bundle Finalization in Python SDK for portable runners.&lt;/li>
&lt;li>Add support to the Java SDK harness to merge windows.&lt;/li>
&lt;li>Add Kafka Sink EOS support on Flink runner.&lt;/li>
&lt;li>Added a dead letter queue to Python streaming BigQuery sink.&lt;/li>
&lt;li>Add Experimental Python 3.6 and 3.7 workloads enabled.
Beam 2.12 supports starting Dataflow pipelines under Python 3.6, 3.7, however 3.5 remains the only recommended minor version for Dataflow runner. In addition to announced 2.11 limitations, Beam typehint annotations are currently not supported on Python &amp;gt;= 3.6.&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed
to the 2.12.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmed El.Hussaini, Ahmet Altay, Alan Myrvold, Alex Amato, Alexander Savchenko,
Alexey Romanenko, Andrew Brampton, Andrew Pilloud, Ankit Jhalaria,
Ankur Goenka, Anton Kedin, Boyuan Zhang, Brian Hulette, Chamikara Jayalath,
Charles Chen, Colm O hEigeartaigh, Craig Chambers, Dan Duong, Daniel Mescheder,
Daniel Oliveira, David Moravek, David Rieber, David Yan, Eric Roshan-Eisner,
Etienne Chauchot, Gleb Kanterov, Heejong Lee, Ho Tien Vu, Ismaël Mejía,
Jan Lukavský, Jean-Baptiste Onofré, Jeff Klukas, Juta, Kasia Kucharczyk,
Kengo Seki, Kenneth Jung, Kenneth Knowles, kevin, Kyle Weaver, Kyle Winkelman,
Łukasz Gajowy, Mark Liu, Mathieu Blanchard, Max Charas, Maximilian Michels,
Melissa Pashniak, Michael Luckey, Michal Walenia, Mike Kaplinskiy,
Mikhail Gryzykhin, Niel Markwick, Pablo Estrada, Radoslaw Stankiewicz,
Reuven Lax, Robbe Sneyders, Robert Bradshaw, Robert Burke, Rui Wang,
Ruoyun Huang, Ryan Williams, Slava Chernyak, Shahar Frank, Sunil Pedapudi,
Thomas Weise, Tim Robertson, Tanay Tummalapalli, Udi Meiri,
Valentyn Tymofieiev, Xinyu Liu, Yifan Zou, Yueyang Qiu&lt;/p></description></item><item><title>Blog: Apache Beam 2.11.0</title><link>/blog/beam-2.11.0/</link><pubDate>Tue, 05 Mar 2019 00:00:01 -0800</pubDate><guid>/blog/beam-2.11.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.11.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2110-2019-02-26">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.11.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12344775">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;h3 id="dependency-upgradeschanges">Dependency Upgrades/Changes&lt;/h3>
&lt;ul>
&lt;li>Java: antlr: 4.7&lt;/li>
&lt;li>Java: antlr_runtime: 4.7&lt;/li>
&lt;li>Java: bigdataoss_gcsio: 1.9.16&lt;/li>
&lt;li>Java: bigdataoss_util: 1.9.16&lt;/li>
&lt;li>Java: bigtable_client_core: 1.8.0&lt;/li>
&lt;li>Java: cassandra-driver-core: 3.6.0&lt;/li>
&lt;li>Java: cassandra-driver-mapping: 3.6.0&lt;/li>
&lt;li>Java: commons-compress: 1.18&lt;/li>
&lt;li>Java: gax_grpc: 1.38.0&lt;/li>
&lt;li>Java: google_api_common: 1.7.0&lt;/li>
&lt;li>Java: google_api_services_dataflow: v1b3-rev20190126-1.27.0&lt;/li>
&lt;li>Java: google_cloud_bigquery_storage: 0.79.0-alpha&lt;/li>
&lt;li>Java: google_cloud_bigquery_storage_proto: 0.44.0&lt;/li>
&lt;li>Java: google_auth_library_credentials: 0.12.0&lt;/li>
&lt;li>Java: google_auth_library_oauth2_http: 0.12.0&lt;/li>
&lt;li>Java: google_cloud_core: 1.61.0&lt;/li>
&lt;li>Java: google_cloud_core_grpc: 1.61.0&lt;/li>
&lt;li>Java: google_cloud_spanner: 1.6.0&lt;/li>
&lt;li>Java: grpc_all: 1.17.1&lt;/li>
&lt;li>Java: grpc_auth: 1.17.1&lt;/li>
&lt;li>Java: grpc_core: 1.17.1&lt;/li>
&lt;li>Java: grpc_google_cloud_pubsub_v1: 1.17.1&lt;/li>
&lt;li>Java: grpc_protobuf: 1.17.1&lt;/li>
&lt;li>Java: grpc_protobuf_lite: 1.17.1&lt;/li>
&lt;li>Java: grpc_netty: 1.17.1&lt;/li>
&lt;li>Java: grpc_stub: 1.17.1&lt;/li>
&lt;li>Java: netty_handler: 4.1.30.Final&lt;/li>
&lt;li>Java: netty_tcnative_boringssl_static: 2.0.17.Final&lt;/li>
&lt;li>Java: netty_transport_native_epoll: 4.1.30.Final&lt;/li>
&lt;li>Java: proto_google_cloud_spanner_admin_database_v1: 1.6.0&lt;/li>
&lt;li>Java: zstd_jni: 1.3.8-3&lt;/li>
&lt;li>Python: futures&amp;gt;=3.2.0,&amp;lt;4.0.0; python_version &amp;lt; &amp;ldquo;3.0&amp;rdquo;&lt;/li>
&lt;li>Python: pyvcf&amp;gt;=0.6.8,&amp;lt;0.7.0; python_version &amp;lt; &amp;ldquo;3.0&amp;rdquo;&lt;/li>
&lt;li>Python: google-apitools&amp;gt;=0.5.26,&amp;lt;0.5.27&lt;/li>
&lt;li>Python: google-cloud-core==0.28.1&lt;/li>
&lt;li>Python: google-cloud-bigtable==0.31.1&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Portable Flink runner support for running cross-language transforms.&lt;/li>
&lt;li>Add Cloud KMS support to GCS copies.&lt;/li>
&lt;li>Add parameters for offsetConsumer in KafkaIO.read().&lt;/li>
&lt;li>Allow setting compression codec in ParquetIO write.&lt;/li>
&lt;li>Add kms_key to BigQuery transforms, pass to Dataflow.&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>Python 3 (experimental) suppport for DirectRunner and DataflowRunner.&lt;/li>
&lt;li>Add ZStandard compression support for Java SDK.&lt;/li>
&lt;li>Python: Add CombineFn.compact, similar to Java.&lt;/li>
&lt;li>SparkRunner: GroupByKey optimized for non-merging windows.&lt;/li>
&lt;li>SparkRunner: Add bundleSize parameter to control splitting of Spark sources.&lt;/li>
&lt;li>FlinkRunner: Portable runner savepoint / upgrade support.&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>Various bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h3 id="deprecations">Deprecations&lt;/h3>
&lt;ul>
&lt;li>Deprecate MongoDb &lt;code>withKeepAlive&lt;/code> because it is deprecated in the Mongo driver.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed
to the 2.11.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alex Amato. Alexey Romanenko, Andrew Pilloud, Ankur Goenka, Anton Kedin,
Boyuan Zhang, Brian Hulette, Brian Martin, Chamikara Jayalath, Charles Chen, Craig Chambers,
Daniel Oliveira, David Moravek, David Rieber, Dustin Rhodes, Etienne Chauchot, Gleb Kanterov,
Hai Lu, Heejong Lee, Ismaël Mejía, J Ross Thomson, Jan Lukavsky, Jason Kuster, Jean-Baptiste Onofré,
Jeff Klukas, João Cabrita, Juan Rael, Juta Staes, Kasia Kucharczyk, Kengo Seki, Kenneth Jung,
Kenneth Knowles, Kyle Weaver, Kyle Winkelman, Lukas Drbal, Marek Simunek, Mark Liu,
Maximilian Michels, Melissa Pashniak, Michael Luckey, Michal Walenia, Mike Pedersen,
Mikhail Gryzykhin, Niel Markwick, Pablo Estrada, Pascal Gula, Reuven Lax, Robbe Sneyders,
Robert Bradshaw, Robert Burke, Rui Wang, Ruoyun Huang, Ryan Williams, Sam Rohde, Sam Whittle,
Scott Wegner, Tanay Tummalapalli, Thomas Weise, Tianyang Hu, Tyler Akidau, Udi Meiri,
Valentyn Tymofieiev, Xinyu Liu, Xu Mingmin, Łukasz Gajowy.&lt;/p></description></item><item><title>Blog: Apache Beam 2.10.0</title><link>/blog/beam-2.10.0/</link><pubDate>Fri, 15 Feb 2019 00:00:01 -0800</pubDate><guid>/blog/beam-2.10.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.10.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#2100-2019-02-01">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.10.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12344540">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="highlights">Highlights&lt;/h2>
&lt;h3 id="dependency-upgradeschanges">Dependency Upgrades/Changes&lt;/h3>
&lt;ul>
&lt;li>FlinkRunner: Flink 1.5.x/1.6.x/1.7.x&lt;/li>
&lt;li>Java: AutoValue 1.6.3&lt;/li>
&lt;li>Java: Jackson 2.9.8&lt;/li>
&lt;li>Java: google_cloud_bigdataoss 1.9.13&lt;/li>
&lt;li>Java: Apache Commons Codec: 1.10&lt;/li>
&lt;li>Python: avro&amp;gt;=1.8.1,&amp;lt;2.0.0; python_version &amp;lt; &amp;ldquo;3.0&amp;rdquo;&lt;/li>
&lt;li>Python: avro-python3&amp;gt;=1.8.1,&amp;lt;2.0.0; python_version &amp;gt;= &amp;ldquo;3.0&amp;rdquo;&lt;/li>
&lt;li>Python: bigdataoss_gcsio 1.9.12&lt;/li>
&lt;li>Python: dill&amp;gt;=0.2.9,&amp;lt;0.2.10&lt;/li>
&lt;li>Python: gcsio 1.9.13&lt;/li>
&lt;li>Python: google-cloud-pubsub 0.39.0&lt;/li>
&lt;li>Python: pytz&amp;gt;=2018.3&lt;/li>
&lt;li>Python: pyyaml&amp;gt;=3.12,&amp;lt;4.0.0&lt;/li>
&lt;li>MongoDbIO: mongo client 3.9.1&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>ParquetIO for Python SDK&lt;/li>
&lt;li>HadoopOutputFormatIO: Add batching support&lt;/li>
&lt;li>HadoopOutputFormatIO: Add streaming support&lt;/li>
&lt;li>MongoDbIO: Add projections&lt;/li>
&lt;li>MongoDbIO: Add support for server with self signed SSL&lt;/li>
&lt;li>MongoDbIO add ordered option (inserts documents even if errors)&lt;/li>
&lt;li>KafkaIO: Add support to write to multiple topics&lt;/li>
&lt;li>KafkaIO: add writing support with ProducerRecord&lt;/li>
&lt;li>CassandraIO: Add ability to delete data&lt;/li>
&lt;li>JdbcIO: Add ValueProvider support for Statement in JdbcIO.write(), so it can be templatized&lt;/li>
&lt;/ul>
&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
&lt;ul>
&lt;li>FlinkRunner: support Flink config directory&lt;/li>
&lt;li>FlinkRunner: master url now supports IPv6 addresses&lt;/li>
&lt;li>FlinkRunner: portable runner savepoint / upgrade support&lt;/li>
&lt;li>FlinkRunner: can be built against different Flink versions&lt;/li>
&lt;li>FlinkRunner: Send metrics to Flink in portable runner&lt;/li>
&lt;li>Java: Migrate to vendored gRPC (no conflicts with user gRPC, smaller jars)&lt;/li>
&lt;li>Java: Migrate to vendored Guava (no conflicts with user Guava, smaller jars)&lt;/li>
&lt;li>SQL: support joining unbounded to bounded sources via side input (and is no longer sensitive to left vs right join)&lt;/li>
&lt;li>SQL: support table macro&lt;/li>
&lt;li>Schemas: support for Avro, with automatic schema registration&lt;/li>
&lt;li>Schemas: Automatic schema registration for AutoValue classes&lt;/li>
&lt;/ul>
&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
&lt;ul>
&lt;li>Watch PTransform fixed (affects FileIO)&lt;/li>
&lt;li>FlinkRunner: no longer fails if GroupByKey contains null values (streaming mode only)&lt;/li>
&lt;li>FlinkRunner: no longer prepares to-be-staged file too late&lt;/li>
&lt;li>FlinkRunner: sets number of shards for writes with runner determined sharding&lt;/li>
&lt;li>FlinkRunner: prevents CheckpointMarks from not getting acknowledged&lt;/li>
&lt;li>Schemas: Generated row object for POJOs, Avros, and JavaBeans should work if the wrapped class is package private&lt;/li>
&lt;li>Schemas: Nested collection types in schemas no longer cause NullPointerException when converting to a POJO&lt;/li>
&lt;li>BigQueryIO: now handles quotaExceeded errors properly&lt;/li>
&lt;li>BigQueryIO: now handles triggering correctly in certain very large load jobs&lt;/li>
&lt;li>FileIO and other file-based IOs: Beam LocalFilesystem now matches glob patterns in windows&lt;/li>
&lt;li>SQL: joins no longer moves timestamps to the end of the window&lt;/li>
&lt;li>SQL: was missing some transitive dependencies&lt;/li>
&lt;li>SQL: JDBC driver no longer breaks interactions with other JDBC sources&lt;/li>
&lt;li>pyarrow supported on Windows Python 2&lt;/li>
&lt;/ul>
&lt;h3 id="deprecations">Deprecations&lt;/h3>
&lt;ul>
&lt;li>Deprecate HadoopInputFormatIO&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed
to the 2.10.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Ahmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Anton Kedin, Rui Wang,
Andrew Brampton Andrew Pilloud, Ankur Goenka, Antonio D&amp;rsquo;souza, Bingfeng Shu,
Boyuan Zhang, brucearctor, Cade Markegard, Chaim Turkel, Chamikara Jayalath,
Charles Chen, Colm O hEigeartaigh, Cory, Craig Chambers, Cristian, Daniel
Mills, Daniel Oliveira, David Cavazos, David Hrbacek, David Moravek, Dawid
Wysakowicz, djhworld, Dustin Rhodes, Etienne Chauchot, Fabien Rousseau, Garrett
Jones, Gleb Kanterov, Heejong Lee, Ismaël Mejía, Jason Kuster, Jean-Baptiste
Onofré, Jeff Klukas, Joar Wandborg, Jozef Vilcek, Kadir Cetinkaya, Kasia
Kucharczyk, Kengo Seki, Kenneth Knowles, lcaggio, Lukasz Cwik, Łukasz Gajowy,
Manu Zhang, marek.simunek, Mark Daoust, Mark Liu, Maximilian Michels, Melissa
Pashniak, Michael Luckey, Mikhail Gryzykhin, mlotstein, morokosi, Niel
Markwick, Pablo Estrada, Prem Kumar Karunakaran, Reuven Lax, robbe, Robbe
Sneyders, Robert Bradshaw, Robert Burke, Ruoyun Huang, Ryan Williams, Sam
Whittle, Scott Wegner, Slava Chernyak, Theodore Siu, Thomas Weise, Udi Meiri,
&lt;a href="mailto:vaclav.plajt@gmail.com">vaclav.plajt@gmail.com&lt;/a>, Valentyn Tymofieiev, Won Wook SONG, Wout Scheepers,
Xinyu Liu, Yueyang Qiu, Zhuo Peng&lt;/p></description></item><item><title>Blog: Apache Beam 2.9.0</title><link>/blog/beam-2.9.0/</link><pubDate>Thu, 13 Dec 2018 00:00:01 -0800</pubDate><guid>/blog/beam-2.9.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.9.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#290-2018-12-13">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.9.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12344258">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;h3 id="dependency-upgrades">Dependency Upgrades&lt;/h3>
&lt;ul>
&lt;li>Update google-api-client libraries to 1.27.0.&lt;/li>
&lt;li>Update byte-buddy to 1.9.3&lt;/li>
&lt;li>Update Flink Runner to 1.5.5&lt;/li>
&lt;li>Upgrade google-apitools to 0.5.24&lt;/li>
&lt;/ul>
&lt;h3 id="portability">Portability&lt;/h3>
&lt;ul>
&lt;li>Added support for user state and timers to Flink runner.&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>I/O connector for RabbitMQ.&lt;/li>
&lt;li>Update SpannerIO to support unbounded writes.&lt;/li>
&lt;li>Add PFADD method to RedisIO.&lt;/li>
&lt;/ul>
&lt;h3 id="miscellaneous-fixes">Miscellaneous Fixes&lt;/h3>
&lt;ul>
&lt;li>Dataflow runner was updated to &lt;strong>not&lt;/strong> use &lt;a href="https://github.com/google/conscrypt">Conscrypt&lt;/a> as the default security provider.&lt;/li>
&lt;li>Support set/delete of timers by ID in Flink runner.&lt;/li>
&lt;li>Improvements to stabilize integration tests.&lt;/li>
&lt;li>Updates Spark runner to show Beam metrics in web UI&lt;/li>
&lt;li>Vendor gRPC and Protobuf separately from beam-model-* Java packages&lt;/li>
&lt;li>Avoid reshuffle for zero and one element creates&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed
to the 2.9.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Adam Horky, Ahmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Andrea Foegler, Andrew Fulton, Andrew Pilloud, Ankur Goenka, Anton Kedin, Babu, Ben Song, Bingfeng Shu, Boyuan Zhang, Brian Martin, Brian Quinlan, Chamikara Jayalath, Charles Chen, Christian Schneider, Colm O hEigeartaigh, Cory Brzycki, CraigChambersG, Daniel Oliveira, David Moravek, Dusan Rychnovsky, Etienne Chauchot, Eugene Kirpichov, Fabien Rousseau, Gleb Kanterov, Heejong Lee, Henning Rohde, Ismaël Mejía, Jan Lukavský, Jaromir Vanek, Jason Kuster, Jean-Baptiste Onofré, Jeff Klukas, Jeroen Steggink, Julien Tournay, Jára Vaněk, Katarzyna Kucharczyk, Keisuke Kondo, Kenneth Knowles, Liam Miller-Cushon, Luke Cwik, Manu Zhang, Mark Liu, Maximilian Michels, Melissa Pashniak, Micah Wylde, Michael Luckey, Mike Pedersen, Mikhail Gryzykhin, Novotnik, Petr, Ondrej Kvasnicka, Pablo Estrada, Pavel Slechta, Raghu Angadi, Reuven Lax, Robbe Sneyders, Robert Bradshaw, Robert Burke, Ruoyu Liu, Ruoyun Huang, Sam Rohde, Sam sam, Scott Wegner, Simon Plovyt, Thomas Weise, Tim Robertson, Tomas Novak, Udi Meiri, Vaclav Plajt, Valentyn Tymofieiev, Varun Dhussa, Vojtech Janota, Wout Scheepers, Xinyu Liu, XuMingmin, Yifan Zou, Yueyang Qiu, akedin, amaliujia, connelloG, flyisland, huygaa11, jasonkuster, jglezt, kkpoon, mareksimunek, matthiasa4, melissa, mingmxu, nielm, reuvenlax, robbe, ruoyu90, splovyt, svXaverius, &lt;a href="mailto:vaclav.plajt@gmail.com">vaclav.plajt@gmail.com&lt;/a>, xinyuiscool, xitep, Łukasz Gajowy&lt;/p></description></item><item><title>Blog: Apache Beam 2.8.0</title><link>/blog/beam-2.8.0/</link><pubDate>Mon, 29 Oct 2018 00:00:01 -0800</pubDate><guid>/blog/beam-2.8.0/</guid><description>
&lt;!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
&lt;p>We are happy to present the new 2.8.0 release of Beam. This release includes both improvements and new functionality.
See the &lt;a href="/get-started/downloads/#280-2018-10-26">download page&lt;/a> for this release.&lt;/p>
&lt;p>For more information on changes in 2.8.0, check out the
&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12343985">detailed release notes&lt;/a>.&lt;/p>
&lt;h2 id="new-features--improvements">New Features / Improvements&lt;/h2>
&lt;h2 id="known-issues">Known Issues&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-4783">BEAM-4783&lt;/a> Performance degradations in certain situations when Spark runner is used.&lt;/li>
&lt;/ul>
&lt;h3 id="dependency-upgrades">Dependency Upgrades&lt;/h3>
&lt;ul>
&lt;li>Elastic Search dependency upgraded to 6.3.2&lt;/li>
&lt;li>google-cloud-pubsub dependency upgraded to 0.35.4&lt;/li>
&lt;li>google-api-client dependency upgraded to 1.24.1&lt;/li>
&lt;li>Updated Flink Runner to 1.5.3&lt;/li>
&lt;li>Updated Spark runner to Spark version 2.3.2&lt;/li>
&lt;/ul>
&lt;h3 id="sdks">SDKs&lt;/h3>
&lt;ul>
&lt;li>Python SDK added support for user state and timers.&lt;/li>
&lt;li>Go SDK added support for side inputs.&lt;/li>
&lt;/ul>
&lt;h3 id="portability">Portability&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="/roadmap/portability/#python-on-flink">Python on Flink MVP&lt;/a> completed.&lt;/li>
&lt;/ul>
&lt;h3 id="ios">I/Os&lt;/h3>
&lt;ul>
&lt;li>Fixes to RedisIO non-prefix read operations.&lt;/li>
&lt;/ul>
&lt;h2 id="miscellaneous-fixes">Miscellaneous Fixes&lt;/h2>
&lt;ul>
&lt;li>Several bug fixes and performance improvements.&lt;/li>
&lt;/ul>
&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
&lt;p>According to git shortlog, the following people contributed
to the 2.8.0 release. Thank you to all contributors!&lt;/p>
&lt;p>Adam Horky, Ahmet Altay, Alan Myrvold, Aleksandr Kokhaniukov,
Alex Amato, Alexey Romanenko, Aljoscha Krettek, Andrew Fulton,
Andrew Pilloud, Ankur Goenka, Anton Kedin, Babu, Batkhuyag Batsaikhan, Ben Song,
Bingfeng Shu, Boyuan Zhang, Chamikara Jayalath, Charles Chen,
Christian Schneider, Cody Schroeder, Colm O hEigeartaigh, Daniel Mills,
Daniel Oliveira, Dat Tran, David Moravek, Dusan Rychnovsky, Etienne Chauchot,
Eugene Kirpichov, Gleb Kanterov, Heejong Lee, Henning Rohde, Ismaël Mejía,
Jan Lukavský, Jaromir Vanek, Jean-Baptiste Onofré, Jeff Klukas, Joar Wandborg,
Jozef Vilcek, Julien Phalip, Julien Tournay, Juta, Jára Vaněk,
Katarzyna Kucharczyk, Kengo Seki, Kenneth Knowles, Kevin Si, Kirill Kozlov,
Kyle Winkelman, Logan HAUSPIE, Lukasz Cwik, Manu Zhang, Mark Liu,
Matthias Baetens, Matthias Feys, Maximilian Michels, Melissa Pashniak,
Micah Wylde, Michael Luckey, Mike Pedersen, Mikhail Gryzykhin, Novotnik,
Petr, Ondrej Kvasnicka, Pablo Estrada, PaulVelthuis93, Pavel Slechta,
Rafael Fernández, Raghu Angadi, Renat, Reuven Lax, Robbe Sneyders,
Robert Bradshaw, Robert Burke, Rodrigo Benenson, Rong Ou, Ruoyun Huang,
Ryan Williams, Sam Rohde, Scott Wegner, Shinsuke Sugaya, Shnitz, Simon P,
Sindy Li, Stephen Lumenta, Stijn Decubber, Thomas Weise, Tomas Novak,
Tomas Roos, Udi Meiri, Vaclav Plajt, Valentyn Tymofieiev, Vitalii Tverdokhlib,
Xinyu Liu, XuMingmin, Yifan Zou, Yuan, Yueyang Qiu, aalbatross, amaliujia,
cclauss, connelloG, daidokoro, deepyaman, djhworld, flyisland, huygaa11,
jasonkuster, jglezt, kkpoon, mareksimunek, nielm, svXaverius, timrobertson100,
&lt;a href="mailto:vaclav.plajt@gmail.com">vaclav.plajt@gmail.com&lt;/a>, vitaliytv, vvarma, xiliu, xinyuiscool, xitep,
Łukasz Gajowy.&lt;/p></description></item></channel></rss>