title: “Release Notes - Flink 1.16”

Release notes - Flink 1.16

These release notes discuss important aspects, such as configuration, behavior, or dependencies, that changed between Flink 1.15 and Flink 1.16. Please read these notes carefully if you are planning to upgrade your Flink version to 1.16.

Clusters & Deployment

Deprecate host/web-ui-port parameter of jobmanager.sh

FLINK-28735

The host/web-ui-port parameters of the jobmanager.sh script have been deprecated. These can (and should) be specified with the corresponding options as dynamic properties.

Table API & SQL

Remove string expression DSL

FLINK-26704

The deprecated String expression DSL has been removed from Java/Scala/Python Table API.

Support Retryable Lookup Join To Solve Delayed Updates Issue In External Systems

FLINK-28779

Adds retryable lookup join to support both async and sync lookups in order to solve the delayed updates issue in external systems.

Make `AsyncDataStream.OutputMode` configurable for table module

FLINK-27622

It is recommend to set the new option ‘table.exec.async-lookup.output-mode’ to ‘ALLOW_UNORDERED’ when no stritctly output order is needed, this will yield significant performance gains on append-only streams

Harden correctness for non-deterministic updates present in the changelog pipeline

FLINK-27849

For complex streaming jobs, now it's possible to detect and resolve potential correctness issues before running.

Connectors

Move Elasticsearch connector to external connector repository

FLINK-26884

The Elasticsearch connector has been copied from the Flink repository to its own individual repository at https://github.com/apache/flink-connector-elasticsearch. For this release, identical Elasticsearch connector artifacts will be available from both repositories but with different versions. For example, the first releases will be 1.16.0 and the externally versioned and maintained artifact 3.0.0. Developers are encouraged to move to the latter during this release cycle.

Drop support for Hive versions 1., 2.1. and 2.2.*

FLINK-27044

Support for Hive 1.*, 2.1.* and 2.2.* has been dropped from Flink. These Hive versions are no longer supported by the Hive community and therefore are also no longer supported by Flink.

Hive sink report statistics to Hive metastore

FLINK-28883

In batch mode, Hive sink now will report statistics to Hive metastore by default for written tables and partitions. This might be time-consuming when there are many written files. You can disable this feature by setting table.exec.hive.sink.statistic-auto-gather.enable to false.

Remove a number of Pulsar cursor APIs

FLINK-27399

A number of breaking changes were made to the Pulsar Connector cursor APIs:

CursorPosition#seekPosition() has been removed.
StartCursor#seekPosition() has been removed.
StopCursor#shouldStop now returns a StopCondition instead of a boolean.

Mark StreamingFileSink as deprecated

FLINK-27188

The StreamingFileSink has been deprecated in favor of the unified FileSink since Flink 1.12.

Flink generated Avro schemas can't be parsed using Python

FLINK-2596

Avro schemas generated by Flink now use the “org.apache.flink.avro.generated” namespace for compatibility with the Avro Python SDK.

Introduce configurable RateLimitingStrategy for Async Sink

FLINK-28487

Supports configurable RateLimitingStrategy for the AsyncSinkWriter. This change allows sink implementers to change the behaviour of an AsyncSink when requests fail, for a specific sink. If no RateLimitingStrategy is specified, it will use the current default of AIMDRateLimitingStrategy.

Runtime & Coordination

Deprecate reflection-based reporter instantiation

FLINK-27206

Configuring reporters by their class has been deprecated. Reporter implementations should provide a MetricReporterFactory, and all configurations should be migrated to such a factory.

If the reporter is loaded from the plugins directory, setting metrics.reporter.reporter_name.class no longer works.

Deprecate Datadog reporter ‘tags’ option

FLINK-29002

The ‘tags’ option from the DatadogReporter has been deprecated in favor of the generic ‘scope.variables.additional’ option.

Return 503 Service Unavailable if endpoint is not ready yet

FLINK-25269

The REST API now returns a 503 Service Unavailable error when a request is made but the backing component isn't ready yet. Previously this returned a 500 Internal Server error.

Make the job id distinct in application mode when HA is enabled.

FLINK-19358

The JobID in application mode is no longer 0000000000, but instead based on the cluster ID.

Checkpoints

Add the overdraft buffer in BufferPool to reduce unaligned checkpoint being blocked

FLINK-26762

New concept of overdraft network buffers was introduced to mitigate effects of uninterruptible blocking a subtask thread during back pressure. Starting from 1.16.0 Flink subtask can request by default up to 5 extra (overdraft) buffers over the regular configured amount(you can read more about this in the documentation: https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/network_mem_tuning/#overdraft-buffers). This change can slightly increase memory consumption of the Flink Job. To restore the older behaviour you can set taskmanager.network.memory.max-overdraft-buffers-per-gate to zero.

Non-deterministic UID generation might cause issues during restore

FLINK-28861

1.15.0 and 1.15.1 generated non-deterministic UIDs for operators, which make it difficult/impossible to restore state or upgrade to next patch version. A new table.exec.uid.generation config option (with correct default behavior) disables setting a UID for new pipelines from non-compiled plans. Existing pipelines can set table.exec.uid.generation=ALWAYS if the 1.15.0/1 behavior was acceptable.

Python

Annotate Python3.6 as deprecated in PyFlink 1.16

FLINK-28195

Python 3.6 extended support ended on 23 December 2021. We plan that PyFlink 1.16 will be the last version support Python3.6.

Dependency upgrades

Update the Hadoop implementation for filesystems to 3.3.2

FLINK-27308

The Hadoop implementation used for Flink's filesystem implementation has been updated. This provides Flink users with the features that are listed under https://issues.apache.org/jira/browse/HADOOP-17566. One of these features is to enable client-side encryption of Flink state via https://issues.apache.org/jira/browse/HADOOP-13887.

Upgrade Kafka Client to 3.1.1

FLINK-28060

Kafka connector uses Kafka client 3.1.1 by default now.

Upgrade Hive 2.3 connector to version 2.3.9

FLINK-27063

Upgrade Hive 2.3 connector to version 2.3.9

Update dependency version for PyFlink

FLINK-25188

For support of Python3.9 and M1, PyFlink updates a series dependencies version: apache-beam==2.38.0 arrow==5.0.0 pemja==0.2.6

Update dependency version for system resources metrics

System resource metrics dependencies has been updated to the following:

com.github.oshi:oshi-core:6.1.5 (licensed under MIT license)
net.java.dev.jna:jna-platform:jar:5.10.0
net.java.dev.jna:jna:jar:5.10.0

title: “Release Notes - Flink 1.16”

Release notes - Flink 1.16

Clusters & Deployment

Deprecate host/web-ui-port parameter of jobmanager.sh

Table API & SQL

Remove string expression DSL

Support Retryable Lookup Join To Solve Delayed Updates Issue In External Systems

Make AsyncDataStream.OutputMode configurable for table module

Harden correctness for non-deterministic updates present in the changelog pipeline

Connectors

Move Elasticsearch connector to external connector repository

Drop support for Hive versions 1.*, 2.1.* and 2.2.*

Hive sink report statistics to Hive metastore

Remove a number of Pulsar cursor APIs

Mark StreamingFileSink as deprecated

Flink generated Avro schemas can't be parsed using Python

Introduce configurable RateLimitingStrategy for Async Sink

Runtime & Coordination

Deprecate reflection-based reporter instantiation

Deprecate Datadog reporter ‘tags’ option

Return 503 Service Unavailable if endpoint is not ready yet

Make the job id distinct in application mode when HA is enabled.

Checkpoints

Add the overdraft buffer in BufferPool to reduce unaligned checkpoint being blocked

Non-deterministic UID generation might cause issues during restore

Python

Annotate Python3.6 as deprecated in PyFlink 1.16

Dependency upgrades

Update the Hadoop implementation for filesystems to 3.3.2

Upgrade Kafka Client to 3.1.1

Upgrade Hive 2.3 connector to version 2.3.9

Update dependency version for PyFlink

Update dependency version for system resources metrics

Make `AsyncDataStream.OutputMode` configurable for table module

Drop support for Hive versions 1., 2.1. and 2.2.*