title: “Release Notes - Flink 2.1”

Release notes - Flink 2.1

These release notes discuss important aspects, such as configuration, behavior or dependencies, that changed between Flink 2.0 and Flink 2.1. Please read these notes carefully if you are planning to upgrade your Flink version to 2.1.

Table SQL / API

Model DDLs using Table API

FLINK-37548

Since Flink 2.0, we have introduced dedicated syntax for AI models, enabling users to define models as easily as creating catalog objects and invoke them like standard functions or table functions in SQL statements. In Flink 2.1, we also added Model DDLs Table API support, allowing users to define and manage AI models programmatically through the Table API in both Java and Python.

Realtime AI Function

FLINK-34992, FLINK-37777

Based on the AI model DDL, we also expanded the ML_PREDICT table-valued function (TVF) to perform realtime model inference in SQL queries, applying machine learning models to data streams seamlessly. The implementation supports both Flink builtin model providers (OpenAI) and interfaces for users to define custom model providers, accelerating Flink's evolution from a real-time data processing engine to a unified realtime AI platform. Looking ahead, we plan to introduce more AI functions to unlock end-to-end experience for real-time data processing, model training, and inference.

See more details about the capabilities and usages of Flink's Model Inference.

Variant Type

FLINK-37922

Variant is a new data type for semi-structured data(e.g. JSON), it supports storing any semi-structured data, including ARRAY, MAP(with STRING keys), and scalar types—while preserving field type information in a JSON-like structure. Unlike ROW and STRUCTURED types, VARIANT provides superior flexibility for handling deeply nested and evolving schemas.

Users can use PARSE_JSON orTRY_PARSE_JSON to convert JSON-formatted VARCHAR data to VARIANT. In addition, table formats like Apache Paimon now support the VARIANT type, this enable users to efficiently process semi-structured data in lakehouse using Flink SQL.

Structured Type Enhancements

FLINK-37861

Enabling declare user-defined objects via STRUCTURED TYPE directly in CREATE TABLE DDL statements, resolving critical type equivalence issues and significantly improving API usability.

Delta Join

FLINK-37836

Introduced a new DeltaJoin operator in stream processing jobs, along with optimizations for simple streaming join pipeline. Compared to traditional streaming join, delta join requires significantly less state, effectively mitigating issues related to large state, including resource bottlenecks, slow checkpointing, and lengthy job recovery times. This feature is enabled by default. More details can be found at Delta Join

Multiple Regular Joins

FLINK-37859

Streaming Flink jobs with multiple cascaded streaming joins often experience operational instability and performance degradation due to large state sizes. This release introduces a multi-join operator (StreamingMultiJoinOperator) that drastically reduces state size by eliminating intermediate results. The operator achieves this by processing joins across all input streams simultaneously within a single operator instance, storing only raw input records instead of propagated join output.

This “zero intermediate state” approach primarily targets state reduction, offering substantial benefits in resource consumption and operational stability. This feature is now available for pipelines with multiple INNER/LEFT joins that share at least one common join key, enable with SET 'table.optimizer.multi-join.enabled' = 'true'.

Async Lookup Join Enhancements

FLINK-37874

Support handling records in order based on upsert key (the unique key in the input stream deduced by planner) while allowing parallel processing of different keys to achieve better throughput when processing changelog data stream.

Sink Reuse

FLINK-37227

Within a single Flink job, when writing multiple INSERT INTO statements updating identical columns (different columns will be supported in next release) of a target table, the planner will optimize the execution plan and merge the sink nodes to achieve reuse. This will be a great usability improvement for users using partial-update features with data lake storages like Apache Paimon.

Support Smile Format for Compiled Plan Serialization

FLINK-37341

This release adds smile binary format support for compiled plans, providing a memory-efficient alternative to JSON for serialization/deserialization. By default JSON is used, in order to use smile format need to call CompiledPlan#asSmileBytes and PlanReference#fromSmileBytes method.

Runtime

Add Pluggable Batching for Async Sink

FLINK-37298

Introducing a pluggable batching mechanism for async sink that allows users to define custom batching write strategies tailored to specific requirements.

Split-level Watermark Metrics

FLINK-37410

We adds some split level watermark metrics, covering watermark progress and per-split state gauges to enhance the watermark observability:

  • currentWatermark: the last watermark this split has received.
  • activeTimeMsPerSecond: the time this split is active per second.
  • pausedTimeMsPerSecond: the time this split is paused due to watermark alignment per second.
  • idleTimeMsPerSecond: the time this split is marked idle by idleness detection per second.
  • accumulatedActiveTimeMs: accumulated time this split was active since registered.
  • accumulatedPausedTimeMs: accumulated time this split was paused since registered.
  • accumulatedIdleTimeMs: accumulated time this split was idle since registered.

Connectors

Introduce SQL Connector for Keyed State

FLINK-36929

In this release, we introduce a new connector for keyed state. This connector allows users to query keyed state directly from checkpoint or savepoint using Flink SQL, making it easier to inspect, debug, and validate the state of Flink jobs without custom tooling. This feature is especially useful for analyzing long-running jobs and validating state migrations.

Python

Adds support of Python 3.12 and removes support of Python 3.8

FLINK-37823, FLINK-37776

PyFlink 2.1 will support Python 3.12 and remove the support for Python 3.8.

Dependency upgrades

Upgrade flink-shaded version to 20.0

FLINK-37376

Bump flink-shaded version to 20.0 to support Smile format.

Upgrade Parquet version to 1.15.3

FLINK-37760

Bump parquet version to 1.15.3 to resolve parquet-avro module vulnerability found in CVE-2025-30065.