blob: 36529c5f57bbb1f00fecb110cd59d96bd7f68345 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Flink Blog Feed</title>
<description>Flink Blog</description>
<link>https://flink.apache.org/blog</link>
<atom:link href="https://flink.apache.org/blog/feed.xml" rel="self" type="application/rss+xml" />
<item>
<title>Apache Flink Kubernetes Operator 1.1.0 Release Announcement</title>
<description>&lt;p&gt;The community has continued to work hard on improving the Flink Kubernetes Operator capabilities since our &lt;a href=&quot;https://flink.apache.org/news/2022/06/05/release-kubernetes-operator-1.0.0.html&quot;&gt;first production ready release&lt;/a&gt; we launched about two months ago.&lt;/p&gt;
&lt;p&gt;With the release of Flink Kubernetes Operator 1.1.0 we are proud to announce a number of exciting new features improving the overall experience of managing Flink resources and the operator itself in production environments.&lt;/p&gt;
&lt;h2 id=&quot;release-highlights&quot;&gt;Release Highlights&lt;/h2&gt;
&lt;p&gt;A non-exhaustive list of some of the more exciting features added in the release:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Kubernetes Events on application and job state changes&lt;/li&gt;
&lt;li&gt;New operator metrics&lt;/li&gt;
&lt;li&gt;Unified and more robust reconciliation flow&lt;/li&gt;
&lt;li&gt;Periodic savepoints&lt;/li&gt;
&lt;li&gt;Custom Flink Resource Listeners&lt;/li&gt;
&lt;li&gt;Dynamic watched namespaces&lt;/li&gt;
&lt;li&gt;New built-in examples For Flink SQL and PyFlink&lt;/li&gt;
&lt;li&gt;Experimental autoscaling support&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;kubernetes-events-for-application-and-job-state-changes&quot;&gt;Kubernetes Events for Application and Job State Changes&lt;/h3&gt;
&lt;p&gt;The operator now emits native Kubernetes Events on relevant Flink Deployment and Job changes. This includes status changes, custom resource specification changes, deployment failures, etc.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Submit 53m JobManagerDeployment Starting deployment
Normal StatusChanged 52m Job Job status changed from RECONCILING to CREATED
Normal StatusChanged 52m Job Job status changed from CREATED to RUNNING
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;new-operator-metrics&quot;&gt;New Operator Metrics&lt;/h3&gt;
&lt;p&gt;The first version of the operator only came with basic system level metrics to monitor the JVM process.&lt;/p&gt;
&lt;p&gt;In 1.1.0 we have introduced a wide range of additional metrics related to lifecycle-management, Kubernetes API server access and the Java Operator SDK framework the operator itself is built on. These metrics allow operator administrators to get a comprehensive view of what’s happening in the environment.&lt;/p&gt;
&lt;p&gt;For details check the list of &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/metrics-logging/#metrics&quot;&gt;supported metrics&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;unified-and-more-robust-reconciliation-flow&quot;&gt;Unified and more robust reconciliation flow&lt;/h3&gt;
&lt;p&gt;We have refactored and streamlined the core reconciliation flow responsible for executing and tracking resource upgrades, savepoints, rollbacks and other operations.&lt;/p&gt;
&lt;p&gt;In the process we made a number of important improvements to tolerate operator failures and temporary Kubernetes API outages more gracefully, which is critical in production environments.&lt;/p&gt;
&lt;h3 id=&quot;periodic-savepoints&quot;&gt;Periodic Savepoints&lt;/h3&gt;
&lt;p&gt;By popular demand we have introduced periodic savepoints for applications and session jobs using a the following simple configuration option:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;flinkConfiguration:
...
kubernetes.operator.periodic.savepoint.interval: 6h
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Old savepoints are cleaned up automatically according to the user configured policy:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;kubernetes.operator.savepoint.history.max.count: 5
kubernetes.operator.savepoint.history.max.age: 48h
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;custom-flink-resource-listeners&quot;&gt;Custom Flink Resource Listeners&lt;/h3&gt;
&lt;p&gt;The operator allows users to listen to events and status updates triggered for the Flink Resources managed by the operator.&lt;/p&gt;
&lt;p&gt;This feature enables tighter integration with the user’s own data platform. By implementing the &lt;code&gt;FlinkResourceListener&lt;/code&gt; interface users can listen to both events and status updates per resource type (&lt;code&gt;FlinkDeployment&lt;/code&gt; / &lt;code&gt;FlinkSessionJob&lt;/code&gt;). The interface methods will be called after the respective events have been triggered by the system.&lt;/p&gt;
&lt;h3 id=&quot;new-sql-and-python-job-examples&quot;&gt;New SQL and Python Job Examples&lt;/h3&gt;
&lt;p&gt;To demonstrate the power of the operator for all Flink use-cases, we have added examples showcasing how to deploy Flink SQL and Python jobs.&lt;/p&gt;
&lt;p&gt;We have also added a brief &lt;a href=&quot;https://github.com/apache/flink-kubernetes-operator/tree/main/examples&quot;&gt;README&lt;/a&gt; for the examples to make it easier for you to find what you are looking for.&lt;/p&gt;
&lt;h3 id=&quot;dynamic-watched-namespaces&quot;&gt;Dynamic watched namespaces&lt;/h3&gt;
&lt;p&gt;The operator can watch and manage custom resources in an arbitrary list of namespaces. The watched namespaces can be defined through the property &lt;code&gt;kubernetes.operator.watched.namespaces: ns1,ns2&lt;/code&gt;. The list of watched namespaces can be changed anytime in the corresponding config map, however the operator ignores the changes unless dynamic watched namespaces is enabled.&lt;/p&gt;
&lt;p&gt;This is controlled by the property &lt;code&gt;kubernetes.operator.dynamic.namespaces.enabled: true&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&quot;experimental-autoscaling-support&quot;&gt;Experimental autoscaling support&lt;/h3&gt;
&lt;p&gt;In this version we have taken the first steps toward enabling Kubernetes native autoscaling integration for the operator. The FlinkDeployment CRD now exposes the &lt;code&gt;scale&lt;/code&gt; subresource which allows us to create HPA policies directly in Kubernetes that will monitor the task manager pods.&lt;/p&gt;
&lt;p&gt;This integration is still very much experimental but we are planning to build on top of this in the upcoming releases to provide a reliable scaling mechanism.&lt;/p&gt;
&lt;p&gt;You can find an example scaling policy &lt;a href=&quot;https://github.com/apache/flink-kubernetes-operator/tree/main/examples#horizontal-pod-autoscaler&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next?&lt;/h2&gt;
&lt;p&gt;In the coming months, our focus will be on the following key areas:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Standalone deployment mode support&lt;/li&gt;
&lt;li&gt;Hardening of rollback mechanism and stability conditions&lt;/li&gt;
&lt;li&gt;Scaling improvements&lt;/li&gt;
&lt;li&gt;Support for older Flink versions&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These features will allow the operator and users to benefit more from the recent advancements in Flink’s scheduling capabilities.&lt;/p&gt;
&lt;h2 id=&quot;upgrading-to-110&quot;&gt;Upgrading to 1.1.0&lt;/h2&gt;
&lt;p&gt;The new 1.1.0 release is backward compatible as long as you follow our &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.1/docs/operations/upgrade/#normal-upgrade-process&quot;&gt;operator upgrade quide&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Please ensure that &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.1/docs/operations/upgrade/#1-upgrading-the-crd&quot;&gt;CRDs are updated&lt;/a&gt; in order to enable some of the new features.&lt;/p&gt;
&lt;p&gt;The upgrade should not impact any currently deployed Flink resources.&lt;/p&gt;
&lt;h2 id=&quot;release-resources&quot;&gt;Release Resources&lt;/h2&gt;
&lt;p&gt;The source artifacts and helm chart are available on the Downloads page of the Flink website. You can easily try out the new features shipped in the official 1.1.0 release by following our &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.1/docs/try-flink-kubernetes-operator/quick-start/&quot;&gt;quickstart guide&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can also find official Kubernetes Operator Docker images of the new version on &lt;a href=&quot;https://hub.docker.com/r/apache/flink-kubernetes-operator&quot;&gt;Dockerhub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For more details, check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.1/&quot;&gt;updated documentation&lt;/a&gt; and the &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351723&quot;&gt;release notes&lt;/a&gt;. We encourage you to download the release and share your feedback with the community through the Flink mailing lists or JIRA.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;Aitozi, Biao Geng, Chethan, ConradJam, Dora Marsal, Gyula Fora, Hao Xin, Hector Miuler Malpica Gallegos, Jaganathan Asokan, Jeesmon Jacob, Jim Busche, Maksim Aniskov, Marton Balassi, Matyas Orhidi, Nicholas Jiang, Peng Yuan, Peter Vary, Thomas Weise, Xin Hao, Yang Wang&lt;/p&gt;
</description>
<pubDate>Mon, 25 Jul 2022 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2022/07/25/release-kubernetes-operator-1.1.0.html</link>
<guid isPermaLink="true">/news/2022/07/25/release-kubernetes-operator-1.1.0.html</guid>
</item>
<item>
<title>Apache Flink ML 2.1.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is excited to announce the release of Flink ML 2.1.0!
This release focuses on improving Flink ML’s infrastructure, such as Python SDK,
memory management, and benchmark framework, to facilitate the development of
performant, memory-safe, and easy-to-use algorithm libraries. We validated the
enhanced infrastructure by implementing, benchmarking, and optimizing 10 new
algorithms in Flink ML, and confirmed that Flink ML can meet or exceed the
performance of selected algorithms from alternative popular ML libraries.
In addition, this release added example Python and Java programs for each
algorithm in the library to help users learn and use Flink ML.&lt;/p&gt;
&lt;p&gt;With the improvements and performance benchmarks made in this release, we
believe Flink ML’s infrastructure is ready for use by the interested developers
in the community to build performant pythonic machine learning libraries.&lt;/p&gt;
&lt;p&gt;We encourage you to &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;download the release&lt;/a&gt;
and share your feedback with the community through the Flink
&lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;mailing lists&lt;/a&gt; or
&lt;a href=&quot;https://issues.apache.org/jira/browse/flink&quot;&gt;JIRA&lt;/a&gt;! We hope you like the new
release and we’d be eager to learn about your experience with it.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#notable-features&quot; id=&quot;markdown-toc-notable-features&quot;&gt;Notable Features&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#api-and-infrastructure&quot; id=&quot;markdown-toc-api-and-infrastructure&quot;&gt;API and Infrastructure&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#supporting-fine-grained-per-operator-memory-management&quot; id=&quot;markdown-toc-supporting-fine-grained-per-operator-memory-management&quot;&gt;Supporting fine-grained per-operator memory management&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#improved-infrastructure-for-developing-online-learning-algorithms&quot; id=&quot;markdown-toc-improved-infrastructure-for-developing-online-learning-algorithms&quot;&gt;Improved infrastructure for developing online learning algorithms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#algorithm-benchmark-framework&quot; id=&quot;markdown-toc-algorithm-benchmark-framework&quot;&gt;Algorithm benchmark framework&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#python-sdk&quot; id=&quot;markdown-toc-python-sdk&quot;&gt;Python SDK&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#algorithm-library&quot; id=&quot;markdown-toc-algorithm-library&quot;&gt;Algorithm Library&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#upgrade-notes&quot; id=&quot;markdown-toc-upgrade-notes&quot;&gt;Upgrade Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes-and-resources&quot; id=&quot;markdown-toc-release-notes-and-resources&quot;&gt;Release Notes and Resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;notable-features&quot;&gt;Notable Features&lt;/h1&gt;
&lt;h2 id=&quot;api-and-infrastructure&quot;&gt;API and Infrastructure&lt;/h2&gt;
&lt;h3 id=&quot;supporting-fine-grained-per-operator-memory-management&quot;&gt;Supporting fine-grained per-operator memory management&lt;/h3&gt;
&lt;p&gt;Before this release, algorithm operators with internal states (e.g. the training
data to be replayed for each round of iteration) store state data using the
state-backend API (e.g. ListState). Such an operator either needs to store all
data in memory, which risks OOM, or it needs to always store data on disk.
In the latter case, it needs to read and de-serialize all data from disks
repeatedly in each round of iteration even if the data can fit in RAM, leading
to sub-optimal performance when the training data size is small. This makes it
hard for developers to write performant and memory-safe operators.&lt;/p&gt;
&lt;p&gt;This release enhances the Flink ML infrastructure with the mechanism to specify
the amount of managed memory that an operator can consume. This allows algorithm
operators to write and read data from managed memory when the data size is below
the quota, and automatically spill those data that exceeds the memory quota to
disks to avoid OOM. Algorithm developers can use this mechanism to achieve
optimal algorithm performance as input data size varies. Please feel free to
check out the implementation of the KMeans operator for example.&lt;/p&gt;
&lt;h3 id=&quot;improved-infrastructure-for-developing-online-learning-algorithms&quot;&gt;Improved infrastructure for developing online learning algorithms&lt;/h3&gt;
&lt;p&gt;A key objective of Flink ML is to facilitate the development of online learning
applications. In the last release, we enhanced the Flink ML API with
setModelData() and getModelData(), which allows users of online learning
algorithms to transmit and persist model data as unbounded data streams.
This release continues the effort by improving and validating the infrastructure
needed to develop online learning algorithms.&lt;/p&gt;
&lt;p&gt;Specifically, this release added two online learning algorithm prototypes (i.e.
OnlineKMeans and OnlineLogisticRegression) with tests covering the entire
lifecycle of using these algorithms. These two algorithms introduce concepts
such as global batch size and model version, together with metrics and APIs to
set and get those values. While the online algorithm prototypes have not been
optimized for prediction accuracy yet, this line of work is an important step
toward setting up best practices for building online learning algorithms in
Flink ML. We hope more contributors from the community can join this effort.&lt;/p&gt;
&lt;h3 id=&quot;algorithm-benchmark-framework&quot;&gt;Algorithm benchmark framework&lt;/h3&gt;
&lt;p&gt;An easy-to-use benchmark framework is critical to developing and maintaining
performant algorithm libraries in Flink ML. This release added a benchmark
framework that provides APIs to write pluggable and reusable data generators,
takes benchmark configuration in JSON format, and outputs benchmark results in
JSON format to enable custom analysis. An off-the-shelf script is provided to
visualize benchmark results using Matplotlib. Feel free to check out this
&lt;a href=&quot;https://github.com/apache/flink-ml/blob/release-2.1/flink-ml-benchmark/README.md&quot;&gt;README&lt;/a&gt;
for instructions on how to use this benchmark framework.&lt;/p&gt;
&lt;p&gt;The benchmark framework currently supports evaluating algorithm throughput.
In the future release, we plan to support evaluating algorithm latency and
accuracy.&lt;/p&gt;
&lt;h2 id=&quot;python-sdk&quot;&gt;Python SDK&lt;/h2&gt;
&lt;p&gt;This release enhances the Python SDK so that operators in the Flink ML Python
library can invoke the corresponding operators in the Java library. The Python
operator is a thin-wrapper around the Java operator and delivers the same
performance as the Java operator during execution. This capability significantly
improves developer velocity by allowing algorithm developers to maintain both
the Python and the Java libraries of algorithms without having to implement
those algorithms twice.&lt;/p&gt;
&lt;h2 id=&quot;algorithm-library&quot;&gt;Algorithm Library&lt;/h2&gt;
&lt;p&gt;This release continues to extend the algorithm library in Flink ML, with the
focus on validating the functionalities and the performance of Flink ML
infrastructure using representative algorithms in different categories.&lt;/p&gt;
&lt;p&gt;Below are the lists of algorithms newly supported in this release, grouped by
their categories:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Feature engineering (MinMaxScaler, StringIndexer, VectorAssembler,
StandardScaler, Bucketizer)&lt;/li&gt;
&lt;li&gt;Online learning (OnlineKmeans, OnlineLogisiticRegression)&lt;/li&gt;
&lt;li&gt;Regression (LinearRegression)&lt;/li&gt;
&lt;li&gt;Classification (LinearSVC)&lt;/li&gt;
&lt;li&gt;Evaluation (BinaryClassificationEvaluator)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Example Python and Java programs for these algorithms are provided on the Apache
Flink ML &lt;a href=&quot;https://nightlies.apache.org/flink/flink-ml-docs-release-2.1/&quot;&gt;website&lt;/a&gt; to
help users learn and try out Flink ML. And we also provided example benchmark
&lt;a href=&quot;https://github.com/apache/flink-ml/tree/release-2.1/flink-ml-benchmark/src/main/resources&quot;&gt;configuration files&lt;/a&gt;
in the repo for users to validate Flink ML performance. Feel free to check out
this &lt;a href=&quot;https://github.com/apache/flink-ml/blob/release-2.1/flink-ml-benchmark/README.md&quot;&gt;README&lt;/a&gt;
for instructions on how to run those benchmarks.&lt;/p&gt;
&lt;h1 id=&quot;upgrade-notes&quot;&gt;Upgrade Notes&lt;/h1&gt;
&lt;p&gt;Please review this note for a list of adjustments to make and issues to check
if you plan to upgrade to Flink ML 2.1.0.&lt;/p&gt;
&lt;p&gt;This note discusses any critical information about incompatibilities and
breaking changes, performance changes, and any other changes that might impact
your production deployment of Flink ML.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Flink dependency is changed from 1.14 to 1.15&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;This change introduces all the breaking changes listed in the Flink 1.15
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/release-notes/flink-1.15/&quot;&gt;release notes&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;release-notes-and-resources&quot;&gt;Release Notes and Resources&lt;/h1&gt;
&lt;p&gt;Please take a look at the &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351141&quot;&gt;release notes&lt;/a&gt;
for a detailed list of changes and new features.&lt;/p&gt;
&lt;p&gt;The source artifacts is now available on the updated
&lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads page&lt;/a&gt; of the Flink website,
and the most recent distribution of Flink ML Python package is available on
&lt;a href=&quot;https://pypi.org/project/apache-flink-ml&quot;&gt;PyPI&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h1&gt;
&lt;p&gt;The Apache Flink community would like to thank each one of the contributors
that have made this release possible:&lt;/p&gt;
&lt;p&gt;Yunfeng Zhou, Zhipeng Zhang, huangxingbo, weibo, Dong Lin, Yun Gao, Jingsong Li
and mumuhhh.&lt;/p&gt;
</description>
<pubDate>Tue, 12 Jul 2022 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2022/07/12/release-ml-2.1.0.html</link>
<guid isPermaLink="true">/news/2022/07/12/release-ml-2.1.0.html</guid>
</item>
<item>
<title>FLIP-147: Support Checkpoints After Tasks Finished - Part Two</title>
<description>&lt;p&gt;In the &lt;a href=&quot;/2022/07/11/final-checkpoint-part1.html&quot;&gt;first part&lt;/a&gt; of this blog,
we have briefly introduced the work to support checkpoints after tasks get
finished and revised the process of finishing. In this part we will present more details on the implementation,
including how we support checkpoints with finished tasks and the revised protocol of the finish process.&lt;/p&gt;
&lt;h1 id=&quot;implementation-of-support-checkpointing-with-finished-tasks&quot;&gt;Implementation of support Checkpointing with Finished Tasks&lt;/h1&gt;
&lt;p&gt;As described in part one,
to support checkpoints after some tasks are finished, the core idea is to mark
the finished operators in checkpoints and skip executing these operators after recovery. To implement this idea,
we enhanced the checkpointing procedure to generate the flag and use the flag on recovery. This section presents
more details on the process of taking checkpoints with finished tasks and recovery from such checkpoints.&lt;/p&gt;
&lt;p&gt;Previously, checkpointing only worked when all tasks were running. As shown in the Figure 1, in this case the
checkpoint coordinator first notify all the source tasks, and then the source tasks further notify the
downstream tasks to take snapshots via barrier events. Similarly, if there are finished tasks, we need to
find the new “source” tasks to initiate the checkpoint, namely those tasks that are still running but have
no running precedent tasks. CheckpointCoordinator does the computation atomically at the JobManager side
based on the latest states recorded in the execution graph.&lt;/p&gt;
&lt;p&gt;There might be race conditions when triggering tasks: when the checkpoint coordinator
decides to trigger one task and starts emitting the RPC, it is possible that the task is just finished and
reporting the FINISHED status to JobManager. In this case, the RPC message would fail and the checkpoint would be aborted.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;20&quot; style=&quot;width:50%&quot; src=&quot;/img/blog/2022-07-11-final-checkpoint/checkpoint_trigger.png&quot; /&gt;
&lt;p style=&quot;font-size: 0.6em&quot;&gt;
Figure 1. The tasks chosen as the new sources when taking checkpoint with finished tasks. The principle is to
choose the running tasks whose precedent tasks are all finished.
&lt;/p&gt;
&lt;/center&gt;
&lt;p&gt;In order to keep track of the finish status of each operator, we need to extend the checkpoint format.
A checkpoint consists of the states of all the stateful operators, and the state of one operator consists of the
entries from all its parallel instances. Note that the concept of Task is not reflected in the checkpoint. Task
is more of a physical execution container that drives the behavior of operators. It is not well-defined across
multiple executions of the same job since job upgrades might modify the operators contained in one task.
Therefore, the finished status should also be attached to the operators.&lt;/p&gt;
&lt;p&gt;As shown in the Figure 2, operators could be classified into three types according to their finished status:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Fully finished: If all the instances of an operator are finished, we could view the logic of the operators as
fully executed and we should skip the execution of the operator after recovery. We need to store a special flag for this
kind of operator.&lt;/li&gt;
&lt;li&gt;Partially finished: If only some instances of an operator are finished, then we still need to continue executing the
remaining logic of this operator. As a whole we could view the state of the operator as the set of entries collected from all the
running instances, which represents the remaining workload for this operator.&lt;/li&gt;
&lt;li&gt;No finished instances: In this case, the state of the operator is the same as the one taken when no tasks are finished.&lt;/li&gt;
&lt;/ol&gt;
&lt;center&gt;
&lt;img vspace=&quot;20&quot; style=&quot;width:50%&quot; src=&quot;/img/blog/2022-07-11-final-checkpoint/checkpoint_format.png&quot; /&gt;
&lt;p style=&quot;font-size: 0.6em&quot;&gt;
Figure 2. An illustration of the extended checkpoint format.
&lt;/p&gt;
&lt;/center&gt;
&lt;p&gt;If the job is later restored from a checkpoint taken with finished tasks, we would skip executing all the logic for fully
finished operators, and execute normally for the operators with no finished instances.&lt;/p&gt;
&lt;p&gt;However, this would be a bit complex for the partially finished operators. The state of partially finished operators would be
redistributed to all the instances, similar to rescaling when the parallelism is changed. Among all the types of states that
Flink offers, the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/state/#using-keyed-state&quot;&gt;keyed state&lt;/a&gt;
and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/state/#using-operator-state&quot;&gt;operator state&lt;/a&gt;
with even-split redistribution would work normally, but the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/state/#broadcast-state&quot;&gt;broadcast state&lt;/a&gt; and
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/state/#using-operator-state&quot;&gt;operator state with union redistribution&lt;/a&gt;
would be affected for the following reasons:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The broadcast state always replicates the state of the first subtask to the other subtasks. If the first subtask is finished,
an empty state would be distributed and the operator would run from scratch, which is not correct.&lt;/li&gt;
&lt;li&gt;The operator state with union distribution merges the states of all the subtasks and then sends the merged state to all the
subtasks. Based on this behavior, some operators may choose one subtask to store a shared value and after restarting this value will
be distributed to all the subtasks. However, if this chosen task is finished, the state would be lost.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These two issues would not occur when rescaling since there would be no finished tasks in that scenario. To address
these issues, we chose one of the running subtasks instead to acquire the current state for the broadcast state. For the operator
state with union redistribution, we have to collect the states of all the subtasks to maintain the semantics. Thus, currently we
abort the checkpoint if parts of subtasks finished for operators using this kind of state.&lt;/p&gt;
&lt;p&gt;In principle, you should be able to modify your job (which changes the dataflow graph) and restore from a previous checkpoint. That said,
there are certain graph modifications that are not supported. These kinds of changes include adding a new operator as the precedent of a fully finished
one. Flink would check for such modifications and throw exceptions while restoring.&lt;/p&gt;
&lt;h1 id=&quot;the-revised-process-of-finishing&quot;&gt;The Revised Process of Finishing&lt;/h1&gt;
&lt;p&gt;As described in the part one, based on the ability to take checkpoints with finished tasks, we revised the process of finishing
so that we could always commit all the data for two-phase-commit sinks. We’ll show the detailed protocol of the finished process in this
section.&lt;/p&gt;
&lt;h2 id=&quot;how-did-jobs-in-flink-finish-before&quot;&gt;How did Jobs in Flink Finish Before?&lt;/h2&gt;
&lt;p&gt;A job might finish in two ways: all sources finish or users execute
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint&quot;&gt;&lt;code&gt;stop-with-savepoint [--drain]&lt;/code&gt;&lt;/a&gt;.
Let’s first have a look at the detailed process of finishing before FLIP-147.&lt;/p&gt;
&lt;h3 id=&quot;when-sources-finish&quot;&gt;When sources finish&lt;/h3&gt;
&lt;p&gt;If all the sources are bounded, The job will finish after all the input records are processed and all the result are
committed to external systems. In this case, the sources would first
emit a &lt;code&gt;MAX_WATERMARK&lt;/code&gt; (&lt;code&gt;Long.MAX_VALUE&lt;/code&gt;) and then start to terminate the task. On termination, a task would call &lt;code&gt;endOfInput()&lt;/code&gt;,
&lt;code&gt;close()&lt;/code&gt; and &lt;code&gt;dispose()&lt;/code&gt; for all the operators, then emit an &lt;code&gt;EndOfPartitionEvent&lt;/code&gt; to the downstream tasks. The intermediate tasks
would start terminating after receiving an &lt;code&gt;EndOfPartitionEvent&lt;/code&gt; from all the input channels, and this process will continue
until the last task is finished.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;1. Source operators emit MAX_WATERMARK
2. On received MAX_WATERMARK for non-source operators
a. Trigger all the event-time timers
b. Emit MAX_WATERMARK
3. Source tasks finished
a. endInput(inputId) for all the operators
b. close() for all the operators
c. dispose() for all the operators
d. Emit EndOfPartitionEvent
e. Task cleanup
4. On received EndOfPartitionEvent for non-source tasks
a. endInput(int inputId) for all the operators
b. close() for all the operators
c. dispose() for all the operators
d. Emit EndOfPartitionEvent
e. Task cleanup
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;when-users-execute-stop-with-savepoint---drain&quot;&gt;When users execute stop-with-savepoint [–drain]&lt;/h3&gt;
&lt;p&gt;Users could execute the command stop-with-savepoint [–drain] for both bounded and unbounded jobs to trigger jobs to finish.
In this case, Flink first triggers a synchronous savepoint and all the tasks would stall after seeing the synchronous
savepoint. If the savepoint succeeds, all the source operators would finish actively and the job would finish the same as the above scenario.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;1. Trigger a savepoint
2. Sources received savepoint trigger RPC
a. If with –-drain
i. source operators emit MAX_WATERMARK
b. Source emits savepoint barrier
3. On received MAX_WATERMARK for non-source operators
a. Trigger all the event times
b. Emit MAX_WATERMARK
4. On received savepoint barrier for non-source operators
a. The task blocks till the savepoint succeed
5. Finish the source tasks actively
a. If with –-drain
ii. endInput(inputId) for all the operators
b. close() for all the operators
c. dispose() for all the operators
d. Emit EndOfPartitionEvent
e. Task cleanup
6. On received EndOfPartitionEvent for non-source tasks
a. If with –-drain
i. endInput(int inputId) for all the operators
b. close() for all the operators
c. dispose() for all the operators
d. Emit EndOfPartitionEvent
e. Task cleanup
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;A parameter &lt;code&gt;–-drain&lt;/code&gt; is supported with &lt;code&gt;stop-with-savepoint&lt;/code&gt;: if not specified, the job is expected to resume from this savepoint,
otherwise the job is expected to terminate permanently. Thus we only emit &lt;code&gt;MAX_WATERMARK&lt;/code&gt; to trigger all the event timers and call
&lt;code&gt;endInput()&lt;/code&gt; in the latter case.&lt;/p&gt;
&lt;h2 id=&quot;revise-the-finishing-steps&quot;&gt;Revise the Finishing Steps&lt;/h2&gt;
&lt;p&gt;As described in part one, after revising the process of finishing, we have decoupled the process of “finishing operator logic”
and “finishing task” by introducing a new &lt;code&gt;EndOfData&lt;/code&gt; event. After the revision each task will first
notify the descendants with an &lt;code&gt;EndOfData&lt;/code&gt; event after executing all the logic
so that the descendants also have chances to finish executing the operator logic, then
all the tasks could wait for the next checkpoint or the specified savepoint concurrently to commit all the remaining data.
This section will present the detailed protocol of the revised process. Since we have renamed
&lt;code&gt;close()&lt;/code&gt; /&lt;code&gt;dispose()&lt;/code&gt; to &lt;code&gt;finish()&lt;/code&gt; / &lt;code&gt;close()&lt;/code&gt;, we’ll stick to the new terminologies in the following description.&lt;/p&gt;
&lt;p&gt;The revised process of finishing is shown as follows:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;1. Source tasks finished due to no more records or stop-with-savepoint.
a. if no more records or stop-with-savepoint –-drain
i. source operators emit MAX_WATERMARK
ii. endInput(inputId) for all the operators
iii. finish() for all the operators
iv. emit EndOfData[isDrain = true] event
b. else if stop-with-savepoint
i. emit EndOfData[isDrain = false] event
c. Wait for the next checkpoint / the savepoint after operator finished complete
d. close() for all the operators
e. Emit EndOfPartitionEvent
f. Task cleanup
2. On received MAX_WATERMARK for non-source operators
a. Trigger all the event times
b. Emit MAX_WATERMARK
3. On received EndOfData for non-source tasks
a. If isDrain
i. endInput(int inputId) for all the operators
ii. finish() for all the operators
b. Emit EndOfData[isDrain = the flag value of the received event]
4. On received EndOfPartitionEvent for non-source tasks
a. Wait for the next checkpoint / the savepoint after operator finished complete
b. close() for all the operators
c. Emit EndOfPartitionEvent
d. Task cleanup
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;center&gt;
&lt;img vspace=&quot;20&quot; style=&quot;width:60%&quot; src=&quot;/img/blog/2022-07-11-final-checkpoint/example_job_finish.png&quot; /&gt;
&lt;p style=&quot;font-size: 0.6em&quot;&gt;
Figure 3. An example job of the revised process of finishing.
&lt;/p&gt;
&lt;/center&gt;
&lt;p&gt;An example of the process of job finishing is shown in Figure 3.&lt;/p&gt;
&lt;p&gt;Let’s first have a look at the example that all the source tasks are bounded.
If Task &lt;code&gt;C&lt;/code&gt; finishes after processing all the records, it first emits the max-watermark, then finishes the operators and emits
the &lt;code&gt;EndOfData&lt;/code&gt; event. After that, it waits for the next checkpoint to complete and then emits the &lt;code&gt;EndOfPartitionEvent&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Task &lt;code&gt;D&lt;/code&gt; finishes all the operators right after receiving the &lt;code&gt;EndOfData&lt;/code&gt; event. Since any checkpoints taken after operators finish
can commit all the pending records and be the final checkpoint, Task &lt;code&gt;D&lt;/code&gt;’s final checkpoint would be the same as Task &lt;code&gt;C&lt;/code&gt;’s since
the barrier must be emitted after the &lt;code&gt;EndOfData&lt;/code&gt; event.&lt;/p&gt;
&lt;p&gt;Task &lt;code&gt;E&lt;/code&gt; is a bit different in that it has two inputs. Task &lt;code&gt;A&lt;/code&gt; might continue to run for a while and, thus, Task &lt;code&gt;E&lt;/code&gt; needs to wait
until it receives an &lt;code&gt;EndOfData&lt;/code&gt; event also from the other input before finishing operators and its final checkpoint might be different.&lt;/p&gt;
&lt;p&gt;On the other hand, when using &lt;code&gt;stop-with-savepoint [--drain]&lt;/code&gt;, the process is similar except that all the tasks need to wait for the exact
savepoint before finishing instead of just any checkpoints. Moreover, since both Task &lt;code&gt;C&lt;/code&gt; and Task &lt;code&gt;A&lt;/code&gt; would finish at the same time,
Task &lt;code&gt;E&lt;/code&gt; would also be able to wait for this particular savepoint before finishing.&lt;/p&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;In this part we have presented more details of how the checkpoints are taken with finished tasks and the revised process
of finishing. We hope the details could provide more insights of the thoughts and implementations for this part of work. Still, if you
have any questions, please feel free to start a discussion or report an issue in the dev or user mailing list.&lt;/p&gt;
</description>
<pubDate>Mon, 11 Jul 2022 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/07/11/final-checkpoint-part2.html</link>
<guid isPermaLink="true">/2022/07/11/final-checkpoint-part2.html</guid>
</item>
<item>
<title>FLIP-147: Support Checkpoints After Tasks Finished - Part One</title>
<description>&lt;h1 id=&quot;motivation&quot;&gt;Motivation&lt;/h1&gt;
&lt;p&gt;Flink is a distributed processing engine for both unbounded and bounded streams of data. In recent versions,
Flink has unified the DataStream API and the Table / SQL API to support both streaming and batch cases.
Since most users require both types of data processing pipelines, the unification helps reduce the complexity of developing,
operating, and maintaining consistency between streaming and batch backfilling jobs, like
&lt;a href=&quot;https://www.ververica.com/blog/apache-flinks-stream-batch-unification-powers-alibabas-11.11-in-2020&quot;&gt;the case for Alibaba&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Flink provides two execution modes under the unified programming API: the streaming mode and the batch mode.
The streaming mode processes records incrementally based on the states, thus it supports both bounded and unbounded sources.
The batch mode works with bounded sources and usually has a better performance for bounded jobs because it executes all the
tasks in topological order and avoids random state access by pre-sorting the input records. Although batch mode is often the
preferred mode to process bounded jobs, streaming mode is also required for various reasons. For example, users may want to deal
with records containing retraction or exploit the property that data is roughly sorted by event times in streaming mode
(like the case in &lt;a href=&quot;https://www.youtube.com/watch?t=666&amp;amp;v=4qSlsYogALo&amp;amp;feature=youtu.be&quot;&gt;Kappa+ Architecture&lt;/a&gt;). Moreover,
users often have mixed jobs involving both unbounded streams and bounded side-inputs, which also require streaming execution mode.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;20&quot; style=&quot;width:70%&quot; src=&quot;/img/blog/2022-07-11-final-checkpoint/stream_batch_cmp.png&quot; /&gt;
&lt;p style=&quot;font-size: 0.6em;text-align:left;margin-top:-1em;margin-bottom: 4em&quot;&gt;
Figure 1. A comparison of the Streaming mode and Batch mode for the example Count operator. For streaming mode, the arrived
elements are not sorted, the operator would read / write the state corresponding to the element for computation.
For batch mode, the arrived elements are first sorted as a whole and then processed.
&lt;/p&gt;
&lt;/center&gt;
&lt;p&gt;In streaming mode, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/checkpointing/&quot;&gt;checkpointing&lt;/a&gt;
is the vital mechanism in supporting exactly-once guarantees. By periodically snapshotting the
aligned states of operators, Flink can recover from the latest checkpoint and continue execution when failover happens. However,
previously Flink could not take checkpoints if any task gets finished. This would cause problems for jobs with both bounded and unbounded
sources: if there are no checkpoints after the bounded part finished, the unbounded part might need to reprocess a large amount of
records in case of a failure.&lt;/p&gt;
&lt;p&gt;Furthermore, being unable to take checkpoints with finished tasks is a problem for jobs using two-phase-commit sinks to achieve
&lt;a href=&quot;https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html&quot;&gt;end-to-end exactly-once processing&lt;/a&gt;.
The two-phase-commit sinks first write data to temporary files or external transactions,
and commit the data only after a checkpoint completes to ensure the data would not be replayed on failure. However, if a job
contains bounded sources, committing the results would not be possible after the bounded sources finish. Also because of that,
for bounded jobs we have no way to commit the last piece of data after the first source task finished, and previously the bounded
jobs just ignore the uncommitted data when finishing. These behaviors caused a lot of confusion and are always asked in the user
mailing list.&lt;/p&gt;
&lt;p&gt;Therefore, to complete the support of streaming mode for jobs using bounded sources, it is important for us to&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Support taking checkpoints with finished tasks.&lt;/li&gt;
&lt;li&gt;Furthermore, revise the process of finishing so that all the data could always be committed.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The remaining blog briefly describes the changes we made to achieve the above goals. In the next blog,
we’ll share more details on how they are implemented.&lt;/p&gt;
&lt;h1 id=&quot;support-checkpointing-with-finished-tasks&quot;&gt;Support Checkpointing with Finished Tasks&lt;/h1&gt;
&lt;p&gt;The core idea of supporting checkpoints with finished tasks is to mark the finished operators in checkpoints and skip
executing these operators after recovery. As illustrated in Figure 2, a checkpoint is composed of the states of all
the operators. If all the subtasks of an operator have finished, we could mark it as fully finished and skip the
execution of this operator on startup. For other operators, their states are composed of the states of all the
running subtasks. The states will be repartitioned on restarting and all the new subtasks restarted with the assigned states.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;20&quot; style=&quot;width:50%&quot; src=&quot;/img/blog/2022-07-11-final-checkpoint/checkpoint_format.png&quot; /&gt;
&lt;p style=&quot;font-size: 0.6em;text-align:center;margin-top:-1em;margin-bottom: 4em&quot;&gt;
Figure 2. An illustration of the extended checkpoint format.
&lt;/p&gt;
&lt;/center&gt;
&lt;p&gt;To support creating such a checkpoint for jobs with finished tasks, we extended the checkpoint procedure.
Previously the checkpoint coordinator inside the JobManager first notifies all the sources to report snapshots,
then all the sources further notify their descendants via broadcasting barrier events. Since now the sources might
have already finished, the checkpoint coordinator would instead treat the running tasks who also do not have running
precedent tasks as “new sources”, and it notifies these tasks to initiate the checkpoints. Finally, if the subtasks of
an operator are either finished on triggering checkpoint or have finished processing all the data on snapshotting states,
the operator would be marked as fully finished.&lt;/p&gt;
&lt;p&gt;The changes of the checkpoint procedure are transparent to users except that for checkpoints indeed containing
finished tasks, we disallowed adding new operators as precedents of the fully finished ones, since it would make the fully
finished operators have running precedents after restarting, which conflicts with the design that tasks finished
in topological order.&lt;/p&gt;
&lt;h1 id=&quot;revise-the-process-of-finishing&quot;&gt;Revise the Process of Finishing&lt;/h1&gt;
&lt;p&gt;Based on the ability to take checkpoints with finished tasks, we could then solve the issue that two-phase-commit
operators could not commit all the data when running in streaming mode. As the background, Flink jobs
have two ways to finish:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;All sources are bound and they processed all the input records. The job will finish after all the
input records are processed and all the result are committed to external systems.&lt;/li&gt;
&lt;li&gt;Users execute &lt;code&gt;stop-with-savepoint [--drain]&lt;/code&gt;. The job will take a savepoint and then finish. With &lt;code&gt;–-drain&lt;/code&gt;, the job
will be stopped permanently and is also required to commit all the data. However, without &lt;code&gt;--drain&lt;/code&gt; the job might
be resumed from the savepoint later, thus not all data are required to be committed, as long as the state of the data could be
recovered from the savepoint.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let’s first have a look at the case of bounded sources. To achieve end-to-end exactly-once,
two-phase-commit operators only commit data after a checkpoint following this piece of data succeeded.
However, previously there is no such an opportunity for the data between the last periodic checkpoint and job getting finished,
and the data finally gets lost. Note that it is also not correct if we directly commit the data on job finished, since
if there are failovers after that (like due to other unfinished tasks getting failed), the data will be replayed and cause duplication.&lt;/p&gt;
&lt;p&gt;The case of &lt;code&gt;stop-with-savepoint --drain&lt;/code&gt; also has problems. The previous implementation first stalls the execution and
takes a savepoint. After the savepoint succeeds, all the source tasks would stop actively. Although the savepoint seems to
provide the opportunity to commit all the data, some processing logic is in fact executed during the job getting stopped,
and the records produced would be discarded by mistake. For example, calling &lt;code&gt;endInput()&lt;/code&gt; method for operators happens during
the stopping phase, some operators like the async operator might still emit new records in this method.&lt;/p&gt;
&lt;p&gt;At last, although &lt;code&gt;stop-with-savepoint&lt;/code&gt; without draining is not required to commit all the data, we hope the job finish process could
be unified for all the cases to keep the code clean.&lt;/p&gt;
&lt;p&gt;To fix the remaining issues, we need to modify the process of finishing to ensure all the data getting committed for the required cases.
An intuitive idea is to directly insert a step to the tasks’ lifecycle to wait for the next checkpoint, as shown in the left part
of Figure 3. However, it could not solve all the issues.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;20&quot; style=&quot;width:90%&quot; src=&quot;/img/blog/2022-07-11-final-checkpoint/finish_cmp.png&quot; /&gt;
&lt;p style=&quot;font-size: 0.6em;text-align:left;margin-top:-1em;margin-bottom: 4em&quot;&gt;
Figure 3. A comparison of the two options to ensure tasks committed all the data before getting finished. The first
option directly inserts a step in the tasks’ lifecycle to wait for the next checkpoint, which disallows the tasks to wait
for the same checkpoint / savepoint. The second option decouples the notification of finishing operator logic and finishing tasks,
thus it allows all the tasks to first process all records, then they have the chance to wait for the same checkpoint / savepoint.
&lt;/p&gt;
&lt;/center&gt;
&lt;p&gt;For the case of bounded sources, the intuitive idea works, but it might have performance issues in some cases:
as exemplified in Figure 4, If there are multiple cascading tasks containing two-phase commit sinks, each task would
wait for the next checkpoint separately, thus the job needs to wait for three more checkpoints during finishing,
which might prolong the total execution time for a long time.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;20&quot; style=&quot;width:90%&quot; src=&quot;/img/blog/2022-07-11-final-checkpoint/example_job.png&quot; /&gt;
&lt;p style=&quot;font-size: 0.6em;text-align:center;margin-top:-1em;margin-bottom: 4em&quot;&gt;
Figure 4. An example job that contains a chain of tasks containing two-phase-commit operators.
&lt;/p&gt;
&lt;/center&gt;
&lt;p&gt;For the case of &lt;code&gt;stop-with-savepoint [--drain]&lt;/code&gt;, the intuitive idea does not work since different tasks have to
wait for different checkpoints / savepoints, thus we could not finish the job with a specific savepoint.&lt;/p&gt;
&lt;p&gt;Therefore, we do not take the intuitive option. Instead, we decoupled &lt;em&gt;“finishing operator logic”&lt;/em&gt; and &lt;em&gt;“finishing tasks”&lt;/em&gt;:
all the tasks would first finish their execution logic as a whole, including calling lifecycle methods like &lt;code&gt;endInput()&lt;/code&gt;,
then each task could wait for the next checkpoint concurrently. Besides, for stop-with-savepoint we also reverted the current
implementation similarly: all the tasks will first finish executing the operators’ logic, then they simply wait for the next savepoint
to happen before finish. Therefore, in this way the finishing processes are unified and the data could be fully committed for all the cases.&lt;/p&gt;
&lt;p&gt;Based on this thought, as shown in the right part of Figure 3, to decoupled the process of “finishing operator logic”
and “finishing tasks”, we introduced a new &lt;code&gt;EndOfData&lt;/code&gt; event. For each task, after executing all the operator logic it would first notify
the descendants with an &lt;code&gt;EndOfData&lt;/code&gt; event so that the descendants also have chances to finish executing the operator logic. Then all
the tasks could wait for the next checkpoint or the specified savepoint concurrently to commit all the remaining data before getting finished.&lt;/p&gt;
&lt;p&gt;At last, it is also worthy to mention we have clarified and renamed the &lt;code&gt;close()&lt;/code&gt; and &lt;code&gt;dispose()&lt;/code&gt; methods in the operators’ lifecycle.
The two methods are in fact different since &lt;code&gt;close()&lt;/code&gt; is only called when the task finishes normally and dispose() is called in both
cases of normal finishing and failover. However, this was not clear from their names. Therefore, we rename the two methods to &lt;code&gt;finish()&lt;/code&gt; and &lt;code&gt;close()&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;finish()&lt;/code&gt; marks the termination of the operator and no more records are allowed after &lt;code&gt;finish()&lt;/code&gt; is called. It should
only be called when sources are finished or when the &lt;code&gt;-–drain&lt;/code&gt; parameter is specified.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;close()&lt;/code&gt; is used to do cleanup and release all the held resources.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;By supporting the checkpoints after tasks finished and revising the process of finishing, we can support checkpoints for jobs with
both bounded and unbounded sources, and ensure the bounded job gets all output records committed before it finishes. The motivation
behind this change is to ensure data consistency, results completeness, and failure recovery if there are bounded sources in the pipeline.
The final checkpoint mechanism was first implemented in Flink 1.14 and enabled by default in Flink 1.15. If you have any questions,
please feel free to start a discussion or report an issue in the dev or user mailing list.&lt;/p&gt;
</description>
<pubDate>Mon, 11 Jul 2022 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/07/11/final-checkpoint-part1.html</link>
<guid isPermaLink="true">/2022/07/11/final-checkpoint-part1.html</guid>
</item>
<item>
<title>Apache Flink 1.15.1 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink Community is pleased to announce the first bug fix release of the Flink 1.15 series.&lt;/p&gt;
&lt;p&gt;This release includes 62 bug fixes, vulnerability fixes, and minor improvements for Flink 1.15.
Below you will find a list of all bugfixes and improvements (excluding improvements to the build infrastructure and build stability). For a complete list of all changes see:
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351546&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users upgrade to Flink 1.15.1.&lt;/p&gt;
&lt;h1 id=&quot;release-artifacts&quot;&gt;Release Artifacts&lt;/h1&gt;
&lt;h2 id=&quot;maven-dependencies&quot;&gt;Maven Dependencies&lt;/h2&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.15.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.15.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.15.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;binaries&quot;&gt;Binaries&lt;/h2&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;docker-images&quot;&gt;Docker Images&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://hub.docker.com/_/flink?tab=tags&amp;amp;page=1&amp;amp;name=1.15.1&quot;&gt;library/flink&lt;/a&gt; (official images)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://hub.docker.com/r/apache/flink/tags?page=1&amp;amp;name=1.15.1&quot;&gt;apache/flink&lt;/a&gt; (ASF repository)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;pypi&quot;&gt;PyPi&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://pypi.org/project/apache-flink/1.15.1/&quot;&gt;apache-flink==1.15.1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h1&gt;
&lt;p&gt;The community is aware of 3 issues that were introduced with 1.15.0 that remain unresolved. Efforts are underway to fix these issues for Flink 1.15.2:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-28861&quot;&gt;FLINK-28861&lt;/a&gt;] - Non-deterministic UID generation might cause issues during restore for Table/SQL API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-28060&quot;&gt;FLINK-28060&lt;/a&gt;] - Kafka commit on checkpointing fails repeatedly after a broker restart
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-28322&quot;&gt;FLINK-28322&lt;/a&gt;] - DataStreamScanProvider&#39;s new method is not compatible
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22984&quot;&gt;FLINK-22984&lt;/a&gt;] - UnsupportedOperationException when using Python UDF to generate watermark
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24491&quot;&gt;FLINK-24491&lt;/a&gt;] - ExecutionGraphInfo may not be archived when the dispatcher terminates
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24735&quot;&gt;FLINK-24735&lt;/a&gt;] - SQL client crashes with `Cannot add expression of different type to set`
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26645&quot;&gt;FLINK-26645&lt;/a&gt;] - Pulsar Source subscribe to a single topic partition will consume all partitions from that topic
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27041&quot;&gt;FLINK-27041&lt;/a&gt;] - KafkaSource in batch mode failing if any topic partition is empty
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27140&quot;&gt;FLINK-27140&lt;/a&gt;] - Move JobResultStore dirty entry creation into ioExecutor
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27174&quot;&gt;FLINK-27174&lt;/a&gt;] - Non-null check for bootstrapServers field is incorrect in KafkaSink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27218&quot;&gt;FLINK-27218&lt;/a&gt;] - Serializer in OperatorState has not been updated when new Serializers are NOT incompatible
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27223&quot;&gt;FLINK-27223&lt;/a&gt;] - State access doesn&amp;#39;t work as expected when cache size is set to 0
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27247&quot;&gt;FLINK-27247&lt;/a&gt;] - ScalarOperatorGens.numericCasting is not compatible with legacy behavior
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27255&quot;&gt;FLINK-27255&lt;/a&gt;] - Flink-avro does not support serialization and deserialization of avro schema longer than 65535 characters
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27282&quot;&gt;FLINK-27282&lt;/a&gt;] - Fix the bug of wrong positions mapping in RowCoder
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27367&quot;&gt;FLINK-27367&lt;/a&gt;] - SQL CAST between INT and DATE is broken
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27368&quot;&gt;FLINK-27368&lt;/a&gt;] - SQL CAST(&amp;#39; 1 &amp;#39; as BIGINT) returns wrong result
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27409&quot;&gt;FLINK-27409&lt;/a&gt;] - Cleanup stale slot allocation record when the resource requirement of a job is empty
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27418&quot;&gt;FLINK-27418&lt;/a&gt;] - Flink SQL TopN result is wrong
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27420&quot;&gt;FLINK-27420&lt;/a&gt;] - Suspended SlotManager fails to re-register metrics when started again
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27465&quot;&gt;FLINK-27465&lt;/a&gt;] - AvroRowDeserializationSchema.convertToTimestamp fails with negative nano seconds
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27487&quot;&gt;FLINK-27487&lt;/a&gt;] - KafkaMetricWrappers do incorrect cast
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27545&quot;&gt;FLINK-27545&lt;/a&gt;] - Update examples in PyFlink shell
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27563&quot;&gt;FLINK-27563&lt;/a&gt;] - Resource Providers - Yarn doc page has minor display error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27606&quot;&gt;FLINK-27606&lt;/a&gt;] - CompileException when using UDAF with merge() method
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27676&quot;&gt;FLINK-27676&lt;/a&gt;] - Output records from on_timer are behind the triggering watermark in PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27683&quot;&gt;FLINK-27683&lt;/a&gt;] - Insert into (column1, column2) Values(.....) fails with SQL hints
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27711&quot;&gt;FLINK-27711&lt;/a&gt;] - Correct the typo of set_topics_pattern by changing it to set_topic_pattern for Pulsar Connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27733&quot;&gt;FLINK-27733&lt;/a&gt;] - Rework on_timer output behind watermark bug fix
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27734&quot;&gt;FLINK-27734&lt;/a&gt;] - Not showing checkpoint interval properly in WebUI when checkpoint is disabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27760&quot;&gt;FLINK-27760&lt;/a&gt;] - NPE is thrown when executing PyFlink jobs in batch mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27762&quot;&gt;FLINK-27762&lt;/a&gt;] - Kafka WakeupException during handling splits changes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27797&quot;&gt;FLINK-27797&lt;/a&gt;] - PythonTableUtils.getCollectionInputFormat cannot correctly handle None values
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27848&quot;&gt;FLINK-27848&lt;/a&gt;] - ZooKeeperLeaderElectionDriver keeps writing leader information, using up zxid
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27881&quot;&gt;FLINK-27881&lt;/a&gt;] - The key(String) in PulsarMessageBuilder returns null
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27890&quot;&gt;FLINK-27890&lt;/a&gt;] - SideOutputExample.java fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27910&quot;&gt;FLINK-27910&lt;/a&gt;] - FileSink not enforcing rolling policy if started from scratch
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27933&quot;&gt;FLINK-27933&lt;/a&gt;] - Savepoint status cannot be queried from standby jobmanager
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27955&quot;&gt;FLINK-27955&lt;/a&gt;] - PyFlink installation failure on Windows OS
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27999&quot;&gt;FLINK-27999&lt;/a&gt;] - NoSuchMethodError when using Hive 3 dialect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-28018&quot;&gt;FLINK-28018&lt;/a&gt;] - the start index to create empty splits in BinaryInputFormat#createInputSplits is inappropriate
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-28019&quot;&gt;FLINK-28019&lt;/a&gt;] - Error in RetractableTopNFunction when retracting a stale record with state ttl enabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-28114&quot;&gt;FLINK-28114&lt;/a&gt;] - The path of the Python client interpreter could not point to an archive file in distributed file system
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24586&quot;&gt;FLINK-24586&lt;/a&gt;] - SQL functions should return STRING instead of VARCHAR(2000)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26788&quot;&gt;FLINK-26788&lt;/a&gt;] - AbstractDeserializationSchema should add cause when throwing a FlinkRuntimeException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26909&quot;&gt;FLINK-26909&lt;/a&gt;] - Allow setting parallelism to -1 from CLI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27064&quot;&gt;FLINK-27064&lt;/a&gt;] - Centralize ArchUnit rules for production code
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27480&quot;&gt;FLINK-27480&lt;/a&gt;] - KafkaSources sharing the groupId might lead to InstanceAlreadyExistException warning
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27534&quot;&gt;FLINK-27534&lt;/a&gt;] - Apply scalafmt to 1.15 branch
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27776&quot;&gt;FLINK-27776&lt;/a&gt;] - Throw exception when UDAF used in sliding window does not implement merge method in PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27935&quot;&gt;FLINK-27935&lt;/a&gt;] - Add Pyflink example of create temporary view document
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Technical Debt
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25694&quot;&gt;FLINK-25694&lt;/a&gt;] - Upgrade Presto to resolve GSON/Alluxio Vulnerability
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26052&quot;&gt;FLINK-26052&lt;/a&gt;] - Update chinese documentation regarding FLIP-203
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26588&quot;&gt;FLINK-26588&lt;/a&gt;] - Translate the new SQL CAST documentation to Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27382&quot;&gt;FLINK-27382&lt;/a&gt;] - Make Job mode wait with cluster shutdown until the cleanup is done
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Wed, 06 Jul 2022 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2022/07/06/release-1.15.1.html</link>
<guid isPermaLink="true">/news/2022/07/06/release-1.15.1.html</guid>
</item>
<item>
<title>Apache Flink 1.14.5 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink Community is pleased to announce another bug fix release for Flink 1.14.&lt;/p&gt;
&lt;p&gt;This release includes 67 bugs, vulnerability fixes and minor improvements for Flink 1.14.
Below you will find a list of all bugfixes and improvements (excluding improvements to the build infrastructure and build stability). For a complete list of all changes see:
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351388&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.14.5.&lt;/p&gt;
&lt;h1 id=&quot;release-artifacts&quot;&gt;Release Artifacts&lt;/h1&gt;
&lt;h2 id=&quot;maven-dependencies&quot;&gt;Maven Dependencies&lt;/h2&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.5&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.5&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.5&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;binaries&quot;&gt;Binaries&lt;/h2&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;docker-images&quot;&gt;Docker Images&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://hub.docker.com/_/flink?tab=tags&amp;amp;page=1&amp;amp;name=1.14.5&quot;&gt;library/flink&lt;/a&gt; (official images)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://hub.docker.com/r/apache/flink/tags?page=1&amp;amp;name=1.14.5&quot;&gt;apache/flink&lt;/a&gt; (ASF repository)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;pypi&quot;&gt;PyPi&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://pypi.org/project/apache-flink/1.14.5/&quot;&gt;apache-flink==1.14.5&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h1&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25800&quot;&gt;FLINK-25800&lt;/a&gt;] - Update wrong links in the datastream/execution_mode.md page.
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22984&quot;&gt;FLINK-22984&lt;/a&gt;] - UnsupportedOperationException when using Python UDF to generate watermark
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24491&quot;&gt;FLINK-24491&lt;/a&gt;] - ExecutionGraphInfo may not be archived when the dispatcher terminates
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25227&quot;&gt;FLINK-25227&lt;/a&gt;] - Comparing the equality of the same (boxed) numeric values returns false
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25440&quot;&gt;FLINK-25440&lt;/a&gt;] - Apache Pulsar Connector Document description error about &#39;Starting Position&#39;.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25904&quot;&gt;FLINK-25904&lt;/a&gt;] - NullArgumentException when accessing checkpoint stats on standby JobManager
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26016&quot;&gt;FLINK-26016&lt;/a&gt;] - FileSystemLookupFunction does not produce correct results when hive table uses columnar storage
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26018&quot;&gt;FLINK-26018&lt;/a&gt;] - Unnecessary late events when using the new KafkaSource
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26049&quot;&gt;FLINK-26049&lt;/a&gt;] - The tolerable-failed-checkpoints logic is invalid when checkpoint trigger failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26285&quot;&gt;FLINK-26285&lt;/a&gt;] - ZooKeeperStateHandleStore does not handle not existing nodes properly in getAllAndLock
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26334&quot;&gt;FLINK-26334&lt;/a&gt;] - When timestamp - offset + windowSize &amp;lt; 0, elements cannot be assigned to the correct window
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26381&quot;&gt;FLINK-26381&lt;/a&gt;] - Wrong document order of Chinese version
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26395&quot;&gt;FLINK-26395&lt;/a&gt;] - The description of RAND_INTEGER is wrong in SQL function documents
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26504&quot;&gt;FLINK-26504&lt;/a&gt;] - Fix the incorrect type error in unbounded Python UDAF
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26536&quot;&gt;FLINK-26536&lt;/a&gt;] - PyFlink RemoteKeyedStateBackend#merge_namespaces bug
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26543&quot;&gt;FLINK-26543&lt;/a&gt;] - Fix the issue that exceptions generated in startup are missed in Python loopback mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26550&quot;&gt;FLINK-26550&lt;/a&gt;] - Correct the information of checkpoint failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26607&quot;&gt;FLINK-26607&lt;/a&gt;] - There are multiple MAX_LONG_VALUE value errors in pyflink code
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26629&quot;&gt;FLINK-26629&lt;/a&gt;] - Error in code comment for SubtaskStateMapper.RANGE
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26645&quot;&gt;FLINK-26645&lt;/a&gt;] - Pulsar Source subscribe to a single topic partition will consume all partitions from that topic
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26708&quot;&gt;FLINK-26708&lt;/a&gt;] - TimestampsAndWatermarksOperator should not propagate WatermarkStatus
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26738&quot;&gt;FLINK-26738&lt;/a&gt;] - Default value of StateDescriptor is valid when enable state ttl config
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26775&quot;&gt;FLINK-26775&lt;/a&gt;] - PyFlink WindowOperator#process_element register wrong cleanup timer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26846&quot;&gt;FLINK-26846&lt;/a&gt;] - Gauge metrics doesn&#39;t work in PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26855&quot;&gt;FLINK-26855&lt;/a&gt;] - ImportError: cannot import name &#39;environmentfilter&#39; from &#39;jinja2&#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26920&quot;&gt;FLINK-26920&lt;/a&gt;] - Job executes failed with &quot;The configured managed memory fraction for Python worker process must be within (0, 1], was: %s.&quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27108&quot;&gt;FLINK-27108&lt;/a&gt;] - State cache clean up doesn&#39;t work as expected
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27174&quot;&gt;FLINK-27174&lt;/a&gt;] - Non-null check for bootstrapServers field is incorrect in KafkaSink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27223&quot;&gt;FLINK-27223&lt;/a&gt;] - State access doesn&#39;t work as expected when cache size is set to 0
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27255&quot;&gt;FLINK-27255&lt;/a&gt;] - Flink-avro does not support serialization and deserialization of avro schema longer than 65535 characters
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27315&quot;&gt;FLINK-27315&lt;/a&gt;] - Fix the demo of MemoryStateBackendMigration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27409&quot;&gt;FLINK-27409&lt;/a&gt;] - Cleanup stale slot allocation record when the resource requirement of a job is empty
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27442&quot;&gt;FLINK-27442&lt;/a&gt;] - Module flink-sql-avro-confluent-registry does not configure Confluent repo
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27545&quot;&gt;FLINK-27545&lt;/a&gt;] - Update examples in PyFlink shell
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27676&quot;&gt;FLINK-27676&lt;/a&gt;] - Output records from on_timer are behind the triggering watermark in PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27733&quot;&gt;FLINK-27733&lt;/a&gt;] - Rework on_timer output behind watermark bug fix
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27751&quot;&gt;FLINK-27751&lt;/a&gt;] - Dependency resolution from repository.jboss.org fails on CI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27760&quot;&gt;FLINK-27760&lt;/a&gt;] - NPE is thrown when executing PyFlink jobs in batch mode
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; New Feature
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26382&quot;&gt;FLINK-26382&lt;/a&gt;] - Add Chinese documents for flink-training exercises
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-5151&quot;&gt;FLINK-5151&lt;/a&gt;] - Add discussion about object mutations to heap-based state backend docs.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23843&quot;&gt;FLINK-23843&lt;/a&gt;] - Exceptions during &quot;SplitEnumeratorContext.runInCoordinatorThread()&quot; should cause Global Failure instead of Process Kill
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24274&quot;&gt;FLINK-24274&lt;/a&gt;] - Wrong parameter order in documentation of State Processor API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24384&quot;&gt;FLINK-24384&lt;/a&gt;] - Count checkpoints failed in trigger phase into numberOfFailedCheckpoints
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26130&quot;&gt;FLINK-26130&lt;/a&gt;] - Document why and when user would like to increase network buffer size
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26575&quot;&gt;FLINK-26575&lt;/a&gt;] - Improve the info message when restoring keyed state backend
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26650&quot;&gt;FLINK-26650&lt;/a&gt;] - Avoid to print stack trace for checkpoint trigger failure if not all tasks are started
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26788&quot;&gt;FLINK-26788&lt;/a&gt;] - AbstractDeserializationSchema should add cause when thow a FlinkRuntimeException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27088&quot;&gt;FLINK-27088&lt;/a&gt;] - The example of using StringDeserializer for deserializing Kafka message value as string has an error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27480&quot;&gt;FLINK-27480&lt;/a&gt;] - KafkaSources sharing the groupId might lead to InstanceAlreadyExistException warning
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-27776&quot;&gt;FLINK-27776&lt;/a&gt;] - Throws exception when udaf used in sliding window does not implement merge method in PyFlink
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Technical Debt
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25694&quot;&gt;FLINK-25694&lt;/a&gt;] - Upgrade Presto to resolve GSON/Alluxio Vulnerability
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26352&quot;&gt;FLINK-26352&lt;/a&gt;] - Missing license header in WebUI source files
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26961&quot;&gt;FLINK-26961&lt;/a&gt;] - Update multiple Jackson dependencies to v2.13.2 and v2.13.2.1
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Wed, 22 Jun 2022 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2022/06/22/release-1.14.5.html</link>
<guid isPermaLink="true">/news/2022/06/22/release-1.14.5.html</guid>
</item>
<item>
<title>Adaptive Batch Scheduler: Automatically Decide Parallelism of Flink Batch Jobs</title>
<description>&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#get-started&quot; id=&quot;markdown-toc-get-started&quot;&gt;Get Started&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#configure-to-use-adaptive-batch-scheduler&quot; id=&quot;markdown-toc-configure-to-use-adaptive-batch-scheduler&quot;&gt;Configure to use adaptive batch scheduler&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#set-the-parallelism-of-operators-to--1&quot; id=&quot;markdown-toc-set-the-parallelism-of-operators-to--1&quot;&gt;Set the parallelism of operators to -1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#implementation-details&quot; id=&quot;markdown-toc-implementation-details&quot;&gt;Implementation Details&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#collect-sizes-of-consumed-datasets&quot; id=&quot;markdown-toc-collect-sizes-of-consumed-datasets&quot;&gt;Collect sizes of consumed datasets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#decide-proper-parallelisms-of-job-vertices&quot; id=&quot;markdown-toc-decide-proper-parallelisms-of-job-vertices&quot;&gt;Decide proper parallelisms of job vertices&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#limit-the-maximum-ratio-of-broadcast-bytes&quot; id=&quot;markdown-toc-limit-the-maximum-ratio-of-broadcast-bytes&quot;&gt;Limit the maximum ratio of broadcast bytes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#normalize-the-parallelism-to-the-closest-power-of-2&quot; id=&quot;markdown-toc-normalize-the-parallelism-to-the-closest-power-of-2&quot;&gt;Normalize the parallelism to the closest power of 2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#build-up-execution-graph-dynamically&quot; id=&quot;markdown-toc-build-up-execution-graph-dynamically&quot;&gt;Build up execution graph dynamically&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#create-execution-vertices-and-execution-edges-lazily&quot; id=&quot;markdown-toc-create-execution-vertices-and-execution-edges-lazily&quot;&gt;Create execution vertices and execution edges lazily&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flexible-subpartition-mapping&quot; id=&quot;markdown-toc-flexible-subpartition-mapping&quot;&gt;Flexible subpartition mapping&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#update-and-schedule-the-dynamic-execution-graph&quot; id=&quot;markdown-toc-update-and-schedule-the-dynamic-execution-graph&quot;&gt;Update and schedule the dynamic execution graph&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#future-improvement&quot; id=&quot;markdown-toc-future-improvement&quot;&gt;Future improvement&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#auto-rebalancing-of-workloads&quot; id=&quot;markdown-toc-auto-rebalancing-of-workloads&quot;&gt;Auto-rebalancing of workloads&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Deciding proper parallelisms of operators is not an easy work for many users. For batch jobs, a small parallelism may result in long execution time and big failover regression. While an unnecessary large parallelism may result in resource waste and more overhead cost in task deployment and network shuffling.&lt;/p&gt;
&lt;p&gt;To decide a proper parallelism, one needs to know how much data each operator needs to process. However, It can be hard to predict data volume to be processed by a job because it can be different everyday. And it can be harder or even impossible (due to complex operators or UDFs) to predict data volume to be processed by each operator.&lt;/p&gt;
&lt;p&gt;To solve this problem, we introduced the adaptive batch scheduler in Flink 1.15. The adaptive batch scheduler can automatically decide parallelism of an operator according to the size of its consumed datasets. Here are the benefits the adaptive batch scheduler can bring:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Batch job users can be relieved from parallelism tuning.&lt;/li&gt;
&lt;li&gt;Parallelism tuning is fine grained considering different operators. This is particularly beneficial for SQL jobs which can only be set with a global parallelism previously.&lt;/li&gt;
&lt;li&gt;Parallelism tuning can better fit consumed datasets which have a varying volume size every day.&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&quot;get-started&quot;&gt;Get Started&lt;/h1&gt;
&lt;p&gt;To automatically decide parallelism of operators, you need to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Configure to use adaptive batch scheduler.&lt;/li&gt;
&lt;li&gt;Set the parallelism of operators to -1.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&quot;configure-to-use-adaptive-batch-scheduler&quot;&gt;Configure to use adaptive batch scheduler&lt;/h2&gt;
&lt;p&gt;To use adaptive batch scheduler, you need to set configurations as below:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Set &lt;code&gt;jobmanager.scheduler: AdaptiveBatch&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Leave the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#execution-batch-shuffle-mode&quot;&gt;execution.batch-shuffle-mode&lt;/a&gt; unset or explicitly set it to &lt;code&gt;ALL-EXCHANGES-BLOCKING&lt;/code&gt; (default value). Currently, the adaptive batch scheduler only supports batch jobs whose shuffle mode is &lt;code&gt;ALL-EXCHANGES-BLOCKING&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In addition, there are several related configuration options to control the upper bounds and lower bounds of tuned parallelisms, to specify expected data volume to process by each operator, and to specify the default parallelism of sources. More details can be found in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/elastic_scaling/#configure-to-use-adaptive-batch-scheduler&quot;&gt;feature documentation page&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;set-the-parallelism-of-operators-to--1&quot;&gt;Set the parallelism of operators to -1&lt;/h2&gt;
&lt;p&gt;The adaptive batch scheduler only automatically decides parallelism of operators whose parallelism is not set (which means the parallelism is -1). To leave parallelism unset, you should configure as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Set &lt;code&gt;parallelism.default: -1&lt;/code&gt; for all jobs.&lt;/li&gt;
&lt;li&gt;Set &lt;code&gt;table.exec.resource.default-parallelism: -1&lt;/code&gt; for SQL jobs.&lt;/li&gt;
&lt;li&gt;Don’t call &lt;code&gt;setParallelism()&lt;/code&gt; for operators in DataStream/DataSet jobs.&lt;/li&gt;
&lt;li&gt;Don’t call &lt;code&gt;setParallelism()&lt;/code&gt; on &lt;code&gt;StreamExecutionEnvironment/ExecutionEnvironment&lt;/code&gt; in DataStream/DataSet jobs.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;implementation-details&quot;&gt;Implementation Details&lt;/h1&gt;
&lt;p&gt;In this section, we will elaborate the details of the implementation. Before that, we need to briefly introduce some concepts involved:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobVertex.java&quot;&gt;JobVertex&lt;/a&gt; and &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobGraph.java&quot;&gt;JobGraph&lt;/a&gt;: A job vertex is an operator chain formed by chaining several operators together for better performance. The job graph is a data flow consisting of job vertices.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.java&quot;&gt;ExecutionVertex&lt;/a&gt; and &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java&quot;&gt;ExecutionGraph&lt;/a&gt;: An execution vertex represents a parallel subtask of a job vertex, which will eventually be instantiated as a physical task. For example, a job vertex with a parallelism of 100 will generate 100 execution vertices. The execution graph is the physical execution topology consisting of all execution vertices.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;More details about the above concepts can be found in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/internals/job_scheduling/#jobmanager-data-structures&quot;&gt;Flink documentation&lt;/a&gt;. Note that the adaptive batch scheduler decides the parallelism of operators by deciding the parallelism of the job vertices which consist of these operators. To automatically decide parallelism of job vertices, we introduced the following changes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Enabled the scheduler to collect sizes of finished datasets.&lt;/li&gt;
&lt;li&gt;Introduced a new component &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptivebatch/VertexParallelismDecider.java&quot;&gt;VertexParallelismDecider&lt;/a&gt; to compute proper parallelisms of job vertices according to the sizes of their consumed results.&lt;/li&gt;
&lt;li&gt;Enabled to dynamically build up execution graph to allow the parallelisms of job vertices to be decided lazily. The execution graph starts with an empty execution topology and then gradually attaches the vertices during job execution.&lt;/li&gt;
&lt;li&gt;Introduced the adaptive batch scheduler to update and schedule the dynamic execution graph.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The details will be introduced in the following sections.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-06-17-adaptive-batch-scheduler/1-overall-structure.png&quot; width=&quot;60%&quot; /&gt;
&lt;br /&gt;
Fig. 1 - The overall structure of automatically deciding parallelism
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h2 id=&quot;collect-sizes-of-consumed-datasets&quot;&gt;Collect sizes of consumed datasets&lt;/h2&gt;
&lt;p&gt;The adaptive batch scheduler decides the parallelism of vertices by the size of input results, so the scheduler needs to know the sizes of result partitions produced by tasks. We introduced a numBytesProduced counter to record the size of each produced result partition, the accumulated result of the counter will be sent to the scheduler when tasks finish.&lt;/p&gt;
&lt;h2 id=&quot;decide-proper-parallelisms-of-job-vertices&quot;&gt;Decide proper parallelisms of job vertices&lt;/h2&gt;
&lt;p&gt;We introduced a new component &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptivebatch/VertexParallelismDecider.java&quot;&gt;VertexParallelismDecider&lt;/a&gt; to compute proper parallelisms of job vertices according to the sizes of their consumed results. The computation algorithm is as follows:&lt;/p&gt;
&lt;p&gt;Suppose&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;V&lt;/em&gt;&lt;/strong&gt; is the bytes of data the user expects to be processed by each task.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;totalBytes&lt;sub&gt;non-broadcast&lt;/sub&gt;&lt;/em&gt;&lt;/strong&gt; is the sum of the non-broadcast result sizes consumed by this job vertex.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;totalBytes&lt;sub&gt;broadcast&lt;/sub&gt;&lt;/em&gt;&lt;/strong&gt; is the sum of the broadcast result sizes consumed by this job vertex.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;maxBroadcastRatio&lt;/em&gt;&lt;/strong&gt; is the maximum ratio of broadcast bytes that affects the parallelism calculation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;normalize(&lt;/em&gt;&lt;/strong&gt;x&lt;strong&gt;&lt;em&gt;)&lt;/em&gt;&lt;/strong&gt; is a function that round &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; to the closest power of 2.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;then the parallelism of this job vertex &lt;strong&gt;&lt;em&gt;P&lt;/em&gt;&lt;/strong&gt; will be:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-06-17-adaptive-batch-scheduler/parallelism-formula.png&quot; width=&quot;60%&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;Note that we introduced two special treatment in the above formula :&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#limit-the-maximum-ratio-of-broadcast-bytes&quot;&gt;Limit the maximum ratio of broadcast bytes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#normalize-the-parallelism-to-the-closest-power-of-2&quot;&gt;Normalize the parallelism to the closest power of 2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, the above formula cannot be used to decide the parallelism of the source vertices, because the source vertices have no input. To solve it, we introduced the configuration option &lt;code&gt;jobmanager.adaptive-batch-scheduler.default-source-parallelism&lt;/code&gt; to allow users to manually configure the parallelism of source vertices. Note that not all data sources need this option, because some data sources can automatically infer parallelism (For example, HiveTableSource, see &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveParallelismInference.java&quot;&gt;HiveParallelismInference&lt;/a&gt; for more detail). For these sources, it is recommended to decide parallelism by themselves.&lt;/p&gt;
&lt;h3 id=&quot;limit-the-maximum-ratio-of-broadcast-bytes&quot;&gt;Limit the maximum ratio of broadcast bytes&lt;/h3&gt;
&lt;p&gt;As you can see, we limit the maximum ratio of broadcast bytes that affects the parallelism calculation to &lt;strong&gt;&lt;em&gt;maxBroadcastRatio&lt;/em&gt;&lt;/strong&gt;. That is, the non-broadcast bytes processed by each task is at least &lt;strong&gt;&lt;em&gt;(1-maxBroadcastRatio) * V&lt;/em&gt;&lt;/strong&gt;. If not so,when the total broadcast bytes is close to &lt;strong&gt;&lt;em&gt;V&lt;/em&gt;&lt;/strong&gt;, even if the total non-broadcast bytes is very small, it may cause a large parallelism, which is unnecessary and may lead to resource waste and large task deployment overhead.&lt;/p&gt;
&lt;p&gt;Generally, the broadcast dataset is usually relatively small against the other co-processed datasets, so we set the maximum ratio to 0.5 by default. The value is hard coded in the first version, and we may make it configurable later.&lt;/p&gt;
&lt;h3 id=&quot;normalize-the-parallelism-to-the-closest-power-of-2&quot;&gt;Normalize the parallelism to the closest power of 2&lt;/h3&gt;
&lt;p&gt;The normalize is to avoid introducing data skew. To better understand this section, we suggest you read the &lt;a href=&quot;#flexible-subpartition-mapping&quot;&gt;Flexible subpartition mapping&lt;/a&gt; section first.&lt;/p&gt;
&lt;p&gt;Taking Fig. 4 (b) as example, A1/A2 produces 4 subpartitions, and the decided parallelism of B is 3. In this case, B1 will consume 1 subpartition, B2 will consume 1 subpartition, and B3 will consume 2 subpartitions. We assume that subpartitions have the same amount of data, which means B3 will consume twice the data of other tasks, data skew is introduced due to the subpartition mapping.&lt;/p&gt;
&lt;p&gt;To solve this problem, we need to make the subpartitions evenly consumed by downstream tasks, which means the number of subpartitions should be a multiple of the number of downstream tasks. For simplicity, we require the user-specified max parallelism to be 2&lt;sup&gt;N&lt;/sup&gt;, and then adjust the calculated parallelism to a closest 2&lt;sup&gt;M&lt;/sup&gt; (M &amp;lt;= N), so that we can guarantee that subpartitions will be evenly consumed by downstream tasks.&lt;/p&gt;
&lt;p&gt;Note that this is a temporary solution, the ultimate solution would be the &lt;a href=&quot;#auto-rebalancing-of-workloads&quot;&gt;Auto-rebalancing of workloads&lt;/a&gt;, which may come soon.&lt;/p&gt;
&lt;h2 id=&quot;build-up-execution-graph-dynamically&quot;&gt;Build up execution graph dynamically&lt;/h2&gt;
&lt;p&gt;Before the adaptive batch scheduler was introduced to Flink, the execution graph was fully built in a static way before starting scheduling. To allow parallelisms of job vertices to be decided lazily, the execution graph must be able to be built up dynamically.&lt;/p&gt;
&lt;h3 id=&quot;create-execution-vertices-and-execution-edges-lazily&quot;&gt;Create execution vertices and execution edges lazily&lt;/h3&gt;
&lt;p&gt;A dynamic execution graph means that a Flink job starts with an empty execution topology, and then gradually attaches vertices during job execution, as shown in Fig. 2.&lt;/p&gt;
&lt;p&gt;The execution topology consists of execution vertices and execution edges. The execution vertices will be created and attached to the execution topology only when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The parallelism of the corresponding job vertex is decided.&lt;/li&gt;
&lt;li&gt;All upstream execution vertices are already attached.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The parallelism of the job vertex needs to be decided first so that Flink knows how many execution vertices should be created. Upstream execution vertices need to be attached first so that Flink can connect the newly created execution vertices to the upstream vertices with execution edges.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-06-17-adaptive-batch-scheduler/2-dynamic-graph.png&quot; width=&quot;90%&quot; /&gt;
&lt;br /&gt;
Fig. 2 - Build up execution graph dynamically
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h3 id=&quot;flexible-subpartition-mapping&quot;&gt;Flexible subpartition mapping&lt;/h3&gt;
&lt;p&gt;Before the adaptive batch scheduler was introduced to Flink, when deploying a task, Flink needed to know the parallelism of its consumer job vertex. This is because consumer vertex parallelism is used to decide the number of subpartitions produced by each upstream task. The reason behind that is, for one result partition, different subpartitions serve different consumer execution vertices. More specifically, one consumer execution vertex only consumes data from subpartition with the same index.&lt;/p&gt;
&lt;p&gt;Taking Fig. 3 as example, parallelism of the consumer B is 2, so the result partition produced by A1/A2 should contain 2 subpartitions, the subpartition with index 0 serves B1, and the subpartition with index 1 serves B2.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-06-17-adaptive-batch-scheduler/3-static-graph-subpartition-mapping.png&quot; width=&quot;30%&quot; /&gt;
&lt;br /&gt;
Fig. 3 - How subpartitions serve consumer execution vertices with static execution graph
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;But obviously, this doesn’t work for dynamic graphs, because when a job vertex is deployed, the parallelism of its consumer job vertices may not have been decided yet. To enable Flink to work in this case, we need a way to allow a job vertex to run without knowing the parallelism of its consumer job vertices.&lt;/p&gt;
&lt;p&gt;To achieve this goal, we can set the number of subpartitions to be the max parallelism of the consumer job vertex. Then when the consumer execution vertices are deployed, they should be assigned with a subpartition range to consume. Suppose N is the number of consumer execution vertices and P is the number of subpartitions. For the kth consumer execution vertex, the consumed subpartition range should be:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-06-17-adaptive-batch-scheduler/range-formula.png&quot; width=&quot;55%&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;Taking Fig. 4 as example, the max parallelism of B is 4, so A1/A2 have 4 subpartitions. And then if the decided parallelism of B is 2, then the subpartitions mapping will be Fig. 4 (a), if the decided parallelism of B is 3, then the subpartitions mapping will be Fig. 4 (b).&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-06-17-adaptive-batch-scheduler/4-dynamic-graph-subpartition-mapping.png&quot; width=&quot;75%&quot; /&gt;
&lt;br /&gt;
Fig. 4 - How subpartitions serve consumer execution vertices with dynamic graph
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h2 id=&quot;update-and-schedule-the-dynamic-execution-graph&quot;&gt;Update and schedule the dynamic execution graph&lt;/h2&gt;
&lt;p&gt;The adaptive batch scheduler scheduling is similar to the default scheduler, the only difference is that an empty dynamic execution graph will be generated initially and vertices will be attached later. Before handling any scheduling event, the scheduler will try deciding the parallelisms of job vertices, and then initialize them to generate execution vertices, connecting execution edges, and update the execution graph.&lt;/p&gt;
&lt;p&gt;The scheduler will try to decide the parallelism of all job vertices before handling each scheduling event, and the parallelism decision will be made for each job vertex in topological order:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For source vertices, the parallelism should have been decided before starting scheduling.&lt;/li&gt;
&lt;li&gt;For non-source vertices, the parallelism can be decided only when all its consumed results are fully produced.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After deciding the parallelism, the scheduler will try to initialize the job vertices in topological order. A job vertex that can be initialized should meet the following conditions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The parallelism of the job vertex has been decided and the job vertex has not been initialized yet.&lt;/li&gt;
&lt;li&gt;All upstream job vertices have been initialized.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;future-improvement&quot;&gt;Future improvement&lt;/h1&gt;
&lt;h2 id=&quot;auto-rebalancing-of-workloads&quot;&gt;Auto-rebalancing of workloads&lt;/h2&gt;
&lt;p&gt;When running batch jobs, data skew may occur (a task needs to process much larger data than other tasks), which leads to long-tail tasks and further slows down the finish of jobs. Users usually hope that the system can automatically solve this problem.
One typical data skew case is that some subpartitions have a significantly larger amount of data than others. This case can be solved by finer grained subpartitions and auto-rebalancing of workload. The work of the adaptive batch scheduler can be considered as the first step towards it, because the requirements of auto-rebalancing are similar to adaptive batch scheduler, they both need the support of dynamic graphs and the collection of result partitions size.
Based on the implementation of adaptive batch scheduler, we can solve the above problem by increasing max parallelism (for finer grained subpartitions) and simply changing the subpartition range division algorithm (for auto-rebalancing). In the current design, the subpartition range is divided according to the number of subpartitions, we can change it to divide according to the amount of data in subpartitions, so that the amount of data within each subpartition range can be approximately the same. In this way, workloads of downstream tasks can be balanced.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-06-17-adaptive-batch-scheduler/5-auto-rebalance.png&quot; width=&quot;75%&quot; /&gt;
&lt;br /&gt;
Fig. 5 - Auto-rebalance with finer grained subpartitions
&lt;/center&gt;
</description>
<pubDate>Fri, 17 Jun 2022 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/06/17/adaptive-batch-scheduler.html</link>
<guid isPermaLink="true">/2022/06/17/adaptive-batch-scheduler.html</guid>
</item>
<item>
<title>Apache Flink Kubernetes Operator 1.0.0 Release Announcement</title>
<description>&lt;p&gt;In the last two months since our &lt;a href=&quot;https://flink.apache.org/news/2022/04/03/release-kubernetes-operator-0.1.0.html&quot;&gt;initial preview release&lt;/a&gt; the community has been hard at work to stabilize and improve the core Flink Kubernetes Operator logic.
We are now proud to announce the first production ready release of the operator project.&lt;/p&gt;
&lt;h2 id=&quot;release-highlights&quot;&gt;Release Highlights&lt;/h2&gt;
&lt;p&gt;The Flink Kubernetes Operator 1.0.0 version brings numerous improvements and new features to almost every aspect of the operator.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;New &lt;strong&gt;v1beta1&lt;/strong&gt; API version &amp;amp; compatibility guarantees&lt;/li&gt;
&lt;li&gt;Session Job Management support&lt;/li&gt;
&lt;li&gt;Support for Flink 1.13, 1.14 and 1.15&lt;/li&gt;
&lt;li&gt;Deployment recovery and rollback&lt;/li&gt;
&lt;li&gt;New Operator metrics&lt;/li&gt;
&lt;li&gt;Improved configuration management&lt;/li&gt;
&lt;li&gt;Custom validators&lt;/li&gt;
&lt;li&gt;Savepoint history and cleanup&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;new-api-version-and-compatibility-guarantees&quot;&gt;New API version and compatibility guarantees&lt;/h3&gt;
&lt;p&gt;The 1.0.0 release brings a new API version: &lt;strong&gt;v1beta1&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Don’t let the name confuse you, we consider v1beta1 the first production ready API release, and we will maintain backward compatibility for your applications going forward.&lt;/p&gt;
&lt;p&gt;If you are already using the 0.1.0 preview release you can read about the upgrade process &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/docs/operations/upgrade/#upgrading-from-v1alpha1---v1beta1&quot;&gt;here&lt;/a&gt;, or check our detailed &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/docs/operations/compatibility/&quot;&gt;compatibility guarantees&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;session-job-management&quot;&gt;Session Job Management&lt;/h3&gt;
&lt;p&gt;One of the most exciting new features of 1.0.0 is the introduction of the FlinkSessionJob resource. In contrast with the FlinkDeployment that allows us to manage Application and Session Clusters, the FlinkSessionJob allows users to manage Flink jobs on a running Session deployment.&lt;/p&gt;
&lt;p&gt;This is extremely valuable in environments where users want to deploy Flink jobs quickly and iteratively and also allows cluster administrators to manage the session cluster independently of the running jobs.&lt;/p&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;&lt;span class=&quot;l-Scalar-Plain&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;flink.apache.org/v1beta1&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;FlinkSessionJob&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;basic-session-job-example&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;deploymentName&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;basic-session-cluster&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;job&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;jarURI&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.15.0/flink-examples-streaming_2.12-1.15.0-TopSpeedWindowing.jar&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;parallelism&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;4&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;upgradeMode&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;stateless&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;multi-version-flink-support&quot;&gt;Multi-version Flink support&lt;/h3&gt;
&lt;p&gt;The Flink Kubernetes Operator now supports the following Flink versions out-of-the box:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Flink 1.15 (Recommended)&lt;/li&gt;
&lt;li&gt;Flink 1.14&lt;/li&gt;
&lt;li&gt;Flink 1.13&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Flink 1.15 comes with a set of features that allow deeper integration for the operator. We recommend using Flink 1.15 to get the best possible operational experience.&lt;/p&gt;
&lt;h3 id=&quot;deployment-recovery-and-rollbacks&quot;&gt;Deployment Recovery and Rollbacks&lt;/h3&gt;
&lt;p&gt;We have added two new features to make Flink cluster operations smoother when using the operator.&lt;/p&gt;
&lt;p&gt;Now the operator will try to recover Flink JobManager deployments that went missing for some reason. Maybe it was accidentally deleted by the user or another service in the cluster. As long as HA was enabled and the job did not fatally fail, the operator will try to restore the job from the latest available checkpoint.&lt;/p&gt;
&lt;p&gt;We also added experimental support for application upgrade rollbacks. With this feature the operator will monitor new application upgrades and if they don’t become stable (healthy &amp;amp; running) within a configurable period, they will be rolled back to the latest stable specification previously deployed.&lt;/p&gt;
&lt;p&gt;While this feature will likely see improvements and new settings in the coming versions, it already provides benefits in cases where we have a large number of jobs with strong uptime requirements where it’s better to roll back than be stuck in a failing state.&lt;/p&gt;
&lt;h3 id=&quot;improved-operator-metrics&quot;&gt;Improved Operator Metrics&lt;/h3&gt;
&lt;p&gt;Beyond the existing JVM based system metrics, additional Operator specific metrics were added to the current release.&lt;/p&gt;
&lt;table class=&quot;table table-bordered&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;Metrics&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Namespace&lt;/td&gt;
&lt;td&gt;FlinkDeployment.Count&lt;/td&gt;
&lt;td&gt;Number of managed FlinkDeployment instances per namespace&lt;/td&gt;
&lt;td&gt;Gauge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Namespace&lt;/td&gt;
&lt;td&gt;FlinkDeployment.&amp;lt;Status&amp;gt;.Count&lt;/td&gt;
&lt;td&gt;Number of managed FlinkDeployment resources per &amp;lt;Status&amp;gt; per namespace. &amp;lt;Status&amp;gt; can take values from: READY, DEPLOYED_NOT_READY, DEPLOYING, MISSING, ERROR&lt;/td&gt;
&lt;td&gt;Gauge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Namespace&lt;/td&gt;
&lt;td&gt;FlinkSessionJob.Count&lt;/td&gt;
&lt;td&gt;Number of managed FlinkSessionJob instances per namespace&lt;/td&gt;
&lt;td&gt;Gauge&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next?&lt;/h2&gt;
&lt;p&gt;Our intention is to advance further on the &lt;a href=&quot;https://operatorframework.io/operator-capabilities/&quot;&gt;Operator Maturity Model&lt;/a&gt; by adding more dynamic/automatic features&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Standalone deployment mode support &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-225%3A+Implement+standalone+mode+support+in+the+kubernetes+operator&quot;&gt;FLIP-225&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Auto-scaling using Horizontal Pod Autoscaler&lt;/li&gt;
&lt;li&gt;Dynamic change of watched namespaces&lt;/li&gt;
&lt;li&gt;Pluggable Status and Event reporters (Making it easier to integrate with proprietary control planes)&lt;/li&gt;
&lt;li&gt;SQL jobs support&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;release-resources&quot;&gt;Release Resources&lt;/h2&gt;
&lt;p&gt;The source artifacts and helm chart are now available on the updated &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads&lt;/a&gt;
page of the Flink website.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://archive.apache.org/dist/flink/flink-kubernetes-operator-1.0.0/&quot;&gt;official 1.0.0 release archive&lt;/a&gt; doubles as a Helm repository that you can easily register locally:&lt;/p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;helm repo add flink-kubernetes-operator-1.0.0 https://archive.apache.org/dist/flink/flink-kubernetes-operator-1.0.0/
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;helm install flink-kubernetes-operator flink-kubernetes-operator-1.0.0/flink-kubernetes-operator --set webhook.create&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;p&gt;You can also find official Kubernetes Operator Docker images of the new version on &lt;a href=&quot;https://hub.docker.com/r/apache/flink-kubernetes-operator&quot;&gt;Dockerhub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For more details, check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/&quot;&gt;updated documentation&lt;/a&gt; and the
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351500&quot;&gt;release notes&lt;/a&gt;.
We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt;
or &lt;a href=&quot;https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Kubernetes%20Operator%22&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank each and every one of the contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;Aitozi, Biao Geng, ConradJam, Fuyao Li, Gyula Fora, Jaganathan Asokan, James Busche, liuzhuo, Márton Balassi, Matyas Orhidi, Nicholas Jiang, Ted Chang, Thomas Weise, Xin Hao, Yang Wang, Zili Chen&lt;/p&gt;
</description>
<pubDate>Sun, 05 Jun 2022 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2022/06/05/release-kubernetes-operator-1.0.0.html</link>
<guid isPermaLink="true">/news/2022/06/05/release-kubernetes-operator-1.0.0.html</guid>
</item>
<item>
<title>Improving speed and stability of checkpointing with generic log-based incremental checkpoints</title>
<description>&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
&lt;p&gt;One of the most important characteristics of stream processing systems is end-to-end latency, i.e. the time it takes for the results of processing an input record to reach the outputs. In the case of Flink, end-to-end latency mostly depends on the checkpointing mechanism, because processing results should only become visible after the state of the stream is persisted to non-volatile storage (this is assuming exactly-once mode; in other modes, results can be published immediately).&lt;/p&gt;
&lt;p&gt;Furthermore, сheckpoint duration also defines the reasonable interval with which checkpoints are made. A shorter interval provides the following advantages:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Lower latency for transactional sinks: Transactional sinks commit on checkpoints, so faster checkpoints mean more frequent commits.&lt;/li&gt;
&lt;li&gt;More predictable checkpoint intervals: Currently, the duration of a checkpoint depends on the size of the artifacts that need to be persisted in the checkpoint storage.&lt;/li&gt;
&lt;li&gt;Less work on recovery. The more frequently the checkpoint, the fewer events need to be re-processed after recovery.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Following are the main factors affecting checkpoint duration in Flink:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Barrier travel time and alignment duration&lt;/li&gt;
&lt;li&gt;Time to take state snapshot and persist it onto the durable highly-available storage (such as S3)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Recent improvements such as &lt;a href=&quot;https://flink.apache.org/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html&quot;&gt;Unaligned checkpoints&lt;/a&gt; and &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment&quot;&gt; Buffer debloating &lt;/a&gt; try to address (1), especially in the presence of back-pressure. Previously, &lt;a href=&quot;https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html&quot;&gt; Incremental checkpoints &lt;/a&gt; were introduced to reduce the size of a snapshot, thereby reducing the time required to store it (2).&lt;/p&gt;
&lt;p&gt;However, there are still some cases when this duration is high&lt;/p&gt;
&lt;h3 id=&quot;every-checkpoint-is-delayed-by-at-least-one-task-with-high-parallelism&quot;&gt;Every checkpoint is delayed by at least one task with high parallelism&lt;/h3&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-30-changelog-state-backend/failing-task.png&quot; /&gt;
&lt;br /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;With the existing incremental checkpoint implementation of the RocksDB state backend, every subtask needs to periodically perform some form of compaction. That compaction results in new, relatively big files, which in turn increase the upload time (2). The probability of at least one node performing such compaction and thus slowing down the whole checkpoint grows proportionally to the number of nodes. In large deployments, almost every checkpoint becomes delayed by some node.&lt;/p&gt;
&lt;h3 id=&quot;unnecessary-delay-before-uploading-state-snapshot&quot;&gt;Unnecessary delay before uploading state snapshot&lt;/h3&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-30-changelog-state-backend/checkpoint-timing.png&quot; /&gt;
&lt;br /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;State backends don’t start any snapshotting work until the task receives at least one checkpoint barrier, increasing the effective checkpoint duration. This is suboptimal if the upload time is comparable to the checkpoint interval; instead, a snapshot could be uploaded continuously throughout the interval.&lt;/p&gt;
&lt;p&gt;This work discusses the mechanism introduced in Flink 1.15 to address the above cases by continuously persisting state changes on non-volatile storage while performing materialization in the background. The basic idea is described in the following section, and then important implementation details are highlighted. Subsequent sections discuss benchmarking results, limitations, and future work.&lt;/p&gt;
&lt;h1 id=&quot;high-level-overview&quot;&gt;High-level Overview&lt;/h1&gt;
&lt;p&gt;The core idea is to introduce a state changelog (a log that records state changes); this changelog allows operators to persist state changes in a very fine-grained manner, as described below:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Stateful operators write the state changes to the state changelog, in addition to applying them to the state tables in RocksDB or the in-mem Hashtable.&lt;/li&gt;
&lt;li&gt;An operator can acknowledge a checkpoint as soon as the changes in the log have reached the durable checkpoint storage.&lt;/li&gt;
&lt;li&gt;The state tables are persisted periodically as well, independent of the checkpoints. We call this procedure the materialization of the state on the durable checkpoint storage.&lt;/li&gt;
&lt;li&gt;Once the state is materialized on the checkpoint storage, the state changelog can be truncated to the point where the state is materialized.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This can be illustrated as follows:&lt;/p&gt;
&lt;center&gt;
&lt;div style=&quot;overflow-x: auto&quot;&gt;
&lt;div style=&quot;width:150%&quot;&gt;
&lt;img style=&quot;display:inline; max-width: 33%; max-height: 200px; margin-left: -1%&quot; src=&quot;/img/blog/2022-05-30-changelog-state-backend/log_checkpoints_1.png&quot; /&gt;
&lt;img style=&quot;display:inline; max-width: 33%; max-height: 200px; margin-left: -1%&quot; src=&quot;/img/blog/2022-05-30-changelog-state-backend/log_checkpoints_2.png&quot; /&gt;
&lt;img style=&quot;display:inline; max-width: 33%; max-height: 200px; margin-left: -1%&quot; src=&quot;/img/blog/2022-05-30-changelog-state-backend/log_checkpoints_3.png&quot; /&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;br /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;This approach mirrors what database systems do, adjusted to distributed checkpoints:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Changes (inserts/updates/deletes) are written to the transaction log, and the transaction is considered durable once the log is synced to disk (or other durable storage).&lt;/li&gt;
&lt;li&gt;The changes are also materialized in the tables (so the database system can efficiently query the table). The tables are usually persisted asynchronously.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once all relevant parts of the changed tables have been persisted, the transaction log can be truncated, which is similar to the materialization procedure in our approach.&lt;/p&gt;
&lt;p&gt;Such a design makes a number of trade-offs:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Increased use of network IO and remote storage space for changelog&lt;/li&gt;
&lt;li&gt;Increased memory usage to buffer state changes&lt;/li&gt;
&lt;li&gt;Increased time to replay state changes during the recovery process&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The last one, may or may not be compensated by more frequent checkpoints. More frequent checkpoints mean less re-processing is needed after recovery.&lt;/p&gt;
&lt;h1 id=&quot;system-architecture&quot;&gt;System architecture&lt;/h1&gt;
&lt;h2 id=&quot;changelog-storage-dstl&quot;&gt;Changelog storage (DSTL)&lt;/h2&gt;
&lt;p&gt;The component that is responsible for actually storing state changes has the following requirements.&lt;/p&gt;
&lt;h3 id=&quot;durability&quot;&gt;Durability&lt;/h3&gt;
&lt;p&gt;Changelog constitutes a part of a checkpoint, and therefore the same durability guarantees as for checkpoints must be provided. However, the duration for which the changelog is stored is expected to be short (until the changes are materialized).&lt;/p&gt;
&lt;h3 id=&quot;workload&quot;&gt;Workload&lt;/h3&gt;
&lt;p&gt;The workload is write-heavy: changelog is written continuously, and it is only read in case of failure. Once written, data can not be modified.&lt;/p&gt;
&lt;h3 id=&quot;latency&quot;&gt;Latency&lt;/h3&gt;
&lt;p&gt;We target checkpoint duration of 1s in the Flink 1.15 MVP for 99% of checkpoints. Therefore, an individual write request must complete within that duration or less (if parallelism is 100, then 99.99% of write requests must complete within 1s).&lt;/p&gt;
&lt;h3 id=&quot;consistency&quot;&gt;Consistency&lt;/h3&gt;
&lt;p&gt;Once a change is persisted (and acknowledged to JM), it must be available for replay to enable recovery (this can be achieved by using a single machine, quorum, or synchronous replication).&lt;/p&gt;
&lt;h3 id=&quot;concurrency&quot;&gt;Concurrency&lt;/h3&gt;
&lt;p&gt;Each task writes to its own changelog, which prevents concurrency issues across multiple tasks. However, when a task is restarted, it needs to write to the same log, which may cause concurrency issues. This is addressed by:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Using unique log segment identifiers while writing&lt;/li&gt;
&lt;li&gt;Fencing previous execution attempts on JM when handling checkpoint acknowledgments&lt;/li&gt;
&lt;li&gt;After closing the log, treating it as Flink state, which is read-only and is discarded by a single JM (leader)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;To emphasize the difference in durability requirements and usage compared to other systems (durable, short-lived, append-only), the component is called &lt;strong&gt;“Durable Short-term Log” (DSTL)&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;DSTL can be implemented in many ways, such as Distributed Log, Distributed File System* (DFS), or even a database. In the MVP version in Flink 1.15, we chose DFS because of the following reasons:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;No additional external dependency; DFS is readily available in most environments and is already used to store checkpoints&lt;/li&gt;
&lt;li&gt;No additional stateful components to manage; using any other persistence medium would incur additional operational overhead&lt;/li&gt;
&lt;li&gt;DFS natively provides durability and consistency guarantees which need to be taken care of when implementing a new customized distributed log storage (in particular, when implementing replication)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;On the other hand, the DFS approach has the following disadvantages:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;has higher latency than for example Distributed Log writing to the local disks&lt;/li&gt;
&lt;li&gt;its scalability is limited by DFS (most Storage Providers start rate-limiting at some point)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;However, after some initial experimentation, we think the performance of popular DFS could satisfy 80% of the use cases, and more results will be illustrated with the MVP version in a later section.
DFS here makes no distinction between DFS and object stores.&lt;/p&gt;
&lt;p&gt;Using RocksDB as an example, this approach can be illustrated at the high level as follows. State updates are replicated to both RocksDB and DSTL by the Changelog State Backend.
DSTL continuously writes state changes to DFS and flushes them periodically and on checkpoint. That way, checkpoint time only depends on the time to flush a small amount of data.
RocksDB on the other hand is still used for querying the state. Furthermore, its SSTables are periodically uploaded to DFS, which is called “materialization”. That upload is independent of and is much less frequent than checkpointing procedure, with 10 minutes as the default interval.&lt;/p&gt;
&lt;center&gt;
&lt;img style=&quot;max-width: 80%&quot; src=&quot;/img/blog/2022-05-30-changelog-state-backend/changelog-simple.png&quot; /&gt;
&lt;br /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;There are a few more issues worth highlighting here:&lt;/p&gt;
&lt;h2 id=&quot;state-cleanup&quot;&gt;State cleanup&lt;/h2&gt;
&lt;p&gt;State changelog needs to be truncated once the corresponding changes are materialized. It becomes more complicated with re-scaling and sharing the underlying files across multiple operators. However, Flink already provides a mechanism called &lt;code&gt;SharedStateRegistry&lt;/code&gt; similar to file system reference counting. Log fragments can be viewed as shared state objects, and therefore can be tracked by this &lt;code&gt;SharedStateRegistry&lt;/code&gt; (please see &lt;a href=&quot;https://www.ververica.com/blog/managing-large-state-apache-flink-incremental-checkpointing-overview&quot;&gt; this &lt;/a&gt; article for more information on how &lt;code&gt;SharedStateRegistry&lt;/code&gt; was used previously).&lt;/p&gt;
&lt;h2 id=&quot;dfs-specific-issues&quot;&gt;DFS-specific issues&lt;/h2&gt;
&lt;h3 id=&quot;small-files-problem&quot;&gt;Small files problem&lt;/h3&gt;
&lt;p&gt;One issue with using DFS is that much more and likely smaller files are created for each checkpoint. And with the increased checkpoint frequency, there are more checkpoints.
To mitigate this, state changes related to the same job on a TM are grouped into a single file.&lt;/p&gt;
&lt;h3 id=&quot;high-tail-latency&quot;&gt;High tail latency&lt;/h3&gt;
&lt;p&gt;DFS are known for high tail latencies, although this has been improving in recent years.
To address the high-tail-latency problem, write requests are retried when they fail to complete within a timeout, which is 1 second by default (but can be configured manually).&lt;/p&gt;
&lt;h1 id=&quot;benchmark-results&quot;&gt;Benchmark results&lt;/h1&gt;
&lt;p&gt;The improvement of checkpoint stability and speed after enabling Changelog highly depends on the factors below:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The difference between the changelog diff size and the full state size (or incremental state size, if comparing changelog to incremental checkpoints).&lt;/li&gt;
&lt;li&gt;The ability to upload the updates continuously during the checkpoint (e.g. an operator might maintain state in memory and only update Flink state objects on checkpoint - in this case, changelog wouldn’t help much).&lt;/li&gt;
&lt;li&gt;The ability to group updates from multiple tasks (multiple tasks must be deployed on a single TM). Grouping the updates leads to fewer files being created thereby reducing the load on DFS, which improves the stability.&lt;/li&gt;
&lt;li&gt;The ability of the underlying backend to accumulate updates to the same key before flushing (This makes state change log potentially contain more updates compared to just the final value, leading to a larger incremental changelog state size)&lt;/li&gt;
&lt;li&gt;The speed of the underlying durable storage (the faster it is, the less significant the improvement)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following setup was used in the experiment:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Parallelism: 50&lt;/li&gt;
&lt;li&gt;Running time: 21h&lt;/li&gt;
&lt;li&gt;State backend: RocksDB (incremental checkpoint enabled)&lt;/li&gt;
&lt;li&gt;Storage: S3 (Presto plugin)&lt;/li&gt;
&lt;li&gt;Machine type: AWS m5.xlarge (4 slots per TM)&lt;/li&gt;
&lt;li&gt;Checkpoint interval: 10ms&lt;/li&gt;
&lt;li&gt;State Table materialization interval: 3m&lt;/li&gt;
&lt;li&gt;Input rate: 50K events per second&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;valuestate-workload&quot;&gt;ValueState workload&lt;/h2&gt;
&lt;p&gt;A workload updating mostly the new keys each time would benefit the most.&lt;/p&gt;
&lt;table border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;&amp;nbsp;&lt;/th&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;Changelog Disabled&lt;/th&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;Changelog Enabled&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;Records processed&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;3,808,629,616&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;3,810,508,130&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;Checkpoints made&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;10,023&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;108,649&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;Checkpoint duration, 90%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;6s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;664ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;Checkpoint duration, 99.9%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;10s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;1s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;Full checkpoint size *, 99%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;19.6GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;25.6GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;Recovery time (local recovery disabled)&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;20-21s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;35-65s (depending on the checkpoint)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;As can be seen from the above table, checkpoint duration is reduced 10 times for 99.9% of checkpoints, while space usage increases by 30%, and recovery time increases by 66%-225%.&lt;/p&gt;
&lt;p&gt;More details about the checkpoints (Changelog Enabled / Changelog Disabled):&lt;/p&gt;
&lt;table border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;Percentile&lt;/th&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;End to End Duration&lt;/th&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;Checkpointed Data Size *&lt;/th&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;Full Checkpoint Data Size *&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;50%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;311ms / 5s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;14.8MB / 3.05GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;24.2GB / 18.5GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;90%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;664ms / 6s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;23.5MB / 4.52GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;25.2GB / 19.3GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;99%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;1s / 7s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;36.6MB / 5.19GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;25.6GB / 19.6GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;99.9%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;1s / 10s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;52.8MB / 6.49GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;25.7GB / 19.8GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;* Checkpointed Data Size is the size of data persisted after receiving the necessary number of checkpoint barriers, during a so-called synchronous and then asynchronous checkpoint phases. Most of the data is persisted pre-emptively (i.e. after the previous checkpoint and before the current one), and that’s why this size is much lower when the Changelog is enabled.
&lt;br /&gt;
* Full checkpoint size is the total size of all the files comprising the checkpoint, including any files reused from the previous checkpoints. Compared to a normal checkpoint, the one with a changelog is less compact, keeping all the historical values since the last materialization, and therefore consumes much more space&lt;/p&gt;
&lt;h2 id=&quot;window-workload&quot;&gt;Window workload&lt;/h2&gt;
&lt;p&gt;This workload used Processing Time Sliding Window. As can be seen below, checkpoints are still faster, resulting in 3 times shorter durations; but storage amplification is much higher in this case (45 times more space consumed):&lt;/p&gt;
&lt;p&gt;Checkpoint Statistics for Window Workload with Changelog Enabled / Changelog Disabled&lt;/p&gt;
&lt;table border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;Percentile&lt;/th&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;End to End Duration&lt;/th&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;Checkpointed Data Size&lt;/th&gt;
&lt;th style=&quot;padding: 5px&quot;&gt;Full Checkpoint Data Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;50%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;791ms / 1s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;269MB / 1.18GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;85.5GB / 1.99GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;90%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;1s / 1s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;292MB / 1.36GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;97.4GB / 2.16GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;99%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;1s / 6s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;310MB / 1.67GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;103GB / 2.26GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;99.9%&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;2s / 6s&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;324MB / 1.87GB&lt;/td&gt;
&lt;td style=&quot;padding: 5px&quot;&gt;104GB / 2.30GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The increase in space consumption (Full Checkpoint Data Size) can be attributed to:
1. Assigning each element to multiple sliding windows (and persisting the state changelog for each). While RocksDB and Heap have the same issue, with changelog the impact is multiplied even further.
2. As mentioned above, if the underlying state backend (i.e. RocksDB) is able to accumulate multiple state updates for the same key without flushing, the snapshot will be smaller in size than the changelog. In this particular case of sliding window, the updates to its contents are eventually followed by purging that window. If those updates and purge happen during the same checkpoint, then it’s quite likely that the window is not included in the snapshot.
This also implies that the faster the window is purged, the smaller the size of the snapshot is.&lt;/p&gt;
&lt;h1 id=&quot;conclusion-and-future-work&quot;&gt;Conclusion and future work&lt;/h1&gt;
&lt;p&gt;Generic log-based incremental checkpoints is released as MVP version in Flink 1.15. This version demonstrates that solutions based on modern DFS can provide good enough latency. Furthermore, checkpointing time and stability are improved significantly by using the Changelog. However, some trade-offs must be made before using it (in particular, space amplification).
&lt;br /&gt;
In the next releases, we plan to enable more use cases for Changelog, e.g., by reducing recovery time via local recovery and improving compatibility.&lt;/p&gt;
&lt;p&gt;Another direction is further reducing latency. This can be achieved by using faster storage, such as Apache Bookkeeper or Apache Kafka.&lt;/p&gt;
&lt;p&gt;Besides that, we are investigating other applications of Changelog, such as WAL for sinks and queryable states.&lt;/p&gt;
&lt;p&gt;We encourage you to try out this feature and assess the pros and cons of using it in your setup. The simplest way to do this it is to add the following to your flink-conf.yaml:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;state.backend.changelog.enabled: true
state.backend.changelog.storage: filesystem
dstl.dfs.base-path: &amp;lt;location similar to state.checkpoints.dir&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Please see the full documentation &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/#enabling-changelog&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;acknowledgments&quot;&gt;Acknowledgments&lt;/h1&gt;
&lt;p&gt;We thank Stephan Ewen for the initial idea of the project, and many other engineers including Piotr Nowojski, Yu Li and Yun Tang for design discussions and code reviews.&lt;/p&gt;
&lt;h1 id=&quot;references&quot;&gt;References&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints&quot;&gt; FLIP-158 &lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/#enabling-changelog&quot;&gt; generic log-based incremental checkpoints documentation &lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://flink.apache.org/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html&quot;&gt; Unaligned checkpoints &lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment&quot;&gt; Buffer debloating &lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html&quot;&gt; Incremental checkpoints &lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Mon, 30 May 2022 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/05/30/changelog-state-backend.html</link>
<guid isPermaLink="true">/2022/05/30/changelog-state-backend.html</guid>
</item>
<item>
<title>Getting into Low-Latency Gears with Apache Flink - Part Two</title>
<description>&lt;p&gt;This series of blog posts present a collection of low-latency techniques in Flink. In &lt;a href=&quot;https://flink.apache.org/2022/05/18/latency-part1.html&quot;&gt;part one&lt;/a&gt;, we discussed the types of latency in Flink and the way we measure end-to-end latency and presented a few techniques that optimize latency directly. In this post, we will continue with a few more direct latency optimization techniques. Just like in part one, for each optimization technique, we will clarify what it is, when to use it, and what to keep in mind when using it. We will also show experimental results to support our statements.&lt;/p&gt;
&lt;h1 id=&quot;direct-latency-optimization&quot;&gt;Direct latency optimization&lt;/h1&gt;
&lt;h2 id=&quot;spread-work-across-time&quot;&gt;Spread work across time&lt;/h2&gt;
&lt;p&gt;When you use timers or do windowing in a job, timer or window firing may create load spikes due to heavy computation or state access. If the allocated resources cannot cope with these load spikes, timer or window firing will take a long time to finish. This often results in high latency.&lt;/p&gt;
&lt;p&gt;To avoid this situation, you should change your code to spread out the workload as much as possible such that you do not accumulate too much work to be done at a single point in time. In the case of windowing, you should consider using incremental window aggregation with &lt;code&gt;AggregateFunction&lt;/code&gt; or &lt;code&gt;ReduceFunction&lt;/code&gt;. In the case of timers in a &lt;code&gt;ProcessFunction&lt;/code&gt;, the operations executed in the &lt;code&gt;onTimer()&lt;/code&gt; method should be optimized such that the time spent there is reduced to a minimum. If you see latency spikes resulting from a global aggregation or if you need to collect events in a well-defined order to perform certain computations, you can consider adding a pre-aggregation phase in front of the current operator.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You can apply this optimization&lt;/strong&gt; if you are using timer-based processing (e.g., timers, windowing) and an efficient aggregation can be applied whenever an event arrives instead of waiting for timers to fire.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Keep in mind&lt;/strong&gt; that when you spread work across time, you should consider not only computation but also state access, especially when using RocksDB. Spreading one type of work while accumulating the other may result in higher latencies.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/WindowingJob.java&quot;&gt;WindowingJob&lt;/a&gt; already does incremental window aggregation with &lt;code&gt;AggregateFunction&lt;/code&gt;. To show the latency improvement of this technique, we compared &lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/WindowingJob.java&quot;&gt;WindowingJob&lt;/a&gt; with a variant that does not do incremental aggregation, &lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/WindowingJobNoAggregation.java&quot;&gt;WindowingJobNoAggregation&lt;/a&gt;, both running with the commonly used &lt;code&gt;rocksdb&lt;/code&gt; state backend. As the results below show, without incremental window aggregation, the latency would increase from 720 ms to 1.7 seconds.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:50%&quot; src=&quot;/img/blog/2022-05-23-latency-part2/spread-work.png&quot; /&gt;
&lt;/center&gt;
&lt;h2 id=&quot;access-external-systems-efficiently&quot;&gt;Access external systems efficiently&lt;/h2&gt;
&lt;h3 id=&quot;using-async-io&quot;&gt;Using async I/O&lt;/h3&gt;
&lt;p&gt;When interacting with external systems (e.g., RDBMS, object stores, web services) in a Flink job for data enrichment, the latency in getting responses from external systems often dominates the overall latency of the job. With Flink’s &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html&quot;&gt;Async I/O API&lt;/a&gt; (e.g., &lt;code&gt;AsyncDataStream.unorderedWait()&lt;/code&gt; or &lt;code&gt;AsyncDataStream.orderedWait()&lt;/code&gt;), a single parallel function instance can handle many requests concurrently and receive responses asynchronously. This reduces latencies because the waiting time for responses is amortized over multiple requests.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:50%&quot; src=&quot;/img/blog/2022-05-23-latency-part2/async-io.png&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;strong&gt;You can apply this optimization&lt;/strong&gt; if the client of your external system supports asynchronous requests. If it does not, you can use a thread pool of multiple clients to handle synchronous requests in parallel. You can also use a cache to speed up lookups if the data in the external system is not changing frequently. A cache, however, comes at the cost of working with outdated data.&lt;/p&gt;
&lt;p&gt;In this experiment, we simulated an external system that returns responses within 1 to 6 ms randomly, and we keep the external system response in a cache in our job for 1s. The results below show the comparison between two jobs: &lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/EnrichingJobSync.java&quot;&gt;EnrichingJobSync&lt;/a&gt; and &lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/EnrichingJobAsync.java&quot;&gt;EnrichingJobAsync&lt;/a&gt;. By using async I/O, the latency was reduced from around 600 ms to 100 ms.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:50%&quot; src=&quot;/img/blog/2022-05-23-latency-part2/enriching-with-async-io.png&quot; /&gt;
&lt;/center&gt;
&lt;h3 id=&quot;using-a-streaming-join&quot;&gt;Using a streaming join&lt;/h3&gt;
&lt;p&gt;If you are enriching a stream of events with an external database where the data changes frequently, and the changes can be converted to a data stream, then you have another option to use &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/operators/overview/#datastreamdatastream-rarr-connectedstream&quot;&gt;connected streams&lt;/a&gt; and a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/operators/process_function/#low-level-joins&quot;&gt;CoProcessFunction&lt;/a&gt; to do a streaming join. This can usually achieve lower latencies than the per-record lookups approach. An alternative approach is to pre-load external data into the job but a full streaming join can usually achieve better accuracy because it does not work with stale data and takes event-time into account. Please refer to &lt;a href=&quot;https://www.youtube.com/watch?v=cJS18iKLUIY&quot;&gt;this webinar&lt;/a&gt; for more details on streaming joins.&lt;/p&gt;
&lt;h2 id=&quot;tune-checkpointing&quot;&gt;Tune checkpointing&lt;/h2&gt;
&lt;p&gt;There are two aspects in checkpointing that impact latency: checkpoint alignment time as well as checkpoint frequency and duration in case of end-to-end exactly-once with transactional sinks.&lt;/p&gt;
&lt;h3 id=&quot;reduce-checkpoint-alignment-time&quot;&gt;Reduce checkpoint alignment time&lt;/h3&gt;
&lt;p&gt;During checkpoint alignment, operators block the event processing from the channels where checkpoint barriers have been received in order to wait for the checkpoint barriers from other channels. Longer alignment time will result in higher latencies.&lt;/p&gt;
&lt;p&gt;There are different ways to reduce checkpoint alignment time:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Improve the throughput. Any improvement in throughput helps processing the buffers sitting in front of a checkpoint barrier faster.&lt;/li&gt;
&lt;li&gt;Scale up or scale out. This is the same as the technique of “allocate enough resources” described in &lt;a href=&quot;https://flink.apache.org/2022/05/18/latency-part1.html&quot;&gt;part one&lt;/a&gt;. Increased processing power helps reducing backpressure and checkpoint alignment time.&lt;/li&gt;
&lt;li&gt;Use unaligned checkpointing. In this case, checkpoint barriers will not wait until the data is processed but skip over and pass on to the next operator immediately. Skipped-over data, however, has to be checkpointed as well in order to be consistent. Flink can also be configured to automatically switch over from aligned to unaligned checkpointing after a certain alignment time has passed.&lt;/li&gt;
&lt;li&gt;Buffer less data. You can reduce the buffered data size by tuning the number of exclusive and floating buffers. With less data buffered in the network stack, the checkpoint barrier can arrive at operators quicker. However, reducing buffers has an adverse effect on throughput and is just mentioned here for completeness. Flink 1.14 improves buffer handling by introducing a feature called &lt;em&gt;buffer debloating&lt;/em&gt;. Buffer debloating can dynamically adjust buffer size based on the current throughput such that the buffered data can be worked off by the receiver within a configured fixed duration, e.g., 1 second. This reduces the buffered data during the alignment phase and can be used in combination with unaligned checkpointing to reduce the checkpoint alignment time.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;tune-checkpoint-duration-and-frequency&quot;&gt;Tune checkpoint duration and frequency&lt;/h3&gt;
&lt;p&gt;If you are working with transactional sinks with exactly-once semantics, the output events are committed to external systems (e.g., Kafka) &lt;em&gt;only&lt;/em&gt; upon checkpoint completion. In this case, tuning other options may not help if you do not tune checkpointing. Instead, you need to have fast and more frequent checkpointing.&lt;/p&gt;
&lt;p&gt;To have fast checkpointing, you need to reduce the checkpoint duration. To achieve that, you can, for example, turn on rocksdb incremental checkpointing, reduce the state stored in Flink, clean up state that is not needed anymore, do not put cache into managed state, store only necessary fields in state, optimize the serialization format, etc. You can also scale up or scale out, same as the technique of “allocate enough resources” described in &lt;a href=&quot;https://flink.apache.org/2022/05/18/latency-part1.html&quot;&gt;part one&lt;/a&gt;. This has two effects: it reduces backpressure because of the increased processing power, and with the increased parallelism, writing checkpoints to remote storage can finish quicker. You can also tune checkpoint alignment time, as described in the previous section, to reduce the checkpoint duration. If you use Flink 1.15 or later, you can enable &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/state_backends/#enabling-changelog&quot;&gt;the changelog feature&lt;/a&gt;. It may help to reduce the async duration of checkpointing.&lt;/p&gt;
&lt;p&gt;To have more frequent checkpointing, you can reduce the checkpoint interval, the minimum pause between checkpoints, or use concurrent checkpoints. But keep in mind that concurrent checkpoints introduce more runtime overhead.&lt;/p&gt;
&lt;p&gt;Another option is to not use exactly-once sinks but to switch to at-least-once sinks. The result of this is that you may have (correct but) duplicated output events, so this may require the downstream application that consumes the output events of your jobs to perform deduplication additionally.&lt;/p&gt;
&lt;h2 id=&quot;process-events-on-arrival&quot;&gt;Process events on arrival&lt;/h2&gt;
&lt;p&gt;In a stream processing pipeline, there often exists a delay between the time an event is received and the time the event can be processed (e.g., after having seen all events up to a certain point in event time). The amount of delay may be significant for those pipelines with very low latency requirements. For example, a fraud detection job usually requires a sub-second level of latency. In this case, you could process events with &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/operators/process_function/&quot;&gt;ProcessFunction&lt;/a&gt; immediately when they arrive and deal with out-of-order events by yourself (in case of event-time processing) depending on your business requirements, e.g., drop or add them to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/side_output/&quot;&gt;side output&lt;/a&gt; for special processing. Please refer to &lt;a href=&quot;https://flink.apache.org/news/2020/07/30/demo-fraud-detection-3.html&quot;&gt;this Flink blog post&lt;/a&gt; for a great example of a low latency fraud detection job with implementation details.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You can apply this optimization&lt;/strong&gt; if your job has a sub-second level latency requirement (e.g., hundreds of milliseconds) and the reduced watermarking interval still contributes a significant part of the latency.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Keep in mind&lt;/strong&gt; that this may change your job logic considerably since you have to deal with out-of-order events by yourself.&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;Following part one, this blog post presented a few more latency optimization techniques with a focus on direct latency optimization. In the next part, we will focus on techniques that optimize latency by increasing throughput. Stay tuned!&lt;/p&gt;
</description>
<pubDate>Mon, 23 May 2022 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/05/23/latency-part2.html</link>
<guid isPermaLink="true">/2022/05/23/latency-part2.html</guid>
</item>
<item>
<title>Getting into Low-Latency Gears with Apache Flink - Part One</title>
<description>&lt;p&gt;Apache Flink is a stream processing framework well known for its low latency processing capabilities. It is generic and suitable for a wide range of use cases. As a Flink application developer or a cluster administrator, you need to find the right gear that is best for your application. In other words, you don’t want to be driving a luxury sports car while only using the first gear.&lt;/p&gt;
&lt;p&gt;In this multi-part series, we will present a collection of low-latency techniques in Flink. Part one starts with types of latency in Flink and the way we measure the end-to-end latency, followed by a few techniques that optimize latency directly. Part two continues with a few more direct latency optimization techniques. Further parts of this series will cover techniques that improve latencies by optimizing throughput. For each optimization technique, we will clarify what it is, when to use it, and what to keep in mind when using it. We will also show experimental results to support our statements.&lt;/p&gt;
&lt;p&gt;This series of blog posts is a write-up of &lt;a href=&quot;https://www.youtube.com/watch?v=4dwwokhQHwo&quot;&gt;our talk in Flink Forward Global 2021&lt;/a&gt; and includes additional latency optimization techniques and details.&lt;/p&gt;
&lt;h1 id=&quot;latency&quot;&gt;Latency&lt;/h1&gt;
&lt;h2 id=&quot;types-of-latency&quot;&gt;Types of latency&lt;/h2&gt;
&lt;p&gt;Latency can refer to different things. &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/metrics/#end-to-end-latency-tracking&quot;&gt;LatencyMarkers&lt;/a&gt; in Flink measure the time it takes for the markers to travel from each source operator to each downstream operator. As LatencyMarkers bypass user functions in operators, the measured latencies do not reflect the entire end-to-end latency but only a part of it. Flink also supports tracking the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/metrics/#state-access-latency-tracking&quot;&gt;state access latency&lt;/a&gt;, which measures the response latency when state is read/written. One can also manually measure the time taken by some operators, or get this data with profilers. However, what users usually care about is the end-to-end latency, including the time spent in user-defined functions, in the stream processing framework, and when state is accessed. End-to-end latency is what we will focus on.&lt;/p&gt;
&lt;h2 id=&quot;how-we-measure-end-to-end-latency&quot;&gt;How we measure end-to-end latency&lt;/h2&gt;
&lt;p&gt;There are two scenarios to consider. In the first scenario, a pipeline does a simple transformation, and there are no timers or any other complex event time logic. For example, a pipeline that produces one output event for each input event. In this case, we measure the processing delay as the latency, that is, &lt;code&gt;t2 - t1&lt;/code&gt; as shown in the diagram.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:40%&quot; src=&quot;/img/blog/2022-05-18-latency-part1/scenario1-simple.png&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;The second scenario is where complex event time logic is involved (e.g., timers, aggregation, windowing). In this case, we measure the event time lag as the latency, that is, &lt;code&gt;current processing time - current watermark&lt;/code&gt;. The event time lag gives us the difference between the expected output time and the actual output time.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:40%&quot; src=&quot;/img/blog/2022-05-18-latency-part1/scenario2-eventtime.png&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;In both scenarios, we capture a histogram and show the 99th percentile of the end-to-end latency. The latency we measure here includes the time an event stays in the source message queue (e.g., Kafka). The reason for this is that it covers the scenarios where a source operator in a pipeline is backpressured by other operators. The more the source operator is backpressured, the longer the messages stay in the message queue. So, including the time events stay in the message queue in the latency gives us how slow or fast a pipeline is.&lt;/p&gt;
&lt;h1 id=&quot;low-latency-optimization-techniques&quot;&gt;Low-latency optimization techniques&lt;/h1&gt;
&lt;p&gt;We will discuss low-latency techniques in two groups: techniques that optimize latency directly and techniques that improve latency by optimizing throughput.
Each of these techniques can be as simple as a configuration change or may require code changes, or both. We have created a git repository containing the example jobs used in our experiments to support our statements. Keep in mind that all the experimental results we will show are specific to those jobs and the environment they run in. Your job may show different results depending on where the latency bottleneck is.&lt;/p&gt;
&lt;h2 id=&quot;direct-latency-optimization&quot;&gt;Direct latency optimization&lt;/h2&gt;
&lt;h3 id=&quot;allocate-enough-resources&quot;&gt;Allocate enough resources&lt;/h3&gt;
&lt;p&gt;An obvious but often forgotten low-latency technique is to allocate enough resources to your job. Flink has some metrics (e.g., &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/monitoring/back_pressure/#task-performance-metrics&quot;&gt;idleTimeMsPerSecond&lt;/a&gt;, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/monitoring/back_pressure/#task-performance-metrics&quot;&gt;busyTimeMsPerSecond&lt;/a&gt;, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/monitoring/back_pressure/#task-performance-metrics&quot;&gt;backPressureTimeMsPerSecond&lt;/a&gt;) to indicate whether an operator/subtask is busy or not. This can also be spotted easily in the job graph on Flink’s Web UI if you are using &lt;a href=&quot;https://flink.apache.org/2021/07/07/backpressure.html&quot;&gt;Flink 1.13 or later&lt;/a&gt;. If some operators in your job are 100% busy, they will backpressure upstream operators and the backpressure may propagate up to the source operators. Backpressure slows down the pipeline and results in high latency. If you scale up your job by adding more CPU/memory resources or scale out by increasing the parallelism, your job will be able to process events faster or process more events in parallel which leads to reduced latencies. We recommend having an average load below 70% under normal circumstances to accommodate load spikes that come from input data, timers, windowing, or other sources. You should adjust the threshold based on your job resource usage patterns and your latency requirements.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You can apply this optimization&lt;/strong&gt; if your job or part of it is running at its total CPU/memory capacity and you have more resources that can be allocated to the job. In the case of scaling out with high parallelism, your streaming job must be able to make use of the additional resources. For example, the job should not have fixed parallelisms in the code, the job should not be bottlenecked on the source streams, and the input streams are partitionable by keys such that they can be processed in parallel and have no severe data skew, etc. In the case of scaling up by allocating more CPU cores, your streaming job must not be bottlenecked on a single thread or any other resources.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Keep in mind&lt;/strong&gt; that allocating more resources may result in increased financial costs, especially when you are running jobs in the cloud.&lt;/p&gt;
&lt;p&gt;Below are the experimental results of &lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/WindowingJob.java&quot;&gt;WindowingJob&lt;/a&gt;. As you can see from the graph at the left, when the parallelism was 2, the two subtasks were often 100% busy. After we increased the parallelism to 3, the three subtasks were around 75% busy. As a result, the 99th percentile latency reduces from around 3 seconds to 650 milliseconds.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:90%&quot; src=&quot;/img/blog/2022-05-18-latency-part1/increase-parallelism.png&quot; /&gt;
&lt;/center&gt;
&lt;h3 id=&quot;use-applicable-state-backends&quot;&gt;Use applicable state backends&lt;/h3&gt;
&lt;p&gt;When using the &lt;code&gt;filesystem&lt;/code&gt; (Flink 1.12 or early) or &lt;code&gt;hashmap&lt;/code&gt; (Flink 1.13 or later) state backend, the state objects are stored in memory and can be accessed directly. In contrast, when using the &lt;code&gt;rocksdb&lt;/code&gt; state backend, every state access has to go through a (de-)serialization process which in addition may involve disk accesses. So using &lt;code&gt;filesystem/hashmap&lt;/code&gt; state backend can help reduce latency.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You can apply this optimization&lt;/strong&gt; if your state size is very small compared to the memory you can allocate to your job and your state size will not grow beyond your memory capacity. You can set the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/memory/mem_setup_tm/#managed-memory&quot;&gt;managed memory&lt;/a&gt; size to 0 if not needed. Since Flink 1.13, you can always start with the &lt;code&gt;hashmap&lt;/code&gt; state backend and seamlessly switch to the &lt;code&gt;rocksdb&lt;/code&gt; state backend via &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-41%3A+Unify+Binary+format+for+Keyed+State&quot;&gt;savepoints&lt;/a&gt; when the state increases to the size that is close to your memory capacity. Note that you should closely monitor the memory usage and perform the switch &lt;strong&gt;before&lt;/strong&gt; an out-of-memory happens. Please refer to &lt;a href=&quot;https://flink.apache.org/2021/01/18/rocksdb.html&quot;&gt;this Flink blog post&lt;/a&gt; for best practices when using the &lt;code&gt;rocksdb&lt;/code&gt; state backend.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Keep in mind&lt;/strong&gt; that heap-based state backends use more memory compared with RocksDB due to their copy-on-write data structure and Java’s on-heap object representation. Heap-based state backends can be affected by the garbage collector which makes them less predictable and may lead to high tail latencies. Also, as of now, there is no support for incremental checkpointing (this is being developed in &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-151%3A+Incremental+snapshots+for+heap-based+state+backend&quot;&gt;FLIP-151&lt;/a&gt;). You should measure the difference before you make the switch.&lt;/p&gt;
&lt;p&gt;Our experiments with the previously mentioned &lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/WindowingJob.java&quot;&gt;WindowingJob&lt;/a&gt; after switching the state backend from &lt;code&gt;rocksdb&lt;/code&gt; to &lt;code&gt;hashmap&lt;/code&gt; show a further reduction of the latency down to 500ms. Depending on your job’s state access pattern, you may see larger or smaller improvements. The graph on the right shows the garbage collection’s impact on the latency.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:90%&quot; src=&quot;/img/blog/2022-05-18-latency-part1/choose-state-backend.png&quot; /&gt;
&lt;/center&gt;
&lt;h3 id=&quot;emit-watermarks-quickly&quot;&gt;Emit watermarks quickly&lt;/h3&gt;
&lt;p&gt;When using a periodic &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/event-time/generating_watermarks/&quot;&gt;watermark generator&lt;/a&gt;, Flink generates a watermark every 200 ms. This means that, by default, each parallel watermark generator does not produce watermark updates until 200 ms have passed. While this may be sufficient for many cases, if you are aiming for sub-second latencies, you could try reducing the interval even further, for example, to 100 ms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You can apply this optimization&lt;/strong&gt; if you use event time and a periodic watermark generator, and you are aiming for sub-second latencies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Keep in mind&lt;/strong&gt; that watermark generation that is too frequent may also degrade performance because more watermarks must be processed by the framework. Moreover, even though watermarks are only created every 200 milliseconds, watermarks may arrive at much higher frequencies further downstream in your job because tasks may receive watermarks from multiple parallel watermark generators.&lt;/p&gt;
&lt;p&gt;We re-ran the previous &lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/WindowingJob.java&quot;&gt;WindowingJob&lt;/a&gt; experiment with the reduced watermark interval &lt;code&gt;pipeline.auto-watermark-interval: 100ms&lt;/code&gt; and reduced the latency further to 430ms.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:50%&quot; src=&quot;/img/blog/2022-05-18-latency-part1/watermark-interval.png&quot; /&gt;
&lt;/center&gt;
&lt;h3 id=&quot;flush-network-buffers-early&quot;&gt;Flush network buffers early&lt;/h3&gt;
&lt;p&gt;Flink uses buffers when sending data from one task to another over the network. Buffers are flushed and sent out when they are filled up or when the default timeout of 100ms has passed. Again, if you are aiming for sub-second latencies, you can lower the timeout to reduce latencies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;You can apply this optimization&lt;/strong&gt; if you are aiming for sub-second latencies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Keep in mind&lt;/strong&gt; that network buffer timeout that is too low may reduce throughput.&lt;/p&gt;
&lt;p&gt;As seen in the following experiment results, by using &lt;code&gt;execution.buffer-timeout: 10 ms&lt;/code&gt; in &lt;a href=&quot;https://github.com/ververica/lab-flink-latency/blob/main/src/main/java/com/ververica/lablatency/job/WindowingJob.java&quot;&gt;WindowingJob&lt;/a&gt;, we again reduced the latency (now to 370ms).&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:50%&quot; src=&quot;/img/blog/2022-05-18-latency-part1/buffer-timeout.png&quot; /&gt;
&lt;/center&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;In part one of this multi-part series, we discussed types of latency in Flink and the way we measure end-to-end latency. Then we presented a few latency optimization techniques with a focus on direct latency optimization. For each technique, we explained what it is, when to use it, and what to keep in mind when using it. Part two will continue with a few more direct latency optimization techniques. Stay tuned!&lt;/p&gt;
</description>
<pubDate>Wed, 18 May 2022 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/05/18/latency-part1.html</link>
<guid isPermaLink="true">/2022/05/18/latency-part1.html</guid>
</item>
<item>
<title>Apache Flink Table Store 0.1.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is pleased to announce the preview release of the
&lt;a href=&quot;https://github.com/apache/flink-table-store&quot;&gt;Apache Flink Table Store&lt;/a&gt; (0.1.0).&lt;/p&gt;
&lt;p&gt;Please check out the full &lt;a href=&quot;https://nightlies.apache.org/flink/flink-table-store-docs-release-0.1/&quot;&gt;documentation&lt;/a&gt; for detailed information and user guides.&lt;/p&gt;
&lt;p&gt;Note: Flink Table Store is still in beta status and undergoing rapid development.
We do not recommend that you use it directly in a production environment.&lt;/p&gt;
&lt;h2 id=&quot;what-is-flink-table-store&quot;&gt;What is Flink Table Store&lt;/h2&gt;
&lt;p&gt;In the past years, thanks to our numerous contributors and users, Apache Flink has established
itself as one of the best distributed computing engines, especially for stateful stream processing
at large scale. However, there are still a few challenges people are facing when they try to obtain
insights from their data in real-time. Among these challenges, one prominent problem is lack of
storage that caters to all the computing patterns.&lt;/p&gt;
&lt;p&gt;As of now it is quite common that people deploy a few storage systems to work with Flink for different
purposes. A typical setup is a message queue for stream processing, a scannable file system / object store
for batch processing and ad-hoc queries, and a K-V store for lookups. Such an architecture posts challenge
in data quality and system maintenance, due to its complexity and heterogeneity. This is becoming a major
issue that hurts the end-to-end user experience of streaming and batch unification brought by Apache Flink.&lt;/p&gt;
&lt;p&gt;The goal of Flink table store is to address the above issues. This is an important step of the project.
It extends Flink’s capability from computing to the storage domain. So we can provide a better end-to-end
experience to the users.&lt;/p&gt;
&lt;p&gt;Flink Table Store aims to provide a unified storage abstraction, so users don’t have to build the hybrid
storage by themselves. More specifically, Table Store offers the following core capabilities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Support storage of large datasets and allows read / write in both batch and streaming manner.&lt;/li&gt;
&lt;li&gt;Support streaming queries with minimum latency down to milliseconds.&lt;/li&gt;
&lt;li&gt;Support Batch/OLAP queries with minimum latency down to the second level.&lt;/li&gt;
&lt;li&gt;Support incremental snapshots for stream consumption by default. So users don’t need to solve the
problem of combining different stores by themselves.&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/table-store/table-store-architecture.png&quot; width=&quot;100%&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;In this preview version, as shown in the architecture above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Users can use Flink to insert data into the Table Store, either by streaming the change log
captured from databases, or by loading the data in batches from the other stores like data warehouses.&lt;/li&gt;
&lt;li&gt;Users can use Flink to query the table store in different ways, including streaming queries and
Batch/OLAP queries. It is also worth noting that users can use other engines such as Apache Hive to
query from the table store as well.&lt;/li&gt;
&lt;li&gt;Under the hood, table Store uses a hybrid storage architecture, using a Lake Store to store historical data
and a Queue system (Apache Kafka integration is currently supported) to store incremental data. It provides
incremental snapshots for hybrid streaming reads.&lt;/li&gt;
&lt;li&gt;Table Store’s Lake Store stores data as columnar files on file system / object store, and uses the LSM Structure
to support a large amount of data updates and high-performance queries.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Many thanks for the inspiration of the following systems: &lt;a href=&quot;https://iceberg.apache.org/&quot;&gt;Apache Iceberg&lt;/a&gt; and &lt;a href=&quot;http://rocksdb.org/&quot;&gt;RocksDB&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;getting-started&quot;&gt;Getting started&lt;/h2&gt;
&lt;p&gt;Please refer to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-table-store-docs-release-0.1/docs/try-table-store/quick-start/&quot;&gt;getting started guide&lt;/a&gt; for more details.&lt;/p&gt;
&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next?&lt;/h2&gt;
&lt;p&gt;The community is currently working on hardening the core logic, stabilizing the storage format and adding the remaining bits for making the Flink Table Store production-ready.&lt;/p&gt;
&lt;p&gt;In the upcoming 0.2.0 release you can expect (at-least) the following additional features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ecosystem: Support Flink Table Store Reader for Apache Hive Engine&lt;/li&gt;
&lt;li&gt;Core: Support the adjustment of the number of Bucket&lt;/li&gt;
&lt;li&gt;Core: Support for Append Only Data, Table Store is not just limited to update scenarios&lt;/li&gt;
&lt;li&gt;Core: Full Schema Evolution&lt;/li&gt;
&lt;li&gt;Improvements based on feedback from the preview release&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the medium term, you can also expect:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ecosystem: Support Flink Table Store Reader for Trino, PrestoDB and Apache Spark&lt;/li&gt;
&lt;li&gt;Flink Table Store Service to accelerate updates and improve query performance&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Please give the preview release a try, share your feedback on the Flink mailing list and contribute to the project!&lt;/p&gt;
&lt;h2 id=&quot;release-resources&quot;&gt;Release Resources&lt;/h2&gt;
&lt;p&gt;The source artifacts and binaries are now available on the updated &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads&lt;/a&gt;
page of the Flink website.&lt;/p&gt;
&lt;p&gt;We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt;
or &lt;a href=&quot;https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Table%20Store%22&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank every one of the contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;Jane Chan, Jiangjie (Becket) Qin, Jingsong Lee, Leonard Xu, Nicholas Jiang, Shen Zhu, tsreaper, Yubin Li&lt;/p&gt;
</description>
<pubDate>Wed, 11 May 2022 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2022/05/11/release-table-store-0.1.0.html</link>
<guid isPermaLink="true">/news/2022/05/11/release-table-store-0.1.0.html</guid>
</item>
<item>
<title>The Generic Asynchronous Base Sink</title>
<description>&lt;p&gt;Flink sinks share a lot of similar behavior. Most sinks batch records according to user-defined buffering hints, sign requests, write them to the destination, retry unsuccessful or throttled requests, and participate in checkpointing.&lt;/p&gt;
&lt;p&gt;This is why for Flink 1.15 we have decided to create the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink&quot;&gt;&lt;code&gt;AsyncSinkBase&lt;/code&gt; (FLIP-171)&lt;/a&gt;, an abstract sink with a number of common functionalities extracted.&lt;/p&gt;
&lt;p&gt;This is a base implementation for asynchronous sinks, which you should use whenever you need to implement a sink that doesn’t offer transactional capabilities. Adding support for a new destination now only requires a lightweight shim that implements the specific interfaces of the destination using a client that supports async requests.&lt;/p&gt;
&lt;p&gt;This common abstraction will reduce the effort required to maintain individual sinks that extend from this abstract sink, with bug fixes and improvements to the sink core benefiting all implementations that extend it. The design of &lt;code&gt;AsyncSinkBase&lt;/code&gt; focuses on extensibility and a broad support of destinations. The core of the sink is kept generic and free of any connector-specific dependencies.&lt;/p&gt;
&lt;p&gt;The sink base is designed to participate in checkpointing to provide at-least-once semantics and can work directly with destinations that provide a client that supports asynchronous requests.&lt;/p&gt;
&lt;p&gt;In this post, we will go over the details of the AsyncSinkBase so that you can start using it to build your own concrete sink.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#adding-the-base-sink-as-a-dependency&quot; id=&quot;markdown-toc-adding-the-base-sink-as-a-dependency&quot;&gt;Adding the base sink as a dependency&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-public-interfaces-of-asyncsinkbase&quot; id=&quot;markdown-toc-the-public-interfaces-of-asyncsinkbase&quot;&gt;The Public Interfaces of AsyncSinkBase&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#generic-types&quot; id=&quot;markdown-toc-generic-types&quot;&gt;Generic Types&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#element-converter-interface&quot; id=&quot;markdown-toc-element-converter-interface&quot;&gt;Element Converter Interface&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#sink-writer-interface&quot; id=&quot;markdown-toc-sink-writer-interface&quot;&gt;Sink Writer Interface&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#sink-interface&quot; id=&quot;markdown-toc-sink-interface&quot;&gt;Sink Interface&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#metrics&quot; id=&quot;markdown-toc-metrics&quot;&gt;Metrics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#sink-behavior&quot; id=&quot;markdown-toc-sink-behavior&quot;&gt;Sink Behavior&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;adding-the-base-sink-as-a-dependency&quot;&gt;Adding the base sink as a dependency&lt;/h1&gt;
&lt;p&gt;In order to use the base sink, you will need to add the following dependency to your project. The example below follows the Maven syntax:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-connector-base&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;${flink.version}&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&quot;the-public-interfaces-of-asyncsinkbase&quot;&gt;The Public Interfaces of AsyncSinkBase&lt;/h1&gt;
&lt;h2 id=&quot;generic-types&quot;&gt;Generic Types&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;InputT&amp;gt;&lt;/code&gt; – type of elements in a DataStream that should be passed to the sink&lt;/p&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;RequestEntryT&amp;gt;&lt;/code&gt; – type of a payload containing the element and additional metadata that is required to submit a single element to the destination&lt;/p&gt;
&lt;h2 id=&quot;element-converter-interface&quot;&gt;Element Converter Interface&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15.0/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/ElementConverter.java&quot;&gt;ElementConverter&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;interface&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ElementConverter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Serializable&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The concrete sink implementation should provide a way to convert from an element in the DataStream to the payload type that contains all the additional metadata required to submit that element to the destination by the sink. Ideally, this would be encapsulated from the end user since it allows concrete sink implementers to adapt to changes in the destination API without breaking end user code.&lt;/p&gt;
&lt;h2 id=&quot;sink-writer-interface&quot;&gt;Sink Writer Interface&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15.0/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java&quot;&gt;AsyncSinkWriter&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;There is a buffer in the sink writer that holds the request entries that have been sent to the sink but not yet written to the destination. An element of the buffer is a &lt;code&gt;RequestEntryWrapper&amp;lt;RequestEntryT&amp;gt;&lt;/code&gt; consisting of the &lt;code&gt;RequestEntryT&lt;/code&gt; along with the size of that record.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AsyncSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Serializable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StatefulSink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;StatefulSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BufferedRequestState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;protected&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;submitRequestEntries&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requestEntries&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Consumer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requestResult&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We will submit the &lt;code&gt;requestEntries&lt;/code&gt; asynchronously to the destination from here. Sink implementers should use the client libraries of the destination they intend to write to, to perform this.&lt;/p&gt;
&lt;p&gt;Should any elements fail to be persisted, they will be requeued back in the buffer for retry using &lt;code&gt;requestResult.accept(...list of failed entries...)&lt;/code&gt;. However, retrying any element that is known to be faulty and consistently failing, will result in that element being requeued forever, therefore a sensible strategy for determining what should be retried is highly recommended. If no errors were returned, we must indicate this with &lt;code&gt;requestResult.accept(Collections.emptyList())&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If at any point, it is determined that a fatal error has occurred and that we should throw a runtime exception from the sink, we can call &lt;code&gt;getFatalExceptionCons().accept(...);&lt;/code&gt; from anywhere in the concrete sink writer.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AsyncSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Serializable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StatefulSink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;StatefulSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BufferedRequestState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;protected&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getSizeInBytes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requestEntry&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The async sink has a concept of size of elements in the buffer. This allows users to specify a byte size threshold beyond which elements will be flushed. However the sink implementer is best positioned to determine what the most sensible measure of size for each &lt;code&gt;RequestEntryT&lt;/code&gt; is. If there is no way to determine the size of a record, then the value &lt;code&gt;0&lt;/code&gt; may be returned, and the sink will not flush based on record size triggers.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AsyncSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Serializable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StatefulSink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;StatefulSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BufferedRequestState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;AsyncSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ElementConverter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;elementConverter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Sink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;InitContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxBatchSize&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxInFlightRequests&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxBufferedRequests&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxBatchSizeInBytes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxTimeInBufferMS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxRecordSizeInBytes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;cm&quot;&gt;/* ... */&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;By default, the method &lt;code&gt;snapshotState&lt;/code&gt; returns all the elements in the buffer to be saved for snapshots. Any elements that were previously removed from the buffer are guaranteed to be persisted in the destination by a preceding call to &lt;code&gt;AsyncWriter#flush(true)&lt;/code&gt;.
You may want to save additional state from the concrete sink. You can achieve this by overriding &lt;code&gt;snapshotState&lt;/code&gt;, and restoring from the saved state in the constructor. You will receive the saved state by overriding &lt;code&gt;restoreWriter&lt;/code&gt; in your concrete sink. In this method, you should construct a sink writer, passing in the recovered state.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MySinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AsyncSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;MySinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ... &lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Collection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BufferedRequestState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;initialStates&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;initialStates&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// restore concrete sink state from initialStates&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BufferedRequestState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;snapshotState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;checkpointId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;snapshotState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;checkpointId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;sink-interface&quot;&gt;Sink Interface&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/apache/flink/blob/release-1.15.0/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/AsyncSinkBase.java&quot;&gt;AsyncSinkBase&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MySink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AsyncSinkBase&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StatefulSinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BufferedRequestState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestEntryT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;createWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InitContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;MySinkWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;AsyncSinkBase implementations return their own extension of the &lt;code&gt;AsyncSinkWriter&lt;/code&gt; from &lt;code&gt;createWriter()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;At the time of writing, the &lt;a href=&quot;https://github.com/apache/flink/tree/release-1.15.0/flink-connectors/flink-connector-aws-kinesis-streams&quot;&gt;Kinesis Data Streams sink&lt;/a&gt; and &lt;a href=&quot;https://github.com/apache/flink/tree/release-1.15.0/flink-connectors/flink-connector-aws-kinesis-firehose&quot;&gt;Kinesis Data Firehose sink&lt;/a&gt; are using this base sink.&lt;/p&gt;
&lt;h1 id=&quot;metrics&quot;&gt;Metrics&lt;/h1&gt;
&lt;p&gt;There are three metrics that automatically exist when you implement sinks (and, thus, should not be implemented by yourself).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;CurrentSendTime Gauge - returns the amount of time in milliseconds it took for the most recent request to write records to complete, whether successful or not.&lt;/li&gt;
&lt;li&gt;NumBytesOut Counter - counts the total number of bytes the sink has tried to write to the destination, using the method &lt;code&gt;getSizeInBytes&lt;/code&gt; to determine the size of each record. This will double count failures that may need to be retried.&lt;/li&gt;
&lt;li&gt;NumRecordsOut Counter - similar to above, this counts the total number of records the sink has tried to write to the destination. This will double count failures that may need to be retried.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;sink-behavior&quot;&gt;Sink Behavior&lt;/h1&gt;
&lt;p&gt;There are six sink configuration settings that control the buffering, flushing, and retry behavior of the sink.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;int maxBatchSize&lt;/code&gt; - maximum number of elements that may be passed in the list to submitRequestEntries to be written downstream.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;int maxInFlightRequests&lt;/code&gt; - maximum number of uncompleted calls to submitRequestEntries that the SinkWriter will allow at any given point. Once this point has reached, writes and callbacks to add elements to the buffer may block until one or more requests to submitRequestEntries completes.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;int maxBufferedRequests&lt;/code&gt; - maximum buffer length. Callbacks to add elements to the buffer and calls to write will block if this length has been reached and will only unblock if elements from the buffer have been removed for flushing.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;long maxBatchSizeInBytes&lt;/code&gt; - a flush will be attempted if the most recent call to write introduces an element to the buffer such that the total size of the buffer is greater than or equal to this threshold value.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;long maxTimeInBufferMS&lt;/code&gt; - maximum amount of time an element may remain in the buffer. In most cases elements are flushed as a result of the batch size (in bytes or number) being reached or during a snapshot. However, there are scenarios where an element may remain in the buffer forever or a long period of time. To mitigate this, a timer is constantly active in the buffer such that: while the buffer is not empty, it will flush every maxTimeInBufferMS milliseconds.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;long maxRecordSizeInBytes&lt;/code&gt; - maximum size in bytes allowed for a single record, as determined by &lt;code&gt;getSizeInBytes()&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Destinations typically have a defined throughput limit and will begin throttling or rejecting requests once near. We employ &lt;a href=&quot;https://en.wikipedia.org/wiki/Additive_increase/multiplicative_decrease&quot;&gt;Additive Increase Multiplicative Decrease (AIMD)&lt;/a&gt; as a strategy for selecting the optimal batch size.&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;The AsyncSinkBase is a new abstraction that makes creating and maintaining async sinks easier. This will be available in Flink 1.15 and we hope that you will try it out and give us feedback on it.&lt;/p&gt;
</description>
<pubDate>Fri, 06 May 2022 18:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/05/06/async-sink-base.html</link>
<guid isPermaLink="true">/2022/05/06/async-sink-base.html</guid>
</item>
<item>
<title>Exploring the thread mode in PyFlink</title>
<description>&lt;p&gt;PyFlink was introduced in Flink 1.9 which purpose is to bring the power of Flink to Python users and allow Python users to develop Flink jobs in Python language.
The functionality becomes more and more mature through the development in the past releases.&lt;/p&gt;
&lt;p&gt;Before Flink 1.15, Python user-defined functions will be executed in separate Python processes (based on the &lt;a href=&quot;https://docs.google.com/document/d/1B9NmaBSKCnMJQp-ibkxvZ_U233Su67c1eYgBhrqWP24/edit#heading=h.khjybycus70&quot;&gt;Apache Beam Portability Framework&lt;/a&gt;).
It will bring additional serialization/deserialization overhead and also communication overhead. In scenarios where the data size is big, e.g. image processing, etc,
this overhead becomes non-negligible. Besides, since it involves inter-process communication, the processing latency is also non-negligible,
which is unacceptable in scenarios where the latency is critical, e.g. quantitative transaction, etc.&lt;/p&gt;
&lt;p&gt;In Flink 1.15, we have introduced a new execution mode named ‘thread’ mode (based on &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt;) where the Python user-defined functions will be executed in the JVM as
a thread instead of a separate Python process. In this article, we will dig into the details about this execution mode and also share some benchmark data to
give users a basic understanding of how it works and which scenarios it’s applicable for.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#process-mode&quot; id=&quot;markdown-toc-process-mode&quot;&gt;Process Mode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pemja&quot; id=&quot;markdown-toc-pemja&quot;&gt;PEMJA&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#pemja-architecture&quot; id=&quot;markdown-toc-pemja-architecture&quot;&gt;PEMJA Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#comparison-with-other-solutions&quot; id=&quot;markdown-toc-comparison-with-other-solutions&quot;&gt;Comparison with other solutions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#thread-mode&quot; id=&quot;markdown-toc-thread-mode&quot;&gt;Thread Mode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#comparisons-between-process-mode-and-thread-mode&quot; id=&quot;markdown-toc-comparisons-between-process-mode-and-thread-mode&quot;&gt;Comparisons between process mode and thread mode&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#benefits-of-thread-mode&quot; id=&quot;markdown-toc-benefits-of-thread-mode&quot;&gt;Benefits of thread mode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#limitations&quot; id=&quot;markdown-toc-limitations&quot;&gt;Limitations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#usage&quot; id=&quot;markdown-toc-usage&quot;&gt;Usage&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#benchmarkhttpsgithubcomhuangxingbopyflink-benchmark&quot; id=&quot;markdown-toc-benchmarkhttpsgithubcomhuangxingbopyflink-benchmark&quot;&gt;&lt;a href=&quot;https://github.com/HuangXingBo/pyflink-benchmark&quot;&gt;Benchmark&lt;/a&gt;&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#test-environment&quot; id=&quot;markdown-toc-test-environment&quot;&gt;Test environment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#test-results&quot; id=&quot;markdown-toc-test-results&quot;&gt;Test results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary--future-work&quot; id=&quot;markdown-toc-summary--future-work&quot;&gt;Summary &amp;amp; Future work&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;process-mode&quot;&gt;Process Mode&lt;/h2&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-06-pyflink-1.15-thread-mode/pyflink-architecture-overview.png&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Fig. 1 - PyFlink Architecture Overview&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;From Fig. 1, we can see the architecture of PyFlink. As shown on the left side of Fig.1, users could use PyFlink API(Python Table API &amp;amp; SQL or Python DataStream API) to declare the logic of jobs,
which will be finally translated into JobGraph (DAG of the job) which could be recognized by Flink’s execution framework. It should be noted that Python operators (Flink operators whose purpose is to
execute Python user-defined functions) will be used to execute the Python user-defined functions.&lt;/p&gt;
&lt;p&gt;On the right side of Fig. 1, it shows the details of the Python operators where the Python user-defined functions were executed in separate Python processes.&lt;/p&gt;
&lt;p&gt;In order to communicate with the Python worker process, a series of communication services are required between the Python operator(runs in JVM) and the Python worker(runs in Python VM).
PyFlink has employed &lt;a href=&quot;https://docs.google.com/document/d/1B9NmaBSKCnMJQp-ibkxvZ_U233Su67c1eYgBhrqWP24/edit#heading=h.khjybycus70&quot;&gt;Apache Beam Portability framework&lt;/a&gt; to execute Python user-defined functions which provides the basic building blocks required for PyFlink.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-06-pyflink-1.15-thread-mode/pyflink-process-mode.png&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Fig. 2 - PyFlink Runtime in Process Mode&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;Process mode can be executed stably and efficiently in most scenarios. It is enough for more users. However, in some scenarios, it doesn’t work well due to the additional serialization/deserialization overhead.
One of the most typical scenarios is image processing, where the input data size is often very big. Besides, since it involves inter-process communication, the processing latency is also non-negligible
which is unacceptable in scenarios where latency is critical, e.g. quantitative transaction, etc. In order to overcome these problems, we have introduced a new execution mode(thread mode)
where Python user-defined functions will be executed in the JVM as a thread instead of a separate Python process. In the following section, we will dig into the details of this new execution mode.&lt;/p&gt;
&lt;h2 id=&quot;pemja&quot;&gt;PEMJA&lt;/h2&gt;
&lt;p&gt;Before digging into the thread mode, let’s introduce a library &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt; firstly, which is the core to the architecture of thread mode.&lt;/p&gt;
&lt;p&gt;As we all know, Java Native Interface (JNI) is a standard programming interface for writing Java native methods and embedding the Java virtual machine into native applications.
What’s more, CPython provides Python/C API to help embed Python in C Applications.&lt;/p&gt;
&lt;p&gt;So if we combine these two interfaces together, we can embed Python in Java Application. Since this library solves a general problem that Python and Java could call each other,
we have open sourced it as an independent project, and PyFlink has depended on &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt; since Flink 1.15 to support thread mode.&lt;/p&gt;
&lt;h3 id=&quot;pemja-architecture&quot;&gt;PEMJA Architecture&lt;/h3&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-06-pyflink-1.15-thread-mode/pemja.png&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Fig. 3 - PEMJA Architecture&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;As we can see from the architecture of &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt; in Fig. 3, JVM and PVM can call each other in the same process through &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt; Library.&lt;/p&gt;
&lt;p&gt;Firstly, &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt; will start a daemon thread in JVM, which is responsible for initializing the Python Environment and creating a Python Main Interpreter owned by this process.
The reason why &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt; uses a dedicated thread to initialize Python Environment is to avoid potential deadlocks in Python Interpreter.
Python Interpreter could deadlock when trying to acquire the GIL through methods such as &lt;a href=&quot;https://docs.python.org/3/c-api/init.html#c.PyGILState_Ensure&quot;&gt;PyGILState_*&lt;/a&gt; in Python/C API concurrently.
It should be noted that &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt; doesn’t call those methods directly, however, it may happen that third-party libraries may call them, e.g. &lt;a href=&quot;https://numpy.org/&quot;&gt;numpy&lt;/a&gt;, etc.
To get around this, we use a dedicated thread to initialize the Python Environment.&lt;/p&gt;
&lt;p&gt;Then, each Java worker thread can invoke the Python functions through the Python &lt;a href=&quot;https://docs.python.org/3/c-api/init.html&quot;&gt;ThreadState&lt;/a&gt; created from Python Main Interpreter.&lt;/p&gt;
&lt;h3 id=&quot;comparison-with-other-solutions&quot;&gt;Comparison with other solutions&lt;/h3&gt;
&lt;table width=&quot;95%&quot; border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Framework&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Principle&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Limitations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;https://www.jython.org/&quot;&gt;Jython&lt;/a&gt;&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Python compiler implemented in Java&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;Only support for Python2&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;https://www.graalvm.org/python/&quot;&gt;GraalVM&lt;/a&gt;&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Truffle framework&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;Compatibility issues with various Python ecological libraries&lt;/li&gt;
&lt;li&gt;Works only with GraalVM&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;https://github.com/jpype-project/jpype&quot;&gt;JPype&lt;/a&gt;&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;JNI + Python/C API&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;Don’t support Java calling Python&lt;/li&gt;
&lt;li&gt;Only support for CPython&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;https://github.com/ninia/jep&quot;&gt;Jep&lt;/a&gt;&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;JNI + Python/C API&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;Difficult to integrate&lt;/li&gt;
&lt;li&gt;Performance is not good enough&lt;/li&gt;
&lt;li&gt;Only support for CPython&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt;&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;JNI + Python/C API&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;Only support for CPython&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;In the table above, we have made a basic comparison of the popular solutions of Java/Python calling libraries.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.jython.org/&quot;&gt;Jython&lt;/a&gt;: Jython is a Python interpreter implemented in Java language. Because its implementation language is Java,
the interoperability between code implemented by Python syntax and Java code will be very natural.
However, Jython does not support Python 3 anymore, and it is no longer actively maintained.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.graalvm.org/python/&quot;&gt;GraalVM&lt;/a&gt;: GraalVM takes use of Truffle framework to support interoperability between Python and Java.
However, it has the limitation that not all the Python libraries are supported. As we know, many Python libraries rely on standard CPython to implement their C extensions.
The other problem is that it only works with GraalVM, which means high migration costs.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/jpype-project/jpype&quot;&gt;JPype&lt;/a&gt;: Similar to &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt;,
JPype is also a framework built using JNI and Python/C API, but JPype only supports calling Java from Python.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/ninia/jep&quot;&gt;Jep&lt;/a&gt;: Similar to &lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt;, Jep is also a framework built using JNI and Python/C API and it supports calling Python from Java.
However, it doesn’t provide a jar to the maven repository and the process of loading native packages needs to be specified in advance through jvm parameters or environment variables when the JVM starts,
which makes it difficult to integrate. Furthermore, our benchmark shows that the performance is not very good.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt;: Similar to Jep and JPype, PEMJA is built on CPython, so it cannot support other Python interpreters, such as PyPy, etc.
Since CPython is the most used implementation and standard of Python Runtime officially provided by Python, most libraries of the Python ecology are built on CPython Runtime and so could work with PEMJA naturally.&lt;/p&gt;
&lt;h2 id=&quot;thread-mode&quot;&gt;Thread Mode&lt;/h2&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-06-pyflink-1.15-thread-mode/pyflink-thread-mode.png&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Fig. 4 - PyFlink Runtime in Thread Mode&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;From the picture above, we can see that in thread mode, the Python user-defined function runs in the same process as the Python operator(which runs in JVM).
&lt;a href=&quot;https://github.com/alibaba/pemja&quot;&gt;PEMJA&lt;/a&gt; is used as a bridge between the Java code and the Python code.&lt;/p&gt;
&lt;p&gt;Since the Python user-defined function runs in JVM, for each input data received from the upstream operators, it will be passed to
the Python user-defined function directly instead of buffered and passed to the Python user-defined function in a batch.
Therefore, thread mode could have lower latency compared to the process mode. Currently, if users want to achieve lower latency in process mode, usually they need to configure the
&lt;code&gt;python.fn-execution.bundle.size&lt;/code&gt; or &lt;code&gt;python.fn-execution.bundle.time&lt;/code&gt; to a lower value. However, since it involves inter-process communication,
the latency is still a little high in some scenarios. However, this is not a problem any more in thread mode. Besides, configuring &lt;code&gt;python.fn-execution.bundle.size&lt;/code&gt; or &lt;code&gt;python.fn-execution.bundle.time&lt;/code&gt; to
a lower value usually will affect the overall performance of the job and this will also not be a problem in thread mode.&lt;/p&gt;
&lt;h2 id=&quot;comparisons-between-process-mode-and-thread-mode&quot;&gt;Comparisons between process mode and thread mode&lt;/h2&gt;
&lt;table width=&quot;95%&quot; border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Execution Mode&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Benefits&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Limitations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Process Mode&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;Better resource isolation&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;IPC overhead&lt;/li&gt;
&lt;li&gt;High implementation complexity&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Thread Mode&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;Higher throughput&lt;/li&gt;
&lt;li&gt;Lower latency&lt;/li&gt;
&lt;li&gt;Less checkpoint time&lt;/li&gt;
&lt;li&gt;Less usage restrictions&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;td style=&quot;text-align: justify&quot;&gt;
&lt;ul&gt;
&lt;li&gt;Only support for CPython&lt;/li&gt;
&lt;li&gt;Multiple jobs cannot use different Python interpreters in session mode&lt;/li&gt;
&lt;li&gt;Performance is affected by the GIL&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&quot;benefits-of-thread-mode&quot;&gt;Benefits of thread mode&lt;/h3&gt;
&lt;p&gt;Since it processes data in batches in process mode, currently Python user-defined functions could not be used in some scenarios,
e.g. used in the Join(Table API &amp;amp; SQL) condition and taking columns both from the left table and the right table as inputs.
However, this will not be a big problem any more in thread mode because of the nature that it handles the data one by one instead of a batch.&lt;/p&gt;
&lt;p&gt;Unlike process mode which sends and receives data asynchronously in batches, in thread mode, data will be processed synchronously one by one.
So usually it will have lower latency and also less checkpoint time. In terms of performance, since there is no inter-process communication,
it could avoid data serialization/deserialization and communication overhead, as well as the stage of copying and context switching between kernel space and user space,
so it usually will have better performance in thread mode.&lt;/p&gt;
&lt;h3 id=&quot;limitations&quot;&gt;Limitations&lt;/h3&gt;
&lt;p&gt;However, there are also some limitations for thread mode:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It only supports CPython which is also one of the most used Python interpreters.&lt;/li&gt;
&lt;li&gt;It doesn’t support session mode well and so it’s recommended that users only use thread mode in per-job or application deployments.
The reason is it doesn’t support using different Python interpreters for the jobs running in the same TaskManager.
This limitation comes from the fact that many Python libraries assume that they will only be initialized once in the process, so they use a lot of static variables.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;usage&quot;&gt;Usage&lt;/h2&gt;
&lt;p&gt;The execution mode could be configured via the configuration &lt;code&gt;python.execution-mode&lt;/code&gt;. It has two possible values:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;process&lt;/code&gt;: The Python user-defined functions will be executed in a separate Python process. (default)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;thread&lt;/code&gt;: The Python user-defined functions will be executed in the same process as Java operators.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For example, you could configure it as following in Python Table API:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;c&quot;&gt;# Specify `process` mode&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;table_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;python.execution-mode&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;process&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# Specify `thread` mode&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;table_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;python.execution-mode&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;thread&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It should be noted that since this is still the first release of ‘thread’ mode, currently there are still many limitations about it,
e.g. it only supports Python ScalarFunction of Python Table API &amp;amp; SQL. It will fall back to ‘process’ mode where ‘thread’ mode is not supported.
So it may happen that you configure a job to execute in thread mode, however, it’s actually executed in ‘process’ execution mode.&lt;/p&gt;
&lt;h2 id=&quot;benchmarkhttpsgithubcomhuangxingbopyflink-benchmark&quot;&gt;&lt;a href=&quot;https://github.com/HuangXingBo/pyflink-benchmark&quot;&gt;Benchmark&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id=&quot;test-environment&quot;&gt;Test environment&lt;/h3&gt;
&lt;p&gt;OS: Alibaba Cloud Linux (Aliyun Linux) release 2.1903 LTS (Hunting Beagle)&lt;/p&gt;
&lt;p&gt;CPU: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz&lt;/p&gt;
&lt;p&gt;Memory: 16G&lt;/p&gt;
&lt;p&gt;CPython: Python 3.7.3&lt;/p&gt;
&lt;p&gt;JDK: OpenJDK Runtime Environment (build 1.8.0_292-b10)&lt;/p&gt;
&lt;p&gt;PyFlink: 1.15.0&lt;/p&gt;
&lt;h3 id=&quot;test-results&quot;&gt;Test results&lt;/h3&gt;
&lt;p&gt;Here, we test the json processing which is a very common scenario for PyFlink users.&lt;/p&gt;
&lt;p&gt;The UDF implementation is as following:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;c&quot;&gt;# python udf&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;general&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;json_value_lower&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;json&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dumps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// Java UDF&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;JsonValueLower&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ScalarFunction&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;transient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ObjectMapper&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mapper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;transient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ObjectWriter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;writer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FunctionContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;mapper&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ObjectMapper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;writer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mapper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;writerWithDefaultPrettyPrinter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;eval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;StringObject&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;object&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mapper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;readValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StringObject&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;object&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setA&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;object&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toLowerCase&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;writer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;writeValueAsString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;object&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JsonProcessingException&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;throw&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;RuntimeException&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Failed to read json value&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;StringObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getA&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;setA&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;toString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;StringObject{&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;a=&amp;#39;&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;&amp;#39;\&amp;#39;&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;&amp;#39;}&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The test results is as following:&lt;/p&gt;
&lt;table width=&quot;95%&quot; border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Type (input data size)&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;QPS&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Latency&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Checkpoint Time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Java UDF (100k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;900&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2ms&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;100ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Java UDF (10k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;1w&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;20us&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Java UDF (1k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;5w&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;1us&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Java UDF (100)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;28w&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;200ns&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Process Mode (100k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;900&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;5s-10s&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;5s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Process Mode (10k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;7000&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;5s-10s&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;3s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Process Mode (1k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;3.6w&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;3s&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;3s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Process Mode (100)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;12w&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2s&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Thread Mode (100k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;1200&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;1ms&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;100ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Thread Mode (10k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;1.2w&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;20us&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Thread Mode (1k)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;5w&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;3us&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Thread Mode (100)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;12w&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;1us&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-06-pyflink-1.15-thread-mode/pyflink-performance.jpg&quot; /&gt;
&lt;/center&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-06-pyflink-1.15-thread-mode/pyflink-latency.jpg&quot; /&gt;
&lt;/center&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2022-05-06-pyflink-1.15-thread-mode/pyflink-checkpoint-time.jpg&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;As we can see from the test results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;If you care about latency and checkpoint time, thread mode is your better choice. The processing latency could be decreased from several seconds in process mode to microseconds in thread mode.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Thread mode can bring better performance than process mode when data serialization/deserialization is not negligible relative to UDF calculation itself.
Compared to process mode, benchmark has shown that the throughput could be increased by 2x in common scenarios such as json processing in thread mode.
​​However, if the UDF calculation is slow and spends much longer time, then it is more recommended to use process mode, because the process mode is more mature and it has better resource isolation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When the performance of Python UDF is close to that of Java UDF, the end-to-end performance of thread mode will be close to that of Java UDF.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;summary--future-work&quot;&gt;Summary &amp;amp; Future work&lt;/h2&gt;
&lt;p&gt;In this article, we have introduced the ‘thread’ execution mode in PyFlink which is a new feature introduced in Flink 1.15.
Compared with the ‘process’ execution mode, users will get better performance, lower latency, less checkpoint time in ‘thread’ mode.
However, there are also some limitations about ‘thread’ mode, e.g. poor support for session deployment mode, etc.&lt;/p&gt;
&lt;p&gt;It should be noted that since this is still the first release of ‘thread’ mode, currently there are still many limitations about it,
e.g. it only supports Python ScalarFunction of Python Table API &amp;amp; SQL. We’re planning to extend it to other places where Python user-defined functions could be used in next releases.&lt;/p&gt;
</description>
<pubDate>Fri, 06 May 2022 14:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/05/06/pyflink-1.15-thread-mode.html</link>
<guid isPermaLink="true">/2022/05/06/pyflink-1.15-thread-mode.html</guid>
</item>
<item>
<title>Improvements to Flink operations: Snapshots Ownership and Savepoint Formats</title>
<description>&lt;p&gt;Flink has become a well established data streaming engine and a
mature project requires some shifting of priorities from thinking purely about new features
towards improving stability and operational simplicity. In the last couple of releases, the Flink community has tried to address
some known friction points, which includes improvements to the
snapshotting process. Snapshotting takes a global, consistent image of the state of a Flink job and is integral to fault-tolerance and exacty-once processing. Snapshots include savepoints and checkpoints.&lt;/p&gt;
&lt;p&gt;This post will outline the journey of improving snapshotting in past releases and the upcoming improvements in Flink 1.15, which includes making it possible to take savepoints in the native state backend specific format as well as clarifying snapshots ownership.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#past-improvements-to-the-snapshotting-process&quot; id=&quot;markdown-toc-past-improvements-to-the-snapshotting-process&quot;&gt;Past improvements to the snapshotting process&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-improvements-to-the-snapshotting-process&quot; id=&quot;markdown-toc-new-improvements-to-the-snapshotting-process&quot;&gt;New improvements to the snapshotting process&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#the-new-restore-modes&quot; id=&quot;markdown-toc-the-new-restore-modes&quot;&gt;The new restore modes&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#legacy-mode&quot; id=&quot;markdown-toc-legacy-mode&quot;&gt;LEGACY mode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#noclaim-default-mode&quot; id=&quot;markdown-toc-noclaim-default-mode&quot;&gt;NO_CLAIM (default) mode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#claim-mode&quot; id=&quot;markdown-toc-claim-mode&quot;&gt;CLAIM mode&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#savepoint-format&quot; id=&quot;markdown-toc-savepoint-format&quot;&gt;Savepoint format&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#capabilities-and-limitations&quot; id=&quot;markdown-toc-capabilities-and-limitations&quot;&gt;Capabilities and limitations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;past-improvements-to-the-snapshotting-process&quot;&gt;Past improvements to the snapshotting process&lt;/h1&gt;
&lt;p&gt;Flink 1.13 was the first release where we announced &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/concepts/stateful-stream-processing/#unaligned-checkpointing&quot;&gt;unaligned checkpoints&lt;/a&gt; to be production-ready. We
encouraged people to use them if their jobs are backpressured to a point where it causes issues for
checkpoints. We also &lt;a href=&quot;/news/2021/05/03/release-1.13.0.html#switching-state-backend-with-savepoints&quot;&gt;unified the binary format of savepoints&lt;/a&gt; across all
different state backends, which enables stateful switching of savepoints.&lt;/p&gt;
&lt;p&gt;Flink 1.14 also brought additional improvements. As an alternative and as a complement
to unaligned checkpoints, we introduced a feature called &lt;a href=&quot;/news/2021/09/29/release-1.14.0.html#buffer-debloating&quot;&gt;“buffer debloating”&lt;/a&gt;. This is built
around the concept of automatically adjusting the amount of in-flight data that needs to be aligned
while snapshotting. We also fixed another long-standing problem and made it
possible to &lt;a href=&quot;/news/2021/09/29/release-1.14.0.html#checkpointing-and-bounded-streams&quot;&gt;continue checkpointing even if there are finished tasks&lt;/a&gt; in a JobGraph.&lt;/p&gt;
&lt;h1 id=&quot;new-improvements-to-the-snapshotting-process&quot;&gt;New improvements to the snapshotting process&lt;/h1&gt;
&lt;p&gt;You can expect more improvements in Flink 1.15! We continue to be invested in making it easy
to operate Flink clusters and have tackled the following problems. :)&lt;/p&gt;
&lt;p&gt;Savepoints can be expensive
to take and restore from if taken for a very large state stored in the RocksDB state backend. In
order to circumvent this issue, we have seen users leveraging the externalized incremental checkpoints
instead of savepoints in order to benefit from the native RocksDB format. However, checkpoints and savepoints
serve different operational purposes. Thus, we now made it possible to take savepoints in the
native state backend specific format, while still maintaining some characteristics of savepoints (i.e. making them relocatable).&lt;/p&gt;
&lt;p&gt;Another issue reported with externalized checkpoints is that it is not clear who owns the
checkpoint files (Flink or the user?). This is especially problematic when it comes to incremental RocksDB checkpoints
where you can easily end up in a situation where you do not know which checkpoints depend on which files
which makes it tough to clean those files up. To solve this issue, we added explicit restore
modes (CLAIM, NO_CLAIM, and LEGACY) which clearly define whether Flink should take
care of cleaning up the snapshots or whether it should remain the user’s responsibility.
.&lt;/p&gt;
&lt;h2 id=&quot;the-new-restore-modes&quot;&gt;The new restore modes&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;Restore Mode&lt;/code&gt; determines who takes ownership of the files that make up savepoints or
externalized checkpoints after they are restored. Snapshots, which are either checkpoints or savepoints
in this context, can be owned either by a user or Flink itself. If a snapshot is owned by a user,
Flink will not delete its files and will not depend on the existence
of such files since it might be deleted outside of Flink’s control.&lt;/p&gt;
&lt;p&gt;The restore modes are &lt;code&gt;CLAIM&lt;/code&gt;, &lt;code&gt;NO_CLAIM&lt;/code&gt;, and &lt;code&gt;LEGACY&lt;/code&gt; (for backwards compatibility). You can pass the restore mode like this:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;$ bin/flink run -s :savepointPath -restoreMode :mode -n [:runArgs]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;While each restore mode serves a specific purpose, we believe the default &lt;em&gt;NO_CLAIM&lt;/em&gt; mode is a good
tradeoff in most situations, as it provides clear ownership with a small price for the first
checkpoint after the restore.&lt;/p&gt;
&lt;p&gt;Let’s dig further into each of the modes.&lt;/p&gt;
&lt;h3 id=&quot;legacy-mode&quot;&gt;LEGACY mode&lt;/h3&gt;
&lt;p&gt;The legacy mode is how Flink dealt with snapshots until version 1.15. In this mode, Flink will never delete the initial
checkpoint. Unfortunately, at the same time, it is not clear if a user can ever delete it as well.
The problem here is that Flink might immediately build an incremental checkpoint on top of the
restored one. Therefore, subsequent checkpoints depend on the restored checkpoint. Overall, the
ownership is not well defined in this mode.&lt;/p&gt;
&lt;div style=&quot;text-align: center&quot;&gt;
&lt;img src=&quot;/img/blog/2022-05-06-restore-modes/restore-mode-legacy.svg&quot; alt=&quot;LEGACY restore mode&quot; width=&quot;70%&quot; /&gt;
&lt;/div&gt;
&lt;h3 id=&quot;noclaim-default-mode&quot;&gt;NO_CLAIM (default) mode&lt;/h3&gt;
&lt;p&gt;To fix the issue of files that no one can reliably claim ownership of, we introduced the &lt;code&gt;NO_CLAIM&lt;/code&gt;
mode as the new default. In this mode, Flink will not assume ownership of the snapshot and will leave the files in
the user’s control and never delete any of the files. You can start multiple jobs from the
same snapshot in this mode.&lt;/p&gt;
&lt;p&gt;In order to make sure Flink does not depend on any of the files from that snapshot, it will force
the first (successful) checkpoint to be a full checkpoint as opposed to an incremental one. This
only makes a difference for &lt;code&gt;state.backend: rocksdb&lt;/code&gt;, because all other state backends always take
full checkpoints.&lt;/p&gt;
&lt;p&gt;Once the first full checkpoint completes, all subsequent checkpoints will be taken as
usual/configured. Consequently, once a checkpoint succeeds, you can manually delete the original
snapshot. You can not do this earlier, because without any completed checkpoints, Flink will - upon
failure - try to recover from the initial snapshot.&lt;/p&gt;
&lt;div style=&quot;text-align: center&quot;&gt;
&lt;img src=&quot;/img/blog/2022-05-06-restore-modes/restore-mode-no_claim.svg&quot; alt=&quot;NO_CLAIM restore mode&quot; width=&quot;70%&quot; /&gt;
&lt;/div&gt;
&lt;h3 id=&quot;claim-mode&quot;&gt;CLAIM mode&lt;/h3&gt;
&lt;p&gt;If you do not want to sacrifice any performance while taking the first checkpoint, we suggest
looking into the &lt;code&gt;CLAIM&lt;/code&gt; mode. In this mode, Flink claims ownership of the snapshot
and essentially treats it like a checkpoint: it controls the lifecycle and might delete it if it is
not needed for recovery anymore. Hence, it is not safe to manually delete the snapshot or to start
two jobs from the same snapshot. Flink keeps around a configured number of checkpoints.&lt;/p&gt;
&lt;div style=&quot;text-align: center&quot;&gt;
&lt;img src=&quot;/img/blog/2022-05-06-restore-modes/restore-mode-claim.svg&quot; alt=&quot;CLAIM restore mode&quot; width=&quot;70%&quot; /&gt;
&lt;/div&gt;
&lt;h2 id=&quot;savepoint-format&quot;&gt;Savepoint format&lt;/h2&gt;
&lt;p&gt;You can now trigger savepoints in the native format of state backends.
This has been introduced to match two characteristics, one of both savepoints and
checkpoints:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;self-contained, relocatable, and owned by users&lt;/li&gt;
&lt;li&gt;lightweight (and thus fast to take and recover from)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In order to provide the two features in a single concept, we provided a way for Flink to create a
savepoint in a (native) binary format of the used state backend. This brings a significant difference
especially in combination with the &lt;code&gt;state.backend: rocksdb&lt;/code&gt; setting and incremental snapshots.&lt;/p&gt;
&lt;p&gt;That state backend can leverage RocksDB native on-disk data structures which are usually referred to
as SST files. Incremental checkpoints leveraged those files and are
collections of those SST files with some additional metadata, which can be quickly reloaded
into the working directory of RocksDB upon restore.&lt;/p&gt;
&lt;p&gt;Native savepoints can use the same mechanism of uploading the SST files instead of dumping the
entire state into a canonical Flink format. There is one additional benefit over simply using the
externalized incremental checkpoints: native savepoints are still relocatable and self-contained
in a single directory. In case of checkpoints that do not hold, because a single SST file can be
used by multiple checkpoints, and thus is put into a common shared directory. That is why they are
called incremental.&lt;/p&gt;
&lt;p&gt;You can choose the savepoint format when triggering the savepoint like this:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;# take an intermediate savepoint
$ bin/flink savepoint --type [native/canonical] :jobId [:targetDirectory]
# stop the job with a savepoint
$ bin/flink stop --type [native/canonical] --savepointPath [:targetDirectory] :jobId
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;capabilities-and-limitations&quot;&gt;Capabilities and limitations&lt;/h3&gt;
&lt;p&gt;Unfortunately it is not possible to provide the same guarantees for all types of snapshots
(canonical or native savepoints and aligned or unaligned checkpoints). The main difference between
checkpoints and savepoints is that savepoints are still triggered and owned by users. Flink does not
create them automatically nor ever depends on their existence. Their main purpose is still for planned,
manual backups, whereas checkpoints are used for recovery. In database terms, savepoints are similar
to backups, whereas checkpoints are like recovery logs.&lt;/p&gt;
&lt;p&gt;Having additional dimensions of properties in each of the two main snapshots category does not make
it easier, therefore we try to list what you can achieve with every type of snapshot.&lt;/p&gt;
&lt;p&gt;The following table gives an overview of capabilities and limitations for the various types of
savepoints and checkpoints.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;✓ - Flink fully supports this type of snapshot&lt;/li&gt;
&lt;li&gt;x - Flink doesn’t support this type of snapshot&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align: left&quot;&gt;Operation&lt;/th&gt;
&lt;th style=&quot;text-align: left&quot;&gt;Canonical Savepoint&lt;/th&gt;
&lt;th style=&quot;text-align: left&quot;&gt;Native Savepoint&lt;/th&gt;
&lt;th style=&quot;text-align: left&quot;&gt;Aligned Checkpoint&lt;/th&gt;
&lt;th style=&quot;text-align: left&quot;&gt;Unaligned Checkpoint&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;State backend change&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;State Processor API(writing)&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;State Processor API(reading)&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;Self-contained and relocatable&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;Schema evolution&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;Arbitrary job upgrade&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;Non-arbitrary job upgrade&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;Flink minor version upgrade&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;Flink bug/patch version upgrade&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: left&quot;&gt;Rescaling&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;td style=&quot;text-align: left&quot;&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;ul&gt;
&lt;li&gt;State backend change - you can restore from the snapshot with a different state.backend than the
one for which the snapshot was taken&lt;/li&gt;
&lt;li&gt;State Processor API (writing) - The ability to create new snapshot via State Processor API.&lt;/li&gt;
&lt;li&gt;State Processor API (reading) - The ability to read state from the existing snapshot via State
Processor API.&lt;/li&gt;
&lt;li&gt;Self-contained and relocatable - One snapshot directory contains everything it needs for recovery.
You can move the directory around.&lt;/li&gt;
&lt;li&gt;Schema evolution - Changing the data type of the &lt;em&gt;state&lt;/em&gt; in your UDFs.&lt;/li&gt;
&lt;li&gt;Arbitrary job upgrade - Restoring the snapshot with the different partitioning type(rescale,
rebalance, map, etc.)
or with a different record type for the existing operator. In other words you can add arbitrary
operators anywhere in your job graph.&lt;/li&gt;
&lt;li&gt;Non-arbitrary job upgrade - In contrary to the above, you still should be able to add new
operators, but certain limitations apply. You can not change partitioning for existing operators
or the data type of records being exchanged.&lt;/li&gt;
&lt;li&gt;Flink minor version upgrade - Restoring a snapshot which was taken for an older minor version of
Flink (1.x → 1.y).&lt;/li&gt;
&lt;li&gt;Flink bug/patch version upgrade - Restoring a snapshot which was taken for an older patch version
of Flink (1.14.x → 1.14.y).&lt;/li&gt;
&lt;li&gt;Rescaling - Restoring the snapshot with a different parallelism than was used during the snapshot
creation.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;We hope the changes we introduced over the last releases make it easier to operate Flink in respect
to snapshotting. We are eager to hear from you if any of the new features have helped you solve problems you’ve faced in the past.
At the same time, if you still struggle with an issue or you had to work around some obstacles, please let
us know! Maybe we will be able to incorporate your approach or find a different solution together.&lt;/p&gt;
</description>
<pubDate>Fri, 06 May 2022 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2022/05/06/restore-modes.html</link>
<guid isPermaLink="true">/2022/05/06/restore-modes.html</guid>
</item>
<item>
<title>Announcing the Release of Apache Flink 1.15</title>
<description>&lt;p&gt;Thanks to our well-organized and open community, Apache Flink continues
&lt;a href=&quot;https://www.apache.org/foundation/docs/FY2021AnnualReport.pdf&quot;&gt;to grow&lt;/a&gt; as a
technology and remain one of the most active projects in
the Apache community. With the release of Flink 1.15, we are proud to announce a number of
exciting changes.&lt;/p&gt;
&lt;p&gt;One of the main concepts that makes Apache Flink stand out is the unification of
batch (aka bounded) and stream (aka unbounded) data processing, which helps reduce the complexity of development. A lot of
effort went into this unification in the previous releases, and you can expect more efforts in this direction.&lt;/p&gt;
&lt;p&gt;Apache Flink is not only growing when it comes to contributions and users, but
also out of the original use cases. We are seeing a trend towards more business/analytics
use cases implemented in low-/no-code. Flink SQL is the feature in the Flink ecosystem
that enables such uses cases and this is why its popularity continues to grow.&lt;/p&gt;
&lt;p&gt;Apache Flink is an essential building block in data pipelines/architectures and
is used with many other technologies in order to drive all sorts of use cases. While new ideas/products
may appear in this domain, existing technologies continue to establish themselves as standards for solving
mission-critical problems. Knowing that we have such a wide reach and play a role in the success of many
projects, it is important that the experience of
integrating Apache Flink with the cloud infrastructures and existing systems is as seamless and easy as possible.&lt;/p&gt;
&lt;p&gt;In the 1.15 release the Apache Flink community made significant progress across all
these areas. Still those are not the only things that made it into 1.15. The
contributors improved the experience of operating Apache Flink by making it much
easier and more transparent to handle checkpoints and savepoints and their ownership,
making auto scaling more seamless and complete, by removing side effects of use cases
in which different data sources produce varying amounts of data, and - finally - the
ability to upgrade SQL jobs without losing the state. By continuing on supporting
checkpoints after tasks finished and adding window table valued functions in batch
mode, the experience of unified stream and batch processing was once more improved
making hybrid use cases way easier. In the SQL space, not only the first step in
version upgrades have been added but also JSON functions to make it easier to import
and export structured data in SQL. Both will allow users to better rely on Flink SQL
for production use cases in the long term. To establish Apache Flink as part of the
data processing ecosystem we improved the cloud interoperability and added more sink
connectors and formats. And yes we enabled a Scala-free runtime
(&lt;a href=&quot;https://flink.apache.org/2022/02/22/scala-free.html&quot;&gt;the hype is real&lt;/a&gt;).&lt;/p&gt;
&lt;h2 id=&quot;operating-apache-flink-with-ease&quot;&gt;Operating Apache Flink with ease&lt;/h2&gt;
&lt;p&gt;Even Flink jobs that have been built and tuned by the best engineering teams still need to
be operated, usually on a long-term basis. The many deployment
patterns, APIs, tunable configs, and use cases covered by Apache Flink mean that operation
support is vital and can be burdensome.&lt;/p&gt;
&lt;p&gt;In this release, we listened to user feedback and now operating Flink is made much
easier. It is now more transparent in terms of handling checkpoints and savepoints and their ownership,
which makes auto-scaling more seamless and complete (by removing side effects of use cases
where different data sources produce varying amounts of data) and enables the&lt;br /&gt;
ability to upgrade SQL jobs without losing the state.&lt;/p&gt;
&lt;h3 id=&quot;clarification-of-checkpoint-and-savepoint-semantics&quot;&gt;Clarification of checkpoint and savepoint semantics&lt;/h3&gt;
&lt;p&gt;An essential cornerstone of Flink’s fault tolerance strategy is based on
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/checkpoints/&quot;&gt;checkpoints&lt;/a&gt; and
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/&quot;&gt;savepoints&lt;/a&gt; (see &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/checkpoints_vs_savepoints/&quot;&gt;the comparison&lt;/a&gt;).
The purpose of savepoints has always been to put transitions,
backups, and upgrades of Flink jobs in the control of users. Checkpoints, on
the other hand, are intended to be fully controlled by Flink and guarantee fault
tolerance through fast recovery, failover, etc. Both concepts are quite similar, and
the underlying implementation also shares aspects of the same ideas.&lt;/p&gt;
&lt;p&gt;However, both concepts grew apart by following specific feature requests and sometimes
neglecting the overarching idea and strategy. Based on user feedback, it became apparent that this should be
aligned and harmonized better and, above all, to make more clear!&lt;/p&gt;
&lt;p&gt;There have been situations in which users relied on checkpoints to stop/restart jobs when savepoints would have
been the right way to go. It was also not clear that savepoints are slower since they don’t include
some of the features that make taking checkpoints so fast. In some cases like
resuming from a retained checkpoint - where the checkpoint is somehow considered as
a savepoint - it is unclear to the user when they can actually clean it up.&lt;/p&gt;
&lt;p&gt;With &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership&quot;&gt;FLIP-193 (Snapshots ownership)&lt;/a&gt;
the community aims to make ownership the only difference between savepoints and
checkpoints. In the 1.15 release the community has fixed some of those shortcomings
by supporting
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#savepoint-format&quot;&gt;native and incremental savepoints&lt;/a&gt;.
Savepoints always used to use the
canonical format which made them slower. Also writing full savepoints for sure takes
longer than doing it in an incremental way. With 1.15 if users use the native format
to take savepoints as well as the RocksDB state backend, savepoints will be
automatically taken in an incremental manner. The documentation has also been
clarified to provide a better overview and understanding of the differences between
checkpoints and savepoints. The semantics for
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#resuming-from-savepoints&quot;&gt;resuming from savepoint/retained checkpoint&lt;/a&gt;
have also been clarified introducing the CLAIM and NO_CLAIM mode. With
the CLAIM mode Flink takes over ownership of an existing snapshot, with NO_CLAIM it
creates its own copy and leaves the existing one up to the user. Please note that
NO_CLAIM mode is the new default behavior. The old semantic of resuming from
savepoint/retained checkpoint is still accessible but has to be manually selected by
choosing LEGACY mode.&lt;/p&gt;
&lt;h3 id=&quot;elastic-scaling-with-reactive-mode-and-the-adaptive-scheduler&quot;&gt;Elastic scaling with reactive mode and the adaptive scheduler&lt;/h3&gt;
&lt;p&gt;Driven by the increasing number of cloud services built on top of Apache Flink, the
project is becoming more and more cloud native which makes elastic
scaling even more important.&lt;/p&gt;
&lt;p&gt;This release improves metrics for the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#reactive-mode&quot;&gt;reactive mode&lt;/a&gt;, which is a job-scope mode where the JobManager will try to use all TaskManager resources available. To do this, we made all the metrics in
the Job scope work correctly when reactive mode is enabled.&lt;/p&gt;
&lt;p&gt;We also added an exception history for the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-160%3A+Adaptive+Scheduler&quot;&gt;adaptive scheduler&lt;/a&gt;, which is a new scheduler that first declares the required resources and waits for them before deciding on the parallelism with which to execute a job.&lt;/p&gt;
&lt;p&gt;Furthermore, downscaling is sped up significantly. The TaskManager now has a dedicated
shutdown code path, where it actively deregisters itself from the cluster instead
of relying on heartbeats, giving the JobManager a clear signal for downscaling.&lt;/p&gt;
&lt;h3 id=&quot;adaptive-batch-scheduler&quot;&gt;Adaptive batch scheduler&lt;/h3&gt;
&lt;p&gt;In 1.15, we introduced a new scheduler to Apache Flink: the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/elastic_scaling/#adaptive-batch-scheduler&quot;&gt;Adaptive Batch Scheduler&lt;/a&gt;.
The new scheduler can automatically decide parallelisms of job vertices for batch jobs,
according to the size of data volume each vertex needs to process.&lt;/p&gt;
&lt;p&gt;Major benefits of this scheduler includes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Ease-of-use: Batch job users can be relieved from parallelism tuning.&lt;/li&gt;
&lt;li&gt;Adaptive: Automatically tuned parallelisms can better fit consumed datasets which
have a varying volume size every day.&lt;/li&gt;
&lt;li&gt;Fine-grained: Parallelism of each job vertex will be tuned individually. This allows
vertices of SQL batch jobs to be automatically assigned different proper parallelisms.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&quot;watermark-alignment-across-data-sources&quot;&gt;Watermark alignment across data sources&lt;/h3&gt;
&lt;p&gt;Having data sources that increase watermarks at different paces could lead to
problems with downstream operators. For example, some operators might need to buffer excessive
amounts of data which could lead to huge operator states. This is why we introduced watermark alignment
in this release.&lt;/p&gt;
&lt;p&gt;For sources based on the new source interface,
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment-_beta_&quot;&gt;watermark alignment&lt;/a&gt;
can be activated. Users can define
alignment groups to pause consuming from sources which are too far ahead from others. The ideal scenario for aligned watermarks is when there are two or more
sources that produce watermarks at a different speed and when the source has the same
parallelism as splits/shards/partitions.&lt;/p&gt;
&lt;h3 id=&quot;sql-version-upgrades&quot;&gt;SQL version upgrades&lt;/h3&gt;
&lt;p&gt;The execution plan of SQL queries and its resulting topology is based on optimization
rules and a cost model. This means that even minimal changes could introduce a completely
different topology. This dynamism makes guaranteeing snapshot compatibility
very challenging across different Flink versions. In the efforts of 1.15, the community focused
on keeping the same query (via the same topology) up and running even after upgrades.&lt;/p&gt;
&lt;p&gt;At the core of SQL upgrades are JSON plans
(&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/api/java/org/apache/flink/table/api/CompiledPlan.html&quot;&gt;please note that we only have documentation in our JavaDocs for now and are still working on updating the documentation&lt;/a&gt;), which are JSON functions that make it easier to import and export structured data in SQL. This has been introduced for
internal use already in previous releases and will now be exposed externally. Both the Table API
and SQL will provide a way to compile and execute a plan which guarantees the same
topology throughout different versions. This feature will be released as an experimental MVP.
Users who want to give it a try already can create a JSON plan that can then be used
to restore a Flink job based on the old operator structure. The full feature can be expected
in Flink 1.16.&lt;/p&gt;
&lt;p&gt;Reliable upgrades makes Flink SQL more dependable for production use cases in the long term.&lt;/p&gt;
&lt;h3 id=&quot;changelog-state-backend&quot;&gt;Changelog state backend&lt;/h3&gt;
&lt;p&gt;In Flink 1.15, we introduced the MVP feature of the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/state_backends/#enabling-changelog&quot;&gt;changelog state backend&lt;/a&gt;,
which aims at
making checkpoint intervals shorter and more predictable with the following advantages:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Shorter end-to-end latency: end-to-end latency mostly depends on the checkpointing
mechanism, especially for transactional sinks. Transactional sinks commit on
checkpoints, so faster checkpoints mean more frequent commits.&lt;/li&gt;
&lt;li&gt;More predictable checkpoint intervals: currently checkpointing time largely depends
on the size of the artifacts that need to be persisted on the checkpoint storage.
By keeping the size consistently small, checkpointing time becomes more predictable.&lt;/li&gt;
&lt;li&gt;Less work on recovery: with more frequently checkpoints are taken, less data need
to be re-processed after each recovery.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The changelog state backend helps achieve the above by continuously
persisting state changes on non-volatile storage while performing state materialization
in the background.&lt;/p&gt;
&lt;h3 id=&quot;repeatable-cleanup&quot;&gt;Repeatable cleanup&lt;/h3&gt;
&lt;p&gt;In previous releases of Flink, cleaning up job-related artifacts was done only once which might have resulted in abandoned artifacts in case of an error. In this version, Flink will try to run the cleanup again to avoid leaving artifacts behind. This retry mechanism runs until it was successful, by default. Users can change this behavior by configuring the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#retryable-cleanup&quot;&gt;repeatable cleanup options&lt;/a&gt;. Disabling the retry strategy will lead to Flink behaving like in previous releases.&lt;/p&gt;
&lt;p&gt;There is still work in progress around cleaning up checkpoints, which is covered by
&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26606&quot;&gt;FLINK-26606&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;openapi&quot;&gt;OpenAPI&lt;/h3&gt;
&lt;p&gt;Flink is now providing an experimental REST API specification following the
&lt;a href=&quot;https://www.openapis.org&quot;&gt;OpenAPI&lt;/a&gt; standard.
This allows the REST API to be used with standard tools that are implementing the
OpenAPI standard.
You can find the specification &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobmanager&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;improvements-to-application-mode&quot;&gt;Improvements to application mode&lt;/h3&gt;
&lt;p&gt;When running Flink in &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/overview/&quot;&gt;application mode&lt;/a&gt;, it can now be guaranteed that jobs will take a savepoint after they are completed if they have been configured to do so
(&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#execution-shutdown-on-application-finish&quot;&gt;see execution.shutdown-on-application-finish&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The recovery and clean up of jobs running in application mode has also been improved. The local
state can be persisted in the working directory, which makes recovering
from local storage easier.&lt;/p&gt;
&lt;h2 id=&quot;unification-of-stream-and-batch-processing---more-progress&quot;&gt;Unification of stream and batch processing - more progress&lt;/h2&gt;
&lt;p&gt;In the latest release, we picked up new efforts and continued some previous ones in the goal of unifying stream and batch processing.&lt;/p&gt;
&lt;h3 id=&quot;final-checkpoints&quot;&gt;Final checkpoints&lt;/h3&gt;
&lt;p&gt;In Flink 1.14, final checkpoints were added as a feature that had to be enabled manually.
Since the last release, we listened to user feedback and decided to enable it by default. For more
information and how to disable this feature, please refer to the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/fault-tolerance/checkpointing/#checkpointing-with-parts-of-the-graph-finished&quot;&gt;documentation&lt;/a&gt;.
This change in configuration can prolong the shutting down sequence of bounded
streaming jobs because jobs have to wait for a final checkpoint before being allowed to
finish.&lt;/p&gt;
&lt;h3 id=&quot;window-table-valued-functions&quot;&gt;Window table-valued functions&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/sql/queries/window-tvf/&quot;&gt;Window table-valued functions&lt;/a&gt;
have only been available for unbounded data streams.
With this release they will also be usable in BATCH mode. While working on this,
change window table-valued functions have also been improved in general by implementing
a dedicated operator which no longer requires those window functions to be used with
aggregators.&lt;/p&gt;
&lt;h2 id=&quot;flink-sql&quot;&gt;Flink SQL&lt;/h2&gt;
&lt;p&gt;Community metrics indicate that Flink SQL is widely used and becomes more popular every day. The community made several improvements but we’d like to go into two in more detail.&lt;/p&gt;
&lt;h3 id=&quot;casttype-system-enhancements&quot;&gt;CAST/Type system enhancements&lt;/h3&gt;
&lt;p&gt;Data appears in all sorts and shapes but is often not in the type that you need
it to be, which is why
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/types/#casting&quot;&gt;casting&lt;/a&gt;
is one of the most common operations in SQL. In Flink
1.15, the default behavior of a failing CAST has changed from returning a null to
returning an error, which makes it more compliant with the SQL standard. The old
casting behavior can still be used by calling the newly introduced TRY_CAST function
or restored via a configuration flag.&lt;/p&gt;
&lt;p&gt;In addition, many bugs have been fixed and improvements made to the casting
functionality, to ensure correct results.&lt;/p&gt;
&lt;h3 id=&quot;json-functions&quot;&gt;JSON functions&lt;/h3&gt;
&lt;p&gt;JSON is one of the most popular data formats and SQL users increasingly need to build
and read these data structures. Multiple
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/functions/systemfunctions/#json-functions&quot;&gt;JSON&lt;/a&gt;
functions have been added to Flink SQL
according to the SQL 2016 standard. It allows users to inspect, create, and modify JSON
strings using the Flink SQL dialect.&lt;/p&gt;
&lt;h2 id=&quot;community-enablement&quot;&gt;Community enablement&lt;/h2&gt;
&lt;p&gt;Enabling people to build streaming data pipelines to solve their use cases is our goal.
The community is well aware that a
technology like Apache Flink is never used on its own and will always be part of a
bigger architecture. Thus, it is important that Flink operates well in the cloud,
connects seamlessly to other systems, and continues to support programming languages like Java and Python.&lt;/p&gt;
&lt;h3 id=&quot;cloud-interoperability&quot;&gt;Cloud interoperability&lt;/h3&gt;
&lt;p&gt;There are users operating Flink deployments in cloud infrastructures from various
cloud providers. There are also services that offer to manage Flink deployments for
users on their platform.&lt;/p&gt;
&lt;p&gt;In Flink 1.15, a recoverable writer for Google Cloud Storage has
been added. We also organized the connectors in the Flink ecosystem and put some focus
on connectors for the AWS ecosystem (i.e.
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kinesis/&quot;&gt;KDS&lt;/a&gt;,
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/firehose/&quot;&gt;Firehose&lt;/a&gt;).&lt;/p&gt;
&lt;h3 id=&quot;the-elasticsearch-sink&quot;&gt;The Elasticsearch sink&lt;/h3&gt;
&lt;p&gt;There was significant work on Flink’s overall connector ecosystem, but we want to highlight the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/elasticsearch/&quot;&gt;Elasticsearch sink&lt;/a&gt; because it was implemented with
the new connector interfaces, which offers asynchronous functionality coupled with end-to-end semantics.
This sink will act as a template in the future.&lt;/p&gt;
&lt;h3 id=&quot;a-scala-free-flink&quot;&gt;A Scala-free Flink&lt;/h3&gt;
&lt;p&gt;A detailed
&lt;a href=&quot;https://flink.apache.org/2022/02/22/scala-free.html&quot;&gt;blog post&lt;/a&gt;&lt;br /&gt;
already explains the ins and outs of why Scala users can now use the Flink
Java API with any Scala version (including Scala 3).&lt;/p&gt;
&lt;p&gt;In the end, removing Scala is just part of a larger effort of cleaning up and updating
various technologies from the Flink ecosystem.&lt;/p&gt;
&lt;p&gt;Starting in Flink 1.14, we removed the Mesos integration, isolated Akka, deprecated the
DataSet Java API, and hid the Table API behind an abstraction. There’s already a lot of traction in the community towards these endeavors.&lt;/p&gt;
&lt;h2 id=&quot;pyflink&quot;&gt;PyFlink&lt;/h2&gt;
&lt;p&gt;Before Flink 1.15, Python user-defined functions were executed in separate Python
processes which caused additional serialization/deserialization and communication overhead.
In scenarios in with large amounts of data, e.g. image processing, etc, this overhead
becomes non-negligible. Besides, since it involves inter-process
communication, the processing latency is also non-negligible, which is unacceptable in
scenarios for which latency is critical, e.g. quantitative transaction, etc. In Flink
1.15, we have introduced a new execution mode named ‘thread’ mode, for which Python
user-defined functions will be executed in the JVM as a thread instead of a separate
Python process. Benchmarks have shown that throughput could be increased by 2x in
common scenarios such as JSON processing. Processing latency is also decreased from
several seconds to micro-seconds. It should be noted that since this is still the first
release of ‘thread’ mode, it currently only supports Python ScalarFunction which is used
in Python Table API &amp;amp; SQL. We’re planning to extend it to other areas in which Python
user-defined functions could be used in the next releases.&lt;/p&gt;
&lt;h3 id=&quot;other&quot;&gt;Other&lt;/h3&gt;
&lt;p&gt;Further work has been done on the
&lt;a href=&quot;https://github.com/PatrickRen/flink/tree/master/flink-test-utils-parent/flink-connector-testing&quot;&gt;connector testing framework&lt;/a&gt;. If you want to contribute a connector or improve on one, you should definitely have a
look.&lt;/p&gt;
&lt;p&gt;Some long-awaited features have been added, including the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/formats/csv/&quot;&gt;CSV format&lt;/a&gt;
and the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/filesystem/#compaction&quot;&gt;small file compaction&lt;/a&gt;
in the unified sink interface.&lt;/p&gt;
&lt;p&gt;The sink API has been upgraded
to &lt;a href=&quot;https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/connector/sink2/StatefulSink.java&quot;&gt;version 2&lt;/a&gt;
and we encourage every connector maintainer to upgrade to this version.&lt;/p&gt;
&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;
&lt;p&gt;Apache Flink is now easier to operate, made even more progress towards aligning stream and
batch processing, became more accessible through improvements in the SQL components,
and now integrates better with other technologies.&lt;/p&gt;
&lt;p&gt;It is also worth mentioning that
the community has set up a new home for
the &lt;a href=&quot;https://ververica.github.io/flink-cdc-connectors/release-2.1/index.html&quot;&gt;CDC connectors&lt;/a&gt;,
the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/Connectors&quot;&gt;connector repository&lt;/a&gt;
will be externalized
(&lt;a href=&quot;https://github.com/apache/flink-connector-elasticsearch/&quot;&gt;with the Elasticsearch sink as a first example&lt;/a&gt;),
and there is now a
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/&quot;&gt;Kubernetes operator&lt;/a&gt;
(&lt;a href=&quot;https://flink.apache.org/news/2022/04/03/release-kubernetes-operator-0.1.0.html&quot;&gt;announcement blogpost&lt;/a&gt;
maintained by the community.&lt;/p&gt;
&lt;p&gt;Moving forward, the community will continue to focus on making Apache Flink a true
unified stream and batch processor and work on better integrating Flink into the cloud-native
ecosystem.&lt;/p&gt;
&lt;h2 id=&quot;upgrade-notes&quot;&gt;Upgrade Notes&lt;/h2&gt;
&lt;p&gt;While we aim to make upgrades as smooth as possible, some of the changes require users
to adjust some parts of the program when upgrading to Apache Flink 1.15. Please take a
look at the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/release-notes/flink-1.15/&quot;&gt;release notes&lt;/a&gt;
for a list of applicable adjustments and issues during
upgrades. The one big thing worth mentioning when upgrading is the updated dependencies
without the Scala version.
&lt;a href=&quot;https://flink.apache.org/2022/02/22/scala-free.html&quot;&gt;Get the details here.&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank each and every one of the contributors that have
made this release possible:&lt;/p&gt;
&lt;p&gt;Ada Wong, Ahmed Hamdy, Aitozi, Alexander Fedulov, Alexander Preuß, Alexander Trushev, Ali Bahadir Zeybek,
Anton Kalashnikov, Arvid Heise, Bernard Joseph Jean Bruno, Bo Cui, Brian Zhou, Camile, ChangLi, Chengkai Yang,
Chesnay Schepler, Daisy T, Danny Cranmer, David Anderson, David Moravek, David N Perkins, Dawid Wysakowicz,
Denis-Cosmin Nutiu, Dian Fu, Dong Lin, Eelis Kostiainen, Etienne Chauchot, Fabian Paul, Francesco Guardiani,
Gabor Somogyi, Galen Warren, Gao Yun, Gen Luo, GitHub, Gyula Fora, Hang Ruan, Hangxiang Yu, Honnix, Horace Lee,
Ingo Bürk, JIN FENG, Jack, Jane Chan, Jark Wu, JianZhangYang, Jiangjie (Becket) Qin, JianzhangYang, Jiayi Liao,
Jing, Jing Ge, Jing Zhang, Jingsong Lee, JingsongLi, Jinzhong Li, Joao Boto, Joey Lee, John Karp, Jon Gillham,
Jun Qin, Junfan Zhang, Juntao Hu, Kexin, Kexin Hui, Kirill Listopad, Konstantin Knauf, LB-Yu, Leonard Xu, Lijie Wang,
Liu Jiangang, Maciej Bryński, Marios Trivyzas, MartijnVisser, Mason Chen, Matthias Pohl, Michal Ciesielczyk, Mika,
Mika Naylor, Mrart, Mulavar, Nick Burkard, Nico Kruber, Nicolas Raga, Nicolaus Weidner, Niklas Semmler, Nikolay,
Nuno Afonso, Oleg Smirnov, Paul Lin, Paul Zhang, PengFei Li, Piotr Nowojski, Px, Qingsheng Ren, Robert Metzger,
Roc Marshal, Roman, Roman Khachatryan, Ruanshubin, Rudi Kershaw, Rui Li, Ryan Scudellari, Ryan Skraba,
Sebastian Mattheis, Sergey, Sergey Nuyanzin, Shen Zhu, Shengkai, Shuo Cheng, Sike Bai, SteNicholas, Steffen Hausmann,
Stephan Ewen, Tartarus0zm, Thesharing, Thomas Weise, Till Rohrmann, Timo Walther, Tony Wei, Victor Xu,
Wenhao Ji, X-czh, Xianxun Ye, Xin Yu, Xinbin Huang, Xintong Song, Xuannan, Yang Wang, Yangze Guo, Yao Zhang,
Yi Tang, Yibo Wen, Yuan Mei, Yuanhao Tian, Yubin Li, Yuepeng Pan, Yufan Sheng, Yufei Zhang, Yuhao Bi, Yun Gao,
Yun Tang, Yuval Itzchakov, Yuxin Tan, Zakelly, Zhu Zhu, Zichen Liu, Zongwen Li, atptour2017, baisike, bgeng777,
camilesing, chenxyz707, chenzihao, chuixue, dengziming, dijkwxyz, fanrui, fengli, fenyi, fornaix, gaurav726,
godfrey he, godfreyhe, gongzhongqiang, haochenhao, hapihu, hehuiyuan, hongshuboy, huangxingbo, huweihua, iyupeng,
jiaoqingbo, jinfeng, jxjgsylsg, kevin.cyj, kylewang, lbb, liliwei, liming.1018, lincoln lee, liufangqi, liujiangang,
liushouwei, liuyongvs, lixiaobao14, lmagic233, lovewin99, lujiefsi, luoyuxia, lz, mans2singh, martijnvisser, mayue.fight,
nanmu42, oogetyboogety, paul8263, pusheng.li01, qianchutao, realdengziqi, ruanhang1993, sammieliu, shammon, shihong90,
shitou, shouweikun, shouzuo1, shuo.cs, siavash119, simenliuxing, sjwiesman, slankka, slinkydeveloper, snailHumming,
snuyanzin, sujun, sujun1, syhily, tsreaper, txdong-sz, unknown, vahmed-hamdy, wangfeifan, wangpengcheng, wangyang0918,
wangzhiwu, wangzhuo, wgzhao, wsz94, xiangqiao123, xmarker, xuyang, xuyu, xuzifu666, yangjunhan, yangze.gyz, ysymi,
yuxia Luo, zhang chaoming, zhangchaoming, zhangjiaogg, zhangjingcun, zhangjun02, zhangmang, zlzhang0122, zoucao, zp,
zzccctv, 周平, 子扬, 李锐, 蒋龙, 龙三, 庄天翼&lt;/p&gt;
</description>
<pubDate>Thu, 05 May 2022 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2022/05/05/1.15-announcement.html</link>
<guid isPermaLink="true">/news/2022/05/05/1.15-announcement.html</guid>
</item>
<item>
<title>Apache Flink Kubernetes Operator 0.1.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink Community is pleased to announce the preview release of the Apache Flink Kubernetes Operator (0.1.0)&lt;/p&gt;
&lt;p&gt;The Flink Kubernetes Operator allows users to easily manage their Flink deployment lifecycle using native Kubernetes tooling.&lt;/p&gt;
&lt;p&gt;The operator takes care of submitting, savepointing, upgrading and generally managing Flink jobs using the built-in Flink Kubernetes integration.
This way users do not have to use the Flink Clients (e.g. CLI) or interact with the Flink jobs manually, they only have to declare the desired deployment specification and the operator will take care of the rest. It also make it easier to integrate Flink job management with CI/CD tooling.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Core Features&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Deploy and monitor Flink Application and Session deployments&lt;/li&gt;
&lt;li&gt;Upgrade, suspend and delete Flink deployments&lt;/li&gt;
&lt;li&gt;Full logging and metrics integration&lt;/li&gt;
&lt;/ul&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2022-04-03-release-kubernetes-operator-0.1.0/overview.svg&quot; width=&quot;600px&quot; alt=&quot;Overview 1&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;h2 id=&quot;getting-started&quot;&gt;Getting started&lt;/h2&gt;
&lt;p&gt;For a detailed &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-0.1/docs/try-flink-kubernetes-operator/quick-start/&quot;&gt;getting started guide&lt;/a&gt; please check the documentation site.&lt;/p&gt;
&lt;h2 id=&quot;flinkdeployment-cr-overview&quot;&gt;FlinkDeployment CR overview&lt;/h2&gt;
&lt;p&gt;When using the operator, users create &lt;code&gt;FlinkDeployment&lt;/code&gt; objects to describe their Flink application and session clusters deployments.&lt;/p&gt;
&lt;p&gt;A minimal application deployment yaml would look like this:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;&lt;span class=&quot;l-Scalar-Plain&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;flink.apache.org/v1alpha1&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;FlinkDeployment&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;default&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;basic-example&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;flink:1.14&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;flinkVersion&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;v1_14&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;flinkConfiguration&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;taskmanager.numberOfTaskSlots&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;2&amp;quot;&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;serviceAccount&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;flink&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;jobManager&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;replicas&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;resource&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;2048m&amp;quot;&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;cpu&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;taskManager&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;resource&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;2048m&amp;quot;&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;cpu&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;job&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;jarURI&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;local:///opt/flink/examples/streaming/StateMachineExample.jar&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;parallelism&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;upgradeMode&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;stateless&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once applied to the cluster using &lt;code&gt;kubectl apply -f your-deployment.yaml&lt;/code&gt; the operator will spin up the application cluster for you.
If you would like to upgrade or make changes to your application, you can simply modify the yaml and submit it again, the operator will execute the necessary steps (savepoint, shutdown, redeploy etc.) to upgrade your application.&lt;/p&gt;
&lt;p&gt;To stop and delete your application cluster you can simply call &lt;code&gt;kubectl delete -f your-deployment.yaml&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You can read more about the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-0.1/docs/custom-resource/job-management/&quot;&gt;job management features&lt;/a&gt; on the documentation site.&lt;/p&gt;
&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next?&lt;/h2&gt;
&lt;p&gt;The community is currently working on hardening the core operator logic, stabilizing the APIs and adding the remaining bits for making the Flink Kubernetes Operator production ready.&lt;/p&gt;
&lt;p&gt;In the upcoming 1.0.0 release you can expect (at-least) the following additional features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Support for Session Job deployments&lt;/li&gt;
&lt;li&gt;Job upgrade rollback strategies&lt;/li&gt;
&lt;li&gt;Pluggable validation logic&lt;/li&gt;
&lt;li&gt;Operator deployment customization&lt;/li&gt;
&lt;li&gt;Improvements based on feedback from the preview release&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the medium term you can also expect:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Support for standalone / reactive deployment modes&lt;/li&gt;
&lt;li&gt;Support for other job types such as SQL or Python&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Please give the preview release a try, share your feedback on the Flink mailing list and contribute to the project!&lt;/p&gt;
&lt;h2 id=&quot;release-resources&quot;&gt;Release Resources&lt;/h2&gt;
&lt;p&gt;The source artifacts and helm chart are now available on the updated &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads&lt;/a&gt;
page of the Flink website.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://archive.apache.org/dist/flink/flink-kubernetes-operator-0.1.0/&quot;&gt;official 0.1.0 release archive&lt;/a&gt; doubles as a Helm repository that you can easily register locally:&lt;/p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;helm repo add flink-kubernetes-operator-0.1.0 https://archive.apache.org/dist/flink/flink-kubernetes-operator-0.1.0/
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;helm install flink-kubernetes-operator flink-kubernetes-operator-0.1.0/flink-kubernetes-operator --set webhook.create&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;p&gt;You can also find official Kubernetes Operator Docker images of the new version on &lt;a href=&quot;https://hub.docker.com/r/apache/flink-kubernetes-operator&quot;&gt;Dockerhub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For more details, check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-0.1/&quot;&gt;updated documentation&lt;/a&gt; and the
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351499&quot;&gt;release notes&lt;/a&gt;.
We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt;
or &lt;a href=&quot;https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Kubernetes%20Operator%22&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank each and every one of the contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;Aitozi, Biao Geng, Gyula Fora, Hao Xin, Jaegu Kim, Jaganathan Asokan, Junfan Zhang, Marton Balassi, Matyas Orhidi, Nicholas Jiang, Sandor Kelemen, Thomas Weise, Yang Wang, 愚鲤&lt;/p&gt;
</description>
<pubDate>Sun, 03 Apr 2022 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2022/04/03/release-kubernetes-operator-0.1.0.html</link>
<guid isPermaLink="true">/news/2022/04/03/release-kubernetes-operator-0.1.0.html</guid>
</item>
<item>
<title>Apache Flink 1.14.4 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink Community is pleased to announce another bug fix release for Flink 1.14.&lt;/p&gt;
&lt;p&gt;This release includes 51 bug and vulnerability fixes and minor improvements for Flink 1.14.
Below you will find a list of all bugfixes and improvements (excluding improvements to the build infrastructure and build stability). For a complete list of all changes see:
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351074&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.14.4.&lt;/p&gt;
&lt;h1 id=&quot;release-artifacts&quot;&gt;Release Artifacts&lt;/h1&gt;
&lt;h2 id=&quot;maven-dependencies&quot;&gt;Maven Dependencies&lt;/h2&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;binaries&quot;&gt;Binaries&lt;/h2&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;docker-images&quot;&gt;Docker Images&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://hub.docker.com/_/flink?tab=tags&amp;amp;page=1&amp;amp;name=1.14.4&quot;&gt;library/flink&lt;/a&gt; (official images)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://hub.docker.com/r/apache/flink/tags?page=1&amp;amp;name=1.14.4&quot;&gt;apache/flink&lt;/a&gt; (ASF repository)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;pypi&quot;&gt;PyPi&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://pypi.org/project/apache-flink/1.14.4/&quot;&gt;apache-flink==1.14.4&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h1&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21788&quot;&gt;FLINK-21788&lt;/a&gt;] - Throw PartitionNotFoundException if the partition file has been lost for blocking shuffle
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24954&quot;&gt;FLINK-24954&lt;/a&gt;] - Reset read buffer request timeout on buffer recycling for sort-shuffle
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25653&quot;&gt;FLINK-25653&lt;/a&gt;] - Move buffer recycle in SortMergeSubpartitionReader out of lock to avoid deadlock
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25654&quot;&gt;FLINK-25654&lt;/a&gt;] - Remove the redundant lock in SortMergeResultPartition
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25879&quot;&gt;FLINK-25879&lt;/a&gt;] - Track used search terms in Matomo
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25880&quot;&gt;FLINK-25880&lt;/a&gt;] - Implement Matomo in Flink documentation
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21752&quot;&gt;FLINK-21752&lt;/a&gt;] - NullPointerException on restore in PojoSerializer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23946&quot;&gt;FLINK-23946&lt;/a&gt;] - Application mode fails fatally when being shut down
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24334&quot;&gt;FLINK-24334&lt;/a&gt;] - Configuration kubernetes.flink.log.dir not working
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24407&quot;&gt;FLINK-24407&lt;/a&gt;] - Pulsar connector chinese document link to Pulsar document location incorrectly.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24607&quot;&gt;FLINK-24607&lt;/a&gt;] - SourceCoordinator may miss to close SplitEnumerator when failover frequently
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25171&quot;&gt;FLINK-25171&lt;/a&gt;] - When the DDL statement was executed, the column names of the Derived Columns were not validated
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25199&quot;&gt;FLINK-25199&lt;/a&gt;] - StreamEdges are not unique in self-union, which blocks propagation of watermarks
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25362&quot;&gt;FLINK-25362&lt;/a&gt;] - Incorrect dependencies in Table Confluent/Avro docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25407&quot;&gt;FLINK-25407&lt;/a&gt;] - Network stack deadlock when cancellation happens during initialisation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25466&quot;&gt;FLINK-25466&lt;/a&gt;] - TTL configuration could parse in StateTtlConfig#DISABLED
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25486&quot;&gt;FLINK-25486&lt;/a&gt;] - Perjob can not recover from checkpoint when zookeeper leader changes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25494&quot;&gt;FLINK-25494&lt;/a&gt;] - Duplicate element serializer during DefaultOperatorStateBackendSnapshotStrategy#syncPrepareResources
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25678&quot;&gt;FLINK-25678&lt;/a&gt;] - TaskExecutorStateChangelogStoragesManager.shutdown is not thread-safe
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25683&quot;&gt;FLINK-25683&lt;/a&gt;] - wrong result if table transfrom to DataStream then window process in batch mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25728&quot;&gt;FLINK-25728&lt;/a&gt;] - Potential memory leaks in StreamMultipleInputProcessor
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25732&quot;&gt;FLINK-25732&lt;/a&gt;] - Dispatcher#requestMultipleJobDetails returns non-serialiable collection
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25827&quot;&gt;FLINK-25827&lt;/a&gt;] - Potential memory leaks in SourceOperator
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25856&quot;&gt;FLINK-25856&lt;/a&gt;] - Fix use of UserDefinedType in from_elements
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25883&quot;&gt;FLINK-25883&lt;/a&gt;] - The value of DEFAULT_BUNDLE_PROCESSOR_CACHE_SHUTDOWN_THRESHOLD_S is too large
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25893&quot;&gt;FLINK-25893&lt;/a&gt;] - ResourceManagerServiceImpl&amp;#39;s lifecycle can lead to exceptions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25952&quot;&gt;FLINK-25952&lt;/a&gt;] - Savepoint on S3 are not relocatable even if entropy injection is not enabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26039&quot;&gt;FLINK-26039&lt;/a&gt;] - Incorrect value getter in map unnest table function
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26159&quot;&gt;FLINK-26159&lt;/a&gt;] - Pulsar Connector: should add description MAX_FETCH_RECORD in doc to explain slow consumption
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26160&quot;&gt;FLINK-26160&lt;/a&gt;] - Pulsar Connector: stopCursor description should be changed. Connector only stop when auto discovery is disabled.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26187&quot;&gt;FLINK-26187&lt;/a&gt;] - Chinese docs override english aliases
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-26304&quot;&gt;FLINK-26304&lt;/a&gt;] - GlobalCommitter can receive failed committables
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; New Feature
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20188&quot;&gt;FLINK-20188&lt;/a&gt;] - Add Documentation for new File Source
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21407&quot;&gt;FLINK-21407&lt;/a&gt;] - Clarify which sources and APIs support which formats
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20830&quot;&gt;FLINK-20830&lt;/a&gt;] - Add a type of HEADLESS_CLUSTER_IP for rest service type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24880&quot;&gt;FLINK-24880&lt;/a&gt;] - Error messages &amp;quot;OverflowError: timeout value is too large&amp;quot; shown when executing PyFlink jobs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25160&quot;&gt;FLINK-25160&lt;/a&gt;] - Make doc clear: tolerable-failed-checkpoints counts consecutive failures
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25611&quot;&gt;FLINK-25611&lt;/a&gt;] - Remove CoordinatorExecutorThreadFactory thread creation guards
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25650&quot;&gt;FLINK-25650&lt;/a&gt;] - Document unaligned checkpoints performance limitations (larger records/flat map/timers/...)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25767&quot;&gt;FLINK-25767&lt;/a&gt;] - Translation of page &amp;#39;Working with State&amp;#39; is incomplete
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25818&quot;&gt;FLINK-25818&lt;/a&gt;] - Add explanation how Kafka Source deals with idleness when parallelism is higher then the number of partitions
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Technical Debt
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25576&quot;&gt;FLINK-25576&lt;/a&gt;] - Update com.h2database:h2 to 2.0.206
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25785&quot;&gt;FLINK-25785&lt;/a&gt;] - Update com.h2database:h2 to 2.0.210
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 11 Mar 2022 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2022/03/11/release-1.14.4.html</link>
<guid isPermaLink="true">/news/2022/03/11/release-1.14.4.html</guid>
</item>
<item>
<title>Scala Free in One Fifteen</title>
<description>&lt;p&gt;Flink 1.15 is right around the corner, and among the many improvements is a Scala free classpath.
Users can now leverage the Java API from any Scala version, including Scala 3!&lt;/p&gt;
&lt;figure style=&quot;margin-left:auto;margin-right:auto;display:block;padding-top: 20px;padding-bottom:20px;width:75%;&quot;&gt;
&lt;img src=&quot;/img/blog/2022-02-22-scala-free/flink-scala-3.jpeg&quot; /&gt;
&lt;figcaption style=&quot;padding-top: 10px;text-align:center&quot;&gt;&lt;b&gt;Fig.1&lt;/b&gt; Flink 1.15 Scala 3 Example&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This blog will discuss what has historically made supporting multiple Scala versions so complex, how we achieved this milestone, and the future of Scala in Apache Flink.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;b&gt;TLDR&lt;/b&gt;: All Scala dependencies are now isolated to the &lt;code&gt;flink-scala&lt;/code&gt; jar.
To remove Scala from the user-code classpath, remove this jar from the lib directory of the Flink distribution.
&lt;br /&gt;&lt;br /&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;rm flink-dist/lib/flink-scala*&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#the-classpath-and-scala&quot; id=&quot;markdown-toc-the-classpath-and-scala&quot;&gt;The Classpath and Scala&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#hiding-scala&quot; id=&quot;markdown-toc-hiding-scala&quot;&gt;Hiding Scala&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-future-of-scala-in-apache-flink&quot; id=&quot;markdown-toc-the-future-of-scala-in-apache-flink&quot;&gt;The Future of Scala in Apache Flink&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;the-classpath-and-scala&quot;&gt;The Classpath and Scala&lt;/h2&gt;
&lt;p&gt;If you have worked with a JVM-based application, you have probably heard the term classpath.
The classpath defines where the JVM will search for a given classfile when it needs to be loaded.
There may only be one instance of a classfile on each classpath, forcing any dependency Flink exposes onto users.
That is why the Flink community works hard to keep our classpath “clean” - or free of unnecessary dependencies.
We achieve this through a combination of &lt;a href=&quot;https://github.com/apache/flink-shaded&quot;&gt;shaded dependencies&lt;/a&gt;, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/debugging/debugging_classloading/#inverted-class-loading-and-classloader-resolution-order&quot;&gt;child first class loading&lt;/a&gt;, and a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/&quot;&gt;plugins abstraction&lt;/a&gt; for optional components.&lt;/p&gt;
&lt;p&gt;The Apache Flink runtime is primarily written in Java but contains critical components that forced Scala on the default classpath.
And because Scala does not maintain binary compatibility across minor releases, this historically required cross-building components for all versions of Scala.
But due to many reasons - &lt;a href=&quot;https://github.com/scala/scala/releases/tag/v2.12.8&quot;&gt;breaking changes in the compiler&lt;/a&gt;, &lt;a href=&quot;https://www.scala-lang.org/news/2.13.0&quot;&gt;a new standard library&lt;/a&gt;, and &lt;a href=&quot;https://docs.scala-lang.org/scala3/guides/macros/macros.html&quot;&gt;a reworked macro system&lt;/a&gt; - this was easier said than done.&lt;/p&gt;
&lt;h2 id=&quot;hiding-scala&quot;&gt;Hiding Scala&lt;/h2&gt;
&lt;p&gt;As mentioned above, Flink uses Scala in a few key components; Mesos integration, the serialization stack, RPC, and the table planner.
Instead of removing these dependencies or finding ways to cross-build them, the community hid Scala.
It still exists in the codebase but no longer leaks into the user code classloader.&lt;/p&gt;
&lt;p&gt;In 1.14, we took our first steps in hiding Scala from our users.
We dropped the support for Apache Mesos, partially implemented in Scala, which Kubernetes very much eclipsed in terms of adoption.
Next, we isolated our RPC system into a dedicated classloader, including Akka.
With these changes, the runtime itself no longer relied on Scala (hence why flink-runtime lost its Scala suffix), but Scala was still ever-present in the API layer.&lt;/p&gt;
&lt;p&gt;These changes, and the ease with which we implemented them, started to make people wonder what else might be possible.
After all, we isolated Akka in less than a month, a task stuck in the backlog for years, thought to be too time-consuming.&lt;/p&gt;
&lt;p&gt;The next logical step was to decouple the DataStream / DataSet Java APIs from Scala.
This primarily entailed the few cleanups of some &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23967&quot;&gt;test&lt;/a&gt; &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23968&quot;&gt;classes&lt;/a&gt; but also the identifying of code paths that are only relevant for the Scala API.
These paths were then migrated into the Scala API modules and only used if required.&lt;/p&gt;
&lt;p&gt;For example, the &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24017&quot;&gt;Kryo serializer&lt;/a&gt;, which we always extended to support certain Scala types, now only includes them if an application uses the Scala APIs.&lt;/p&gt;
&lt;p&gt;Finally, it was time to tackle the Table API, specifically the table planner, which contains 378,655 lines of Scala code at the time of writing.
The table planner provides parsing, planning, and optimization of SQL and Table API queries into highly optimized Java code.
It is the most extensive Scala codebase in Flink and it cannot be ported easily to Java.
Using what we learned from building dedicated classloaders for the RPC stack and conditional classloading for the serializers, we hid the planner behind an abstraction that does not expose any of its internals, including Scala.&lt;/p&gt;
&lt;h2 id=&quot;the-future-of-scala-in-apache-flink&quot;&gt;The Future of Scala in Apache Flink&lt;/h2&gt;
&lt;p&gt;While most of these changes happened behind the scenes, they resulted in one very user-facing change: removing many scala suffixes. You can find a list of all dependencies that lost their Scala suffix at the end of this post&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Additionally, changes to the Table API required several changes to the packaging and the distribution, which some power users relying on the planner internals might need to adapt to&lt;sup id=&quot;fnref:3&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Going forward, Flink will continue to support Scala packages for the DataStream and Table APIs compiled against Scala 2.12 while the Java API is now unlocked for users to leverage components from any Scala version.
We are already seeing new Scala 3 wrappers pop up in the community are excited to see how users leverage these tools in their streaming pipelines&lt;sup id=&quot;fnref:4&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:5&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:6&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;!&lt;/p&gt;
&lt;hr /&gt;
&lt;div class=&quot;footnotes&quot;&gt;
&lt;ol&gt;
&lt;li id=&quot;fn:1&quot;&gt;
&lt;p&gt;flink-cep, flink-clients, flink-connector-elasticsearch-base, flink-connector-elasticsearch6, flink-connector-elasticsearch7, flink-connector-gcp-pubsub, flink-connector-hbase-1.4, flink-connector-hbase-2.2, flink-connector-hbase-base, flink-connector-jdbc, flink-connector-kafka, flink-connector-kinesis, flink-connector-nifi, flink-connector-pulsar, flink-connector-rabbitmq, flink-connector-testing, flink-connector-twitter, flink-connector-wikiedits, flink-container, flink-dstl-dfs, flink-gelly, flink-hadoop-bulk, flink-kubernetes, flink-runtime-web, flink-sql-connector-elasticsearch6, flink-sql-connector-elasticsearch7, flink-sql-connector-hbase-1.4, flink-sql-connector-hbase-2.2, flink-sql-connector-kafka, flink-sql-connector-kinesis, flink-sql-connector-rabbitmq, flink-state-processor-api, flink-statebackend-rocksdb, flink-streaming-java, flink-table-api-java-bridge, flink-test-utils, flink-yarn, flink-table-runtime, flink-table-api-java-bridge &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn:2&quot;&gt;
&lt;p&gt;https://nightlies.apache.org/flink/flink-docs-master/docs/dev/configuration/overview/#which-dependencies-do-you-need &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn:3&quot;&gt;
&lt;p&gt;https://nightlies.apache.org/flink/flink-docs-master/docs/dev/configuration/advanced/#anatomy-of-table-dependencies &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn:4&quot;&gt;
&lt;p&gt;https://github.com/ariskk/flink4s &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn:5&quot;&gt;
&lt;p&gt;https://github.com/findify/flink-adt &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn:6&quot;&gt;
&lt;p&gt;https://github.com/sjwiesman/flink-scala-3 &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
<pubDate>Tue, 22 Feb 2022 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/2022/02/22/scala-free.html</link>
<guid isPermaLink="true">/2022/02/22/scala-free.html</guid>
</item>
<item>
<title>Apache Flink 1.13.6 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink Community is pleased to announce another bug fix release for Flink 1.13.&lt;/p&gt;
&lt;p&gt;This release includes 99 bug and vulnerability fixes and minor improvements for Flink 1.13 including another upgrade of Apache Log4j (to 2.17.1).
Below you will find a list of all bugfixes and improvements (excluding improvements to the build infrastructure and build stability). For a complete list of all changes see:
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351074&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.13.6.&lt;/p&gt;
&lt;h1 id=&quot;release-artifacts&quot;&gt;Release Artifacts&lt;/h1&gt;
&lt;h2 id=&quot;maven-dependencies&quot;&gt;Maven Dependencies&lt;/h2&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.6&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.6&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.6&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;binaries&quot;&gt;Binaries&lt;/h2&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;docker-images&quot;&gt;Docker Images&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://hub.docker.com/_/flink?tab=tags&amp;amp;page=1&amp;amp;name=1.13.6&quot;&gt;library/flink&lt;/a&gt; (official images)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://hub.docker.com/r/apache/flink/tags?page=1&amp;amp;name=1.13.6&quot;&gt;apache/flink&lt;/a&gt; (ASF repository)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;pypi&quot;&gt;PyPi&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://pypi.org/project/apache-flink/1.13.6/&quot;&gt;apache-flink==1.13.6&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h1&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15987&quot;&gt;FLINK-15987&lt;/a&gt;] - SELECT 1.0e0 / 0.0e0 throws NumberFormatException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17914&quot;&gt;FLINK-17914&lt;/a&gt;] - HistoryServer deletes cached archives if archive listing fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20195&quot;&gt;FLINK-20195&lt;/a&gt;] - Jobs endpoint returns duplicated jobs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20370&quot;&gt;FLINK-20370&lt;/a&gt;] - Result is wrong when sink primary key is not the same with query
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21289&quot;&gt;FLINK-21289&lt;/a&gt;] - Application mode ignores the pipeline.classpaths configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23919&quot;&gt;FLINK-23919&lt;/a&gt;] - PullUpWindowTableFunctionIntoWindowAggregateRule generates invalid Calc for Window TVF
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24232&quot;&gt;FLINK-24232&lt;/a&gt;] - Archiving of suspended jobs prevents breaks subsequent archive attempts
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24255&quot;&gt;FLINK-24255&lt;/a&gt;] - Test Environment / Mini Cluster do not forward configuration.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24310&quot;&gt;FLINK-24310&lt;/a&gt;] - A bug in the BufferingSink example in the doc
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24318&quot;&gt;FLINK-24318&lt;/a&gt;] - Casting a number to boolean has different results between &amp;#39;select&amp;#39; fields and &amp;#39;where&amp;#39; condition
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24334&quot;&gt;FLINK-24334&lt;/a&gt;] - Configuration kubernetes.flink.log.dir not working
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24366&quot;&gt;FLINK-24366&lt;/a&gt;] - Unnecessary/misleading error message about failing restores when tasks are already canceled.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24401&quot;&gt;FLINK-24401&lt;/a&gt;] - TM cannot exit after Metaspace OOM
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24465&quot;&gt;FLINK-24465&lt;/a&gt;] - Wrong javadoc and documentation for buffer timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24492&quot;&gt;FLINK-24492&lt;/a&gt;] - incorrect implicit type conversion between numeric and (var)char
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24506&quot;&gt;FLINK-24506&lt;/a&gt;] - checkpoint directory is not configurable through the Flink configuration passed into the StreamExecutionEnvironment
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24509&quot;&gt;FLINK-24509&lt;/a&gt;] - FlinkKafkaProducer example is not compiling due to incorrect constructer signature used
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24540&quot;&gt;FLINK-24540&lt;/a&gt;] - Fix Resource leak due to Files.list
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24543&quot;&gt;FLINK-24543&lt;/a&gt;] - Zookeeper connection issue causes inconsistent state in Flink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24563&quot;&gt;FLINK-24563&lt;/a&gt;] - Comparing timstamp_ltz with random string throws NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24597&quot;&gt;FLINK-24597&lt;/a&gt;] - RocksdbStateBackend getKeysAndNamespaces would return duplicate data when using MapState
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24621&quot;&gt;FLINK-24621&lt;/a&gt;] - JobManager fails to recover 1.13.1 checkpoint due to InflightDataRescalingDescriptor
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24662&quot;&gt;FLINK-24662&lt;/a&gt;] - PyFlink sphinx check failed with &amp;quot;node class &amp;#39;meta&amp;#39; is already registered, its visitors will be overridden&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24667&quot;&gt;FLINK-24667&lt;/a&gt;] - Channel state writer would fail the task directly if meeting exception previously
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24676&quot;&gt;FLINK-24676&lt;/a&gt;] - Schema does not match if explain insert statement with partial column
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24678&quot;&gt;FLINK-24678&lt;/a&gt;] - Correct the metric name of map state contains latency
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24708&quot;&gt;FLINK-24708&lt;/a&gt;] - `ConvertToNotInOrInRule` has a bug which leads to wrong result
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24728&quot;&gt;FLINK-24728&lt;/a&gt;] - Batch SQL file sink forgets to close the output stream
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24761&quot;&gt;FLINK-24761&lt;/a&gt;] - Fix PartitionPruner code gen compile fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24846&quot;&gt;FLINK-24846&lt;/a&gt;] - AsyncWaitOperator fails during stop-with-savepoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24860&quot;&gt;FLINK-24860&lt;/a&gt;] - Fix the wrong position mappings in the Python UDTF
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24885&quot;&gt;FLINK-24885&lt;/a&gt;] - ProcessElement Interface parameter Collector : java.lang.NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24922&quot;&gt;FLINK-24922&lt;/a&gt;] - Fix spelling errors in the word &amp;quot;parallism&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25022&quot;&gt;FLINK-25022&lt;/a&gt;] - ClassLoader leak with ThreadLocals on the JM when submitting a job through the REST API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25067&quot;&gt;FLINK-25067&lt;/a&gt;] - Correct the description of RocksDB&amp;#39;s background threads
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25084&quot;&gt;FLINK-25084&lt;/a&gt;] - Field names must be unique. Found duplicates
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25091&quot;&gt;FLINK-25091&lt;/a&gt;] - Official website document FileSink orc compression attribute reference error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25096&quot;&gt;FLINK-25096&lt;/a&gt;] - Issue in exceptions API(/jobs/:jobid/exceptions) in flink 1.13.2
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25199&quot;&gt;FLINK-25199&lt;/a&gt;] - StreamEdges are not unique in self-union, which blocks propagation of watermarks
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25362&quot;&gt;FLINK-25362&lt;/a&gt;] - Incorrect dependencies in Table Confluent/Avro docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25468&quot;&gt;FLINK-25468&lt;/a&gt;] - Local recovery fails if local state storage and RocksDB working directory are not on the same volume
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25486&quot;&gt;FLINK-25486&lt;/a&gt;] - Perjob can not recover from checkpoint when zookeeper leader changes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25494&quot;&gt;FLINK-25494&lt;/a&gt;] - Duplicate element serializer during DefaultOperatorStateBackendSnapshotStrategy#syncPrepareResources
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25513&quot;&gt;FLINK-25513&lt;/a&gt;] - CoFlatMapFunction requires both two flat_maps to yield something
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25559&quot;&gt;FLINK-25559&lt;/a&gt;] - SQL JOIN causes data loss
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25683&quot;&gt;FLINK-25683&lt;/a&gt;] - wrong result if table transfrom to DataStream then window process in batch mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25728&quot;&gt;FLINK-25728&lt;/a&gt;] - Potential memory leaks in StreamMultipleInputProcessor
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25732&quot;&gt;FLINK-25732&lt;/a&gt;] - Dispatcher#requestMultipleJobDetails returns non-serialiable collection
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21407&quot;&gt;FLINK-21407&lt;/a&gt;] - Clarify which sources and APIs support which formats
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20443&quot;&gt;FLINK-20443&lt;/a&gt;] - ContinuousProcessingTimeTrigger doesn&amp;#39;t fire at the end of the window
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21467&quot;&gt;FLINK-21467&lt;/a&gt;] - Document possible recommended usage of Bounded{One/Multi}Input.endInput and emphasize that they could be called multiple times
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23842&quot;&gt;FLINK-23842&lt;/a&gt;] - Add log messages for reader registrations and split requests.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24631&quot;&gt;FLINK-24631&lt;/a&gt;] - Avoiding directly use the labels as selector for deployment and service
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24739&quot;&gt;FLINK-24739&lt;/a&gt;] - State requirements for Flink&amp;#39;s application mode in the documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24987&quot;&gt;FLINK-24987&lt;/a&gt;] - Enhance ExternalizedCheckpointCleanup enum
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25160&quot;&gt;FLINK-25160&lt;/a&gt;] - Make doc clear: tolerable-failed-checkpoints counts consecutive failures
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25415&quot;&gt;FLINK-25415&lt;/a&gt;] - implement retrial on connections to Cassandra container
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25611&quot;&gt;FLINK-25611&lt;/a&gt;] - Remove CoordinatorExecutorThreadFactory thread creation guards
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25818&quot;&gt;FLINK-25818&lt;/a&gt;] - Add explanation how Kafka Source deals with idleness when parallelism is higher then the number of partitions
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Technical Debt
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24740&quot;&gt;FLINK-24740&lt;/a&gt;] - Update testcontainers dependency to v1.16.2
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24796&quot;&gt;FLINK-24796&lt;/a&gt;] - Exclude javadocs / node[_modules] directories from CI compile artifact
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25472&quot;&gt;FLINK-25472&lt;/a&gt;] - Update to Log4j 2.17.1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25375&quot;&gt;FLINK-25375&lt;/a&gt;] - Update Log4j to 2.17.0
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25576&quot;&gt;FLINK-25576&lt;/a&gt;] - Update com.h2database:h2 to 2.0.206
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 18 Feb 2022 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2022/02/18/release-1.13.6.html</link>
<guid isPermaLink="true">/news/2022/02/18/release-1.13.6.html</guid>
</item>
<item>
<title>Stateful Functions 3.2.0 Release Announcement</title>
<description>&lt;p&gt;Stateful Functions is a cross-platform stack for building Stateful Serverless applications, making it radically simpler to develop scalable, consistent, and elastic distributed applications.
This new release brings various improvements to the StateFun runtime, a leaner way to specify StateFun module components, and a brand new JavaScript SDK!&lt;/p&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads&lt;/a&gt;
page of the Flink website, and the most recent Java SDK, Python SDK,, GoLang SDK and JavaScript SDK distributions are available on &lt;a href=&quot;https://search.maven.org/artifact/org.apache.flink/statefun-sdk-java/3.2.0/jar&quot;&gt;Maven&lt;/a&gt;, &lt;a href=&quot;https://pypi.org/project/apache-flink-statefun/&quot;&gt;PyPI&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/flink-statefun/tree/statefun-sdk-go/v3.2.0&quot;&gt;Github&lt;/a&gt;, and &lt;a href=&quot;https://www.npmjs.com/package/apache-flink-statefun&quot;&gt;npm&lt;/a&gt; respectively.
You can also find official StateFun Docker images of the new version on &lt;a href=&quot;https://hub.docker.com/r/apache/flink-statefun&quot;&gt;Dockerhub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For more details, check the complete &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12350540&quot;&gt;release changelog&lt;/a&gt;
and the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-3.2/&quot;&gt;updated documentation&lt;/a&gt;.
We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt;
or &lt;a href=&quot;https://issues.apache.org/jira/browse/&quot;&gt;JIRA&lt;/a&gt;!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#new-features&quot; id=&quot;markdown-toc-new-features&quot;&gt;New Features&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#a-brand-new-javascript-sdk-for-nodejs&quot; id=&quot;markdown-toc-a-brand-new-javascript-sdk-for-nodejs&quot;&gt;A brand new JavaScript SDK for NodeJS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#support-different-remote-functions-module-names&quot; id=&quot;markdown-toc-support-different-remote-functions-module-names&quot;&gt;Support different remote functions module names&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#allow-creating-custom-metrics&quot; id=&quot;markdown-toc-allow-creating-custom-metrics&quot;&gt;Allow creating custom metrics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#upgraded-flink-dependency-to-1143&quot; id=&quot;markdown-toc-upgraded-flink-dependency-to-1143&quot;&gt;Upgraded Flink dependency to 1.14.3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes&quot; id=&quot;markdown-toc-release-notes&quot;&gt;Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;new-features&quot;&gt;New Features&lt;/h2&gt;
&lt;h3 id=&quot;a-brand-new-javascript-sdk-for-nodejs&quot;&gt;A brand new JavaScript SDK for NodeJS&lt;/h3&gt;
&lt;p&gt;Stateful Functions provides a unified model for building stateful applications across various programming languages and deployment environments.
The community is thrilled to release an official JavaScript SDK as part of the 3.2.0 release.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;&lt;span class=&quot;kr&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;http&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;require&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;http&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;messageBuilder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;StateFun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;require&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;apache-flink-statefun&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;StateFun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;typename&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;com.example.fns/greeter&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;asString&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;seen&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;seen&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;seen&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;seen&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;seen&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;seen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;messageBuilder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;typename&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;com.example.fns/inbox&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Hello ${name} for the ${seen}th time!&amp;quot;&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;specs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;seen&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;StateFun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;intType&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}]&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;http&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;createServer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;handler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;listen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As with the Python, Java and Go SDKs, the JavaScript SDK includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;An address scoped storage acting as a key-value store for a particular address.&lt;/li&gt;
&lt;li&gt;A unified cross-language way to send, receive and store values across languages.&lt;/li&gt;
&lt;li&gt;Dynamic &lt;code&gt;ValueSpec&lt;/code&gt; to describe the state name, type, and possibly expiration configuration at runtime.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can get started by adding the SDK to your project.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;npm install apache-flink-statefun@3.2.0&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;For a detailed SDK tutorial, we would like to encourage you to visit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-3.2/docs/sdk/js/&quot;&gt;JavaScript SDK Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;support-different-remote-functions-module-names&quot;&gt;Support different remote functions module names&lt;/h3&gt;
&lt;p&gt;With the newly introduced configuration option &lt;code&gt;statefun.remote.module-name&lt;/code&gt;, it is possible to override the default remote module file name (&lt;code&gt;module.yaml&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;To provide a different name, for example &lt;code&gt;prod.yaml&lt;/code&gt; that is located at &lt;code&gt;/flink/usrlib/prod.yaml&lt;/code&gt;, one can add the following to ones &lt;code&gt;flink-conf.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;statefun.remote.module-name: /flink/usrlib/prod.yaml&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For more information see &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25308&quot;&gt;FLINK-25308&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;allow-creating-custom-metrics&quot;&gt;Allow creating custom metrics&lt;/h3&gt;
&lt;p&gt;The embedded SDK now supports registering custom counters.
For more information see &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22533&quot;&gt;FLINK-22533&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;upgraded-flink-dependency-to-1143&quot;&gt;Upgraded Flink dependency to 1.14.3&lt;/h3&gt;
&lt;p&gt;Stateful Functions 3.2.0 runtime uses Flink 1.14.3 underneath.
This means that Stateful Functions benefits from the latest improvements and stabilisations that went into Flink.
For more information see &lt;a href=&quot;https://flink.apache.org/news/2022/01/17/release-1.14.3.html&quot;&gt;Flink’s release announcement&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h2&gt;
&lt;p&gt;Please review the &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12350540&quot;&gt;release notes&lt;/a&gt;
for a detailed list of changes and new features if you plan to upgrade your setup to Stateful Functions 3.2.0.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;Seth Wiesman, Igal Shilman, Till Rohrmann, Stephan Ewen, Tzu-Li (Gordon) Tai, Ingo Bürk, Evans Ye, neoXfire, Galen Warren&lt;/p&gt;
&lt;p&gt;If you’d like to get involved, we’re always &lt;a href=&quot;https://github.com/apache/flink-statefun#contributing&quot;&gt;looking for new contributors&lt;/a&gt;.&lt;/p&gt;
</description>
<pubDate>Mon, 31 Jan 2022 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2022/01/31/release-statefun-3.2.0.html</link>
<guid isPermaLink="true">/news/2022/01/31/release-statefun-3.2.0.html</guid>
</item>
<item>
<title>Pravega Flink Connector 101</title>
<description>&lt;p&gt;&lt;a href=&quot;https://cncf.pravega.io/&quot;&gt;Pravega&lt;/a&gt;, which is now a CNCF sandbox project, is a cloud-native storage system based on abstractions for both batch and streaming data consumption. Pravega streams (a new storage abstraction) are durable, consistent, and elastic, while natively supporting long-term data retention. In comparison, &lt;a href=&quot;https://flink.apache.org/&quot;&gt;Apache Flink&lt;/a&gt; is a popular real-time computing engine that provides unified batch and stream processing. Flink provides high-throughput, low-latency computation, as well as support for complex event processing and state management. Both Pravega and Flink share the same design philosophy and treat data streams as primitives. This makes them a great match when constructing storage+computing data pipelines which can unify batch and streaming use cases.&lt;/p&gt;
&lt;p&gt;That’s also the main reason why Pravega has chosen to use Flink as the first integrated execution engine among the various distributed computing engines on the market. With the help of Flink, users can use flexible APIs for windowing, complex event processing (CEP), or table abstractions to process streaming data easily and enrich the data being stored. Since its inception in 2016, Pravega has established communication with Flink PMC members and developed the connector together.&lt;/p&gt;
&lt;p&gt;In 2017, the Pravega Flink connector module started to move out of the Pravega main repository and has been maintained in a new separate &lt;a href=&quot;https://github.com/pravega/flink-connectors&quot;&gt;repository&lt;/a&gt; since then. During years of development, many features have been implemented, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;exactly-once processing guarantees for both Reader and Writer, supporting end-to-end exactly-once processing pipelines&lt;/li&gt;
&lt;li&gt;seamless integration with Flink’s checkpoints and savepoints&lt;/li&gt;
&lt;li&gt;parallel Readers and Writers supporting high throughput and low latency processing&lt;/li&gt;
&lt;li&gt;support for Batch, Streaming, and Table API to access Pravega Streams&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These key features make streaming pipeline applications easier to develop without worrying about performance and correctness which are the common pain points for many streaming use cases.&lt;/p&gt;
&lt;p&gt;In this blog post, we will discuss how to use this connector to read and write Pravega streams with the Flink DataStream API.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#basic-usages&quot; id=&quot;markdown-toc-basic-usages&quot;&gt;Basic usages&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#dependency&quot; id=&quot;markdown-toc-dependency&quot;&gt;Dependency&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#api-introduction&quot; id=&quot;markdown-toc-api-introduction&quot;&gt;API introduction&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#configurations&quot; id=&quot;markdown-toc-configurations&quot;&gt;Configurations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#serializationdeserialization&quot; id=&quot;markdown-toc-serializationdeserialization&quot;&gt;Serialization/Deserialization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flinkpravegareader&quot; id=&quot;markdown-toc-flinkpravegareader&quot;&gt;&lt;code&gt;FlinkPravegaReader&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flinkpravegawriter&quot; id=&quot;markdown-toc-flinkpravegawriter&quot;&gt;&lt;code&gt;FlinkPravegaWriter&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#internals-of-reader-and-writer&quot; id=&quot;markdown-toc-internals-of-reader-and-writer&quot;&gt;Internals of reader and writer&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#checkpoint-integration&quot; id=&quot;markdown-toc-checkpoint-integration&quot;&gt;Checkpoint integration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#end-to-end-exactly-once-semantics&quot; id=&quot;markdown-toc-end-to-end-exactly-once-semantics&quot;&gt;End-to-end exactly-once semantics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#future-plans&quot; id=&quot;markdown-toc-future-plans&quot;&gt;Future plans&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;basic-usages&quot;&gt;Basic usages&lt;/h1&gt;
&lt;h2 id=&quot;dependency&quot;&gt;Dependency&lt;/h2&gt;
&lt;p&gt;To use this connector in your application, add the dependency to your project:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;io.pravega&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;pravega-connectors-flink-1.13_2.12&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;0.10.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In the above example,&lt;/p&gt;
&lt;p&gt;&lt;code&gt;1.13&lt;/code&gt; is the Flink major version which is put in the middle of the artifact name. The Pravega Flink connector maintains compatibility for the &lt;em&gt;three&lt;/em&gt; most recent major versions of Flink.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;0.10.1&lt;/code&gt; is the version that aligns with the Pravega version.&lt;/p&gt;
&lt;p&gt;You can find the latest release with a support matrix on the &lt;a href=&quot;https://github.com/pravega/flink-connectors/releases&quot;&gt;GitHub Releases page&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;api-introduction&quot;&gt;API introduction&lt;/h2&gt;
&lt;h3 id=&quot;configurations&quot;&gt;Configurations&lt;/h3&gt;
&lt;p&gt;The connector provides a common top-level object &lt;code&gt;PravegaConfig&lt;/code&gt; for Pravega connection configurations. The config object automatically configures itself from &lt;em&gt;environment variables&lt;/em&gt;, &lt;em&gt;system properties&lt;/em&gt; and &lt;em&gt;program arguments&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The basic controller URI and the default scope can be set like this:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Setting&lt;/th&gt;
&lt;th&gt;Environment Variable /&lt;br /&gt;System Property /&lt;br /&gt;Program Argument&lt;/th&gt;
&lt;th&gt;Default Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Controller URI&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PRAVEGA_CONTROLLER_URI&lt;/code&gt;&lt;br /&gt;&lt;code&gt;pravega.controller.uri&lt;/code&gt;&lt;br /&gt;&lt;code&gt;--controller&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;tcp://localhost:9090&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Default Scope&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PRAVEGA_SCOPE&lt;/code&gt;&lt;br /&gt;&lt;code&gt;pravega.scope&lt;/code&gt;&lt;br /&gt;&lt;code&gt;--scope&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The recommended way to create an instance of &lt;code&gt;PravegaConfig&lt;/code&gt; is the following:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// From default environment&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromDefaults&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// From program arguments&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ParameterTool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;params&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ParameterTool&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromArgs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromParams&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;params&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// From user specification&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromDefaults&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withControllerURI&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;tcp://...&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withDefaultScope&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;SCOPE-NAME&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withCredentials&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;credentials&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withHostnameValidation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;serializationdeserialization&quot;&gt;Serialization/Deserialization&lt;/h3&gt;
&lt;p&gt;Pravega has defined &lt;a href=&quot;http://pravega.io/docs/latest/javadoc/clients/io/pravega/client/stream/Serializer.html&quot;&gt;&lt;code&gt;io.pravega.client.stream.Serializer&lt;/code&gt;&lt;/a&gt; for the serialization/deserialization, while Flink has also defined standard interfaces for the purpose.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/SerializationSchema.html&quot;&gt;&lt;code&gt;org.apache.flink.api.common.serialization.SerializationSchema&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/DeserializationSchema.html&quot;&gt;&lt;code&gt;org.apache.flink.api.common.serialization.DeserializationSchema&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For interoperability with other pravega client applications, we have built-in adapters &lt;code&gt;PravegaSerializationSchema&lt;/code&gt; and &lt;code&gt;PravegaDeserializationSchema&lt;/code&gt; to support processing Pravega stream data produced by a non-Flink application.&lt;/p&gt;
&lt;p&gt;Here is the adapter for Pravega Java serializer:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;io.pravega.client.stream.impl.JavaSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DeserializationSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyEvent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;adapter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PravegaDeserializationSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;MyEvent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JavaSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyEvent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;());&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;flinkpravegareader&quot;&gt;&lt;code&gt;FlinkPravegaReader&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;FlinkPravegaReader&lt;/code&gt; is a Flink &lt;code&gt;SourceFunction&lt;/code&gt; implementation which supports parallel reads from one or more Pravega streams. Internally, it initiates a Pravega reader group and creates Pravega &lt;code&gt;EventStreamReader&lt;/code&gt; instances to read the data from the stream(s). It provides a builder-style API to construct, and can allow streamcuts to mark the start and end of the read.&lt;/p&gt;
&lt;p&gt;You can use it like this:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Enable Flink checkpoint to make state fault tolerant&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;enableCheckpointing&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;60000&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Define the Pravega configuration&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ParameterTool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;params&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ParameterTool&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromArgs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromParams&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;params&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Define the event deserializer&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DeserializationSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;deserializer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Define the data stream&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FlinkPravegaReader&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pravegaSource&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FlinkPravegaReader&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;forStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(...)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withPravegaConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withDeserializationSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;deserializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pravegaSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setParallelism&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;uid&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;pravega-source&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;flinkpravegawriter&quot;&gt;&lt;code&gt;FlinkPravegaWriter&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;FlinkPravegaWriter&lt;/code&gt; is a Flink &lt;code&gt;SinkFunction&lt;/code&gt; implementation which supports parallel writes to Pravega streams.&lt;/p&gt;
&lt;p&gt;It supports three writer modes that relate to guarantees about the persistence of events emitted by the sink to a Pravega Stream:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Best-effort&lt;/strong&gt; - Any write failures will be ignored and there could be data loss.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;At-least-once&lt;/strong&gt;(default) - All events are persisted in Pravega. Duplicate events are possible, due to retries or in case of failure and subsequent recovery.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Exactly-once&lt;/strong&gt; - All events are persisted in Pravega using a transactional approach integrated with the Flink checkpointing feature.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Internally, it will initiate several Pravega &lt;code&gt;EventStreamWriter&lt;/code&gt; or &lt;code&gt;TransactionalEventStreamWriter&lt;/code&gt; (depends on the writer mode) instances to write data to the stream. It provides a builder-style API to construct.&lt;/p&gt;
&lt;p&gt;A basic usage looks like this:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Define the Pravega configuration&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PravegaConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromParams&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;params&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Define the event serializer&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;SerializationSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;serializer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Define the event router for selecting the Routing Key&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;PravegaEventRouter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;router&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Define the sink function&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FlinkPravegaWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pravegaSink&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FlinkPravegaWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;forStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(...)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withPravegaConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withSerializationSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;serializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withEventRouter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;router&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withWriterMode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EXACTLY_ONCE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyClass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addSink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pravegaSink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setParallelism&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;uid&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;pravega-sink&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can see some more examples &lt;a href=&quot;https://github.com/pravega/pravega-samples&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;internals-of-reader-and-writer&quot;&gt;Internals of reader and writer&lt;/h1&gt;
&lt;h2 id=&quot;checkpoint-integration&quot;&gt;Checkpoint integration&lt;/h2&gt;
&lt;p&gt;Flink has periodic checkpoints based on the Chandy-Lamport algorithm to make state in Flink fault-tolerant. By allowing state and the corresponding stream positions to be recovered, the application is given the same semantics as a failure-free execution.&lt;/p&gt;
&lt;p&gt;Pravega also has its own Checkpoint concept which is to create a consistent “point in time” persistence of the state of each Reader in the Reader Group, by using a specialized Event (&lt;em&gt;Checkpoint Event&lt;/em&gt;) to signal each Reader to preserve its state. Once a Checkpoint has been completed, the application can use the Checkpoint to reset all the Readers in the Reader Group to the known consistent state represented by the Checkpoint.&lt;/p&gt;
&lt;p&gt;This means that our end-to-end recovery story is not like other messaging systems such as Kafka, which uses a more coupled method and persists its offset in the Flink task state and lets Flink do the coordination. Flink delegates the Pravega source recovery completely to the Pravega server and uses only a lightweight hook to connect. We collaborated with the Flink community and added a new interface &lt;code&gt;ExternallyInducedSource&lt;/code&gt; (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-6390&quot;&gt;FLINK-6390&lt;/a&gt;) to allow such external calls for checkpointing. The connector integrated this interface to guarantee exactly-once semantics during a failure recovery.&lt;/p&gt;
&lt;p&gt;The checkpoint mechanism works as a two-step process:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html&quot;&gt;master hook&lt;/a&gt; handler from the JobManager initiates the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html#triggerCheckpoint-long-long-java.util.concurrent.Executor-&quot;&gt;&lt;code&gt;triggerCheckpoint&lt;/code&gt;&lt;/a&gt; request to the &lt;code&gt;ReaderCheckpointHook&lt;/code&gt; that was registered with the JobManager during &lt;code&gt;FlinkPravegaReader&lt;/code&gt; source initialization. The &lt;code&gt;ReaderCheckpointHook&lt;/code&gt; handler notifies Pravega to checkpoint the current reader state. This is a non-blocking call that returns a &lt;code&gt;future&lt;/code&gt; once Pravega readers are done with the checkpointing. Once the &lt;code&gt;future&lt;/code&gt; completes, the Pravega checkpoint will be persisted in a “master state” of a Flink checkpoint.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A &lt;code&gt;Checkpoint&lt;/code&gt; event will be sent by Pravega as part of the data stream flow and, upon receiving the event, the &lt;code&gt;FlinkPravegaReader&lt;/code&gt; will initiate a &lt;a href=&quot;https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/checkpoint/ExternallyInducedSource.java#L73&quot;&gt;&lt;code&gt;triggerCheckpoint&lt;/code&gt;&lt;/a&gt; request to effectively let Flink continue and complete the checkpoint process.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;end-to-end-exactly-once-semantics&quot;&gt;End-to-end exactly-once semantics&lt;/h2&gt;
&lt;p&gt;In the early years of big data processing, results from real-time stream processing were always considered inaccurate/approximate/speculative. However, this correctness is extremely important for some use cases and in some industries such as finance.&lt;/p&gt;
&lt;p&gt;This constraint stems mainly from two issues:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;unordered data source in event time&lt;/li&gt;
&lt;li&gt;end-to-end exactly-once semantics guarantee&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;During recent years of development, watermarking has been introduced as a tradeoff between correctness and latency, which is now considered a good solution for unordered data sources in event time.&lt;/p&gt;
&lt;p&gt;The guarantee of end-to-end exactly-once semantics is more tricky. When we say “exactly-once semantics”, what we mean is that each incoming event affects the final results exactly once. Even in the event of a machine or software failure, there is no duplicate data and no data that goes unprocessed. This is quite difficult because of the demands of message acknowledgment and recovery during such fast processing and is also why some early distributed streaming engines like Storm(without Trident) chose to support “at-least-once” guarantees.&lt;/p&gt;
&lt;p&gt;Flink is one of the first streaming systems that was able to provide exactly-once semantics due to its delicate &lt;a href=&quot;https://www.ververica.com/blog/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink&quot;&gt;checkpoint mechanism&lt;/a&gt;. But to make it work end-to-end, the final stage needs to apply the semantic to external message system sinks that support commits and rollbacks.&lt;/p&gt;
&lt;p&gt;To work around this problem, Pravega introduced &lt;a href=&quot;https://cncf.pravega.io/docs/latest/transactions/&quot;&gt;transactional writes&lt;/a&gt;. A Pravega transaction allows an application to prepare a set of events that can be written “all at once” to a Stream. This allows an application to “commit” a bunch of events atomically. When writes are idempotent, it is possible to implement end-to-end exactly-once pipelines together with Flink.&lt;/p&gt;
&lt;p&gt;To build such an end-to-end solution requires coordination between Flink and the Pravega sink, which is still challenging. A common approach for coordinating commits and rollbacks in a distributed system is the two-phase commit protocol. We used this protocol and, together with the Flink community, implemented the sink function in a two-phase commit way coordinated with Flink checkpoints.&lt;/p&gt;
&lt;p&gt;The Flink community then extracted the common logic from the two-phase commit protocol and provided a general interface &lt;code&gt;TwoPhaseCommitSinkFunction&lt;/code&gt; (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-7210&quot;&gt;FLINK-7210&lt;/a&gt;) to make it possible to build end-to-end exactly-once applications with other message systems that have transaction support. This includes Apache Kafka versions 0.11 and above. There is an official Flink &lt;a href=&quot;https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html&quot;&gt;blog post&lt;/a&gt; that describes this feature in detail.&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;The Pravega Flink connector enables Pravega to connect to Flink and allows Pravega to act as a key data store in a streaming pipeline. Both projects share a common design philosophy and can integrate well with each other. Pravega has its own concept of checkpointing and has implemented transactional writes to support end-to-end exactly-once guarantees.&lt;/p&gt;
&lt;h1 id=&quot;future-plans&quot;&gt;Future plans&lt;/h1&gt;
&lt;p&gt;&lt;code&gt;FlinkPravegaInputFormat&lt;/code&gt; and &lt;code&gt;FlinkPravegaOutputFormat&lt;/code&gt; are now provided to support batch reads and writes in Flink, but these are under the legacy DataSet API. Since Flink is now making efforts to unify batch and streaming, it is improving its APIs and providing new interfaces for the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface&quot;&gt;source&lt;/a&gt; and &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API&quot;&gt;sink&lt;/a&gt; APIs in the Flink 1.11 and 1.12 releases. We will continue to work with the Flink community and integrate with the new APIs.&lt;/p&gt;
&lt;p&gt;We will also put more effort into SQL / Table API support in order to provide a better user experience since it is simpler to understand and even more powerful to use in some cases.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; the original blog post can be found &lt;a href=&quot;https://cncf.pravega.io/blog/2021/11/01/pravega-flink-connector-101/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
</description>
<pubDate>Thu, 20 Jan 2022 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/2022/01/20/pravega-connector-101.html</link>
<guid isPermaLink="true">/2022/01/20/pravega-connector-101.html</guid>
</item>
<item>
<title>Apache Flink 1.14.3 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community released the second bugfix version of the Apache Flink 1.14 series.
The first bugfix release was 1.14.2, being an emergency release due to an Apache Log4j Zero Day (CVE-2021-44228). Flink 1.14.1 was abandoned.
That means that this Flink release is the first bugfix release of the Flink 1.14 series which contains bugfixes not related to the mentioned CVE.&lt;/p&gt;
&lt;p&gt;This release includes 164 fixes and minor improvements for Flink 1.14.0. The list below includes bugfixes and improvements. For a complete list of all changes see:
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12351075&amp;amp;projectId=12315522&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.14.3.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.14.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt; Release Notes - Flink - Version 1.14.3
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24355&quot;&gt;FLINK-24355&lt;/a&gt;] - Expose the flag for enabling checkpoints after tasks finish in the Web UI
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15987&quot;&gt;FLINK-15987&lt;/a&gt;] - SELECT 1.0e0 / 0.0e0 throws NumberFormatException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17914&quot;&gt;FLINK-17914&lt;/a&gt;] - HistoryServer deletes cached archives if archive listing fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19142&quot;&gt;FLINK-19142&lt;/a&gt;] - Local recovery can be broken if slot hijacking happened during a full restart
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20195&quot;&gt;FLINK-20195&lt;/a&gt;] - Jobs endpoint returns duplicated jobs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20370&quot;&gt;FLINK-20370&lt;/a&gt;] - Result is wrong when sink primary key is not the same with query
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21289&quot;&gt;FLINK-21289&lt;/a&gt;] - Application mode ignores the pipeline.classpaths configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21345&quot;&gt;FLINK-21345&lt;/a&gt;] - NullPointerException LogicalCorrelateToJoinFromTemporalTableFunctionRule.scala:157
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22113&quot;&gt;FLINK-22113&lt;/a&gt;] - UniqueKey constraint is lost with multiple sources join in SQL
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22954&quot;&gt;FLINK-22954&lt;/a&gt;] - Don&amp;#39;t support consuming update and delete changes when use table function that does not contain table field
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23614&quot;&gt;FLINK-23614&lt;/a&gt;] - The resulting scale of TRUNCATE(DECIMAL, ...) is not correct
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23704&quot;&gt;FLINK-23704&lt;/a&gt;] - FLIP-27 sources are not generating LatencyMarkers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23827&quot;&gt;FLINK-23827&lt;/a&gt;] - Fix ModifiedMonotonicity inference for some node
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23919&quot;&gt;FLINK-23919&lt;/a&gt;] - PullUpWindowTableFunctionIntoWindowAggregateRule generates invalid Calc for Window TVF
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24156&quot;&gt;FLINK-24156&lt;/a&gt;] - BlobServer crashes due to SocketTimeoutException in Java 11
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24232&quot;&gt;FLINK-24232&lt;/a&gt;] - Archiving of suspended jobs prevents breaks subsequent archive attempts
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24291&quot;&gt;FLINK-24291&lt;/a&gt;] - Decimal precision is lost when deserializing in test cases
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24310&quot;&gt;FLINK-24310&lt;/a&gt;] - A bug in the BufferingSink example in the doc
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24315&quot;&gt;FLINK-24315&lt;/a&gt;] - Cannot rebuild watcher thread while the K8S API server is unavailable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24318&quot;&gt;FLINK-24318&lt;/a&gt;] - Casting a number to boolean has different results between &amp;#39;select&amp;#39; fields and &amp;#39;where&amp;#39; condition
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24331&quot;&gt;FLINK-24331&lt;/a&gt;] - PartiallyFinishedSourcesITCase fails with &amp;quot;No downstream received 0 from xxx;&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24336&quot;&gt;FLINK-24336&lt;/a&gt;] - PyFlink TableEnvironment executes the SQL randomly MalformedURLException with the configuration for &amp;#39;pipeline.classpaths&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24344&quot;&gt;FLINK-24344&lt;/a&gt;] - Handling of IOExceptions when triggering checkpoints doesn&amp;#39;t cause job failover
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24353&quot;&gt;FLINK-24353&lt;/a&gt;] - Bash scripts do not respect dynamic configurations when calculating memory sizes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24366&quot;&gt;FLINK-24366&lt;/a&gt;] - Unnecessary/misleading error message about failing restores when tasks are already canceled.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24371&quot;&gt;FLINK-24371&lt;/a&gt;] - Support SinkWriter preCommit without the need of a committer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24377&quot;&gt;FLINK-24377&lt;/a&gt;] - TM resource may not be properly released after heartbeat timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24380&quot;&gt;FLINK-24380&lt;/a&gt;] - Flink should handle the state transition of the pod from Pending to Failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24381&quot;&gt;FLINK-24381&lt;/a&gt;] - Table API exceptions may leak sensitive configuration values
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24401&quot;&gt;FLINK-24401&lt;/a&gt;] - TM cannot exit after Metaspace OOM
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24407&quot;&gt;FLINK-24407&lt;/a&gt;] - Pulsar connector chinese document link to Pulsar document location incorrectly.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24408&quot;&gt;FLINK-24408&lt;/a&gt;] - org.codehaus.janino.InternalCompilerException: Compiling &amp;quot;StreamExecValues$200&amp;quot;: Code of method &amp;quot;nextRecord(Ljava/lang/Object;)Ljava/lang/Object;&amp;quot; of class &amp;quot;StreamExecValues$200&amp;quot; grows beyond 64 KB
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24409&quot;&gt;FLINK-24409&lt;/a&gt;] - Kafka topics with periods in their names generate a constant stream of errors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24431&quot;&gt;FLINK-24431&lt;/a&gt;] - [Kinesis][EFO] EAGER registration strategy does not work when job fails over
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24432&quot;&gt;FLINK-24432&lt;/a&gt;] - RocksIteratorWrapper.seekToLast() calls the wrong RocksIterator method
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24465&quot;&gt;FLINK-24465&lt;/a&gt;] - Wrong javadoc and documentation for buffer timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24467&quot;&gt;FLINK-24467&lt;/a&gt;] - Set min and max buffer size even if the difference less than threshold
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24468&quot;&gt;FLINK-24468&lt;/a&gt;] - NPE when notifyNewBufferSize
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24469&quot;&gt;FLINK-24469&lt;/a&gt;] - Incorrect calcualtion of the buffer size in case of channel data skew
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24480&quot;&gt;FLINK-24480&lt;/a&gt;] - EqualiserCodeGeneratorTest fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24488&quot;&gt;FLINK-24488&lt;/a&gt;] - KafkaRecordSerializationSchemaBuilder does not forward timestamp
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24492&quot;&gt;FLINK-24492&lt;/a&gt;] - incorrect implicit type conversion between numeric and (var)char
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24506&quot;&gt;FLINK-24506&lt;/a&gt;] - checkpoint directory is not configurable through the Flink configuration passed into the StreamExecutionEnvironment
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24540&quot;&gt;FLINK-24540&lt;/a&gt;] - Fix Resource leak due to Files.list
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24543&quot;&gt;FLINK-24543&lt;/a&gt;] - Zookeeper connection issue causes inconsistent state in Flink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24550&quot;&gt;FLINK-24550&lt;/a&gt;] - Can not access job information from a standby jobmanager UI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24551&quot;&gt;FLINK-24551&lt;/a&gt;] - BUFFER_DEBLOAT_SAMPLES property is taken from the wrong configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24552&quot;&gt;FLINK-24552&lt;/a&gt;] - Ineffective buffer debloat configuration in randomized tests
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24563&quot;&gt;FLINK-24563&lt;/a&gt;] - Comparing timstamp_ltz with random string throws NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24596&quot;&gt;FLINK-24596&lt;/a&gt;] - Bugs in sink.buffer-flush before upsert-kafka
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24597&quot;&gt;FLINK-24597&lt;/a&gt;] - RocksdbStateBackend getKeysAndNamespaces would return duplicate data when using MapState
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24600&quot;&gt;FLINK-24600&lt;/a&gt;] - Duplicate 99th percentile displayed in checkpoint summary
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24608&quot;&gt;FLINK-24608&lt;/a&gt;] - Sinks built with the unified sink framework do not receive timestamps when used in Table API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24613&quot;&gt;FLINK-24613&lt;/a&gt;] - Documentation on orc supported data types is outdated
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24647&quot;&gt;FLINK-24647&lt;/a&gt;] - ClusterUncaughtExceptionHandler does not log the exception
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24654&quot;&gt;FLINK-24654&lt;/a&gt;] - NPE on RetractableTopNFunction when some records were cleared by state ttl
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24662&quot;&gt;FLINK-24662&lt;/a&gt;] - PyFlink sphinx check failed with &amp;quot;node class &amp;#39;meta&amp;#39; is already registered, its visitors will be overridden&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24667&quot;&gt;FLINK-24667&lt;/a&gt;] - Channel state writer would fail the task directly if meeting exception previously
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24676&quot;&gt;FLINK-24676&lt;/a&gt;] - Schema does not match if explain insert statement with partial column
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24678&quot;&gt;FLINK-24678&lt;/a&gt;] - Correct the metric name of map state contains latency
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24691&quot;&gt;FLINK-24691&lt;/a&gt;] - FLINK SQL SUM() causes a precision error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24704&quot;&gt;FLINK-24704&lt;/a&gt;] - Exception occurs when the input record loses monotonicity on the sort key field of UpdatableTopNFunction
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24706&quot;&gt;FLINK-24706&lt;/a&gt;] - AkkaInvocationHandler silently ignores deserialization errors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24708&quot;&gt;FLINK-24708&lt;/a&gt;] - `ConvertToNotInOrInRule` has a bug which leads to wrong result
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24728&quot;&gt;FLINK-24728&lt;/a&gt;] - Batch SQL file sink forgets to close the output stream
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24733&quot;&gt;FLINK-24733&lt;/a&gt;] - Data loss in pulsar source when using shared mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24738&quot;&gt;FLINK-24738&lt;/a&gt;] - Fail during announcing buffer size to released local channel
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24761&quot;&gt;FLINK-24761&lt;/a&gt;] - Fix PartitionPruner code gen compile fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24773&quot;&gt;FLINK-24773&lt;/a&gt;] - KafkaCommitter should fail on unknown Exception
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24777&quot;&gt;FLINK-24777&lt;/a&gt;] - Processed (persisted) in-flight data description miss on Monitoring Checkpointing page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24789&quot;&gt;FLINK-24789&lt;/a&gt;] - IllegalStateException with CheckpointCleaner being closed already
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24792&quot;&gt;FLINK-24792&lt;/a&gt;] - OperatorCoordinatorSchedulerTest crashed JVM on AZP
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24835&quot;&gt;FLINK-24835&lt;/a&gt;] - &amp;quot;group by&amp;quot; in the interval join will throw a exception
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24846&quot;&gt;FLINK-24846&lt;/a&gt;] - AsyncWaitOperator fails during stop-with-savepoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24858&quot;&gt;FLINK-24858&lt;/a&gt;] - TypeSerializer version mismatch during eagerly restore
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24874&quot;&gt;FLINK-24874&lt;/a&gt;] - Dropdown menu is not properly shown in UI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24885&quot;&gt;FLINK-24885&lt;/a&gt;] - ProcessElement Interface parameter Collector : java.lang.NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24919&quot;&gt;FLINK-24919&lt;/a&gt;] - UnalignedCheckpointITCase hangs on Azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24922&quot;&gt;FLINK-24922&lt;/a&gt;] - Fix spelling errors in the word &amp;quot;parallism&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24937&quot;&gt;FLINK-24937&lt;/a&gt;] - &amp;quot;kubernetes application HA test&amp;quot; hangs on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24938&quot;&gt;FLINK-24938&lt;/a&gt;] - Checkpoint cleaner is closed before checkpoints are discarded
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25022&quot;&gt;FLINK-25022&lt;/a&gt;] - ClassLoader leak with ThreadLocals on the JM when submitting a job through the REST API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25067&quot;&gt;FLINK-25067&lt;/a&gt;] - Correct the description of RocksDB&amp;#39;s background threads
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25084&quot;&gt;FLINK-25084&lt;/a&gt;] - Field names must be unique. Found duplicates
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25091&quot;&gt;FLINK-25091&lt;/a&gt;] - Official website document FileSink orc compression attribute reference error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25096&quot;&gt;FLINK-25096&lt;/a&gt;] - Issue in exceptions API(/jobs/:jobid/exceptions) in flink 1.13.2
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25126&quot;&gt;FLINK-25126&lt;/a&gt;] - FlinkKafkaInternalProducer state is not reset if transaction finalization fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25132&quot;&gt;FLINK-25132&lt;/a&gt;] - KafkaSource cannot work with object-reusing DeserializationSchema
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25134&quot;&gt;FLINK-25134&lt;/a&gt;] - Unused RetryRule in KafkaConsumerTestBase swallows retries
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25222&quot;&gt;FLINK-25222&lt;/a&gt;] - Remove NetworkFailureProxy used for Kafka connector tests
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25271&quot;&gt;FLINK-25271&lt;/a&gt;] - ApplicationDispatcherBootstrapITCase. testDispatcherRecoversAfterLosingAndRegainingLeadership failed on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25294&quot;&gt;FLINK-25294&lt;/a&gt;] - Incorrect cloudpickle import
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25375&quot;&gt;FLINK-25375&lt;/a&gt;] - Update Log4j to 2.17.0
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25418&quot;&gt;FLINK-25418&lt;/a&gt;] - The dir_cache is specified in the flink task. When there is no network, you will still download the python third-party library
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25446&quot;&gt;FLINK-25446&lt;/a&gt;] - Avoid sanity check on read bytes on DataInputStream#read(byte[])
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25468&quot;&gt;FLINK-25468&lt;/a&gt;] - Local recovery fails if local state storage and RocksDB working directory are not on the same volume
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25477&quot;&gt;FLINK-25477&lt;/a&gt;] - The directory structure of the State Backends document is not standardized
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25513&quot;&gt;FLINK-25513&lt;/a&gt;] - CoFlatMapFunction requires both two flat_maps to yield something
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20443&quot;&gt;FLINK-20443&lt;/a&gt;] - ContinuousProcessingTimeTrigger doesn&amp;#39;t fire at the end of the window
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21467&quot;&gt;FLINK-21467&lt;/a&gt;] - Document possible recommended usage of Bounded{One/Multi}Input.endInput and emphasize that they could be called multiple times
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23519&quot;&gt;FLINK-23519&lt;/a&gt;] - Aggregate State Backend Latency by State Level
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23798&quot;&gt;FLINK-23798&lt;/a&gt;] - Avoid using reflection to get filter when partition filter is enabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23842&quot;&gt;FLINK-23842&lt;/a&gt;] - Add log messages for reader registrations and split requests.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23914&quot;&gt;FLINK-23914&lt;/a&gt;] - Make connector testing framework more verbose on test failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24117&quot;&gt;FLINK-24117&lt;/a&gt;] - Remove unHandledErrorListener in ZooKeeperLeaderElectionDriver and ZooKeeperLeaderRetrievalDriver
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24148&quot;&gt;FLINK-24148&lt;/a&gt;] - Add bloom filter policy option in RocksDBConfiguredOptions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24382&quot;&gt;FLINK-24382&lt;/a&gt;] - RecordsOut metric for sinks is inaccurate
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24437&quot;&gt;FLINK-24437&lt;/a&gt;] - Remove unhandled exception handler from CuratorFramework before closing it
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24460&quot;&gt;FLINK-24460&lt;/a&gt;] - Rocksdb Iterator Error Handling Improvement
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24481&quot;&gt;FLINK-24481&lt;/a&gt;] - Translate buffer debloat documenation to chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24529&quot;&gt;FLINK-24529&lt;/a&gt;] - flink sql job cannot use custom job name
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24631&quot;&gt;FLINK-24631&lt;/a&gt;] - Avoiding directly use the labels as selector for deployment and service
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24670&quot;&gt;FLINK-24670&lt;/a&gt;] - Restructure unaligned checkpoints documentation page to &amp;quot;Checkpointing under back pressure&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24690&quot;&gt;FLINK-24690&lt;/a&gt;] - Clarification of buffer size threshold calculation in BufferDebloater
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24695&quot;&gt;FLINK-24695&lt;/a&gt;] - Update how to configure unaligned checkpoints in the documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24739&quot;&gt;FLINK-24739&lt;/a&gt;] - State requirements for Flink&amp;#39;s application mode in the documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24813&quot;&gt;FLINK-24813&lt;/a&gt;] - Improve ImplicitTypeConversionITCase
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24880&quot;&gt;FLINK-24880&lt;/a&gt;] - Error messages &amp;quot;OverflowError: timeout value is too large&amp;quot; shown when executing PyFlink jobs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24958&quot;&gt;FLINK-24958&lt;/a&gt;] - correct the example and link for temporal table function documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24987&quot;&gt;FLINK-24987&lt;/a&gt;] - Enhance ExternalizedCheckpointCleanup enum
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25092&quot;&gt;FLINK-25092&lt;/a&gt;] - Implement artifact cacher for Bash based Elasticsearch test
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Technical Debt
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24367&quot;&gt;FLINK-24367&lt;/a&gt;] - Add a fallback AkkaRpcSystemLoader for tests in the IDE
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24445&quot;&gt;FLINK-24445&lt;/a&gt;] - Move RPC System packaging to package phase
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24455&quot;&gt;FLINK-24455&lt;/a&gt;] - FallbackAkkaRpcSystemLoader should check for maven errors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24513&quot;&gt;FLINK-24513&lt;/a&gt;] - AkkaRpcSystemLoader must be an ITCase
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24559&quot;&gt;FLINK-24559&lt;/a&gt;] - flink-rpc-akka-loader does not bundle flink-rpc-akka
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24609&quot;&gt;FLINK-24609&lt;/a&gt;] - flink-rpc-akka uses wrong Scala version property for parser-combinators
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24859&quot;&gt;FLINK-24859&lt;/a&gt;] - Document new File formats
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25472&quot;&gt;FLINK-25472&lt;/a&gt;] - Update to Log4j 2.17.1
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Mon, 17 Jan 2022 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2022/01/17/release-1.14.3.html</link>
<guid isPermaLink="true">/news/2022/01/17/release-1.14.3.html</guid>
</item>
<item>
<title>Apache Flink ML 2.0.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is excited to announce the release of Flink ML
2.0.0! Flink ML is a library that provides APIs and infrastructure for building
stream-batch unified machine learning algorithms, that can be easy-to-use and
performant with (near-) real-time latency.&lt;/p&gt;
&lt;p&gt;This release involves a major refactor of the earlier Flink ML library and
introduces major features that extend the Flink ML API and the iteration
runtime, such as supporting stages with multi-input multi-output, graph-based
stage composition, and a new stream-batch unified iteration library. Moreover,
we added five algorithm implementations in this release, which is the start of
a long-term initiative to provide a large number of off-the-shelf algorithms in
Flink ML with state-of-the-art performance.&lt;/p&gt;
&lt;p&gt;We believe this release is an important step towards extending Apache Flink to
a wide range of machine learning use cases, especially the real-time machine
learning scenarios.&lt;/p&gt;
&lt;p&gt;We encourage you to &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;download the release&lt;/a&gt; and share your feedback with
the community through the Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;mailing lists&lt;/a&gt; or
&lt;a href=&quot;https://issues.apache.org/jira/browse/flink&quot;&gt;JIRA&lt;/a&gt;! We hope you like the new
release and we’d be eager to learn about your experience with it.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#notable-features&quot; id=&quot;markdown-toc-notable-features&quot;&gt;Notable Features&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#api-and-infrastructure&quot; id=&quot;markdown-toc-api-and-infrastructure&quot;&gt;API and Infrastructure&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#supporting-stages-requiring-multi-input-multi-output&quot; id=&quot;markdown-toc-supporting-stages-requiring-multi-input-multi-output&quot;&gt;Supporting stages requiring multi-input multi-output&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#supporting-online-learning-with-apis-exposing-model-data&quot; id=&quot;markdown-toc-supporting-online-learning-with-apis-exposing-model-data&quot;&gt;Supporting online learning with APIs exposing model data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#improved-usability-for-managing-parameters&quot; id=&quot;markdown-toc-improved-usability-for-managing-parameters&quot;&gt;Improved usability for managing parameters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#tools-for-composing-dag-of-stages-into-a-new-stage&quot; id=&quot;markdown-toc-tools-for-composing-dag-of-stages-into-a-new-stage&quot;&gt;Tools for composing DAG of stages into a new stage&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#stream-batch-unified-iteration-library&quot; id=&quot;markdown-toc-stream-batch-unified-iteration-library&quot;&gt;Stream-batch Unified Iteration Library&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#python-sdk&quot; id=&quot;markdown-toc-python-sdk&quot;&gt;Python SDK&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#algorithm-library&quot; id=&quot;markdown-toc-algorithm-library&quot;&gt;Algorithm Library&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#related-work&quot; id=&quot;markdown-toc-related-work&quot;&gt;Related Work&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-ml-project-moved-to-a-separate-repository&quot; id=&quot;markdown-toc-flink-ml-project-moved-to-a-separate-repository&quot;&gt;Flink ML project moved to a separate repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#github-organization-created-for-flink-ecosystem-projects&quot; id=&quot;markdown-toc-github-organization-created-for-flink-ecosystem-projects&quot;&gt;Github organization created for Flink ecosystem projects&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#upgrade-notes&quot; id=&quot;markdown-toc-upgrade-notes&quot;&gt;Upgrade Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes-and-resources&quot; id=&quot;markdown-toc-release-notes-and-resources&quot;&gt;Release Notes and Resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;notable-features&quot;&gt;Notable Features&lt;/h1&gt;
&lt;h2 id=&quot;api-and-infrastructure&quot;&gt;API and Infrastructure&lt;/h2&gt;
&lt;h3 id=&quot;supporting-stages-requiring-multi-input-multi-output&quot;&gt;Supporting stages requiring multi-input multi-output&lt;/h3&gt;
&lt;p&gt;Stages in a machine learning workflow might take multiple inputs and return
multiple outputs. For example, a graph embedding algorithm might need to read
two tables, which represent the edge and node of the graph respectively. And a
workflow might need a stage that splits the input dataset into two output
datasets, for training and testing respectively.&lt;/p&gt;
&lt;p&gt;With this capability, algorithm developers can assemble a machine learning
workflow as a directed acyclic graph (DAG) of pre-defined stages. And this
workflow can be configured and deployed without users knowing the
implementation details of this graph. This improvement could considerably
expand the applicability and usability of Flink ML.&lt;/p&gt;
&lt;h3 id=&quot;supporting-online-learning-with-apis-exposing-model-data&quot;&gt;Supporting online learning with APIs exposing model data&lt;/h3&gt;
&lt;p&gt;In a native online learning scenario, we have a long-running job that keeps
processing training data and updating a machine learning model. And we could
have multiple jobs deployed in web servers which do online inference. It is
necessary to transmit the latest model data from the training job to those
inference jobs in (near-) real-time latency.&lt;/p&gt;
&lt;p&gt;The traditional Estimator/Transformer paradigm does not provide APIs to expose
this model data in a streaming manner. Users have to repeatedly call fit() to
update model data. Although users might be able to update model data once every
few minutes, it is likely very inefficient, if not impossible, to update model
data once every few seconds with this approach.&lt;/p&gt;
&lt;p&gt;With
&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=184615783&quot;&gt;FLIP-173&lt;/a&gt;,
model data can be exposed as an unbounded stream via the getModelData() API.
Then algorithm users can transfer the model data to web servers in real-time
and use the up-to-date model data to do online inference. This feature could
significantly strengthen Flink ML’s capability to support online learning
applications.&lt;/p&gt;
&lt;h3 id=&quot;improved-usability-for-managing-parameters&quot;&gt;Improved usability for managing parameters&lt;/h3&gt;
&lt;p&gt;We care a lot about usability and developer velocity in Flink ML. In this
release, we refactored and significantly simplified the experience of defining,
getting and setting parameters for algorithms.&lt;/p&gt;
&lt;p&gt;With
&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181311361&quot;&gt;FLIP-174&lt;/a&gt;,
parameters can be defined as static variables of an interface, and any
algorithm that implements the interface could inherit these variable
definitions without additional work. Commonly used parameter validators are
provided as part of the infrastructure.&lt;/p&gt;
&lt;h3 id=&quot;tools-for-composing-dag-of-stages-into-a-new-stage&quot;&gt;Tools for composing DAG of stages into a new stage&lt;/h3&gt;
&lt;p&gt;One of the most useful tool in the existing ML libraries (e.g. Scikit-learn,
Flink, Spark) is
&lt;a href=&quot;https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html&quot;&gt;Pipeline&lt;/a&gt;,
which allows users to compose an estimator from an ordered list of estimators
and transformers, without having to explicitly implement the fit/transform for
the estimator/transformer.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181311363&quot;&gt;FLIP-175&lt;/a&gt;
extended this capability from pipeline to DAG. Users can now compose an
estimator from a DAG of estimator and transformers. This capability of
composition allows developers to slice a complex workflow into simpler modules
and re-use the modules across multiple workflows. We believe this capability
could significantly improve the experience of building and deploying complex
workflows using Flink ML.&lt;/p&gt;
&lt;h2 id=&quot;stream-batch-unified-iteration-library&quot;&gt;Stream-batch Unified Iteration Library&lt;/h2&gt;
&lt;p&gt;To support training machine learning algorithms and adjust the model parameters
dynamically based on the prediction result, it is necessary to have native
support for processing data iteratively. It is known that Flink uses DAG to
describe the process logic, thus we need to provide the iteration library on
top of Flink separately. Besides, since we need to support both offline
training and online training / adjustment, the iteration library should support
both streaming and batch cases.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/x/hAEBCw&quot;&gt;FLIP-176&lt;/a&gt; implements a
stream-batch unified iteration library. It provides the function of
transmitting records back to the precedent operators and the ability to track
the progress of rounds inside the iteration. Users could directly use
DataStream API and Table API to express the execution logic inside the
iteration. Besides, the new iteration library also extends Flink’s
checkpointing mechanism to also support exactly-once failover for jobs using
iterations.&lt;/p&gt;
&lt;h2 id=&quot;python-sdk&quot;&gt;Python SDK&lt;/h2&gt;
&lt;p&gt;Nowadays many machine learning practitioners are used to developing machine
learning workflows in Python due to its ease-of-use and excellent ecosystem. To
meet the needs of these users, a Python package dedicated for Flink ML is
created starting from this release. The Python package currently provides APIs
similar to their Java counterparts for developing machine learning algorithms.&lt;/p&gt;
&lt;p&gt;Users can install Flink ML Python package through pip using the following
command:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;pip install apache-flink-ml&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In the future we will enhance the Python SDK to enable its interoperability
with Flink ML’s Java library, for example, allowing users to express machine
learning workflows in Python, where workflows consist of a mixture of stages
from the Flink ML Java library as well as stages implemented in Python (e.g. a
TensorFlow program).&lt;/p&gt;
&lt;h2 id=&quot;algorithm-library&quot;&gt;Algorithm Library&lt;/h2&gt;
&lt;p&gt;Now that the Flink ML API re-design is done, we started the initiative to add
off-the-shelf algorithms in Flink ML. The release of Flink-ML 2.0.0 is closely
related to project Alink - an Apache Flink ecosystem project open sourced by
Alibaba. The connection between the Flink community and developers of the Alink
project dates back to 2017. The project Alink developers have a significant
contribution in designing the new Flink ML APIs, refactoring, optimizing and
migrating algorithms from Alink to Flink. Our long-term goal is to provide a
library of performant algorithms that are easy to use, debug and customize for
your needs.&lt;/p&gt;
&lt;p&gt;We have implemented five algorithms in this release, i.e. logistic regression,
k-means, k-nearest neighbors, naive bayes and one-hot encoder. For now these
algorithms focus on validating the APIs and iteration runtime. In addition to
adding more and more algorithms, we will also stress test and optimize their
performance to make sure these algorithms have state-of-the-art performance.
Stay tuned!&lt;/p&gt;
&lt;h1 id=&quot;related-work&quot;&gt;Related Work&lt;/h1&gt;
&lt;h2 id=&quot;flink-ml-project-moved-to-a-separate-repository&quot;&gt;Flink ML project moved to a separate repository&lt;/h2&gt;
&lt;p&gt;To accelerate the development of Flink ML, the effort has moved to the new
repository &lt;a href=&quot;https://github.com/apache/flink-ml&quot;&gt;flink-ml&lt;/a&gt; under the Flink
project. We here follow a similar approach like the Stateful Functions effort,
where a separate repository has helped to speed up the development by allowing
for more light-weight contribution workflows and separate release cycles.&lt;/p&gt;
&lt;h2 id=&quot;github-organization-created-for-flink-ecosystem-projects&quot;&gt;Github organization created for Flink ecosystem projects&lt;/h2&gt;
&lt;p&gt;To facilitate the community collaboration on ecosystem projects that extend the
capability of the Apache Flink, Apache Flink PMC has granted the permission to
use flink-extended as the name of this &lt;a href=&quot;https://github.com/flink-extended&quot;&gt;GitHub
organization&lt;/a&gt;, which provides a neutral
place to host the code of ecosystem projects.&lt;/p&gt;
&lt;p&gt;Two Flink ML related projects have been moved to this organization.
&lt;a href=&quot;https://github.com/flink-extended/dl-on-flink&quot;&gt;dl-on-flink&lt;/a&gt; provides the
capability to implement Flink ML stages using TensorFlow. And
&lt;a href=&quot;https://github.com/flink-extended/clink&quot;&gt;clink&lt;/a&gt; is a library that facilitates
the implementation of Flink ML stages using C++ in order to support e.g.
real-time feature engineering.&lt;/p&gt;
&lt;p&gt;We hope you can join this effort and share your Flink ecosystem projects in
this Github organization. And stay tuned for more updates on ecosystem
projects.&lt;/p&gt;
&lt;h1 id=&quot;upgrade-notes&quot;&gt;Upgrade Notes&lt;/h1&gt;
&lt;p&gt;Please review this note for a list of adjustments to make and issues to check
if you plan to upgrade to Flink ML 2.0.0.&lt;/p&gt;
&lt;p&gt;This note discusses any critical information about incompatibilities and
breaking changes, performance changes, and any other changes that might impact
your production deployment of Flink ML.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Module names are changed&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;We have replaced the &lt;code&gt;flink-ml-api&lt;/code&gt; module with the &lt;code&gt;flink-ml-core_2.12&lt;/code&gt;
module.&lt;/p&gt;
&lt;p&gt;For users who have a dependency on &lt;code&gt;flink-ml-api&lt;/code&gt;, please replace it with
&lt;code&gt;flink-ml-core_2.12&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;PipelineStage and its subclasses are changed&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=184615783&quot;&gt;FLIP-173&lt;/a&gt;
made major changes to PipelineStage and its subclasses. Changes include class
rename, method signature change, method removal etc.&lt;/p&gt;
&lt;p&gt;Users who use PipelineStage and its subclasses should use the new APIs
introduced in FLIP-173.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Param-related classes are changed&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181311361&quot;&gt;FLIP-174&lt;/a&gt;
made major changes to the param-related classes. Changes include class rename,
method signature change, method removal etc.&lt;/p&gt;
&lt;p&gt;Users who use classes such as Params and WithParams should use the new APIs
introduced in FLIP-174.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Flink dependency is changed from 1.12 to 1.14&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;This change introduces all the breaking changes listed in the Flink 1.14
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/release-notes/flink-1.14&quot;&gt;release notes&lt;/a&gt;.
One major change is that the DataSet API is not supported anymore.&lt;/p&gt;
&lt;p&gt;Users who use DataSet::iterate should switch to using the datastream-based
iteration API introduced in &lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=184615300&quot;&gt;FLIP-176&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;release-notes-and-resources&quot;&gt;Release Notes and Resources&lt;/h1&gt;
&lt;p&gt;Please take a look at the &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12351079&quot;&gt;release notes&lt;/a&gt;
for a detailed list of changes and new features.&lt;/p&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated
&lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads page&lt;/a&gt; of the Flink website,
and the most recent distribution of Flink ML Python package is available on
&lt;a href=&quot;https://pypi.org/project/apache-flink-ml&quot;&gt;PyPI&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h1&gt;
&lt;p&gt;The Apache Flink community would like to thank each one of the contributors
that have made this release possible:&lt;/p&gt;
&lt;p&gt;Yun Gao, Dong Lin, Zhipeng Zhang, huangxingbo, Yunfeng Zhou, Jiangjie (Becket)
Qin, weibo, abdelrahman-ik.&lt;/p&gt;
</description>
<pubDate>Fri, 07 Jan 2022 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2022/01/07/release-ml-2.0.0.html</link>
<guid isPermaLink="true">/news/2022/01/07/release-ml-2.0.0.html</guid>
</item>
<item>
<title>How We Improved Scheduler Performance for Large-scale Jobs - Part Two</title>
<description>&lt;p&gt;&lt;a href=&quot;/2022/01/04/scheduler-performance-part-one&quot;&gt;Part one&lt;/a&gt; of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#reducing-complexity-with-groups&quot; id=&quot;markdown-toc-reducing-complexity-with-groups&quot;&gt;Reducing complexity with groups&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#optimizations-related-to-task-deployment&quot; id=&quot;markdown-toc-optimizations-related-to-task-deployment&quot;&gt;Optimizations related to task deployment&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#the-problem&quot; id=&quot;markdown-toc-the-problem&quot;&gt;The problem&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-solution&quot; id=&quot;markdown-toc-the-solution&quot;&gt;The solution&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#cache-shuffledescriptors&quot; id=&quot;markdown-toc-cache-shuffledescriptors&quot;&gt;Cache ShuffleDescriptors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#distribute-shuffledescriptors-via-the-blob-server&quot; id=&quot;markdown-toc-distribute-shuffledescriptors-via-the-blob-server&quot;&gt;Distribute ShuffleDescriptors via the blob server&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#optimizations-when-building-pipelined-regions&quot; id=&quot;markdown-toc-optimizations-when-building-pipelined-regions&quot;&gt;Optimizations when building pipelined regions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;reducing-complexity-with-groups&quot;&gt;Reducing complexity with groups&lt;/h1&gt;
&lt;p&gt;A distribution pattern describes how consumer tasks are connected to producer tasks. Currently, there are two distribution patterns in Flink: pointwise and all-to-all. When the distribution pattern is pointwise between two vertices, the &lt;a href=&quot;https://en.wikipedia.org/wiki/Big_O_notation&quot;&gt;computational complexity&lt;/a&gt; of traversing all edges is O(n). When the distribution pattern is all-to-all, the complexity of traversing all edges is O(n&lt;sup&gt;2&lt;/sup&gt;), which means that complexity increases rapidly when the scale goes up.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg&quot; width=&quot;75%&quot; /&gt;
&lt;br /&gt;
Fig. 1 - Two distribution patterns in Flink
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
In Flink 1.12, the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/runtime/executiongraph/ExecutionEdge.html&quot;&gt;ExecutionEdge&lt;/a&gt; class is used to store the information of connections between tasks. This means that for the all-to-all distribution pattern, there would be O(n&lt;sup&gt;2&lt;/sup&gt;) ExecutionEdges, which would take up a lot of memory for large-scale jobs. For two &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph/JobVertex.html&quot;&gt;JobVertices&lt;/a&gt; connected with an all-to-all edge and a parallelism of 10K, it would take more than 4 GiB memory to store 100M ExecutionEdges. Since there can be multiple all-to-all connections between vertices in production jobs, the amount of memory required would increase rapidly.&lt;/p&gt;
&lt;p&gt;As we can see in Fig. 1, for two JobVertices connected with the all-to-all distribution pattern, all &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartition.html&quot;&gt;IntermediateResultPartitions&lt;/a&gt; produced by upstream &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.html&quot;&gt;ExecutionVertices&lt;/a&gt; are &lt;a href=&quot;https://en.wikipedia.org/wiki/Isomorphism&quot;&gt;isomorphic&lt;/a&gt;, which means that the downstream ExecutionVertices they connect to are exactly the same. The downstream ExecutionVertices belonging to the same JobVertex are also isomorphic, as the upstream IntermediateResultPartitions they connect to are the same too. Since every &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph/JobEdge.html&quot;&gt;JobEdge&lt;/a&gt; has exactly one distribution type, we can divide vertices and result partitions into groups according to the distribution type of the JobEdge.&lt;/p&gt;
&lt;p&gt;For the all-to-all distribution pattern, since all downstream ExecutionVertices belonging to the same JobVertex are isomorphic and belong to a single group, all the result partitions they consume are connected to this group. This group is called &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/ConsumerVertexGroup.html&quot;&gt;ConsumerVertexGroup&lt;/a&gt;. Inversely, all the upstream result partitions are grouped into a single group, and all the consumer vertices are connected to this group. This group is called &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/ConsumedPartitionGroup.html&quot;&gt;ConsumedPartitionGroup&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The basic idea of our optimizations is to put all the vertices that consume the same result partitions into one ConsumerVertexGroup, and put all the result partitions with the same consumer vertices into one ConsumedPartitionGroup.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/2-groups.svg&quot; width=&quot;80%&quot; /&gt;
&lt;br /&gt;
Fig. 2 - How partitions and vertices are grouped w.r.t. distribution patterns
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
When scheduling tasks, Flink needs to iterate over all the connections between result partitions and consumer vertices. In the past, since there were O(n&lt;sup&gt;2&lt;/sup&gt;) edges in total, the overall complexity of the iteration was O(n&lt;sup&gt;2&lt;/sup&gt;). Now ExecutionEdge is replaced with ConsumerVertexGroup and ConsumedPartitionGroup. As all the isomorphic result partitions are connected to the same downstream ConsumerVertexGroup, when the scheduler iterates over all the connections, it just needs to iterate over the group once. The computational complexity decreases from O(n&lt;sup&gt;2&lt;/sup&gt;) to O(n).&lt;/p&gt;
&lt;p&gt;For the pointwise distribution pattern, one ConsumedPartitionGroup is connected to one ConsumerVertexGroup point-to-point. The number of groups is the same as the number of ExecutionEdges. Thus, the computational complexity of iterating over the groups is still O(n).&lt;/p&gt;
&lt;p&gt;For the example job we mentioned above, replacing ExecutionEdges with the groups can effectively reduce the memory usage of &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.html&quot;&gt;ExecutionGraph&lt;/a&gt; from more than 4 GiB to about 12 MiB. Based on the concept of groups, we further optimized several procedures, including job initialization, scheduling tasks, failover, and partition releasing. These procedures are all involved with traversing all consumer vertices for all the partitions. After the optimization, their overall computational complexity decreases from O(n&lt;sup&gt;2&lt;/sup&gt;) to O(n).&lt;/p&gt;
&lt;h1 id=&quot;optimizations-related-to-task-deployment&quot;&gt;Optimizations related to task deployment&lt;/h1&gt;
&lt;h2 id=&quot;the-problem&quot;&gt;The problem&lt;/h2&gt;
&lt;p&gt;In Flink 1.12, it takes a long time to deploy tasks for large-scale jobs if they contain all-to-all edges. Furthermore, a heartbeat timeout may happen during or after task deployment, which makes the cluster unstable.&lt;/p&gt;
&lt;p&gt;Currently, task deployment includes the following steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A JobManager creates &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/deployment/TaskDeploymentDescriptor.html&quot;&gt;TaskDeploymentDescriptors&lt;/a&gt; for each task, which happens in the JobManager’s main thread;&lt;/li&gt;
&lt;li&gt;The JobManager serializes TaskDeploymentDescriptors asynchronously;&lt;/li&gt;
&lt;li&gt;The JobManager ships serialized TaskDeploymentDescriptors to TaskManagers via RPC messages;&lt;/li&gt;
&lt;li&gt;TaskManagers create new tasks based on the TaskDeploymentDescriptors and execute them.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A TaskDeploymentDescriptor (TDD) contains all the information required by TaskManagers to create a task. At the beginning of task deployment, a JobManager creates the TDDs for all tasks. Since this happens in the main thread, the JobManager cannot respond to any other requests. For large-scale jobs, the main thread may get blocked for a long time, heartbeat timeouts may happen, and a failover would be triggered.&lt;/p&gt;
&lt;p&gt;A JobManager can become a bottleneck during task deployment since all descriptors are transmitted from it to all TaskManagers. For large-scale jobs, these temporary descriptors would require a lot of heap memory and cause frequent long-term garbage collection pauses.&lt;/p&gt;
&lt;p&gt;Thus, we need to speed up the creation of the TDDs. Furthermore, if the size of descriptors can be reduced, then they will be transmitted faster, which leads to faster task deployments.&lt;/p&gt;
&lt;h2 id=&quot;the-solution&quot;&gt;The solution&lt;/h2&gt;
&lt;h3 id=&quot;cache-shuffledescriptors&quot;&gt;Cache ShuffleDescriptors&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/shuffle/ShuffleDescriptor.html&quot;&gt;ShuffleDescriptor&lt;/a&gt;s are used to describe the information of result partitions that a task consumes and can be the largest part of a TaskDeploymentDescriptor. For an all-to-all edge, when the parallelisms of both upstream and downstream vertices are n, the number of ShuffleDescriptors for each downstream vertex is n, since they are connected to n upstream vertices. Thus, the total count of the ShuffleDescriptors for the vertices is n2.&lt;/p&gt;
&lt;p&gt;However, the ShuffleDescriptors for the downstream vertices are all the same since they all consume the same upstream result partitions. Therefore, Flink doesn’t need to create ShuffleDescriptors for each downstream vertex individually. Instead, it can create them once and cache them to be reused. This will decrease the overall complexity of creating TaskDeploymentDescriptors for tasks from O(n&lt;sup&gt;2&lt;/sup&gt;) to O(n).&lt;/p&gt;
&lt;p&gt;To decrease the size of RPC messages and reduce the transmission of replicated data over the network, the cached ShuffleDescriptors can be compressed. For the example job we mentioned above, if the parallelisms of vertices are both 10k, then each downstream vertex has 10k ShuffleDescriptors. After compression, the size of the serialized value would be reduced by 72%.&lt;/p&gt;
&lt;h3 id=&quot;distribute-shuffledescriptors-via-the-blob-server&quot;&gt;Distribute ShuffleDescriptors via the blob server&lt;/h3&gt;
&lt;p&gt;A &lt;a href=&quot;https://en.wikipedia.org/wiki/Binary_large_object&quot;&gt;blob&lt;/a&gt; (binary large objects) is a collection of binary data used to store large files. Flink hosts a blob server to transport large-sized data between the JobManager and TaskManagers. When a JobManager decides to transmit a large file to TaskManagers, it would first store the file in the blob server (will also upload files to the distributed file system) and get a token representing the blob, called the blob key. It would then transmit the blob key instead of the blob file to TaskManagers. When TaskManagers get the blob key, they will retrieve the file from the distributed file system (DFS). The blobs are stored in the blob cache on TaskManagers so that they only need to retrieve the file once.&lt;/p&gt;
&lt;p&gt;During task deployment, the JobManager is responsible for distributing the ShuffleDescriptors to TaskManagers via RPC messages. The messages will be garbage collected once they are sent. However, if the JobManager cannot send the messages as fast as they are created, these messages would take up a lot of space in heap memory and become a heavy burden for the garbage collector to deal with. There will be more long-term garbage collections that stop the world and slow down the task deployment.&lt;/p&gt;
&lt;p&gt;To solve this problem, the blob server can be used to distribute large ShuffleDescriptors. The JobManager first sends ShuffleDescriptors to the blob server, which stores ShuffleDescriptors in the DFS. TaskManagers request ShuffleDescriptors from the DFS once they begin to process TaskDeploymentDescriptors. With this change, the JobManager doesn’t need to keep all the copies of ShuffleDescriptors in heap memory until they are sent. Moreover, the frequency of garbage collections for large-scale jobs is significantly reduced. Also, task deployment will be faster since there will be no bottleneck during task deployment anymore, because the DFS provides multiple distributed nodes for TaskManagers to download the ShuffleDescriptors from.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg&quot; width=&quot;80%&quot; /&gt;
&lt;br /&gt;
Fig. 3 - How ShuffleDescriptors are distributed
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
To avoid running out of space on the local disk, the cache will be cleared when the related partitions are no longer valid and a size limit is added for ShuffleDescriptors in the blob cache on TaskManagers. If the overall size exceeds the limit, the least recently used cached value will be removed. This ensures that the local disks on the JobManager and TaskManagers won’t be filled up with ShuffleDescriptors, especially in session mode.&lt;/p&gt;
&lt;h1 id=&quot;optimizations-when-building-pipelined-regions&quot;&gt;Optimizations when building pipelined regions&lt;/h1&gt;
&lt;p&gt;In Flink, there are two types of data exchanges: pipelined and blocking. When using blocking data exchanges, result partitions are first fully produced and then consumed by the downstream vertices. The produced results are persisted and can be consumed multiple times. When using pipelined data exchanges, result partitions are produced and consumed concurrently. The produced results are not persisted and can be consumed only once.&lt;/p&gt;
&lt;p&gt;Since the pipelined data stream is produced and consumed simultaneously, Flink needs to make sure that the vertices connected via pipelined data exchanges execute at the same time. These vertices form a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/topology/PipelinedRegion.html&quot;&gt;pipelined region&lt;/a&gt;. The pipelined region is the basic unit of scheduling and failover by default. During scheduling, all vertices in a pipelined region will be scheduled together, and all pipelined regions in the graph will be scheduled one by one topologically.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg&quot; width=&quot;90%&quot; /&gt;
&lt;br /&gt;
Fig. 4 - The LogicalPipelinedRegion and the SchedulingPipelinedRegion
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
Currently, there are two types of pipelined regions in the scheduler: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph/topology/LogicalPipelinedRegion.html&quot;&gt;LogicalPipelinedRegion&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/SchedulingPipelinedRegion.html&quot;&gt;SchedulingPipelinedRegion&lt;/a&gt;. The LogicalPipelinedRegion denotes the pipelined regions on the logical level. It consists of JobVertices and forms the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph/JobGraph.html&quot;&gt;JobGraph&lt;/a&gt;. The SchedulingPipelinedRegion denotes the pipelined regions on the execution level. It consists of ExecutionVertices and forms the ExecutionGraph. Like ExecutionVertices are derived from a JobVertex, SchedulingPipelinedRegions are derived from a LogicalPipelinedRegion, as Fig. 4 shows.&lt;/p&gt;
&lt;p&gt;During the construction of pipelined regions, a problem arises: There may be cyclic dependencies between pipelined regions. A pipelined region can be scheduled if and only if all its dependencies have finished. However, if there are two pipelined regions with cyclic dependencies between each other, there will be a scheduling &lt;a href=&quot;https://en.wikipedia.org/wiki/Deadlock&quot;&gt;deadlock&lt;/a&gt;. They are both waiting for the other one to be scheduled first, and none of them can be scheduled. Therefore, &lt;a href=&quot;https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm&quot;&gt;Tarjan’s strongly connected components algorithm&lt;/a&gt; is adopted to discover the cyclic dependencies between regions and merge them into one pipelined region. It will traverse all the edges in the topology. For the all-to-all distribution pattern, the number of edges is O(n&lt;sup&gt;2&lt;/sup&gt;). Thus, the computational complexity of this algorithm is O(n&lt;sup&gt;2&lt;/sup&gt;), and it significantly slows down the initialization of the scheduler.&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg&quot; width=&quot;90%&quot; /&gt;
&lt;br /&gt;
Fig. 5 - The topology with scheduling deadlock
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
To speed up the construction of pipelined regions, the relevance between the logical topology and the scheduling topology can be leveraged. Since a SchedulingPipelinedRegion is derived from just one LogicalPipelinedRegion, Flink traverses all LogicalPipelinedRegions and converts them into SchedulingPipelinedRegions one by one. The conversion varies based on the distribution patterns of edges that connect vertices in the LogicalPipelinedRegion.&lt;/p&gt;
&lt;p&gt;If there are any all-to-all distribution patterns inside the region, the entire region can just be converted into one SchedulingPipelinedRegion directly. That’s because for the all-to-all edge with the pipelined data exchange, all the regions connected to this edge must execute simultaneously, which means they are merged into one region. For the all-to-all edge with a blocking data exchange, it will introduce cyclic dependencies, as Fig. 5 shows. All the regions it connects must be merged into one region to avoid scheduling deadlock, as Fig. 6 shows. Since there’s no need to use Tarjan’s algorithm, the computational complexity in this case is O(n).&lt;/p&gt;
&lt;p&gt;If there are only pointwise distribution patterns inside a region, Tarjan’s strongly connected components algorithm is still used to ensure no cyclic dependencies. Since there are only pointwise distribution patterns, the number of edges in the topology is O(n), and the computational complexity of the algorithm will be O(n).&lt;/p&gt;
&lt;center&gt;
&lt;br /&gt;
&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg&quot; width=&quot;90%&quot; /&gt;
&lt;br /&gt;
Fig. 6 - How to convert a LogicalPipelinedRegion to ScheduledPipelinedRegions
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
After the optimization, the overall computational complexity of building pipelined regions decreases from O(n&lt;sup&gt;2&lt;/sup&gt;) to O(n). In our experiments, for the job which contains two vertices connected with a blocking all-to-all edge, when their parallelisms are both 10K, the time of building pipelined regions decreases by 99%, from 8,257 ms to 120 ms.&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;All in all, we’ve done several optimizations to improve the scheduler’s performance for large-scale jobs in Flink 1.13 and 1.14. The optimizations involve procedures including job initialization, scheduling, task deployment, and failover. If you have any questions about them, please feel free to start a discussion in the dev mail list.&lt;/p&gt;
</description>
<pubDate>Tue, 04 Jan 2022 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/2022/01/04/scheduler-performance-part-two.html</link>
<guid isPermaLink="true">/2022/01/04/scheduler-performance-part-two.html</guid>
</item>
<item>
<title>How We Improved Scheduler Performance for Large-scale Jobs - Part One</title>
<description>&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
&lt;p&gt;When scheduling large-scale jobs in Flink 1.12, a lot of time is required to initialize jobs and deploy tasks. The scheduler also requires a large amount of heap memory in order to store the execution topology and host temporary deployment descriptors. For example, for a job with a topology that contains two vertices connected with an all-to-all edge and a parallelism of 10k (which means there are 10k source tasks and 10k sink tasks and every source task is connected to all sink tasks), Flink’s JobManager would require 30 GiB of heap memory and more than 4 minutes to deploy all of the tasks.&lt;/p&gt;
&lt;p&gt;Furthermore, task deployment may block the JobManager’s main thread for a long time and the JobManager will not be able to respond to any other requests from TaskManagers. This could lead to heartbeat timeouts that trigger a failover. In the worst case, this will render the Flink cluster unusable because it cannot deploy the job.&lt;/p&gt;
&lt;p&gt;To improve the performance of the scheduler for large-scale jobs, we’ve implemented several optimizations in Flink 1.13 and 1.14:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Introduce the concept of consuming groups to optimize procedures related to the complexity of topologies, including the initialization, scheduling, failover, and partition release. This also reduces the memory required to store the topology;&lt;/li&gt;
&lt;li&gt;Introduce a cache to optimize task deployment, which makes the process faster and requires less memory;&lt;/li&gt;
&lt;li&gt;Leverage characteristics of the logical topology and the scheduling topology to speed up the building of pipelined regions.&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&quot;benchmarking-results&quot;&gt;Benchmarking Results&lt;/h1&gt;
&lt;p&gt;To estimate the effect of our optimizations, we conducted several experiments to compare the performance of Flink 1.12 (before the optimization) with Flink 1.14 (after the optimization). The job in our experiments contains two vertices connected with an all-to-all edge. The parallelisms of these vertices are both 10K. To make temporary deployment descriptors distributed via the blob server, we set the configuration &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#blob-offload-minsize&quot;&gt;blob.offload.minsize&lt;/a&gt; to 100 KiB (from default value 1 MiB). This configuration means that the blobs larger than the set value will be distributed via the blob server, and the size of deployment descriptors in our test job is about 270 KiB. The results of our experiments are illustrated below:&lt;/p&gt;
&lt;center&gt;
Table 1 - The comparison of time cost between Flink 1.12 and 1.14
&lt;table width=&quot;95%&quot; border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Procedure&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;1.12&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;1.14&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Reduction(%)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Job Initialization&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;11,431ms&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;627ms&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;94.51%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Task Deployment&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;63,118ms&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;17,183ms&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;72.78%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Computing tasks to restart when failover&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;37,195ms&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;170ms&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;99.55%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
In addition to quicker speeds, the memory usage is significantly reduced. It requires 30 GiB heap memory for a JobManager to deploy the test job and keep it running stably with Flink 1.12, while the minimum heap memory required by the JobManager with Flink 1.14 is only 2 GiB.&lt;/p&gt;
&lt;p&gt;There are also less occurrences of long-term garbage collection. When running the test job with Flink 1.12, a garbage collection that lasts more than 10 seconds occurs during both job initialization and task deployment. With Flink 1.14, since there is no long-term garbage collection, there is also a decreased risk of heartbeat timeouts, which creates better cluster stability.&lt;/p&gt;
&lt;p&gt;In our experiment, it took more than 4 minutes for the large-scale job with Flink 1.12 to transition to running (excluding the time spent on allocating resources). With Flink 1.14, it took no more than 30 seconds (excluding the time spent on allocating resources). The time cost is reduced by 87%. Thus, for users who are running large-scale jobs for production and want better scheduling performance, please consider upgrading Flink to 1.14.&lt;/p&gt;
&lt;p&gt;In &lt;a href=&quot;/2022/01/04/scheduler-performance-part-two&quot;&gt;part two&lt;/a&gt; of this blog post, we are going to talk about these improvements in detail.&lt;/p&gt;
</description>
<pubDate>Tue, 04 Jan 2022 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/2022/01/04/scheduler-performance-part-one.html</link>
<guid isPermaLink="true">/2022/01/04/scheduler-performance-part-one.html</guid>
</item>
<item>
<title>Apache Flink StateFun Log4j emergency release</title>
<description>&lt;p&gt;The Apache Flink community has released an emergency bugfix version of Apache Flink Stateful Function 3.1.1.&lt;/p&gt;
&lt;p&gt;This release include a version upgrade of Apache Flink to 1.13.5, for log4j to address &lt;a href=&quot;https://nvd.nist.gov/vuln/detail/CVE-2021-44228&quot;&gt;CVE-2021-44228&lt;/a&gt; and &lt;a href=&quot;https://nvd.nist.gov/vuln/detail/CVE-2021-45046&quot;&gt;CVE-2021-45046&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to the latest patch release.&lt;/p&gt;
&lt;p&gt;You can find the source and binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;, and Docker images in the &lt;a href=&quot;https://hub.docker.com/r/apache/flink-statefun&quot;&gt;apache/flink-statefun&lt;/a&gt; dockerhub repository.&lt;/p&gt;
</description>
<pubDate>Wed, 22 Dec 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2021/12/22/log4j-statefun-release.html</link>
<guid isPermaLink="true">/news/2021/12/22/log4j-statefun-release.html</guid>
</item>
<item>
<title>Apache Flink Log4j emergency releases</title>
<description>&lt;p&gt;The Apache Flink community has released emergency bugfix versions of Apache Flink for the 1.11, 1.12, 1.13 and 1.14 series.&lt;/p&gt;
&lt;p&gt;These releases only include a version upgrade for Log4j to address &lt;a href=&quot;https://nvd.nist.gov/vuln/detail/CVE-2021-44228&quot;&gt;CVE-2021-44228&lt;/a&gt; and &lt;a href=&quot;https://nvd.nist.gov/vuln/detail/CVE-2021-45046&quot;&gt;CVE-2021-45046&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to the respective patch release.&lt;/p&gt;
&lt;p&gt;You can find the source and binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;, and Docker images in the &lt;a href=&quot;https://hub.docker.com/r/apache/flink&quot;&gt;apache/flink&lt;/a&gt; dockerhub repository.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;We are publishing this announcement earlier than usual to give users access to the updated source/binary releases as soon as possible.&lt;/p&gt;
&lt;p&gt;As a result of that certain artifacts are not yet available:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Maven artifacts are currently being synced to Maven central and will become available over the next 24 hours.&lt;/li&gt;
&lt;li&gt;The 1.11.6/1.12.7 Python binaries will be published at a later date.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This post will be continously updated to reflect the latest state.&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;The newly released versions are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;1.14.2&lt;/li&gt;
&lt;li&gt;1.13.5&lt;/li&gt;
&lt;li&gt;1.12.7&lt;/li&gt;
&lt;li&gt;1.11.6&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To clarify and avoid confusion: The 1.14.1 / 1.13.4 / 1.12.6 / 1.11.5 releases, which were supposed to only contain a Log4j upgrade to 2.15.0, were &lt;em&gt;skipped&lt;/em&gt; because &lt;a href=&quot;https://nvd.nist.gov/vuln/detail/CVE-2021-45046&quot;&gt;CVE-2021-45046&lt;/a&gt; was discovered during the release publication. Some artifacts were published to Maven Central, but no source/binary releases nor Docker images are available for those versions.&lt;/p&gt;
&lt;/div&gt;
</description>
<pubDate>Thu, 16 Dec 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2021/12/16/log4j-patch-releases.html</link>
<guid isPermaLink="true">/news/2021/12/16/log4j-patch-releases.html</guid>
</item>
<item>
<title>Advise on Apache Log4j Zero Day (CVE-2021-44228)</title>
<description>&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;Please see &lt;a href=&quot;/news/2021/12/16/log4j-patch-releases&quot;&gt;this&lt;/a&gt; for our updated recommendation regarding this CVE.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Yesterday, a new Zero Day for Apache Log4j was &lt;a href=&quot;https://www.cyberkendra.com/2021/12/apache-log4j-vulnerability-details-and.html&quot;&gt;reported&lt;/a&gt;.
It is by now tracked under &lt;a href=&quot;https://nvd.nist.gov/vuln/detail/CVE-2021-44228&quot;&gt;CVE-2021-44228&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Apache Flink is bundling a version of Log4j that is affected by this vulnerability.
We recommend users to follow the &lt;a href=&quot;https://logging.apache.org/log4j/2.x/security.html&quot;&gt;advisory&lt;/a&gt; of the Apache Log4j Community.
For Apache Flink this currently translates to setting the following property in your flink-conf.yaml:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;&lt;span class=&quot;l-Scalar-Plain&quot;&gt;env.java.opts&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;-Dlog4j2.formatMsgNoLookups=true&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you are already setting &lt;code&gt;env.java.opts.jobmanager&lt;/code&gt;, &lt;code&gt;env.java.opts.taskmanager&lt;/code&gt;, &lt;code&gt;env.java.opts.client&lt;/code&gt;, or &lt;code&gt;env.java.opts.historyserver&lt;/code&gt; you should instead add the system change to those existing parameter lists.&lt;/p&gt;
&lt;p&gt;As soon as Log4j has been upgraded to 2.15.0 in Apache Flink, this is not necessary anymore.
This effort is tracked in &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-25240&quot;&gt;FLINK-25240&lt;/a&gt;.
It will be included in Flink 1.15.0, Flink 1.14.1 and Flink 1.13.3.
We expect Flink 1.14.1 to be released in the next 1-2 weeks.
The other releases will follow in their regular cadence.&lt;/p&gt;
</description>
<pubDate>Fri, 10 Dec 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/2021/12/10/log4j-cve.html</link>
<guid isPermaLink="true">/2021/12/10/log4j-cve.html</guid>
</item>
<item>
<title>Flink Backward - The Apache Flink Retrospective</title>
<description>&lt;p&gt;It has now been a month since the community released &lt;a href=&quot;https://flink.apache.org/downloads.html#apache-flink-1140&quot;&gt;Apache Flink 1.14&lt;/a&gt; into the wild. We had a comprehensive look at the enhancements, additions, and fixups in the release announcement blog post, and now we will look at the development cycle from a different angle. Based on feedback collected from contributors involved in this release, we will explore the experiences and processes behind it all.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#a-retrospective-on-the-release-cycle&quot; id=&quot;markdown-toc-a-retrospective-on-the-release-cycle&quot;&gt;A retrospective on the release cycle&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#problems-faced&quot; id=&quot;markdown-toc-problems-faced&quot;&gt;Problems faced&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#things-enjoyed&quot; id=&quot;markdown-toc-things-enjoyed&quot;&gt;Things enjoyed&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#what-we-want-to-achieve-through-process-changes&quot; id=&quot;markdown-toc-what-we-want-to-achieve-through-process-changes&quot;&gt;What we want to achieve through process changes&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#transparency---let-the-community-participate&quot; id=&quot;markdown-toc-transparency---let-the-community-participate&quot;&gt;Transparency - let the community participate&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#stability---reduce-building-and-testing-pain&quot; id=&quot;markdown-toc-stability---reduce-building-and-testing-pain&quot;&gt;Stability - reduce building and testing pain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#documentation---make-it-user-friendly&quot; id=&quot;markdown-toc-documentation---make-it-user-friendly&quot;&gt;Documentation - make it user-friendly&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#api-consistency---a-timeless-joyful-experience&quot; id=&quot;markdown-toc-api-consistency---a-timeless-joyful-experience&quot;&gt;API consistency - a timeless, joyful experience&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#some-noteworthy-items&quot; id=&quot;markdown-toc-some-noteworthy-items&quot;&gt;Some noteworthy items&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;a-retrospective-on-the-release-cycle&quot;&gt;A retrospective on the release cycle&lt;/h1&gt;
&lt;p&gt;From the team, we collected emotions that have been attributed to points in time of the 1.14 release cycle:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2021-11-03-flink-backward/1.14-weather.png&quot; width=&quot;70%&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;The overall sentiment seems to be quite good. A ship crushed a robot two times, someone felt sick towards the end, an octopus causing negative emotions appeared in June…&lt;/p&gt;
&lt;p&gt;We looked at the origin of these emotions and analyzed what went well and what could be improved. We also incorporated some feedback gathered from the community.&lt;/p&gt;
&lt;h2 id=&quot;problems-faced&quot;&gt;Problems faced&lt;/h2&gt;
&lt;p&gt;No release is perfect, and the community is constantly looking to improve.&lt;/p&gt;
&lt;p&gt;Apache Flink has active contributors from around the globe, many of whom do not speak English as a first language. The community is still ironing out processes for delivering high-quality documentation and blog posts from a content perspective. It is a work in progress but we have contributors focusing on this component.&lt;/p&gt;
&lt;p&gt;Each Flink release is built with the help of hundreds of contributors, each working on different parts of the project. Changes to one module may affect others in ways that are not always obvious. To maintain quality, the community supports an expansive test suite. Invariably, some tests are found to be flaky. Whenever we discover a test issue, the community opens a blocker issue that we must resolve before the next release. In practice, this leads to contributors triaging most test instabilities towards the end of each release cycle. From now on, we want to be more mindful of these failures and prioritize them when discovered.&lt;/p&gt;
&lt;p&gt;Finally, the community pushed the planned feature freeze for 1.14 by two weeks. Two weeks is an improvement from previous release cycles, but we hope to continue improving this metric for 1.15.&lt;/p&gt;
&lt;h2 id=&quot;things-enjoyed&quot;&gt;Things enjoyed&lt;/h2&gt;
&lt;p&gt;The implementation of some features, such as &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/memory/network_mem_tuning/#the-buffer-debloating-mechanism&quot;&gt;buffer debloating&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/finegrained_resource/&quot;&gt;fine-grained resource management&lt;/a&gt;, went smoothly. Though a few issues are now popping up as people begin using them in production, it is satisfying to see an engineering effort go according to plan.&lt;/p&gt;
&lt;p&gt;We also said goodbye to some components, the old table planner and integrated Mesos support. As any developer will tell you, there’s nothing better than deleting old code and reducing complexity.&lt;/p&gt;
&lt;h1 id=&quot;what-we-want-to-achieve-through-process-changes&quot;&gt;What we want to achieve through process changes&lt;/h1&gt;
&lt;h2 id=&quot;transparency---let-the-community-participate&quot;&gt;Transparency - let the community participate&lt;/h2&gt;
&lt;p&gt;When approaching a release, usually a couple of weeks after the previous release has been done, we set up bi-weekly meetings for the community to discuss any issues regarding the release. The usefulness of those meetings varied a lot, and so we started to &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/1.14+Release&quot;&gt;track the efforts&lt;/a&gt; in the Apache Flink Confluence wiki.&lt;/p&gt;
&lt;p&gt;We came up with a system to label the current states of each feature: “independent”, “won’t make it”, “very unlikely”, “will make it”, “done”, and “done done”. We introduced the “done done” state since we lacked a shared understanding of the definition of done. To qualify for “done done”, the feature is manually tested by someone not involved in the implementation. Additionally, there must exist comprehensive documentation that enables users to use the feature.&lt;/p&gt;
&lt;p&gt;After each meeting, we provided updates on the mailing list and created a corresponding burn-down chart. Those efforts have well been received by our contributors, although they might still require some improvements.&lt;/p&gt;
&lt;p&gt;The meeting used to only be for those driving the primary efforts, but we opened it up to the whole community for this release. While nobody ended up joining, we will continue to make the meetings open to everyone.&lt;/p&gt;
&lt;h2 id=&quot;stability---reduce-building-and-testing-pain&quot;&gt;Stability - reduce building and testing pain&lt;/h2&gt;
&lt;p&gt;At one point, as we were coming close to the feature freeze, the stability of the master branch became quite unstable. Although we have encountered this issue in the past, building and testing Flink under such conditions was not ideal.&lt;/p&gt;
&lt;p&gt;As a result, we focused on reducing stability issues, and the release managers have tried to organize and manage this effort. In future development cycles, the whole community needs to focus on the stability of the master branch. There are already improvements in the making, and they will hopefully enhance the experience of contributing significantly.&lt;/p&gt;
&lt;h2 id=&quot;documentation---make-it-user-friendly&quot;&gt;Documentation - make it user-friendly&lt;/h2&gt;
&lt;p&gt;Coming back to Apache traditions, most of the documentation (if any) was still being pushed after the feature freeze. As mentioned before, documentation is required to achieve the level of “done done”. Going forward, we will keep more of an eye on pushing documentation earlier in the development process. Apache Flink is an amazing piece of software that can solve so many problems, but we can do so much more in improving the user experience and introducing it to a wider audience.&lt;/p&gt;
&lt;h2 id=&quot;api-consistency---a-timeless-joyful-experience&quot;&gt;API consistency - a timeless, joyful experience&lt;/h2&gt;
&lt;p&gt;The issue of API consistency was not caused by the 1.14 release, but popped up during the development cycle nevertheless, including a bigger discussion on the mailing list. While we tried to be transparent about the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/Stability+Annotations&quot;&gt;stability guarantees of an API&lt;/a&gt; (there are no guarantees across major versions), this was not made very clear or easy to find. Since many users rely on PublicEvolving APIs (due to a lack of Public API additions), this resulted in problems for downstream projects.&lt;/p&gt;
&lt;p&gt;Moving forward, we will document more clearly what the guarantees are and introduce a process for promoting PublicEvolving APIs. This might involve generating a report on any removed/modified PublicEvolving APIs during the release cycle so that downstream projects can prepare for the changes.&lt;/p&gt;
&lt;h1 id=&quot;some-noteworthy-items&quot;&gt;Some noteworthy items&lt;/h1&gt;
&lt;p&gt;The first iteration for the buffer debloat feature was done in a Hackathon.&lt;/p&gt;
&lt;p&gt;Our &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/1.14+Release&quot;&gt;Apache Flink 1.14 Release wiki page&lt;/a&gt; has 167 historic versions. For comparison, &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-147%3A+Support+Checkpoints+After+Tasks+Finished&quot;&gt;FLIP 147&lt;/a&gt; (one of the most active FLIPs) has just 76.&lt;/p&gt;
&lt;p&gt;With &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-2491&quot;&gt;FLINK-2491&lt;/a&gt;, we closed the third most watched issue in the Apache Flink Jira. This makes sense since FLINK-2491 was created 6 years ago (August 6, 2015). The second oldest issue was created in 2017.&lt;/p&gt;
&lt;p&gt;:heart:&lt;/p&gt;
&lt;p&gt;An open source community is more than just working on software. Apache Flink is the perfect example of software that is collaborated on in all parts of the world. The active mailing list, the discussions on FLIPs, and the interactions on Jira tickets all document how people work together to build something great. We should never forget that.&lt;/p&gt;
&lt;p&gt;In the meantime, the community is already working towards Apache Flink 1.15. If you would like to become a contributor, please reach out via the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;dev mailing list&lt;/a&gt;. We are happy to help you find a ticket to get started on.&lt;/p&gt;
</description>
<pubDate>Wed, 03 Nov 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/2021/11/03/flink-backward.html</link>
<guid isPermaLink="true">/2021/11/03/flink-backward.html</guid>
</item>
<item>
<title>Sort-Based Blocking Shuffle Implementation in Flink - Part Two</title>
<description>&lt;p&gt;&lt;a href=&quot;/2021/10/26/sort-shuffle-part1&quot;&gt;Part one&lt;/a&gt; of this blog post explained the motivation behind introducing sort-based blocking shuffle, presented benchmark results, and provided guidelines on how to use this new feature.&lt;/p&gt;
&lt;p&gt;Like sort-merge shuffle implemented by other distributed data processing frameworks, the whole sort-based shuffle process in Flink consists of several important stages, including collecting data in memory, sorting the collected data in memory, spilling the sorted data to files, and reading the shuffle data from these spilled files. However, Flink’s implementation has some core differences, including the multiple data region file structure, the removal of file merge, and IO scheduling.&lt;/p&gt;
&lt;p&gt;In part two of this blog post, we will give you insight into some core design considerations and implementation details of the sort-based blocking shuffle in Flink and list several ideas for future improvement.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#design-considerations&quot; id=&quot;markdown-toc-design-considerations&quot;&gt;Design considerations&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#produce-fewer-small-files&quot; id=&quot;markdown-toc-produce-fewer-small-files&quot;&gt;Produce fewer (small) files&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#open-fewer-files-concurrently&quot; id=&quot;markdown-toc-open-fewer-files-concurrently&quot;&gt;Open fewer files concurrently&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#create-more-sequential-disk-io&quot; id=&quot;markdown-toc-create-more-sequential-disk-io&quot;&gt;Create more sequential disk IO&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#have-less-disk-io-amplification&quot; id=&quot;markdown-toc-have-less-disk-io-amplification&quot;&gt;Have less disk IO amplification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#decouple-memory-consumption-from-parallelism&quot; id=&quot;markdown-toc-decouple-memory-consumption-from-parallelism&quot;&gt;Decouple memory consumption from parallelism&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#implementation-details&quot; id=&quot;markdown-toc-implementation-details&quot;&gt;Implementation details&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#in-memory-sort&quot; id=&quot;markdown-toc-in-memory-sort&quot;&gt;In-memory sort&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#storage-structure&quot; id=&quot;markdown-toc-storage-structure&quot;&gt;Storage structure&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#io-scheduling&quot; id=&quot;markdown-toc-io-scheduling&quot;&gt;IO scheduling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#broadcast-optimization&quot; id=&quot;markdown-toc-broadcast-optimization&quot;&gt;Broadcast optimization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#data-compression&quot; id=&quot;markdown-toc-data-compression&quot;&gt;Data compression&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#future-improvements&quot; id=&quot;markdown-toc-future-improvements&quot;&gt;Future improvements&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;design-considerations&quot;&gt;Design considerations&lt;/h1&gt;
&lt;p&gt;There are several core objectives we want to achieve for the new sort-based blocking shuffle to be implemented Flink:&lt;/p&gt;
&lt;h2 id=&quot;produce-fewer-small-files&quot;&gt;Produce fewer (small) files&lt;/h2&gt;
&lt;p&gt;As discussed above, the hash-based blocking shuffle would produce too many small files for large-scale batch jobs. Producing fewer files can help to improve both stability and performance. The sort-merge approach has been widely adopted to solve this problem. By first writing to the in-memory buffer and then sorting and spilling the data into a file after the in-memory buffer is full, the number of output files can be reduced, which becomes (total data size) / (in-memory buffer size). Then by merging the produced files together, the number of files can be further reduced and larger data blocks can provide better sequential reads.&lt;/p&gt;
&lt;p&gt;Flink’s sort-based blocking shuffle adopts a similar logic. A core difference is that data spilling will always append data to the same file so only one file will be spilled for each output, which means fewer files are produced.&lt;/p&gt;
&lt;h2 id=&quot;open-fewer-files-concurrently&quot;&gt;Open fewer files concurrently&lt;/h2&gt;
&lt;p&gt;The hash-based implementation will open all partition files when writing and reading data which will consume resources like file descriptors and native memory. Exhaustion of file descriptors will lead to stability issues like “too many open files”.&lt;/p&gt;
&lt;p&gt;By always writing/reading only one file per data result partition and sharing the same opened file channel among all the concurrent data reads from the downstream consumer tasks, Flink’s sort-based blocking shuffle implementation can greatly reduce the number of concurrently opened files.&lt;/p&gt;
&lt;h2 id=&quot;create-more-sequential-disk-io&quot;&gt;Create more sequential disk IO&lt;/h2&gt;
&lt;p&gt;Although the hash-based implementation writes and reads each output file sequentially, the large amount of writing and reading can cause random IO because of the large number of files being processed concurrently, which means that reducing the number of files can also achieve more sequential IO.&lt;/p&gt;
&lt;p&gt;In addition to producing larger files, there are some other optimizations implemented by Flink. In the data writing phase, by merging small output data together into larger batches and writing through the writev system call, more writing sequential IO can be achieved. In the data reading phase, more sequential data reading IO is achieved by IO scheduling. In short, Flink tries to always read data in file offset order which maximizes sequential reads. Please refer to the IO scheduling section for more information.&lt;/p&gt;
&lt;h2 id=&quot;have-less-disk-io-amplification&quot;&gt;Have less disk IO amplification&lt;/h2&gt;
&lt;p&gt;The sort-merge approach can reduce the number of files and produce larger data blocks by merging the spilled data files together. One down side of this approach is that it writes and reads the same data multiple times because of the data merging and, theoretically, it may also take up more storage space than the total size of shuffle data.&lt;/p&gt;
&lt;p&gt;Flink’s implementation eliminates the data merging phase by spilling all data of one data result partition together into one file. As a result, the total amount of disk IO can be reduced, as well as the storage space. Though without the data merging, the data blocks are not merged into larger ones. With the IO scheduling technique, Flink can still achieve good sequential reading and high disk IO throughput. The benchmark results from the &lt;a href=&quot;/2021/10/26/sort-shuffle-part1#benchmark-results-on-stability-and-performance&quot;&gt;first part&lt;/a&gt; shows that.&lt;/p&gt;
&lt;h2 id=&quot;decouple-memory-consumption-from-parallelism&quot;&gt;Decouple memory consumption from parallelism&lt;/h2&gt;
&lt;p&gt;Similar to the sort-merge implementation in other distributed data processing systems, Flink’s implementation uses a piece of fixed size (configurable) in-memory buffer for data sorting and the buffer does not necessarily need to be extended after the task parallelism is changed, though increasing the size may lead to better performance for large-scale batch jobs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This only decouples the memory consumption from the parallelism at the data producer side. On the data consumer side, there is an improvement which works for both streaming and batch jobs (see &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16428&quot;&gt;FLINK-16428&lt;/a&gt;).&lt;/p&gt;
&lt;h1 id=&quot;implementation-details&quot;&gt;Implementation details&lt;/h1&gt;
&lt;p&gt;Here are several core components and algorithms implemented in Flink’s sort-based blocking shuffle:&lt;/p&gt;
&lt;h2 id=&quot;in-memory-sort&quot;&gt;In-memory sort&lt;/h2&gt;
&lt;p&gt;In the sort-spill phase, data records are serialized to the in-memory sort buffer first. When the sort buffer is full or all output has been finished, the data in the sort buffer will be copied and spilled into the target data file in the specific order. The following is the sort buffer interface in Flink:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;interface&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;SortBuffer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;cm&quot;&gt;/** Appends data of the specified channel to this SortBuffer. */&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ByteBuffer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;targetChannel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Buffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;DataType&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IOException&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;cm&quot;&gt;/** Copies data in this SortBuffer to the target MemorySegment. */&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BufferWithChannel&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;copyIntoSegment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MemorySegment&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;numRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;numBytes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;hasRemaining&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;finish&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;isFinished&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;release&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;isReleased&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Currently, Flink does not need to sort records by key on the data producer side, so the default implementation of sort buffer only sorts data by subpartition index, which is achieved by binary bucket sort. More specifically, each data record will be serialized and attached a 16 bytes binary header. Among the 16 bytes, 4 bytes is for the record length, 4 bytes is for the data type (event or data buffer) and 8 bytes is for pointers to the next records belonging to the same subpartition to be consumed by the same downstream data consumer. When reading data from the sort buffer, all records of the same subpartition will be copied one by one following the pointer in the record header, which guarantees that for each subpartition, the order of record reading/spilling is the same order as when the record is emitted by the producer task. The following picture shows the internal structure of the in-memory binary sort buffer:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2021-10-26-sort-shuffle/1.jpg&quot; width=&quot;70%&quot; /&gt;
&lt;/center&gt;
&lt;h2 id=&quot;storage-structure&quot;&gt;Storage structure&lt;/h2&gt;
&lt;p&gt;The data of each blocking result partition is stored as a physical data file on the disk. The data file consists of multiple data regions, one data spilling produces one data region. In each data region, the data is clustered by the subpartition ID (index) and each subpartition is corresponding to one data consumer.&lt;/p&gt;
&lt;p&gt;The following picture shows the structure of a simple data file. This data file has three data regions (R1, R2, R3) and three consumers (C1, C2, C3). Data blocks B1.1, B2.1 and B3.1 will be consumed by C1, data blocks B1.2, B2.2 and B3.2 will be consumed by C2, and data blocks B1.3, B2.3 and B3.3 will be consumed by C3.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2021-10-26-sort-shuffle/2.jpg&quot; width=&quot;60%&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;In addition to the data file, for each result partition, there is also an index file which contains pointers to the data file. The index file has the same number of regions as the data file. In each region, there are n (equals to the number of subpartitions) index entries. Each index entry consists of two parts: one is the file offset of the target data in the data file, the other is the data size. To reduce the disk IO caused by index data file access, Flink caches the index data using unmanaged heap memory if the index data file size is less than 4M. The following picture illustrates the relationship between index file and data file:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2021-10-26-sort-shuffle/4.jpg&quot; width=&quot;60%&quot; /&gt;
&lt;/center&gt;
&lt;h2 id=&quot;io-scheduling&quot;&gt;IO scheduling&lt;/h2&gt;
&lt;p&gt;Based on the storage structure described above, we introduced the IO scheduling technique to achieve more sequential reads for the sort-based blocking shuffle in Flink. The core idea behind IO scheduling is pretty simple. Just like the &lt;a href=&quot;https://en.wikipedia.org/wiki/Elevator_algorithm&quot;&gt;elevator algorithm&lt;/a&gt; for disk scheduling, the IO scheduling for sort-based blocking shuffle always tries to serve data read requests in the file offset order. More formally, we have n data regions indexed from 0 to n-1 in a result partition file. In each data region, there are m data subpartitions to be consumed by m downstream data consumers. These data consumers read data concurrently.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// let data_regions as the data region list indexed from 0 to n - 1&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// let data_readers as the concurrent downstream data readers queue indexed from 0 to m - 1&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_region&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data_regions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;data_reader&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;poll_reader_of_the_smallest_file_offset&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_readers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_reader&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;reading_buffers&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;request_reading_buffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reading_buffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;isEmpty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;read_data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_region&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data_reader&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reading_buffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;broadcast-optimization&quot;&gt;Broadcast optimization&lt;/h2&gt;
&lt;p&gt;Shuffle data broadcast in Flink refers to sending the same collection of data to all the downstream data consumers. Instead of copying and writing the same data multiple times, Flink optimizes this process by copying and spilling the broadcast data only once, which improves the data broadcast performance.&lt;/p&gt;
&lt;p&gt;More specifically, when broadcasting a data record to the sort buffer, the record will be copied and stored once. A similar thing happens when spilling the broadcast data into files. For index data, the only difference is that all the index entries for different downstream consumers point to the same data in the data file.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2021-10-26-sort-shuffle/5.jpg&quot; width=&quot;85%&quot; /&gt;
&lt;/center&gt;
&lt;h2 id=&quot;data-compression&quot;&gt;Data compression&lt;/h2&gt;
&lt;p&gt;Data compression is a simple but really useful technique to improve blocking shuffle performance. Similar to the data compression implementation of the hash-based blocking shuffle, data is compressed per buffer after it is copied from the in-memory sort buffer and before it is spilled to disk. If the data size becomes even larger after compression, the original uncompressed data buffer will be kept. Then the corresponding downstream data consumers are responsible for decompressing the received shuffle data when processing it. In fact, the sort-based blocking shuffle reuses those building blocks implemented for the hash-based blocking shuffle directly. The following picture illustrates the shuffle data compression process:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2021-10-26-sort-shuffle/3.jpg&quot; width=&quot;85%&quot; /&gt;
&lt;/center&gt;
&lt;h1 id=&quot;future-improvements&quot;&gt;Future improvements&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;TCP Connection Reuse:&lt;/strong&gt; This improvement is also useful for streaming applications which can improve the network stability. There are already tickets opened for it: &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22643&quot;&gt;FLINK-22643&lt;/a&gt; and &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15455&quot;&gt;FLINK-15455&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Multi-Disks Load Balance:&lt;/strong&gt; Multi-Disks Load Balance: In production environments, there are usually multiple disks per node, better load balance can lead to better performance, the relevant issues are &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21790&quot;&gt;FLINK-21790&lt;/a&gt; and &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21789&quot;&gt;FLINK-21789&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;External/Remote Shuffle Service:&lt;/strong&gt; Implementing an external/remote shuffle service can further improve the shuffle io performance because as a centralized service, it can collect more information leading to more optimized decisions. For example, further merging of data to the same downstream task, better node-level load balance, handling of stragglers, shared resources and so on. There are several relevant issues: &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13247&quot;&gt;FLINK-13247&lt;/a&gt;, &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22672&quot;&gt;FLINK-22672&lt;/a&gt;, &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19551&quot;&gt;FLINK-19551&lt;/a&gt; and &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10653&quot;&gt;FLINK-10653&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Enable the Choice of SSD/HDD:&lt;/strong&gt; In production environments, there are usually both SSD and HDD storage. Some jobs may prefer SSD for the faster speed, some jobs may prefer HDD for larger space and cheaper price. Enabling the choice of SSD/HDD can improve the usability of Flink’s blocking shuffle.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
</description>
<pubDate>Tue, 26 Oct 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2021/10/26/sort-shuffle-part2.html</link>
<guid isPermaLink="true">/2021/10/26/sort-shuffle-part2.html</guid>
</item>
<item>
<title>Sort-Based Blocking Shuffle Implementation in Flink - Part One</title>
<description>&lt;p&gt;Part one of this blog post will explain the motivation behind introducing sort-based blocking shuffle, present benchmark results, and provide guidelines on how to use this new feature.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#how-data-gets-passed-around-between-operators&quot; id=&quot;markdown-toc-how-data-gets-passed-around-between-operators&quot;&gt;How data gets passed around between operators&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#motivation-behind-the-sort-based-implementation&quot; id=&quot;markdown-toc-motivation-behind-the-sort-based-implementation&quot;&gt;Motivation behind the sort-based implementation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#benchmark-results-on-stability-and-performance&quot; id=&quot;markdown-toc-benchmark-results-on-stability-and-performance&quot;&gt;Benchmark results on stability and performance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#how-to-use-this-new-feature&quot; id=&quot;markdown-toc-how-to-use-this-new-feature&quot;&gt;How to use this new feature&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#whats-next&quot; id=&quot;markdown-toc-whats-next&quot;&gt;What’s next?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;how-data-gets-passed-around-between-operators&quot;&gt;How data gets passed around between operators&lt;/h1&gt;
&lt;p&gt;Data shuffling is an important stage in batch processing applications and describes how data is sent from one operator to the next. In this phase, output data of the upstream operator will spill over to persistent storages like disk, then the downstream operator will read the corresponding data and process it. Blocking shuffle means that intermediate results from operator A are not sent immediately to operator B until operator A has completely finished.&lt;/p&gt;
&lt;p&gt;The hash-based and sort-based blocking shuffle are two main blocking shuffle implementations widely adopted by existing distributed data processing frameworks:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Hash-Based Approach:&lt;/strong&gt; The core idea behind the hash-based approach is to write data consumed by different consumer tasks to different files and each file can then serve as a natural boundary for the partitioned data.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sort-Based Approach:&lt;/strong&gt; The core idea behind the sort-based approach is to write all the produced data together first and then leverage sorting to cluster data belonging to different data partitions or even keys.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The sort-based blocking shuffle was introduced in Flink 1.12 and further optimized and made production-ready in 1.13 for both stability and performance. We hope you enjoy the improvements and any feedback is highly appreciated.&lt;/p&gt;
&lt;h1 id=&quot;motivation-behind-the-sort-based-implementation&quot;&gt;Motivation behind the sort-based implementation&lt;/h1&gt;
&lt;p&gt;The hash-based blocking shuffle has been supported in Flink for a long time. However, compared to the sort-based approach, it can have several weaknesses:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Stability:&lt;/strong&gt; For batch jobs with high parallelism (tens of thousands of subtasks), the hash-based approach opens many files concurrently while writing or reading data, which can give high pressure to the file system (i.e. maintenance of too many file metas, exhaustion of inodes or file descriptors). We have encountered many stability issues when running large-scale batch jobs via the hash-based blocking shuffle.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performance:&lt;/strong&gt; For large-scale batch jobs, the hash-based approach can produce too many small files: for each data shuffle (or connection), the number of output files is (producer parallelism) * (consumer parallelism) and the average size of each file is (shuffle data size) / (number of files). The random IO caused by writing/reading these fragmented files can influence the shuffle performance a lot, especially on spinning disks. See the &lt;a href=&quot;#benchmark-results-on-stability-and-performance&quot;&gt;benchmark results&lt;/a&gt; section for more information.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;By introducing the sort-based blocking shuffle implementation, fewer data files will be created and opened, and more sequential reads are done. As a result, better stability and performance can be achieved.&lt;/p&gt;
&lt;p&gt;Moreover, the sort-based implementation can save network buffers for large-scale batch jobs. For the hash-based implementation, the network buffers needed for each output result partition are proportional to the consumers’ parallelism. For the sort-based implementation, the network memory consumption can be decoupled from the parallelism, which means that a fixed size of network memory can satisfy requests for all result partitions, though more network memory may lead to better performance.&lt;/p&gt;
&lt;h1 id=&quot;benchmark-results-on-stability-and-performance&quot;&gt;Benchmark results on stability and performance&lt;/h1&gt;
&lt;p&gt;Aside from the problem of consuming too many file descriptors and inodes mentioned in the above section, the hash-based blocking shuffle also has a known issue of creating too many files which blocks the TaskExecutor’s main thread (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21201&quot;&gt;FLINK-21201&lt;/a&gt;). In addition, some large-scale jobs like q78 and q80 of the tpc-ds benchmark failed to run on the hash-based blocking shuffle in our tests because of the “connection reset by peer” exception which is similar to the issue reported in &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19925&quot;&gt;FLINK-19925&lt;/a&gt; (reading shuffle data by Netty threads can influence network stability).&lt;/p&gt;
&lt;p&gt;We ran the tpc-ds test suit (10T scale with 1050 max parallelism) for both the hash-based and the sort-based blocking shuffle. The results show that the sort-based shuffle can achieve 2-6 times more performance gain compared to the hash-based one on spinning disks. If we exclude the computation time, up to 10 times performance gain can be achieved for some jobs. Here are some performance results of our tests:&lt;/p&gt;
&lt;center&gt;
&lt;table width=&quot;95%&quot; border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Jobs&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Time used for Sort-Shuffle (s)&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Time used for Hash-Shuffle (s)&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Speedup Factor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q4.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;986&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;5371&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;5.45&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q11.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;348&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;798&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q14b.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;883&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2129&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.51&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q17.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;269&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;781&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.90&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q23a.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;418&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;1199&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.87&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q23b.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;376&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;843&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.24&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q25.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;413&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;873&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q29.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;354&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;1038&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.93&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q31.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;223&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;498&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q50.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;215&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;550&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.56&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q64.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;217&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;442&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.04&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q74.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;270&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;962&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;3.56&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q75.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;166&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;713&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;4.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;q93.sql&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;204&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;540&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;2.65&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
The throughput per disk of the new sort-based implementation can reach up to 160MB/s for both writing and reading on our testing nodes:&lt;/p&gt;
&lt;center&gt;
&lt;table width=&quot;95%&quot; border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Disk Name&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Disk SDI&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Disk SDJ&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Disk SDK&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Writing Speed (MB/s)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;189&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;173&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;186&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;Reading Speed (MB/s)&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;112&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;154&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;158&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
&lt;strong&gt;Note:&lt;/strong&gt; The following table shows the settings for our test cluster. Because we have a large available memory size per node, those jobs of small shuffle size will exchange their shuffle data purely via memory (page cache). As a result, evident performance differences are seen only between those jobs which shuffle a large amount of data.&lt;/p&gt;
&lt;center&gt;
&lt;table width=&quot;95%&quot; border=&quot;1&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Number of Nodes&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Memory Size Per Node&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Cores Per Node&lt;/th&gt;
&lt;th style=&quot;text-align: center&quot;&gt;Disks Per Node&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center&quot;&gt;12&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;About 400G&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;96&lt;/td&gt;
&lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/center&gt;
&lt;h1 id=&quot;how-to-use-this-new-feature&quot;&gt;How to use this new feature&lt;/h1&gt;
&lt;p&gt;The sort-based blocking shuffle is introduced mainly for large-scale batch jobs but it also works well for batch jobs with low parallelism.&lt;/p&gt;
&lt;p&gt;The sort-based blocking shuffle is not enabled by default. You can enable it by setting the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#taskmanager-network-sort-shuffle-min-parallelism&quot;&gt;taskmanager.network.sort-shuffle.min-parallelism&lt;/a&gt; config option to a smaller value. This means that for parallelism smaller than this threshold, the hash-based blocking shuffle will be used, otherwise, the sort-based blocking shuffle will be used (it has no influence on streaming applications). Setting this option to 1 will disable the hash-based blocking shuffle.&lt;/p&gt;
&lt;p&gt;For spinning disks and large-scale batch jobs, you should use the sort-based blocking shuffle. For low parallelism (several hundred processes or fewer) on solid state drives, both implementations should be fine.&lt;/p&gt;
&lt;p&gt;There are several other config options that can have an impact on the performance of the sort-based blocking shuffle:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#taskmanager-network-blocking-shuffle-compression-enabled&quot;&gt;taskmanager.network.blocking-shuffle.compression.enabled&lt;/a&gt;: This enables shuffle data compression, which can reduce both the network and the disk IO with some CPU overhead. It is recommended to enable shuffle data compression unless the data compression ratio is low. It works for both sort-based and hash-based blocking shuffle.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#taskmanager-network-sort-shuffle-min-buffers&quot;&gt;taskmanager.network.sort-shuffle.min-buffers&lt;/a&gt;: This declares the minimum number of required network buffers that can be used as the in-memory sort-buffer per result partition for data caching and sorting. Increasing the value of this option may improve the blocking shuffle performance. Several hundreds of megabytes of memory is usually enough for large-scale batch jobs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#taskmanager-memory-framework-off-heap-batch-shuffle-size&quot;&gt;taskmanager.memory.framework.off-heap.batch-shuffle.size&lt;/a&gt;: This configuration defines the maximum memory size that can be used by data reading of the sort-based blocking shuffle per task manager. Increasing the value of this option may improve the shuffle read performance, and usually, several hundreds of megabytes of memory is enough for large-scale batch jobs. Because this memory is cut from the framework off-heap memory, you may also need to increase &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#taskmanager-memory-framework-off-heap-size&quot;&gt;taskmanager.memory.framework.off-heap.size&lt;/a&gt; before you increase this value.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For more information about blocking shuffle in Flink, please refer to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/batch/blocking_shuffle/&quot;&gt;official documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; From the optimization mechanism in &lt;a href=&quot;/2021/10/26/sort-shuffle-part2&quot;&gt;part two&lt;/a&gt;, we can see that the IO scheduling relies on the concurrent data read requests of the downstream consumer tasks for more sequential reads. As a result, if the downstream consumer task is running one by one (for example, because of limited resources), the advantage brought by IO scheduling disappears, which can influence performance. We may further optimize this scenario in future versions.&lt;/p&gt;
&lt;h1 id=&quot;whats-next&quot;&gt;What’s next?&lt;/h1&gt;
&lt;p&gt;For details on the design and implementation of this feature, please refer to the &lt;a href=&quot;/2021/10/26/sort-shuffle-part2&quot;&gt;second part&lt;/a&gt; of this blog!&lt;/p&gt;
</description>
<pubDate>Tue, 26 Oct 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2021/10/26/sort-shuffle-part1.html</link>
<guid isPermaLink="true">/2021/10/26/sort-shuffle-part1.html</guid>
</item>
<item>
<title>Apache Flink 1.13.3 Released</title>
<description>&lt;p&gt;The Apache Flink community released the third bugfix version of the Apache Flink 1.13 series.&lt;/p&gt;
&lt;p&gt;This release includes 136 fixes and minor improvements for Flink 1.13.2. The list below includes bugfixes and improvements. For a complete list of all changes see:
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12350329&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.13.3.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Below you can find more information on changes that might affect the behavior of Flink:&lt;/p&gt;
&lt;h2 id=&quot;propagate-unique-keys-for-fromchangelogstream-flink-24033httpsissuesapacheorgjirabrowseflink-24033&quot;&gt;Propagate unique keys for &lt;code&gt;fromChangelogStream&lt;/code&gt; (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-24033&quot;&gt;FLINK-24033&lt;/a&gt;)&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;StreamTableEnvironment.fromChangelogStream&lt;/code&gt; might produce a different stream because primary keys were not properly considered before.&lt;/p&gt;
&lt;h2 id=&quot;table-api-primary-key-feature-was-not-working-correctly-flink-23895httpsissuesapacheorgjirabrowseflink-23895-flink-20374httpsissuesapacheorgjirabrowseflink-20374&quot;&gt;Table API ‘Primary Key’ feature was not working correctly (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23895&quot;&gt;FLINK-23895&lt;/a&gt; &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20374&quot;&gt;FLINK-20374&lt;/a&gt;)&lt;/h2&gt;
&lt;p&gt;Various primary key issues have been fixed that effectively made it impossible to use this feature.
The change might affect savepoint backwards compatibility for affected pipelines.
Pipelines that were not affected should be able to restore from a savepoint without issues.
The resulting changelog stream might be different after these changes.&lt;/p&gt;
&lt;h2 id=&quot;clarify-sourcefunctioncancel-contract-about-interrupting-flink-23527httpsissuesapacheorgjirabrowseflink-23527&quot;&gt;Clarify &lt;code&gt;SourceFunction#cancel()&lt;/code&gt; contract about interrupting (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23527&quot;&gt;FLINK-23527&lt;/a&gt;)&lt;/h2&gt;
&lt;p&gt;The contract of the &lt;code&gt;SourceFunction#cancel()&lt;/code&gt; method with respect to interruptions has been clarified:
- The source itself shouldn’t interrupt the source thread.
- The source can expect to not be interrupted during a clean cancellation procedure.&lt;/p&gt;
&lt;h2 id=&quot;taskmanagerslottimeout-falls-back-to-akkaasktimeout-flink-22002httpsissuesapacheorgjirabrowseflink-22002&quot;&gt;&lt;code&gt;taskmanager.slot.timeout&lt;/code&gt; falls back to &lt;code&gt;akka.ask.timeout&lt;/code&gt; (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22002&quot;&gt;FLINK-22002&lt;/a&gt;)&lt;/h2&gt;
&lt;p&gt;The config option &lt;code&gt;taskmanager.slot.timeout&lt;/code&gt; falls now back to &lt;code&gt;akka.ask.timeout&lt;/code&gt; if no value has been configured.&lt;/p&gt;
&lt;h2 id=&quot;increase-akkaasktimeout-for-tests-using-the-minicluster-flink-23906httpsissuesapacheorgjirabrowseflink-23962&quot;&gt;Increase &lt;code&gt;akka.ask.timeout&lt;/code&gt; for tests using the MiniCluster (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23962&quot;&gt;FLINK-23906&lt;/a&gt;)&lt;/h2&gt;
&lt;p&gt;The default &lt;code&gt;akka.ask.timeout&lt;/code&gt; used by the &lt;code&gt;MiniCluster&lt;/code&gt; has been increased to 5 minutes. If you want to use a smaller value, then you have to set it explicitly in the passed configuration.
The change is due to the fact that messages cannot get lost in a single-process minicluster, so this timeout (which otherwise helps to detect message loss in distributed setups) has no benefit here.
The increased timeout reduces the number of false-positive timeouts, for example during heavy tests on loaded CI/CD workers or during debugging.&lt;/p&gt;
</description>
<pubDate>Tue, 19 Oct 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/10/19/release-1.13.3.html</link>
<guid isPermaLink="true">/news/2021/10/19/release-1.13.3.html</guid>
</item>
<item>
<title>Apache Flink 1.14.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Software Foundation recently released its annual report and Apache Flink once again made
it on the list of the top 5 most active projects! This remarkable
activity also shows in the new 1.14.0 release. Once again, more than 200 contributors worked on
over 1,000 issues. We are proud of how this community is consistently moving the project forward.&lt;/p&gt;
&lt;p&gt;This release brings many new features and improvements in areas such as the SQL API, more connector support, checkpointing, and PyFlink.
A major area of changes in this release is the integrated streaming &amp;amp; batch experience. We believe
that, in practice, unbounded stream processing goes hand-in-hand with bounded- and batch processing tasks,
because many use cases require processing historic data from various sources alongside streaming data.
Examples are data exploration when developing new applications, bootstrapping state for new applications, training
models to be applied in a streaming application, or re-processing data after fixes/upgrades.&lt;/p&gt;
&lt;p&gt;In Flink 1.14, we finally made it possible to &lt;strong&gt;mix bounded and unbounded streams in an application&lt;/strong&gt;:
Flink now supports taking checkpoints of applications that are partially running and partially finished (some
operators reached the end of the bounded inputs). Additionally, &lt;strong&gt;bounded streams now take a final checkpoint&lt;/strong&gt;
when reaching their end to ensure smooth committing of results in sinks.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;batch execution mode now supports programs that use a mixture of the DataStream API and the SQL/Table API&lt;/strong&gt;
(previously only pure Table/SQL or DataStream programs).&lt;/p&gt;
&lt;p&gt;The unified Source and Sink APIs have gotten an update, and we started &lt;strong&gt;consolidating the connector ecosystem around the unified APIs&lt;/strong&gt;. We added a new &lt;strong&gt;hybrid source&lt;/strong&gt; that can bridge between multiple storage systems.
You can now do things like read old data from Amazon S3 and then switch over to Apache Kafka.&lt;/p&gt;
&lt;p&gt;In addition, this release furthers our initiative in making Flink more self-tuning and
easier to operate, without necessarily requiring a lot of Stream-Processor-specific knowledge.
We started this initiative in the previous release with &lt;a href=&quot;/news/2021/05/03/release-1.13.0.html#reactive-scaling&quot;&gt;reactive scaling&lt;/a&gt;
and are now adding &lt;strong&gt;automatic network memory tuning&lt;/strong&gt; (&lt;em&gt;a.k.a. Buffer Debloating&lt;/em&gt;).
This feature speeds up checkpoints under high load while maintaining high throughput and without
increasing checkpoint size. The mechanism continuously adjusts the network buffers to ensure the best
throughput while having minimal in-flight data. See the &lt;a href=&quot;#buffer-debloating&quot;&gt;Buffer Debloating section&lt;/a&gt;
for more details.&lt;/p&gt;
&lt;p&gt;There are many more improvements and new additions throughout various components, as we discuss below.
We also had to say goodbye to some features that have been superceded by newer ones in recent releases,
most prominently we are &lt;strong&gt;removing the old SQL execution engine&lt;/strong&gt; and are
&lt;strong&gt;removing the active integration with Apache Mesos&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;We hope you like the new release and we’d be eager to learn about your experience with it, which yet
unsolved problems it solves, what new use-cases it unlocks for you.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#the-unified-batch-and-stream-processing-experience&quot; id=&quot;markdown-toc-the-unified-batch-and-stream-processing-experience&quot;&gt;The Unified Batch and Stream Processing Experience&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#checkpointing-and-bounded-streams&quot; id=&quot;markdown-toc-checkpointing-and-bounded-streams&quot;&gt;Checkpointing and Bounded Streams&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#batch-execution-for-mixed-datastream-and-tablesql-applications&quot; id=&quot;markdown-toc-batch-execution-for-mixed-datastream-and-tablesql-applications&quot;&gt;Batch Execution for mixed DataStream and Table/SQL Applications&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#hybrid-source&quot; id=&quot;markdown-toc-hybrid-source&quot;&gt;Hybrid Source&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#consolidating-sources-and-sink&quot; id=&quot;markdown-toc-consolidating-sources-and-sink&quot;&gt;Consolidating Sources and Sink&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#improvements-to-operations&quot; id=&quot;markdown-toc-improvements-to-operations&quot;&gt;Improvements to Operations&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#buffer-debloating&quot; id=&quot;markdown-toc-buffer-debloating&quot;&gt;Buffer debloating&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#fine-grained-resource-management&quot; id=&quot;markdown-toc-fine-grained-resource-management&quot;&gt;Fine-grained Resource Management&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#connectors&quot; id=&quot;markdown-toc-connectors&quot;&gt;Connectors&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#connector-metrics&quot; id=&quot;markdown-toc-connector-metrics&quot;&gt;Connector Metrics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pulsar-connector&quot; id=&quot;markdown-toc-pulsar-connector&quot;&gt;Pulsar Connector&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pyflink&quot; id=&quot;markdown-toc-pyflink&quot;&gt;PyFlink&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#performance-improvements-through-chaining&quot; id=&quot;markdown-toc-performance-improvements-through-chaining&quot;&gt;Performance Improvements through Chaining&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#loopback-mode-for-debugging&quot; id=&quot;markdown-toc-loopback-mode-for-debugging&quot;&gt;Loopback Mode for Debugging&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#miscellaneous-improvements&quot; id=&quot;markdown-toc-miscellaneous-improvements&quot;&gt;Miscellaneous Improvements&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#goodbye-legacy-sql-engine-and-mesos-support&quot; id=&quot;markdown-toc-goodbye-legacy-sql-engine-and-mesos-support&quot;&gt;Goodbye Legacy SQL Engine and Mesos Support&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#upgrade-notes&quot; id=&quot;markdown-toc-upgrade-notes&quot;&gt;Upgrade Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;the-unified-batch-and-stream-processing-experience&quot;&gt;The Unified Batch and Stream Processing Experience&lt;/h1&gt;
&lt;p&gt;One of Flink’s unique characteristics is how it integrates stream- and batch processing,
using unified APIs and a runtime that supports multiple execution paradigms.&lt;/p&gt;
&lt;p&gt;As motivated in the introduction, we believe that stream- and batch processing always go hand in hand. This quote from
a &lt;a href=&quot;https://research.fb.com/wp-content/uploads/2016/11/realtime_data_processing_at_facebook.pdf&quot;&gt;report on Facebook’s streaming infrastructure&lt;/a&gt;
echos this sentiment nicely.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Streaming versus batch processing is not an either/or decision. Originally, all data warehouse
processing at Facebook was batch processing. We began developing Puma and Swift about five years
ago. As we showed in Section […], using a mix of streaming and batch processing can speed up
long pipelines by hours.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Having both the real-time and the historic computations in the same engine also ensures consistency
between semantics and makes results well comparable. Here is an &lt;a href=&quot;https://www.ververica.com/blog/apache-flinks-stream-batch-unification-powers-alibabas-11.11-in-2020&quot;&gt;article by Alibaba&lt;/a&gt;
about unifying business reporting with Apache Flink and getting consistent reports that way.&lt;/p&gt;
&lt;p&gt;While unified streaming &amp;amp; batch are already possible in earlier versions, this release brings
some features that unlock new use cases, as well as a series of quality-of-life improvements.&lt;/p&gt;
&lt;h2 id=&quot;checkpointing-and-bounded-streams&quot;&gt;Checkpointing and Bounded Streams&lt;/h2&gt;
&lt;p&gt;Flink’s checkpointing mechanism could originally only create checkpoints when all tasks in an application’s
DAG were running. This meant that applications using both bounded and unbounded data sources were not really possible.
In addition, applications on bounded inputs that were executed in a streaming way (not in a batch way)
stopped checkpointing towards the end of the processing, when some tasks finished. Without checkpoints, the
latest output data was not committed, resulting in lingering data for exactly-once sinks.&lt;/p&gt;
&lt;p&gt;With &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-147%3A+Support+Checkpoints+After+Tasks+Finished&quot;&gt;FLIP-147&lt;/a&gt;
Flink now supports checkpoints after tasks are finished, and takes a final checkpoint at the end of a
bounded stream, ensuring that all sink data is committed before the job ends (similar to how
&lt;em&gt;stop-with-savepoint&lt;/em&gt; behaves).&lt;/p&gt;
&lt;p&gt;To activate this feature, add &lt;code&gt;execution.checkpointing.checkpoints-after-tasks-finish.enabled: true&lt;/code&gt;
to your configuration. Keeping with the opt-in tradition for big and new features,
this is not activated by default in Flink 1.14. We expect it to become the default mode in the next release.&lt;/p&gt;
&lt;p&gt;Background: While the batch execution mode is often the preferrable way to run applications over bounded streams,
there are various reasons to use streaming execution mode over bounded streams. For example, the sink being used
might only support streaming execution (i.e. Kafka sink) or you may want to exploit the streaming-inherent
quasi-ordering-by-time in your application, such as motivated by the &lt;a href=&quot;https://youtu.be/4qSlsYogALo?t=666&quot;&gt;Kappa+ Architecture&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;batch-execution-for-mixed-datastream-and-tablesql-applications&quot;&gt;Batch Execution for mixed DataStream and Table/SQL Applications&lt;/h2&gt;
&lt;p&gt;SQL and the Table API are becoming the default starting points for new projects. The declarative
nature and richness of built-in types and operations make it easy to develop applications fast.
It is not uncommon, however, for developers to eventually hit the limit of SQL’s expressiveness for
certain types of event-driven business logic (or hit the point when it becomes grotesque to express
that logic in SQL).&lt;/p&gt;
&lt;p&gt;At that point, the natural step is to blend in a piece of stateful DataStream API logic, before
switching back to SQL again.&lt;/p&gt;
&lt;p&gt;In Flink 1.14, bounded batch-executed SQL/Table programs can convert their intermediate
Tables to a DataStream, apply some DataSteam API operations, and convert it back to a Table.
Under the hood, Flink builds a dataflow DAG mixing declarative optimized SQL execution with batch-executed DataStream logic.
Check out the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/data_stream_api/#converting-between-datastream-and-table&quot;&gt;documentation&lt;/a&gt; for details.&lt;/p&gt;
&lt;h2 id=&quot;hybrid-source&quot;&gt;Hybrid Source&lt;/h2&gt;
&lt;p&gt;The new &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/hybridsource/&quot;&gt;Hybrid Source&lt;/a&gt;
produces a combined stream from multiple sources, by reading those sources one after the other,
seamlessly switching over from one source to the other.&lt;/p&gt;
&lt;p&gt;The motivating use case for the Hybrid Source was to read streams from tiered storage setups as if there was one
stream that spans all tiers. For example, new data may land in Kafka and is eventually
migrated to S3 (typically in compressed columnar format, for cost efficiency and performance).
The Hybrid Source can read this as one contiguous logical stream, starting with the historic data on S3
and transitioning over to the more recent data in Kafka.&lt;/p&gt;
&lt;figure style=&quot;align-content: center&quot;&gt;
&lt;img src=&quot;/img/blog/2021-09-25-release-1.14.0/hybrid_source.png&quot; style=&quot;display: block; margin-left: auto; margin-right: auto; width: 600px&quot; /&gt;
&lt;/figure&gt;
&lt;p&gt;We believe that this is an exciting step in realizing the full promise of logs and the &lt;em&gt;Kappa Architecture.&lt;/em&gt;
Even if older parts of an event log are physically migrated to different storage
(for reasons such as cost, better compression, faster reads) you can still treat and process it as one
contiguous log.&lt;/p&gt;
&lt;p&gt;Flink 1.14 adds the core functionality of the Hybrid Source. Over the next releases, we expect to add more
utilities and patterns for typical switching strategies.&lt;/p&gt;
&lt;h2 id=&quot;consolidating-sources-and-sink&quot;&gt;Consolidating Sources and Sink&lt;/h2&gt;
&lt;p&gt;With the new unified (streaming/batch) source and sink APIs now being stable, we started the
big effort to consolidate all connectors around those APIs. At the same time, we are
better aligning connectors between DataStream and SQL/Table API. First are the &lt;em&gt;Kafka&lt;/em&gt; and
&lt;em&gt;File&lt;/em&gt; Soures and Sinks for the DataStream API.&lt;/p&gt;
&lt;p&gt;The result of this effort (that we expect to span at least 1-2 futher releases) will be a much
smoother and more consistent experience for Flink users when connecting to external systems.&lt;/p&gt;
&lt;h1 id=&quot;improvements-to-operations&quot;&gt;Improvements to Operations&lt;/h1&gt;
&lt;h2 id=&quot;buffer-debloating&quot;&gt;Buffer debloating&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Buffer Debloating&lt;/em&gt; is a new technology in Flink that minimizes checkpoint latency and cost.
It does so by automatically tuning the usage of network memory to ensure high throughput,
while minimizing the amount of in-flight data.&lt;/p&gt;
&lt;p&gt;Apache Flink buffers a certain amount of data in its network stack to be able to utilize the
bandwidth of fast networks. A Flink application running with high throughput uses some (or
all) of that memory. Aligned checkpoints flow with the data through the network buffers in milliseconds.&lt;/p&gt;
&lt;p&gt;During (temporary) backpressure from a resource bottleneck such as an external system, data skew, or (temporarily)
increased load, Flink was buffering a lot more data inside its network buffers than necessary to utilize
enough network bandwidth for the application’s current – backpressured – throughput. This actually has an adverse
effect because more buffered data means that the checkpoints need to do more work. Aligned checkpoint barriers
need to wait for more data to be processed, unaligned checkpoints need to persist more in-flight data.&lt;/p&gt;
&lt;p&gt;This is where &lt;em&gt;Buffer Debloating&lt;/em&gt; comes into play: It changes the network stack from keeping up to X bytes of data
to keeping data that is worth X milliseconds of receiver computing time. With the default setting
of 1000 milliseconds, that means the network stack will buffer as much data as the receiving task can
process in 1000 milliseconds. These values are constantly measured and adjusted, so the system keeps
this characteristic even under varying conditions. As a result, Flink can now provide
stable and predictable alignment times for aligned checkpoints under backpressure, and can vastly
reduce the amount of in-flight data stored in unaliged checkpoints under backpressure.&lt;/p&gt;
&lt;figure style=&quot;align-content: center&quot;&gt;
&lt;img src=&quot;/img/blog/2021-09-25-release-1.14.0/buffer_debloating.svg&quot; style=&quot;display: block; margin-left: auto; margin-right: auto; width: 600px&quot; /&gt;
&lt;/figure&gt;
&lt;p&gt;Buffer Debloating acts as a complementary feature, or even alternative, to unaligned checkpoints.
Checkout the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/memory/network_mem_tuning/#the-buffer-debloating-mechanism&quot;&gt;documentation&lt;/a&gt;
to see how to activate this feature.&lt;/p&gt;
&lt;h2 id=&quot;fine-grained-resource-management&quot;&gt;Fine-grained Resource Management&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Fine-grained resource management&lt;/em&gt; is an advanced new feature that increases the resource
utilization of large shared clusters.&lt;/p&gt;
&lt;p&gt;Flink clusters execute various data processing workloads. Different data processing steps typically need
different resources such as compute resources and memory. For example, most &lt;code&gt;map()&lt;/code&gt; functions are fairly
lightweight, but large windows with long retention can benefit from lots of memory.
By default, Flink manages resources in coarse-grained units called &lt;em&gt;slots&lt;/em&gt;, which are slices
of a TaskManager’s resources. Streaming pipelines fill a slot with one parallel
subtask of each operator, so each slot holds a pipeline of subtasks.
Through &lt;em&gt;‘slot sharing groups’&lt;/em&gt;, users can influence how subtasks are assigned to slots.&lt;/p&gt;
&lt;p&gt;With fine-grained resource management, TaskManager slots can now be dynamically sized.
Transformations and operators specify what resource profiles they would like (CPU size,
memory pools, disk space) and Flink’s Resource Manager and TaskManagers slice off that specific
part of a TaskManager’s total resources. You can think of it as a minimal lightweight resource orchestration
layer within Flink. The figure below illustrates the difference between the current default mode of shared
fixed-size slots and the new fine-grained resource management feature.&lt;/p&gt;
&lt;figure style=&quot;align-content: center&quot;&gt;
&lt;img src=&quot;/img/blog/2021-09-25-release-1.14.0/fine_grained_resource_management.svg&quot; style=&quot;display: block; margin-left: auto; margin-right: auto; width: 600px&quot; /&gt;
&lt;/figure&gt;
&lt;p&gt;You may be wondering why we implement such a feature in Flink, when we also integrate with full-fledged
resource orchestration frameworks like Kubernetes or YARN. There are several situations where the additional
resource management layer within Flink significantly increases the resource utilization:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For many small slots, the overhead of dedicated TaskManagers is very high (JVM overhead, Flink control data structures).
Slot-sharing implicitly works around this by sharing the slots between all operator types, which means
sharing resources between lightweight operators (which need small slots) and heavyweight operators (which need large slots).
However, this only works well when all operators share the same parallelism, which is not aways optimal.
Furthermore, certain operators work better when run in isolation (for example ML training operators
that need dedicated GPU resources).&lt;/li&gt;
&lt;li&gt;Kubernetes and YARN often take quite some time to fulfill requests, especially on loaded clusters.
For many batch jobs, efficiency gets lost while waiting for the requests to be fulfilled.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So when should you use this feature? For most streaming and batch jobs the default resource management mechanism
are perfectly suitable. Fine-grained resourced management can help you increase resource efficiency if you have either long-running
streaming jobs, or fast batch jobs, where different stages have different resource requirements, and you may
have already tuned the parallelism of different operators to different values.&lt;/p&gt;
&lt;p&gt;Alibaba’s internal Flink-based platform has used this mechanism for some time now and the resource utilization
of the cluster has improved significantly.&lt;/p&gt;
&lt;p&gt;Please refer to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/finegrained_resource/&quot;&gt;Fine-grained Resource Management documentation&lt;/a&gt;
for details on how to use this feature.&lt;/p&gt;
&lt;h1 id=&quot;connectors&quot;&gt;Connectors&lt;/h1&gt;
&lt;h2 id=&quot;connector-metrics&quot;&gt;Connector Metrics&lt;/h2&gt;
&lt;p&gt;Metrics for connectors have been standardized in this release (see &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics&quot;&gt;FLIP-33&lt;/a&gt;).
The community will gradually pull metrics through all connectors, as we rework them
onto the new unified APIs over the next releases. In Flink 1.14, we cover the Kafka connector
and (partially) the FileSystem connectors.&lt;/p&gt;
&lt;p&gt;Connectors are the entry and exit points for data in a Flink job. If a job is not running as
expected, the connector telemetry is among the first parts to be checked. We believe this will become
a nice improvement when operating Flink applications in production.&lt;/p&gt;
&lt;h2 id=&quot;pulsar-connector&quot;&gt;Pulsar Connector&lt;/h2&gt;
&lt;p&gt;In this release, Flink added the &lt;a href=&quot;https://pulsar.apache.org/&quot;&gt;Apache Pulsar&lt;/a&gt; connector.
The Pulsar connector reads data from Pulsar topics and supports both streaming and batch execution modes.
With the support of the transaction functionality (introduced in Pulsar 2.8.0), the Pulsar connector provides
exactly-once delivery semantic to ensure that a message is delivered exactly once to a consumer,
even if a producer retries sending that message.&lt;/p&gt;
&lt;p&gt;To support the different message-ordering and scaling requirements of different use cases, the Pulsar
source connector exposes four subscription types:
- &lt;a href=&quot;https://pulsar.apache.org/docs/en/concepts-messaging/#exclusive&quot;&gt;Exclusive&lt;/a&gt;
- &lt;a href=&quot;https://pulsar.apache.org/docs/en/concepts-messaging/#shared&quot;&gt;Shared&lt;/a&gt;
- &lt;a href=&quot;https://pulsar.apache.org/docs/en/concepts-messaging/#failover&quot;&gt;Failover&lt;/a&gt;
- &lt;a href=&quot;https://pulsar.apache.org/docs/en/concepts-messaging/#key_shared&quot;&gt;Key-Shared&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The connector currently supports the DataStream API. Table API/SQL bindings are expected to be
contributed in a future release. For details about how to use the Pulsar connector, see
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/pulsar/#apache-pulsar-connector&quot;&gt;Apache Pulsar Connector&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;pyflink&quot;&gt;PyFlink&lt;/h1&gt;
&lt;h2 id=&quot;performance-improvements-through-chaining&quot;&gt;Performance Improvements through Chaining&lt;/h2&gt;
&lt;p&gt;Similar to how the Java APIs chain transformation functions/operators within a task to avoid
serialization overhead, PyFlink now chains Python functions. In PyFlink’s case, the
chaining not only eliminates serialization overhead, but also reduces RPC round trips
between the Java and Python processes. This provides a significant
boost to PyFlink’s overall performance.&lt;/p&gt;
&lt;p&gt;Python function chaining was already available for Python UDFs used in the Table API &amp;amp; SQL.
In Flink 1.14, chaining is also exploited for the cPython functions in Python DataStream API.&lt;/p&gt;
&lt;h2 id=&quot;loopback-mode-for-debugging&quot;&gt;Loopback Mode for Debugging&lt;/h2&gt;
&lt;p&gt;Python functions are normally executed in a separate Python process next to Flink’s JVM.
This architecture makes it difficult to debug Python code.&lt;/p&gt;
&lt;p&gt;PyFlink 1.14 introduces a &lt;em&gt;loopback mode&lt;/em&gt;, which is activated by default for local deployments.
In this mode, user-defined Python functions will be executed in the Python process of the client,
which is the entry point process that starts the PyFlink program and contains the DataStream API and
Table API code that builds the dataflow DAG. Users can now easily debug their Python functions
by setting breakpoints in their IDEs when launching a PyFlink job locally.&lt;/p&gt;
&lt;h2 id=&quot;miscellaneous-improvements&quot;&gt;Miscellaneous Improvements&lt;/h2&gt;
&lt;p&gt;There are also many other improvements to PyFlink, such as support for executing
jobs in YARN application mode and support for compressed tgz files as Python archives.
Check out the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/overview/&quot;&gt;Python API documentation&lt;/a&gt;
for more details.&lt;/p&gt;
&lt;h1 id=&quot;goodbye-legacy-sql-engine-and-mesos-support&quot;&gt;Goodbye Legacy SQL Engine and Mesos Support&lt;/h1&gt;
&lt;p&gt;Maintaining an open source project also means sometimes saying good-bye to some beloved features.&lt;/p&gt;
&lt;p&gt;When we added the Blink SQL Engine to Flink more than two years ago, it was clear that it would
eventually replace the previous SQL engine. Blink was faster and more feature-complete.
For a year now, Blink has been the default SQL engine. With Flink 1.14 we finally remove all
code from the previous SQL engine. This allowed us to drop many outdated interfaces and reduce
confusion for users about which interfaces to use when implementing custom connectors or functions.
It will also help us in the future to make faster changes to the SQL engine.&lt;/p&gt;
&lt;p&gt;The active integration with Apache Mesos was also removed, because we saw little interest by
users in this feature and we could not gather enough contributors willing to help maintaining this
part of the system. Flink 1.14 can no longer run on Mesos without the help of projects like Marathon,
and the Flink Resource Manager can no longer request and release resources from Mesos for workloads
with changing resource requirements.&lt;/p&gt;
&lt;h1 id=&quot;upgrade-notes&quot;&gt;Upgrade Notes&lt;/h1&gt;
&lt;p&gt;While we aim to make upgrades as smooth as possible, some of the changes require users
to adjust some parts of the program when upgrading to Apache Flink 1.14.
Please take a look at the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/release-notes/flink-1.14/&quot;&gt;release notes&lt;/a&gt;
for a list of adjustments to make and issues to check during upgrades.&lt;/p&gt;
&lt;h1 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h1&gt;
&lt;p&gt;The Apache Flink community would like to thank each one of the contributors that have made this
release possible:&lt;/p&gt;
&lt;p&gt;adavis9592, Ada Wong, aidenma, Aitozi, Ankush Khanna, anton, Anton Kalashnikov, Arvid Heise, Ashwin
Kolhatkar, Authuir, bgeng777, Brian Zhou, camile.sing, caoyingjie, Cemre Mengu, chennuo, Chesnay
Schepler, chuixue, CodeCooker17, comsir, Daisy T, Danny Cranmer, David Anderson, David Moravek,
Dawid Wysakowicz, dbgp2021, Dian Fu, Dong Lin, Edmondsky, Elphas Toringepi, Emre Kartoglu, ericliuk,
Eron Wright, est08zw, Etienne Chauchot, Fabian Paul, fangliang, fangyue1, fengli, Francesco
Guardiani, FuyaoLi2017, fuyli, Gabor Somogyi, gaoyajun02, Gen Luo, gentlewangyu, GitHub, godfrey he,
godfreyhe, gongzhongqiang, Guokuai Huang, GuoWei Ma, Gyula Fora, hackergin, hameizi, Hang Ruan, Han
Wei, hapihu, hehuiyuan, hstdream, Huachao Mao, HuangXiao, huangxingbo, huxixiang, Ingo Bürk,
Jacklee, Jan Brusch, Jane, Jane Chan, Jark Wu, JasonLee, Jiajie Zhong, Jiangjie (Becket) Qin,
Jianzhang Chen, Jiayi Liao, Jing, Jingsong Lee, JingsongLi, Jing Zhang, jinxing64, junfan.zhang, Jun
Qin, Jun Zhang, kanata163, Kevin Bohinski, kevin.cyj, Kevin Fan, Kurt Young, kylewang, Lars
Bachmann, lbb, LB Yu, LB-Yu, LeeJiangchuan, Leeviiii, leiyanfei, Leonard Xu, LightGHLi, Lijie Wang,
liliwei, lincoln lee, Linyu, liuyanpunk, lixiaobao14, luoyuxia, Lyn Zhang, lys0716, MaChengLong,
mans2singh, Marios Trivyzas, martijnvisser, Matthias Pohl, Mayi, mayue.fight, Michael Li, Michal
Ciesielczyk, Mika, Mika Naylor, MikuSugar, movesan, Mulan, Nico Kruber, Nicolas Raga, Nicolaus
Weidner, paul8263, Paul Lin, pierre xiong, Piotr Nowojski, Qingsheng Ren, Rainie Li, Robert Metzger,
Roc Marshal, Roman, Roman Khachatryan, Rui Li, sammieliu, sasukerui, Senbin Lin, Senhong Liu, Serhat
Soydan, Seth Wiesman, sharkdtu, Shengkai, Shen Zhu, shizhengchao, Shuo Cheng, shuo.cs, simenliuxing,
sjwiesman, Srinivasulu Punuru, Stefan Gloutnikov, SteNicholas, Stephan Ewen, sujun, sv3ndk, Svend
Vanderveken, syhily, Tartarus0zm, Terry Wang, Thesharing, Thomas Weise, tiegen, Till Rohrmann, Timo
Walther, tison, Tony Wei, trushev, tsreaper, TsReaper, Tzu-Li (Gordon) Tai, wangfeifan, wangwei1025,
wangxianghu, wangyang0918, weizheng92, Wenhao Ji, Wenlong Lyu, wenqiao, WilliamSong11, wuren,
wysstartgo, Xintong Song, yanchenyun, yangminghua, yangqu, Yang Wang, Yangyang ZHANG, Yangze Guo,
Yao Zhang, yfhanfei, yiksanchan, Yik San Chan, Yi Tang, yljee, Youngwoo Kim, Yuan Mei, Yubin Li,
Yufan Sheng, yulei0824, Yun Gao, Yun Tang, yuxia Luo, Zakelly, zhang chaoming, zhangjunfan,
zhangmang, zhangzhengqi3, zhao_wei_nan, zhaown, zhaoxing, ZhiJie Yang, Zhilong Hong, Zhiwen Sun, Zhu
Zhu, zlzhang0122, zoran, Zor X. LIU, zoucao, Zsombor Chikan, 子扬, 莫辞&lt;/p&gt;
</description>
<pubDate>Wed, 29 Sep 2021 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/09/29/release-1.14.0.html</link>
<guid isPermaLink="true">/news/2021/09/29/release-1.14.0.html</guid>
</item>
<item>
<title>Implementing a custom source connector for Table API and SQL - Part Two </title>
<description>&lt;p&gt;In &lt;a href=&quot;/2021/09/07/connector-table-sql-api-part1&quot;&gt;part one&lt;/a&gt; of this tutorial, you learned how to build a custom source connector for Flink. In part two, you will learn how to integrate the connector with a test email inbox through the IMAP protocol and filter out emails using Flink SQL.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#goals&quot; id=&quot;markdown-toc-goals&quot;&gt;Goals&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#prerequisites&quot; id=&quot;markdown-toc-prerequisites&quot;&gt;Prerequisites&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#understand-how-to-fetch-emails-via-the-imap-protocol&quot; id=&quot;markdown-toc-understand-how-to-fetch-emails-via-the-imap-protocol&quot;&gt;Understand how to fetch emails via the IMAP protocol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#add-configuration-options---server-information-and-credentials&quot; id=&quot;markdown-toc-add-configuration-options---server-information-and-credentials&quot;&gt;Add configuration options - server information and credentials&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#connect-to-the-source-email-server&quot; id=&quot;markdown-toc-connect-to-the-source-email-server&quot;&gt;Connect to the source email server&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#collect-incoming-emails&quot; id=&quot;markdown-toc-collect-incoming-emails&quot;&gt;Collect incoming emails&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#test-the-connector-with-a-real-mail-server-on-the-ververica-platform&quot; id=&quot;markdown-toc-test-the-connector-with-a-real-mail-server-on-the-ververica-platform&quot;&gt;Test the connector with a real mail server on the Ververica Platform&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;goals&quot;&gt;Goals&lt;/h1&gt;
&lt;p&gt;Part two of the tutorial will teach you how to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;integrate a source connector which connects to a mailbox using the IMAP protocol&lt;/li&gt;
&lt;li&gt;use &lt;a href=&quot;https://eclipse-ee4j.github.io/mail/&quot;&gt;Jakarta Mail&lt;/a&gt;, a Java library that can send and receive email via the IMAP protocol&lt;/li&gt;
&lt;li&gt;write &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/overview/&quot;&gt;Flink SQL&lt;/a&gt; and execute the queries in the &lt;a href=&quot;https://www.ververica.com/apache-flink-sql-on-ververica-platform&quot;&gt;Ververica Platform&lt;/a&gt; for a nicer visualization&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You are encouraged to follow along with the code in this &lt;a href=&quot;https://github.com/Airblader/blog-imap&quot;&gt;repository&lt;/a&gt;. It provides a boilerplate project that also comes with a bundled &lt;a href=&quot;https://docs.docker.com/compose/&quot;&gt;docker-compose&lt;/a&gt; setup that lets you easily run the connector. You can then try it out with Flink’s SQL client.&lt;/p&gt;
&lt;h1 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h1&gt;
&lt;p&gt;This tutorial assumes that you have:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;followed the steps outlined in &lt;a href=&quot;/2021/09/07/connector-table-sql-api-part1&quot;&gt;part one&lt;/a&gt; of this tutorial&lt;/li&gt;
&lt;li&gt;some familiarity with Java and objected-oriented programming&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;understand-how-to-fetch-emails-via-the-imap-protocol&quot;&gt;Understand how to fetch emails via the IMAP protocol&lt;/h1&gt;
&lt;p&gt;Now that you have a working source connector that can run on Flink, it is time to connect to an email server via &lt;a href=&quot;https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol&quot;&gt;IMAP&lt;/a&gt; (an Internet protocol that allows email clients to retrieve messages from a mail server) so that Flink can process emails instead of test static data.&lt;/p&gt;
&lt;p&gt;You will use &lt;a href=&quot;https://eclipse-ee4j.github.io/mail/&quot;&gt;Jakarta Mail&lt;/a&gt;, a Java library that can be used to send and receive email via IMAP. For simplicity, authentication will use a plain username and password.&lt;/p&gt;
&lt;p&gt;This tutorial will focus more on how to implement a connector for Flink. If you want to learn more about the details of how IMAP or Jakarta Mail work, you are encouraged to explore a more extensive implementation at this &lt;a href=&quot;https://github.com/TNG/flink-connector-email&quot;&gt;repository&lt;/a&gt;. It offers a wide range of information to be read from emails, as well as options to ingest existing emails alongside new ones, connecting with SSL, and more. It also supports different formats for reading email content and implements some connector abilities such as &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/connector/source/abilities/SupportsReadingMetadata.html&quot;&gt;reading metadata&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In order to fetch emails, you will need to connect to the email server, register a listener for new emails and collect them whenever they arrive, and enter a loop to keep the connector running.&lt;/p&gt;
&lt;h1 id=&quot;add-configuration-options---server-information-and-credentials&quot;&gt;Add configuration options - server information and credentials&lt;/h1&gt;
&lt;p&gt;In order to connect to your IMAP server, you will need at least the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;hostname (of the mail server)&lt;/li&gt;
&lt;li&gt;port number&lt;/li&gt;
&lt;li&gt;username&lt;/li&gt;
&lt;li&gt;password&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You will start by creating a class to encapsulate the configuration options. You will make use of &lt;a href=&quot;https://projectlombok.org&quot;&gt;Lombok&lt;/a&gt; to help with some boilerplate code. By adding the &lt;code&gt;@Data&lt;/code&gt; and &lt;code&gt;@SuperBuilder&lt;/code&gt; annotations, Lombok will generate these for all the fields of the immutable class.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;lombok.Data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;lombok.experimental.SuperBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;javax.annotation.Nullable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.io.Serializable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Data&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@SuperBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;toBuilder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSourceOptions&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Serializable&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;serialVersionUID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1L&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@Nullable&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@Nullable&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@Nullable&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;password&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now you can add an instance of this class to the &lt;code&gt;ImapSource&lt;/code&gt; and &lt;code&gt;ImapTableSource&lt;/code&gt; classes previously created (in part one) so it can be used there. Take note of the column names with which the table has been created. This will help later. You will also switch the source to be unbounded now as we will change the implementation in a bit to continuously listen for new emails.&lt;/p&gt;
&lt;div class=&quot;note&quot;&gt;
&lt;h5&gt;Hint&lt;/h5&gt;
&lt;p&gt;The column names would be &quot;subject&quot; and &quot;content&quot; with the SQL executed in part one:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;CREATE TABLE T (subject STRING, content STRING) WITH (&#39;connector&#39; = &#39;imap&#39;);&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.functions.source.RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.stream.Collectors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImapSourceOptions&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ImapSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ImapSourceOptions&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;options&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;columnNames&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;String:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;toUpperCase&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Collectors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.connector.source.DynamicTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.connector.source.ScanTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapTableSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ScanTableSource&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImapSourceOptions&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ImapTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ImapSourceOptions&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;options&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;columnNames&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ScanRuntimeProvider&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getScanRuntimeProvider&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ScanContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bounded&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ImapSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SourceFunctionProvider&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bounded&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DynamicTableSource&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;copy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ImapTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Finally, in the &lt;code&gt;ImapTableSourceFactory&lt;/code&gt; class, you need to create a &lt;code&gt;ConfigOption&amp;lt;&amp;gt;&lt;/code&gt; for the hostname, port number, username, and password. Then you need to report them to Flink. Host, user, and password are mandatory and can be added to &lt;code&gt;requiredOptions()&lt;/code&gt;; the port is optional and can be added to &lt;code&gt;optionalOptions()&lt;/code&gt; instead.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.configuration.ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.configuration.ConfigOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.factories.DynamicTableSourceFactory&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.HashSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapTableSourceFactory&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DynamicTableSourceFactory&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HOST&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConfigOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;host&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;stringType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;noDefaultValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PORT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConfigOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;port&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;intType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;noDefaultValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;USER&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConfigOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;stringType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;noDefaultValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PASSWORD&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConfigOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;password&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;stringType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;noDefaultValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;?&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requiredOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;?&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HashSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HOST&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;USER&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PASSWORD&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;?&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optionalOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;?&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HashSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PORT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now take a look at the &lt;code&gt;createDynamicTableSource()&lt;/code&gt; function in the &lt;code&gt;ImapTableSourceFactory&lt;/code&gt; class. Recall that previously (in part one) you used a small helper utility &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/factories/FactoryUtil.TableFactoryHelper.html&quot;&gt;TableFactoryHelper&lt;/a&gt;, that Flink offers which ensures that required options are set and that no unknown options are provided. You can now use it to automatically make sure that the required options of hostname, port number, username, and password are all provided when creating a table using this connector. The helper function will throw an error message if one required option is missing. You can also use it to access the provided options (&lt;code&gt;getOptions()&lt;/code&gt;), convert them into an instance of the &lt;code&gt;ImapTableSource&lt;/code&gt; class created earlier, and provide the instance to the table source:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.stream.Collectors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.factories.DynamicTableSourceFactory&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.factories.FactoryUtil&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.catalog.Column&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapTableSourceFactory&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DynamicTableSourceFactory&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DynamicTableSource&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;createDynamicTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FactoryUtil&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;TableFactoryHelper&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;factoryHelper&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FactoryUtil&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;createTableFactoryHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;factoryHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;validate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImapSourceOptions&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImapSourceOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factoryHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HOST&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factoryHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PORT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factoryHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;USER&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;password&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factoryHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PASSWORD&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getCatalogTable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getResolvedSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getColumns&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;Column:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isPhysical&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;Column:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Collectors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ImapTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;note&quot;&gt;
&lt;h5&gt;Hint&lt;/h5&gt;
&lt;p&gt;
Ideally, you would use connector &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/overview/#metadata&quot;&gt;metadata&lt;/a&gt; instead of column names. You can refer again to the accompanying &lt;a href=&quot;https://github.com/TNG/flink-connector-email&quot;&gt;repository&lt;/a&gt; which does implement this using metadata fields.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;To test these new configuration options, run:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;testing/
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./build_and_run.sh&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once you see the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sqlclient/&quot;&gt;Flink SQL client&lt;/a&gt; start up, execute the following statements to create a table with your connector:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;subject&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;content&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;imap&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This time it will fail because the required options are not provided:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.table.api.ValidationException: One or more required options are missing.
Missing required options are:
host
password
user
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&quot;connect-to-the-source-email-server&quot;&gt;Connect to the source email server&lt;/h1&gt;
&lt;p&gt;Now that you have configured the required options to connect to the email server, it is time to actually connect to the server.&lt;/p&gt;
&lt;p&gt;Going back to the &lt;code&gt;ImapSource&lt;/code&gt; class, you first need to convert the options given to the table source into a &lt;a href=&quot;https://docs.oracle.com/javase/tutorial/essential/environment/properties.html&quot;&gt;Properties&lt;/a&gt; object, which is what you can pass to the Jakarta library. You can also set various other properties here as well (i.e. enabling SSL).&lt;/p&gt;
&lt;p&gt;The specific properties that the Jakarta library understands are documented &lt;a href=&quot;https://jakarta.ee/specifications/mail/1.6/apidocs/index.html?com/sun/mail/imap/package-summary.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.functions.source.RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.Properties&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Properties&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getSessionProperties&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Properties&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;props&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Properties&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;props&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;mail.store.protocol&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;imap&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;props&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;mail.imap.auth&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;props&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;mail.imap.host&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getHost&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getPort&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;props&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;mail.imap.port&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getPort&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;props&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now create a method (&lt;code&gt;connect()&lt;/code&gt;) which sets up the connection:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;jakarta.mail.*&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;com.sun.mail.imap.IMAPFolder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.functions.source.RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;transient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Store&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;transient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IMAPFolder&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Session&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;session&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Session&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getInstance&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getSessionProperties&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;session&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getStore&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getUser&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getPassword&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Folder&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;genericFolder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getFolder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;INBOX&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;folder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IMAPFolder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;genericFolder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;isOpen&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;READ_ONLY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can now use this method to connect to the mail server when the source is created. Create a loop to keep the source running while collecting email counts. Lastly, implement methods to cancel and close the connection:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;jakarta.mail.*&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.functions.source.RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.functions.source.SourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;transient&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;volatile&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;running&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;SourceContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;running&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// TODO: Listen for new messages&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;running&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Trigger some IMAP request to force the server to send a notification&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getMessageCount&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Thread&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;sleep&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;250&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;cancel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;running&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;folder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There is a request trigger to the server in every loop iteration. This is crucial as it ensures that the server will keep sending notifications. A more sophisticated approach would be to make use of the IDLE protocol.&lt;/p&gt;
&lt;div class=&quot;note&quot;&gt;
&lt;h5&gt;Note&lt;/h5&gt;
&lt;p&gt;Since the source is not checkpointable, no state fault tolerance will be possible.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;collect-incoming-emails&quot;&gt;Collect incoming emails&lt;/h2&gt;
&lt;p&gt;Now you need to listen for new emails arriving in the inbox folder and collect them. To begin, hardcode the schema and only return the email’s subject. Fortunately, Jakarta provides a simple hook (&lt;code&gt;addMessageCountListener()&lt;/code&gt;) to get notified when new messages arrive on the server. You can use this in place of the “TODO” comment above:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;jakarta.mail.*&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;jakarta.mail.event.MessageCountAdapter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;jakarta.mail.event.MessageCountEvent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.functions.source.RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.GenericRowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.StringData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;SourceContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addMessageCountListener&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;MessageCountAdapter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;messagesAdded&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MessageCountEvent&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;collectMessages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getMessages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;});&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// …&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;collectMessages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;SourceContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;messages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;messages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GenericRowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;StringData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getSubject&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MessagingException&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ignored&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now build the project again and start up the SQL client:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;testing/
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./build_and_run.sh&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This time, you will connect to a &lt;a href=&quot;https://greenmail-mail-test.github.io/greenmail/&quot;&gt;GreenMail server&lt;/a&gt; which is started as part of the &lt;a href=&quot;https://github.com/Airblader/blog-imap/blob/master/testing/docker-compose.yaml&quot;&gt;setup&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;subject&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;imap&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;host&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;greenmail&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;port&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;3143&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;user&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;alice&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;password&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;alice&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The query above should now run continuously but no rows will be produced since it is a test server. You need to first send an email to the server. If you have &lt;a href=&quot;https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mailx.html&quot;&gt;mailx&lt;/a&gt; installed, you can do so by executing in your terminal:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;This is the email body&amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; mailx -Sv15-compat &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-s&lt;span class=&quot;s2&quot;&gt;&amp;quot;Email Subject&amp;quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Smta&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;smtp://alice:alice@localhost:3025&amp;quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
alice@acme.org&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The row “Email Subject” should now have appeared as a row in your output. Your source connector is working!&lt;/p&gt;
&lt;p&gt;However, since you are still hard-coding the schema produced by the source, defining the table with a different schema will produce errors. You want to be able to define which fields of an email interest you and then produce the data accordingly. To do this, you will use the list of column names from earlier and then look at it when you collect the emails.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.GenericRowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.TimestampData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;collectMessages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;SourceContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;messages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;messages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;collectMessage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MessagingException&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ignored&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;collectMessage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;SourceContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MessagingException&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GenericRowData&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;row&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;GenericRowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;switch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;columnNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;SUBJECT&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setField&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StringData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getSubject&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()));&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;SENT&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setField&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TimestampData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromInstant&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getSentDate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toInstant&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()));&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;RECEIVED&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setField&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TimestampData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromInstant&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getReceivedDate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toInstant&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()));&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You should now have a working source where you can select any of the columns that are supported. Try it out again in the SQL client, but this time specifying all the columns (“subject”, “sent”, “received”) supported above:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;subject&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;sent&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;received&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;imap&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;host&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;greenmail&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;port&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;3143&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;user&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;alice&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;password&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;alice&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Use the &lt;code&gt;mailx&lt;/code&gt; command from earlier to send emails to the GreenMail server and you should see them appear. You can also try selecting only some of the columns, or write more complex queries.&lt;/p&gt;
&lt;h1 id=&quot;test-the-connector-with-a-real-mail-server-on-the-ververica-platform&quot;&gt;Test the connector with a real mail server on the Ververica Platform&lt;/h1&gt;
&lt;p&gt;If you want to test the connector with a real mail server, you can import it into &lt;a href=&quot;https://www.ververica.com/getting-started&quot;&gt;Ververica Platform Community Edition&lt;/a&gt;. To begin, make sure that you have the Ververica Platform up and running.&lt;/p&gt;
&lt;p&gt;Since the example connector in this blog post is still a bit limited, you will use the finished connector in this &lt;a href=&quot;github.com/TNG/flink-connector-email&quot;&gt;repository&lt;/a&gt; instead. You can clone that repository and build it the same way to obtain the JAR file.&lt;/p&gt;
&lt;p&gt;For this example, let’s connect to a Gmail account. This requires SSL and comes with an additional caveat that you need to enable two-factor authentication and create an application password to use instead of your real password.&lt;/p&gt;
&lt;p&gt;First, head to SQL → Connectors. There you can create a new connector by uploading your JAR file. The platform will detect the connector options automatically. Afterwards, go back to the SQL Editor and you should now be able to use the connector.&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-09-07-connector-table-sql-api/VVP-SQL-Editor.png&quot; alt=&quot;Ververica Platform - SQL Editor&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;Ververica Platform - SQL Editor&lt;/p&gt;
&lt;/div&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;Apache Flink is designed for easy extensibility and allows users to access many different external systems as data sources or sinks through a versatile set of connectors. It can read and write data from databases, local and distributed file systems.&lt;/p&gt;
&lt;p&gt;Flink also exposes APIs on top of which custom connectors can be built. In this two-part blog series, you explored some of these APIs and concepts and learned how to implement your own custom source connector that can read in data from an email inbox. You then used Flink to process incoming emails through the IMAP protocol and wrote some Flink SQL.&lt;/p&gt;
</description>
<pubDate>Tue, 07 Sep 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2021/09/07/connector-table-sql-api-part2.html</link>
<guid isPermaLink="true">/2021/09/07/connector-table-sql-api-part2.html</guid>
</item>
<item>
<title>Implementing a Custom Source Connector for Table API and SQL - Part One </title>
<description>&lt;p&gt;Part one of this tutorial will teach you how to build and run a custom source connector to be used with Table API and SQL, two high-level abstractions in Flink. The tutorial comes with a bundled &lt;a href=&quot;https://docs.docker.com/compose/&quot;&gt;docker-compose&lt;/a&gt; setup that lets you easily run the connector. You can then try it out with Flink’s SQL client.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#prerequisites&quot; id=&quot;markdown-toc-prerequisites&quot;&gt;Prerequisites&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#understand-the-infrastructure-required-for-a-connector&quot; id=&quot;markdown-toc-understand-the-infrastructure-required-for-a-connector&quot;&gt;Understand the infrastructure required for a connector&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#establish-the-runtime-implementation-of-the-connector&quot; id=&quot;markdown-toc-establish-the-runtime-implementation-of-the-connector&quot;&gt;Establish the runtime implementation of the connector&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#create-and-configure-a-dynamic-table-source-for-the-data-stream&quot; id=&quot;markdown-toc-create-and-configure-a-dynamic-table-source-for-the-data-stream&quot;&gt;Create and configure a dynamic table source for the data stream&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#create-a-factory-class-for-the-connector-so-it-can-be-discovered-by-flink&quot; id=&quot;markdown-toc-create-a-factory-class-for-the-connector-so-it-can-be-discovered-by-flink&quot;&gt;Create a factory class for the connector so it can be discovered by Flink&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#test-the-custom-connector&quot; id=&quot;markdown-toc-test-the-custom-connector&quot;&gt;Test the custom connector&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#next-steps&quot; id=&quot;markdown-toc-next-steps&quot;&gt;Next Steps&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Apache Flink is a data processing engine that aims to keep &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/state/state_backends/&quot;&gt;state&lt;/a&gt; locally in order to do computations efficiently. However, Flink does not “own” the data but relies on external systems to ingest and persist data. Connecting to external data input (&lt;strong&gt;sources&lt;/strong&gt;) and external data storage (&lt;strong&gt;sinks&lt;/strong&gt;) is usually summarized under the term &lt;strong&gt;connectors&lt;/strong&gt; in Flink.&lt;/p&gt;
&lt;p&gt;Since connectors are such important components, Flink ships with &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/overview/&quot;&gt;connectors for some popular systems&lt;/a&gt;. But sometimes you may need to read in an uncommon data format and what Flink provides is not enough. This is why Flink also provides extension points for building custom connectors if you want to connect to a system that is not supported by an existing connector.&lt;/p&gt;
&lt;p&gt;Once you have a source and a sink defined for Flink, you can use its declarative APIs (in the form of the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/overview/&quot;&gt;Table API and SQL&lt;/a&gt;) to execute queries for data analysis.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Table API&lt;/strong&gt; provides more programmatic access while &lt;strong&gt;SQL&lt;/strong&gt; is a more universal query language. It is named Table API because of its relational functions on tables: how to obtain a table, how to output a table, and how to perform query operations on the table.&lt;/p&gt;
&lt;p&gt;In this two-part tutorial, you will explore some of these APIs and concepts by implementing your own custom source connector for reading in data from an email inbox. You will then use Flink to process emails through the &lt;a href=&quot;https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol&quot;&gt;IMAP protocol&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Part one will focus on building a custom source connector and &lt;a href=&quot;/2021/09/07/connector-table-sql-api-part2&quot;&gt;part two&lt;/a&gt; will focus on integrating it.&lt;/p&gt;
&lt;h1 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h1&gt;
&lt;p&gt;This tutorial assumes that you have some familiarity with Java and objected-oriented programming.&lt;/p&gt;
&lt;p&gt;You are encouraged to follow along with the code in this &lt;a href=&quot;https://github.com/Airblader/blog-imap&quot;&gt;repository&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It would also be useful to have &lt;a href=&quot;https://docs.docker.com/compose/install/&quot;&gt;docker-compose&lt;/a&gt; installed on your system in order to use the script included in the repository that builds and runs the connector.&lt;/p&gt;
&lt;h1 id=&quot;understand-the-infrastructure-required-for-a-connector&quot;&gt;Understand the infrastructure required for a connector&lt;/h1&gt;
&lt;p&gt;In order to create a connector which works with Flink, you need:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;A &lt;em&gt;factory class&lt;/em&gt; (a blueprint for creating other objects from string properties) that tells Flink with which identifier (in this case, “imap”) our connector can be addressed, which configuration options it exposes, and how the connector can be instantiated. Since Flink uses the Java Service Provider Interface (SPI) to discover factories located in different modules, you will also need to add some configuration details.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The &lt;em&gt;table source&lt;/em&gt; object as a specific instance of the connector during the planning stage. It is responsible for back and forth communication with the optimizer during the planning stage and is like another factory for creating connector runtime implementation. There are also more advanced features, such as &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/connector/source/abilities/package-summary.html&quot;&gt;abilities&lt;/a&gt;, that can be implemented to improve connector performance.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A &lt;em&gt;runtime implementation&lt;/em&gt; from the connector obtained during the planning stage. The runtime logic is implemented in Flink’s core connector interfaces and does the actual work of producing rows of dynamic table data. The runtime instances are shipped to the Flink cluster.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let us look at this sequence (factory class → table source → runtime implementation) in reverse order.&lt;/p&gt;
&lt;h1 id=&quot;establish-the-runtime-implementation-of-the-connector&quot;&gt;Establish the runtime implementation of the connector&lt;/h1&gt;
&lt;p&gt;You first need to have a source connector which can be used in Flink’s runtime system, defining how data goes in and how it can be executed in the cluster. There are a few different interfaces available for implementing the actual source of the data and have it be discoverable in Flink.&lt;/p&gt;
&lt;p&gt;For complex connectors, you may want to implement the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/api/connector/source/Source.html&quot;&gt;Source interface&lt;/a&gt; which gives you a lot of control. For simpler use cases, you can use the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/api/functions/source/SourceFunction.html&quot;&gt;SourceFunction interface&lt;/a&gt;. There are already a few different implementations of SourceFunction interfaces for common use cases such as the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/api/functions/source/FromElementsFunction.html&quot;&gt;FromElementsFunction&lt;/a&gt; class and the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/api/functions/source/RichSourceFunction.html&quot;&gt;RichSourceFunction&lt;/a&gt; class. You will use the latter.&lt;/p&gt;
&lt;div class=&quot;note&quot;&gt;
&lt;h5&gt;Hint&lt;/h5&gt;
&lt;p&gt;The Source interface is the new abstraction whereas the SourceFunction interface is slowly phasing out.
All connectors will eventually implement the Source interface.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;code&gt;RichSourceFunction&lt;/code&gt; is a base class for implementing a data source that has access to context information and some lifecycle methods. There is a &lt;code&gt;run()&lt;/code&gt; method inherited from the &lt;code&gt;SourceFunction&lt;/code&gt; interface that you need to implement. It is invoked once and can be used to produce the data either once for a bounded result or within a loop for an unbounded stream.&lt;/p&gt;
&lt;p&gt;For example, to create a bounded data source, you could implement this method so that it reads all existing emails and then closes. To create an unbounded source, you could only look at new emails coming in while the source is active. You can also combine these behaviors and expose them through configuration options.&lt;/p&gt;
&lt;p&gt;When you first create the class and implement the interface, it should look something like this:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.functions.source.RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SourceContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;cancel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note that internal data structures (&lt;code&gt;RowData&lt;/code&gt;) are used because that is required by the table runtime.&lt;/p&gt;
&lt;p&gt;In the &lt;code&gt;run()&lt;/code&gt; method, you get access to a &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/api/functions/source/SourceFunction.SourceContext.html&quot;&gt;context&lt;/a&gt; object inherited from the SourceFunction interface, which is a bridge to Flink and allows you to output data. Since the source does not produce any data yet, the next step is to make it produce some static data in order to test that the data flows correctly:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.functions.source.RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.GenericRowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.data.StringData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichSourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SourceContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GenericRowData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;StringData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Subject 1&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;StringData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Hello, World!&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;cancel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(){}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You do not need to implement the &lt;code&gt;cancel()&lt;/code&gt; method yet because the source finishes instantly.&lt;/p&gt;
&lt;h1 id=&quot;create-and-configure-a-dynamic-table-source-for-the-data-stream&quot;&gt;Create and configure a dynamic table source for the data stream&lt;/h1&gt;
&lt;p&gt;&lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/concepts/dynamic_tables/&quot;&gt;Dynamic tables&lt;/a&gt; are the core concept of Flink’s Table API and SQL support for streaming data and, like its name suggests, change over time. You can imagine a data stream being logically converted into a table that is constantly changing. For this tutorial, the emails that will be read in will be interpreted as a (source) table that is queryable. It can be viewed as a specific instance of a connector class.&lt;/p&gt;
&lt;p&gt;You will now implement a &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/connector/source/DynamicTableSource.html&quot;&gt;DynamicTableSource&lt;/a&gt; interface. There are two types of dynamic table sources: &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/connector/source/ScanTableSource.html&quot;&gt;ScanTableSource&lt;/a&gt; and &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/connector/source/LookupTableSource.html&quot;&gt;LookupTableSource&lt;/a&gt;. Scan sources read the entire table on the external system while lookup sources look for specific rows based on keys. The former will fit the use case of this tutorial.&lt;/p&gt;
&lt;p&gt;This is what a scan table source implementation would look like:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.connector.ChangelogMode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.connector.source.DynamicTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.connector.source.ScanTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.connector.source.SourceFunctionProvider&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapTableSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ScanTableSource&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ChangelogMode&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getChangelogMode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ChangelogMode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;insertOnly&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ScanRuntimeProvider&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getScanRuntimeProvider&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ScanContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bounded&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImapSource&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ImapSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SourceFunctionProvider&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bounded&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DynamicTableSource&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;copy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ImapTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;asSummaryString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;IMAP Table Source&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/connector/ChangelogMode.html&quot;&gt;ChangelogMode&lt;/a&gt; informs Flink of expected changes that the planner can expect during runtime. For example, whether the source produces only new rows, also updates to existing ones, or whether it can remove previously produced rows. Our source will only produce (&lt;code&gt;insertOnly()&lt;/code&gt;) new rows.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/connector/source/ScanTableSource.ScanRuntimeProvider.html&quot;&gt;ScanRuntimeProvider&lt;/a&gt; allows Flink to create the actual runtime implementation you established previously (for reading the data). Flink even provides utilities like &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/connector/source/SourceFunctionProvider.html&quot;&gt;SourceFunctionProvider&lt;/a&gt; to wrap it into an instance of &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/api/functions/source/SourceFunction.html&quot;&gt;SourceFunction&lt;/a&gt;, which is one of the base runtime interfaces.&lt;/p&gt;
&lt;p&gt;You will also need to indicate whether the source is bounded or not. Currently, this is the case but you will have to change this later.&lt;/p&gt;
&lt;h1 id=&quot;create-a-factory-class-for-the-connector-so-it-can-be-discovered-by-flink&quot;&gt;Create a factory class for the connector so it can be discovered by Flink&lt;/h1&gt;
&lt;p&gt;You now have a working source connector, but in order to use it in Table API or SQL, it needs to be discoverable by Flink. You also need to define how the connector is addressable from a SQL statement when creating a source table.&lt;/p&gt;
&lt;p&gt;You need to implement a &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/factories/Factory.html&quot;&gt;Factory&lt;/a&gt;, which is a base interface that creates object instances from a list of key-value pairs in Flink’s Table API and SQL. A factory is uniquely identified by its class name and &lt;code&gt;factoryIdentifier()&lt;/code&gt;. For this tutorial, you will implement the more specific &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/factories/DynamicTableSourceFactory.html&quot;&gt;DynamicTableSourceFactory&lt;/a&gt;, which allows you to configure a dynamic table connector as well as create &lt;code&gt;DynamicTableSource&lt;/code&gt; instances.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.HashSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;java.util.Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.configuration.ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.connector.source.DynamicTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.factories.DynamicTableSourceFactory&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.table.factories.FactoryUtil&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ImapTableSourceFactory&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DynamicTableSourceFactory&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;factoryIdentifier&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;imap&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;?&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requiredOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HashSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ConfigOption&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;?&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optionalOptions&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HashSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DynamicTableSource&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;createDynamicTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FactoryUtil&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;TableFactoryHelper&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;factoryHelper&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FactoryUtil&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;createTableFactoryHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;factoryHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;validate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ImapTableSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There are currently no configuration options but they can be added and also validated within the &lt;code&gt;createDynamicTableSource()&lt;/code&gt; function. There is a small helper utility, &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/table/factories/FactoryUtil.TableFactoryHelper.html&quot;&gt;TableFactoryHelper&lt;/a&gt;, that Flink offers which ensures that required options are set and that no unknown options are provided.&lt;/p&gt;
&lt;p&gt;Finally, you need to register your factory for Java’s Service Provider Interfaces (SPI). Classes that implement this interface can be discovered and should be added to this file &lt;code&gt;src/main/resources/META-INF/services/org.apache.flink.table.factories.Factory&lt;/code&gt; with the fully classified class name of your factory:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// if you created your class in the package org.example.acme, it should be named the following:&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;org&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;example&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;acme&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;ImapTableSourceFactory&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&quot;test-the-custom-connector&quot;&gt;Test the custom connector&lt;/h1&gt;
&lt;p&gt;You should now have a working source connector. If you are following along with the provided repository, you can test it by running:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;testing/
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./build_and_run.sh&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This builds the connector, starts a Flink cluster, a &lt;a href=&quot;https://greenmail-mail-test.github.io/greenmail/&quot;&gt;test email server&lt;/a&gt; (which you will need later), and the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sqlclient/&quot;&gt;SQL client&lt;/a&gt; (which is bundled in the regular Flink distribution) for you. If successful, you should see the SQL CLI:&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-09-07-connector-table-sql-api/flink-sql-client.png&quot; alt=&quot;Flink SQL Client&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;Flink SQL Client&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;You can now create a table (with a “subject” column and a “content” column) with your connector by executing the following statement with the SQL client:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;subject&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;content&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;imap&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note that the schema must be exactly as written since it is currently hardcoded into the connector.&lt;/p&gt;
&lt;p&gt;You should be able to see the static data you provided in your source connector earlier, which would be “Subject 1” and “Hello, World!”.&lt;/p&gt;
&lt;p&gt;Now that you have a working connector, the next step is to make it do something more useful than returning static data.&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;In this tutorial, you looked into the infrastructure required for a connector and configured its runtime implementation to define how it should be executed in a cluster. You also defined a dynamic table source that reads the entire stream-converted table from the external source, made the connector discoverable by Flink through creating a factory class for it, and then tested it.&lt;/p&gt;
&lt;h1 id=&quot;next-steps&quot;&gt;Next Steps&lt;/h1&gt;
&lt;p&gt;In &lt;a href=&quot;/2021/09/07/connector-table-sql-api-part2&quot;&gt;part two&lt;/a&gt;, you will integrate this connector with an email inbox through the IMAP protocol.&lt;/p&gt;
</description>
<pubDate>Tue, 07 Sep 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2021/09/07/connector-table-sql-api-part1.html</link>
<guid isPermaLink="true">/2021/09/07/connector-table-sql-api-part1.html</guid>
</item>
<item>
<title>Stateful Functions 3.1.0 Release Announcement</title>
<description>&lt;p&gt;Stateful Functions is a cross-platform stack for building Stateful Serverless applications, making it radically simpler to develop scalable, consistent, and elastic distributed applications.
This new release brings various improvements to the StateFun runtime, a leaner way to specify StateFun module components, and a brand new GoLang SDK!&lt;/p&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads&lt;/a&gt;
page of the Flink website, and the most recent Java SDK, Python SDK, and GoLang SDK distributions are available on &lt;a href=&quot;https://search.maven.org/artifact/org.apache.flink/statefun-sdk-java/3.1.0/jar&quot;&gt;Maven&lt;/a&gt;, &lt;a href=&quot;https://pypi.org/project/apache-flink-statefun/&quot;&gt;PyPI&lt;/a&gt;, and &lt;a href=&quot;https://github.com/apache/flink-statefun/tree/statefun-sdk-go/v3.1.0&quot;&gt;Github&lt;/a&gt; repecitvely.
You can also find official StateFun Docker images of the new version on &lt;a href=&quot;https://hub.docker.com/r/apache/flink-statefun&quot;&gt;Dockerhub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For more details, check the complete &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350038&amp;amp;projectId=12315522&quot;&gt;release changelog&lt;/a&gt;
and the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-3.0/&quot;&gt;updated documentation&lt;/a&gt;.
We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt;
or &lt;a href=&quot;https://issues.apache.org/jira/browse/&quot;&gt;JIRA&lt;/a&gt;!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#new-features&quot; id=&quot;markdown-toc-new-features&quot;&gt;New Features&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#delayed-message-cancellation&quot; id=&quot;markdown-toc-delayed-message-cancellation&quot;&gt;Delayed Message Cancellation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-way-to-specify-components&quot; id=&quot;markdown-toc-new-way-to-specify-components&quot;&gt;New way to specify components&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pluggable-transport-for-remote-function-invocations&quot; id=&quot;markdown-toc-pluggable-transport-for-remote-function-invocations&quot;&gt;Pluggable transport for remote function invocations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#asynchronous-non-blocking-remote-function-invocation-beta&quot; id=&quot;markdown-toc-asynchronous-non-blocking-remote-function-invocation-beta&quot;&gt;Asynchronous, non blocking remote function invocation (beta)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#a-brand-new-golang-sdk&quot; id=&quot;markdown-toc-a-brand-new-golang-sdk&quot;&gt;A brand new GoLang SDK&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes&quot; id=&quot;markdown-toc-release-notes&quot;&gt;Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes-1&quot; id=&quot;markdown-toc-release-notes-1&quot;&gt;Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;new-features&quot;&gt;New Features&lt;/h2&gt;
&lt;h3 id=&quot;delayed-message-cancellation&quot;&gt;Delayed Message Cancellation&lt;/h3&gt;
&lt;p&gt;Stateful Functions communicate by sending messages, but sometimes it is helpful that a function will send a message for itself.
For example, you may want to set a time limit on a customer onboarding flow to complete.
This can easily be implmented by sending a message with a delay.
But up until now, there was no way to indicate to the StateFun runtime that a particular delayed message is not necessary anymore (a customer had completed their onboarding flow).
With StateFun 3.1, it is now possible to cancel a delayed message.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send_after&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timedelta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;days&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;message_builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;target_typename&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;fns/onboarding&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;target_id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;user-1234&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;str_value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;send a reminder email&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cancellation_token&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;flow-1234&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To cancel the message at a later time, simply call&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cancel_delayed_message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;flow-1234&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Please note that a message cancellation occurs on a best-effort basis, as the message might have already been delivered or enqueued for immediate delivery on a remote worker’s mailbox.&lt;/p&gt;
&lt;h3 id=&quot;new-way-to-specify-components&quot;&gt;New way to specify components&lt;/h3&gt;
&lt;p&gt;StateFun applications consist of multiple configuration components, including remote function endpoints, along with ingress and egress definitions, defined in a YAML format.
We’ve added a new structure that treats each StateFun component as a standalone YAML document in this release.
Thus, a &lt;code&gt;module.yaml&lt;/code&gt; file becomes simply a collection of components.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;&lt;span class=&quot;l-Scalar-Plain&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;io.statefun.endpoints.v2/http&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;functions&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;com.example/*&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;urlPathTemplate&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;https://bar.foo.com/{function.name}&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;io.statefun.kafka.v1/ingress&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;com.example/my-ingress&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;address&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;kafka-broker:9092&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;consumerGroupId&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;my-consumer-group&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;topics&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;p-Indicator&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;topic&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;message-topic&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;valueType&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;io.statefun.types/string&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;targets&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;p-Indicator&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;com.example/greeter&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;io.statefun.kafka.v1/egress&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;com.example/my-egress&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;address&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;kafka-broker:9092&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;deliverySemantic&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;exactly-once&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;transactionTimeout&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;15min&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;While this might seem like a minor cosmetic improvement, this change opens the door to more flexible configuration management options in future releases - such as managing each component as a custom K8s resource definition or even behind a REST API. StateFun still supports the legacy module format in version 3.0 for backward compatibility, but users are encouraged to upgrade.
The community is providing an &lt;a href=&quot;https://github.com/sjwiesman/statefun-module-upgrade&quot;&gt;automated migration tool&lt;/a&gt; to ease the transition.&lt;/p&gt;
&lt;h3 id=&quot;pluggable-transport-for-remote-function-invocations&quot;&gt;Pluggable transport for remote function invocations&lt;/h3&gt;
&lt;p&gt;It is possible to plugin a custom mechanism that invokes a remote stateful function starting with this release.
Users who wish to use a customized transport need to register it as an extension and later reference it straight from the endpoint component definition.&lt;/p&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;&lt;span class=&quot;l-Scalar-Plain&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;io.statefun.endpoints.v2/http&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;functions&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;com.foo.bar/*&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;urlPathTemplate&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;https://{function.name}/&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;maxNumBatchRequests&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;10000&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;transport&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;com.foo.bar/pubsub&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;some_property1&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;some_value1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For a complete example of a custom transport you can start exploring &lt;a href=&quot;https://github.com/apache/flink-statefun/blob/release-3.1.0/statefun-flink/statefun-flink-core/src/main/java/org/apache/flink/statefun/flink/core/nettyclient/NettyTransportModule.java&quot;&gt;here&lt;/a&gt;.
Along with a reference usage over &lt;a href=&quot;https://github.com/apache/flink-statefun/blob/release-3.1.0/statefun-e2e-tests/statefun-smoke-e2e-java/src/test/resources/remote-module/module.yaml#L21-L22&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;asynchronous-non-blocking-remote-function-invocation-beta&quot;&gt;Asynchronous, non blocking remote function invocation (beta)&lt;/h3&gt;
&lt;p&gt;For this release we’ve included a new transport implementation (opt in for this release) that is implemented on top of the asynchronous Netty framework.
This transport enables much higher resource utilization, higher throughput, and lower remote function invocation latency.&lt;/p&gt;
&lt;p&gt;To enable this new transport, set the transport type to be &lt;code&gt;io.statefun.transports.v1/async&lt;/code&gt;
Like in the following example:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;&lt;span class=&quot;l-Scalar-Plain&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;io.statefun.endpoints.v2/http&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;functions&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;fns/*&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;urlPathTemplate&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;https://api-gateway.foo.bar/{function.name}&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;maxNumBatchRequests&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;10000&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;transport&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;io.statefun.transports.v1/async&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;call&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;2m&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;20s&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Take it for a spin!&lt;/p&gt;
&lt;h3 id=&quot;a-brand-new-golang-sdk&quot;&gt;A brand new GoLang SDK&lt;/h3&gt;
&lt;p&gt;Stateful Functions provides a unified model for building stateful applications across various programming languages and deployment environments.
The community is thrilled to release an official GoLang SDK as part of the 3.1.0 release.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;fmt&amp;quot;&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;github.com/apache/flink-statefun/statefun-sdk-go/v3/pkg/statefun&amp;quot;&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;net/http&amp;quot;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Greeter&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;SeenCount&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ValueSpec&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;g&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Greeter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Invoke&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ctx&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;message&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;storage&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Storage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Read the current value of the state&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// or zero value if no value is set&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int32&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;SeenCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Update the state which will&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// be made persistent by the runtime&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;SeenCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;AsString&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;greeting&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Sprintf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Hello there %s at the %d-th time!\n&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;MessageBuilder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;Target&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Caller&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;greeting&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;nil&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;greeter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Greeter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;SeenCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ValueSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;seen_count&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;ValueType&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Int32Type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;builder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;StatefulFunctionsBuilder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;WithSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;StatefulFunctionSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;FunctionType&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;TypeNameFrom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;com.example.fns/greeter&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;States&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;statefun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ValueSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;greeter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;SeenCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;greeter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;http&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;/statefun&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;AsHandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;http&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ListenAndServe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;:8000&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As with the Python and Java SDKs, the Go SDK includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;An address scoped storage acting as a key-value store for a particular address.&lt;/li&gt;
&lt;li&gt;A unified cross-language way to send, receive and store values across languages.&lt;/li&gt;
&lt;li&gt;Dynamic &lt;code&gt;ValueSpec&lt;/code&gt; to describe the state name, type, and possibly expiration configuration at runtime.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can get started by adding the SDK to your &lt;code&gt;go.mod&lt;/code&gt; file.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;require github.com/apache/flink-statefun/statefun-sdk-go/v3 v3.1.0&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;For a detailed SDK tutorial, we would like to encourage you to visit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/apache/flink-statefun-playground/tree/release-3.1/go/showcase&quot;&gt;GoLang SDK Showcase&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/apache/flink-statefun-playground/tree/release-3.1/go/greeter&quot;&gt;GoLang Greeter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-3.1/docs/sdk/golang/&quot;&gt;GoLang SDK Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h2&gt;
&lt;h2 id=&quot;release-notes-1&quot;&gt;Release Notes&lt;/h2&gt;
&lt;p&gt;Please review the &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350038&amp;amp;projectId=12315522&quot;&gt;release notes&lt;/a&gt;
for a detailed list of changes and new features if you plan to upgrade your setup to Stateful Functions 3.1.0.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;Evans Ye, George Birbilis, Igal Shilman, Konstantin Knauf, Seth Wiesman, Siddique Ahmad, Tzu-Li (Gordon) Tai, ariskk, austin ce&lt;/p&gt;
&lt;p&gt;If you’d like to get involved, we’re always &lt;a href=&quot;https://github.com/apache/flink-statefun#contributing&quot;&gt;looking for new contributors&lt;/a&gt;.&lt;/p&gt;
</description>
<pubDate>Tue, 31 Aug 2021 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/08/31/release-statefun-3.1.0.html</link>
<guid isPermaLink="true">/news/2021/08/31/release-statefun-3.1.0.html</guid>
</item>
<item>
<title>Help us stabilize Apache Flink 1.14.0 RC0</title>
<description>&lt;div class=&quot;note&quot;&gt;
&lt;h5&gt;Hint&lt;/h5&gt;
&lt;p&gt;
Update 29th of September: Today
&lt;a href=&quot;https://flink.apache.org/news/2021/09/29/release-1.14.0.html&quot;&gt;Apache Flink 1.14&lt;/a&gt;
has been released. For sure we&#39;d still like to hear your feedback.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Dear Flink Community,&lt;/p&gt;
&lt;p&gt;we are excited to announce the first release candidate of Apache Flink 1.14. 🎉&lt;/p&gt;
&lt;p&gt;A lot of features and fixes went into this release, including improvements to the
unified batch and streaming experience, an increase in fault tolerance by reducing
in-flight data, and more developments on connectors and components.
It wouldn’t have been possible without your help.
Around 211 people have made contributions!&lt;/p&gt;
&lt;p&gt;Two weeks ago (August 16th) we created a feature freeze. This means that only a
few small, almost-ready features will go into the release from this moment on.
We are now in the process of stabilizing the release and need your help! As you can
see on the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/1.14+Release&quot;&gt;1.14 release coordination page&lt;/a&gt;,
a lot of focus is on documentation and testing.&lt;/p&gt;
&lt;p&gt;If you would like to contribute to the squirrel community, a great way would be to
download the &lt;a href=&quot;https://dist.apache.org/repos/dist/dev/flink/flink-1.14.0-rc0/&quot;&gt;release candidate&lt;/a&gt;
and test it. You can run some existing Flink jobs or pick one of the
&lt;a href=&quot;https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=468&amp;amp;quickFilter=2115&quot;&gt;test issues&lt;/a&gt;.
We would greatly appreciate any feedback you can provide on the
&lt;a href=&quot;https://issues.apache.org/jira/projects/FLINK/summary&quot;&gt;JIRA tickets&lt;/a&gt; or on
the &lt;a href=&quot;https://flink.apache.org/gettinghelp.html#user-mailing-list&quot;&gt;mailing list&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We continue to be grateful and inspired by the community who believe in the project and want to help create a great user experience and product for all Flink users.&lt;/p&gt;
&lt;p&gt;Many thanks!&lt;/p&gt;
</description>
<pubDate>Tue, 31 Aug 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/08/31/release-1.14.0-rc0.html</link>
<guid isPermaLink="true">/news/2021/08/31/release-1.14.0-rc0.html</guid>
</item>
<item>
<title>Apache Flink 1.11.4 Released</title>
<description>&lt;p&gt;The Apache Flink community released the next bugfix version of the Apache Flink 1.11 series.&lt;/p&gt;
&lt;p&gt;This release includes 78 fixes and minor improvements for Flink 1.11.4. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.11.4.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt; Release Notes - Flink - Version 1.11.4
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21070&quot;&gt;FLINK-21070&lt;/a&gt;] - Overloaded aggregate functions cause converter errors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21486&quot;&gt;FLINK-21486&lt;/a&gt;] - Add sanity check when switching from Rocks to Heap timers
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15262&quot;&gt;FLINK-15262&lt;/a&gt;] - kafka connector doesn&amp;#39;t read from beginning immediately when &amp;#39;connector.startup-mode&amp;#39; = &amp;#39;earliest-offset&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16443&quot;&gt;FLINK-16443&lt;/a&gt;] - Fix wrong fix for user-code CheckpointExceptions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18438&quot;&gt;FLINK-18438&lt;/a&gt;] - TaskManager start failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19369&quot;&gt;FLINK-19369&lt;/a&gt;] - BlobClientTest.testGetFailsDuringStreamingForJobPermanentBlob hangs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19436&quot;&gt;FLINK-19436&lt;/a&gt;] - TPC-DS end-to-end test (Blink planner) failed during shutdown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19771&quot;&gt;FLINK-19771&lt;/a&gt;] - NullPointerException when accessing null array from postgres in JDBC Connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20288&quot;&gt;FLINK-20288&lt;/a&gt;] - Correct documentation about savepoint self-contained
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20383&quot;&gt;FLINK-20383&lt;/a&gt;] - DataSet allround end-to-end test fails with NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20626&quot;&gt;FLINK-20626&lt;/a&gt;] - Canceling a job when it is failing will result in job hanging in CANCELING state
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20666&quot;&gt;FLINK-20666&lt;/a&gt;] - Fix the deserialized Row losing the field_name information in PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20675&quot;&gt;FLINK-20675&lt;/a&gt;] - Asynchronous checkpoint failure would not fail the job anymore
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20680&quot;&gt;FLINK-20680&lt;/a&gt;] - Fails to call var-arg function with no parameters
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20752&quot;&gt;FLINK-20752&lt;/a&gt;] - FailureRateRestartBackoffTimeStrategy allows one less restart than configured
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20793&quot;&gt;FLINK-20793&lt;/a&gt;] - Fix NamesTest due to code style refactor
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20803&quot;&gt;FLINK-20803&lt;/a&gt;] - Version mismatch between spotless-maven-plugin and google-java-format plugin
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20832&quot;&gt;FLINK-20832&lt;/a&gt;] - Deliver bootstrap resouces ourselves for website and documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20841&quot;&gt;FLINK-20841&lt;/a&gt;] - Fix compile error due to duplicated generated files
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20913&quot;&gt;FLINK-20913&lt;/a&gt;] - Improve new HiveConf(jobConf, HiveConf.class)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20989&quot;&gt;FLINK-20989&lt;/a&gt;] - Functions in ExplodeFunctionUtil should handle null data to avoid NPE
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21008&quot;&gt;FLINK-21008&lt;/a&gt;] - Residual HA related Kubernetes ConfigMaps and ZooKeeper nodes when cluster entrypoint received SIGTERM in shutdown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21009&quot;&gt;FLINK-21009&lt;/a&gt;] - Can not disable certain options in Elasticsearch 7 connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21013&quot;&gt;FLINK-21013&lt;/a&gt;] - Blink planner does not ingest timestamp into StreamRecord
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21028&quot;&gt;FLINK-21028&lt;/a&gt;] - Streaming application didn&amp;#39;t stop properly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21030&quot;&gt;FLINK-21030&lt;/a&gt;] - Broken job restart for job with disjoint graph
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21071&quot;&gt;FLINK-21071&lt;/a&gt;] - Snapshot branches running against flink-docker dev-master branch
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21132&quot;&gt;FLINK-21132&lt;/a&gt;] - BoundedOneInput.endInput is called when taking synchronous savepoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21138&quot;&gt;FLINK-21138&lt;/a&gt;] - KvStateServerHandler is not invoked with user code classloader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21148&quot;&gt;FLINK-21148&lt;/a&gt;] - YARNSessionFIFOSecuredITCase cannot connect to BlobServer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21208&quot;&gt;FLINK-21208&lt;/a&gt;] - pyarrow exception when using window with pandas udaf
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21213&quot;&gt;FLINK-21213&lt;/a&gt;] - e2e test fail with &amp;#39;As task is already not running, no longer decline checkpoint&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21215&quot;&gt;FLINK-21215&lt;/a&gt;] - Checkpoint was declined because one input stream is finished
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21216&quot;&gt;FLINK-21216&lt;/a&gt;] - StreamPandasConversionTests Fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21274&quot;&gt;FLINK-21274&lt;/a&gt;] - At per-job mode, during the exit of the JobManager process, if ioExecutor exits at the end, the System.exit() method will not be executed.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21289&quot;&gt;FLINK-21289&lt;/a&gt;] - Application mode ignores the pipeline.classpaths configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21312&quot;&gt;FLINK-21312&lt;/a&gt;] - SavepointITCase.testStopSavepointWithBoundedInputConcurrently is unstable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21323&quot;&gt;FLINK-21323&lt;/a&gt;] - Stop-with-savepoint is not supported by SourceOperatorStreamTask
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21453&quot;&gt;FLINK-21453&lt;/a&gt;] - BoundedOneInput.endInput is NOT called when doing stop with savepoint WITH drain
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21497&quot;&gt;FLINK-21497&lt;/a&gt;] - JobLeaderIdService completes leader future despite no leader being elected
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21550&quot;&gt;FLINK-21550&lt;/a&gt;] - ZooKeeperHaServicesTest.testSimpleClose fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21606&quot;&gt;FLINK-21606&lt;/a&gt;] - TaskManager connected to invalid JobManager leading to TaskSubmissionException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21609&quot;&gt;FLINK-21609&lt;/a&gt;] - SimpleRecoveryITCaseBase.testRestartMultipleTimes fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21654&quot;&gt;FLINK-21654&lt;/a&gt;] - YARNSessionCapacitySchedulerITCase.testStartYarnSessionClusterInQaTeamQueue fail because of NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21725&quot;&gt;FLINK-21725&lt;/a&gt;] - DataTypeExtractor extracts wrong fields ordering for Tuple12
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21753&quot;&gt;FLINK-21753&lt;/a&gt;] - Cycle references between memory manager and gc cleaner action
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21980&quot;&gt;FLINK-21980&lt;/a&gt;] - ZooKeeperRunningJobsRegistry creates an empty znode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21986&quot;&gt;FLINK-21986&lt;/a&gt;] - taskmanager native memory not release timely after restart
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22081&quot;&gt;FLINK-22081&lt;/a&gt;] - Entropy key not resolved if flink-s3-fs-hadoop is added as a plugin
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22109&quot;&gt;FLINK-22109&lt;/a&gt;] - Misleading exception message if the number of arguments of a nested function is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22184&quot;&gt;FLINK-22184&lt;/a&gt;] - Rest client shutdown on failure runs in netty thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22424&quot;&gt;FLINK-22424&lt;/a&gt;] - Writing to already released buffers potentially causing data corruption during job failover/cancellation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22489&quot;&gt;FLINK-22489&lt;/a&gt;] - subtask backpressure indicator shows value for entire job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22597&quot;&gt;FLINK-22597&lt;/a&gt;] - JobMaster cannot be restarted
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22815&quot;&gt;FLINK-22815&lt;/a&gt;] - Disable unaligned checkpoints for broadcast partitioning
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22946&quot;&gt;FLINK-22946&lt;/a&gt;] - Network buffer deadlock introduced by unaligned checkpoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23164&quot;&gt;FLINK-23164&lt;/a&gt;] - JobMasterTest.testMultipleStartsWork unstable on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23166&quot;&gt;FLINK-23166&lt;/a&gt;] - ZipUtils doesn&amp;#39;t handle properly for softlinks inside the zip file
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9844&quot;&gt;FLINK-9844&lt;/a&gt;] - PackagedProgram does not close URLClassLoader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18182&quot;&gt;FLINK-18182&lt;/a&gt;] - Upgrade AWS SDK in flink-connector-kinesis to include new region af-south-1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19415&quot;&gt;FLINK-19415&lt;/a&gt;] - Move Hive document to &amp;quot;Table &amp;amp; SQL Connectors&amp;quot; from &amp;quot;Table API &amp;amp; SQL&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20651&quot;&gt;FLINK-20651&lt;/a&gt;] - Use Spotless/google-java-format for code formatting/enforcement
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20770&quot;&gt;FLINK-20770&lt;/a&gt;] - Incorrect description for config option kubernetes.rest-service.exposed.type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20790&quot;&gt;FLINK-20790&lt;/a&gt;] - Generated classes should not be put under src/ directory
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20792&quot;&gt;FLINK-20792&lt;/a&gt;] - Allow shorthand invocation of spotless
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20805&quot;&gt;FLINK-20805&lt;/a&gt;] - Blink runtime classes partially ignored by spotless
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20866&quot;&gt;FLINK-20866&lt;/a&gt;] - Add how to list jobs in Yarn deployment documentation when HA enabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20906&quot;&gt;FLINK-20906&lt;/a&gt;] - Update copyright year to 2021 for NOTICE files
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21020&quot;&gt;FLINK-21020&lt;/a&gt;] - Bump Jackson to 2.10.5[.1] / 2.12.1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21123&quot;&gt;FLINK-21123&lt;/a&gt;] - Upgrade Beanutils 1.9.x to 1.9.4
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21164&quot;&gt;FLINK-21164&lt;/a&gt;] - Jar handlers don&amp;#39;t cleanup temporarily extracted jars
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21210&quot;&gt;FLINK-21210&lt;/a&gt;] - ApplicationClusterEntryPoints should explicitly close PackagedProgram
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21411&quot;&gt;FLINK-21411&lt;/a&gt;] - The components on which Flink depends may contain vulnerabilities. If yes, fix them.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21735&quot;&gt;FLINK-21735&lt;/a&gt;] - Harden JobMaster#updateTaskExecutionState()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22142&quot;&gt;FLINK-22142&lt;/a&gt;] - Remove console logging for Kafka connector for AZP runs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22208&quot;&gt;FLINK-22208&lt;/a&gt;] - Bump snappy-java to 1.1.5+
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22470&quot;&gt;FLINK-22470&lt;/a&gt;] - The root cause of the exception encountered during compiling the job was not exposed to users in certain cases
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23312&quot;&gt;FLINK-23312&lt;/a&gt;] - Use -Dfast for building e2e tests on AZP
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Mon, 09 Aug 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/08/09/release-1.11.4.html</link>
<guid isPermaLink="true">/news/2021/08/09/release-1.11.4.html</guid>
</item>
<item>
<title>Apache Flink 1.13.2 Released</title>
<description>&lt;p&gt;The Apache Flink community released the second bugfix version of the Apache Flink 1.13 series.&lt;/p&gt;
&lt;p&gt;This release includes 127 fixes and minor improvements for Flink 1.13.2. The list below includes bugfixes and improvements. For a complete list of all changes see:
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218&amp;amp;styleName=&amp;amp;projectId=12315522&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.13.2.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt; Release Notes - Flink - Version 1.13.2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22726&quot;&gt;FLINK-22726&lt;/a&gt;] - Hive GROUPING__ID returns different value in older versions
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20888&quot;&gt;FLINK-20888&lt;/a&gt;] - ContinuousFileReaderOperator should not close the output on close()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20975&quot;&gt;FLINK-20975&lt;/a&gt;] - HiveTableSourceITCase.testPartitionFilter fails on AZP
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21389&quot;&gt;FLINK-21389&lt;/a&gt;] - ParquetInputFormat should not need parquet schema as user input
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21445&quot;&gt;FLINK-21445&lt;/a&gt;] - Application mode does not set the configuration when building PackagedProgram
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21952&quot;&gt;FLINK-21952&lt;/a&gt;] - Make all the &amp;quot;Connection reset by peer&amp;quot; exception wrapped as RemoteTransportException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22045&quot;&gt;FLINK-22045&lt;/a&gt;] - Set log level for shaded zookeeper logger
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22195&quot;&gt;FLINK-22195&lt;/a&gt;] - YARNHighAvailabilityITCase.testClusterClientRetrieval because of TestTimedOutException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22203&quot;&gt;FLINK-22203&lt;/a&gt;] - KafkaChangelogTableITCase.testKafkaCanalChangelogSource fail due to ConcurrentModificationException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22272&quot;&gt;FLINK-22272&lt;/a&gt;] - Some scenes can&amp;#39;t drop table by hive catalog
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22312&quot;&gt;FLINK-22312&lt;/a&gt;] - YARNSessionFIFOSecuredITCase&amp;gt;YARNSessionFIFOITCase.checkForProhibitedLogContents due to the heartbeat exception with Yarn RM
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22376&quot;&gt;FLINK-22376&lt;/a&gt;] - SequentialChannelStateReaderImpl may recycle buffer twice
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22443&quot;&gt;FLINK-22443&lt;/a&gt;] - can not be execute an extreme long sql under batch mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22462&quot;&gt;FLINK-22462&lt;/a&gt;] - JdbcExactlyOnceSinkE2eTest.testInsert failed because of too many clients.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22464&quot;&gt;FLINK-22464&lt;/a&gt;] - OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure hangs with `AdaptiveScheduler`
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22492&quot;&gt;FLINK-22492&lt;/a&gt;] - KinesisTableApiITCase with wrong results
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22496&quot;&gt;FLINK-22496&lt;/a&gt;] - ClusterEntrypointTest.testCloseAsyncShouldBeExecutedInShutdownHook failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22545&quot;&gt;FLINK-22545&lt;/a&gt;] - JVM crashes when runing OperatorEventSendingCheckpointITCase.testOperatorEventAckLost
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22547&quot;&gt;FLINK-22547&lt;/a&gt;] - OperatorCoordinatorHolderTest. verifyCheckpointEventOrderWhenCheckpointFutureCompletesLate fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22613&quot;&gt;FLINK-22613&lt;/a&gt;] - FlinkKinesisITCase.testStopWithSavepoint fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22662&quot;&gt;FLINK-22662&lt;/a&gt;] - YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22683&quot;&gt;FLINK-22683&lt;/a&gt;] - The total Flink/process memory of memoryConfiguration in /taskmanagers can be null or incorrect value
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22686&quot;&gt;FLINK-22686&lt;/a&gt;] - Incompatible subtask mappings while resuming from unaligned checkpoints
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22689&quot;&gt;FLINK-22689&lt;/a&gt;] - Table API Documentation Row-Based Operations Example Fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22698&quot;&gt;FLINK-22698&lt;/a&gt;] - RabbitMQ source does not stop unless message arrives in queue
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22725&quot;&gt;FLINK-22725&lt;/a&gt;] - SlotManagers should unregister metrics at the start of suspend()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22730&quot;&gt;FLINK-22730&lt;/a&gt;] - Lookup join condition with CURRENT_DATE fails to filter records
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22746&quot;&gt;FLINK-22746&lt;/a&gt;] - Links to connectors in docs are broken
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22759&quot;&gt;FLINK-22759&lt;/a&gt;] - Correct the applicability of RocksDB related options as per operator
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22760&quot;&gt;FLINK-22760&lt;/a&gt;] - HiveParser::setCurrentTimestamp fails with hive-3.1.2
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22777&quot;&gt;FLINK-22777&lt;/a&gt;] - Restore lost sections in Try Flink DataStream API example
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22779&quot;&gt;FLINK-22779&lt;/a&gt;] - KafkaChangelogTableITCase.testKafkaDebeziumChangelogSource fail due to ConcurrentModificationException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22786&quot;&gt;FLINK-22786&lt;/a&gt;] - sql-client can not create .flink-sql-history file
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22795&quot;&gt;FLINK-22795&lt;/a&gt;] - Throw better exception when executing remote SQL file in SQL Client
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22796&quot;&gt;FLINK-22796&lt;/a&gt;] - Update mem_setup_tm documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22814&quot;&gt;FLINK-22814&lt;/a&gt;] - New sources are not defining/exposing checkpointStartDelayNanos metric
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22815&quot;&gt;FLINK-22815&lt;/a&gt;] - Disable unaligned checkpoints for broadcast partitioning
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22819&quot;&gt;FLINK-22819&lt;/a&gt;] - YARNFileReplicationITCase fails with &amp;quot;The YARN application unexpectedly switched to state FAILED during deployment&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22820&quot;&gt;FLINK-22820&lt;/a&gt;] - Stopping Yarn session cluster will cause fatal error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22833&quot;&gt;FLINK-22833&lt;/a&gt;] - Source tasks (both old and new) are not reporting checkpointStartDelay via CheckpointMetrics
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22856&quot;&gt;FLINK-22856&lt;/a&gt;] - Move our Azure pipelines away from Ubuntu 16.04 by September
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22863&quot;&gt;FLINK-22863&lt;/a&gt;] - ArrayIndexOutOfBoundsException may happen when building rescale edges
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22884&quot;&gt;FLINK-22884&lt;/a&gt;] - Select view columns fail when store metadata with hive
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22886&quot;&gt;FLINK-22886&lt;/a&gt;] - Thread leak in RocksDBStateUploader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22890&quot;&gt;FLINK-22890&lt;/a&gt;] - Few tests fail in HiveTableSinkITCase
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22894&quot;&gt;FLINK-22894&lt;/a&gt;] - Window Top-N should allow n=1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22898&quot;&gt;FLINK-22898&lt;/a&gt;] - HiveParallelismInference limit return wrong parallelism
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22908&quot;&gt;FLINK-22908&lt;/a&gt;] - FileExecutionGraphInfoStoreTest.testPutSuspendedJobOnClusterShutdown should wait until job is running
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22927&quot;&gt;FLINK-22927&lt;/a&gt;] - Exception on JobClient.get_job_status().result()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22945&quot;&gt;FLINK-22945&lt;/a&gt;] - StackOverflowException can happen when a large scale job is CANCELING/FAILING
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22946&quot;&gt;FLINK-22946&lt;/a&gt;] - Network buffer deadlock introduced by unaligned checkpoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22948&quot;&gt;FLINK-22948&lt;/a&gt;] - Scala example for toDataStream does not compile
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22952&quot;&gt;FLINK-22952&lt;/a&gt;] - docs_404_check fail on azure due to ruby version not available
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22961&quot;&gt;FLINK-22961&lt;/a&gt;] - Incorrect calculation of alignment timeout for LocalInputChannel
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22963&quot;&gt;FLINK-22963&lt;/a&gt;] - The description of taskmanager.memory.task.heap.size in the official document is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22964&quot;&gt;FLINK-22964&lt;/a&gt;] - Connector-base exposes dependency to flink-core.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22966&quot;&gt;FLINK-22966&lt;/a&gt;] - NPE in StateAssignmentOperation when rescaling
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22980&quot;&gt;FLINK-22980&lt;/a&gt;] - FileExecutionGraphInfoStoreTest hangs on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22982&quot;&gt;FLINK-22982&lt;/a&gt;] - java.lang.ClassCastException when using Python UDF
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22987&quot;&gt;FLINK-22987&lt;/a&gt;] - Scala suffix check isn&amp;#39;t working
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22993&quot;&gt;FLINK-22993&lt;/a&gt;] - CompactFileWriter won&amp;#39;t emit EndCheckpoint with Long.MAX_VALUE checkpointId
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23001&quot;&gt;FLINK-23001&lt;/a&gt;] - flink-avro-glue-schema-registry lacks scala suffix
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23003&quot;&gt;FLINK-23003&lt;/a&gt;] - Resource leak in RocksIncrementalSnapshotStrategy
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23010&quot;&gt;FLINK-23010&lt;/a&gt;] - HivePartitionFetcherContextBase::getComparablePartitionValueList can return partitions that don&amp;#39;t exist
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23018&quot;&gt;FLINK-23018&lt;/a&gt;] - State factories should handle extended state descriptors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23024&quot;&gt;FLINK-23024&lt;/a&gt;] - RPC result TaskManagerInfoWithSlots not serializable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23025&quot;&gt;FLINK-23025&lt;/a&gt;] - sink-buffer-max-rows and sink-buffer-flush-interval options produce a lot of duplicates
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23030&quot;&gt;FLINK-23030&lt;/a&gt;] - PartitionRequestClientFactory#createPartitionRequestClient should throw when network failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23034&quot;&gt;FLINK-23034&lt;/a&gt;] - NPE in JobDetailsDeserializer during the reading old version of ExecutionState
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23045&quot;&gt;FLINK-23045&lt;/a&gt;] - RunnablesTest.testExecutorService_uncaughtExceptionHandler fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23073&quot;&gt;FLINK-23073&lt;/a&gt;] - Fix space handling in Row CSV timestamp parser
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23074&quot;&gt;FLINK-23074&lt;/a&gt;] - There is a class conflict between flink-connector-hive and flink-parquet
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23092&quot;&gt;FLINK-23092&lt;/a&gt;] - Built-in UDAFs could not be mixed use with Python UDAF in group window
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23096&quot;&gt;FLINK-23096&lt;/a&gt;] - HiveParser could not attach the sessionstate of hive
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23119&quot;&gt;FLINK-23119&lt;/a&gt;] - Fix the issue that the exception that General Python UDAF is unsupported is not thrown in Compile Stage.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23120&quot;&gt;FLINK-23120&lt;/a&gt;] - ByteArrayWrapperSerializer.serialize should use writeInt to serialize the length
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23121&quot;&gt;FLINK-23121&lt;/a&gt;] - Fix the issue that the InternalRow as arguments in Python UDAF
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23129&quot;&gt;FLINK-23129&lt;/a&gt;] - When cancelling any running job of multiple jobs in an application cluster, JobManager shuts down
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23133&quot;&gt;FLINK-23133&lt;/a&gt;] - The dependencies are not handled properly when mixing use of Python Table API and Python DataStream API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23151&quot;&gt;FLINK-23151&lt;/a&gt;] - KinesisTableApiITCase.testTableApiSourceAndSink fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23166&quot;&gt;FLINK-23166&lt;/a&gt;] - ZipUtils doesn&amp;#39;t handle properly for softlinks inside the zip file
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23182&quot;&gt;FLINK-23182&lt;/a&gt;] - Connection leak in RMQSource
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23184&quot;&gt;FLINK-23184&lt;/a&gt;] - CompileException Assignment conversion not possible from type &amp;quot;int&amp;quot; to type &amp;quot;short&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23188&quot;&gt;FLINK-23188&lt;/a&gt;] - Unsupported function definition: IFNULL. Only user defined functions are supported as inline functions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23196&quot;&gt;FLINK-23196&lt;/a&gt;] - JobMasterITCase fail on azure due to BindException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23201&quot;&gt;FLINK-23201&lt;/a&gt;] - The check on alignmentDurationNanos seems to be too strict
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23223&quot;&gt;FLINK-23223&lt;/a&gt;] - When flushAlways is enabled the subpartition may lose notification of data availability
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23233&quot;&gt;FLINK-23233&lt;/a&gt;] - OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23235&quot;&gt;FLINK-23235&lt;/a&gt;] - SinkITCase.writerAndCommitterAndGlobalCommitterExecuteInStreamingMode fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23248&quot;&gt;FLINK-23248&lt;/a&gt;] - SinkWriter is not closed when failing
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23259&quot;&gt;FLINK-23259&lt;/a&gt;] - [DOCS]The &amp;#39;window&amp;#39; link on page docs/dev/datastream/operators/overview is failed and 404 is returned
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23260&quot;&gt;FLINK-23260&lt;/a&gt;] - [DOCS]The link on page docs/libs/gelly/overview is failed and 404 is returned
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23270&quot;&gt;FLINK-23270&lt;/a&gt;] - Impove description of Regular Joins section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23280&quot;&gt;FLINK-23280&lt;/a&gt;] - Python ExplainDetails does not have JSON_EXECUTION_PLAN option
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23306&quot;&gt;FLINK-23306&lt;/a&gt;] - FlinkRelMdUniqueKeys causes exception when used with new Schema
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23359&quot;&gt;FLINK-23359&lt;/a&gt;] - Fix the number of available slots in testResourceCanBeAllocatedForDifferentJobAfterFree
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23368&quot;&gt;FLINK-23368&lt;/a&gt;] - Fix the wrong mapping of state cache in PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23429&quot;&gt;FLINK-23429&lt;/a&gt;] - State Processor API failed with FileNotFoundException when working with state files on Cloud Storage
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; New Feature
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22770&quot;&gt;FLINK-22770&lt;/a&gt;] - Expose SET/RESET from the parser
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18182&quot;&gt;FLINK-18182&lt;/a&gt;] - Upgrade AWS SDK in flink-connector-kinesis to include new region af-south-1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20140&quot;&gt;FLINK-20140&lt;/a&gt;] - Add documentation of TableResult.collect for Python Table API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21229&quot;&gt;FLINK-21229&lt;/a&gt;] - Support ssl connection with schema registry format
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21393&quot;&gt;FLINK-21393&lt;/a&gt;] - Implement ParquetAvroInputFormat
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21411&quot;&gt;FLINK-21411&lt;/a&gt;] - The components on which Flink depends may contain vulnerabilities. If yes, fix them.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22528&quot;&gt;FLINK-22528&lt;/a&gt;] - Document latency tracking metrics for state accesses
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22638&quot;&gt;FLINK-22638&lt;/a&gt;] - Keep channels blocked on alignment timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22655&quot;&gt;FLINK-22655&lt;/a&gt;] - When using -i &amp;lt;init.sql&amp;gt; option to initialize SQL Client session It should be possible to annotate the script with --
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22722&quot;&gt;FLINK-22722&lt;/a&gt;] - Add Documentation for Kafka New Source
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22747&quot;&gt;FLINK-22747&lt;/a&gt;] - Update commons-io to 2.8
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22766&quot;&gt;FLINK-22766&lt;/a&gt;] - Report metrics of KafkaConsumer in Kafka new source
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22774&quot;&gt;FLINK-22774&lt;/a&gt;] - Update Kinesis SQL connector&amp;#39;s Guava to 27.0-jre
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22855&quot;&gt;FLINK-22855&lt;/a&gt;] - Translate the &amp;#39;Overview of Python API&amp;#39; page into Chinese.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22873&quot;&gt;FLINK-22873&lt;/a&gt;] - Add ToC to configuration documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22905&quot;&gt;FLINK-22905&lt;/a&gt;] - Fix missing comma in SQL example in &amp;quot;Versioned Table&amp;quot; page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22939&quot;&gt;FLINK-22939&lt;/a&gt;] - Generalize JDK switch in azure setup
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22996&quot;&gt;FLINK-22996&lt;/a&gt;] - The description about coalesce is wrong
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23009&quot;&gt;FLINK-23009&lt;/a&gt;] - Bump up Guava in Kinesis Connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23052&quot;&gt;FLINK-23052&lt;/a&gt;] - cron_snapshot_deployment_maven unstable on maven
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23138&quot;&gt;FLINK-23138&lt;/a&gt;] - Raise an exception if types other than PickledBytesTypeInfo are specified for state descriptor
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23156&quot;&gt;FLINK-23156&lt;/a&gt;] - Change the reference of &amp;#39;docs/dev/table/sql/queries&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23157&quot;&gt;FLINK-23157&lt;/a&gt;] - Fix missing comma in SQL example in &amp;quot;Versioned View&amp;quot; page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23162&quot;&gt;FLINK-23162&lt;/a&gt;] - Create table uses time_ltz in the column name and it&amp;#39;s expression which results in exception
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23168&quot;&gt;FLINK-23168&lt;/a&gt;] - Catalog shouldn&amp;#39;t merge properties for alter DB operation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23178&quot;&gt;FLINK-23178&lt;/a&gt;] - Raise an error for writing stream data into partitioned hive tables without a partition committer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23200&quot;&gt;FLINK-23200&lt;/a&gt;] - Correct grammatical mistakes in &amp;#39;Table API&amp;#39; page of &amp;#39;Table API &amp;amp; SQL&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23226&quot;&gt;FLINK-23226&lt;/a&gt;] - Flink Chinese doc learn-flink/etl transformation.svg display issue
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23312&quot;&gt;FLINK-23312&lt;/a&gt;] - Use -Dfast for building e2e tests on AZP
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 06 Aug 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/08/06/release-1.13.2.html</link>
<guid isPermaLink="true">/news/2021/08/06/release-1.13.2.html</guid>
</item>
<item>
<title>Apache Flink 1.12.5 Released</title>
<description>&lt;p&gt;The Apache Flink community released the next bugfix version of the Apache Flink 1.12 series.&lt;/p&gt;
&lt;p&gt;This release includes 76 fixes and minor improvements for Flink 1.12.4. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.12.5.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.5&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.5&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.5&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Release Notes - Flink - Version 1.12.5&lt;/p&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19925&quot;&gt;FLINK-19925&lt;/a&gt;] - Errors$NativeIoException: readAddress(..) failed: Connection reset by peer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20321&quot;&gt;FLINK-20321&lt;/a&gt;] - Get NPE when using AvroDeserializationSchema to deserialize null input
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20888&quot;&gt;FLINK-20888&lt;/a&gt;] - ContinuousFileReaderOperator should not close the output on close()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21329&quot;&gt;FLINK-21329&lt;/a&gt;] - &amp;quot;Local recovery and sticky scheduling end-to-end test&amp;quot; does not finish within 600 seconds
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21445&quot;&gt;FLINK-21445&lt;/a&gt;] - Application mode does not set the configuration when building PackagedProgram
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21469&quot;&gt;FLINK-21469&lt;/a&gt;] - stop-with-savepoint --drain doesn&amp;#39;t advance watermark for sources chained to MultipleInputStreamTask
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21952&quot;&gt;FLINK-21952&lt;/a&gt;] - Make all the &amp;quot;Connection reset by peer&amp;quot; exception wrapped as RemoteTransportException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22015&quot;&gt;FLINK-22015&lt;/a&gt;] - SQL filter containing OR and IS NULL will produce an incorrect result.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22105&quot;&gt;FLINK-22105&lt;/a&gt;] - SubtaskCheckpointCoordinatorTest.testForceAlignedCheckpointResultingInPriorityEvents unstable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22157&quot;&gt;FLINK-22157&lt;/a&gt;] - Join &amp;amp; Select a part of composite primary key will cause ArrayIndexOutOfBoundsException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22312&quot;&gt;FLINK-22312&lt;/a&gt;] - YARNSessionFIFOSecuredITCase&amp;gt;YARNSessionFIFOITCase.checkForProhibitedLogContents due to the heartbeat exception with Yarn RM
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22408&quot;&gt;FLINK-22408&lt;/a&gt;] - Flink Table Parsr Hive Drop Partitions Syntax unparse is Error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22419&quot;&gt;FLINK-22419&lt;/a&gt;] - testScheduleRunAsync fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22434&quot;&gt;FLINK-22434&lt;/a&gt;] - Dispatcher does not store suspended jobs in execution graph store
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22443&quot;&gt;FLINK-22443&lt;/a&gt;] - can not be execute an extreme long sql under batch mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22494&quot;&gt;FLINK-22494&lt;/a&gt;] - Avoid discarding checkpoints in case of failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22496&quot;&gt;FLINK-22496&lt;/a&gt;] - ClusterEntrypointTest.testCloseAsyncShouldBeExecutedInShutdownHook failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22502&quot;&gt;FLINK-22502&lt;/a&gt;] - DefaultCompletedCheckpointStore drops unrecoverable checkpoints silently
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22547&quot;&gt;FLINK-22547&lt;/a&gt;] - OperatorCoordinatorHolderTest. verifyCheckpointEventOrderWhenCheckpointFutureCompletesLate fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22564&quot;&gt;FLINK-22564&lt;/a&gt;] - Kubernetes-related ITCases do not fail even in case of failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22592&quot;&gt;FLINK-22592&lt;/a&gt;] - numBuffersInLocal is always zero when using unaligned checkpoints
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22613&quot;&gt;FLINK-22613&lt;/a&gt;] - FlinkKinesisITCase.testStopWithSavepoint fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22683&quot;&gt;FLINK-22683&lt;/a&gt;] - The total Flink/process memory of memoryConfiguration in /taskmanagers can be null or incorrect value
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22698&quot;&gt;FLINK-22698&lt;/a&gt;] - RabbitMQ source does not stop unless message arrives in queue
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22704&quot;&gt;FLINK-22704&lt;/a&gt;] - ZooKeeperHaServicesTest.testCleanupJobData failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22721&quot;&gt;FLINK-22721&lt;/a&gt;] - Breaking HighAvailabilityServices interface by adding new method
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22733&quot;&gt;FLINK-22733&lt;/a&gt;] - Type mismatch thrown in DataStream.union if parameter is KeyedStream for Python DataStream API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22756&quot;&gt;FLINK-22756&lt;/a&gt;] - DispatcherTest.testJobStatusIsShownDuringTermination fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22788&quot;&gt;FLINK-22788&lt;/a&gt;] - Code of equals method grows beyond 64 KB
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22814&quot;&gt;FLINK-22814&lt;/a&gt;] - New sources are not defining/exposing checkpointStartDelayNanos metric
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22815&quot;&gt;FLINK-22815&lt;/a&gt;] - Disable unaligned checkpoints for broadcast partitioning
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22819&quot;&gt;FLINK-22819&lt;/a&gt;] - YARNFileReplicationITCase fails with &amp;quot;The YARN application unexpectedly switched to state FAILED during deployment&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22820&quot;&gt;FLINK-22820&lt;/a&gt;] - Stopping Yarn session cluster will cause fatal error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22833&quot;&gt;FLINK-22833&lt;/a&gt;] - Source tasks (both old and new) are not reporting checkpointStartDelay via CheckpointMetrics
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22856&quot;&gt;FLINK-22856&lt;/a&gt;] - Move our Azure pipelines away from Ubuntu 16.04 by September
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22886&quot;&gt;FLINK-22886&lt;/a&gt;] - Thread leak in RocksDBStateUploader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22898&quot;&gt;FLINK-22898&lt;/a&gt;] - HiveParallelismInference limit return wrong parallelism
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22908&quot;&gt;FLINK-22908&lt;/a&gt;] - FileExecutionGraphInfoStoreTest.testPutSuspendedJobOnClusterShutdown should wait until job is running
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22927&quot;&gt;FLINK-22927&lt;/a&gt;] - Exception on JobClient.get_job_status().result()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22946&quot;&gt;FLINK-22946&lt;/a&gt;] - Network buffer deadlock introduced by unaligned checkpoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22952&quot;&gt;FLINK-22952&lt;/a&gt;] - docs_404_check fail on azure due to ruby version not available
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22963&quot;&gt;FLINK-22963&lt;/a&gt;] - The description of taskmanager.memory.task.heap.size in the official document is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22964&quot;&gt;FLINK-22964&lt;/a&gt;] - Connector-base exposes dependency to flink-core.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22987&quot;&gt;FLINK-22987&lt;/a&gt;] - Scala suffix check isn&amp;#39;t working
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23010&quot;&gt;FLINK-23010&lt;/a&gt;] - HivePartitionFetcherContextBase::getComparablePartitionValueList can return partitions that don&amp;#39;t exist
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23030&quot;&gt;FLINK-23030&lt;/a&gt;] - PartitionRequestClientFactory#createPartitionRequestClient should throw when network failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23045&quot;&gt;FLINK-23045&lt;/a&gt;] - RunnablesTest.testExecutorService_uncaughtExceptionHandler fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23074&quot;&gt;FLINK-23074&lt;/a&gt;] - There is a class conflict between flink-connector-hive and flink-parquet
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23076&quot;&gt;FLINK-23076&lt;/a&gt;] - DispatcherTest.testWaitingForJobMasterLeadership fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23119&quot;&gt;FLINK-23119&lt;/a&gt;] - Fix the issue that the exception that General Python UDAF is unsupported is not thrown in Compile Stage.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23120&quot;&gt;FLINK-23120&lt;/a&gt;] - ByteArrayWrapperSerializer.serialize should use writeInt to serialize the length
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23133&quot;&gt;FLINK-23133&lt;/a&gt;] - The dependencies are not handled properly when mixing use of Python Table API and Python DataStream API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23135&quot;&gt;FLINK-23135&lt;/a&gt;] - Flink SQL Error while applying rule AggregateReduceGroupingRule
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23164&quot;&gt;FLINK-23164&lt;/a&gt;] - JobMasterTest.testMultipleStartsWork unstable on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23166&quot;&gt;FLINK-23166&lt;/a&gt;] - ZipUtils doesn&amp;#39;t handle properly for softlinks inside the zip file
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23182&quot;&gt;FLINK-23182&lt;/a&gt;] - Connection leak in RMQSource
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23184&quot;&gt;FLINK-23184&lt;/a&gt;] - CompileException Assignment conversion not possible from type &amp;quot;int&amp;quot; to type &amp;quot;short&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23201&quot;&gt;FLINK-23201&lt;/a&gt;] - The check on alignmentDurationNanos seems to be too strict
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23223&quot;&gt;FLINK-23223&lt;/a&gt;] - When flushAlways is enabled the subpartition may lose notification of data availability
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23233&quot;&gt;FLINK-23233&lt;/a&gt;] - OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23248&quot;&gt;FLINK-23248&lt;/a&gt;] - SinkWriter is not closed when failing
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23417&quot;&gt;FLINK-23417&lt;/a&gt;] - MiniClusterITCase.testHandleBatchJobsWhenNotEnoughSlot fails on Azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23429&quot;&gt;FLINK-23429&lt;/a&gt;] - State Processor API failed with FileNotFoundException when working with state files on Cloud Storage
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17857&quot;&gt;FLINK-17857&lt;/a&gt;] - Kubernetes and docker e2e tests could not run on Mac OS after migration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18182&quot;&gt;FLINK-18182&lt;/a&gt;] - Upgrade AWS SDK in flink-connector-kinesis to include new region af-south-1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20695&quot;&gt;FLINK-20695&lt;/a&gt;] - Zookeeper node under leader and leaderlatch is not deleted after job finished
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21229&quot;&gt;FLINK-21229&lt;/a&gt;] - Support ssl connection with schema registry format
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21411&quot;&gt;FLINK-21411&lt;/a&gt;] - The components on which Flink depends may contain vulnerabilities. If yes, fix them.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22708&quot;&gt;FLINK-22708&lt;/a&gt;] - Propagate savepoint settings from StreamExecutionEnvironment to StreamGraph
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22747&quot;&gt;FLINK-22747&lt;/a&gt;] - Update commons-io to 2.8
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22757&quot;&gt;FLINK-22757&lt;/a&gt;] - Update GCS documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22774&quot;&gt;FLINK-22774&lt;/a&gt;] - Update Kinesis SQL connector&amp;#39;s Guava to 27.0-jre
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22939&quot;&gt;FLINK-22939&lt;/a&gt;] - Generalize JDK switch in azure setup
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23009&quot;&gt;FLINK-23009&lt;/a&gt;] - Bump up Guava in Kinesis Connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23052&quot;&gt;FLINK-23052&lt;/a&gt;] - cron_snapshot_deployment_maven unstable on maven
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-23312&quot;&gt;FLINK-23312&lt;/a&gt;] - Use -Dfast for building e2e tests on AZP
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 06 Aug 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/08/06/release-1.12.5.html</link>
<guid isPermaLink="true">/news/2021/08/06/release-1.12.5.html</guid>
</item>
<item>
<title>How to identify the source of backpressure?</title>
<description>&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#what-is-backpressure&quot; id=&quot;markdown-toc-what-is-backpressure&quot;&gt;What is backpressure?&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#why-should-i-care-about-backpressure&quot; id=&quot;markdown-toc-why-should-i-care-about-backpressure&quot;&gt;Why should I care about backpressure?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#why-shouldnt-i-care-about-backpressure&quot; id=&quot;markdown-toc-why-shouldnt-i-care-about-backpressure&quot;&gt;Why shouldn’t I care about backpressure?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#how-to-detect-and-track-down-the-source-of-backpressure&quot; id=&quot;markdown-toc-how-to-detect-and-track-down-the-source-of-backpressure&quot;&gt;How to detect and track down the source of backpressure?&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#what-are-those-numbers&quot; id=&quot;markdown-toc-what-are-those-numbers&quot;&gt;What are those numbers?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#varying-load&quot; id=&quot;markdown-toc-varying-load&quot;&gt;Varying load&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#what-can-i-do-with-backpressure&quot; id=&quot;markdown-toc-what-can-i-do-with-backpressure&quot;&gt;What can I do with backpressure?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-07-07-backpressure/animated.png&quot; alt=&quot;Backpressure monitoring in the web UI&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;Backpressure monitoring in the web UI&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The backpressure topic was tackled from different angles over the last couple of years. However, when it comes
to identifying and analyzing sources of backpressure, things have changed quite a bit in the recent Flink releases
(especially with new additions to metrics and the web UI in Flink 1.13). This post will try to clarify some of
these changes and go into more detail about how to track down the source of backpressure, but first…&lt;/p&gt;
&lt;h2 id=&quot;what-is-backpressure&quot;&gt;What is backpressure?&lt;/h2&gt;
&lt;p&gt;This has been explained very well in an old, but still accurate, &lt;a href=&quot;https://www.ververica.com/blog/how-flink-handles-backpressure&quot;&gt;post by Ufuk Celebi&lt;/a&gt;.
I highly recommend reading it if you are not familiar with this concept. For a much deeper and low-level understanding of
the topic and how Flink’s network stack works, there is a more &lt;a href=&quot;https://alibabacloud.com/blog/analysis-of-network-flow-control-and-back-pressure-flink-advanced-tutorials_596632&quot;&gt;advanced explanation available here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;At a high level, backpressure happens if some operator(s) in the Job Graph cannot process records at the
same rate as they are received. This fills up the input buffers of the subtask that is running this slow operator.
Once the input buffers are full, backpressure propagates to the output buffers of the upstream subtasks.
Once those are filled up, the upstream subtasks are also forced to slow down their records’ processing
rate to match the processing rate of the operator causing this bottleneck down the stream. Backpressure
further propagates up the stream until it reaches the source operators.&lt;/p&gt;
&lt;p&gt;As long as the load and available resources are static and none of the operators produce short bursts of
data (like windowing operators), those input/output buffers should only be in one of two states: almost empty
or almost full. If the downstream operator or subtask is able to keep up with the influx of data, the
buffers will be empty. If not, then the buffers will be full [&lt;sup&gt;1&lt;/sup&gt;]. In fact, checking the buffers’ usage metrics
was the basis of the previously recommended way on how to detect and analyze backpressure described &lt;a href=&quot;https://flink.apache.org/2019/07/23/flink-network-stack-2.html#backpressure&quot;&gt;a couple
of years back by Nico Kruber&lt;/a&gt;.
As I mentioned in the beginning, Flink now offers much better tools to do the same job, but before we get to that,
there are two questions worth asking.&lt;/p&gt;
&lt;h3 id=&quot;why-should-i-care-about-backpressure&quot;&gt;Why should I care about backpressure?&lt;/h3&gt;
&lt;p&gt;Backpressure is an indicator that your machines or operators are overloaded. The buildup of backpressure
directly affects the end-to-end latency of the system, as records are waiting longer in the queues before
being processed. Secondly, aligned checkpointing takes longer with backpressure, while unaligned checkpoints
will be larger (you can read more about aligned and unaligned checkpoints &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/concepts/stateful-stream-processing/#checkpointing&quot;&gt;in the documentation&lt;/a&gt;.
If you are struggling with checkpoint barriers propagation times, taking care of backpressure would most
likely help to solve the problem. Lastly, you might just want to optimize your job in order to reduce
the costs of running the job.&lt;/p&gt;
&lt;p&gt;In order to address the problem for all cases, one needs to be aware of it, then locate and analyze it.&lt;/p&gt;
&lt;h3 id=&quot;why-shouldnt-i-care-about-backpressure&quot;&gt;Why shouldn’t I care about backpressure?&lt;/h3&gt;
&lt;p&gt;Frankly, you do not always have to care about the presence of backpressure. Almost by definition, lack
of backpressure means that your cluster is at least ever so slightly underutilized and over-provisioned.
If you want to minimize idling resources, you probably can not avoid incurring some backpressure. This
is especially true for batch processing.&lt;/p&gt;
&lt;h2 id=&quot;how-to-detect-and-track-down-the-source-of-backpressure&quot;&gt;How to detect and track down the source of backpressure?&lt;/h2&gt;
&lt;p&gt;One way to detect backpressure is to use &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/metrics/#system-metrics&quot;&gt;metrics&lt;/a&gt;,
however, in Flink 1.13 it’s no longer necessary to dig so deep. In most cases, it should be enough to just
look at the job graph in the Web UI.&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-07-07-backpressure/simple-example.png&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;The first thing to note in the example above is that different tasks have different colors. Those colors
represent a combination of two factors: under how much backpressure this task is and how busy it is. Idling
tasks will be blue, fully busy tasks will be red hot, and fully backpressured tasks will be black. Anything
in between will be a combination/shade of those three colors. With this knowledge, one can easily spot the
backpressured tasks (black). The busiest (red) task downstream of the backpressured tasks will most likely
be the source of the backpressure (the bottleneck).&lt;/p&gt;
&lt;p&gt;If you click on one particular task and go into the “BackPressure” tab you will be able to further dissect
the problem and check what is the busy/backpressured/idle status of every subtask in that task. For example,
this is especially handy if there is a data skew and not all subtasks are equally utilized.&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-07-07-backpressure/subtasks.png&quot; alt=&quot;Backpressure among subtasks&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;Backpressure among subtasks&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In the above example, we can clearly see which subtasks are idling, which are backpressured, and that
none of them are busy. And frankly, in a nutshell, that should be enough to quickly understand what is
happening with your Job :) However, there are a couple of more details worth explaining.&lt;/p&gt;
&lt;h3 id=&quot;what-are-those-numbers&quot;&gt;What are those numbers?&lt;/h3&gt;
&lt;p&gt;If you are curious how it works underneath, we can go a little deeper. At the base of this new mechanism
we have three &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/metrics/#io&quot;&gt;new metrics&lt;/a&gt;
that are exposed and calculated by each subtask:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;idleTimeMsPerSecond&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;busyTimeMsPerSecond&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;backPressuredTimeMsPerSecond&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each of them measures the average time in milliseconds per second that the subtask spent being idle,
busy, or backpressured respectively. Apart from some rounding errors they should complement each other and
add up to 1000ms/s. In essence, they are quite similar to, for example, CPU usage metrics.&lt;/p&gt;
&lt;p&gt;Another important detail is that they are being averaged over a short period of time (a couple of seconds)
and they take into account everything that is happening inside the subtask’s thread: operators, functions,
timers, checkpointing, records serialization/deserialization, network stack, and other Flink internal
overheads. A &lt;code&gt;WindowOperator&lt;/code&gt; that is busy firing timers and producing results will be reported as busy or backpressured.
A function doing some expensive computation in &lt;code&gt;CheckpointedFunction#snapshotState&lt;/code&gt; call, for instance flushing
internal buffers, will also be reported as busy.&lt;/p&gt;
&lt;p&gt;One limitation, however, is that &lt;code&gt;busyTimeMsPerSecond&lt;/code&gt; and &lt;code&gt;idleTimeMsPerSecond&lt;/code&gt; metrics are oblivious
to anything that is happening in separate threads, outside of the main subtask’s execution loop.
Fortunately, this is only relevant for two cases:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Custom threads that you manually spawn in your operators (a discouraged practice).&lt;/li&gt;
&lt;li&gt;Old-style sources that implement the deprecated &lt;code&gt;SourceFunction&lt;/code&gt; interface. Such sources will report &lt;code&gt;NaN&lt;/code&gt;/&lt;code&gt;N/A&lt;/code&gt;
as the value for busyTimeMsPerSecond. For more information on the topic of Data Sources please
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/sources/&quot;&gt;take a look here&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-07-07-backpressure/source-task-busy.png&quot; alt=&quot;Old-style sources do not report busy time&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;Old-style sources do not report busy time&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In order to present those raw numbers in the web UI, those metrics need to be aggregated from all subtasks
(on the job graph we are showing only tasks). This is why the web UI presents the maximal value from all
subtasks of a given task and why the aggregated maximal values of busy and backpressured may not add up to 100%.
One subtask can be backpressured at 60%, while another can be busy at 60%. This can result in a task that
is both backpressured and busy at 60%.&lt;/p&gt;
&lt;h3 id=&quot;varying-load&quot;&gt;Varying load&lt;/h3&gt;
&lt;p&gt;There is one more thing. Do you remember that those metrics are measured and averaged over a couple of seconds?
Keep this in mind when analyzing jobs or tasks with varying load, such as (sub)tasks containing a &lt;code&gt;WindowOperator&lt;/code&gt;
that is firing periodically. Both the subtask with a constant load of 50% and the subtask that alternates every
second between being fully busy and fully idle will be reporting the same value of &lt;code&gt;busyTimeMsPerSecond&lt;/code&gt;
of 500ms/s.&lt;/p&gt;
&lt;p&gt;Furthermore, varying load and especially firing windows can move the bottleneck to a different place in
the job graph:&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-07-07-backpressure/bottleneck-zoom.png&quot; alt=&quot;Bottleneck alternating between two tasks&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;Bottleneck alternating between two tasks&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-07-07-backpressure/sliding-window.png&quot; alt=&quot;SlidingWindowOperator&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;SlidingWindowOperator&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In this particular example, &lt;code&gt;SlidingWindowOperator&lt;/code&gt; was the bottleneck as long as it was accumulating records.
However, as soon as it starts to fire its windows (once every 10 seconds), the downstream task
&lt;code&gt;SlidingWindowCheckMapper -&amp;gt; Sink: SlidingWindowCheckPrintSink&lt;/code&gt; becomes the bottleneck and &lt;code&gt;SlidingWindowOperator&lt;/code&gt;
gets backpressured. As those busy/backpressured/idle metrics are averaging time over a couple of seconds,
this subtlety is not immediately visible and has to be read between the lines. On top of that, the web UI
is updating its state only once every 10 seconds, which makes spotting more frequent changes a bit more difficult.&lt;/p&gt;
&lt;h2 id=&quot;what-can-i-do-with-backpressure&quot;&gt;What can I do with backpressure?&lt;/h2&gt;
&lt;p&gt;In general this is a complex topic that is worthy of a dedicated blog post. It was, to a certain extent,
addressed in &lt;a href=&quot;https://flink.apache.org/2019/07/23/flink-network-stack-2.html#:~:text=this%20is%20unnecessary.-,What%20to%20do%20with%20Backpressure%3F,-Assuming%20that%20you&quot;&gt;previous blog posts&lt;/a&gt;.
In short, there are two high-level ways of dealing with backpressure. Either add more resources (more machines,
faster CPU, more RAM, better network, using SSDs…) or optimize usage of the resources you already have
(optimize the code, tune the configuration, avoid data skew). In either case, you first need to analyze
what is causing backpressure by:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Identifying the presence of backpressure.&lt;/li&gt;
&lt;li&gt;Locating which subtask(s) or machines are causing it.&lt;/li&gt;
&lt;li&gt;Digging deeper into what part of the code is causing it and which resource is scarce.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Backpressure monitoring improvements and metrics can help you with the first two points. To tackle the
last one, profiling the code can be the way to go. To help with profiling, also starting from Flink 1.13,
&lt;a href=&quot;http://www.brendangregg.com/flamegraphs.html&quot;&gt;Flame Graphs&lt;/a&gt; are &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/flame_graphs/&quot;&gt;integrated into Flink’s web UI&lt;/a&gt;.
Flame Graphs is a well known profiling tool and visualization technique and I encourage you to give it a try.&lt;/p&gt;
&lt;p&gt;But keep in mind that after locating where the bottleneck is, you can analyze it the same way you would
any other non-distributed application (by checking resource utilization, attaching a profiler, etc).
Usually there is no silver bullet for problems like this. You can try to scale up but sometimes it might
not be easy or practical to do.&lt;/p&gt;
&lt;p&gt;Anyway… The aforementioned improvements to backpressure monitoring allow us to easily detect the source of backpressure,
and Flame Graphs can help us to analyze why a particular subtask is causing problems. Together those two
features should make the previously quite tedious process of debugging and performance analysis of Flink
jobs that much easier! Please upgrade to Flink 1.13.x and try them out!&lt;/p&gt;
&lt;p&gt;[&lt;sup&gt;1&lt;/sup&gt;] There is a third possibility. In a rare case when network exchange is actually the bottleneck in your job,
the downstream task will have empty input buffers, while upstream output buffers will be full. &lt;a class=&quot;anchor&quot; id=&quot;1&quot;&gt;&lt;/a&gt;&lt;/p&gt;
</description>
<pubDate>Wed, 07 Jul 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2021/07/07/backpressure.html</link>
<guid isPermaLink="true">/2021/07/07/backpressure.html</guid>
</item>
<item>
<title>Apache Flink 1.13.1 Released</title>
<description>&lt;p&gt;The Apache Flink community released the first bugfix version of the Apache Flink 1.13 series.&lt;/p&gt;
&lt;p&gt;This release includes 82 fixes and minor improvements for Flink 1.13.1. The list below includes bugfixes and improvements. For a complete list of all changes see:
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12350058&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.13.1.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.13.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt; Release Notes - Flink - Version 1.13.1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22378&quot;&gt;FLINK-22378&lt;/a&gt;] - Type mismatch when declaring SOURCE_WATERMARK on TIMESTAMP_LTZ column
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22666&quot;&gt;FLINK-22666&lt;/a&gt;] - Make structured type&amp;#39;s fields more lenient during casting
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12351&quot;&gt;FLINK-12351&lt;/a&gt;] - AsyncWaitOperator should deep copy StreamElement when object reuse is enabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17170&quot;&gt;FLINK-17170&lt;/a&gt;] - Cannot stop streaming job with savepoint which uses kinesis consumer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19449&quot;&gt;FLINK-19449&lt;/a&gt;] - LEAD/LAG cannot work correctly in streaming mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21181&quot;&gt;FLINK-21181&lt;/a&gt;] - Buffer pool is destroyed error when outputting data over a timer after cancellation.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21247&quot;&gt;FLINK-21247&lt;/a&gt;] - flink iceberg table map&amp;lt;string,string&amp;gt; cannot convert to datastream
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21469&quot;&gt;FLINK-21469&lt;/a&gt;] - stop-with-savepoint --drain doesn&amp;#39;t advance watermark for sources chained to MultipleInputStreamTask
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21923&quot;&gt;FLINK-21923&lt;/a&gt;] - SplitAggregateRule will be abnormal, when the sum/count and avg in SQL at the same time
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22109&quot;&gt;FLINK-22109&lt;/a&gt;] - Misleading exception message if the number of arguments of a nested function is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22294&quot;&gt;FLINK-22294&lt;/a&gt;] - Hive reading fail when getting file numbers on different filesystem nameservices
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22355&quot;&gt;FLINK-22355&lt;/a&gt;] - Simple Task Manager Memory Model image does not show up
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22356&quot;&gt;FLINK-22356&lt;/a&gt;] - Filesystem/Hive partition file is not committed when watermark is applied on rowtime of TIMESTAMP_LTZ type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22408&quot;&gt;FLINK-22408&lt;/a&gt;] - Flink Table Parsr Hive Drop Partitions Syntax unparse is Error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22424&quot;&gt;FLINK-22424&lt;/a&gt;] - Writing to already released buffers potentially causing data corruption during job failover/cancellation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22431&quot;&gt;FLINK-22431&lt;/a&gt;] - AdaptiveScheduler does not log failure cause when recovering
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22434&quot;&gt;FLINK-22434&lt;/a&gt;] - Dispatcher does not store suspended jobs in execution graph store
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22438&quot;&gt;FLINK-22438&lt;/a&gt;] - add numRecordsOut metric for Async IO
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22442&quot;&gt;FLINK-22442&lt;/a&gt;] - Using scala api to change the TimeCharacteristic of the PatternStream is invalid
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22463&quot;&gt;FLINK-22463&lt;/a&gt;] - IllegalArgumentException is thrown in WindowAttachedWindowingStrategy when two phase is enabled for distinct agg
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22479&quot;&gt;FLINK-22479&lt;/a&gt;] - [Kinesis][Consumer] Potential lock-up under error condition
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22489&quot;&gt;FLINK-22489&lt;/a&gt;] - subtask backpressure indicator shows value for entire job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22494&quot;&gt;FLINK-22494&lt;/a&gt;] - Avoid discarding checkpoints in case of failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22502&quot;&gt;FLINK-22502&lt;/a&gt;] - DefaultCompletedCheckpointStore drops unrecoverable checkpoints silently
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22511&quot;&gt;FLINK-22511&lt;/a&gt;] - Fix the bug of non-composite result type in Python TableAggregateFunction
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22512&quot;&gt;FLINK-22512&lt;/a&gt;] - Can&amp;#39;t call current_timestamp with hive dialect for hive-3.1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22522&quot;&gt;FLINK-22522&lt;/a&gt;] - BytesHashMap has many verbose logs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22523&quot;&gt;FLINK-22523&lt;/a&gt;] - TUMBLE TVF should throw helpful exception when specifying second interval parameter
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22525&quot;&gt;FLINK-22525&lt;/a&gt;] - The zone id in exception message should be GMT+08:00 instead of GMT+8:00
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22535&quot;&gt;FLINK-22535&lt;/a&gt;] - Resource leak would happen if exception thrown during AbstractInvokable#restore of task life
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22555&quot;&gt;FLINK-22555&lt;/a&gt;] - LGPL-2.1 files in flink-python jars
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22573&quot;&gt;FLINK-22573&lt;/a&gt;] - AsyncIO can timeout elements after completion
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22574&quot;&gt;FLINK-22574&lt;/a&gt;] - Adaptive Scheduler: Can not cancel restarting job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22592&quot;&gt;FLINK-22592&lt;/a&gt;] - numBuffersInLocal is always zero when using unaligned checkpoints
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22596&quot;&gt;FLINK-22596&lt;/a&gt;] - Active timeout is not triggered if there were no barriers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22618&quot;&gt;FLINK-22618&lt;/a&gt;] - Fix incorrect free resource metrics of task managers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22654&quot;&gt;FLINK-22654&lt;/a&gt;] - SqlCreateTable toString()/unparse() lose CONSTRAINTS and watermarks
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22661&quot;&gt;FLINK-22661&lt;/a&gt;] - HiveInputFormatPartitionReader can return invalid data
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22688&quot;&gt;FLINK-22688&lt;/a&gt;] - Root Exception can not be shown on Web UI in Flink 1.13.0
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22706&quot;&gt;FLINK-22706&lt;/a&gt;] - Source NOTICE outdated regarding docs/
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22721&quot;&gt;FLINK-22721&lt;/a&gt;] - Breaking HighAvailabilityServices interface by adding new method
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22733&quot;&gt;FLINK-22733&lt;/a&gt;] - Type mismatch thrown in DataStream.union if parameter is KeyedStream for Python DataStream API
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18952&quot;&gt;FLINK-18952&lt;/a&gt;] - Add 10 minutes to DataStream API documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20695&quot;&gt;FLINK-20695&lt;/a&gt;] - Zookeeper node under leader and leaderlatch is not deleted after job finished
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22250&quot;&gt;FLINK-22250&lt;/a&gt;] - flink-sql-parser model Class ParserResource lack ParserResource.properties
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22301&quot;&gt;FLINK-22301&lt;/a&gt;] - Statebackend and CheckpointStorage type is not shown in the Web UI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22304&quot;&gt;FLINK-22304&lt;/a&gt;] - Refactor some interfaces for TVF based window to improve the extendability
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22470&quot;&gt;FLINK-22470&lt;/a&gt;] - The root cause of the exception encountered during compiling the job was not exposed to users in certain cases
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22560&quot;&gt;FLINK-22560&lt;/a&gt;] - Filter maven metadata from all jars
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22699&quot;&gt;FLINK-22699&lt;/a&gt;] - Make ConstantArgumentCount public API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22708&quot;&gt;FLINK-22708&lt;/a&gt;] - Propagate savepoint settings from StreamExecutionEnvironment to StreamGraph
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22725&quot;&gt;FLINK-22725&lt;/a&gt;] - SlotManagers should unregister metrics at the start of suspend()
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 28 May 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/05/28/release-1.13.1.html</link>
<guid isPermaLink="true">/news/2021/05/28/release-1.13.1.html</guid>
</item>
<item>
<title>Apache Flink 1.12.4 Released</title>
<description>&lt;p&gt;The Apache Flink community released the next bugfix version of the Apache Flink 1.12 series.&lt;/p&gt;
&lt;p&gt;This release includes 21 fixes and minor improvements for Flink 1.12.3. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.12.4.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.4&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Release Notes - Flink - Version 1.12.4&lt;/p&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17170&quot;&gt;FLINK-17170&lt;/a&gt;] - Cannot stop streaming job with savepoint which uses kinesis consumer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20114&quot;&gt;FLINK-20114&lt;/a&gt;] - Fix a few KafkaSource-related bugs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21181&quot;&gt;FLINK-21181&lt;/a&gt;] - Buffer pool is destroyed error when outputting data over a timer after cancellation.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22109&quot;&gt;FLINK-22109&lt;/a&gt;] - Misleading exception message if the number of arguments of a nested function is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22368&quot;&gt;FLINK-22368&lt;/a&gt;] - UnalignedCheckpointITCase hangs on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22424&quot;&gt;FLINK-22424&lt;/a&gt;] - Writing to already released buffers potentially causing data corruption during job failover/cancellation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22438&quot;&gt;FLINK-22438&lt;/a&gt;] - add numRecordsOut metric for Async IO
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22442&quot;&gt;FLINK-22442&lt;/a&gt;] - Using scala api to change the TimeCharacteristic of the PatternStream is invalid
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22479&quot;&gt;FLINK-22479&lt;/a&gt;] - [Kinesis][Consumer] Potential lock-up under error condition
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22489&quot;&gt;FLINK-22489&lt;/a&gt;] - subtask backpressure indicator shows value for entire job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22555&quot;&gt;FLINK-22555&lt;/a&gt;] - LGPL-2.1 files in flink-python jars
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22557&quot;&gt;FLINK-22557&lt;/a&gt;] - Japicmp fails on 1.12 branch
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22573&quot;&gt;FLINK-22573&lt;/a&gt;] - AsyncIO can timeout elements after completion
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22577&quot;&gt;FLINK-22577&lt;/a&gt;] - KubernetesLeaderElectionAndRetrievalITCase is failing
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22597&quot;&gt;FLINK-22597&lt;/a&gt;] - JobMaster cannot be restarted
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18952&quot;&gt;FLINK-18952&lt;/a&gt;] - Add 10 minutes to DataStream API documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20553&quot;&gt;FLINK-20553&lt;/a&gt;] - Add end-to-end test case for new Kafka source
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22470&quot;&gt;FLINK-22470&lt;/a&gt;] - The root cause of the exception encountered during compiling the job was not exposed to users in certain cases
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22539&quot;&gt;FLINK-22539&lt;/a&gt;] - Restructure the Python dependency management documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22544&quot;&gt;FLINK-22544&lt;/a&gt;] - Add the missing documentation about the command line options for PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22560&quot;&gt;FLINK-22560&lt;/a&gt;] - Filter maven metadata from all jars
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 21 May 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/05/21/release-1.12.4.html</link>
<guid isPermaLink="true">/news/2021/05/21/release-1.12.4.html</guid>
</item>
<item>
<title>Scaling Flink automatically with Reactive Mode</title>
<description>&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#getting-started&quot; id=&quot;markdown-toc-getting-started&quot;&gt;Getting Started&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#demo-on-kubernetes&quot; id=&quot;markdown-toc-demo-on-kubernetes&quot;&gt;Demo on Kubernetes&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#the-setup&quot; id=&quot;markdown-toc-the-setup&quot;&gt;The Setup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#results&quot; id=&quot;markdown-toc-results&quot;&gt;Results&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#lessons-learned-configuring-a-low-heartbeat-timeout-for-a-smooth-scale-down&quot; id=&quot;markdown-toc-lessons-learned-configuring-a-low-heartbeat-timeout-for-a-smooth-scale-down&quot;&gt;Lessons Learned: Configuring a low heartbeat timeout for a smooth scale down&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Streaming jobs which run for several days or longer usually experience variations in workload during their lifetime. These variations can originate from seasonal spikes, such as day vs. night, weekdays vs. weekend or holidays vs. non-holidays, sudden events or simply the growing popularity of your product. Although some of these variations are more predictable than others, in all cases there is a change in job resource demand that needs to be addressed if you want to ensure the same quality of service for your customers.&lt;/p&gt;
&lt;p&gt;A simple way of quantifying the mismatch between the required resources and the available resources is to measure the space between the actual load and the number of available workers. As pictured below, in the case of static resource allocation, you can see that there’s a big gap between the actual load and the available workers — hence, we are wasting resources. For elastic resource allocation, the gap between the red and black line is consistently small.&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-04-reactive-mode/intro.svg&quot; width=&quot;640px&quot; alt=&quot;Reactive Mode Intro&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Manually rescaling&lt;/strong&gt; a Flink job has been possible since Flink 1.2 introduced &lt;a href=&quot;https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html&quot;&gt;rescalable state&lt;/a&gt;, which allows you to stop-and-restore a job with a different parallelism. For example, if your job is running with a parallelism of p=100 and your load increases, you can restart it with p=200 to cope with the additional data.&lt;/p&gt;
&lt;p&gt;The problem with this approach is that you have to orchestrate a rescale operation with custom tools by yourself, including error handling and similar tasks.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/&quot;&gt;Reactive Mode&lt;/a&gt; introduces a new option in Flink 1.13: You monitor your Flink cluster and add or remove resources depending on some metrics, Flink will do the rest. Reactive Mode is a mode where JobManager will try to use all TaskManager resources available.&lt;/p&gt;
&lt;p&gt;The big benefit of Reactive Mode is that you don’t need any specific knowledge to scale Flink anymore. Flink basically behaves like a fleet of servers (e.g. webservers, caches, batch processing) that you can expand or shrink as you wish. Since this is such a common pattern, there is a lot of infrastructure available for handling such cases: all major cloud providers offer utilities to monitor specific metrics and automatically scale a set of machines accordingly. For example, this would be provided through &lt;a href=&quot;https://docs.aws.amazon.com/autoscaling/ec2/userguide/AutoScalingGroup.html&quot;&gt;Auto Scaling groups&lt;/a&gt; in AWS, and &lt;a href=&quot;https://cloud.google.com/compute/docs/instance-groups&quot;&gt;Managed Instance groups&lt;/a&gt; in Google Cloud.
Similarly, Kubernetes provides &lt;a href=&quot;https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/&quot;&gt;Horizontal Pod Autoscalers&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;What is interesting, as a side note, is that unlike most auto scalable “fleets of servers”, Flink is a stateful system, often processing valuable data requiring strong correctness guarantees (comparable to a database). But, unlike many traditional databases, Flink is resilient enough (through checkpointing and state backups) to adjust to changing workloads by just adding or removing resources, with very little requirements (i.e. simple blob store for state backups).&lt;/p&gt;
&lt;h2 id=&quot;getting-started&quot;&gt;Getting Started&lt;/h2&gt;
&lt;p&gt;If you want to try out Reactive Mode yourself locally, follow these steps using a Flink 1.13.0 distribution:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;# These instructions assume you are in the root directory of a Flink distribution.&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# Put Job into usrlib/ directory&lt;/span&gt;
mkdir usrlib
cp ./examples/streaming/TopSpeedWindowing.jar usrlib/
&lt;span class=&quot;c&quot;&gt;# Submit Job in Reactive Mode&lt;/span&gt;
./bin/standalone-job.sh start -Dscheduler-mode&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;reactive -Dexecution.checkpointing.interval&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;10s&amp;quot;&lt;/span&gt; -j org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
&lt;span class=&quot;c&quot;&gt;# Start first TaskManager&lt;/span&gt;
./bin/taskmanager.sh start&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You have now started a Flink job in Reactive Mode. The &lt;a href=&quot;http://localhost:8081&quot;&gt;web interface&lt;/a&gt; shows that the job is running on one TaskManager. If you want to scale up the job, simply add another TaskManager to the cluster:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;# Start additional TaskManager&lt;/span&gt;
./bin/taskmanager.sh start&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To scale down, remove a TaskManager instance:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;# Remove a TaskManager&lt;/span&gt;
./bin/taskmanager.sh stop&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Reactive Mode also works when deploying &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/&quot;&gt;Flink on Docker&lt;/a&gt; or using the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/&quot;&gt;standalone Kubernetes deployment&lt;/a&gt; (both only as application clusters).&lt;/p&gt;
&lt;h2 id=&quot;demo-on-kubernetes&quot;&gt;Demo on Kubernetes&lt;/h2&gt;
&lt;p&gt;In this section, we want to demonstrate the new Reactive Mode in a real-world scenario. You can use this demo as a starting point for your own scalable deployment of Flink on Kubernetes, or as a template for building your own deployment using a different setup.&lt;/p&gt;
&lt;h3 id=&quot;the-setup&quot;&gt;The Setup&lt;/h3&gt;
&lt;p&gt;The central idea of this demo is to use a Kubernetes &lt;a href=&quot;https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/&quot;&gt;Horizontal Pod Autoscaler&lt;/a&gt;, which monitors the CPU load of all TaskManager pods and adjusts their replication factor accordingly. On high CPU load, the autoscaler should add more TaskManagers, distributing the load across more machines. On low load, it should stop TaskManagers to save resources.&lt;/p&gt;
&lt;p&gt;The whole setup is presented here:&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-04-reactive-mode/arch.png&quot; width=&quot;640px&quot; alt=&quot;Reactive Mode Demo Architecture&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Let’s discuss the components:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Flink&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;JobManager&lt;/strong&gt; is deployed as a &lt;a href=&quot;https://kubernetes.io/docs/concepts/workloads/controllers/job/&quot;&gt;Kubernetes job&lt;/a&gt;. We are submitting a container that is based on the official Flink Docker image, but has the jar file of our job added to it. The Flink job simply reads data from a Kafka topic and does some expensive math operations per event received. We use these math operations to generate high CPU loads, without requiring a large Kafka deployment.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;TaskManager(s)&lt;/strong&gt; are deployed as a Kubernetes deployment, which is scaled through a &lt;a href=&quot;https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/&quot;&gt;Horizontal Pod Autoscaler&lt;/a&gt;. In this experiment, the autoscaler is monitoring the CPU load of the pods in the deployment. The number of pods is adjusted between 1 and 15 pods by the autoscaler.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Additional Components&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We have a &lt;strong&gt;Zookeeper&lt;/strong&gt; and &lt;strong&gt;Kafka&lt;/strong&gt; deployment (each with one pod) to provide a Kafka topic that serves as the input for the Flink job.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Data Generator&lt;/strong&gt; pod produces simple string messages at a adjustable rate to the Kafka topic. In this experiment, the rate is following a sine wave.&lt;/li&gt;
&lt;li&gt;For monitoring, we are deploying &lt;strong&gt;Prometheus&lt;/strong&gt; and &lt;strong&gt;Grafana&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The entire setup is &lt;a href=&quot;https://github.com/rmetzger/flink-reactive-mode-k8s-demo&quot;&gt;available on GitHub&lt;/a&gt; if you want to try this out yourself.&lt;/p&gt;
&lt;h3 id=&quot;results&quot;&gt;Results&lt;/h3&gt;
&lt;p&gt;We’ve deployed all the above components on a hosted Kubernetes cluster, running it for several days. The results are best examined based on the following Grafana dashboard:&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-04-reactive-mode/result.png&quot; alt=&quot;Reactive Mode Demo Result&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;Reactive Mode Experiment Results&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Let’s take a closer look at the dashboard:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;On the top left, you can see the &lt;strong&gt;Kafka consumer lag&lt;/strong&gt;, reported by Flink’s Kafka consumer (source), which reports the queue size of unprocessed messages. A high lag means that Flink is not processing messages as fast as they are produced: we need to scale up.&lt;/p&gt;
&lt;p&gt;The lag is usually following the throughput of data coming from Kafka. When the throughput is the highest, the reported lag is at ~75k messages. In low throughput times, it is basically at zero.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;On the top right, you’ll see the &lt;strong&gt;throughput&lt;/strong&gt;, measured in records per second, as reported by Flink. The throughput is roughly following a sine wave, peaking at 6k messages per second, and going down to almost zero.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The bottom left chart shows the &lt;strong&gt;CPU load&lt;/strong&gt; per TaskManager. We’ve added this metric to the dashboard because this is what the pod autoscaler in Kubernetes will use to decide on the replica count of the TaskManager deployment. You can see that, as soon as a certain CPU load is reached, additional TaskManagers are started.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the bottom right chart, you can see the &lt;strong&gt;TaskManager count&lt;/strong&gt; over time. When the throughput (and CPU load) is peaking, we’re running on 5 TaskManagers (with some peaks up to even 8). On low throughput, we’re running the minimal number of just one TaskManager. This chart showcases nicely that Reactive Mode is working as expected in this experiment: the number of TaskManagers is adjusting to the load on the system.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;lessons-learned-configuring-a-low-heartbeat-timeout-for-a-smooth-scale-down&quot;&gt;Lessons Learned: Configuring a low heartbeat timeout for a smooth scale down&lt;/h3&gt;
&lt;p&gt;When we initially started with the experiment, we noticed some anomalies in the behavior of Flink, depicted in this chart:&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-04-reactive-mode/high-timeout.png&quot; alt=&quot;Reactive Mode Demo Lessons Learned&quot; /&gt;
&lt;p class=&quot;align-center&quot;&gt;Reactive Mode not scaling down properly&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In all the charts, we see sudden spikes or drops: The consumer lag is going to up to 600k messages (that’s 8 times more than the usual 75k lag we observe at peak), the throughput seems to peak (and drop). On the “Number of TaskManagers” chart, we see that we are not following the throughput line very nicely. We are wasting resources by allocating too many TaskManagers for the given at rate.&lt;/p&gt;
&lt;p&gt;We see that these issues are only occurring when the load is decreasing, and Reactive Mode is supposed to scale down. So what is happening here?&lt;/p&gt;
&lt;p&gt;The Flink JobManager is sending periodic heartbeats to the TaskManagers, to check if they are still alive. These heartbeats have a default timeout of 50 seconds. This value might seem high, but in high load scenarios, there might be network congestions, garbage collection pauses or other disruptions that cause slow heartbeats. We don’t want to consider a TaskManager dead just because of a temporary disruption.&lt;/p&gt;
&lt;p&gt;However, this default value is causing problems in this experiment: When the Kubernetes autoscaler notices that the CPU load has gone down, it will reduce the replica count of the TaskManager deployment, stopping at least one TaskManager instance. Flink will almost immediately stop processing messages, because of the connection loss in the data transport layer of Flink. However, the JobManager will wait for 50 seconds (the default heartbeat timeout) before the TaskManager is considered dead.&lt;/p&gt;
&lt;p&gt;During this waiting period, the throughput is at zero and messages are queuing in Kafka (causing spikes in the consumer lag). Once Flink is running again, Flink will try to catch up on the queued messages, causing a spike in CPU load. The autoscaler notices this load spike and allocates more TaskManagers.&lt;/p&gt;
&lt;p&gt;We are only seeing this effect on scale down, because a scale down is much more disruptive than scaling up. Scale up, which means adding TaskManagers, is disrupting the processing only for the duration of a job restart (which is fast since our application state are just a few bytes for the Kafka offsets), while scaling down is disrupting the processing for roughly 50 seconds.&lt;/p&gt;
&lt;p&gt;To mitigate this issue, we have reduced the &lt;code&gt;heartbeat.timeout&lt;/code&gt; in our experiment to 8 seconds. Additionally, we are looking into improving the behavior of the JobManager to detect TaskManager losses better and faster.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In this blog post, we’ve introduced Reactive Mode, a big step forward in Flink’s ability to dynamically adjust to changing workloads, reducing resource utilization and overall costs. The blog post demonstrated Reactive Mode on Kubernetes, including some lessons learned.&lt;/p&gt;
&lt;p&gt;Reactive Mode is new feature in Flink 1.13 and is currently in the &lt;a href=&quot;https://flink.apache.org/roadmap.html#feature-stages&quot;&gt;MVP (Minimal Viable Product) phase&lt;/a&gt; of product development. Before experimenting with it, or using it in production, please check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling&quot;&gt;documentation&lt;/a&gt;, in particular the current &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#limitations&quot;&gt;limitations&lt;/a&gt; section. In this phase, the biggest limitation is that only standalone application mode deployments are supported (i.e. no active resource managers or session clusters).&lt;/p&gt;
&lt;p&gt;The community is actively looking for feedback on this feature, to continue improving Flink’s resource elasticity. If you have any feedback, please reach out to the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;dev@ mailing list&lt;/a&gt; or to me personally on &lt;a href=&quot;https://twitter.com/rmetzger_&quot;&gt;Twitter&lt;/a&gt;.&lt;/p&gt;
</description>
<pubDate>Thu, 06 May 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2021/05/06/reactive-mode.html</link>
<guid isPermaLink="true">/2021/05/06/reactive-mode.html</guid>
</item>
<item>
<title>Apache Flink 1.13.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is excited to announce the release of Flink 1.13.0! More than 200
contributors worked on over 1,000 issues for this new version.&lt;/p&gt;
&lt;p&gt;The release brings us a big step forward in one of our major efforts: &lt;strong&gt;Making Stream Processing
Applications as natural and as simple to manage as any other application.&lt;/strong&gt; The new &lt;em&gt;reactive scaling&lt;/em&gt;
mode means that scaling streaming applications in and out now works like in any other application
by just changing the number of parallel processes.&lt;/p&gt;
&lt;p&gt;The release also prominently features a &lt;strong&gt;series of improvements that help users better understand the performance of
applications.&lt;/strong&gt; When the streams don’t flow as fast as you’d hope, these can help you to understand
why: Load and &lt;em&gt;backpressure visualization&lt;/em&gt; to identify bottlenecks, &lt;em&gt;CPU flame graphs&lt;/em&gt; to identify hot
code paths in your application, and &lt;em&gt;State Access Latencies&lt;/em&gt; to see how the State Backends are keeping
up.&lt;/p&gt;
&lt;p&gt;Beyond those features, the Flink community has added a ton of improvements all over the system,
some of which we discuss in this article. We hope you enjoy the new release and features.
Towards the end of the article, we describe changes to be aware of when upgrading
from earlier versions of Apache Flink.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#notable-features&quot; id=&quot;markdown-toc-notable-features&quot;&gt;Notable features&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#reactive-scaling&quot; id=&quot;markdown-toc-reactive-scaling&quot;&gt;Reactive scaling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#analyzing-application-performance&quot; id=&quot;markdown-toc-analyzing-application-performance&quot;&gt;Analyzing application performance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#switching-state-backend-with-savepoints&quot; id=&quot;markdown-toc-switching-state-backend-with-savepoints&quot;&gt;Switching State Backend with savepoints&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#user-specified-pod-templates-for-kubernetes-deployments&quot; id=&quot;markdown-toc-user-specified-pod-templates-for-kubernetes-deployments&quot;&gt;User-specified pod templates for Kubernetes deployments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#unaligned-checkpoints---production-ready&quot; id=&quot;markdown-toc-unaligned-checkpoints---production-ready&quot;&gt;Unaligned Checkpoints - production-ready&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#machine-learning-library-moving-to-a-separate-repository&quot; id=&quot;markdown-toc-machine-learning-library-moving-to-a-separate-repository&quot;&gt;Machine Learning Library moving to a separate repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#notable-sql--table-api-improvements&quot; id=&quot;markdown-toc-notable-sql--table-api-improvements&quot;&gt;Notable SQL &amp;amp; Table API improvements&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#windows-via-table-valued-functions&quot; id=&quot;markdown-toc-windows-via-table-valued-functions&quot;&gt;Windows via Table-valued functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#improved-interoperability-between-datastream-api-and-table-apisql&quot; id=&quot;markdown-toc-improved-interoperability-between-datastream-api-and-table-apisql&quot;&gt;Improved interoperability between DataStream API and Table API/SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#sql-client-init-scripts-and-statement-sets&quot; id=&quot;markdown-toc-sql-client-init-scripts-and-statement-sets&quot;&gt;SQL Client: Init scripts and Statement Sets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#hive-query-syntax-compatibility&quot; id=&quot;markdown-toc-hive-query-syntax-compatibility&quot;&gt;Hive query syntax compatibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#improved-behavior-of-sql-time-functions&quot; id=&quot;markdown-toc-improved-behavior-of-sql-time-functions&quot;&gt;Improved behavior of SQL time functions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#notable-pyflink-improvements&quot; id=&quot;markdown-toc-notable-pyflink-improvements&quot;&gt;Notable PyFlink improvements&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#stateful-operations-in-the-python-datastream-api&quot; id=&quot;markdown-toc-stateful-operations-in-the-python-datastream-api&quot;&gt;Stateful operations in the Python DataStream API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#user-defined-windows-in-the-pyflink-datastream-api&quot; id=&quot;markdown-toc-user-defined-windows-in-the-pyflink-datastream-api&quot;&gt;User-defined Windows in the PyFlink DataStream API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#row-based-operation-in-the-pyflink-table-api&quot; id=&quot;markdown-toc-row-based-operation-in-the-pyflink-table-api&quot;&gt;Row-based operation in the PyFlink Table API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#batch-execution-mode-for-pyflink-datastream-programs&quot; id=&quot;markdown-toc-batch-execution-mode-for-pyflink-datastream-programs&quot;&gt;Batch execution mode for PyFlink DataStream programs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-improvements&quot; id=&quot;markdown-toc-other-improvements&quot;&gt;Other improvements&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#changes-to-consider-when-upgrading-to-flink-113&quot; id=&quot;markdown-toc-changes-to-consider-when-upgrading-to-flink-113&quot;&gt;Changes to consider when upgrading to Flink 1.13&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#resources&quot; id=&quot;markdown-toc-resources&quot;&gt;Resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;We encourage you to &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;download the release&lt;/a&gt; and share your
feedback with the community through
the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt;
or &lt;a href=&quot;https://issues.apache.org/jira/projects/FLINK/summary&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;notable-features&quot;&gt;Notable features&lt;/h1&gt;
&lt;h2 id=&quot;reactive-scaling&quot;&gt;Reactive scaling&lt;/h2&gt;
&lt;p&gt;Reactive scaling is the latest piece in Flink’s initiative to make Stream Processing
Applications as natural and as simple to manage as any other application.&lt;/p&gt;
&lt;p&gt;Flink has a dual nature when it comes to resource management and deployments: You can deploy
Flink applications onto resource orchestrators like Kubernetes or Yarn in such a way that Flink actively manages
the resources and allocates and releases workers as needed. That is especially useful for jobs and
applications that rapidly change their required resources, like batch applications and ad-hoc SQL
queries. The application parallelism rules, the number of workers follows. In the context of Flink
applications, we call this &lt;em&gt;active scaling&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;For long-running streaming applications, it is often a nicer model to just deploy them like any
other long-running application: The application doesn’t really need to know that it runs on K8s,
EKS, Yarn, etc. and doesn’t try to acquire a specific amount of workers; instead, it just uses the
number of workers that are given to it. The number of workers rules, the application parallelism
adjusts to that. In the context of Flink, we call that &lt;em&gt;reactive scaling&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/concepts/flink-architecture/#flink-application-execution&quot;&gt;Application Deployment Mode&lt;/a&gt;
started this effort, making deployments more application-like (by avoiding two separate deployment
steps to (1) start a cluster and (2) submit an application). The reactive scaling mode completes this,
and you now don’t have to use extra tools (scripts, or a K8s operator) anymore to keep the number
of workers, and the application parallelism settings in sync.&lt;/p&gt;
&lt;p&gt;You can now put an auto-scaler around Flink applications like around other typical applications — as
long as you are mindful about the cost of rescaling when configuring the autoscaler: Stateful
streaming applications must move state around when scaling.&lt;/p&gt;
&lt;p&gt;To try the reactive-scaling mode, add the &lt;code&gt;scheduler-mode: reactive&lt;/code&gt; config entry and deploy
an application cluster (&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/overview/#application-mode&quot;&gt;standalone&lt;/a&gt; or &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#deploy-application-cluster&quot;&gt;Kubernetes&lt;/a&gt;). Check out &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/elastic_scaling/#reactive-mode&quot;&gt;the reactive scaling docs&lt;/a&gt; for more details.&lt;/p&gt;
&lt;h2 id=&quot;analyzing-application-performance&quot;&gt;Analyzing application performance&lt;/h2&gt;
&lt;p&gt;Like for any application, analyzing and understanding the performance of a Flink application
is critical. Often even more critical, because Flink applications are typically data-intensive
(processing high volumes of data) and are at the same time expected to provide results within
(near-) real-time latencies.&lt;/p&gt;
&lt;p&gt;When an application doesn’t keep up with the data rate anymore, or an application takes more
resources than you’d expect it would, these new tools can help you track down the causes:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bottleneck detection, Back Pressure monitoring&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The first question during performance analysis is often: Which operation is the bottleneck?&lt;/p&gt;
&lt;p&gt;To help answer that, Flink exposes metrics about the degree to which tasks are &lt;em&gt;busy&lt;/em&gt; (doing work)
and &lt;em&gt;back-pressured&lt;/em&gt; (have the capacity to do work but cannot because their successor operators
cannot accept more results). Candidates for bottlenecks are the busy operators whose predecessors
are back-pressured.&lt;/p&gt;
&lt;p&gt;Flink 1.13 brings an improved back pressure metric system (using task mailbox timings rather than
thread stack sampling), and a reworked graphical representation of the job’s dataflow with color-coding
and ratios for busyness and backpressure.&lt;/p&gt;
&lt;figure style=&quot;align-content: center&quot;&gt;
&lt;img src=&quot;/img/blog/2021-05-03-release-1.13.0/bottleneck.png&quot; style=&quot;width: 900px&quot; /&gt;
&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;CPU flame graphs in Web UI&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The next question during performance analysis is typically: What part of work in the bottlenecked
operator is expensive?&lt;/p&gt;
&lt;p&gt;One visually effective means to investigate that is &lt;em&gt;Flame Graphs&lt;/em&gt;. They help answer question like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Which methods are currently consuming CPU resources?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;How does one method’s CPU consumption compare to other methods?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Which series of calls on the stack led to executing a particular method?&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Flame Graphs are constructed by repeatedly sampling the thread stack traces. Every method call is
represented by a bar, where the length of the bar is proportional to the number of times it is present
in the samples. When enabled, the graphs are shown in a new UI component for the selected operator.&lt;/p&gt;
&lt;figure style=&quot;align-content: center&quot;&gt;
&lt;img src=&quot;/img/blog/2021-05-03-release-1.13.0/7.png&quot; style=&quot;display: block; margin-left: auto; margin-right: auto; width: 600px&quot; /&gt;
&lt;/figure&gt;
&lt;p&gt;The &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/flame_graphs&quot;&gt;Flame Graphs documentation&lt;/a&gt;
contains more details and instructions for enabling this feature.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Access Latency Metrics for State&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Another possible performance bottleneck can be the state backend, especially when your state is larger
than the main memory available to Flink and you are using the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/state/state_backends/#the-embeddedrocksdbstatebackend&quot;&gt;RocksDB state backend&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That’s not saying RocksDB is slow (we love RocksDB!), but it has some requirements to achieve
good performance. For example, it is easy to accidentally &lt;a href=&quot;https://www.ververica.com/blog/the-impact-of-disks-on-rocksdb-state-backend-in-flink-a-case-study&quot;&gt;starve RocksDB’s demand for IOPs on cloud setups with
the wrong type of disk resources&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;On top of the CPU flame graphs, the new &lt;em&gt;state backend latency metrics&lt;/em&gt; can help you understand whether
your state backend is responsive. For example, if you see that RocksDB state accesses start to take
milliseconds, you probably need to look into your memory and I/O configuration.
These metrics can be activated by setting the &lt;code&gt;state.backend.rocksdb.latency-track-enabled&lt;/code&gt; option.
The metrics are sampled, and their collection should have a marginal impact on the RocksDB state
backend performance.&lt;/p&gt;
&lt;h2 id=&quot;switching-state-backend-with-savepoints&quot;&gt;Switching State Backend with savepoints&lt;/h2&gt;
&lt;p&gt;You can now change the state backend of a Flink application when resuming from a savepoint.
That means the application’s state is no longer locked into the state backend that was used when
the application was initially started.&lt;/p&gt;
&lt;p&gt;This makes it possible, for example, to initially start with the HashMap State Backend (pure
in-memory in JVM Heap) and later switch to the RocksDB State Backend, once the state grows
too large.&lt;/p&gt;
&lt;p&gt;Under the hood, Flink now has a canonical savepoint format, which all state backends use when
creating a data snapshot for a savepoint.&lt;/p&gt;
&lt;h2 id=&quot;user-specified-pod-templates-for-kubernetes-deployments&quot;&gt;User-specified pod templates for Kubernetes deployments&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/&quot;&gt;native Kubernetes deployment&lt;/a&gt;
(where Flink actively talks to K8s to start and stop pods) now supports &lt;em&gt;custom pod templates&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;With those templates, users can set up and configure the JobManagers and TaskManagers pods in a
Kubernetes-y way, with flexibility beyond the configuration options that are directly built into
Flink’s Kubernetes integration.&lt;/p&gt;
&lt;h2 id=&quot;unaligned-checkpoints---production-ready&quot;&gt;Unaligned Checkpoints - production-ready&lt;/h2&gt;
&lt;p&gt;Unaligned Checkpoints have matured to the point where we encourage all users to try them out,
if they see issues with their application under backpressure.&lt;/p&gt;
&lt;p&gt;In particular, these changes make Unaligned Checkpoints easier to use:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;You can now rescale applications from unaligned checkpoints. This comes in handy if your
application needs to be scaled from a retained checkpoint because you cannot (afford to) create
a savepoint.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Enabling unaligned checkpoints is cheaper for applications that are not back-pressured.
Unaligned checkpoints can now trigger adaptively with a timeout, meaning a checkpoint starts
as an aligned checkpoint (not storing any in-flight events) and falls back to an unaligned
checkpoint (storing some in-flight events), if the alignment phase takes longer than a certain
time.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Find out more about how to enable unaligned checkpoints in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/state/checkpoints/#unaligned-checkpoints&quot;&gt;Checkpointing Documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;machine-learning-library-moving-to-a-separate-repository&quot;&gt;Machine Learning Library moving to a separate repository&lt;/h2&gt;
&lt;p&gt;To accelerate the development of Flink’s Machine Learning efforts (streaming, batch, and
unified machine learning), the effort has moved to the new repository &lt;a href=&quot;https://github.com/apache/flink-ml&quot;&gt;flink-ml&lt;/a&gt;
under the Flink project. We here follow a similar approach like the &lt;em&gt;Stateful Functions&lt;/em&gt; effort,
where a separate repository has helped to speed up the development by allowing for more light-weight
contribution workflows and separate release cycles.&lt;/p&gt;
&lt;p&gt;Stay tuned for more updates in the Machine Learning efforts, like the interplay with
&lt;a href=&quot;https://github.com/alibaba/Alink&quot;&gt;ALink&lt;/a&gt; (suite of many common Machine Learning Algorithms on Flink)
or the &lt;a href=&quot;https://github.com/alibaba/flink-ai-extended&quot;&gt;Flink &amp;amp; TensorFlow integration&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;notable-sql--table-api-improvements&quot;&gt;Notable SQL &amp;amp; Table API improvements&lt;/h1&gt;
&lt;p&gt;Like in previous releases, SQL and the Table API remain an area of big developments.&lt;/p&gt;
&lt;h2 id=&quot;windows-via-table-valued-functions&quot;&gt;Windows via Table-valued functions&lt;/h2&gt;
&lt;p&gt;Defining time windows is one of the most frequent operations in streaming SQL queries.
Flink 1.13 introduces a new way to define windows: via &lt;em&gt;Table-valued Functions&lt;/em&gt;.
This approach is both more expressive (lets you define new types of windows) and fully
in line with the SQL standard.&lt;/p&gt;
&lt;p&gt;Flink 1.13 supports &lt;em&gt;TUMBLE&lt;/em&gt; and &lt;em&gt;HOP&lt;/em&gt; windows in the new syntax, &lt;em&gt;SESSION&lt;/em&gt; windows will
follow in a subsequent release. To demonstrate the increased expressiveness, consider the two examples
below.&lt;/p&gt;
&lt;p&gt;A new &lt;em&gt;CUMULATE&lt;/em&gt; window function that assigns windows with an expanding step size until the maximum
window size is reached:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;SUM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;price&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total_price&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CUMULATE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Bid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;DESCRIPTOR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bidtime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;2&amp;#39;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MINUTES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;10&amp;#39;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MINUTES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can reference the window start and window end time of the table-valued window functions,
making new types of constructs possible. Beyond regular windowed aggregations and windowed joins,
you can, for example, now express windowed Top-K aggregations:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ROW_NUMBER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OVER&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PARTITION&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_end&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;ORDER&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total_price&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;DESC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rank&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rank&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;improved-interoperability-between-datastream-api-and-table-apisql&quot;&gt;Improved interoperability between DataStream API and Table API/SQL&lt;/h2&gt;
&lt;p&gt;This release radically simplifies mixing DataStream API and Table API programs.&lt;/p&gt;
&lt;p&gt;The Table API is a great way to develop applications, with its declarative nature and its
many built-in functions. But sometimes, you need to &lt;em&gt;escape&lt;/em&gt; to the DataStream API for its
expressiveness, flexibility, and explicit control over the state.&lt;/p&gt;
&lt;p&gt;The new methods &lt;code&gt;StreamTableEnvironment.toDataStream()/.fromDataStream()&lt;/code&gt; can model
a &lt;code&gt;DataStream&lt;/code&gt; from the DataStream API as a table source or sink.
Notable improvements include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Automatic type conversion between the DataStream and Table API type systems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Seamless integration of event time configurations; watermarks flow through boundaries for high
consistency.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Enhancements to the &lt;code&gt;Row&lt;/code&gt; class (representing row events from the Table API) has received a major
overhaul (improving the behavior of &lt;code&gt;toString()&lt;/code&gt;/&lt;code&gt;hashCode()&lt;/code&gt;/&lt;code&gt;equals()&lt;/code&gt; methods) and now supports
accessing fields by name, with support for sparse representations.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tableEnv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromDataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;dataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Schema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;newBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;columnByMetadata&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;rowtime&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;TIMESTAMP(3)&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;watermark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;rowtime&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;SOURCE_WATERMARK()&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataStream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tableEnv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toDataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;keyBy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getField&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(...);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;sql-client-init-scripts-and-statement-sets&quot;&gt;SQL Client: Init scripts and Statement Sets&lt;/h2&gt;
&lt;p&gt;The SQL Client is a convenient way to run and deploy SQL streaming and batch jobs directly,
without writing any code from the command line, or as part of a CI/CD workflow.&lt;/p&gt;
&lt;p&gt;This release vastly improves the functionality of the SQL client. Almost all operations as that
are available to Java applications (when programmatically launching queries from the
&lt;code&gt;TableEnvironment&lt;/code&gt;) are now supported in the SQL Client and as SQL scripts.
That means SQL users need much less glue code for their SQL deployments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Easier Configuration and Code Sharing&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The support of YAML files to configure the SQL Client will be discontinued. Instead, the client
accepts one or more &lt;em&gt;initialization scripts&lt;/em&gt; to configure a session before the main SQL script
gets executed.&lt;/p&gt;
&lt;p&gt;These init scripts would typically be shared across teams/deployments and could be used for
loading common catalogs, applying common configuration settings, or defining standard views.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;./sql-client.sh -i init1.sql init2.sql -f sqljob.sql
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;More config options&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;A greater set of recognized config options and improved &lt;code&gt;SET&lt;/code&gt;/&lt;code&gt;RESET&lt;/code&gt; commands make it easier to
define and control the execution from within the SQL client and SQL scripts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Multi-query Support with Statement Sets&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Multi-query execution lets you execute multiple SQL queries (or statements) as a single Flink job.
This is particularly useful for streaming SQL queries that run indefinitely.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Statement Sets&lt;/em&gt; are the mechanism to group the queries together that should be executed together.&lt;/p&gt;
&lt;p&gt;The following is an example of a SQL script that can be run via the SQL client. It sets up and
configures the environment and executes multiple queries. The script captures end-to-end the
queries and all environment setup and configuration work, making it a self-contained deployment
artifact.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;c1&quot;&gt;-- set up a catalog&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive_catalog&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;type&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;hive&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;USE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive_catalog&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- or use temporary objects&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TEMPORARY&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clicks&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;page_id&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;viewtime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;kafka&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;topic&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;clicks&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;properties.bootstrap.servers&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;...&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;format&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;avro&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- set the execution mode for jobs&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SET&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;execution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;runtime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;streaming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- set the sync/async mode for INSERT INTOs&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SET&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dml&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sync&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- set the job&amp;#39;s parallelism&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SET&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parallism&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- set the job name&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SET&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pipeline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;my_flink_job&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- restore state from the specific savepoint path&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SET&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;execution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;savepoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;savepoints&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;savepoint&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb0dab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;BEGIN&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;STATEMENT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;SET&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pageview_pv_sink&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clicks&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pageview_uv_sink&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;distinct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clicks&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;END&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;hive-query-syntax-compatibility&quot;&gt;Hive query syntax compatibility&lt;/h2&gt;
&lt;p&gt;You can now write SQL queries against Flink using the Hive SQL syntax.
In addition to Hive’s DDL dialect, Flink now also accepts the commonly-used Hive DML and DQL
dialects.&lt;/p&gt;
&lt;p&gt;To use the Hive SQL dialect, set &lt;code&gt;table.sql-dialect&lt;/code&gt; to &lt;code&gt;hive&lt;/code&gt; and load the &lt;code&gt;HiveModule&lt;/code&gt;.
The latter is important because Hive’s built-in functions are required for proper syntax and
semantics compatibility. The following example illustrates that:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;myhive&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;type&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;hive&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- setup HiveCatalog&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;USE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;myhive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;LOAD&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MODULE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- setup HiveModule&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;USE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MODULES&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;core&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SET&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;sql&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dialect&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- enable Hive dialect&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CLUSTER&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- run some Hive queries&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Please note that the Hive dialect no longer supports Flink’s SQL syntax for DML and DQL statements.
Switch back to the &lt;code&gt;default&lt;/code&gt; dialect for Flink’s syntax.&lt;/p&gt;
&lt;h2 id=&quot;improved-behavior-of-sql-time-functions&quot;&gt;Improved behavior of SQL time functions&lt;/h2&gt;
&lt;p&gt;Working with time is a crucial element of any data processing. But simultaneously, handling different
time zones, dates, and times is an &lt;a href=&quot;https://xkcd.com/1883/&quot;&gt;increadibly delicate task&lt;/a&gt; when working with data.&lt;/p&gt;
&lt;p&gt;In Flink 1.13. we put much effort into simplifying the usage of time-related functions. We adjusted (made
more specific) the return types of functions such as: &lt;code&gt;PROCTIME()&lt;/code&gt;, &lt;code&gt;CURRENT_TIMESTAMP&lt;/code&gt;, &lt;code&gt;NOW()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Moreover, you can now also define an event time attribute on a &lt;em&gt;TIMESTAMP_LTZ&lt;/em&gt; column to gracefully
do window processing with the support of Daylight Saving Time.&lt;/p&gt;
&lt;p&gt;Please see the release notes for a complete list of changes.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;notable-pyflink-improvements&quot;&gt;Notable PyFlink improvements&lt;/h1&gt;
&lt;p&gt;The general theme of this release in PyFlink is to bring the Python DataStream API and Table API
closer to feature parity with the Java/Scala APIs.&lt;/p&gt;
&lt;h3 id=&quot;stateful-operations-in-the-python-datastream-api&quot;&gt;Stateful operations in the Python DataStream API&lt;/h3&gt;
&lt;p&gt;With Flink 1.13, Python programmers now also get to enjoy the full potential of Apache Flink’s
stateful stream processing APIs. The rearchitected Python DataStream API, introduced in Flink 1.12,
now has full stateful capabilities, allowing users to remember information from events in the state
and act on it later.&lt;/p&gt;
&lt;p&gt;That stateful processing capability is the basis of many of the more sophisticated processing
operations, which need to remember information across individual events (for example, Windowing
Operations).&lt;/p&gt;
&lt;p&gt;This example shows a custom counting window implementation, using state:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CountWindowAverage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FlatMapFunction&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;window_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window_size&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtime_context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RuntimeContext&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;descriptor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ValueStateDescriptor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;average&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TUPLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LONG&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LONG&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()]))&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtime_context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;descriptor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;flat_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# update the count&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# if the count reaches window_size, emit the average and clear the state&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;window_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;//&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# type: DataStream&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key_by&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flat_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CountWindowAverage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;user-defined-windows-in-the-pyflink-datastream-api&quot;&gt;User-defined Windows in the PyFlink DataStream API&lt;/h3&gt;
&lt;p&gt;Flink 1.13 adds support for user-defined windows to the PyFlink DataStream API. Programs can now use
windows beyond the standard window definitions.&lt;/p&gt;
&lt;p&gt;Because windows are at the heart of all programs that process unbounded streams (by splitting the
stream into “buckets” of bounded size), this greatly increases the expressiveness of the API.&lt;/p&gt;
&lt;h3 id=&quot;row-based-operation-in-the-pyflink-table-api&quot;&gt;Row-based operation in the PyFlink Table API&lt;/h3&gt;
&lt;p&gt;The Python Table API now supports row-based operations, i.e., custom transformation functions on rows.
These functions are an easy way to apply data transformations on tables beyond the built-in functions.&lt;/p&gt;
&lt;p&gt;This is an example of using a &lt;code&gt;map()&lt;/code&gt; operation in Python Table API:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ROW&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FIELD&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;c1&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FIELD&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;c2&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())]))&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;increment_column&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# type: Table&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mapped_result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;increment_column&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In addition to &lt;code&gt;map()&lt;/code&gt;, the API also supports &lt;code&gt;flat_map()&lt;/code&gt;, &lt;code&gt;aggregate()&lt;/code&gt;, &lt;code&gt;flat_aggregate()&lt;/code&gt;,
and other row-based operations. This brings the Python Table API a big step closer to feature
parity with the Java Table API.&lt;/p&gt;
&lt;h3 id=&quot;batch-execution-mode-for-pyflink-datastream-programs&quot;&gt;Batch execution mode for PyFlink DataStream programs&lt;/h3&gt;
&lt;p&gt;The PyFlink DataStream API now also supports the batch execution mode for bounded streams,
which was introduced for the Java DataStream API in Flink 1.12.&lt;/p&gt;
&lt;p&gt;The batch execution mode simplifies operations and improves the performance of programs on bounded streams,
by exploiting the bounded stream nature to bypass state backends and checkpoints.&lt;/p&gt;
&lt;h1 id=&quot;other-improvements&quot;&gt;Other improvements&lt;/h1&gt;
&lt;p&gt;&lt;strong&gt;Flink Documentation via Hugo&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The Flink Documentation has been migrated from Jekyll to Hugo. If you find something missing, please let us know.
We are also curious to hear if you like the new look &amp;amp; feel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Exception histories in the Web UI&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The Flink Web UI will present up to &lt;em&gt;n&lt;/em&gt; last exceptions that caused a job to fail.
That helps to debug scenarios where a root failure caused subsequent failures. The root failure
cause can be found in the exception history.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Better exception / failure-cause reporting for unsuccessful checkpoints&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Flink now provides statistics for checkpoints that failed or were aborted to make it easier
to determine the failure cause without having to analyze the logs.&lt;/p&gt;
&lt;p&gt;Prior versions of Flink were reporting metrics (e.g., size of persisted data, trigger time)
only in case a checkpoint succeeded.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Exactly-once JDBC sink&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;From 1.13, JDBC sink can guarantee exactly-once delivery of results for XA-compliant databases
by transactionally committing results on checkpoints. The target database must have (or be linked
to) an XA Transaction Manager.&lt;/p&gt;
&lt;p&gt;The connector exists currently only for the &lt;em&gt;DataStream API&lt;/em&gt;, and can be created through the
&lt;code&gt;JdbcSink.exactlyOnceSink(...)&lt;/code&gt; method (or by instantiating the &lt;code&gt;JdbcXaSinkFunction&lt;/code&gt; directly).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PyFlink Table API supports User-Defined Aggregate Functions in Group Windows&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Group Windows in PyFlink’s Table API now support both general Python User-defined Aggregate
Functions (UDAFs) and Pandas UDAFs. Such functions are critical to many analysis- and ML training
programs.&lt;/p&gt;
&lt;p&gt;Flink 1.13 improves upon previous releases, where these functions were only supported
in unbounded Group-by aggregations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Improved Sort-Merge Shuffle for Batch Execution&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Flink 1.13 improves the memory stability and performance of the &lt;em&gt;sort-merge blocking shuffle&lt;/em&gt;
for batch-executed programs, initially introduced in Flink 1.12 via &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-148%3A+Introduce+Sort-Merge+Based+Blocking+Shuffle+to+Flink&quot;&gt;FLIP-148&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Programs with higher parallelism (1000s) should no longer frequently trigger &lt;em&gt;OutOfMemoryError: Direct Memory&lt;/em&gt;.
The performance (especially on spinning disks) is improved through better I/O scheduling
and broadcast optimizations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HBase connector supports async lookup and lookup cache&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The HBase Lookup Table Source now supports an &lt;em&gt;async lookup mode&lt;/em&gt; and a lookup cache.
This greatly benefits the performance of Table/SQL jobs with lookup joins against HBase, while
reducing the I/O requests to HBase in the typical case.&lt;/p&gt;
&lt;p&gt;In prior versions, the HBase Lookup Source only communicated synchronously, resulting in lower
pipeline utilization and throughput.&lt;/p&gt;
&lt;h1 id=&quot;changes-to-consider-when-upgrading-to-flink-113&quot;&gt;Changes to consider when upgrading to Flink 1.13&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21709&quot;&gt;FLINK-21709&lt;/a&gt; - The old planner of the Table &amp;amp;
SQL API has been deprecated in Flink 1.13 and will be dropped in Flink 1.14.
The &lt;em&gt;Blink&lt;/em&gt; engine has been the default planner for some releases now and will be the only one going forward.
That means that both the &lt;code&gt;BatchTableEnvironment&lt;/code&gt; and SQL/DataSet interoperability are reaching
the end of life. Please use the unified &lt;code&gt;TableEnvironment&lt;/code&gt; for batch and stream processing going forward.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22352&quot;&gt;FLINK-22352&lt;/a&gt; The community decided to deprecate
the Apache Mesos support for Apache Flink. It is subject to removal in the future. Users are
encouraged to switch to a different resource manager.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21935&quot;&gt;FLINK-21935&lt;/a&gt; - The &lt;code&gt;state.backend.async&lt;/code&gt;
option is deprecated. Snapshots are always asynchronous now (as they were by default before) and
there is no option to configure a synchronous snapshot anymore.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17012&quot;&gt;FLINK-17012&lt;/a&gt; - The tasks’ &lt;code&gt;RUNNING&lt;/code&gt; state was split
into two states: &lt;code&gt;INITIALIZING&lt;/code&gt; and &lt;code&gt;RUNNING&lt;/code&gt;. A task is &lt;code&gt;INITIALIZING&lt;/code&gt; while it loads the checkpointed state,
and, in the case of unaligned checkpoints, until the checkpointed in-flight data has been recovered.
This lets monitoring systems better determine when the tasks are really back to doing work by making
the phase for state restoring explicit.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21698&quot;&gt;FLINK-21698&lt;/a&gt; - The &lt;em&gt;CAST&lt;/em&gt; operation between the
NUMERIC type and the TIMESTAMP type is problematic and therefore no longer supported: Statements like
&lt;code&gt;CAST(numeric AS TIMESTAMP(3))&lt;/code&gt; will now fail. Please use &lt;code&gt;TO_TIMESTAMP(FROM_UNIXTIME(numeric))&lt;/code&gt; instead.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22133&quot;&gt;FLINK-22133&lt;/a&gt; The unified source API for connectors
has a minor breaking change: The &lt;code&gt;SplitEnumerator.snapshotState()&lt;/code&gt; method was adjusted to accept the
&lt;em&gt;Checkpoint ID&lt;/em&gt; of the checkpoint for which the snapshot is created.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19463&quot;&gt;FLINK-19463&lt;/a&gt; - The old &lt;code&gt;StateBackend&lt;/code&gt; interfaces were deprecated
as they had overloaded semantics which many users found confusing. This is a pure API change and does not affect
runtime characteristics of applications.
For full details on how to update existing pipelines, please see the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/state/state_backends/#migrating-from-legacy-backends&quot;&gt;migration guide&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;resources&quot;&gt;Resources&lt;/h1&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;
of the Flink website, and the most recent distribution of PyFlink is available on &lt;a href=&quot;https://pypi.org/project/apache-flink/&quot;&gt;PyPI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Please review the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/release-notes/flink-1.13&quot;&gt;release notes&lt;/a&gt;
carefully if you plan to upgrade your setup to Flink 1.13. This version is API-compatible with
previous 1.x releases for APIs annotated with the &lt;code&gt;@Public&lt;/code&gt; annotation.&lt;/p&gt;
&lt;p&gt;You can also check the complete &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12349287&quot;&gt;release changelog&lt;/a&gt;
and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.13/&quot;&gt;updated documentation&lt;/a&gt; for a detailed list of changes and new features.&lt;/p&gt;
&lt;h1 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h1&gt;
&lt;p&gt;The Apache Flink community would like to thank each one of the contributors that have
made this release possible:&lt;/p&gt;
&lt;p&gt;acqua.csq, AkisAya, Alexander Fedulov, Aljoscha Krettek, Ammar Al-Batool, Andrey Zagrebin, anlen321,
Anton Kalashnikov, appleyuchi, Arvid Heise, Austin Cawley-Edwards, austin ce, azagrebin, blublinsky,
Brian Zhou, bytesmithing, caozhen1937, chen qin, Chesnay Schepler, Congxian Qiu, Cristian,
cxiiiiiii, Danny Chan, Danny Cranmer, David Anderson, Dawid Wysakowicz, dbgp2021, Dian Fu,
DinoZhang, dixingxing, Dong Lin, Dylan Forciea, est08zw, Etienne Chauchot, fanrui03, Flora Tao,
FLRNKS, fornaix, fuyli, George, Giacomo Gamba, GitHub, godfrey he, GuoWei Ma, Gyula Fora,
hackergin, hameizi, Haoyuan Ge, Harshvardhan Chauhan, Haseeb Asif, hehuiyuan, huangxiao, HuangXiao,
huangxingbo, HuangXingBo, humengyu2012, huzekang, Hwanju Kim, Ingo Bürk, I. Raleigh, Ivan, iyupeng,
Jack, Jane, Jark Wu, Jerry Wang, Jiangjie (Becket) Qin, JiangXin, Jiayi Liao, JieFang.He, Jie Wang,
jinfeng, Jingsong Lee, JingsongLi, Jing Zhang, Joao Boto, JohnTeslaa, Jun Qin, kanata163, kevin.cyj,
KevinyhZou, Kezhu Wang, klion26, Kostas Kloudas, kougazhang, Kurt Young, laughing, legendtkl,
leiqiang, Leonard Xu, liaojiayi, Lijie Wang, liming.1018, lincoln lee, lincoln-lil, liushouwei,
liuyufei, LM Kang, lometheus, luyb, Lyn Zhang, Maciej Obuchowski, Maciek Próchniak, mans2singh,
Marek Sabo, Matthias Pohl, meijie, Mika Naylor, Miklos Gergely, Mohit Paliwal, Moritz Manner,
morsapaes, Mulan, Nico Kruber, openopen2, paul8263, Paul Lam, Peidian li, pengkangjing, Peter Huang,
Piotr Nowojski, Qinghui Xu, Qingsheng Ren, Raghav Kumar Gautam, Rainie Li, Ricky Burnett, Rion
Williams, Robert Metzger, Roc Marshal, Roman, Roman Khachatryan, Ruguo,
Ruguo Yu, Rui Li, Sebastian Liu, Seth Wiesman, sharkdtu, sharkdtu(涂小刚), Shengkai, shizhengchao,
shouweikun, Shuo Cheng, simenliuxing, SteNicholas, Stephan Ewen, Suo Lu, sv3ndk, Svend Vanderveken,
taox, Terry Wang, Thelgis Kotsos, Thesharing, Thomas Weise, Till Rohrmann, Timo Walther, Ting Sun,
totoro, totorooo, TsReaper, Tzu-Li (Gordon) Tai, V1ncentzzZ, vthinkxie, wangfeifan, wangpeibin,
wangyang0918, wangyemao-github, Wei Zhong, Wenlong Lyu, wineandcheeze, wjc, xiaoHoly, Xintong Song,
xixingya, xmarker, Xue Wang, Yadong Xie, yangsanity, Yangze Guo, Yao Zhang, Yuan Mei, yulei0824, Yu
Li, Yun Gao, Yun Tang, yuruguo, yushujun, Yuval Itzchakov, yuzhao.cyz, zck, zhangjunfan,
zhangzhengqi3, zhao_wei_nan, zhaown, zhaoxing, Zhenghua Gao, Zhenqiu Huang, zhisheng, zhongqishang,
zhushang, zhuxiaoshang, Zhu Zhu, zjuwangg, zoucao, zoudan, 左元, 星, 肖佳文, 龙三&lt;/p&gt;
</description>
<pubDate>Mon, 03 May 2021 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/05/03/release-1.13.0.html</link>
<guid isPermaLink="true">/news/2021/05/03/release-1.13.0.html</guid>
</item>
<item>
<title>Apache Flink 1.12.3 Released</title>
<description>&lt;p&gt;The Apache Flink community released the next bugfix version of the Apache Flink 1.12 series.&lt;/p&gt;
&lt;p&gt;This release includes 73 fixes and minor improvements for Flink 1.12.2. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.12.3.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18071&quot;&gt;FLINK-18071&lt;/a&gt;] - CoordinatorEventsExactlyOnceITCase.checkListContainsSequence fails on CI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20547&quot;&gt;FLINK-20547&lt;/a&gt;] - Batch job fails due to the exception in network stack
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20654&quot;&gt;FLINK-20654&lt;/a&gt;] - Unaligned checkpoint recovery may lead to corrupted data stream
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20722&quot;&gt;FLINK-20722&lt;/a&gt;] - HiveTableSink should copy the record when converting RowData to Row
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20752&quot;&gt;FLINK-20752&lt;/a&gt;] - FailureRateRestartBackoffTimeStrategy allows one less restart than configured
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20761&quot;&gt;FLINK-20761&lt;/a&gt;] - Cannot read hive table/partition whose location path contains comma
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20977&quot;&gt;FLINK-20977&lt;/a&gt;] - USE DATABASE &amp;amp; USE CATALOG fails with quoted identifiers containing characters to be escaped in Flink SQL client
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21008&quot;&gt;FLINK-21008&lt;/a&gt;] - Residual HA related Kubernetes ConfigMaps and ZooKeeper nodes when cluster entrypoint received SIGTERM in shutdown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21012&quot;&gt;FLINK-21012&lt;/a&gt;] - AvroFileFormatFactory uses non-deserializable lambda function
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21133&quot;&gt;FLINK-21133&lt;/a&gt;] - FLIP-27 Source does not work with synchronous savepoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21148&quot;&gt;FLINK-21148&lt;/a&gt;] - YARNSessionFIFOSecuredITCase cannot connect to BlobServer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21159&quot;&gt;FLINK-21159&lt;/a&gt;] - KafkaSourceEnumerator not sending NoMoreSplitsEvent to unassigned reader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21178&quot;&gt;FLINK-21178&lt;/a&gt;] - Task failure will not trigger master hook&amp;#39;s reset()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21289&quot;&gt;FLINK-21289&lt;/a&gt;] - Application mode ignores the pipeline.classpaths configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21387&quot;&gt;FLINK-21387&lt;/a&gt;] - DispatcherTest.testInvalidCallDuringInitialization times out on azp
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21388&quot;&gt;FLINK-21388&lt;/a&gt;] - Parquet DECIMAL logical type is not properly supported in ParquetSchemaConverter
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21431&quot;&gt;FLINK-21431&lt;/a&gt;] - UpsertKafkaTableITCase.testTemporalJoin hang
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21434&quot;&gt;FLINK-21434&lt;/a&gt;] - When UDAF return ROW type, and the number of fields is more than 14, the crash happend
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21497&quot;&gt;FLINK-21497&lt;/a&gt;] - JobLeaderIdService completes leader future despite no leader being elected
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21515&quot;&gt;FLINK-21515&lt;/a&gt;] - SourceStreamTaskTest.testStopWithSavepointShouldNotInterruptTheSource is failing
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21518&quot;&gt;FLINK-21518&lt;/a&gt;] - CheckpointCoordinatorTest.testMinCheckpointPause fails fatally on AZP
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21523&quot;&gt;FLINK-21523&lt;/a&gt;] - ArrayIndexOutOfBoundsException occurs while run a hive streaming job with partitioned table source
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21535&quot;&gt;FLINK-21535&lt;/a&gt;] - UnalignedCheckpointITCase.execute failed with &amp;quot;OutOfMemoryError: Java heap space&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21550&quot;&gt;FLINK-21550&lt;/a&gt;] - ZooKeeperHaServicesTest.testSimpleClose fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21552&quot;&gt;FLINK-21552&lt;/a&gt;] - The managed memory was not released if exception was thrown in createPythonExecutionEnvironment
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21606&quot;&gt;FLINK-21606&lt;/a&gt;] - TaskManager connected to invalid JobManager leading to TaskSubmissionException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21609&quot;&gt;FLINK-21609&lt;/a&gt;] - SimpleRecoveryITCaseBase.testRestartMultipleTimes fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21654&quot;&gt;FLINK-21654&lt;/a&gt;] - YARNSessionCapacitySchedulerITCase.testStartYarnSessionClusterInQaTeamQueue fail because of NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21661&quot;&gt;FLINK-21661&lt;/a&gt;] - SHARD_GETRECORDS_INTERVAL_MILLIS wrong use?
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21685&quot;&gt;FLINK-21685&lt;/a&gt;] - Flink JobManager failed to restart from checkpoint in kubernetes HA setup
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21691&quot;&gt;FLINK-21691&lt;/a&gt;] - KafkaSource fails with NPE when setting it up
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21707&quot;&gt;FLINK-21707&lt;/a&gt;] - Job is possible to hang when restarting a FINISHED task with POINTWISE BLOCKING consumers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21710&quot;&gt;FLINK-21710&lt;/a&gt;] - FlinkRelMdUniqueKeys gets incorrect result on TableScan after project push-down
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21725&quot;&gt;FLINK-21725&lt;/a&gt;] - DataTypeExtractor extracts wrong fields ordering for Tuple12
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21733&quot;&gt;FLINK-21733&lt;/a&gt;] - WatermarkAssigner incorrectly recomputing the rowtime index which may cause ArrayIndexOutOfBoundsException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21746&quot;&gt;FLINK-21746&lt;/a&gt;] - flink sql fields in row access error about scalarfunction
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21753&quot;&gt;FLINK-21753&lt;/a&gt;] - Cycle references between memory manager and gc cleaner action
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21817&quot;&gt;FLINK-21817&lt;/a&gt;] - New Kafka Source might break subtask and split assignment upon rescale
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21833&quot;&gt;FLINK-21833&lt;/a&gt;] - TemporalRowTimeJoinOperator.java will lead to the state expansion by short-life-cycle &amp;amp; huge RowData, although config idle.state.retention.time
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21889&quot;&gt;FLINK-21889&lt;/a&gt;] - source:canal-cdc , sink:upsert-kafka, print &amp;quot;select * from sinkTable&amp;quot;, throw NullException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21922&quot;&gt;FLINK-21922&lt;/a&gt;] - The method partition_by in Over doesn&amp;#39;t work for expression dsl
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21933&quot;&gt;FLINK-21933&lt;/a&gt;] - [kinesis][efo] EFO consumer treats interrupts as retryable exceptions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21941&quot;&gt;FLINK-21941&lt;/a&gt;] - testSavepointRescalingOutPartitionedOperatorStateList fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21942&quot;&gt;FLINK-21942&lt;/a&gt;] - KubernetesLeaderRetrievalDriver not closed after terminated which lead to connection leak
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21944&quot;&gt;FLINK-21944&lt;/a&gt;] - AbstractArrowPythonAggregateFunctionOperator.dispose should consider whether arrowSerializer is null
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21969&quot;&gt;FLINK-21969&lt;/a&gt;] - PythonTimestampsAndWatermarksOperator emitted the Long.MAX_VALUE watermark before emitting all the data
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21980&quot;&gt;FLINK-21980&lt;/a&gt;] - ZooKeeperRunningJobsRegistry creates an empty znode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21986&quot;&gt;FLINK-21986&lt;/a&gt;] - taskmanager native memory not release timely after restart
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21992&quot;&gt;FLINK-21992&lt;/a&gt;] - Fix availability notification in UnionInputGate
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21996&quot;&gt;FLINK-21996&lt;/a&gt;] - Transient RPC failure without TaskManager failure can lead to split assignment loss
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22006&quot;&gt;FLINK-22006&lt;/a&gt;] - Could not run more than 20 jobs in a native K8s session when K8s HA enabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22024&quot;&gt;FLINK-22024&lt;/a&gt;] - Maven: Entry has not been leased from this pool / fix for release 1.12
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22053&quot;&gt;FLINK-22053&lt;/a&gt;] - NumberSequenceSource causes fatal exception when less splits than parallelism.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22055&quot;&gt;FLINK-22055&lt;/a&gt;] - RPC main thread executor may schedule commands with wrong time unit of delay
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22061&quot;&gt;FLINK-22061&lt;/a&gt;] - The DEFAULT_NON_SPLITTABLE_FILE_ENUMERATOR defined in FileSource should points to NonSplittingRecursiveEnumerator
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22081&quot;&gt;FLINK-22081&lt;/a&gt;] - Entropy key not resolved if flink-s3-fs-hadoop is added as a plugin
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22082&quot;&gt;FLINK-22082&lt;/a&gt;] - Nested projection push down doesn&amp;#39;t work for data such as row(array(row))
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22124&quot;&gt;FLINK-22124&lt;/a&gt;] - The job finished without any exception if error was thrown during state access
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22172&quot;&gt;FLINK-22172&lt;/a&gt;] - Fix the bug of shared resource among Python Operators of the same slot is not released
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22184&quot;&gt;FLINK-22184&lt;/a&gt;] - Rest client shutdown on failure runs in netty thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22191&quot;&gt;FLINK-22191&lt;/a&gt;] - PyFlinkStreamUserDefinedFunctionTests.test_udf_in_join_condition_2 fail due to NPE
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22327&quot;&gt;FLINK-22327&lt;/a&gt;] - NPE exception happens if it throws exception in finishBundle during job shutdown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22339&quot;&gt;FLINK-22339&lt;/a&gt;] - Fix some encoding exceptions were not thrown in cython coders
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22345&quot;&gt;FLINK-22345&lt;/a&gt;] - CoordinatorEventsExactlyOnceITCase hangs on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22385&quot;&gt;FLINK-22385&lt;/a&gt;] - Type mismatch in NetworkBufferPool
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20533&quot;&gt;FLINK-20533&lt;/a&gt;] - Add histogram support to Datadog reporter
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21382&quot;&gt;FLINK-21382&lt;/a&gt;] - Standalone K8s documentation does not explain usage of standby JobManagers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21521&quot;&gt;FLINK-21521&lt;/a&gt;] - Pretty print K8s specifications
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21690&quot;&gt;FLINK-21690&lt;/a&gt;] - remove redundant tolerableCheckpointFailureNumber setting in CheckpointConfig
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21735&quot;&gt;FLINK-21735&lt;/a&gt;] - Harden JobMaster#updateTaskExecutionState()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22051&quot;&gt;FLINK-22051&lt;/a&gt;] - Better document the distinction between stop-with-savepoint and stop-with-savepoint-with-drain
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22142&quot;&gt;FLINK-22142&lt;/a&gt;] - Remove console logging for Kafka connector for AZP runs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22208&quot;&gt;FLINK-22208&lt;/a&gt;] - Bump snappy-java to 1.1.5+
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-22297&quot;&gt;FLINK-22297&lt;/a&gt;] - Perform early check to ensure that the length of the result is the same as the input for Pandas UDF
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Thu, 29 Apr 2021 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/04/29/release-1.12.3.html</link>
<guid isPermaLink="true">/news/2021/04/29/release-1.12.3.html</guid>
</item>
<item>
<title>Stateful Functions 3.0.0: Remote Functions Front and Center</title>
<description>&lt;p&gt;The Apache Flink community is happy to announce the release of Stateful Functions (StateFun) 3.0.0!
Stateful Functions is a cross-platform stack for building Stateful Serverless applications, making it radically simpler
to develop scalable, consistent, and elastic distributed applications.&lt;/p&gt;
&lt;p&gt;This new release brings &lt;strong&gt;remote functions to the front and center of StateFun&lt;/strong&gt;, making the disaggregated setup that
separates the application logic from the StateFun cluster the default. It is now easier, more efficient, and more
ergonomic to write applications that live in their own processes or containers. With the new Java SDK this is now also
possible for all JVM languages, in addition to Python.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#background&quot; id=&quot;markdown-toc-background&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-features&quot; id=&quot;markdown-toc-new-features&quot;&gt;New Features&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#unified-language-sdks&quot; id=&quot;markdown-toc-unified-language-sdks&quot;&gt;Unified Language SDKs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#cross-language-type-system&quot; id=&quot;markdown-toc-cross-language-type-system&quot;&gt;Cross-Language Type System&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#dynamic-registration-of-state-and-functions&quot; id=&quot;markdown-toc-dynamic-registration-of-state-and-functions&quot;&gt;Dynamic Registration of State and Functions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-resources&quot; id=&quot;markdown-toc-release-resources&quot;&gt;Release Resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;background&quot;&gt;Background&lt;/h2&gt;
&lt;p&gt;Starting with the first StateFun release, before the project was donated to the Apache Software Foundation, our focus was: &lt;strong&gt;making scalable stateful applications easy to build and run&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The first StateFun version introduced an SDK that allowed writing stateful functions that build up a StateFun application packaged and deployed as a particular Flink job submitted to a Flink cluster. Having functions executing within the same JVM as Flink has some advantages, such as the deployment’s performance and immutability. However, it had a few limitations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;❌ ⠀Functions can be written only in a JVM based language.&lt;/li&gt;
&lt;li&gt;❌ ⠀A blocking call/CPU heavy task in one function can affect other functions and operations that need to complete in a timely manner, such as checkpointing.&lt;/li&gt;
&lt;li&gt;❌ ⠀Deploying a new version of the function required a stateful upgrade of the backing Flink job.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With StateFun 2.0.0, the debut official release after the project was donated to Apache Flink, the community introduced the concept of &lt;em&gt;remote functions&lt;/em&gt;, together with an additional SDK for the Python language.
A remote function is a function that executes in a separate process and is invoked via HTTP by the StateFun cluster processes.
Remote functions introduce a new and exciting capability: &lt;strong&gt;state and compute disaggregation&lt;/strong&gt; - allowing users to scale the functions independently of the StateFun cluster, which essentially plays the role of handling messaging and state in a consistent and fault-tolerant manner.&lt;/p&gt;
&lt;p&gt;While remote functions did address the limitations (1) and (2) mentioned above, we still had some room to improve:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;❌ ⠀A stateful restart of the StateFun processes is required to register a new function or to change the state definitions of an existing function.&lt;/li&gt;
&lt;li&gt;❌ ⠀The SDK had a few friction points around state and messaging ergonomics - it had a heavy dependency on Google’s Protocol Buffers for it’s multi-language object representation.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As business requirements evolve, the application logic naturally evolves with it. For StateFun applications, this often means
typical changes such as adding new functions to the application or updating some existing functions to include new state to be persisted.
This is where the first limitation becomes an issue - such operations require a stateful restart of the StateFun cluster
in order for the changes to be discovered, meaning that &lt;em&gt;all&lt;/em&gt; functions of the application would have some downtime for
this to take effect. With remote functions being standalone instances that are supposedly independent of the StateFun cluster processes,
this is obviously non-ideal. By making remote functions the default in StateFun, we’re aiming at enabling full flexibility
and ease of operations for application upgrades.&lt;/p&gt;
&lt;p&gt;The second limitation around state and messaging ergonomics had came up a few times from our users. Prior to this release,
all state values and message objects were strictly required to be Protobuf objects. This made it cumbersome to use common
types such as JSON or simple strings as state and messages.&lt;/p&gt;
&lt;p&gt;With the new StateFun 3.0.0 release, the community has enhanced the remote functions protocol (the protocol that describes how StateFun communicates with the remote function processes) to address all the issues mentioned above.
Building on the new protocol, we rewrote the Python SDK and introduced a brand new remote Java SDK.&lt;/p&gt;
&lt;h2 id=&quot;new-features&quot;&gt;New Features&lt;/h2&gt;
&lt;h3 id=&quot;unified-language-sdks&quot;&gt;Unified Language SDKs&lt;/h3&gt;
&lt;p&gt;One of the goals that we set up to achieve with the SDKs is a unified set of concepts across all the languages.
Having standard and unified SDK concepts across the board makes it straightforward for users to switch the languages their
functions are implemented in.&lt;/p&gt;
&lt;p&gt;Here is the same function written with the updated Python SDK and newly added Java SDK in StateFun 3.0.0:&lt;/p&gt;
&lt;h4 id=&quot;python&quot;&gt;Python&lt;/h4&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@functions.bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typename&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;example/greeter&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;specs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ValueSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;visits&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IntType&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;async&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;greeter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# update the visit count.&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# compute a greeting&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;greeting&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Hello there {name} at the {visits}th time!&amp;quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;caller&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;caller&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message_builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;target_typename&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;caller&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;typename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;target_id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;caller&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;str_value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;greeting&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h4 id=&quot;java&quot;&gt;Java&lt;/h4&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Greeter&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StatefulFunction&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ValueSpec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VISITS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ValueSpec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;named&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;visits&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withIntType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CompletableFuture&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Void&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;){&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// update the visits count&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;VISITS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;orElse&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;VISITS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// compute a greeting&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;asUtf8String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;greeting&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Hello there %s at the %d-th time!\n&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;visits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// reply to the caller with a greeting&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;caller&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;caller&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;MessageBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;forAddress&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;caller&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;greeting&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;done&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Although there are some language specific differences, the terms and concepts are the same:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;an address scoped storage acting as a key-value store for a particular address.&lt;/li&gt;
&lt;li&gt;a unified cross-language way to send, receive, and store values across languages (see also &lt;em&gt;Cross-Language Type System&lt;/em&gt; below).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ValueSpec&lt;/code&gt; to describe the state name, type and possibly expiration configuration. Please note that it is no longer necessary to declare the state ahead of time in a &lt;code&gt;module.yaml&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For a detailed SDK tutorial, we would like to encourage you to visit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/apache/flink-statefun-playground/tree/release-3.0/java/showcase&quot;&gt;Java SDK showcase&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/apache/flink-statefun-playground/tree/release-3.0/python/showcase&quot;&gt;Python SDK showcase&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;cross-language-type-system&quot;&gt;Cross-Language Type System&lt;/h3&gt;
&lt;p&gt;StateFun 3.0.0 introduces a new type system with cross-language support for common primitive types, such as boolean,
integer, long, etc. This is of course all transparent for the user, so you don’t need to worry about it. Functions
implemented in various languages (e.g. Java or Python) can message each other by directly sending supported primitive
values as message arguments. Moreover, the type system is used for state values as well - so, you can expect that a function
can safely read previous state after reimplementing it in a different language.&lt;/p&gt;
&lt;p&gt;The type system is also very easily extensible to support custom message types, such as JSON or Protobuf messages.
StateFun makes this super easy by providing builder utilities to help you create custom types.&lt;/p&gt;
&lt;h3 id=&quot;dynamic-registration-of-state-and-functions&quot;&gt;Dynamic Registration of State and Functions&lt;/h3&gt;
&lt;p&gt;Starting with this release it is now possible to dynamically register new functions without going through a stateful upgrade cycle of the StateFun cluster (which entails the standard process of performing a stateful restart of a Flink job).
This is achieved with a new &lt;code&gt;endpoint&lt;/code&gt; definition that supports target URL templating.&lt;/p&gt;
&lt;p&gt;Consider the following definition:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;&lt;span class=&quot;l-Scalar-Plain&quot;&gt;endpoints&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;p-Indicator&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;endpoint&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;meta&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;http&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;functions&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;example/*&lt;/span&gt;
&lt;span class=&quot;l-Scalar-Plain&quot;&gt;urlPathTemplate&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;l-Scalar-Plain&quot;&gt;https://loadbalancer.svc.cluster.local/{function.name}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With this definition, all messages being addressed to functions under the namespace &lt;code&gt;example&lt;/code&gt; will be forwarded to the specified templated URL.
For example, a message being addressed to a function of typename &lt;code&gt;example/greeter&lt;/code&gt; would be forwarded to &lt;code&gt;https://loadbalancer.svc.cluster.local/greeter&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This unlocks the possibility to dynamically introduce new functions into the topology without ever restarting the Stateful Functions application.&lt;/p&gt;
&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;
&lt;p&gt;With 3.0.0, we’ve brought remote functions to the front and center of StateFun. This is done by a new remote function protocol that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;✅ ⠀Allows registering a new function or changing the state definitions of an existing function to happen dynamically without any downtime, and&lt;/li&gt;
&lt;li&gt;✅ ⠀Provides a cross-language type system, which comes along with a few built-in primitive types, that can be used for messaging and state.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A new Java SDK was added for remote functions to extend the array of supported languages to also include all JVM based languages.
The language SDKs now have unified concepts and constructs in their APIs so that they will all feel familiar to work with
when switching around languages for your functions. In upcoming releases, the community is also looking forward to
continuing building on top of the new remote function protocol to provide an even more language SDKs, such as Golang.&lt;/p&gt;
&lt;h2 id=&quot;release-resources&quot;&gt;Release Resources&lt;/h2&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads&lt;/a&gt;
page of the Flink website, and the most recent Python SDK distribution is available on &lt;a href=&quot;https://pypi.org/project/apache-flink-statefun/&quot;&gt;PyPI&lt;/a&gt;.
You can also find official StateFun Docker images of the new version on &lt;a href=&quot;https://hub.docker.com/r/apache/flink-statefun&quot;&gt;Dockerhub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For more details, check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-3.0/&quot;&gt;updated documentation&lt;/a&gt; and the
&lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12348822&quot;&gt;release notes&lt;/a&gt;
for a detailed list of changes and new features if you plan to upgrade your setup to Stateful Functions 3.0.0.
We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt;
or &lt;a href=&quot;https://issues.apache.org/jira/browse/&quot;&gt;JIRA&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank all contributors that have made this release possible:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;Authuir, Chesnay Schepler, David Anderson, Dian Fu, Frans King, Galen Warren, Guillaume Vauvert, Igal Shilman, Ismaël Mejía, Kartik Khare, Konstantin Knauf, Marta Paes Moreira, Patrick Lucas, Patrick Wiener, Rafi Aroch, Robert Metzger, RocMarshal, Seth Wiesman, Siddique Ahmad, SteNicholas, Stephan Ewen, Timothy Bess, Tymur Yarosh, Tzu-Li (Gordon) Tai, Ufuk Celebi, abc863377, billyrrr, congxianqiu, danp11, hequn8128, kaibo, klion26, morsapaes, slinkydeveloper, wangchao, wangzzu, winder
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you’d like to get involved, we’re always &lt;a href=&quot;https://github.com/apache/flink-statefun#contributing&quot;&gt;looking for new contributors&lt;/a&gt;.&lt;/p&gt;
</description>
<pubDate>Thu, 15 Apr 2021 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2021/04/15/release-statefun-3.0.0.html</link>
<guid isPermaLink="true">/news/2021/04/15/release-statefun-3.0.0.html</guid>
</item>
<item>
<title>A Rundown of Batch Execution Mode in the DataStream API</title>
<description>&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#which-api-and-execution-mode-should-i-use&quot; id=&quot;markdown-toc-which-api-and-execution-mode-should-i-use&quot;&gt;Which API and execution mode should I use?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#how-to-use-the-batch-execution&quot; id=&quot;markdown-toc-how-to-use-the-batch-execution&quot;&gt;How to use the &lt;em&gt;batch&lt;/em&gt; execution&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#hello-batch-mode&quot; id=&quot;markdown-toc-hello-batch-mode&quot;&gt;Hello &lt;em&gt;batch&lt;/em&gt; mode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#example-two-input-operators&quot; id=&quot;markdown-toc-example-two-input-operators&quot;&gt;Example: Two input operators&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#looking-into-the-future&quot; id=&quot;markdown-toc-looking-into-the-future&quot;&gt;Looking into the future&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;Flink has been following the mantra that &lt;a href=&quot;https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html&quot;&gt;Batch is a Special Case of Streaming&lt;/a&gt; since the very early days. As the project evolved to address specific uses cases, different core APIs ended up being implemented for &lt;em&gt;batch&lt;/em&gt; (DataSet API) and &lt;em&gt;streaming&lt;/em&gt; execution (DataStream API), but the higher-level Table API/SQL was subsequently designed following this mantra of &lt;em&gt;unification&lt;/em&gt;. With Flink 1.12, the community worked on bringing a similarly unified behaviour to the DataStream API, and took the first steps towards enabling efficient &lt;a href=&quot;https://cwiki.apache.org/confluence/x/4i94CQ&quot;&gt;batch execution in the DataStream API&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The idea behind making the DataStream API a unified abstraction for &lt;em&gt;batch&lt;/em&gt; and &lt;em&gt;streaming&lt;/em&gt; execution instead of maintaining separate APIs is two-fold:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Reusability: efficient batch and stream processing under the same API would allow you to easily switch between both execution modes without rewriting any code. So, a job could be easily reused to process real-time and historical data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Operational simplicity: providing a unified API would mean using a single set of connectors, maintaining a single codebase and being able to easily implement mixed execution pipelines e.g. for use cases like backfilling.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;p&gt;The difference between BATCH and STREAMING vs BOUNDED and UNBOUNDED is subtle, and a common source of confusion — so, let’s start by clarifying that. These terms might seem mostly interchangeable, but in reality serve different purposes:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bounded&lt;/em&gt; and &lt;em&gt;unbounded&lt;/em&gt; refer to the &lt;strong&gt;characteristics&lt;/strong&gt; of the streams you want to process: whether or not they are known to have an end. The terms are also sometimes applied to the applications processing these streams: an application that only processes bounded streams is a &lt;em&gt;bounded&lt;/em&gt; stream processing application that eventually finishes; while an &lt;em&gt;unbounded&lt;/em&gt; stream processing application processes an unbounded stream and runs forever (or until canceled).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Batch&lt;/em&gt; and &lt;em&gt;streaming&lt;/em&gt; are &lt;strong&gt;execution modes&lt;/strong&gt;. Batch execution is only applicable to bounded streams/applications because it exploits the fact that it can process the whole data (e.g. from a partition) in a batch rather than event-by-event, and possibly execute different batches one after the other. Continuous streaming execution runs everything at the same time, continuously processes (small groups of) events and is applicable to both bounded and unbounded applications.&lt;/p&gt;
&lt;p&gt;Based on that differentiation, there are two main scenarios that result of the combination of these properties:
1. A &lt;em&gt;bounded&lt;/em&gt; Stream Processing Application that is executed in a &lt;em&gt;batch&lt;/em&gt; mode, which you can call a Batch (Processing) Application.
2. An &lt;em&gt;unbounded&lt;/em&gt; Stream Processing Application that is executed in a &lt;em&gt;streaming&lt;/em&gt; mode. This is the combination that has been the primary use case for the DataStream API in Flink.&lt;/p&gt;
&lt;p&gt;It’s also possible to have a &lt;em&gt;bounded&lt;/em&gt; Stream Processing Application that is executed in &lt;em&gt;streaming&lt;/em&gt; mode, but this combination is less significant and likely to be used e.g. in a test environment or in other rare corner cases.&lt;/p&gt;
&lt;h2 id=&quot;which-api-and-execution-mode-should-i-use&quot;&gt;Which API and execution mode should I use?&lt;/h2&gt;
&lt;p&gt;Before going into the choice of execution mode, try looking at your use case from a different angle: do you need to process structured data? Does your data have a schema of some sort? The Table API/SQL will most likely be the right choice. In fact, the majority of &lt;em&gt;batch&lt;/em&gt; use cases should be expressed with the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/table/&quot;&gt;Table API/SQL&lt;/a&gt;! Finite, bounded data can most often be organized, described with a schema and put into a catalog. This is where the SQL API shines, giving you a rich set of functions and operators out-of-the box with low-level optimizations and broad connector support, all supported by standard SQL. And it works for &lt;em&gt;streaming&lt;/em&gt; use cases, as well!&lt;/p&gt;
&lt;p&gt;However, if you need explicit control over the execution graph, you want to manually control the state of your operations, or you need to be able to upgrade Flink (which applies to &lt;em&gt;unbounded&lt;/em&gt; applications), the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/datastream_api.html&quot;&gt;DataStream API&lt;/a&gt; is the right choice.
If the DataStream API sounds like the best fit for your use cases, the next decision is what execution mode to run your program in.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When should you use the &lt;em&gt;batch&lt;/em&gt; mode, then?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The simple answer is if you run your computation on &lt;em&gt;bounded&lt;/em&gt;, historic data. The &lt;em&gt;batch&lt;/em&gt; mode has a few benefits:
1. In &lt;em&gt;bounded&lt;/em&gt; data there is no such thing as late data. You do not need to think how to adjust the watermarking logic that you use in your application. In a streaming case, you need to maintain the order in which the records were written - which is often not possible to recreate when reading from e.g. historic files. In &lt;em&gt;batch&lt;/em&gt; mode you don’t need to care about that as the data will be sorted according to the timestamp and “perfect” watermarks will be injected automatically.
2. The way streaming applications are scheduled and react upon failure have significant performance implications that can be optimized when dealing with &lt;em&gt;bounded&lt;/em&gt; data. We recommend reading through the blogposts on &lt;a href=&quot;https://flink.apache.org/2020/12/15/pipelined-region-sheduling.html&quot;&gt;pipelined region scheduling&lt;/a&gt; and &lt;a href=&quot;https://flink.apache.org/news/2021/01/11/batch-fine-grained-fault-tolerance.html&quot;&gt;fine-grained fault tolerance&lt;/a&gt; to better understand these performance implications.
3. It can simplify the operational overhead of setting up and maintaining your pipelines. For example, there is no need to configure checkpointing, which otherwise requires things like choosing a state backend or setting up distributed storage for checkpoints.&lt;/p&gt;
&lt;h2 id=&quot;how-to-use-the-batch-execution&quot;&gt;How to use the &lt;em&gt;batch&lt;/em&gt; execution&lt;/h2&gt;
&lt;p&gt;Once you have a good understanding of which execution mode is better suited to your use case, you can configure it via the &lt;code&gt;execution.runtime-mode&lt;/code&gt; setting. There are three possible values:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;STREAMING&lt;/code&gt;: The classic DataStream execution mode (default)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;BATCH&lt;/code&gt;: Batch-style execution on the DataStream API&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AUTOMATIC&lt;/code&gt;: Let the system decide based on the boundedness of the sources&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This can be configured via command line parameters of &lt;code&gt;bin/flink run ...&lt;/code&gt; when submitting a job:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;bin/flink run -Dexecution.runtime-mode&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;BATCH examples/streaming/WordCount.jar&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;, or programmatically when creating/configuring the &lt;code&gt;StreamExecutionEnvironment&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;java
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.BATCH);
&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;We recommend passing the execution mode when submitting the job, in order to keep your code configuration-free and potentially be able to execute the same application in different execution modes.&lt;/p&gt;
&lt;h3 id=&quot;hello-batch-mode&quot;&gt;Hello &lt;em&gt;batch&lt;/em&gt; mode&lt;/h3&gt;
&lt;p&gt;Now that you know how to set the execution mode, let’s try to write a simple word count program and see how it behaves depending on the chosen mode. The program is a variation of a standard word count, where we count number of orders placed
in a given currency. We derive the number in 1-day windows. We read the input data from a new &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/connector/file/src/FileSource.html&quot;&gt;unified file source&lt;/a&gt; and then apply a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/stream/operators/windows.html#windows&quot;&gt;window aggregation&lt;/a&gt;. Notice that we will be checking the side output for late arriving data, which can illustrate how watermarks behave differently in the two execution modes.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowWordCount&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OutputTag&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LATE_DATA&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OutputTag&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;late-data&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BasicArrayTypeInfo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING_ARRAY_TYPE_INFO&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ParameterTool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ParameterTool&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromArgs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;path&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;SingleOutputStreamOperator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataStream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FileSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;forRecordStreamFormat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;TsvFormat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WatermarkStrategy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;forBoundedOutOfOrderness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Duration&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;ofDays&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withTimestampAssigner&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;OrderTimestampAssigner&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()),&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;Text file&amp;quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;keyBy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// group by currency&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TumblingEventTimeWindows&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;days&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;sideOutputLateData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LATE_DATA&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;aggregate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;CountFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// count number of orders in a given currency&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;CombineWindow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lateData&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getSideOutput&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LATE_DATA&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CloseableIterator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lateData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;executeAndCollect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;hasNext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;late&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Arrays&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;late&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Number of late records: &amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CloseableIterator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;executeAndCollect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;hasNext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If we simply execute the above program with:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;bin/flink run examples/streaming/WindowWordCount.jar&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;it will be executed in a &lt;em&gt;streaming&lt;/em&gt; mode by default. Because of that, it will use the given watermarking strategy and produce windows based on it. In real-time scenarios, it might happen that records do not adhere to watermarks and
some records might actually be considered late, so you’ll get results like:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;...
[1431681, 130936, F, 135996.21, NOK, 2020-04-11 07:53:02.674, 2-HIGH, Clerk#000000922, 0, quests. slyly regular platelets cajole ironic deposits: blithely even depos]
[1431744, 143957, F, 36391.24, CHF, 2020-04-11 07:53:27.631, 2-HIGH, Clerk#000000406, 0, eans. blithely special instructions are quickly. q]
[1431812, 58096, F, 55292.05, CAD, 2020-04-11 07:54:16.956, 2-HIGH, Clerk#000000561, 0, , regular packages use. slyly even instr]
[1431844, 77335, O, 415443.20, CAD, 2020-04-11 07:54:40.967, 2-HIGH, Clerk#000000446, 0, unts across the courts wake after the accounts! ruthlessly]
[1431968, 122005, F, 44964.19, JPY, 2020-04-11 07:55:42.661, 1-URGENT, Clerk#000000001, 0, nal theodolites against the slyly special packages poach blithely special req]
[1432097, 26035, F, 42464.15, CAD, 2020-04-11 07:57:13.423, 5-LOW, Clerk#000000213, 0, l accounts hang blithely. carefully blithe dependencies ]
[1432193, 97537, F, 87856.63, NOK, 2020-04-11 07:58:06.862, 4-NOT SPECIFIED, Clerk#000000356, 0, furiously furiously brave foxes. bo]
[1432291, 112045, O, 114327.52, JPY, 2020-04-11 07:59:12.912, 1-URGENT, Clerk#000000732, 0, ding to the fluffily ironic requests haggle carefully alongsid]
Number of late records: 1514
(GBP,374,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(HKD,401,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(CNY,402,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(CAD,392,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(JPY,411,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(CHF,371,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(NOK,370,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(RUB,365,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;However, if you execute the exact same code using the &lt;em&gt;batch&lt;/em&gt; execution mode:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;bin/flink run -Dexecution.runtime-mode&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;BATCH examples/streaming/WordCount.jar&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;you’ll see that there won’t be any late records.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;Number of late records: 0
(GBP,374,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(HKD,401,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(CNY,402,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(CAD,392,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(JPY,411,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(CHF,371,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(NOK,370,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
(RUB,365,2020-03-31T00:00:00Z,2020-04-01T00:00:00Z)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Also, if you compare the execution timelines of both runs, you’ll see that the jobs were scheduled differently. In the case of &lt;em&gt;batch&lt;/em&gt; execution, the two stages were executed one after the other:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/img/blog/2021-03-11-batch-execution-mode/batch-execution.png&quot;&gt;&lt;img src=&quot;/img/blog/2021-03-11-batch-execution-mode/batch-execution.png&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;whereas for &lt;em&gt;streaming&lt;/em&gt; both stages started at the same time.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/img/blog/2021-03-11-batch-execution-mode/stream-execution.png&quot;&gt;&lt;img src=&quot;/img/blog/2021-03-11-batch-execution-mode/stream-execution.png&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&quot;example-two-input-operators&quot;&gt;Example: Two input operators&lt;/h3&gt;
&lt;p&gt;Operators that process data from multiple inputs can be executed in both execution modes as well. Let’s see how we may implement a join of two data sets on a common key. (Disclaimer: Make sure to think first if you &lt;a href=&quot;#which-api-and-execution-mode-should-i-use&quot;&gt;should use the Table API/SQL&lt;/a&gt; for your join!). We will enrich a stream of orders with information about the customer and we will make it run either of the two modes.&lt;/p&gt;
&lt;p&gt;For this particular use case, the DataStream API provides a &lt;code&gt;DataStream#join&lt;/code&gt; method that requires a window in which the join must happen; since we’ll process the data in bulk, we can use a &lt;code&gt;GlobalWindow&lt;/code&gt; (that would otherwise not be very useful on its own in an &lt;em&gt;unbounded&lt;/em&gt; case due to state size concerns):&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;DataStreamSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orders&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FileSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;forRecordStreamFormat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;TsvFormat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ordersPath&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WatermarkStrategy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;noWatermarks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withTimestampAssigner&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previous&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;Text file&amp;quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customersPath&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;customers&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStreamSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customers&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fromSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FileSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;forRecordStreamFormat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;TsvFormat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customersPath&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WatermarkStrategy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;noWatermarks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withTimestampAssigner&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previous&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;Text file&amp;quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataStream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orders&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;customers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;order&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;equalTo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;customer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// join on customer id&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GlobalWindows&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;trigger&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ContinuousProcessingTimeTrigger&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;seconds&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ProjectFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You might notice the &lt;code&gt;ContinuousProcessingTimeTrigger&lt;/code&gt;. It is there for the application to produce results in a &lt;em&gt;streaming&lt;/em&gt; mode. In a &lt;em&gt;streaming&lt;/em&gt; application the &lt;code&gt;GlobalWindow&lt;/code&gt; never finishes so we need to add a processing time trigger to emit results from time to time. We believe triggers are a way to control when to emit results, but are not part of the logic what to emit. Therefore we think it is safe to ignore those in case of &lt;em&gt;batch&lt;/em&gt; mode and that’s what we do. In &lt;em&gt;batch&lt;/em&gt; mode you will just get one final result for the join.&lt;/p&gt;
&lt;h2 id=&quot;looking-into-the-future&quot;&gt;Looking into the future&lt;/h2&gt;
&lt;p&gt;Support for efficient &lt;em&gt;batch&lt;/em&gt; execution in the DataStream API was introduced in Flink 1.12 as a first step towards achieving a truly unified runtime for both batch and stream processing. This is not the end of the story yet! The community is still working on some optimizations and exploring more use cases that can be enabled with this new mode.&lt;/p&gt;
&lt;p&gt;One of the first efforts we want to finalize is providing world-class support for transactional sinks in both execution modes, for &lt;em&gt;bounded&lt;/em&gt; and &lt;em&gt;unbounded&lt;/em&gt; streams. An experimental API for &lt;a href=&quot;https://cwiki.apache.org/confluence/x/KEJ4CQ&quot;&gt;transactional sinks&lt;/a&gt; was already introduced in Flink 1.12, so we’re working on stabilizing it and would be happy to hear feedback about its current state!&lt;/p&gt;
&lt;p&gt;We are also thinking how the two modes can be brought closer together and benefit from each other. A common pattern that we hear from users is bootstrapping state of a streaming job from a batch one. There are two somewhat different approaches we are considering here:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Having a mixed graph, where one of the branches would have only bounded sources and the other would reflect the unbounded part — you can think of such a graph as effectively two separate jobs. The bounded part would be executed first and sink into the state of a common vertex of the two parts. This jobs’ purpose would be to populate the state of the common operator. Once that job is done, we could proceed to running the unbounded part.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Another approach is to run the exact same program first on the &lt;em&gt;bounded&lt;/em&gt; data. However, this time we wouldn’t assume completeness of the job; instead, we would produce the state of all operators up to a certain point in time and store it as a savepoint. Later on, we could use the savepoint to start the application on the &lt;em&gt;unbounded&lt;/em&gt; data.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Lastly, to achieve feature parity with the DataSet API (Flink’s legacy API for batch-style execution), we are looking into the topic of iterations and how to meet the different usage patterns depending on the mode. In STREAMING mode, iterations serve as a loopback edge, but we don’t necessarily need to keep track of the iteration step. On the other hand, the iteration generation is vital for Machine Learning (ML) algorithms, which are the primary use case for iterations in BATCH mode.&lt;/p&gt;
&lt;p&gt;Have you tried the new BATCH execution mode in the DataStream API? How was your experience? We are happy to hear your feedback and stories!&lt;/p&gt;
</description>
<pubDate>Thu, 11 Mar 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/2021/03/11/batch-execution-mode.html</link>
<guid isPermaLink="true">/2021/03/11/batch-execution-mode.html</guid>
</item>
<item>
<title>Apache Flink 1.12.2 Released</title>
<description>&lt;p&gt;The Apache Flink community released the next bugfix version of the Apache Flink 1.12 series.&lt;/p&gt;
&lt;p&gt;This release includes 83 fixes and minor improvements for Flink 1.12.1. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.12.2.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21070&quot;&gt;FLINK-21070&lt;/a&gt;] - Overloaded aggregate functions cause converter errors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21486&quot;&gt;FLINK-21486&lt;/a&gt;] - Add sanity check when switching from Rocks to Heap timers
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12461&quot;&gt;FLINK-12461&lt;/a&gt;] - Document binary compatibility situation with Scala beyond 2.12.8
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16443&quot;&gt;FLINK-16443&lt;/a&gt;] - Fix wrong fix for user-code CheckpointExceptions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19771&quot;&gt;FLINK-19771&lt;/a&gt;] - NullPointerException when accessing null array from postgres in JDBC Connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20309&quot;&gt;FLINK-20309&lt;/a&gt;] - UnalignedCheckpointTestBase.execute is failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20462&quot;&gt;FLINK-20462&lt;/a&gt;] - MailboxOperatorTest.testAvoidTaskStarvation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20500&quot;&gt;FLINK-20500&lt;/a&gt;] - UpsertKafkaTableITCase.testTemporalJoin test failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20565&quot;&gt;FLINK-20565&lt;/a&gt;] - Fix typo in EXPLAIN Statements docs.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20580&quot;&gt;FLINK-20580&lt;/a&gt;] - Missing null value handling for SerializedValue&amp;#39;s getByteArray()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20654&quot;&gt;FLINK-20654&lt;/a&gt;] - Unaligned checkpoint recovery may lead to corrupted data stream
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20663&quot;&gt;FLINK-20663&lt;/a&gt;] - Managed memory may not be released in time when operators use managed memory frequently
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20675&quot;&gt;FLINK-20675&lt;/a&gt;] - Asynchronous checkpoint failure would not fail the job anymore
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20680&quot;&gt;FLINK-20680&lt;/a&gt;] - Fails to call var-arg function with no parameters
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20798&quot;&gt;FLINK-20798&lt;/a&gt;] - Using PVC as high-availability.storageDir could not work
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20832&quot;&gt;FLINK-20832&lt;/a&gt;] - Deliver bootstrap resouces ourselves for website and documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20848&quot;&gt;FLINK-20848&lt;/a&gt;] - Kafka consumer ID is not specified correctly in new KafkaSource
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20913&quot;&gt;FLINK-20913&lt;/a&gt;] - Improve new HiveConf(jobConf, HiveConf.class)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20921&quot;&gt;FLINK-20921&lt;/a&gt;] - Fix Date/Time/Timestamp in Python DataStream
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20933&quot;&gt;FLINK-20933&lt;/a&gt;] - Config Python Operator Use Managed Memory In Python DataStream
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20942&quot;&gt;FLINK-20942&lt;/a&gt;] - Digest of FLOAT literals throws UnsupportedOperationException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20944&quot;&gt;FLINK-20944&lt;/a&gt;] - Launching in application mode requesting a ClusterIP rest service type results in an Exception
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20947&quot;&gt;FLINK-20947&lt;/a&gt;] - Idle source doesn&amp;#39;t work when pushing watermark into the source
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20961&quot;&gt;FLINK-20961&lt;/a&gt;] - Flink throws NullPointerException for tables created from DataStream with no assigned timestamps and watermarks
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20992&quot;&gt;FLINK-20992&lt;/a&gt;] - Checkpoint cleanup can kill JobMaster
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20998&quot;&gt;FLINK-20998&lt;/a&gt;] - flink-raw-1.12.jar does not exist
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21009&quot;&gt;FLINK-21009&lt;/a&gt;] - Can not disable certain options in Elasticsearch 7 connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21013&quot;&gt;FLINK-21013&lt;/a&gt;] - Blink planner does not ingest timestamp into StreamRecord
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21024&quot;&gt;FLINK-21024&lt;/a&gt;] - Dynamic properties get exposed to job&amp;#39;s main method if user parameters are passed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21028&quot;&gt;FLINK-21028&lt;/a&gt;] - Streaming application didn&amp;#39;t stop properly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21030&quot;&gt;FLINK-21030&lt;/a&gt;] - Broken job restart for job with disjoint graph
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21059&quot;&gt;FLINK-21059&lt;/a&gt;] - KafkaSourceEnumerator does not honor consumer properties
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21069&quot;&gt;FLINK-21069&lt;/a&gt;] - Configuration &amp;quot;parallelism.default&amp;quot; doesn&amp;#39;t take effect for TableEnvironment#explainSql
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21071&quot;&gt;FLINK-21071&lt;/a&gt;] - Snapshot branches running against flink-docker dev-master branch
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21104&quot;&gt;FLINK-21104&lt;/a&gt;] - UnalignedCheckpointITCase.execute failed with &amp;quot;IllegalStateException&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21132&quot;&gt;FLINK-21132&lt;/a&gt;] - BoundedOneInput.endInput is called when taking synchronous savepoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21138&quot;&gt;FLINK-21138&lt;/a&gt;] - KvStateServerHandler is not invoked with user code classloader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21140&quot;&gt;FLINK-21140&lt;/a&gt;] - Extract zip file dependencies before adding to PYTHONPATH
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21144&quot;&gt;FLINK-21144&lt;/a&gt;] - KubernetesResourceManagerDriver#tryResetPodCreationCoolDown causes fatal error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21155&quot;&gt;FLINK-21155&lt;/a&gt;] - FileSourceTextLinesITCase.testBoundedTextFileSourceWithTaskManagerFailover does not pass
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21158&quot;&gt;FLINK-21158&lt;/a&gt;] - wrong jvm metaspace and overhead size show in taskmanager metric page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21163&quot;&gt;FLINK-21163&lt;/a&gt;] - Python dependencies specified via CLI should not override the dependencies specified in configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21169&quot;&gt;FLINK-21169&lt;/a&gt;] - Kafka flink-connector-base dependency should be scope compile
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21208&quot;&gt;FLINK-21208&lt;/a&gt;] - pyarrow exception when using window with pandas udaf
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21213&quot;&gt;FLINK-21213&lt;/a&gt;] - e2e test fail with &amp;#39;As task is already not running, no longer decline checkpoint&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21215&quot;&gt;FLINK-21215&lt;/a&gt;] - Checkpoint was declined because one input stream is finished
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21216&quot;&gt;FLINK-21216&lt;/a&gt;] - StreamPandasConversionTests Fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21225&quot;&gt;FLINK-21225&lt;/a&gt;] - OverConvertRule does not consider distinct
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21226&quot;&gt;FLINK-21226&lt;/a&gt;] - Reintroduce TableColumn.of for backwards compatibility
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21274&quot;&gt;FLINK-21274&lt;/a&gt;] - At per-job mode, during the exit of the JobManager process, if ioExecutor exits at the end, the System.exit() method will not be executed.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21277&quot;&gt;FLINK-21277&lt;/a&gt;] - SQLClientSchemaRegistryITCase fails to download testcontainers/ryuk:0.3.0
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21312&quot;&gt;FLINK-21312&lt;/a&gt;] - SavepointITCase.testStopSavepointWithBoundedInputConcurrently is unstable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21323&quot;&gt;FLINK-21323&lt;/a&gt;] - Stop-with-savepoint is not supported by SourceOperatorStreamTask
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21351&quot;&gt;FLINK-21351&lt;/a&gt;] - Incremental checkpoint data would be lost once a non-stop savepoint completed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21361&quot;&gt;FLINK-21361&lt;/a&gt;] - FlinkRelMdUniqueKeys matches on AbstractCatalogTable instead of CatalogTable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21412&quot;&gt;FLINK-21412&lt;/a&gt;] - pyflink DataTypes.DECIMAL is not available
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21452&quot;&gt;FLINK-21452&lt;/a&gt;] - FLIP-27 sources cannot reliably downscale
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21453&quot;&gt;FLINK-21453&lt;/a&gt;] - BoundedOneInput.endInput is NOT called when doing stop with savepoint WITH drain
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21490&quot;&gt;FLINK-21490&lt;/a&gt;] - UnalignedCheckpointITCase fails on azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21492&quot;&gt;FLINK-21492&lt;/a&gt;] - ActiveResourceManager swallows exception stack trace
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; New Feature
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20359&quot;&gt;FLINK-20359&lt;/a&gt;] - Support adding Owner Reference to Job Manager in native kubernetes setup
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9844&quot;&gt;FLINK-9844&lt;/a&gt;] - PackagedProgram does not close URLClassLoader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20417&quot;&gt;FLINK-20417&lt;/a&gt;] - Handle &amp;quot;Too old resource version&amp;quot; exception in Kubernetes watch more gracefully
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20491&quot;&gt;FLINK-20491&lt;/a&gt;] - Support Broadcast Operation in BATCH execution mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20517&quot;&gt;FLINK-20517&lt;/a&gt;] - Support mixed keyed/non-keyed operations in BATCH execution mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20770&quot;&gt;FLINK-20770&lt;/a&gt;] - Incorrect description for config option kubernetes.rest-service.exposed.type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20907&quot;&gt;FLINK-20907&lt;/a&gt;] - Table API documentation promotes deprecated syntax
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21020&quot;&gt;FLINK-21020&lt;/a&gt;] - Bump Jackson to 20.10.5[.1] / 2.12.1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21034&quot;&gt;FLINK-21034&lt;/a&gt;] - Rework jemalloc switch to use an environment variable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21035&quot;&gt;FLINK-21035&lt;/a&gt;] - Deduplicate copy_plugins_if_required calls
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21036&quot;&gt;FLINK-21036&lt;/a&gt;] - Consider removing automatic configuration fo number of slots from docker
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21037&quot;&gt;FLINK-21037&lt;/a&gt;] - Deduplicate configuration logic in docker entrypoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21042&quot;&gt;FLINK-21042&lt;/a&gt;] - Fix code example in &amp;quot;Aggregate Functions&amp;quot; section in Table UDF page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21048&quot;&gt;FLINK-21048&lt;/a&gt;] - Refactor documentation related to switch memory allocator
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21123&quot;&gt;FLINK-21123&lt;/a&gt;] - Upgrade Beanutils 1.9.x to 1.9.4
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21164&quot;&gt;FLINK-21164&lt;/a&gt;] - Jar handlers don&amp;#39;t cleanup temporarily extracted jars
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21210&quot;&gt;FLINK-21210&lt;/a&gt;] - ApplicationClusterEntryPoints should explicitly close PackagedProgram
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21381&quot;&gt;FLINK-21381&lt;/a&gt;] - Kubernetes HA documentation does not state required service account and role
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20529&quot;&gt;FLINK-20529&lt;/a&gt;] - Publish Dockerfiles for release 1.12.0
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20534&quot;&gt;FLINK-20534&lt;/a&gt;] - Add Flink 1.12 MigrationVersion
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20536&quot;&gt;FLINK-20536&lt;/a&gt;] - Update migration tests in master to cover migration from release-1.12
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20960&quot;&gt;FLINK-20960&lt;/a&gt;] - Add warning in 1.12 release notes about potential corrupt data stream with unaligned checkpoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-21358&quot;&gt;FLINK-21358&lt;/a&gt;] - Missing snapshot version compatibility for 1.12
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Wed, 03 Mar 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2021/03/03/release-1.12.2.html</link>
<guid isPermaLink="true">/news/2021/03/03/release-1.12.2.html</guid>
</item>
<item>
<title>How to natively deploy Flink on Kubernetes with High-Availability (HA)</title>
<description>&lt;p&gt;Flink has supported resource management systems like YARN and Mesos since the early days; however, these were not designed for the fast-moving cloud-native architectures that are increasingly gaining popularity these days, or the growing need to support complex, mixed workloads (e.g. batch, streaming, deep learning, web services).
For these reasons, more and more users are using Kubernetes to automate the deployment, scaling and management of their Flink applications.&lt;/p&gt;
&lt;p&gt;From release to release, the Flink community has made significant progress in &lt;strong&gt;integrating natively with Kubernetes&lt;/strong&gt;, from active resource management to “Zookeeperless” High Availability (HA).
In this blogpost, we’ll recap the technical details of deploying Flink applications natively on Kubernetes, diving deeper into Flink’s Kubernetes HA architecture. We’ll then walk you through a &lt;a href=&quot;#example-application-cluster-with-ha&quot;&gt;&lt;strong&gt;hands-on example&lt;/strong&gt;&lt;/a&gt; of running a Flink &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/#application-mode&quot;&gt;application cluster&lt;/a&gt; on Kubernetes with HA enabled.
We’ll end with a conclusion covering the advantages of running Flink natively on Kubernetes, and an outlook into future work.&lt;/p&gt;
&lt;h1 id=&quot;native-flink-on-kubernetes-integration&quot;&gt;Native Flink on Kubernetes Integration&lt;/h1&gt;
&lt;p&gt;Before we dive into the technical details of how the Kubernetes-based HA service works, let us briefly explain what &lt;em&gt;native&lt;/em&gt; means in the context of Flink deployments on Kubernetes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Flink is &lt;strong&gt;self-contained&lt;/strong&gt;. There will be an embedded Kubernetes client in the Flink client, and so you will not need other external tools (&lt;em&gt;e.g.&lt;/em&gt; kubectl, Kubernetes dashboard) to create a Flink cluster on Kubernetes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The Flink client will contact the Kubernetes API server &lt;strong&gt;directly&lt;/strong&gt; to create the JobManager deployment. The configuration located on the client side will be shipped to the JobManager pod, as well as the log4j and Hadoop configurations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Flink’s ResourceManager will talk to the Kubernetes API server to &lt;strong&gt;allocate and release&lt;/strong&gt; the TaskManager pods dynamically &lt;strong&gt;on-demand&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All in all, this is similar to how Flink integrates with other resource management systems (&lt;em&gt;e.g.&lt;/em&gt; YARN, Mesos), so it should be somewhat straightforward to integrate with Kubernetes if you’ve managed such deployments before — and especially if you already had some internal deployer for the lifecycle management of multiple Flink jobs.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:75%&quot; src=&quot;/img/blog/2021-02-10-native-k8s-with-ha/native-k8s-architecture.png&quot; /&gt;
&lt;p&gt;
&lt;em&gt;&lt;b&gt;Fig. 1:&lt;/b&gt; Architecture of Flink&#39;s native Kubernetes integration.&lt;/em&gt;
&lt;/p&gt;
&lt;/center&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;kubernetes-high-availability-service&quot;&gt;Kubernetes High Availability Service&lt;/h1&gt;
&lt;p&gt;High Availability (HA) is a common requirement when bringing Flink to production: it helps prevent a single point of failure for Flink clusters.
Previous to the &lt;a href=&quot;https://flink.apache.org/news/2020/12/10/release-1.12.0.html&quot;&gt;1.12 release&lt;/a&gt;, Flink has provided a Zookeeper HA service that has been widely used in production setups and that can be integrated in standalone cluster, YARN, or Kubernetes deployments.
However, managing a Zookeeper cluster on Kubernetes for HA would require an additional operational cost that could be avoided because, in the end, Kubernetes also provides some public APIs for leader election and configuration storage (&lt;em&gt;i.e.&lt;/em&gt; ConfigMap).
From Flink 1.12, we leverage these features to make running a HA-configured Flink cluster on Kubernetes more convenient to users.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:72%&quot; src=&quot;/img/blog/2021-02-10-native-k8s-with-ha/native-k8s-ha-architecture.png&quot; /&gt;
&lt;p&gt;
&lt;em&gt;&lt;b&gt;Fig. 2:&lt;/b&gt; Architecture of Flink&#39;s Kubernetes High Availability (HA) service.&lt;/em&gt;
&lt;/p&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;
The above diagram shows the architecture of Flink’s Kubernetes HA service, which works as follows:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;For the &lt;strong&gt;leader election&lt;/strong&gt;, a set of eligible JobManagers is identified. They all race to declare themselves as the leader, with one eventually becoming the active leader. The active JobManager then continually “heartbeats” to renew its position as the leader. In the meantime, all other standby JobManagers periodically make new attempts to become the leader — this ensures that the JobManager could &lt;strong&gt;failover quickly&lt;/strong&gt;. Different components (&lt;em&gt;e.g.&lt;/em&gt; ResourceManager, JobManager, Dispatcher, RestEndpoint) have separate leader election services and ConfigMaps.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The active leader publishes its address to the ConfigMap. It’s important to note that Flink will use the same ConfigMap for contending lock and storing the leader address. This ensures that there is &lt;strong&gt;no unexpected change&lt;/strong&gt; snuck in during a periodic update.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The leader retrieval service is used to find the active leader’s address and allow the components to then &lt;strong&gt;register&lt;/strong&gt; themselves. For example, TaskManagers retrieve the address of ResourceManager and JobManager for registration and to offer slots. Flink uses a &lt;strong&gt;Kubernetes watch&lt;/strong&gt; in the leader retrieval service — once the content of ConfigMap changes, it usually means that the leader has changed, and so the listener can &lt;strong&gt;get the latest leader address immediately&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;All other meta information (&lt;em&gt;e.g.&lt;/em&gt; running jobs, job graphs, completed checkpoints and checkpointer counter) will be directly stored in the corresponding ConfigMaps. Only the leader can update the ConfigMap. The HA data will only be &lt;strong&gt;cleaned up&lt;/strong&gt; once the Flink cluster reaches the global &lt;strong&gt;terminal state&lt;/strong&gt;. Please note that only the pointers are stored in the ConfigMap; the concrete data will be stored in the DistributedStorage. This level of indirection is necessary to keep the amount of data in ConfigMap small (ConfigMap is built for data less than 1MB whereas state can grow to multiple GBs).&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&quot;example-application-cluster-with-ha&quot;&gt;Example: Application Cluster with HA&lt;/h1&gt;
&lt;p&gt;You’ll need a running Kubernetes cluster and to get &lt;code&gt;kubeconfig&lt;/code&gt; properly set to follow along.
You can use &lt;code&gt;kubectl get nodes&lt;/code&gt; to verify that you’re all set!
In this blog post, we’re using &lt;a href=&quot;https://minikube.sigs.k8s.io/docs/start/&quot;&gt;minikube&lt;/a&gt; for local testing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Build a Docker image with the Flink job&lt;/strong&gt; (&lt;code&gt;my-flink-job.jar&lt;/code&gt;) &lt;strong&gt;baked in&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-dockerfile&quot;&gt;&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; flink:1.12.1
&lt;span class=&quot;k&quot;&gt;RUN&lt;/span&gt; mkdir -p &lt;span class=&quot;nv&quot;&gt;$FLINK_HOME&lt;/span&gt;/usrlib
COPY /path/of/my-flink-job.jar &lt;span class=&quot;nv&quot;&gt;$FLINK_HOME&lt;/span&gt;/usrlib/my-flink-job.jar&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Use the above Dockerfile to build a user image (&lt;code&gt;&amp;lt;user-image&amp;gt;&lt;/code&gt;) and then push it to your remote image repository:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;docker build -t &amp;lt;user-image&amp;gt; .
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;docker push &amp;lt;user-image&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;br /&gt;
&lt;strong&gt;2. Start a Flink Application Cluster&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./bin/flink run-application &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
--detached &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
--parallelism &lt;span class=&quot;m&quot;&gt;4&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
--target kubernetes-application &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dkubernetes.cluster-id&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;k8s-ha-app-1 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dkubernetes.container.image&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&amp;lt;user-image&amp;gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dkubernetes.jobmanager.cpu&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;0.5 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dkubernetes.taskmanager.cpu&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;0.5 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dtaskmanager.numberOfTaskSlots&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dkubernetes.rest-service.exposed.type&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;NodePort &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dhigh-availability&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dhigh-availability.storageDir&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;s3://flink-bucket/flink-ha &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Drestart-strategy&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;fixed-delay &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Drestart-strategy.fixed-delay.attempts&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;flink-s3-fs-hadoop-1.12.1.jar &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;flink-s3-fs-hadoop-1.12.1.jar &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;local&lt;/span&gt;:///opt/flink/usrlib/my-flink-job.jar&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;br /&gt;
&lt;strong&gt;3. Access the Flink Web UI&lt;/strong&gt; (http://minikube-ip-address:node-port) &lt;strong&gt;and check that the job is running!&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;2021-02-05 17:26:13,403 INFO org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create flink application cluster k8s-ha-app-1 successfully, JobManager Web Interface: http://192.168.64.21:32388
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You should be able to find a similar log in the Flink client and get the JobManager web interface URL.&lt;/p&gt;
&lt;p&gt;&lt;br /&gt;
&lt;strong&gt;4. Kill the JobManager to simulate failure&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;kubectl &lt;span class=&quot;nb&quot;&gt;exec&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;jobmanager_pod_name&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; -- /bin/sh -c &lt;span class=&quot;s2&quot;&gt;&amp;quot;kill 1&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;br /&gt;
&lt;strong&gt;5. Verify that the job recovers from the latest successful checkpoint&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Refresh the Flink Web UI until the new JobManager is launched, and then search for the following JobManager logs to verify that the job recovers from the latest successful checkpoint:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;2021-02-05 09:44:01,636 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Restoring job 00000000000000000000000000000000 from Checkpoint 101 @ 1612518074802 for 00000000000000000000000000000000 located at &amp;lt;checkpoint-not-externally-addressable&amp;gt;.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;br /&gt;
&lt;strong&gt;6. Cancel the job&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The job can be cancelled through the Flink the Web UI, or using the following command:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./bin/flink cancel --target kubernetes-application -Dkubernetes.cluster-id&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&amp;lt;ClusterID&amp;gt; &amp;lt;JobID&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;When the job is cancelled, all the Kubernetes resources created by Flink (e.g. JobManager deployment, TaskManager pods, service, Flink configuration ConfigMap, leader-related ConfigMaps) will be deleted automatically.&lt;/p&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;The native Kubernetes integration was first introduced in Flink 1.10, abstracting a lot of the complexities of hosting, configuring, managing and operating Flink clusters in cloud-native environments.
After three major releases, the community has made great progress in supporting multiple deployment modes (i.e. session and application) and an alternative HA setup that doesn’t depend on Zookeeper.&lt;/p&gt;
&lt;p&gt;Compared with &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/deployment/resource-providers/standalone/kubernetes.html&quot;&gt;standalone&lt;/a&gt; Kubernetes deployments, the native integration is more &lt;strong&gt;user-friendly&lt;/strong&gt; and requires less upfront knowledge about Kubernetes.
Given that Flink is now aware of the underlying Kubernetes cluster, it also can benefit from dynamic resource allocation and make &lt;strong&gt;more efficient use of Kubernetes cluster resources&lt;/strong&gt;.
The next building block to deepen Flink’s native integration with Kubernetes is the pod template (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15656&quot;&gt;FLINK-15656&lt;/a&gt;), which will greatly enhance the flexibility of using advanced Kubernetes features (&lt;em&gt;e.g.&lt;/em&gt; volumes, init container, sidecar container).
This work is already in progress and will be added in the upcoming 1.13 release!&lt;/p&gt;
</description>
<pubDate>Wed, 10 Feb 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/2021/02/10/native-k8s-with-ha.html</link>
<guid isPermaLink="true">/2021/02/10/native-k8s-with-ha.html</guid>
</item>
<item>
<title>Apache Flink 1.10.3 Released</title>
<description>&lt;p&gt;The Apache Flink community released the third bugfix version of the Apache Flink 1.10 series.&lt;/p&gt;
&lt;p&gt;This release includes 36 fixes and minor improvements for Flink 1.10.2. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.10.3.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14087&quot;&gt;FLINK-14087&lt;/a&gt;] - throws java.lang.ArrayIndexOutOfBoundsException when emiting the data using RebalancePartitioner.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15170&quot;&gt;FLINK-15170&lt;/a&gt;] - WebFrontendITCase.testCancelYarn fails on travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15467&quot;&gt;FLINK-15467&lt;/a&gt;] - Should wait for the end of the source thread during the Task cancellation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16246&quot;&gt;FLINK-16246&lt;/a&gt;] - Exclude &amp;quot;SdkMBeanRegistrySupport&amp;quot; from dynamically loaded AWS connectors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17341&quot;&gt;FLINK-17341&lt;/a&gt;] - freeSlot in TaskExecutor.closeJobManagerConnection cause ConcurrentModificationException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17458&quot;&gt;FLINK-17458&lt;/a&gt;] - TaskExecutorSubmissionTest#testFailingScheduleOrUpdateConsumers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17677&quot;&gt;FLINK-17677&lt;/a&gt;] - FLINK_LOG_PREFIX recommended in docs is not always available
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18081&quot;&gt;FLINK-18081&lt;/a&gt;] - Fix broken links in &amp;quot;Kerberos Authentication Setup and Configuration&amp;quot; doc
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18196&quot;&gt;FLINK-18196&lt;/a&gt;] - flink throws `NullPointerException` when executeCheckpointing
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18212&quot;&gt;FLINK-18212&lt;/a&gt;] - Init lookup join failed when use udf on lookup table
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18832&quot;&gt;FLINK-18832&lt;/a&gt;] - BoundedBlockingSubpartition does not work with StreamTask
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18959&quot;&gt;FLINK-18959&lt;/a&gt;] - Fail to archiveExecutionGraph because job is not finished when dispatcher close
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19022&quot;&gt;FLINK-19022&lt;/a&gt;] - AkkaRpcActor failed to start but no exception information
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19109&quot;&gt;FLINK-19109&lt;/a&gt;] - Split Reader eats chained periodic watermarks
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19135&quot;&gt;FLINK-19135&lt;/a&gt;] - (Stream)ExecutionEnvironment.execute() should not throw ExecutionException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19237&quot;&gt;FLINK-19237&lt;/a&gt;] - LeaderChangeClusterComponentsTest.testReelectionOfJobMaster failed with &amp;quot;NoResourceAvailableException: Could not allocate the required slot within slot request timeout&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19401&quot;&gt;FLINK-19401&lt;/a&gt;] - Job stuck in restart loop due to excessive checkpoint recoveries which block the JobMaster
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19557&quot;&gt;FLINK-19557&lt;/a&gt;] - Issue retrieving leader after zookeeper session reconnect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19675&quot;&gt;FLINK-19675&lt;/a&gt;] - The plan of is incorrect when Calc contains WHERE clause, composite fields access and Python UDF at the same time
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19901&quot;&gt;FLINK-19901&lt;/a&gt;] - Unable to exclude metrics variables for the last metrics reporter.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20013&quot;&gt;FLINK-20013&lt;/a&gt;] - BoundedBlockingSubpartition may leak network buffer if task is failed or canceled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20018&quot;&gt;FLINK-20018&lt;/a&gt;] - pipeline.cached-files option cannot escape &amp;#39;:&amp;#39; in path
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20033&quot;&gt;FLINK-20033&lt;/a&gt;] - Job fails when stopping JobMaster
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20065&quot;&gt;FLINK-20065&lt;/a&gt;] - UnalignedCheckpointCompatibilityITCase.test failed with AskTimeoutException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20076&quot;&gt;FLINK-20076&lt;/a&gt;] - DispatcherTest.testOnRemovedJobGraphDoesNotCleanUpHAFiles does not test the desired functionality
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20183&quot;&gt;FLINK-20183&lt;/a&gt;] - Fix the default PYTHONPATH is overwritten in client side
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20218&quot;&gt;FLINK-20218&lt;/a&gt;] - AttributeError: module &amp;#39;urllib&amp;#39; has no attribute &amp;#39;parse&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20875&quot;&gt;FLINK-20875&lt;/a&gt;] - [CVE-2020-17518] Directory traversal attack: remote file writing through the REST API
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16753&quot;&gt;FLINK-16753&lt;/a&gt;] - Exception from AsyncCheckpointRunnable should be wrapped in CheckpointException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18287&quot;&gt;FLINK-18287&lt;/a&gt;] - Correct the documentation of Python Table API in SQL pages
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19055&quot;&gt;FLINK-19055&lt;/a&gt;] - MemoryManagerSharedResourcesTest contains three tests running extraordinary long
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19105&quot;&gt;FLINK-19105&lt;/a&gt;] - Table API Sample Code Error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19252&quot;&gt;FLINK-19252&lt;/a&gt;] - Jaas file created under io.tmp.dirs - folder not created if not exists
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19339&quot;&gt;FLINK-19339&lt;/a&gt;] - Support Avro&amp;#39;s unions with logical types
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19523&quot;&gt;FLINK-19523&lt;/a&gt;] - Hide sensitive command-line configurations
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20906&quot;&gt;FLINK-20906&lt;/a&gt;] - Update copyright year to 2021 for NOTICE files
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 29 Jan 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2021/01/29/release-1.10.3.html</link>
<guid isPermaLink="true">/news/2021/01/29/release-1.10.3.html</guid>
</item>
<item>
<title>Apache Flink 1.12.1 Released</title>
<description>&lt;p&gt;The Apache Flink community released the first bugfix version of the Apache Flink 1.12 series.&lt;/p&gt;
&lt;p&gt;This release includes 79 fixes and minor improvements for Flink 1.12.0. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.12.1.&lt;/p&gt;
&lt;div class=&quot;alert alert-danger small&quot;&gt;
&lt;p&gt;&lt;b&gt;Attention:&lt;/b&gt;
Using &lt;b&gt;unaligned checkpoints in Flink 1.12.0&lt;/b&gt; combined with two/multiple inputs tasks or with union inputs for single input tasks can result in corrupted state.&lt;/p&gt;
&lt;p&gt;This can happen if a new checkpoint is triggered before recovery is fully completed. For state to be corrupted a task with two or more input gates must receive a checkpoint barrier exactly at the same time this tasks finishes recovering spilled in-flight data. In such case this new checkpoint can succeed, with corrupted/missing in-flight data, which will result in various deserialisation/corrupted data stream errors when someone attempts to recover from such corrupted checkpoint.&lt;/p&gt;
&lt;p&gt;Using &lt;b&gt;unaligned checkpoints in Flink 1.12.1&lt;/b&gt;, a corruption may occur in the checkpoint following a declined checkpoint.&lt;/p&gt;
&lt;p&gt;A late barrier of a canceled checkpoint may lead to buffers being not written into the successive checkpoint, such that recovery is not possible. This happens, when the next checkpoint barrier arrives at a given operator before all previous barriers arrived, which can only happen after cancellation in unaligned checkpoints.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18897&quot;&gt;FLINK-18897&lt;/a&gt;] - Add documentation for the maxwell-json format
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20352&quot;&gt;FLINK-20352&lt;/a&gt;] - Rework command line interface documentation page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20353&quot;&gt;FLINK-20353&lt;/a&gt;] - Rework logging documentation page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20354&quot;&gt;FLINK-20354&lt;/a&gt;] - Rework standalone deployment documentation page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20355&quot;&gt;FLINK-20355&lt;/a&gt;] - Rework K8s deployment documentation page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20356&quot;&gt;FLINK-20356&lt;/a&gt;] - Rework Mesos deployment documentation page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20422&quot;&gt;FLINK-20422&lt;/a&gt;] - Remove from .html files in flink documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20485&quot;&gt;FLINK-20485&lt;/a&gt;] - Map views are deserialized multiple times
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20601&quot;&gt;FLINK-20601&lt;/a&gt;] - Rework PyFlink CLI documentation
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19369&quot;&gt;FLINK-19369&lt;/a&gt;] - BlobClientTest.testGetFailsDuringStreamingForJobPermanentBlob hangs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19435&quot;&gt;FLINK-19435&lt;/a&gt;] - Deadlock when loading different driver classes concurrently using Class.forName
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19725&quot;&gt;FLINK-19725&lt;/a&gt;] - Logger cannot be initialized due to timeout: LoggerInitializationException is thrown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19880&quot;&gt;FLINK-19880&lt;/a&gt;] - Fix ignore-parse-errors not work for the legacy JSON format
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20213&quot;&gt;FLINK-20213&lt;/a&gt;] - Partition commit is delayed when records keep coming
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20221&quot;&gt;FLINK-20221&lt;/a&gt;] - DelimitedInputFormat does not restore compressed filesplits correctly leading to dataloss
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20273&quot;&gt;FLINK-20273&lt;/a&gt;] - Fix Table api Kafka connector Sink Partitioner Document Error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20372&quot;&gt;FLINK-20372&lt;/a&gt;] - Update Kafka SQL connector page to mention properties.* options
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20389&quot;&gt;FLINK-20389&lt;/a&gt;] - UnalignedCheckpointITCase failure caused by NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20404&quot;&gt;FLINK-20404&lt;/a&gt;] - ZooKeeper quorum fails to start due to missing log4j library
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20419&quot;&gt;FLINK-20419&lt;/a&gt;] - Insert fails due to failure to generate execution plan
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20428&quot;&gt;FLINK-20428&lt;/a&gt;] - ZooKeeperLeaderElectionConnectionHandlingTest.testConnectionSuspendedHandlingDuringInitialization failed with &amp;quot;No result is expected since there was no leader elected before stopping the server, yet&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20429&quot;&gt;FLINK-20429&lt;/a&gt;] - KafkaTableITCase.testKafkaTemporalJoinChangelog failed with unexpected results
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20433&quot;&gt;FLINK-20433&lt;/a&gt;] - UnalignedCheckpointTestBase.execute failed with &amp;quot;TestTimedOutException: test timed out after 300 seconds&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20464&quot;&gt;FLINK-20464&lt;/a&gt;] - Some Table examples are not built correctly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20467&quot;&gt;FLINK-20467&lt;/a&gt;] - Fix the Example in Python DataStream Doc
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20470&quot;&gt;FLINK-20470&lt;/a&gt;] - MissingNode can&amp;#39;t be casted to ObjectNode when deserializing JSON
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20476&quot;&gt;FLINK-20476&lt;/a&gt;] - New File Sink end-to-end test Failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20486&quot;&gt;FLINK-20486&lt;/a&gt;] - Hive temporal join should allow monitor interval smaller than 1 hour
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20492&quot;&gt;FLINK-20492&lt;/a&gt;] - The SourceOperatorStreamTask should implement cancelTask() and finishTask()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20493&quot;&gt;FLINK-20493&lt;/a&gt;] - SQLClientSchemaRegistryITCase failed with &amp;quot;Could not build the flink-dist image&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20521&quot;&gt;FLINK-20521&lt;/a&gt;] - Null result values are being swallowed by RPC system
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20525&quot;&gt;FLINK-20525&lt;/a&gt;] - StreamArrowPythonGroupWindowAggregateFunctionOperator doesn&amp;#39;t handle rowtime and proctime properly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20543&quot;&gt;FLINK-20543&lt;/a&gt;] - Fix typo in upsert kafka docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20554&quot;&gt;FLINK-20554&lt;/a&gt;] - The Checkpointed Data Size of the Latest Completed Checkpoint is incorrectly displayed on the Overview page of the UI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20582&quot;&gt;FLINK-20582&lt;/a&gt;] - Fix typos in `CREATE Statements` docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20607&quot;&gt;FLINK-20607&lt;/a&gt;] - a wrong example in udfs page.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20615&quot;&gt;FLINK-20615&lt;/a&gt;] - Local recovery and sticky scheduling end-to-end test timeout with &amp;quot;IOException: Stream Closed&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20626&quot;&gt;FLINK-20626&lt;/a&gt;] - Canceling a job when it is failing will result in job hanging in CANCELING state
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20630&quot;&gt;FLINK-20630&lt;/a&gt;] - [Kinesis][DynamoDB] DynamoDB Streams Consumer fails to consume from Latest
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20646&quot;&gt;FLINK-20646&lt;/a&gt;] - ReduceTransformation does not work with RocksDBStateBackend
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20648&quot;&gt;FLINK-20648&lt;/a&gt;] - Unable to restore job from savepoint when using Kubernetes based HA services
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20664&quot;&gt;FLINK-20664&lt;/a&gt;] - Support setting service account for TaskManager pod
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20665&quot;&gt;FLINK-20665&lt;/a&gt;] - FileNotFoundException when restore from latest Checkpoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20666&quot;&gt;FLINK-20666&lt;/a&gt;] - Fix the deserialized Row losing the field_name information in PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20669&quot;&gt;FLINK-20669&lt;/a&gt;] - Add the jzlib LICENSE file in flink-python module
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20703&quot;&gt;FLINK-20703&lt;/a&gt;] - HiveSinkCompactionITCase test timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20704&quot;&gt;FLINK-20704&lt;/a&gt;] - Some rel data type does not implement the digest correctly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20756&quot;&gt;FLINK-20756&lt;/a&gt;] - PythonCalcSplitConditionRule is not working as expected
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20764&quot;&gt;FLINK-20764&lt;/a&gt;] - BatchGroupedReduceOperator does not emit results for singleton inputs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20781&quot;&gt;FLINK-20781&lt;/a&gt;] - UnalignedCheckpointITCase failure caused by NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20784&quot;&gt;FLINK-20784&lt;/a&gt;] - .staging_xxx does not exist, when insert into hive
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20793&quot;&gt;FLINK-20793&lt;/a&gt;] - Fix NamesTest due to code style refactor
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20803&quot;&gt;FLINK-20803&lt;/a&gt;] - Version mismatch between spotless-maven-plugin and google-java-format plugin
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20841&quot;&gt;FLINK-20841&lt;/a&gt;] - Fix compile error due to duplicated generated files
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19013&quot;&gt;FLINK-19013&lt;/a&gt;] - Log start/end of state restoration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19259&quot;&gt;FLINK-19259&lt;/a&gt;] - Use classloader release hooks with Kinesis producer to avoid metaspace leak
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19832&quot;&gt;FLINK-19832&lt;/a&gt;] - Improve handling of immediately failed physical slot in SlotSharingExecutionSlotAllocator
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20055&quot;&gt;FLINK-20055&lt;/a&gt;] - Datadog API Key exposed in Flink JobManager logs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20168&quot;&gt;FLINK-20168&lt;/a&gt;] - Translate page &amp;#39;Flink Architecture&amp;#39; into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20209&quot;&gt;FLINK-20209&lt;/a&gt;] - Add missing checkpoint configuration to Flink UI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20298&quot;&gt;FLINK-20298&lt;/a&gt;] - Replace usage of in flink documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20468&quot;&gt;FLINK-20468&lt;/a&gt;] - Enable leadership control in MiniCluster to test JM failover
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20510&quot;&gt;FLINK-20510&lt;/a&gt;] - Enable log4j2 monitor interval by default
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20519&quot;&gt;FLINK-20519&lt;/a&gt;] - Extend HBase notice with transitively bundled dependencies
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20570&quot;&gt;FLINK-20570&lt;/a&gt;] - The `NOTE` tip style is different from the others in process_function page.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20588&quot;&gt;FLINK-20588&lt;/a&gt;] - Add docker-compose as appendix to Mesos documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20629&quot;&gt;FLINK-20629&lt;/a&gt;] - [Kinesis][EFO] Migrate from DescribeStream to DescribeStreamSummary
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20647&quot;&gt;FLINK-20647&lt;/a&gt;] - Use yield to generate output datas in ProcessFunction for Python DataStream
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20650&quot;&gt;FLINK-20650&lt;/a&gt;] - Mark &amp;quot;native-k8s&amp;quot; as deprecated in docker-entrypoint.sh
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20651&quot;&gt;FLINK-20651&lt;/a&gt;] - Use Spotless/google-java-format for code formatting/enforcement
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20682&quot;&gt;FLINK-20682&lt;/a&gt;] - Add configuration options related to hadoop
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20697&quot;&gt;FLINK-20697&lt;/a&gt;] - Correct the Type of &amp;quot;lookup.cache.ttl&amp;quot; in jdbc.md/jdbc.zh.md
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20790&quot;&gt;FLINK-20790&lt;/a&gt;] - Generated classes should not be put under src/ directory
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20792&quot;&gt;FLINK-20792&lt;/a&gt;] - Allow shorthand invocation of spotless
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20805&quot;&gt;FLINK-20805&lt;/a&gt;] - Blink runtime classes partially ignored by spotless
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20822&quot;&gt;FLINK-20822&lt;/a&gt;] - Don&amp;#39;t check whether a function is generic in hive catalog
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20866&quot;&gt;FLINK-20866&lt;/a&gt;] - Add how to list jobs in Yarn deployment documentation when HA enabled
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20300&quot;&gt;FLINK-20300&lt;/a&gt;] - Create Flink 1.12 release notes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20906&quot;&gt;FLINK-20906&lt;/a&gt;] - Update copyright year to 2021 for NOTICE files
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Tue, 19 Jan 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2021/01/19/release-1.12.1.html</link>
<guid isPermaLink="true">/news/2021/01/19/release-1.12.1.html</guid>
</item>
<item>
<title>Using RocksDB State Backend in Apache Flink: When and How</title>
<description>&lt;p&gt;Stream processing applications are often stateful, “remembering” information from processed events and using it to influence further event processing. In Flink, the remembered information, i.e., state, is stored locally in the configured state backend. To prevent data loss in case of failures, the state backend periodically persists a snapshot of its contents to a pre-configured durable storage. The &lt;a href=&quot;https://rocksdb.org/&quot;&gt;RocksDB&lt;/a&gt; state backend (i.e., RocksDBStateBackend) is one of the three built-in state backends in Flink. This blog post will guide you through the benefits of using RocksDB to manage your application’s state, explain when and how to use it and also clear up a few common misconceptions. Having said that, this is &lt;strong&gt;not&lt;/strong&gt; a blog post to explain how RocksDB works in-depth or how to do advanced troubleshooting and performance tuning; if you need help with any of those topics, you can reach out to the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink User Mailing List&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;state-in-flink&quot;&gt;State in Flink&lt;/h1&gt;
&lt;p&gt;To best understand state and state backends in Flink, it’s important to distinguish between &lt;strong&gt;in-flight state&lt;/strong&gt; and &lt;strong&gt;state snapshots&lt;/strong&gt;. In-flight state, also known as working state, is the state a Flink job is working on. It is always stored locally in memory (with the possibility to spill to disk) and can be lost when jobs fail without impacting job recoverability. State snapshots, i.e., &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/state/checkpoints.html&quot;&gt;checkpoints&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/state/savepoints.html#what-is-a-savepoint-how-is-a-savepoint-different-from-a-checkpoint&quot;&gt;savepoints&lt;/a&gt;, are stored in a remote durable storage, and are used to restore the local state in the case of job failures. The appropriate state backend for a production deployment depends on scalability, throughput, and latency requirements.&lt;/p&gt;
&lt;h1 id=&quot;what-is-rocksdb&quot;&gt;What is RocksDB?&lt;/h1&gt;
&lt;p&gt;Thinking of RocksDB as a distributed database that needs to run on a cluster and to be managed by specialized administrators is a common misconception. RocksDB is an embeddable persistent key-value store for fast storage. It interacts with Flink via the Java Native Interface (JNI). The picture below shows where RocksDB fits in a Flink cluster node. Details are explained in the following sections.&lt;/p&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:60%&quot; src=&quot;/img/blog/2021-01-18-rocksdb/RocksDB-in-Flink.png&quot; /&gt;
&lt;/center&gt;
&lt;h1 id=&quot;rocksdb-in-flink&quot;&gt;RocksDB in Flink&lt;/h1&gt;
&lt;p&gt;Everything you need to use RocksDB as a state backend is bundled in the Apache Flink distribution, including the native shared library:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;$ jar -tvf lib/flink-dist_2.12-1.12.0.jar| grep librocksdbjni-linux64
8695334 Wed Nov 27 02:27:06 CET 2019 librocksdbjni-linux64.so
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;At runtime, RocksDB is embedded in the TaskManager processes. It runs in native threads and works with local files. For example, if you have a job configured with RocksDBStateBackend running in your Flink cluster, you’ll see something similar to the following, where 32513 is the TaskManager process ID.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;$ ps -T -p 32513 | grep -i rocksdb
32513 32633 ? 00:00:00 rocksdb:low0
32513 32634 ? 00:00:00 rocksdb:high0
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
The command is for Linux only. For other operating systems, please refer to their documentation.&lt;/p&gt;
&lt;/div&gt;
&lt;h1 id=&quot;when-to-use-rocksdbstatebackend&quot;&gt;When to use RocksDBStateBackend&lt;/h1&gt;
&lt;p&gt;In addition to RocksDBStateBackend, Flink has two other built-in state backends: MemoryStateBackend and FsStateBackend. They both are heap-based, as in-flight state is stored in the JVM heap. For the moment being, let’s ignore MemoryStateBackend, as it is intended only for &lt;strong&gt;local developments&lt;/strong&gt; and &lt;strong&gt;debugging&lt;/strong&gt;, not for production use.&lt;/p&gt;
&lt;p&gt;With RocksDBStateBackend, in-flight state is first written into off-heap/native memory, and then flushed to local disks when a configured threshold is reached. This means that RocksDBStateBackend can support state larger than the total configured heap capacity. The amount of state that you can store in RocksDBStateBackend is only limited by the amount of &lt;strong&gt;disk space&lt;/strong&gt; available across the entire cluster. In addition, since RocksDBStateBackend doesn’t use the JVM heap to store in-flight state, it’s not affected by JVM Garbage Collection and therefore has predictable latency.&lt;/p&gt;
&lt;p&gt;On top of full, self-contained state snapshots, RocksDBStateBackend also supports &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/state/large_state_tuning.html#incremental-checkpoints&quot;&gt;incremental checkpointing&lt;/a&gt; as a performance tuning option. An incremental checkpoint stores only the changes that occurred since the latest completed checkpoint. This dramatically reduces checkpointing time in comparison to performing a full snapshot. RocksDBStateBackend is currently the only state backend that supports incremental checkpointing.&lt;/p&gt;
&lt;p&gt;RocksDB is a good option when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The state of your job is larger than can fit in local memory (e.g., long windows, large &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/stream/state/state.html&quot;&gt;keyed state&lt;/a&gt;);&lt;/li&gt;
&lt;li&gt;You’re looking into incremental checkpointing as a way to reduce checkpointing time;&lt;/li&gt;
&lt;li&gt;You expect to have more predictable latency without being impacted by JVM Garbage Collection.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Otherwise, if your application has &lt;strong&gt;small state&lt;/strong&gt; or requires &lt;strong&gt;very low latency&lt;/strong&gt;, you should consider &lt;strong&gt;FsStateBackend&lt;/strong&gt;. As a rule of thumb, RocksDBStateBackend is a few times slower than heap-based state backends, because it stores key/value pairs as serialized bytes. This means that any state access (reads/writes) needs to go through a de-/serialization process crossing the JNI boundary, which is more expensive than working directly with the on-heap representation of state. The upside is that, for the same amount of state, it has a &lt;strong&gt;low memory footprint&lt;/strong&gt; compared to the corresponding on-heap representation.&lt;/p&gt;
&lt;h1 id=&quot;how-to-use-rocksdbstatebackend&quot;&gt;How to use RocksDBStateBackend&lt;/h1&gt;
&lt;p&gt;RocksDB is fully embedded within and fully managed by the TaskManager process. RocksDBStateBackend can be configured at the cluster level as the default for the entire cluster, or at the job level for individual jobs. The job level configuration takes precedence over the cluster level configuration.&lt;/p&gt;
&lt;h2 id=&quot;cluster-level&quot;&gt;Cluster Level&lt;/h2&gt;
&lt;p&gt;Add the following configuration in &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html&quot;&gt;&lt;code&gt;conf/flink-conf.yaml&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;state.backend: rocksdb
state.backend.incremental: true
state.checkpoints.dir: hdfs:///flink-checkpoints # location to store checkpoints
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&quot;job-level&quot;&gt;Job Level&lt;/h1&gt;
&lt;p&gt;Add the following into your job’s code after StreamExecutionEnvironment is created:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;# &#39;env&#39; is the created StreamExecutionEnvironment
# &#39;true&#39; is to enable incremental checkpointing
env.setStateBackend(new RocksDBStateBackend(&quot;hdfs:///fink-checkpoints&quot;, true));
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
In addition to HDFS, you can also use other on-premises or cloud-based object stores if the corresponding dependencies are added under &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/filesystems/plugins.html&quot;&gt;FLINK_HOME/plugins&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;h1 id=&quot;best-practices-and-advanced-configuration&quot;&gt;Best Practices and Advanced Configuration&lt;/h1&gt;
&lt;p&gt;We hope this overview helped you gain a better understanding of the role of RocksDB in Flink and how to successfully run a job with RocksDBStateBackend. To round it off, we’ll explore some best practices and a few reference points for further troubleshooting and performance tuning.&lt;/p&gt;
&lt;h2 id=&quot;state-location-in-rocksdb&quot;&gt;State Location in RocksDB&lt;/h2&gt;
&lt;p&gt;As mentioned earlier, in-flight state in RocksDBStateBackend is spilled to files on disk. These files are located under the directory specified by the Flink configuration &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#state-backend-rocksdb-localdir&quot;&gt;&lt;code&gt;state.backend.rocksdb.localdir&lt;/code&gt;&lt;/a&gt;. Because disk performance has a direct impact on RocksDB’s performance, it’s recommended that this directory is located on a &lt;strong&gt;local&lt;/strong&gt; disk. It’s discouraged to configure it to a remote network-based location like NFS or HDFS, as writing to remote disks is usually slower. Also high availability is not a requirement for in-flight state. Local SSD disks are preferred if high disk throughput is required.&lt;/p&gt;
&lt;p&gt;State snapshots are persisted to remote durable storage. During state snapshotting, TaskManagers take a snapshot of the in-flight state and store it remotely. Transferring the state snapshot to remote storage is handled purely by the TaskManager itself without the involvement of the state backend. So, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#state-checkpoints-dir&quot;&gt;&lt;code&gt;state.checkpoints.dir&lt;/code&gt;&lt;/a&gt; or the parameter you set in the code for a particular job can be different locations like an on-premises &lt;a href=&quot;https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html&quot;&gt;HDFS&lt;/a&gt; cluster or a cloud-based object store like &lt;a href=&quot;https://aws.amazon.com/s3/&quot;&gt;Amazon S3&lt;/a&gt;, &lt;a href=&quot;https://azure.microsoft.com/en-us/services/storage/blobs/&quot;&gt;Azure Blob Storage&lt;/a&gt;, &lt;a href=&quot;https://cloud.google.com/storage&quot;&gt;Google Cloud Storage&lt;/a&gt;, &lt;a href=&quot;https://www.alibabacloud.com/product/oss&quot;&gt;Alibaba OSS&lt;/a&gt;, etc.&lt;/p&gt;
&lt;h2 id=&quot;troubleshooting-rocksdb&quot;&gt;Troubleshooting RocksDB&lt;/h2&gt;
&lt;p&gt;To check how RocksDB is behaving in production, you should look for the RocksDB log file named LOG. By default, this log file is located in the same directory as your data files, i.e., the directory specified by the Flink configuration &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#state-backend-rocksdb-localdir&quot;&gt;&lt;code&gt;state.backend.rocksdb.localdir&lt;/code&gt;&lt;/a&gt;. When enabled, &lt;a href=&quot;https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide#rocksdb-statistics&quot;&gt;RocksDB statistics&lt;/a&gt; are also logged there to help diagnose potential problems. For further information, check &lt;a href=&quot;https://github.com/facebook/rocksdb/wiki/RocksDB-Troubleshooting-Guide&quot;&gt;RocksDB Troubleshooting Guide&lt;/a&gt; in &lt;a href=&quot;https://github.com/facebook/rocksdb/wiki&quot;&gt;RocksDB Wiki&lt;/a&gt;. If you are interested in the RocksDB behavior trend over time, you can consider enabling &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#rocksdb-native-metrics&quot;&gt;RocksDB native metrics&lt;/a&gt; for your Flink job.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
From Flink 1.10, RocksDB logging was effectively disabled by &lt;a href=&quot;https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java#L64&quot;&gt;setting the log level to HEADER&lt;/a&gt;. To enable it, check &lt;a href=&quot;https://ververica.zendesk.com/hc/en-us/articles/360015933320-How-to-get-RocksDB-s-LOG-file-back-for-advanced-troubleshooting&quot;&gt;How to get RocksDB’s LOG file back for advanced troubleshooting&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&quot;alert alert-warning&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-warning&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-warning-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Warning&lt;/span&gt;
Enabling RocksDB’s native metrics in Flink may have a negative performance impact on your job.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;tuning-rocksdb&quot;&gt;Tuning RocksDB&lt;/h2&gt;
&lt;p&gt;Since Flink 1.10, Flink configures RocksDB’s memory allocation to the amount of managed memory of each task slot by default. The primary mechanism for improving memory-related performance issues is to increase Flink’s &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/memory/mem_setup_tm.html#managed-memory&quot;&gt;managed memory&lt;/a&gt; via the Flink configuration &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#taskmanager-memory-managed-size&quot;&gt;&lt;code&gt;taskmanager.memory.managed.size&lt;/code&gt;&lt;/a&gt; or &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#taskmanager-memory-managed-fraction&quot;&gt;&lt;code&gt;taskmanager.memory.managed.fraction&lt;/code&gt;&lt;/a&gt;. For more fine-grained control, you should first disable the automatic memory management by setting &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#state-backend-rocksdb-memory-managed&quot;&gt;&lt;code&gt;state.backend.rocksdb.memory.managed&lt;/code&gt;&lt;/a&gt; to &lt;code&gt;false&lt;/code&gt;, then start with the following Flink configuration: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#state-backend-rocksdb-block-cache-size&quot;&gt;&lt;code&gt;state.backend.rocksdb.block.cache-size&lt;/code&gt;&lt;/a&gt; (corresponding to block_cache_size in RocksDB), &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#state-backend-rocksdb-writebuffer-size&quot;&gt;&lt;code&gt;state.backend.rocksdb.writebuffer.size&lt;/code&gt;&lt;/a&gt; (corresponding to write_buffer_size in RocksDB), and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#state-backend-rocksdb-writebuffer-count&quot;&gt;&lt;code&gt;state.backend.rocksdb.writebuffer.count&lt;/code&gt;&lt;/a&gt; (corresponding to max_write_buffer_number in RocksDB). For more details, check &lt;a href=&quot;https://www.ververica.com/blog/manage-rocksdb-memory-size-apache-flink&quot;&gt;this blog post&lt;/a&gt; on how to manage RocksDB memory size in Flink and the &lt;a href=&quot;https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB&quot;&gt;RocksDB Memory Usage&lt;/a&gt; Wiki page.&lt;/p&gt;
&lt;p&gt;While data is being written or overwritten in RocksDB, flushing from memory to local disks and data compaction are managed in the background by RocksDB threads. On a machine with many CPU cores, you should increase the parallelism of background flushing and compaction by setting the Flink configuration &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#state-backend-rocksdb-thread-num&quot;&gt;&lt;code&gt;state.backend.rocksdb.thread.num&lt;/code&gt;&lt;/a&gt; (corresponding to max_background_jobs in RocksDB). The default configuration is usually too small for a production setup. If your job reads frequently from RocksDB, you should consider enabling &lt;a href=&quot;https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide#bloom-filters&quot;&gt;bloom filters&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For other RocksDBStateBackend configurations, check the Flink documentation on &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/deployment/config.html#advanced-rocksdb-state-backends-options&quot;&gt;Advanced RocksDB State Backends Options&lt;/a&gt;. For further tuning, check &lt;a href=&quot;https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide&quot;&gt;RocksDB Tuning Guide&lt;/a&gt; in &lt;a href=&quot;https://github.com/facebook/rocksdb/wiki&quot;&gt;RocksDB Wiki&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;The &lt;a href=&quot;https://rocksdb.org/&quot;&gt;RocksDB&lt;/a&gt; state backend (i.e., RocksDBStateBackend) is one of the three state backends bundled in Flink, and can be a powerful choice when configuring your streaming applications. It enables scalable applications maintaining up to many terabytes of state with exactly-once processing guarantees. If the state of your Flink job is too large to fit on the JVM heap, you are interested in incremental checkpointing, or you expect to have predictable latency, you should use RocksDBStateBackend. Since RocksDB is embedded in TaskManager processes as native threads and works with files on local disks, RocksDBStateBackend is supported out-of-the-box without the need to further setup and manage any external systems or processes.&lt;/p&gt;
</description>
<pubDate>Mon, 18 Jan 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/2021/01/18/rocksdb.html</link>
<guid isPermaLink="true">/2021/01/18/rocksdb.html</guid>
</item>
<item>
<title>Exploring fine-grained recovery of bounded data sets on Flink</title>
<description>&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#how-does-fine-grained-recovery-work&quot; id=&quot;markdown-toc-how-does-fine-grained-recovery-work&quot;&gt;&lt;strong&gt;How does fine-grained recovery work?&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#experimenting-with-fine-grained-recovery&quot; id=&quot;markdown-toc-experimenting-with-fine-grained-recovery&quot;&gt;&lt;strong&gt;Experimenting with fine-grained recovery&lt;/strong&gt;&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#setup&quot; id=&quot;markdown-toc-setup&quot;&gt;&lt;strong&gt;Setup&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-experiment&quot; id=&quot;markdown-toc-the-experiment&quot;&gt;&lt;strong&gt;The Experiment&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-results&quot; id=&quot;markdown-toc-the-results&quot;&gt;&lt;strong&gt;The Results&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;Apache Flink is a very versatile tool for all kinds of data processing workloads. It can process incoming data within a few milliseconds or crunch through petabytes of bounded datasets (also known as batch processing).&lt;/p&gt;
&lt;p&gt;Processing efficiency is not the only parameter users of data processing systems care about. In the real world, system outages due to hardware or software failure are expected to happen all the time. For unbounded (or streaming) workloads, Flink is using periodic checkpoints to allow for reliable and correct recovery. In case of bounded data sets, having a reliable recovery mechanism is mission critical — as users do not want to potentially lose many hours of intermediate processing results.&lt;/p&gt;
&lt;p&gt;Apache Flink 1.9 introduced &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures&quot;&gt;fine-grained recovery&lt;/a&gt; into its internal workload scheduler. The Flink APIs that are made for bounded workloads benefit from this change by individually recovering failed operators, re-using results from the previous processing step.&lt;/p&gt;
&lt;p&gt;In this blog post, we are going to give an overview over these changes, and we will experimentally validate their effectiveness.&lt;/p&gt;
&lt;h2 id=&quot;how-does-fine-grained-recovery-work&quot;&gt;&lt;strong&gt;How does fine-grained recovery work?&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;For streaming jobs (and in &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionMode.html&quot;&gt;pipelined mode&lt;/a&gt; for batch jobs), Flink is using a coarse-grained restart-strategy: upon failure, the entire job is restarted (but streaming jobs have an entirely different fault-tolerance model, using &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/concepts/stateful-stream-processing.html#checkpointing&quot;&gt;checkpointing&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;For batch jobs, we can use a more sophisticated recovery strategy, by using cached intermediate results, thus only restarting parts of the pipeline.&lt;/p&gt;
&lt;p&gt;Let’s look at the topology below: Some connections are pipelined (between A1 and B1, as well as A2 and B2) – data is immediately streamed from operator A1 to B1.&lt;/p&gt;
&lt;p&gt;However the output of B1 and B2 is cached on disk (indicated by the grey box). We call such connections blocking. If there’s a failure in the steps succeeding B1 and B2 and the results of B1 and B2 have already been produced, we don’t need to reprocess this part of the pipeline – we can reuse the cached result.&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/batch-fine-grained-fault-tolerance/example.png&quot; width=&quot;320px&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Looking at the case of a failure (here of D2), we see that we do not need to restart the entire job. Restarting C2 and all dependent tasks is sufficient. This is possible because we can read the cached results of B1 and B2. We call this recovery mechanism “fine-grained”, as we only restart parts of the topology to recover from a failure – reducing the recovery time, resource consumption and overall job runtime.&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/batch-fine-grained-fault-tolerance/recov.png&quot; width=&quot;640px&quot; /&gt;
&lt;/div&gt;
&lt;h2 id=&quot;experimenting-with-fine-grained-recovery&quot;&gt;&lt;strong&gt;Experimenting with fine-grained recovery&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;To validate the implementation, we’ve conducted a small experiment. The following sections will describe the setup, the experiment and the results.&lt;/p&gt;
&lt;h3 id=&quot;setup&quot;&gt;&lt;strong&gt;Setup&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Hardware&lt;/strong&gt;: The experiment was performed on an idle MacBook Pro 2016 (16 GB of memory, SSD storage).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Test Job&lt;/strong&gt;: We used a &lt;a href=&quot;https://github.com/rmetzger/flip1-bench/blob/master/flip1-bench-jobs/src/main/java/com/ververica/TPCHQuery3.java&quot;&gt;modified version&lt;/a&gt; (for instrumentation only) of the &lt;a href=&quot;https://github.com/apache/flink/blob/master/flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/relational/TPCHQuery3.java&quot;&gt;TPC-H Query 3&lt;/a&gt; example that is part of the Flink batch (DataSet API) examples, running on Flink 1.12&lt;/p&gt;
&lt;p&gt;This is the topology of the query:&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/batch-fine-grained-fault-tolerance/job.png&quot; width=&quot;640px&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;It has many blocking data exchanges where we cache intermediate results, if executed in batch mode.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Test Data&lt;/strong&gt;: We generated a &lt;a href=&quot;http://www.tpc.org/tpch/&quot;&gt;TPC-H dataset&lt;/a&gt; of 150 GB as the input.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cluster&lt;/strong&gt;: We were running 4 TaskManagers with 2 slots each and 1 JobManager in standalone mode.&lt;/p&gt;
&lt;p&gt;Running this test job takes roughly 15 minutes with the given hardware and data.&lt;/p&gt;
&lt;p&gt;For &lt;strong&gt;inducing failures&lt;/strong&gt; into the job, we decided to randomly throw exceptions in the operators. This has a number of benefits compared to randomly killing entire TaskManagers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Killing a TaskManager would require starting and registering a new TaskManager — which introduces an uncontrollable factor into our benchmark: We don’t want to test how quickly Flink is reconciling a cluster.&lt;/li&gt;
&lt;li&gt;Killing an entire TaskManager would bring down on average 1/4th of all running operators. In larger production setups, a failure usually affects only a smaller fraction of all running operators. The differences between the execution modes would be less obvious if we killed entire TaskManagers.&lt;/li&gt;
&lt;li&gt;Keeping TaskManagers across failures helps to better simulate using an external shuffle service, as intermediate results are preserved despite a failure.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The failures are controlled by a central “&lt;a href=&quot;https://github.com/rmetzger/flip1-bench/blob/master/flip1-bench-jobs/src/main/java/com/ververica/utilz/KillerServer.java&quot;&gt;failure coordinator&lt;/a&gt;” which decides when to kill which operator.&lt;/p&gt;
&lt;p&gt;Failures are artificially triggered based on a configured mean failure frequency. The failures follow an &lt;a href=&quot;https://en.wikipedia.org/wiki/Exponential_distribution&quot;&gt;exponential distribution&lt;/a&gt;, which is suitable for simulating continuous and independent failures at a configured average rate.&lt;/p&gt;
&lt;h3 id=&quot;the-experiment&quot;&gt;&lt;strong&gt;The Experiment&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;We were running the job with two parameters which we varied in the benchmark:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/execution_configuration.html&quot;&gt;Execution Mode&lt;/a&gt;: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/api/common/ExecutionMode.html&quot;&gt;BATCH or PIPELINED&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In PIPELINED mode, except for data exchanges susceptible for deadlocks all exchanges are pipelined (e.g. upstream operators are streaming their result downstream). A failure means that we have to restart the entire job, and start the processing from scratch.&lt;/p&gt;
&lt;p&gt;In BATCH mode, all shuffles and broadcasts are persisted before downstream consumption. You can imagine the execution to happen in steps. Since we are persisting intermediate results in BATCH mode, we do not have to reprocess all upstream operators in case of an (induced) failure. We just have to restart the step that was in progress during the failure.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mean Failure Frequency: This parameter controls the frequency of failures induced into the running job. If the parameter is set to 5 minutes, on average, a failure occurs every 5 minutes. The failures are following an exponential distribution. We’ve chosen values between 15 minutes and 20 seconds.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each configuration combination was executed at least 3 times. We report the average execution time. This is necessary due to the probabilistic behavior of the induced failures.&lt;/p&gt;
&lt;h3 id=&quot;the-results&quot;&gt;&lt;strong&gt;The Results&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The chart below shows the execution time in seconds for each batch and pipelined execution with different failure frequencies.&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/batch-fine-grained-fault-tolerance/result.png&quot; width=&quot;640px&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;We will now discuss some findings:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Execution time with rare failures&lt;/strong&gt;: Looking at the first few results on the left, where we compare the behavior with a mean failure frequency of 15 (=900s), 10 (=600s), 9 (=540s), 8 (=480s), 7 (=420s) minutes. The execution times are mostly the same, around 15 minutes. The batch execution time is usually lower, and more predictable. This behavior is easy to explain. If an error occurred later in the execution, the pipelined mode needs to start from scratch, while the batch mode can re-use previous intermediate results. The variances in runtime can be explained by statistical effects: if an error happens to be induced close to the end of a pipelined mode run, the entire job needs to rerun.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Execution time with frequent failures&lt;/strong&gt;: The results “in the middle”, with failure frequencies of 6, 5, 4, 3 and 2 minutes show that the pipelined mode execution gets unfeasible at some point: If failures happen on average every 3 minutes, the average execution time reaches more than 60 minutes, for failures every 2 minutes the time spikes to more than 120 minutes. The pipelined job is unable to finish the execution, only if we happen to find a window where no failure is induced for 15 minutes, the pipelined job manages to produce the final result. For more frequent failures, the pipelined mode did not manage to finish at all.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;How many failures can the batch mode sustain?&lt;/strong&gt; The last numbers, with failure frequencies between 60 and 20 seconds are probably a bit unrealistic for real world scenarios. But we wanted to investigate how frequent failures can become for the batch mode to become unfeasible. With failures induced every 30 seconds, the average execution time is 30 minutes. In other words, even if you have two failures per minute, your execution time only doubles in this case. The batch mode is much more predictable and well behaved when it comes to execution times.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Based on these results, it makes a lot of sense to use the batch execution mode for batch workloads, as the resource consumption and overall execution times are substantially lower compared to the pipelined execution mode.&lt;/p&gt;
&lt;p&gt;In general, we recommend conducting your own performance experiments on your own hardware and with your own workloads, as results might vary from what we’ve presented here. Despite the findings here, the pipelined mode probably has some performance advantages in environments with rare failures and slower I/O (for example when using spinning disks, or network attached disks). On the other hand, CPU intensive workloads might benefit from the batch mode even in slow I/O environments.&lt;/p&gt;
&lt;p&gt;We should also note that the caching (and subsequent reprocessing on failure) only works if the cached results are still present – this is currently only the case, if the TaskManager survives a failure. However, this is an unrealistic assumption as many failures would cause the TaskManager process to die. To mitigate this limitation, data processing frameworks employ external shuffle services that persist the cached results independent of the data processing framework. Since Flink 1.9, there is support for a &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-31%3A+Pluggable+Shuffle+Service&quot;&gt;pluggable shuffle service&lt;/a&gt;, and there are tickets for adding implementations for YARN (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13247&quot;&gt;FLINK-13247&lt;/a&gt;) and Kubernetes (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13246&quot;&gt;FLINK-13246&lt;/a&gt;). Once these implementations are added, TaskManagers can recover cached results even if the process or machine got killed.&lt;/p&gt;
&lt;p&gt;Despite these considerations, we believe that fine-grained recovery is a great improvement for Flink’s batch capabilities, as it makes the framework much more efficient, even in unstable environments.&lt;/p&gt;
</description>
<pubDate>Mon, 11 Jan 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2021/01/11/batch-fine-grained-fault-tolerance.html</link>
<guid isPermaLink="true">/news/2021/01/11/batch-fine-grained-fault-tolerance.html</guid>
</item>
<item>
<title>What&#39;s New in the Pulsar Flink Connector 2.7.0</title>
<description>&lt;h2 id=&quot;about-the-pulsar-flink-connector&quot;&gt;About the Pulsar Flink Connector&lt;/h2&gt;
&lt;p&gt;In order for companies to access real-time data insights, they need unified batch and streaming capabilities. Apache Flink unifies batch and stream processing into one single computing engine with “streams” as the unified data representation. Although developers have done extensive work at the computing and API layers, very little work has been done at the data messaging and storage layers. In reality, data is segregated into data silos, created by various storage and messaging technologies. As a result, there is still no single source-of-truth and the overall operation for the developer teams poses significant challenges. To address such operational challenges, we need to store data in streams. Apache Pulsar (together with Apache BookKeeper) perfectly meets the criteria: data is stored as one copy (source-of-truth) and can be accessed in streams (via pub-sub interfaces) and segments (for batch processing). When Flink and Pulsar come together, the two open source technologies create a unified data architecture for real-time, data-driven businesses.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://github.com/streamnative/pulsar-flink/&quot;&gt;Pulsar Flink connector&lt;/a&gt; provides elastic data processing with &lt;a href=&quot;https://pulsar.apache.org/&quot;&gt;Apache Pulsar&lt;/a&gt; and &lt;a href=&quot;https://flink.apache.org/&quot;&gt;Apache Flink&lt;/a&gt;, allowing Apache Flink to read/write data from/to Apache Pulsar. The Pulsar Flink Connector enables you to concentrate on your business logic without worrying about the storage details.&lt;/p&gt;
&lt;h2 id=&quot;challenges&quot;&gt;Challenges&lt;/h2&gt;
&lt;p&gt;When we first developed the Pulsar Flink Connector, it received wide adoption from both the Flink and Pulsar communities. Leveraging the Pulsar Flink connector, &lt;a href=&quot;https://www.hpe.com/us/en/home.html&quot;&gt;Hewlett Packard Enterprise (HPE)&lt;/a&gt; built a real-time computing platform, &lt;a href=&quot;https://www.bigo.sg/&quot;&gt;BIGO&lt;/a&gt; built a &lt;a href=&quot;https://pulsar-summit.org/en/event/asia-2020/sessions/how-bigo-builds-real-time-message-system-with-apache-pulsar-and-flink&quot;&gt;real-time message processing system&lt;/a&gt;, and &lt;a href=&quot;https://www.zhihu.com/&quot;&gt;Zhihu&lt;/a&gt; is in the process of assessing the Connector’s fit for a real-time computing system.&lt;/p&gt;
&lt;p&gt;With more users adopting the Pulsar Flink Connector, it became clear that one of the common issues was evolving around data formats and specifically performing serialization and deserialization. While the Pulsar Flink connector leverages the Pulsar serialization, the previous connector versions did not support the Flink data format. As a result, users had to manually configure their setup in order to use the connector for real-time computing scenarios.&lt;/p&gt;
&lt;p&gt;To improve the user experience and make the Pulsar Flink connector easier-to-use, we built the capabilities to fully support the Flink data format, so users of the connector do not spend time on manual tuning and configuration.&lt;/p&gt;
&lt;h2 id=&quot;whats-new-in-the-pulsar-flink-connector-270&quot;&gt;What’s New in the Pulsar Flink Connector 2.7.0?&lt;/h2&gt;
&lt;p&gt;The Pulsar Flink Connector 2.7.0 supports features in Apache Pulsar 2.7.0 and Apache Flink 1.12 and is fully compatible with the Flink connector and Flink message format. With the latest version, you can use important features in Flink, such as exactly-once sink, upsert Pulsar mechanism, Data Definition Language (DDL) computed columns, watermarks, and metadata. You can also leverage the Key-Shared subscription in Pulsar, and conduct serialization and deserialization without much configuration. Additionally, you can easily customize the configuration based on your business requirements.&lt;/p&gt;
&lt;p&gt;Below, we provide more details about the key features in the Pulsar Flink Connector 2.7.0.&lt;/p&gt;
&lt;h3 id=&quot;ordered-message-queue-with-high-performance&quot;&gt;Ordered message queue with high-performance&lt;/h3&gt;
&lt;p&gt;When users needed to strictly guarantee the ordering of messages, only one consumer was allowed to consume them. This had a severe impact on throughput. To address this, we designed a Key_Shared subscription model in Pulsar that guarantees the ordering of messages and improves throughput by adding a Key to each message and routes messages with the same Key Hash to one consumer.&lt;/p&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-01-07-pulsar-flink/pulsar-key-shared.png&quot; width=&quot;640px&quot; alt=&quot;Apache Pulsar Key-Shared Subscription&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Pulsar Flink Connector 2.7.0 supports the Key_Shared subscription model. You can enable this feature by setting &lt;code&gt;enable-key-hash-range&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt;. The Key Hash range processed by each consumer is decided by the parallelism of tasks.&lt;/p&gt;
&lt;h3 id=&quot;introducing-exactly-once-semantics-for-pulsar-sink-based-on-the-pulsar-transaction&quot;&gt;Introducing exactly-once semantics for Pulsar sink (based on the Pulsar transaction)&lt;/h3&gt;
&lt;p&gt;In previous versions, sink operators only supported at-least-once semantics, which could not fully meet requirements for end-to-end consistency. To deduplicate messages, users had to do some dirty work, which was not user-friendly.&lt;/p&gt;
&lt;p&gt;Transactions are supported in Pulsar 2.7.0, which greatly improves the fault tolerance capability of the Flink sink. In the Pulsar Flink Connector 2.7.0, we designed exactly-once semantics for sink operators based on Pulsar transactions. Flink uses the two-phase commit protocol to implement TwoPhaseCommitSinkFunction. The main life cycle methods are beginTransaction(), preCommit(), commit(), abort(), recoverAndCommit(), recoverAndAbort().&lt;/p&gt;
&lt;p&gt;You can flexibly select semantics when creating a sink operator while the internal logic changes are transparent. Pulsar transactions are similar to the two-phase commit protocol in Flink, which greatly improves the reliability of the Connector Sink.&lt;/p&gt;
&lt;p&gt;It’s easy to implement beginTransaction and preCommit. You only need to start a Pulsar transaction and persist the TID of the transaction after the checkpoint. In the preCommit phase, you need to ensure that all messages are flushed to Pulsar, while any pre-committed messages will be committed eventually.&lt;/p&gt;
&lt;p&gt;We focus on recoverAndCommit and recoverAndAbort in implementation. Limited by Kafka features, Kafka connector adopts hack styles for recoverAndCommit. Pulsar transactions do not rely on the specific Producer, so it’s easy for you to commit and abort transactions based on TID.&lt;/p&gt;
&lt;p&gt;Pulsar transactions are highly efficient and flexible. Taking advantages of Pulsar and Flink, the Pulsar Flink connector is even more powerful. We will continue to improve transactional sink in the Pulsar Flink connector.&lt;/p&gt;
&lt;h3 id=&quot;introducing-upsert-pulsar-connector&quot;&gt;Introducing upsert-pulsar connector&lt;/h3&gt;
&lt;p&gt;Users in the Flink community expressed their needs for the upsert Pulsar. After looking through mailing lists and issues, we’ve summarized the following three reasons.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Interpret Pulsar topic as a changelog stream that interprets records with keys as upsert (aka insert/update) events.&lt;/li&gt;
&lt;li&gt;As a part of the real time pipeline, join multiple streams for enrichment and store results into a Pulsar topic for further calculation later. However, the result may contain update events.&lt;/li&gt;
&lt;li&gt;As a part of the real time pipeline, aggregate on data streams and store results into a Pulsar topic for further calculation later. However, the result may contain update events.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Based on the requirements, we add support for Upsert Pulsar. The upsert-pulsar connector allows for reading data from and writing data to Pulsar topics in the upsert fashion.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;As a source, the upsert-pulsar connector produces a changelog stream, where each data record represents an update or delete event. More precisely, the value in a data record is interpreted as an UPDATE of the last value for the same key, if any (if a corresponding key does not exist yet, the update will be considered an INSERT). Using the table analogy, a data record in a changelog stream is interpreted as an UPSERT (aka INSERT/UPDATE) because any existing row with the same key is overwritten. Also, null values are interpreted in a special way: a record with a null value represents a “DELETE”.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;As a sink, the upsert-pulsar connector can consume a changelog stream. It will write INSERT/UPDATE_AFTER data as normal Pulsar message values and write DELETE data as Pulsar message with null values (indicate tombstone for the key). Flink will guarantee the message ordering on the primary key by partitioning data on the values of the primary key columns, so the update/deletion messages on the same key will fall into the same partition.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;support-new-source-interface-and-table-api-introduced-in-flip-27httpscwikiapacheorgconfluencedisplayflinkflip-273arefactorsourceinterfaceflip27refactorsourceinterface-batchandstreamingunification-and-flip-95httpscwikiapacheorgconfluencedisplayflinkflip-953anewtablesourceandtablesinkinterfaces&quot;&gt;Support new source interface and Table API introduced in &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface#FLIP27:RefactorSourceInterface-BatchandStreamingUnification&quot;&gt;FLIP-27&lt;/a&gt; and &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-95%3A+New+TableSource+and+TableSink+interfaces&quot;&gt;FLIP-95&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;This feature unifies the source of the batch stream and optimizes the mechanism for task discovery and data reading. It is also the cornerstone of our implementation of Pulsar batch and streaming unification. The new Table API supports DDL computed columns, watermarks and metadata.&lt;/p&gt;
&lt;h3 id=&quot;support-sql-read-and-write-metadata-as-described-in-flip-107httpscwikiapacheorgconfluencedisplayflinkflip-1073ahandlingofmetadatainsqlconnectors&quot;&gt;Support SQL read and write metadata as described in &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Handling+of+metadata+in+SQL+connectors&quot;&gt;FLIP-107&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;FLIP-107 enables users to access connector metadata as a metadata column in table definitions. In real-time computing, users normally need additional information, such as eventTime, or customized fields. The Pulsar Flink connector supports SQL read and write metadata, so it is flexible and easy for users to manage metadata of Pulsar messages in the Pulsar Flink Connector 2.7.0. For details on the configuration, refer to &lt;a href=&quot;https://github.com/streamnative/pulsar-flink#pulsar-message-metadata-manipulation&quot;&gt;Pulsar Message metadata manipulation&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;add-flink-format-type-atomic-to-support-pulsar-primitive-types&quot;&gt;Add Flink format type &lt;code&gt;atomic&lt;/code&gt; to support Pulsar primitive types&lt;/h3&gt;
&lt;p&gt;In the Pulsar Flink Connector 2.7.0, we add Flink format type &lt;code&gt;atomic&lt;/code&gt; to support Pulsar primitive types. When processing with Flink requires a Pulsar primitive type, you can use &lt;code&gt;atomic&lt;/code&gt; as the connector format. You can find more information on Pulsar primitive types &lt;a href=&quot;https://pulsar.apache.org/docs/en/schema-understand/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;migration&quot;&gt;Migration&lt;/h2&gt;
&lt;p&gt;If you’re using the previous Pulsar Flink Connector version, you need to adjust your SQL and API parameters accordingly. Below we provide details on each.&lt;/p&gt;
&lt;h2 id=&quot;sql&quot;&gt;SQL&lt;/h2&gt;
&lt;p&gt;In SQL, we’ve changed the Pulsar configuration parameters in the DDL declaration. The name of some parameters are changed, but the values are not.&lt;br /&gt;
- Remove the &lt;code&gt;connector.&lt;/code&gt; prefix from the parameter names.
- Change the name of the &lt;code&gt;connector.type&lt;/code&gt; parameter into &lt;code&gt;connector&lt;/code&gt;.
- Change the startup mode parameter name from &lt;code&gt;connector.startup-mode&lt;/code&gt; into &lt;code&gt;scan.startup.mode&lt;/code&gt;.
- Adjust Pulsar properties as &lt;code&gt;properties.pulsar.reader.readername=testReaderName&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If you use SQL in the Pulsar Flink Connector, you need to adjust your SQL configuration accordingly when migrating to Pulsar Flink Connector 2.7.0. The following sample shows the differences between previous versions and the 2.7.0 version for SQL.&lt;/p&gt;
&lt;p&gt;SQL in previous versions:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;create table topic1(
`rip` VARCHAR,
`rtime` VARCHAR,
`uid` bigint,
`client_ip` VARCHAR,
`day` as TO_DATE(rtime),
`hour` as date_format(rtime,&#39;HH&#39;)
) with (
&#39;connector.type&#39; =&#39;pulsar&#39;,
&#39;connector.version&#39; = &#39;1&#39;,
&#39;connector.topic&#39; =&#39;persistent://public/default/test_flink_sql&#39;,
&#39;connector.service-url&#39; =&#39;pulsar://xxx&#39;,
&#39;connector.admin-url&#39; =&#39;http://xxx&#39;,
&#39;connector.startup-mode&#39; =&#39;earliest&#39;,
&#39;connector.properties.0.key&#39; =&#39;pulsar.reader.readerName&#39;,
&#39;connector.properties.0.value&#39; =&#39;testReaderName&#39;,
&#39;format.type&#39; =&#39;json&#39;,
&#39;update-mode&#39; =&#39;append&#39;
);
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;SQL in Pulsar Flink Connector 2.7.0:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;create table topic1(
`rip` VARCHAR,
`rtime` VARCHAR,
`uid` bigint,
`client_ip` VARCHAR,
`day` as TO_DATE(rtime),
`hour` as date_format(rtime,&#39;HH&#39;)
) with (
&#39;connector&#39; =&#39;pulsar&#39;,
&#39;topic&#39; =&#39;persistent://public/default/test_flink_sql&#39;,
&#39;service-url&#39; =&#39;pulsar://xxx&#39;,
&#39;admin-url&#39; =&#39;http://xxx&#39;,
&#39;scan.startup.mode&#39; =&#39;earliest&#39;,
&#39;properties.pulsar.reader.readername&#39; = &#39;testReaderName&#39;,
&#39;format&#39; =&#39;json&#39;
);
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;api&quot;&gt;API&lt;/h2&gt;
&lt;p&gt;From an API perspective, we adjusted some classes and enabled easier customization.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;To solve serialization issues, we changed the signature of the construction method &lt;code&gt;FlinkPulsarSink&lt;/code&gt;, and added &lt;code&gt;PulsarSerializationSchema&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;We removed inappropriate classes related to row, such as &lt;code&gt;FlinkPulsarRowSink&lt;/code&gt;, &lt;code&gt;FlinkPulsarRowSource&lt;/code&gt;. If you need to deal with Row formats, you can use Apache Flink’s Row related serialization components.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can build &lt;code&gt;PulsarSerializationSchema&lt;/code&gt; by using &lt;code&gt;PulsarSerializationSchemaWrapper.Builder&lt;/code&gt;. &lt;code&gt;TopicKeyExtractor&lt;/code&gt; is moved into &lt;code&gt;PulsarSerializationSchemaWrapper&lt;/code&gt;. When you adjust your API, you can take the following sample as reference.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;new PulsarSerializationSchemaWrapper.Builder&amp;lt;&amp;gt;(new SimpleStringSchema())
.setTopicExtractor(str -&amp;gt; getTopic(str))
.build();
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;future-plan&quot;&gt;Future Plan&lt;/h2&gt;
&lt;p&gt;Future plans involve the design of a batch and stream solution integrated with Pulsar Source, based on the new Flink Source API (FLIP-27). The new solution will overcome the limitations of the current streaming source interface (SourceFunction) and simultaneously unify the source interfaces between the batch and streaming APIs.&lt;/p&gt;
&lt;p&gt;Pulsar offers a hierarchical architecture where data is divided into streaming, batch, and cold data, which enables Pulsar to provide infinite capacity. This makes Pulsar an ideal solution for unified batch and streaming.&lt;/p&gt;
&lt;p&gt;The batch and stream solution based on the new Flink Source API is divided into two simple parts: SplitEnumerator and Reader. SplitEnumerator discovers and assigns partitions, and Reader reads data from the partition.&lt;/p&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;div class=&quot;row front-graphic&quot;&gt;
&lt;img src=&quot;/img/blog/2021-01-07-pulsar-flink/pulsar-flink-batch-stream.png&quot; width=&quot;640px&quot; alt=&quot;Batch and Stream Solution with Apache Pulsar and Apache Flink&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Apache Pulsar stores messages in the ledger block for users to locate the ledgers through Pulsar admin, and then provide broker partition, BookKeeper partition, Offloader partition, and other information through different partitioning policies. For more details, you can refer &lt;a href=&quot;https://github.com/streamnative/pulsar-flink/issues/187&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The latest version of the Pulsar Flink Connector is now available and we encourage everyone to use/upgrade to the Pulsar Flink Connector 2.7.0. The new version provides significant user enhancements, enabled by various features in Pulsar 2.7 and Flink 1.12. We will be contributing the Pulsar Flink Connector 2.7.0 to the &lt;a href=&quot;https://github.com/apache/flink/&quot;&gt;Apache Flink repository&lt;/a&gt; soon. If you have any questions or concerns about the Pulsar Flink Connector, feel free to open issues in &lt;a href=&quot;https://github.com/streamnative/pulsar-flink/issues&quot;&gt;this repository&lt;/a&gt;.&lt;/p&gt;
</description>
<pubDate>Thu, 07 Jan 2021 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/2021/01/07/pulsar-flink-connector-270.html</link>
<guid isPermaLink="true">/2021/01/07/pulsar-flink-connector-270.html</guid>
</item>
<item>
<title>Stateful Functions 2.2.2 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community released the second bugfix release of the Stateful Functions (StateFun) 2.2 series, version 2.2.2.&lt;/p&gt;
&lt;p&gt;The most important change of this bugfix release is upgrading Apache Flink to version 1.11.3. In addition to many stability
fixes to the Flink runtime itself, this also allows StateFun applications to safely use savepoints to upgrade from
older versions earlier than StateFun 2.2.1. Previously, restoring from savepoints could have failed under
&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19741&quot;&gt;certain conditions&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;i&gt;We strongly recommend all users to upgrade to 2.2.2&lt;/i&gt;&lt;/b&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This release includes 5 fixes and minor improvements since StateFun 2.2.1. Below is a detailed list of all fixes and improvements:&lt;/p&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20699&quot;&gt;FLINK-20699&lt;/a&gt;] - Feedback invocation_id must not be constant.
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20161&quot;&gt;FLINK-20161&lt;/a&gt;] - Consider switching from Travis CI to Github Actions for flink-statefun&amp;#39;s CI workflows
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20189&quot;&gt;FLINK-20189&lt;/a&gt;] - Restored feedback events may be silently dropped if per key-group header bytes were not fully read
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20636&quot;&gt;FLINK-20636&lt;/a&gt;] - Require unaligned checkpoints to be disabled in StateFun applications
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20689&quot;&gt;FLINK-20689&lt;/a&gt;] - Upgrade StateFun to Flink 1.11.3
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Sat, 02 Jan 2021 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2021/01/02/release-statefun-2.2.2.html</link>
<guid isPermaLink="true">/news/2021/01/02/release-statefun-2.2.2.html</guid>
</item>
<item>
<title>Apache Flink 1.11.3 Released</title>
<description>&lt;p&gt;The Apache Flink community released the third bugfix version of the Apache Flink 1.11 series.&lt;/p&gt;
&lt;p&gt;This release includes 151 fixes and minor improvements for Flink 1.11.2. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.11.3.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17393&quot;&gt;FLINK-17393&lt;/a&gt;] - Improve the `FutureCompletingBlockingQueue` to wakeup blocking put() more elegantly.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18604&quot;&gt;FLINK-18604&lt;/a&gt;] - HBase ConnectorDescriptor can not work in Table API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18673&quot;&gt;FLINK-18673&lt;/a&gt;] - Calling ROW() in a UDF results in UnsupportedOperationException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18680&quot;&gt;FLINK-18680&lt;/a&gt;] - Improve RecordsWithSplitIds API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18916&quot;&gt;FLINK-18916&lt;/a&gt;] - Add a &amp;quot;Operations&amp;quot; link(linked to dev/table/tableApi.md) under the &amp;quot;Python API&amp;quot; -&amp;gt; &amp;quot;User Guide&amp;quot; -&amp;gt; &amp;quot;Table API&amp;quot; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18918&quot;&gt;FLINK-18918&lt;/a&gt;] - Add a &amp;quot;Connectors&amp;quot; document under the &amp;quot;Python API&amp;quot; -&amp;gt; &amp;quot;User Guide&amp;quot; -&amp;gt; &amp;quot;Table API&amp;quot; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18922&quot;&gt;FLINK-18922&lt;/a&gt;] - Add a &amp;quot;Catalogs&amp;quot; link (linked to dev/table/catalogs.md) under the &amp;quot;Python API&amp;quot; -&amp;gt; &amp;quot;User Guide&amp;quot; -&amp;gt; &amp;quot;Table API&amp;quot; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18926&quot;&gt;FLINK-18926&lt;/a&gt;] - Add a &amp;quot;Environment Variables&amp;quot; document under the &amp;quot;Python API&amp;quot; -&amp;gt; &amp;quot;User Guide&amp;quot; -&amp;gt; &amp;quot;Table API&amp;quot; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19162&quot;&gt;FLINK-19162&lt;/a&gt;] - Allow Split Reader based sources to reuse record batches
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19205&quot;&gt;FLINK-19205&lt;/a&gt;] - SourceReaderContext should give access to Configuration and Hostbame
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20397&quot;&gt;FLINK-20397&lt;/a&gt;] - Pass checkpointId to OperatorCoordinator.resetToCheckpoint().
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9992&quot;&gt;FLINK-9992&lt;/a&gt;] - FsStorageLocationReferenceTest#testEncodeAndDecode failed in CI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13733&quot;&gt;FLINK-13733&lt;/a&gt;] - FlinkKafkaInternalProducerITCase.testHappyPath fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15170&quot;&gt;FLINK-15170&lt;/a&gt;] - WebFrontendITCase.testCancelYarn fails on travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16246&quot;&gt;FLINK-16246&lt;/a&gt;] - Exclude &amp;quot;SdkMBeanRegistrySupport&amp;quot; from dynamically loaded AWS connectors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16268&quot;&gt;FLINK-16268&lt;/a&gt;] - Failed to run rank over window with Hive built-in functions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16768&quot;&gt;FLINK-16768&lt;/a&gt;] - HadoopS3RecoverableWriterITCase.testRecoverWithStateWithMultiPart hangs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17341&quot;&gt;FLINK-17341&lt;/a&gt;] - freeSlot in TaskExecutor.closeJobManagerConnection cause ConcurrentModificationException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17458&quot;&gt;FLINK-17458&lt;/a&gt;] - TaskExecutorSubmissionTest#testFailingScheduleOrUpdateConsumers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17677&quot;&gt;FLINK-17677&lt;/a&gt;] - FLINK_LOG_PREFIX recommended in docs is not always available
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17825&quot;&gt;FLINK-17825&lt;/a&gt;] - HA end-to-end gets killed due to timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18128&quot;&gt;FLINK-18128&lt;/a&gt;] - CoordinatedSourceITCase.testMultipleSources gets stuck
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18196&quot;&gt;FLINK-18196&lt;/a&gt;] - flink throws `NullPointerException` when executeCheckpointing
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18222&quot;&gt;FLINK-18222&lt;/a&gt;] - &amp;quot;Avro Confluent Schema Registry nightly end-to-end test&amp;quot; unstable with &amp;quot;Kafka cluster did not start after 120 seconds&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18815&quot;&gt;FLINK-18815&lt;/a&gt;] - AbstractCloseableRegistryTest.testClose unstable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18818&quot;&gt;FLINK-18818&lt;/a&gt;] - HadoopRenameCommitterHDFSTest.testCommitOneFile[Override: false] failed with &amp;quot;java.io.IOException: The stream is closed&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18836&quot;&gt;FLINK-18836&lt;/a&gt;] - Python UDTF doesn&amp;#39;t work well when the return type isn&amp;#39;t generator
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18915&quot;&gt;FLINK-18915&lt;/a&gt;] - FIXED_PATH(dummy Hadoop Path) with WriterImpl may cause ORC writer OOM
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19022&quot;&gt;FLINK-19022&lt;/a&gt;] - AkkaRpcActor failed to start but no exception information
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19121&quot;&gt;FLINK-19121&lt;/a&gt;] - Avoid accessing HDFS frequently in HiveBulkWriterFactory
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19135&quot;&gt;FLINK-19135&lt;/a&gt;] - (Stream)ExecutionEnvironment.execute() should not throw ExecutionException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19138&quot;&gt;FLINK-19138&lt;/a&gt;] - Python UDF supports directly specifying input_types as DataTypes.ROW
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19140&quot;&gt;FLINK-19140&lt;/a&gt;] - Join with Table Function (UDTF) SQL example is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19151&quot;&gt;FLINK-19151&lt;/a&gt;] - Flink does not normalize container resource with correct configurations when Yarn FairScheduler is used
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19154&quot;&gt;FLINK-19154&lt;/a&gt;] - Application mode deletes HA data in case of suspended ZooKeeper connection
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19170&quot;&gt;FLINK-19170&lt;/a&gt;] - Parameter naming error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19201&quot;&gt;FLINK-19201&lt;/a&gt;] - PyFlink e2e tests is instable and failed with &amp;quot;Connection broken: OSError&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19227&quot;&gt;FLINK-19227&lt;/a&gt;] - The catalog is still created after opening failed in catalog registering
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19237&quot;&gt;FLINK-19237&lt;/a&gt;] - LeaderChangeClusterComponentsTest.testReelectionOfJobMaster failed with &amp;quot;NoResourceAvailableException: Could not allocate the required slot within slot request timeout&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19244&quot;&gt;FLINK-19244&lt;/a&gt;] - CSV format can&amp;#39;t deserialize null ROW field
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19250&quot;&gt;FLINK-19250&lt;/a&gt;] - SplitFetcherManager does not propagate errors correctly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19253&quot;&gt;FLINK-19253&lt;/a&gt;] - SourceReaderTestBase.testAddSplitToExistingFetcher hangs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19258&quot;&gt;FLINK-19258&lt;/a&gt;] - Fix the wrong example of &amp;quot;csv.line-delimiter&amp;quot; in CSV documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19280&quot;&gt;FLINK-19280&lt;/a&gt;] - The option &amp;quot;sink.buffer-flush.max-rows&amp;quot; for JDBC can&amp;#39;t be disabled by set to zero
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19281&quot;&gt;FLINK-19281&lt;/a&gt;] - LIKE cannot recognize full table path
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19291&quot;&gt;FLINK-19291&lt;/a&gt;] - Fix exception for AvroSchemaConverter#convertToSchema when RowType contains multiple row fields
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19295&quot;&gt;FLINK-19295&lt;/a&gt;] - YARNSessionFIFOITCase.checkForProhibitedLogContents found a log with prohibited string
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19300&quot;&gt;FLINK-19300&lt;/a&gt;] - Timer loss after restoring from savepoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19321&quot;&gt;FLINK-19321&lt;/a&gt;] - CollectSinkFunction does not define serialVersionUID
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19338&quot;&gt;FLINK-19338&lt;/a&gt;] - New source interface cannot unregister unregistered source
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19361&quot;&gt;FLINK-19361&lt;/a&gt;] - Make HiveCatalog thread safe
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19398&quot;&gt;FLINK-19398&lt;/a&gt;] - Hive connector fails with IllegalAccessError if submitted as usercode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19401&quot;&gt;FLINK-19401&lt;/a&gt;] - Job stuck in restart loop due to excessive checkpoint recoveries which block the JobMaster
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19423&quot;&gt;FLINK-19423&lt;/a&gt;] - Fix ArrayIndexOutOfBoundsException when executing DELETE statement in JDBC upsert sink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19433&quot;&gt;FLINK-19433&lt;/a&gt;] - An Error example of FROM_UNIXTIME function in document
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19448&quot;&gt;FLINK-19448&lt;/a&gt;] - CoordinatedSourceITCase.testEnumeratorReaderCommunication hangs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19535&quot;&gt;FLINK-19535&lt;/a&gt;] - SourceCoordinator should avoid fail job multiple times.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19557&quot;&gt;FLINK-19557&lt;/a&gt;] - Issue retrieving leader after zookeeper session reconnect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19585&quot;&gt;FLINK-19585&lt;/a&gt;] - UnalignedCheckpointCompatibilityITCase.test:97-&amp;gt;runAndTakeSavepoint: &amp;quot;Not all required tasks are currently running.&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19587&quot;&gt;FLINK-19587&lt;/a&gt;] - Error result when casting binary type as varchar
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19618&quot;&gt;FLINK-19618&lt;/a&gt;] - Broken link in docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19629&quot;&gt;FLINK-19629&lt;/a&gt;] - Fix NullPointException when deserializing map field with null value for Avro format
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19675&quot;&gt;FLINK-19675&lt;/a&gt;] - The plan of is incorrect when Calc contains WHERE clause, composite fields access and Python UDF at the same time
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19695&quot;&gt;FLINK-19695&lt;/a&gt;] - Writing Table with RowTime Column of type TIMESTAMP(3) to Kafka fails with ClassCastException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19717&quot;&gt;FLINK-19717&lt;/a&gt;] - SourceReaderBase.pollNext may return END_OF_INPUT if SplitReader.fetch throws
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19740&quot;&gt;FLINK-19740&lt;/a&gt;] - Error in to_pandas for table containing event time: class java.time.LocalDateTime cannot be cast to class java.sql.Timestamp
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19741&quot;&gt;FLINK-19741&lt;/a&gt;] - InternalTimeServiceManager fails to restore due to corrupt reads if there are other users of raw keyed state streams
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19748&quot;&gt;FLINK-19748&lt;/a&gt;] - KeyGroupRangeOffsets#KeyGroupOffsetsIterator should skip key groups that don&amp;#39;t have a defined offset
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19750&quot;&gt;FLINK-19750&lt;/a&gt;] - Deserializer is not opened in Kafka consumer when restoring from state
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19755&quot;&gt;FLINK-19755&lt;/a&gt;] - Fix CEP documentation error of the example in &amp;#39;After Match Strategy&amp;#39; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19775&quot;&gt;FLINK-19775&lt;/a&gt;] - SystemProcessingTimeServiceTest.testImmediateShutdown is instable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19777&quot;&gt;FLINK-19777&lt;/a&gt;] - Fix NullPointException for WindowOperator.close()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19790&quot;&gt;FLINK-19790&lt;/a&gt;] - Writing MAP&amp;lt;STRING, STRING&amp;gt; to Kafka with JSON format produces incorrect data.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19806&quot;&gt;FLINK-19806&lt;/a&gt;] - Job may try to leave SUSPENDED state in ExecutionGraph#failJob()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19816&quot;&gt;FLINK-19816&lt;/a&gt;] - Flink restored from a wrong checkpoint (a very old one and not the last completed one)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19852&quot;&gt;FLINK-19852&lt;/a&gt;] - Managed memory released check can block IterativeTask
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19867&quot;&gt;FLINK-19867&lt;/a&gt;] - Validation fails for UDF that accepts var-args
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19894&quot;&gt;FLINK-19894&lt;/a&gt;] - Use iloc for positional slicing instead of direct slicing in from_pandas
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19901&quot;&gt;FLINK-19901&lt;/a&gt;] - Unable to exclude metrics variables for the last metrics reporter.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19906&quot;&gt;FLINK-19906&lt;/a&gt;] - Incorrect result when compare two binary fields
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19907&quot;&gt;FLINK-19907&lt;/a&gt;] - Channel state (upstream) can be restored after emission of new elements (watermarks)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19909&quot;&gt;FLINK-19909&lt;/a&gt;] - Flink application in attach mode could not terminate when the only job is canceled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19948&quot;&gt;FLINK-19948&lt;/a&gt;] - Calling NOW() function throws compile exception
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20013&quot;&gt;FLINK-20013&lt;/a&gt;] - BoundedBlockingSubpartition may leak network buffer if task is failed or canceled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20018&quot;&gt;FLINK-20018&lt;/a&gt;] - pipeline.cached-files option cannot escape &amp;#39;:&amp;#39; in path
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20033&quot;&gt;FLINK-20033&lt;/a&gt;] - Job fails when stopping JobMaster
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20050&quot;&gt;FLINK-20050&lt;/a&gt;] - SourceCoordinatorProviderTest.testCheckpointAndReset failed with NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20063&quot;&gt;FLINK-20063&lt;/a&gt;] - File Source requests an additional split on every restore.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20064&quot;&gt;FLINK-20064&lt;/a&gt;] - Broken links in the documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20065&quot;&gt;FLINK-20065&lt;/a&gt;] - UnalignedCheckpointCompatibilityITCase.test failed with AskTimeoutException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20068&quot;&gt;FLINK-20068&lt;/a&gt;] - KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20069&quot;&gt;FLINK-20069&lt;/a&gt;] - docs_404_check doesn&amp;#39;t work properly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20076&quot;&gt;FLINK-20076&lt;/a&gt;] - DispatcherTest.testOnRemovedJobGraphDoesNotCleanUpHAFiles does not test the desired functionality
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20079&quot;&gt;FLINK-20079&lt;/a&gt;] - Modified UnalignedCheckpointITCase...MassivelyParallel fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20081&quot;&gt;FLINK-20081&lt;/a&gt;] - ExecutorNotifier should run handler in the main thread when receive an exception from the callable.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20143&quot;&gt;FLINK-20143&lt;/a&gt;] - use `yarn.provided.lib.dirs` config deploy job failed in yarn per job mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20165&quot;&gt;FLINK-20165&lt;/a&gt;] - YARNSessionFIFOITCase.checkForProhibitedLogContents: Error occurred during initialization of boot layer java.lang.IllegalStateException: Module system already initialized
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20175&quot;&gt;FLINK-20175&lt;/a&gt;] - Avro Confluent Registry SQL format does not support adding nullable columns
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20183&quot;&gt;FLINK-20183&lt;/a&gt;] - Fix the default PYTHONPATH is overwritten in client side
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20193&quot;&gt;FLINK-20193&lt;/a&gt;] - SourceCoordinator should catch exception thrown from SplitEnumerator.start()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20194&quot;&gt;FLINK-20194&lt;/a&gt;] - KafkaSourceFetcherManager.commitOffsets() should handle the case when there is no split fetcher.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20200&quot;&gt;FLINK-20200&lt;/a&gt;] - SQL Hints are not supported in &amp;quot;Create View&amp;quot; syntax
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20213&quot;&gt;FLINK-20213&lt;/a&gt;] - Partition commit is delayed when records keep coming
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20221&quot;&gt;FLINK-20221&lt;/a&gt;] - DelimitedInputFormat does not restore compressed filesplits correctly leading to dataloss
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20222&quot;&gt;FLINK-20222&lt;/a&gt;] - The CheckpointCoordinator should reset the OperatorCoordinators when fail before the first checkpoint.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20223&quot;&gt;FLINK-20223&lt;/a&gt;] - The RecreateOnResetOperatorCoordinator and SourceCoordinator executor thread should use the user class loader.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20243&quot;&gt;FLINK-20243&lt;/a&gt;] - Remove useless words in documents
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20262&quot;&gt;FLINK-20262&lt;/a&gt;] - Building flink-dist docker image does not work without python2
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20266&quot;&gt;FLINK-20266&lt;/a&gt;] - New Sources prevent JVM shutdown when running a job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20270&quot;&gt;FLINK-20270&lt;/a&gt;] - Fix the regression of missing ExternallyInducedSource support in FLIP-27 Source.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20277&quot;&gt;FLINK-20277&lt;/a&gt;] - flink-1.11.2 ContinuousFileMonitoringFunction cannot restore from failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20284&quot;&gt;FLINK-20284&lt;/a&gt;] - Error happens in TaskExecutor when closing JobMaster connection if there was a python UDF
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20285&quot;&gt;FLINK-20285&lt;/a&gt;] - LazyFromSourcesSchedulingStrategy is possible to schedule non-CREATED vertices
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20333&quot;&gt;FLINK-20333&lt;/a&gt;] - Flink standalone cluster throws metaspace OOM after submitting multiple PyFlink UDF jobs.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20351&quot;&gt;FLINK-20351&lt;/a&gt;] - Execution.transitionState does not properly log slot location
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20382&quot;&gt;FLINK-20382&lt;/a&gt;] - Exception thrown from JobMaster.startScheduling() may be ignored.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20396&quot;&gt;FLINK-20396&lt;/a&gt;] - Add &amp;quot;OperatorCoordinator.resetSubtask()&amp;quot; to fix order problems of &amp;quot;subtaskFailed()&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20404&quot;&gt;FLINK-20404&lt;/a&gt;] - ZooKeeper quorum fails to start due to missing log4j library
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20413&quot;&gt;FLINK-20413&lt;/a&gt;] - Sources should add splits back in &amp;quot;resetSubtask()&amp;quot;, rather than in &amp;quot;subtaskFailed()&amp;quot;.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20418&quot;&gt;FLINK-20418&lt;/a&gt;] - NPE in IteratorSourceReader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20442&quot;&gt;FLINK-20442&lt;/a&gt;] - Fix license documentation mistakes in flink-python.jar
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20492&quot;&gt;FLINK-20492&lt;/a&gt;] - The SourceOperatorStreamTask should implement cancelTask() and finishTask()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20554&quot;&gt;FLINK-20554&lt;/a&gt;] - The Checkpointed Data Size of the Latest Completed Checkpoint is incorrectly displayed on the Overview page of the UI
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; New Feature
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19934&quot;&gt;FLINK-19934&lt;/a&gt;] - [FLIP-27 source] add new API: SplitEnumeratorContext.runInCoordinatorThread(Runnable)
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16753&quot;&gt;FLINK-16753&lt;/a&gt;] - Exception from AsyncCheckpointRunnable should be wrapped in CheckpointException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18139&quot;&gt;FLINK-18139&lt;/a&gt;] - Unaligned checkpoints checks wrong channels for inflight data.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18500&quot;&gt;FLINK-18500&lt;/a&gt;] - Make the legacy planner exception more clear when resolving computed columns types for schema
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18545&quot;&gt;FLINK-18545&lt;/a&gt;] - Sql api cannot specify flink job name
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18715&quot;&gt;FLINK-18715&lt;/a&gt;] - add cpu usage metric of jobmanager/taskmanager
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19193&quot;&gt;FLINK-19193&lt;/a&gt;] - Recommend stop-with-savepoint in upgrade guidelines
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19225&quot;&gt;FLINK-19225&lt;/a&gt;] - Improve code and logging in SourceReaderBase
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19245&quot;&gt;FLINK-19245&lt;/a&gt;] - Set default queue capacity for FLIP-27 source handover queue to 2
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19251&quot;&gt;FLINK-19251&lt;/a&gt;] - Avoid confusing queue handling in &amp;quot;SplitReader.handleSplitsChanges()&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19252&quot;&gt;FLINK-19252&lt;/a&gt;] - Jaas file created under io.tmp.dirs - folder not created if not exists
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19265&quot;&gt;FLINK-19265&lt;/a&gt;] - Simplify handling of &amp;#39;NoMoreSplitsEvent&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19339&quot;&gt;FLINK-19339&lt;/a&gt;] - Support Avro&amp;#39;s unions with logical types
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19523&quot;&gt;FLINK-19523&lt;/a&gt;] - Hide sensitive command-line configurations
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19569&quot;&gt;FLINK-19569&lt;/a&gt;] - Upgrade ICU4J to 67.1+
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19677&quot;&gt;FLINK-19677&lt;/a&gt;] - TaskManager takes abnormally long time to register with JobManager on Kubernetes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19698&quot;&gt;FLINK-19698&lt;/a&gt;] - Add close() method and onCheckpointComplete() to the Source.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19892&quot;&gt;FLINK-19892&lt;/a&gt;] - Replace __metaclass__ field with metaclass keyword
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20049&quot;&gt;FLINK-20049&lt;/a&gt;] - Simplify handling of &amp;quot;request split&amp;quot;.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20055&quot;&gt;FLINK-20055&lt;/a&gt;] - Datadog API Key exposed in Flink JobManager logs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20142&quot;&gt;FLINK-20142&lt;/a&gt;] - Update the document for CREATE TABLE LIKE that source table from different catalog is supported
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20152&quot;&gt;FLINK-20152&lt;/a&gt;] - Document which execution.target values are supported
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20156&quot;&gt;FLINK-20156&lt;/a&gt;] - JavaDocs of WatermarkStrategy.withTimestampAssigner are wrong wrt Java 8
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20169&quot;&gt;FLINK-20169&lt;/a&gt;] - Move emitting MAX_WATERMARK out of SourceOperator processing loop
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20207&quot;&gt;FLINK-20207&lt;/a&gt;] - Improve the error message printed when submitting the pyflink jobs via &amp;#39;flink run&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20296&quot;&gt;FLINK-20296&lt;/a&gt;] - Explanation of keyBy was broken by find/replace of deprecated forms of keyBy
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Test
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18725&quot;&gt;FLINK-18725&lt;/a&gt;] - &amp;quot;Run Kubernetes test&amp;quot; failed with &amp;quot;30025: provided port is already allocated&amp;quot;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20455&quot;&gt;FLINK-20455&lt;/a&gt;] - Add check to LicenseChecker for top level /LICENSE files in shaded jars
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 18 Dec 2020 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/12/18/release-1.11.3.html</link>
<guid isPermaLink="true">/news/2020/12/18/release-1.11.3.html</guid>
</item>
<item>
<title>Improvements in task scheduling for batch workloads in Apache Flink 1.12</title>
<description>&lt;p&gt;The Flink community has been working for some time on making Flink a
&lt;a href=&quot;https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html&quot;&gt;truly unified batch and stream processing system&lt;/a&gt;.
Achieving this involves touching a lot of different components of the Flink stack, from the user-facing APIs all the way
to low-level operator processes such as task scheduling. In this blogpost, we’ll take a closer look at how far
the community has come in improving scheduling for batch workloads, why this matters and what you can expect in the
Flink 1.12 release with the new &lt;em&gt;pipelined region scheduler&lt;/em&gt;.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#towards-unified-scheduling&quot; id=&quot;markdown-toc-towards-unified-scheduling&quot;&gt;Towards unified scheduling&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#scheduling-strategies-in-flink-before-112&quot; id=&quot;markdown-toc-scheduling-strategies-in-flink-before-112&quot;&gt;Scheduling Strategies in Flink before 1.12&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#a-practical-example&quot; id=&quot;markdown-toc-a-practical-example&quot;&gt;A practical example&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-new-pipelined-region-scheduling&quot; id=&quot;markdown-toc-the-new-pipelined-region-scheduling&quot;&gt;The new pipelined region scheduling&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#pipelined-regions&quot; id=&quot;markdown-toc-pipelined-regions&quot;&gt;Pipelined regions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pipelined-region-scheduling-strategy&quot; id=&quot;markdown-toc-pipelined-region-scheduling-strategy&quot;&gt;Pipelined region scheduling strategy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#failover-strategy&quot; id=&quot;markdown-toc-failover-strategy&quot;&gt;Failover strategy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#benefits&quot; id=&quot;markdown-toc-benefits&quot;&gt;Benefits&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#appendix&quot; id=&quot;markdown-toc-appendix&quot;&gt;Appendix&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#what-is-scheduling&quot; id=&quot;markdown-toc-what-is-scheduling&quot;&gt;What is scheduling?&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#executiongraph&quot; id=&quot;markdown-toc-executiongraph&quot;&gt;ExecutionGraph&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#intermediate-results&quot; id=&quot;markdown-toc-intermediate-results&quot;&gt;Intermediate results&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#slots-and-resources&quot; id=&quot;markdown-toc-slots-and-resources&quot;&gt;Slots and resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#scheduling-strategy&quot; id=&quot;markdown-toc-scheduling-strategy&quot;&gt;Scheduling strategy&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;towards-unified-scheduling&quot;&gt;Towards unified scheduling&lt;/h1&gt;
&lt;p&gt;Flink has an internal &lt;a href=&quot;#what-is-scheduling&quot;&gt;scheduler&lt;/a&gt; to distribute work to all available cluster nodes, taking resource utilization, state locality and recovery into account.
How do you write a scheduler for a unified batch and streaming system? To answer this question,
let’s first have a look into the high-level differences between batch and streaming scheduling requirements.&lt;/p&gt;
&lt;h4 id=&quot;streaming&quot;&gt;Streaming&lt;/h4&gt;
&lt;p&gt;&lt;em&gt;Streaming&lt;/em&gt; jobs usually require that all &lt;em&gt;&lt;a href=&quot;#executiongraph&quot;&gt;operator subtasks&lt;/a&gt;&lt;/em&gt; are running in parallel at the same time, for an indefinite time.
Therefore, all the required resources to run these jobs have to be provided upfront, and all &lt;em&gt;operator subtasks&lt;/em&gt; must be deployed at once.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-12-02-pipelined-region-sheduling/streaming-job-example.png&quot; width=&quot;400px&quot; alt=&quot;Streaming job example:high&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Flink: Streaming job example&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;Because there are no finite intermediate results, a &lt;em&gt;streaming job&lt;/em&gt; always has to be restarted fully from a checkpoint or a savepoint in case of failure.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
A &lt;em&gt;streaming job&lt;/em&gt; may generally consist of multiple disjoint pipelines which can be restarted independently.
Hence, the full job restart is not required in this case but you can think of each disjoint pipeline as if it were a separate job.&lt;/p&gt;
&lt;/div&gt;
&lt;h4 id=&quot;batch&quot;&gt;Batch&lt;/h4&gt;
&lt;p&gt;In contrast to &lt;em&gt;streaming&lt;/em&gt; jobs, &lt;em&gt;batch&lt;/em&gt; jobs usually consist of one or more stages that can have dependencies between them.
Each stage will only run for a finite amount of time and produce some finite output (i.e. at some point, the batch job will be &lt;em&gt;finished&lt;/em&gt;).
Independent stages can run in parallel to improve execution time, but for cases where there are dependencies between stages,
a stage may have to wait for upstream results to be produced before it can run.
These are called &lt;em&gt;&lt;a href=&quot;#intermediate-results&quot;&gt;blocking results&lt;/a&gt;&lt;/em&gt;, and in this case stages cannot run in parallel.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-12-02-pipelined-region-sheduling/batch-job-example.png&quot; width=&quot;600px&quot; alt=&quot;Batch job example:high&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Flink: Batch job example&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;As an example, in the figure above &lt;strong&gt;Stage 0&lt;/strong&gt; and &lt;strong&gt;Stage 1&lt;/strong&gt; can run simultaneously, as there is no dependency between them.
&lt;strong&gt;Stage 3&lt;/strong&gt;, on the other hand, can only be scheduled once both its inputs are available. There are a few implications from this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;(a)&lt;/strong&gt; You can use available resources more efficiently by only scheduling stages that have data to perform work;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;(b)&lt;/strong&gt; You can use this mechanism also for failover: if a stage fails, it can be restarted individually, without recomputing the results of other stages.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;scheduling-strategies-in-flink-before-112&quot;&gt;Scheduling Strategies in Flink before 1.12&lt;/h3&gt;
&lt;p&gt;Given these differences, a unified scheduler would have to be good at resource management for each individual stage,
be it finite (&lt;em&gt;batch&lt;/em&gt;) or infinite (&lt;em&gt;streaming&lt;/em&gt;), and also across multiple stages.
The existing &lt;a href=&quot;#scheduling-strategy&quot;&gt;scheduling strategies&lt;/a&gt; in older Flink versions up to 1.11 have been largely designed to address these concerns separately.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;“All at once (Eager)”&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This strategy is the simplest: Flink just tries to allocate resources and deploy all &lt;em&gt;subtasks&lt;/em&gt; at once.
Up to Flink 1.11, this is the scheduling strategy used for all &lt;em&gt;streaming&lt;/em&gt; jobs.
For &lt;em&gt;batch&lt;/em&gt; jobs, using “all at once” scheduling would lead to suboptimal resource utilization,
since it’s unlikely that such jobs would require all resources upfront, and any resources allocated to subtasks
that could not run at a given moment would be idle and therefore wasted.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;“Lazy from sources”&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To account for &lt;em&gt;blocking results&lt;/em&gt; and make sure that no consumer is deployed before their respective producers are finished,
Flink provides a different scheduling strategy for &lt;em&gt;batch&lt;/em&gt; workloads.
“Lazy from sources” scheduling deploys subtasks only once all their inputs are ready.
This strategy operates on each &lt;em&gt;subtask&lt;/em&gt; individually; it does not identify all &lt;em&gt;subtasks&lt;/em&gt; which can (or have to) run at the same time.&lt;/p&gt;
&lt;h3 id=&quot;a-practical-example&quot;&gt;A practical example&lt;/h3&gt;
&lt;p&gt;Let’s take a closer look at the specific case of &lt;em&gt;batch&lt;/em&gt; jobs, using as motivation a simple SQL query:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-SQL&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customers&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;customerId&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;varchar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;255&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orders&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;orderId&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;orderCustomerId&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;--fill tables with data&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customerId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orders&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customerId&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orderCustomerId&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Assume that two tables were created in some database: the &lt;code&gt;customers&lt;/code&gt; table is relatively small and fits into the local memory (or also on disk). The &lt;code&gt;orders&lt;/code&gt; table is bigger, as it contains all orders created by customers, and doesn’t fit in memory. To enrich the orders with the customer name, you have to join these two tables. There are basically two stages in this &lt;em&gt;batch&lt;/em&gt; job:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Load the complete &lt;code&gt;customers&lt;/code&gt; table into a local map: &lt;code&gt;(customerId, name)&lt;/code&gt;; because this table is smaller,&lt;/li&gt;
&lt;li&gt;Process the &lt;code&gt;orders&lt;/code&gt; table record by record, enriching it with the &lt;code&gt;name&lt;/code&gt; value from the map.&lt;/li&gt;
&lt;/ol&gt;
&lt;h4 id=&quot;executing-the-job&quot;&gt;Executing the job&lt;/h4&gt;
&lt;p&gt;The batch job described above will have three operators. For simplicity, each operator is represented with a parallelism of 1,
so the resulting &lt;em&gt;&lt;a href=&quot;#executiongraph&quot;&gt;ExecutionGraph&lt;/a&gt;&lt;/em&gt; will consist of three &lt;em&gt;subtasks&lt;/em&gt;: A, B and C.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;A&lt;/strong&gt;: load full &lt;code&gt;customers&lt;/code&gt; table&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;B&lt;/strong&gt;: load &lt;code&gt;orders&lt;/code&gt; table record by record in a &lt;em&gt;streaming&lt;/em&gt; (pipelined) fashion&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;C&lt;/strong&gt;: join order table records with the loaded customer table&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This translates into &lt;strong&gt;A&lt;/strong&gt; and &lt;strong&gt;C&lt;/strong&gt; being connected with a &lt;em&gt;blocking&lt;/em&gt; data exchange,
because the &lt;code&gt;customers&lt;/code&gt; table needs to be loaded locally (&lt;strong&gt;A&lt;/strong&gt;) before we start processing the order table (&lt;strong&gt;B&lt;/strong&gt;).
&lt;strong&gt;B&lt;/strong&gt; and &lt;strong&gt;C&lt;/strong&gt; are connected with a &lt;em&gt;&lt;a href=&quot;#intermediate-results&quot;&gt;pipelined&lt;/a&gt;&lt;/em&gt; data exchange,
because the consumer (&lt;strong&gt;C&lt;/strong&gt;) can run as soon as the first result records from &lt;strong&gt;B&lt;/strong&gt; have been produced.
You can think of &lt;strong&gt;B-&amp;gt;C&lt;/strong&gt; as a &lt;em&gt;finite streaming&lt;/em&gt; job. It’s then possible to identify two separate stages within the &lt;em&gt;ExecutionGraph&lt;/em&gt;: &lt;strong&gt;A&lt;/strong&gt; and &lt;strong&gt;B-&amp;gt;C&lt;/strong&gt;.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-12-02-pipelined-region-sheduling/sql-join-job-example.png&quot; width=&quot;450px&quot; alt=&quot;SQL Join job example:high&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Flink: SQL Join job example&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h4 id=&quot;scheduling-limitations&quot;&gt;Scheduling Limitations&lt;/h4&gt;
&lt;p&gt;Imagine that the cluster this job will run in has only one &lt;em&gt;&lt;a href=&quot;#slots-and-resources&quot;&gt;slot&lt;/a&gt;&lt;/em&gt; and can therefore only execute one &lt;em&gt;subtask&lt;/em&gt;.
If Flink deploys &lt;strong&gt;B&lt;/strong&gt; &lt;em&gt;&lt;a href=&quot;#slots-and-resources&quot;&gt;chained&lt;/a&gt;&lt;/em&gt; with &lt;strong&gt;C&lt;/strong&gt; first into this one &lt;em&gt;slot&lt;/em&gt; (as &lt;strong&gt;B&lt;/strong&gt; and &lt;strong&gt;C&lt;/strong&gt; are connected with a &lt;em&gt;&lt;a href=&quot;#intermediate-results&quot;&gt;pipelined&lt;/a&gt;&lt;/em&gt; edge),
&lt;strong&gt;C&lt;/strong&gt; cannot run because A has not produced its &lt;em&gt;blocking result&lt;/em&gt; yet. Flink will try to deploy &lt;strong&gt;A&lt;/strong&gt; and the job will fail, because there are no more &lt;em&gt;slots&lt;/em&gt;.
If there were two &lt;em&gt;slots&lt;/em&gt; available, Flink would be able to deploy &lt;strong&gt;A&lt;/strong&gt; and the job would eventually succeed.
Nonetheless, the resources of the first &lt;em&gt;slot&lt;/em&gt; occupied by &lt;strong&gt;B&lt;/strong&gt; and &lt;strong&gt;C&lt;/strong&gt; would be wasted while &lt;strong&gt;A&lt;/strong&gt; was running.&lt;/p&gt;
&lt;p&gt;Both scheduling strategies available as of Flink 1.11 (&lt;em&gt;“all at once”&lt;/em&gt; and &lt;em&gt;“lazy from source”&lt;/em&gt;) would be affected by these limitations.
What would be the optimal approach? In this case, if &lt;strong&gt;A&lt;/strong&gt; was deployed first, then &lt;strong&gt;B&lt;/strong&gt; and &lt;strong&gt;C&lt;/strong&gt; could also complete afterwards using the same &lt;em&gt;slot&lt;/em&gt;.
The job would succeed even if only a single &lt;em&gt;slot&lt;/em&gt; was available.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
If we could load the &lt;code&gt;orders&lt;/code&gt; table into local memory (making B -&amp;gt; C blocking), then the previous strategy would also succeed with one slot.
Nonetheless, we would have to allocate a lot of resources to accommodate the table locally, which may not be required.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Last but not least, let’s consider what happens in the case of &lt;em&gt;failover&lt;/em&gt;: if the processing of the &lt;code&gt;orders&lt;/code&gt; table fails (&lt;strong&gt;B-&amp;gt;C&lt;/strong&gt;),
then we do not have to reload the customer table (&lt;strong&gt;A&lt;/strong&gt;); we only need to restart &lt;strong&gt;B-&amp;gt;C&lt;/strong&gt;. This did not work prior to Flink 1.9.&lt;/p&gt;
&lt;p&gt;To satisfy the scheduling requirements for &lt;em&gt;batch&lt;/em&gt; and &lt;em&gt;streaming&lt;/em&gt; and overcome these limitations,
the Flink community has worked on a new unified scheduling and failover strategy that is suitable for both types of workloads: &lt;em&gt;pipelined region scheduling&lt;/em&gt;.&lt;/p&gt;
&lt;h1 id=&quot;the-new-pipelined-region-scheduling&quot;&gt;The new pipelined region scheduling&lt;/h1&gt;
&lt;p&gt;As you read in the previous introductory sections, an optimal &lt;a href=&quot;#what-is-scheduling&quot;&gt;scheduler&lt;/a&gt; should efficiently allocate resources
for the sub-stages of the pipeline, finite or infinite, running in a &lt;em&gt;streaming&lt;/em&gt; fashion. Those stages are called &lt;em&gt;pipelined regions&lt;/em&gt; in Flink.
In this section, we will take a deeper dive into &lt;em&gt;pipelined region scheduling and failover&lt;/em&gt;.&lt;/p&gt;
&lt;h2 id=&quot;pipelined-regions&quot;&gt;Pipelined regions&lt;/h2&gt;
&lt;p&gt;The new scheduling strategy analyses the &lt;em&gt;&lt;a href=&quot;#executiongraph&quot;&gt;ExecutionGraph&lt;/a&gt;&lt;/em&gt; before starting the &lt;em&gt;subtask&lt;/em&gt; deployment in order to identify its &lt;em&gt;pipelined regions&lt;/em&gt;.
A &lt;em&gt;pipelined region&lt;/em&gt; is a subset of &lt;em&gt;subtasks&lt;/em&gt; in the &lt;em&gt;ExecutionGraph&lt;/em&gt; connected by &lt;em&gt;&lt;a href=&quot;#intermediate-results&quot;&gt;pipelined&lt;/a&gt;&lt;/em&gt; data exchanges.
&lt;em&gt;Subtasks&lt;/em&gt; from different &lt;em&gt;pipelined regions&lt;/em&gt; are connected only by &lt;em&gt;&lt;a href=&quot;#intermediate-results&quot;&gt;blocking&lt;/a&gt;&lt;/em&gt; data exchanges.
The depicted example of an &lt;em&gt;ExecutionGraph&lt;/em&gt; has four &lt;em&gt;pipelined regions&lt;/em&gt; and &lt;em&gt;subtasks&lt;/em&gt;, A to H:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-12-02-pipelined-region-sheduling/pipelined-regions.png&quot; width=&quot;250px&quot; alt=&quot;Pipelined regions:high&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Flink: Pipelined regions&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;Why do we need the &lt;em&gt;pipelined region&lt;/em&gt;? Within the &lt;em&gt;pipelined region&lt;/em&gt; all consumers have to constantly consume the produced results
to not block the producers and avoid backpressure. Hence, all &lt;em&gt;subtasks&lt;/em&gt; of a &lt;em&gt;pipelined region&lt;/em&gt; have to be scheduled, restarted in case of failure and run at the same time.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note (out of scope)&lt;/span&gt;
In certain cases the &lt;em&gt;subtasks&lt;/em&gt; can be connected by &lt;em&gt;&lt;a href=&quot;#intermediate-results&quot;&gt;blocking&lt;/a&gt;&lt;/em&gt; data exchanges within one region.
Check &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17330&quot;&gt;FLINK-17330&lt;/a&gt; for details.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;pipelined-region-scheduling-strategy&quot;&gt;Pipelined region scheduling strategy&lt;/h2&gt;
&lt;p&gt;Once the &lt;em&gt;pipelined regions&lt;/em&gt; are identified, each region is scheduled only when all the regions it depends on (i.e. its inputs),
have produced their &lt;em&gt;&lt;a href=&quot;#intermediate-results&quot;&gt;blocking&lt;/a&gt;&lt;/em&gt; results (for the depicted graph: R2 and R3 after R1; R4 after R2 and R3).
If the &lt;em&gt;JobManager&lt;/em&gt; has enough resources available, it will try to run as many schedulable &lt;em&gt;pipelined regions&lt;/em&gt; in parallel as possible.
The &lt;em&gt;subtasks&lt;/em&gt; of a &lt;em&gt;pipelined region&lt;/em&gt; are either successfully deployed all at once or none at all.
The job fails if there are not enough resources to run any of its &lt;em&gt;pipelined regions&lt;/em&gt;.
You can read more about this effort in the original &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling#FLIP119PipelinedRegionScheduling-BulkSlotAllocation&quot;&gt;FLIP-119 proposal&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;failover-strategy&quot;&gt;Failover strategy&lt;/h2&gt;
&lt;p&gt;As mentioned before, only certain regions are running at the same time. Others have already produced their &lt;em&gt;&lt;a href=&quot;#intermediate-results&quot;&gt;blocking&lt;/a&gt;&lt;/em&gt; results.
The results are stored locally in &lt;em&gt;TaskManagers&lt;/em&gt; where the corresponding &lt;em&gt;subtasks&lt;/em&gt; run.
If a currently running region fails, it gets restarted to consume its inputs again.
If some input results got lost (e.g. the hosting &lt;em&gt;TaskManager&lt;/em&gt; failed as well), Flink will rerun their producing regions.
You can read more about this effort in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/task_failure_recovery.html#failover-strategies&quot;&gt;user documentation&lt;/a&gt;
and the original &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures&quot;&gt;FLIP-1 proposal&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;benefits&quot;&gt;Benefits&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Run any batch job, possibly with limited resources&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;subtasks&lt;/em&gt; of a &lt;em&gt;pipelined region&lt;/em&gt; are deployed only when all necessary conditions for their success are fulfilled:
inputs are ready and all needed resources are allocated. Hence, the &lt;em&gt;batch&lt;/em&gt; job never gets stuck without notifying the user.
The job either eventually finishes or fails after a timeout.&lt;/p&gt;
&lt;p&gt;Depending on how the &lt;em&gt;subtasks&lt;/em&gt; are allowed to &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/operators/#task-chaining-and-resource-groups&quot;&gt;share slots&lt;/a&gt;,
it is often the case that the whole &lt;em&gt;pipelined region&lt;/em&gt; can run within one &lt;em&gt;slot&lt;/em&gt;,
making it generally possible to run the whole &lt;em&gt;batch&lt;/em&gt; job with only a single &lt;em&gt;slot&lt;/em&gt;.
At the same time, if the cluster provides more resources, Flink will run as many regions as possible in parallel to improve the overall job performance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No resource waste&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As mentioned in the definition of &lt;em&gt;pipelined region&lt;/em&gt;, all its &lt;em&gt;subtasks&lt;/em&gt; have to run simultaneously.
The &lt;em&gt;subtasks&lt;/em&gt; of other regions either cannot or do not have to run at the same time.
This means that a &lt;em&gt;pipelined region&lt;/em&gt; is the minimum subgraph of a &lt;em&gt;batch&lt;/em&gt; job’s &lt;em&gt;ExecutionGraph&lt;/em&gt; that has to be scheduled at once.
There is no way to run the job with fewer resources than needed to run the largest region, and so there can be no resource waste.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note (out of scope)&lt;/span&gt;
The amount of resources required to run a region can be further optimized separately.
It depends on &lt;em&gt;co-location constraints&lt;/em&gt; and &lt;em&gt;slot sharing groups&lt;/em&gt; of the region’s &lt;em&gt;subtasks&lt;/em&gt;.
Check &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18689&quot;&gt;FLINK-18689&lt;/a&gt; for details.&lt;/p&gt;
&lt;/div&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;Scheduling is a fundamental component of the Flink stack. In this blogpost, we recapped how scheduling affects resource utilization and failover as a part of the user experience.
We described the limitations of Flink’s old scheduler and introduced a new approach to tackle them: the &lt;em&gt;pipelined region scheduler&lt;/em&gt;, which ships with Flink 1.12.
The blogpost also explained how &lt;em&gt;pipelined region failover&lt;/em&gt; (introduced in Flink 1.11) works.&lt;/p&gt;
&lt;p&gt;Stay tuned for more improvements to scheduling in upcoming releases. If you have any suggestions or questions for the community,
we encourage you to sign up to the Apache Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;mailing lists&lt;/a&gt; and become part of the discussion.&lt;/p&gt;
&lt;h1 id=&quot;appendix&quot;&gt;Appendix&lt;/h1&gt;
&lt;h2 id=&quot;what-is-scheduling&quot;&gt;What is scheduling?&lt;/h2&gt;
&lt;h3 id=&quot;executiongraph&quot;&gt;ExecutionGraph&lt;/h3&gt;
&lt;p&gt;A Flink &lt;em&gt;job&lt;/em&gt; is a pipeline of connected &lt;em&gt;operators&lt;/em&gt; to process data.
Together, the operators form a &lt;em&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/internals/job_scheduling.html#jobmanager-data-structures&quot;&gt;JobGraph&lt;/a&gt;&lt;/em&gt;.
Each &lt;em&gt;operator&lt;/em&gt; has a certain number of &lt;em&gt;subtasks&lt;/em&gt; executed in parallel. The &lt;em&gt;subtask&lt;/em&gt; is the actual execution unit in Flink.
Each subtask can consume user records from other subtasks (inputs), process them and produce records for further consumption by other &lt;em&gt;subtasks&lt;/em&gt; (outputs) down the stream.
There are &lt;em&gt;source subtasks&lt;/em&gt; without inputs and &lt;em&gt;sink subtasks&lt;/em&gt; without outputs. Hence, the &lt;em&gt;subtasks&lt;/em&gt; form the nodes of the
&lt;em&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/internals/job_scheduling.html#jobmanager-data-structures&quot;&gt;ExecutionGraph&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;
&lt;h3 id=&quot;intermediate-results&quot;&gt;Intermediate results&lt;/h3&gt;
&lt;p&gt;There are also two major data-exchange types to produce and consume results by &lt;em&gt;operators&lt;/em&gt; and their &lt;em&gt;subtasks&lt;/em&gt;: &lt;em&gt;pipelined&lt;/em&gt; and &lt;em&gt;blocking&lt;/em&gt;.
They are basically types of edges in the &lt;em&gt;ExecutionGraph&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;pipelined&lt;/em&gt; result can be consumed record by record. This means that the consumer can already run once the first result records have been produced.
A &lt;em&gt;pipelined&lt;/em&gt; result can be a never ending output of records, e.g. in case of a &lt;em&gt;streaming job&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;blocking&lt;/em&gt; result can be consumed only when its &lt;em&gt;production&lt;/em&gt; is done. Hence, the &lt;em&gt;blocking&lt;/em&gt; result is always finite
and the consumer of the &lt;em&gt;blocking&lt;/em&gt; result can run only when the producer has finished its execution.&lt;/p&gt;
&lt;h3 id=&quot;slots-and-resources&quot;&gt;Slots and resources&lt;/h3&gt;
&lt;p&gt;A &lt;em&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/concepts/flink-architecture.html#anatomy-of-a-flink-cluster&quot;&gt;TaskManager&lt;/a&gt;&lt;/em&gt;
instance has a certain number of virtual &lt;em&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/concepts/flink-architecture.html#task-slots-and-resources&quot;&gt;slots&lt;/a&gt;&lt;/em&gt;.
Each &lt;em&gt;slot&lt;/em&gt; represents a certain part of the &lt;em&gt;TaskManager’s physical resources&lt;/em&gt; to run the operator &lt;em&gt;subtasks&lt;/em&gt;, and each &lt;em&gt;subtask&lt;/em&gt; is deployed into a &lt;em&gt;slot&lt;/em&gt; of the &lt;em&gt;TaskManager&lt;/em&gt;.
A &lt;em&gt;slot&lt;/em&gt; can run multiple &lt;em&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/internals/job_scheduling.html#scheduling&quot;&gt;subtasks&lt;/a&gt;&lt;/em&gt; from different &lt;em&gt;operators&lt;/em&gt; at the same time, usually &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/concepts/flink-architecture.html#tasks-and-operator-chains&quot;&gt;chained&lt;/a&gt; together.&lt;/p&gt;
&lt;h3 id=&quot;scheduling-strategy&quot;&gt;Scheduling strategy&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/internals/job_scheduling.html#scheduling&quot;&gt;Scheduling&lt;/a&gt;&lt;/em&gt;
in Flink is a process of searching for and allocating appropriate resources (&lt;em&gt;slots&lt;/em&gt;) from the &lt;em&gt;TaskManagers&lt;/em&gt; to run the &lt;em&gt;subtasks&lt;/em&gt; and produce results.
The &lt;em&gt;scheduling strategy&lt;/em&gt; reacts on scheduling events (like start job, &lt;em&gt;subtask&lt;/em&gt; failed or finished etc) to decide which &lt;em&gt;subtask&lt;/em&gt; to deploy next.&lt;/p&gt;
&lt;p&gt;For instance, it does not make sense to schedule &lt;em&gt;subtasks&lt;/em&gt; whose inputs are not ready to consume yet to avoid wasting resources.
Another example is to schedule &lt;em&gt;subtasks&lt;/em&gt; which are connected with &lt;em&gt;pipelined&lt;/em&gt; edges together, to avoid deadlocks caused by backpressure.&lt;/p&gt;
</description>
<pubDate>Tue, 15 Dec 2020 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/2020/12/15/pipelined-region-sheduling.html</link>
<guid isPermaLink="true">/2020/12/15/pipelined-region-sheduling.html</guid>
</item>
<item>
<title>Apache Flink 1.12.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is excited to announce the release of Flink 1.12.0! Close to 300 contributors worked on over 1k threads to bring significant improvements to usability as well as new features that simplify (and unify) Flink handling across the API stack.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Release Highlights&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The community has added support for &lt;strong&gt;efficient batch execution&lt;/strong&gt; in the DataStream API. This is the next major milestone towards achieving a truly unified runtime for both batch and stream processing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Kubernetes-based High Availability (HA)&lt;/strong&gt; was implemented as an alternative to ZooKeeper for highly available production setups.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The Kafka SQL connector has been extended to work in &lt;strong&gt;upsert mode&lt;/strong&gt;, supported by the ability to handle &lt;strong&gt;connector metadata&lt;/strong&gt; in SQL DDL. &lt;strong&gt;Temporal table joins&lt;/strong&gt; can now also be fully expressed in SQL, no longer depending on the Table API.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Support for the &lt;strong&gt;DataStream API in PyFlink&lt;/strong&gt; expands its usage to more complex scenarios that require fine-grained control over state and time, and it’s now possible to deploy PyFlink jobs natively on &lt;strong&gt;Kubernetes&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This blog post describes all major new features and improvements, important changes to be aware of and what to expect moving forward.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#new-features-and-improvements&quot; id=&quot;markdown-toc-new-features-and-improvements&quot;&gt;New Features and Improvements&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#batch-execution-mode-in-the-datastream-api&quot; id=&quot;markdown-toc-batch-execution-mode-in-the-datastream-api&quot;&gt;Batch Execution Mode in the DataStream API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-data-sink-api-beta&quot; id=&quot;markdown-toc-new-data-sink-api-beta&quot;&gt;New Data Sink API (Beta)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#kubernetes-high-availability-ha-service&quot; id=&quot;markdown-toc-kubernetes-high-availability-ha-service&quot;&gt;Kubernetes High Availability (HA) Service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-improvements&quot; id=&quot;markdown-toc-other-improvements&quot;&gt;Other Improvements&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#table-apisql-metadata-handling-in-sql-connectors&quot; id=&quot;markdown-toc-table-apisql-metadata-handling-in-sql-connectors&quot;&gt;Table API/SQL: Metadata Handling in SQL Connectors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#table-apisql-upsert-kafka-connector&quot; id=&quot;markdown-toc-table-apisql-upsert-kafka-connector&quot;&gt;Table API/SQL: Upsert Kafka Connector&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#table-apisql-support-for-temporal-table-joins-in-sql&quot; id=&quot;markdown-toc-table-apisql-support-for-temporal-table-joins-in-sql&quot;&gt;Table API/SQL: Support for Temporal Table Joins in SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-improvements-to-the-table-apisql&quot; id=&quot;markdown-toc-other-improvements-to-the-table-apisql&quot;&gt;Other Improvements to the Table API/SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pyflink-python-datastream-api&quot; id=&quot;markdown-toc-pyflink-python-datastream-api&quot;&gt;PyFlink: Python DataStream API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-improvements-to-pyflink&quot; id=&quot;markdown-toc-other-improvements-to-pyflink&quot;&gt;Other Improvements to PyFlink&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#important-changes&quot; id=&quot;markdown-toc-important-changes&quot;&gt;Important Changes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes&quot; id=&quot;markdown-toc-release-notes&quot;&gt;Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt; of the Flink website, and the most recent distribution of PyFlink is available on &lt;a href=&quot;https://pypi.org/project/apache-flink/&quot;&gt;PyPI&lt;/a&gt;. Please review the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/release-notes/flink-1.12.html&quot;&gt;release notes&lt;/a&gt; carefully, and check the complete &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12348263&amp;amp;styleName=Html&amp;amp;projectId=12315522&quot;&gt;release changelog&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/&quot;&gt;updated documentation&lt;/a&gt; for more details.&lt;/p&gt;
&lt;p&gt;We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt; or &lt;a href=&quot;https://issues.apache.org/jira/projects/FLINK/summary&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;new-features-and-improvements&quot;&gt;New Features and Improvements&lt;/h2&gt;
&lt;h3 id=&quot;batch-execution-mode-in-the-datastream-api&quot;&gt;Batch Execution Mode in the DataStream API&lt;/h3&gt;
&lt;p&gt;Flink’s core APIs have developed organically over the lifetime of the project, and were initially designed with specific use cases in mind. And while the Table API/SQL already has unified operators, using lower-level abstractions still requires you to choose between two semantically different APIs for batch (DataSet API) and streaming (DataStream API). Since &lt;em&gt;a batch is a subset of an unbounded stream&lt;/em&gt;, there are some clear advantages to consolidating them under a single API:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reusability:&lt;/strong&gt; efficient batch and stream processing under the same API would allow you to easily switch between both execution modes without rewriting any code. So, a job could be easily reused to process real-time and historical data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Operational simplicity:&lt;/strong&gt; providing a unified API would mean using a single set of connectors, maintaining a single codebase and being able to easily implement mixed execution pipelines &lt;em&gt;e.g.&lt;/em&gt; for use cases like backfilling.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With these advantages in mind, the community has taken the first step towards the unification of the DataStream API: supporting efficient batch execution (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-134%3A+Batch+execution+for+the+DataStream+API&quot;&gt;FLIP-134&lt;/a&gt;). This means that, in the long run, the DataSet API will be deprecated and subsumed by the DataStream API and the Table API/SQL (&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741&quot;&gt;FLIP-131&lt;/a&gt;). For an overview of the unification effort, refer to &lt;a href=&quot;https://youtu.be/z9ye4jzp4DQ&quot;&gt;this&lt;/a&gt; recent Flink Forward talk.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Batch for Bounded Streams&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;You could already use the DataStream API to process bounded streams (&lt;em&gt;e.g.&lt;/em&gt; files), with the limitation that the runtime is not “aware” that the job is bounded. To optimize the runtime for bounded input, the new &lt;code&gt;BATCH&lt;/code&gt; mode execution uses sort-based shuffles with aggregations purely in-memory and an improved scheduling strategy (&lt;em&gt;see &lt;a href=&quot;#pipelined-region-scheduling-flip-119&quot;&gt;Pipelined Region Scheduling&lt;/a&gt;&lt;/em&gt;). As a result, &lt;code&gt;BATCH&lt;/code&gt; mode execution in the DataStream API already comes very close to the performance of the DataSet API in Flink 1.12. For more details on the performance benchmark, check the original proposal (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-140%3A+Introduce+batch-style+execution+for+bounded+keyed+streams&quot;&gt;FLIP-140&lt;/a&gt;).&lt;/p&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-12-08-release-1.12.0/1.png&quot; width=&quot;600px&quot; /&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;In Flink 1.12, the default execution mode is &lt;code&gt;STREAMING&lt;/code&gt;. To configure a job to run in &lt;code&gt;BATCH&lt;/code&gt; mode, you can set the configuration when submitting a job:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;bin/flink run -Dexecution.runtime-mode&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;BATCH examples/streaming/WordCount.jar&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;, or do it programmatically:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setRuntimeMode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RuntimeMode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;BATCH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;div class=&quot;alert alert-info small&quot;&gt;
&lt;p&gt;&lt;b&gt;Note:&lt;/b&gt; Although the DataSet API has not been deprecated yet, we recommend that users give preference to the DataStream API with &lt;code&gt;BATCH&lt;/code&gt; execution mode for new batch jobs, and consider migrating existing DataSet jobs.&lt;/p&gt;
&lt;/div&gt;
&lt;h3 id=&quot;new-data-sink-api-beta&quot;&gt;New Data Sink API (Beta)&lt;/h3&gt;
&lt;p&gt;Ensuring that connectors can work for both execution modes has already been covered for data sources in the &lt;a href=&quot;https://flink.apache.org/news/2020/07/06/release-1.11.0.html#new-data-source-api-beta&quot;&gt;previous release&lt;/a&gt;, so in Flink 1.12 the community focused on implementing a unified Data Sink API (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API&quot;&gt;FLIP-143&lt;/a&gt;). The new abstraction introduces a write/commit protocol and a more modular interface where the individual components are transparently exposed to the framework.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;Sink&lt;/em&gt; implementor will have to provide the &lt;strong&gt;what&lt;/strong&gt; and &lt;strong&gt;how&lt;/strong&gt;: a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/api/connector/sink/SinkWriter.html&quot;&gt;&lt;em&gt;SinkWriter&lt;/em&gt;&lt;/a&gt; that writes data and outputs what needs to be committed (i.e. committables); and a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/api/connector/sink/Committer.html&quot;&gt;&lt;em&gt;Committer&lt;/em&gt;&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/api/connector/sink/GlobalCommitter.html&quot;&gt;&lt;em&gt;GlobalCommitter&lt;/em&gt;&lt;/a&gt; that encapsulate how to handle the committables. The framework is responsible for the &lt;strong&gt;when&lt;/strong&gt; and &lt;strong&gt;where&lt;/strong&gt;: at what time and on which machine or process to commit.&lt;/p&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-12-08-release-1.12.0/2.png&quot; width=&quot;700px&quot; /&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;This more modular abstraction allowed to support different runtime implementations for the &lt;code&gt;BATCH&lt;/code&gt; and &lt;code&gt;STREAMING&lt;/code&gt; execution modes that are efficient for their intended purpose, but use just one, unified sink implementation. In Flink 1.12, the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/connectors/file_sink.html&quot;&gt;FileSink connector&lt;/a&gt; is the unified drop-in replacement for StreamingFileSink (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19758&quot;&gt;FLINK-19758&lt;/a&gt;). The remaining connectors will be ported to the new interfaces in future releases.&lt;/p&gt;
&lt;h3 id=&quot;kubernetes-high-availability-ha-service&quot;&gt;Kubernetes High Availability (HA) Service&lt;/h3&gt;
&lt;p&gt;Kubernetes provides built-in functionalities that Flink can leverage for JobManager failover, instead of relying on &lt;a href=&quot;https://zookeeper.apache.org/&quot;&gt;ZooKeeper&lt;/a&gt;. To enable a “ZooKeeperless” HA setup, the community implemented a Kubernetes HA service in Flink 1.12 (&lt;a href=&quot;https://cwiki.apache.org/confluence/x/H0V4CQ&quot;&gt;FLIP-144&lt;/a&gt;). The service is built on the same &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/runtime/highavailability/HighAvailabilityServices.html&quot;&gt;base interface&lt;/a&gt; as the ZooKeeper implementation and uses Kubernetes’ &lt;a href=&quot;https://kubernetes.io/docs/concepts/configuration/configmap/&quot;&gt;ConfigMap&lt;/a&gt; objects to handle all the metadata needed to recover from a JobManager failure. For more details and examples on how to configure a highly available Kubernetes cluster, check out the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/deployment/ha/kubernetes_ha.html&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;alert alert-info small&quot;&gt;
&lt;p&gt;&lt;b&gt;Note:&lt;/b&gt; This does not mean that the ZooKeeper dependency will be dropped, just that there will be an alternative for users of Flink on Kubernetes.&lt;/p&gt;
&lt;/div&gt;
&lt;hr /&gt;
&lt;h3 id=&quot;other-improvements&quot;&gt;Other Improvements&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Migration of existing connectors to the new Data Source API&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The previous release introduced a new Data Source API (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface&quot;&gt;FLIP-27&lt;/a&gt;), allowing to implement connectors that work both as bounded (batch) and unbounded (streaming) sources. In Flink 1.12, the community started porting existing source connectors to the new interfaces, starting with the FileSystem connector (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19161&quot;&gt;FLINK-19161&lt;/a&gt;).&lt;/p&gt;
&lt;div class=&quot;alert alert-danger small&quot;&gt;
&lt;p&gt;&lt;b&gt;Attention:&lt;/b&gt; The unified source implementations will be completely separate connectors that are not snapshot-compatible with their legacy counterparts.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Pipelined Region Scheduling (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling#FLIP119PipelinedRegionScheduling-BulkSlotAllocation&quot;&gt;FLIP-119&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Flink’s scheduler has been largely designed to address batch and streaming workloads separately. This release introduces a &lt;strong&gt;unified&lt;/strong&gt; scheduling strategy that identifies blocking data exchanges to break down the execution graph into &lt;em&gt;pipelined regions&lt;/em&gt;. This allows to schedule each region only when there’s data to perform work and only deploy it once all the required resources are available; as well as to restart failed regions independently. In particular for batch jobs, the new strategy leads to more efficient resource utilization and eliminates deadlocks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Support for Sort-Merge Shuffles (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-148%3A+Introduce+Sort-Merge+Based+Blocking+Shuffle+to+Flink&quot;&gt;FLIP-148&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To improve the stability, performance and resource utilization of large-scale batch jobs, the community introduced sort-merge shuffle as an alternative to the original shuffle implementation that Flink already used. This approach can reduce shuffle time &lt;a href=&quot;https://www.mail-archive.com/dev@flink.apache.org/msg42472.html&quot;&gt;significantly&lt;/a&gt;, and uses fewer file handles and file write buffers (which is problematic for large-scale jobs). Further optimizations will be implemented in upcoming releases (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19614&quot;&gt;FLINK-19614&lt;/a&gt;).&lt;/p&gt;
&lt;div class=&quot;alert alert-danger small&quot;&gt;
&lt;p&gt;&lt;b&gt;Attention:&lt;/b&gt; This feature is experimental and not enabled by default. To enable sort-merge shuffles, you can configure a reasonable minimum parallelism threshold in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/deployment/config.html#taskmanager-network-sort-shuffle-min-parallelism&quot;&gt;TaskManager network configuration options&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Improvements to the Flink WebUI (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal&quot;&gt;FLIP-75&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As a continuation of the series of improvements to the Flink WebUI kicked off in the last release, the community worked on exposing JobManager’s memory-related metrics and configuration parameters on the WebUI (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-104%3A+Add+More+Metrics+to+Jobmanager&quot;&gt;FLIP-104&lt;/a&gt;). The TaskManager’s metrics page has also been updated to reflect the &lt;a href=&quot;https://flink.apache.org/news/2020/04/21/memory-management-improvements-flink-1.10.html&quot;&gt;changes to the TaskManager memory model&lt;/a&gt; introduced in Flink 1.10 (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-102%3A+Add+More+Metrics+to+TaskManager&quot;&gt;FLIP-102&lt;/a&gt;), adding new metrics for Managed Memory, Network Memory and Metaspace.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3 id=&quot;table-apisql-metadata-handling-in-sql-connectors&quot;&gt;Table API/SQL: Metadata Handling in SQL Connectors&lt;/h3&gt;
&lt;p&gt;Some sources (and formats) expose additional fields as metadata that can be valuable for users to process along with record data. A common example is Kafka, where you might want to &lt;em&gt;e.g.&lt;/em&gt; access offset, partition or topic information, read/write the record key or use embedded metadata timestamps for time-based operations.
With the new release, Flink SQL supports &lt;strong&gt;metadata columns&lt;/strong&gt; to read and write connector- and format-specific fields for every row of a table (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Handling+of+metadata+in+SQL+connectors&quot;&gt;FLIP-107&lt;/a&gt;). These columns are declared in the &lt;code&gt;CREATE TABLE&lt;/code&gt; statement using the &lt;code&gt;METADATA&lt;/code&gt; (reserved) keyword.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kafka_table&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;event_time&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;METADATA&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;timestamp&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- access Kafka &amp;#39;timestamp&amp;#39; metadata&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;headers&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;MAP&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BYTES&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;METADATA&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- access Kafka &amp;#39;headers&amp;#39; metadata&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;kafka&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;topic&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;test-topic&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;format&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;avro&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In Flink 1.12, metadata is exposed for the &lt;strong&gt;Kafka&lt;/strong&gt; and &lt;strong&gt;Kinesis&lt;/strong&gt; connectors, with work on the FileSystem connector already planned (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19903&quot;&gt;FLINK-19903&lt;/a&gt;). Due to the more complex structure of Kafka records, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/kafka.html#key-format&quot;&gt;new properties&lt;/a&gt; were also specifically implemented for the Kafka connector to control how to handle the key/value pairs. For a complete overview of metadata support in Flink SQL, check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/&quot;&gt;documentation&lt;/a&gt; for each connector, as well as the motivating use cases in the original proposal.&lt;/p&gt;
&lt;h3 id=&quot;table-apisql-upsert-kafka-connector&quot;&gt;Table API/SQL: Upsert Kafka Connector&lt;/h3&gt;
&lt;p&gt;For some use cases, like interpreting compacted topics or writing out (updating) aggregated results, it’s necessary to handle Kafka record keys as &lt;em&gt;true&lt;/em&gt; primary keys that can determine what should be inserted, deleted or updated. To enable this, the community created a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/upsert-kafka.html&quot;&gt;dedicated upsert connector&lt;/a&gt; (&lt;code&gt;upsert-kafka&lt;/code&gt;) that extends the base implementation to work in &lt;em&gt;upsert&lt;/em&gt; mode (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19857&quot;&gt;FLIP-149&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The new &lt;code&gt;upsert-kafka&lt;/code&gt; connector can be used for sources and sinks, and provides the &lt;strong&gt;same base functionality&lt;/strong&gt; and &lt;strong&gt;persistence guarantees&lt;/strong&gt; as the existing Kafka connector, as it reuses most of its code under the hood. To use the &lt;code&gt;upsert-kafka connector&lt;/code&gt;, you must define a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/sql/create.html#primary-key&quot;&gt;primary key constraint&lt;/a&gt; on table creation, as well as specify the (de)serialization format for the key (&lt;code&gt;key.format&lt;/code&gt;) and value (&lt;code&gt;value.format&lt;/code&gt;).&lt;/p&gt;
&lt;h3 id=&quot;table-apisql-support-for-temporal-table-joins-in-sql&quot;&gt;Table API/SQL: Support for Temporal Table Joins in SQL&lt;/h3&gt;
&lt;p&gt;Instead of creating a temporal table function to look up against a table at a certain point in time, you can now simply use the standard SQL clause &lt;code&gt;FOR SYSTEM_TIME AS OF&lt;/code&gt; (SQL:2011) to express a &lt;strong&gt;temporal table join&lt;/strong&gt;. In addition, temporal joins are now supported against &lt;em&gt;any&lt;/em&gt; kind of table that has a time attribute and a primary key, and not just &lt;em&gt;append-only&lt;/em&gt; tables. This unlocks a new set of use cases, like performing temporal joins directly against Kafka compacted topics or database changelogs (e.g. from Debezium).&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;c1&quot;&gt;-- Table backed by a Kafka topic&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orders&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;order_id&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;currency&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;amount&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WATERMARK&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;30&amp;#39;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;SECOND&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;kafka&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- Table backed by a Kafka compacted topic&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;latest_rates&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;currency&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;currency_rate&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;DECIMAL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;38&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;currency_time&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WATERMARK&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currency_time&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currency_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;5&amp;#39;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;SECOND&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;PRIMARY&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;KEY&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currency&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;NOT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ENFORCED&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;upsert-kafka&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- Event-time temporal table join&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;o&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;order_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;o&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;o&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;amount&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currency_rate&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;amount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currency&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orders&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;JOIN&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;latest_rates&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SYSTEM_TIME&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;OF&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;ON&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currency&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currency&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The previous example also shows how you can take advantage of the new &lt;code&gt;upsert-kafka&lt;/code&gt; connector in the context of temporal table joins.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hive Tables in Temporal Table Joins&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;You can also perform temporal table joins against Hive tables by either automatically reading the latest table partition as a temporal table (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19644&quot;&gt;FLINK-19644&lt;/a&gt;) or the whole table as a bounded stream tracking the latest version at execution time. Refer to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/hive/hive_read_write.html#temporal-table-join&quot;&gt;documentation&lt;/a&gt; for examples of using Hive tables in temporal table joins.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3 id=&quot;other-improvements-to-the-table-apisql&quot;&gt;Other Improvements to the Table API/SQL&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Kinesis Flink SQL Connector (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18858&quot;&gt;FLINK-18858&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;From Flink 1.12, Amazon Kinesis Data Streams (KDS) is natively supported as a source/sink also in the Table API/SQL. The new Kinesis SQL connector ships with support for Enhanced Fan-Out (EFO) and Sink Partitioning. For a complete overview of supported features, configuration options and exposed metadata, check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/kinesis.html&quot;&gt;updated documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Streaming Sink Compaction in the FileSystem/Hive Connector (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19345&quot;&gt;FLINK-19345&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Many bulk formats, such as Parquet, are most efficient when written as large files; this is a challenge when frequent checkpointing is enabled, as too many small files are created (and need to be rolled on checkpoint). In Flink 1.12, the file sink supports &lt;strong&gt;file compaction&lt;/strong&gt;, allowing jobs to retain smaller checkpoint intervals without generating a large number of files. To enable file compaction, you can set &lt;code&gt;auto-compaction=true&lt;/code&gt; in the properties of the FileSystem connector, as described in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/filesystem.html#file-compaction&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Watermark Pushdown in the Kafka Connector (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-20041&quot;&gt;FLINK-20041&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To ensure correctness when consuming from Kafka, it’s generally preferable to generate watermarks on a per-partition basis, since the out-of-orderness within a partition is usually lower than across all partitions. Flink will now push down watermark strategies to &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/kafka.html#source-per-partition-watermarks&quot;&gt;emit &lt;strong&gt;per-partition watermarks&lt;/strong&gt;&lt;/a&gt; from within the Kafka consumer. The output watermark of the source will be determined by the minimum watermark across the partitions it reads, leading to better (i.e. closer to real-time) watermarking. Watermark pushdown also lets you configure per-partition &lt;strong&gt;idleness detection&lt;/strong&gt; to prevent idle partitions from holding back the event time progress of the entire application.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Newly Supported Formats&lt;/strong&gt;&lt;/p&gt;
&lt;table class=&quot;table table-bordered&quot; style=&quot;font-size:95%&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Supported Connectors&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/formats/avro-confluent.html&quot;&gt;Avro Schema Registry&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Read and write data serialized with the Confluent Schema Registry &lt;a href=&quot;https://docs.confluent.io/platform/current/schema-registry/serdes-develop/serdes-avro.html&quot;&gt;KafkaAvroSerializer&lt;/a&gt;.&lt;/td&gt;
&lt;td&gt;Kafka, Upsert Kafka&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/formats/debezium.html&quot;&gt;Debezium Avro&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Read and write Debezium records serialized with the Confluent Schema Registry KafkaAvroSerializer.&lt;/td&gt;
&lt;td&gt;Kafka&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/formats/maxwell.html&quot;&gt;Maxwell (CDC)&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Read and write Maxwell JSON records.&lt;/td&gt;
&lt;td&gt;
&lt;p&gt;Kafka&lt;/p&gt;
&lt;p&gt;FileSystem&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;td&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/formats/raw.html&quot;&gt;Raw&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Read and write raw (byte-based) values as a single column.&lt;/td&gt;
&lt;td&gt;
&lt;p&gt;Kafka, Upsert Kafka&lt;/p&gt;
&lt;p&gt;Kinesis&lt;/p&gt;
&lt;p&gt;FileSystem&lt;/p&gt;
&lt;/td&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Multi-input Operator for Join Optimization (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19621&quot;&gt;FLINK-19621&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To eliminate unnecessary serialization and data spilling and improve the performance of batch and streaming Table API/SQL jobs, the default planner now leverages the N-ary stream operator introduced in the last release (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-92%3A+Add+N-Ary+Stream+Operator+in+Flink&quot;&gt;FLIP-92&lt;/a&gt;) to implement the “chaining” of operators connected by forward edges.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Type Inference for Table API UDAFs (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-65%3A+New+type+inference+for+Table+API+UDFs&quot;&gt;FLIP-65&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This release concluded the work started in Flink 1.9 on a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/types.html#data-types&quot;&gt;new data type system&lt;/a&gt; for the Table API, with the exposure of aggregate functions (UDAFs) to the new type system. From Flink 1.12, UDAFs behave similarly to scalar and table functions, and support all data types.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3 id=&quot;pyflink-python-datastream-api&quot;&gt;PyFlink: Python DataStream API&lt;/h3&gt;
&lt;p&gt;To expand the usability of PyFlink, this release introduces a first version of the Python DataStream API (&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866298&quot;&gt;FLIP-130&lt;/a&gt;) with support for stateless operations (e.g. Map, FlatMap, Filter, KeyBy).&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.common.typeinfo&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.datastream&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MapFunction&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyMapFunction&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MapFunction&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_execution_environment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;data_stream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_collection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;type_info&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;INT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mapped_stream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data_stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyMapFunction&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;output_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;INT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mapped_stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;datastream job&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To give the Python DataStream API a try, you can &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/python/installation.html#installation-of-pyflink&quot;&gt;install PyFlink&lt;/a&gt; and check out &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/python/datastream_tutorial.html&quot;&gt;this tutorial&lt;/a&gt; that guides you through building a simple streaming application.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3 id=&quot;other-improvements-to-pyflink&quot;&gt;Other Improvements to PyFlink&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;PyFlink Jobs on Kubernetes (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17480&quot;&gt;FLINK-17480&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In addition to standalone and YARN deployments, PyFlink jobs can now also be deployed natively on Kubernetes. The &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/deployment/resource-providers/native_kubernetes.html&quot;&gt;deployment documentation&lt;/a&gt; has detailed instructions on how to start a &lt;em&gt;session&lt;/em&gt; or &lt;em&gt;application&lt;/em&gt; cluster on Kubernetes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;User-defined Aggregate Functions (UDAFs)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;From Flink 1.12, you can define and register UDAFs in PyFlink (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API&quot;&gt;FLIP-139&lt;/a&gt;). In contrast to a normal UDF, which doesn’t handle state and operates on a single row at a time, a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/python/table-api-users-guide/udfs/python_udfs.html#aggregate-functions&quot;&gt;UDAF&lt;/a&gt; is stateful and can be used to compute custom aggregations over multiple input rows. To benefit from vectorization, you can also use &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/python/table-api-users-guide/udfs/vectorized_python_udfs.html#vectorized-aggregate-functions&quot;&gt;Pandas UDAFs&lt;/a&gt; (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-137%3A+Support+Pandas+UDAF+in+PyFlink?src=jira&quot;&gt;FLIP-137&lt;/a&gt;) (up to 10x faster).&lt;/p&gt;
&lt;div class=&quot;alert alert-info small&quot;&gt;
&lt;p&gt;&lt;b&gt;Note:&lt;/b&gt; General UDAFs are only supported for group aggregations and in &lt;em&gt;streaming&lt;/em&gt; mode. For &lt;em&gt;batch&lt;/em&gt; mode or window aggregations, use Pandas UDAFs.&lt;/p&gt;
&lt;/div&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;important-changes&quot;&gt;Important Changes&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19319&quot;&gt;FLINK-19319&lt;/a&gt;] The default stream time characteristic has been changed to &lt;code&gt;EventTime&lt;/code&gt;, so you no longer need to call &lt;code&gt;StreamExecutionEnvironment.setStreamTimeCharacteristic()&lt;/code&gt; to enable event time support.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19278&quot;&gt;FLINK-19278&lt;/a&gt;] Flink now relies on Scala Macros 2.1.1, so Scala versions &amp;lt; 2.11.11 are no longer supported.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19152&quot;&gt;FLINK-19152&lt;/a&gt;] The Kafka 0.10.x and 0.11.x connectors have been removed with this release. If you’re still using these versions, please refer to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/connectors/kafka.html&quot;&gt;documentation&lt;/a&gt; to learn how to upgrade to the universal Kafka connector.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18795&quot;&gt;FLINK-18795&lt;/a&gt;] The HBase connector has been upgraded to the last stable version (2.2.3).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18795&quot;&gt;FLINK-17877&lt;/a&gt;] PyFlink now supports Python 3.8.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18738&quot;&gt;FLINK-18738&lt;/a&gt;] To align with &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management&quot;&gt;FLIP-53&lt;/a&gt;, managed memory is now the default also for Python workers. The configurations &lt;code&gt;python.fn-execution.buffer.memory.size&lt;/code&gt; and &lt;code&gt;python.fn-execution.framework.memory.size&lt;/code&gt; have been removed and will not take effect anymore.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h2&gt;
&lt;p&gt;Please review the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/release-notes/flink-1.12.html&quot;&gt;release notes&lt;/a&gt; carefully for a detailed list of changes and new features if you plan to upgrade your setup to Flink 1.12. This version is API-compatible with previous 1.x releases for APIs annotated with the @Public annotation.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank each and every one of the 300 contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;Abhijit Shandilya, Aditya Agarwal, Alan Su, Alexander Alexandrov, Alexander Fedulov, Alexey Trenikhin, Aljoscha Krettek, Allen Madsen, Andrei Bulgakov, Andrey Zagrebin, Arvid Heise, Authuir, Bairos, Bartosz Krasinski, Benchao Li, Brandon, Brian Zhou, C08061, Canbin Zheng, Cedric Chen, Chesnay Schepler, Chris Nix, Congxian Qiu, DG-Wangtao, Da(Dash)Shen, Dan Hill, Daniel Magyar, Danish Amjad, Danny Chan, Danny Cranmer, David Anderson, Dawid Wysakowicz, Devin Thomson, Dian Fu, Dongxu Wang, Dylan Forciea, Echo Lee, Etienne Chauchot, Fabian Paul, Felipe Lolas, Fin-Chan, Fin-chan, Flavio Pompermaier, Flora Tao, Fokko Driesprong, Gao Yun, Gary Yao, Ghildiyal, GitHub, Grebennikov Roman, GuoWei Ma, Gyula Fora, Hequn Cheng, Herman, Hong Teoh, HuangXiao, HuangXingBo, Husky Zeng, Hyeonseop Lee, I. Raleigh, Ivan, Jacky Lau, Jark Wu, Jaskaran Bindra, Jeff Yang, Jeff Zhang, Jiangjie (Becket) Qin, Jiatao Tao, Jiayi Liao, Jiayi-Liao, Jiezhi.G, Jimmy.Zhou, Jindrich Vimr, Jingsong Lee, JingsongLi, Joey Echeverria, Juha Mynttinen, Jun Qin, Jörn Kottmann, Karim Mansour, Kevin Bohinski, Kezhu Wang, Konstantin Knauf, Kostas Kloudas, Kurt Young, Lee Do-Kyeong, Leonard Xu, Lijie Wang, Liu Jiangang, Lorenzo Nicora, LululuAlu, Luxios22, Marta Paes Moreira, Mateusz Sabat, Matthias Pohl, Maximilian Michels, Miklos Gergely, Milan Nikl, Nico Kruber, Niel Hu, Niels Basjes, Oleksandr Nitavskyi, Paul Lam, Peng, PengFei Li, PengchengLiu, Peter Huang, Piotr Nowojski, PoojaChandak, Qingsheng Ren, Qishang Zhong, Richard Deurwaarder, Richard Moorhead, Robert Metzger, Roc Marshal, Roey Shem Tov, Roman, Roman Khachatryan, Rong Rong, Rui Li, Seth Wiesman, Shawn Huang, ShawnHx, Shengkai, Shuiqiang Chen, Shuo Cheng, SteNicholas, Stephan Ewen, Steve Whelan, Steven Wu, Tartarus0zm, Terry Wang, Thesharing, Thomas Weise, Till Rohrmann, Timo Walther, TsReaper, Tzu-Li (Gordon) Tai, Ufuk Celebi, V1ncentzzZ, Vladimirs Kotovs, Wei Zhong, Weike DONG, XBaith, Xiaogang Zhou, Xiaoguang Sun, Xingcan Cui, Xintong Song, Xuannan, Yang Liu, Yangze Guo, Yichao Yang, Yikun Jiang, Yu Li, Yuan Mei, Yubin Li, Yun Gao, Yun Tang, Yun Wang, Zhenhua Yang, Zhijiang, Zhu Zhu, acesine, acqua.csq, austin ce, bigdata-ny, billyrrr, caozhen, caozhen1937, chaojianok, chenkai, chris, cpugputpu, dalong01.liu, darionyaphet, dijie, diohabara, dufeng1010, fangliang, felixzheng, gkrishna, gm7y8, godfrey he, godfreyhe, gsralex, haseeb1431, hequn.chq, hequn8128, houmaozheng, huangxiao, huangxingbo, huzekang, jPrest, jasonlee, jinfeng, jinhai, johnm, jxeditor, kecheng, kevin.cyj, kevinzwx, klion26, leiqiang, libenchao, lijiewang.wlj, liufangliang, liujiangang, liuyongvs, liuyufei9527, lsy, lzy3261944, mans2singh, molsionmo, openopen2, pengweibo, rinkako, sanshi@wwdz.onaliyun.com, secondChoice, seunjjs, shaokan.cao, shizhengchao, shizk233, shouweikun, spurthi chaganti, sujun, sunjincheng121, sxnan, tison, totorooo, venn, vthinkxie, wangsong2, wangtong, wangxiyuan, wangxlong, wangyang0918, wangzzu, weizheng92, whlwanghailong, wineandcheeze, wooplevip, wtog, wudi28, wxp, xcomp, xiaoHoly, xiaolong.wang, yangyichao-mango, yingshin, yushengnan, yushujun, yuzhao.cyz, zhangap, zhangmang, zhangzhanchum, zhangzhanchun, zhangzhanhua, zhangzp, zheyu, zhijiang, zhushang, zhuxiaoshang, zlzhang0122, zodo, zoudan, zouzhiye&lt;/p&gt;
</description>
<pubDate>Thu, 10 Dec 2020 09:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/12/10/release-1.12.0.html</link>
<guid isPermaLink="true">/news/2020/12/10/release-1.12.0.html</guid>
</item>
<item>
<title>Stateful Functions 2.2.1 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community released the first bugfix release of the Stateful Functions (StateFun) 2.2 series, version 2.2.1.&lt;/p&gt;
&lt;p&gt;This release fixes a critical bug that causes restoring the Stateful Functions cluster from snapshots (checkpoints or
savepoints) to fail under certain conditions. Starting from this release, StateFun now creates snapshots with a more
robust format that allows it to be restored safely going forward.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;i&gt;We strongly recommend all users to upgrade to 2.2.1&lt;/i&gt;&lt;/b&gt;. Please see the following sections on instructions and things to
keep in mind for this upgrade.&lt;/p&gt;
&lt;h2 id=&quot;for-new-users-just-starting-out-with-stateful-functions&quot;&gt;For new users just starting out with Stateful Functions&lt;/h2&gt;
&lt;p&gt;We strongly recommend to skip all previous versions and start using StateFun from version 2.2.1.
This guarantees that failure recovery from checkpoints, or application upgrades using savepoints will work as expected for you.&lt;/p&gt;
&lt;h2 id=&quot;for-existing-users-on-versions--220&quot;&gt;For existing users on versions &amp;lt;= 2.2.0&lt;/h2&gt;
&lt;p&gt;Users that are currently using older versions of StateFun may or may not be able to directly upgrade to 2.2.1 using
savepoints taken with the older versions. &lt;b&gt;The Flink community is working hard on a follow-up hotfix release, 2.2.2,
that would guarantee that you can perform the upgrade smoothly&lt;/b&gt;. For the meantime, you may still try to upgrade to 2.2.1
first, but may encounter &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19741&quot;&gt;FLINK-19741&lt;/a&gt; or
&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19748&quot;&gt;FLINK-19748&lt;/a&gt;. If you do encounter this, do not worry about data
loss; this simply means that the restore failed, and you’d have to wait until 2.2.2 is out in order to upgrade.&lt;/p&gt;
&lt;p&gt;The follow-up hotfix release 2.2.2 is expected to be ready within another 2~3 weeks, as it &lt;a href=&quot;http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Apache-Flink-1-11-3-td45989.html&quot;&gt;requires a new hotfix release
from Flink core&lt;/a&gt;,
and ultimately an upgrade of the Flink dependency in StateFun. We’ll update the community via the Flink
mailing lists as soon as this is ready, so please subscribe to the mailing lists for important updates for this!&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This release includes 6 fixes and minor improvements since StateFun 2.2.0. Below is a detailed list of all fixes and improvements:&lt;/p&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19515&quot;&gt;FLINK-19515&lt;/a&gt;] - Async RequestReply handler concurrency bug
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19692&quot;&gt;FLINK-19692&lt;/a&gt;] - Can&amp;#39;t restore feedback channel from savepoint
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19866&quot;&gt;FLINK-19866&lt;/a&gt;] - FunctionsStateBootstrapOperator.createStateAccessor fails due to uninitialized runtimeContext
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19826&quot;&gt;FLINK-19826&lt;/a&gt;] - StateFun Dockerfile copies plugins with a specific version instead of a wildcard
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19827&quot;&gt;FLINK-19827&lt;/a&gt;] - Allow the harness to start with a user provided Flink configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19840&quot;&gt;FLINK-19840&lt;/a&gt;] - Add a rocksdb and heap timers configuration validation
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Wed, 11 Nov 2020 01:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/11/11/release-statefun-2.2.1.html</link>
<guid isPermaLink="true">/news/2020/11/11/release-statefun-2.2.1.html</guid>
</item>
<item>
<title>From Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and Backpressure</title>
<description>&lt;p&gt;Apache Flink’s checkpoint-based fault tolerance mechanism is one of its defining features. Because of that design, Flink unifies batch and stream processing, can easily scale to both &lt;a href=&quot;https://hal.inria.fr/hal-02463206/document&quot;&gt;very small&lt;/a&gt; and &lt;a href=&quot;https://102.alibaba.com/detail?id=35&quot;&gt;extremely large&lt;/a&gt; scenarios and provides support for many operational features like stateful upgrades with &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/stream/state/schema_evolution.html&quot;&gt;state evolution&lt;/a&gt; or &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/state/savepoints.html&quot;&gt;roll-backs and time-travel&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Despite all these great properties, Flink’s checkpointing method has an Achilles Heel: the speed of a completed checkpoint is determined by the speed at which data flows through the application. When the application backpressures, the processing of checkpoints is backpressured as well (&lt;a href=&quot;#appendix-1---on-backpressure&quot;&gt;Appendix 1&lt;/a&gt; recaps what is backpressure and why it can be a good thing). In such cases, checkpoints may take longer to complete or even time out completely.&lt;/p&gt;
&lt;p&gt;In Flink 1.11, the community introduced a first version of a new feature called “&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/state/checkpoints.html#unaligned-checkpoints&quot;&gt;unaligned checkpoints&lt;/a&gt;” that aims at solving this issue, while Flink 1.12 plans to further expand its functionality. In this two-series blog post, we discuss how Flink’s checkpointing mechanism has been modified to support unaligned checkpoints, how unaligned checkpoints work, and how this new mode impacts Flink users. In the first of the two posts, we start with a recap of the original checkpointing process in Flink, its core properties and issues under backpressure.&lt;/p&gt;
&lt;h1 id=&quot;state-in-streaming-applications&quot;&gt;State in Streaming Applications&lt;/h1&gt;
&lt;p&gt;Simply put, state is the information that you need to remember across events. Even the most trivial streaming applications are typically stateful because of their need to “remember” the exact position they are processing data from, for example in the form of a Kafka Partition Offset or a File Offset.
In addition, many applications hold state internally as a way to support their internal operations, such as windows, aggregations, joins, or state machines.&lt;/p&gt;
&lt;p&gt;For the remainder of this article, we’ll use the following example showing a streaming application consisting of &lt;strong&gt;four operators&lt;/strong&gt;, each one holding some state.&lt;/p&gt;
&lt;p&gt;&lt;span&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:75%&quot; src=&quot;/img/blog/2020-10-15-from-aligned-to-unaligned-checkpoints-part-1/from-aligned-to-unaligned-checkpoints-part-1-1.png&quot; /&gt;
&lt;/center&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;h2 id=&quot;state-persistence-through-checkpoints&quot;&gt;State Persistence through Checkpoints&lt;/h2&gt;
&lt;p&gt;Streaming applications are long-lived. They inevitably experience hardware and software failures but should, ideally, look from the outside as if no failure ever happened. Since applications are long-lived — and can potentially accumulate very large state —, recomputing partial results after failures can take quite some time, and so a way to persist and recover this (potentially very large) application state is necessary.&lt;/p&gt;
&lt;p&gt;Flink relies on its &lt;strong&gt;state checkpointing and recovery mechanism&lt;/strong&gt; to implement such behavior, as shown in the figure below. Periodic checkpoints store a snapshot of the application’s state on some Checkpoint Storage (commonly an Object Store or Distributed File System, like S3, HDFS, GCS, Azure Blob Storage, etc.). When a failure is detected, the affected parts of the application are reset to the state of the latest checkpoint (either by a local reset or by loading the state from the checkpoint storage).&lt;/p&gt;
&lt;div style=&quot;line-height:40%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:90%&quot; src=&quot;/img/blog/2020-10-15-from-aligned-to-unaligned-checkpoints-part-1/from-aligned-to-unaligned-checkpoints-part-1-2.png&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Flink’s checkpoint-based approach differs from the approach taken by other stream processing systems that keep state in a distributed database or write state changes to a log, for example. The checkpoint-based approach has some nice properties, described below, which make it a great option for Flink.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Checkpointing has very simple external dependencies&lt;/strong&gt;: An &lt;em&gt;Object Storage&lt;/em&gt; or a &lt;em&gt;Distributed FileSystem&lt;/em&gt; are probably the most available and easiest-to-administer services. Because these are available on all public cloud providers and among the first systems to provide on-premises, Flink becomes well-suited for a &lt;em&gt;cloud-native&lt;/em&gt; stack. In addition, these storage systems are cheaper by an order of magnitude (GB/month) when compared to distributed databases, key/value stores, or event brokers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Checkpoints are immutable and versioned&lt;/strong&gt;: Together with immutable and versioned inputs (as input streams are, by nature), checkpoints support storing immutable application snapshots that can be used for rollbacks, debugging, testing, or as a cheap alternative to analyze application state outside the production setup.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Checkpoints decouple the “stream transport” from the persistence mechanism&lt;/strong&gt;: “Stream transport” refers to how data is being exchanged between operators (e.g. during a shuffle). This decoupling is key to Flink’s batch &amp;lt;-&amp;gt; streaming unification in one system, because it allows Flink to implement a data transport that can take the shape of both a low-latency streaming exchange or a decoupled batch data exchange.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;the-checkpointing-mechanism&quot;&gt;The Checkpointing Mechanism&lt;/h1&gt;
&lt;p&gt;The fundamental challenge solved by the checkpointing algorithm (details in &lt;a href=&quot;https://pdfs.semanticscholar.org/6fa0/917417d3c213b0e130ae01b7b440b1868dde.pdf&quot;&gt;this paper&lt;/a&gt;) is drawing a snapshot out of the ever-changing state of a streaming application without suspending the continuous processing of events. Because there are always events in-flight (on the network, in I/O buffers, etc.), up- and downstream operators can be processing events from different times: the sink may write data from 11:04, while the source already ingests events from 11:06. Ideally, all snapshotted data should belong to the same point-in-time, as if the input was paused and we waited until all in-flight data was drained (i.e. the pipeline becoming idle) before taking the snapshot.&lt;/p&gt;
&lt;p&gt;To achieve that, Flink injects &lt;em&gt;checkpoint barriers&lt;/em&gt; into the streams at the sources, which travel through the entire topology and eventually reach the sinks. These barriers divide the stream into a &lt;em&gt;pre-checkpoint epoch&lt;/em&gt; (all events that are persisted in state or emitted into sinks) and a &lt;em&gt;post-checkpoint epoch&lt;/em&gt; (events not reflected in the state, to be re-processed when resuming from the checkpoint).&lt;/p&gt;
&lt;p&gt;The following figure shows what happens when a barrier reaches an operator.&lt;/p&gt;
&lt;p&gt;&lt;span&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:100%&quot; src=&quot;/img/blog/2020-10-15-from-aligned-to-unaligned-checkpoints-part-1/from-aligned-to-unaligned-checkpoints-part-1-3.png&quot; /&gt;
&lt;/center&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Operators need to make sure that they take the checkpoint exactly when all pre-checkpoint events are processed and no post-checkpoint events have yet been processed. When the first barrier reaches the head of the input buffer queue and is consumed by the operator, the operator starts the so-called &lt;em&gt;alignment phase&lt;/em&gt;. During that phase, the operator will not consume any data from the channels where it already received a barrier, until it has received a barrier from all input channels.&lt;/p&gt;
&lt;p&gt;Once all barriers are received, the operator snapshots its state, forwards the barrier to the output, and ends the alignment phase, which unblocks all inputs. An operator state snapshot is written into the checkpoint storage, typically asynchronously while data processing continues. Once all operators have successfully written their state snapshot to the checkpoint storage, the checkpoint is successfully completed and can be used for recovery.&lt;/p&gt;
&lt;p&gt;One important thing to note here is that the barriers flow with the events, strictly in line. In a healthy setup without backpressure, barriers flow and align in milliseconds. The checkpoint duration is dominated by the time it takes to write the state snapshots to the checkpoint storage, which becomes faster with &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/state/large_state_tuning.html#incremental-checkpoints&quot;&gt;incremental checkpoints&lt;/a&gt;. If the events flow slowly under backpressure, so will the barriers. That means that barriers can take long to travel from sources to sinks resulting in the alignment phase to take even longer to complete.&lt;/p&gt;
&lt;h2 id=&quot;recovery&quot;&gt;Recovery&lt;/h2&gt;
&lt;p&gt;When operators restart from a checkpoint (automatically during recovery or manually during deployment from a savepoint), the operators first restore their state from the checkpoint storage before resuming the event stream processing.&lt;/p&gt;
&lt;p&gt;&lt;span&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:100%&quot; src=&quot;/img/blog/2020-10-15-from-aligned-to-unaligned-checkpoints-part-1/from-aligned-to-unaligned-checkpoints-part-1-4.png&quot; /&gt;
&lt;/center&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Since sources are bound to the offsets persisted in the checkpoint, recovery time is often calculated as the sum of the time of the recovery process — outlined in the previous figure — and any additional time needed to process any remaining data up to the point right before the system failure. When an application experiences backpressure, recovery time can also include the total time from the very start of the recovery process until backpressure is fully eliminated.&lt;/p&gt;
&lt;h2 id=&quot;consistency-guarantees&quot;&gt;Consistency Guarantees&lt;/h2&gt;
&lt;p&gt;The alignment phase is only necessary for checkpoints with &lt;strong&gt;exactly-once&lt;/strong&gt; processing semantics, which is the default setting in Flink. If an application runs with &lt;strong&gt;at-least-once&lt;/strong&gt; processing semantics, checkpoints will not block any channels with barriers during alignment, which has an additional cost from the duplication of the then-not-blocked events when recovering the operator.&lt;/p&gt;
&lt;p&gt;This is not to be confused with having at-least-once semantics only in the sinks — something that many Flink users choose over transactional sinks — because many sink operations are idempotent or converge to the same result (like inputs/outputs to key/value stores). Having at-least-once semantics in an intermediate operator state is often not idempotent (for example a simple count aggregation) and hence using exactly-once checkpoints is advisable for the majority of Flink users.&lt;/p&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;This blog post recaps how Flink’s fault tolerance mechanism (based on aligned checkpoints) works, and why checkpointing is a fitting mechanism for a fault-tolerant stream processor. The checkpointing mechanism has been optimized over time to make checkpoints faster and cheaper (with both asynchronous and incremental checkpoints) and faster-to-recover (local caching), but the basic concepts (barriers, alignment, operator state snapshots) are still the same as in the original version.&lt;/p&gt;
&lt;p&gt;The next part will dig into a major break with the original mechanism that avoids the alignment phase — the recently-introduced “unaligned checkpoints”. Stay tuned for the second part, which explains how unaligned checkpoints work and how they guarantee consistent checkpointing times under backpressure.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;appendix-1---on-backpressure&quot;&gt;Appendix 1 - On Backpressure&lt;/h2&gt;
&lt;p&gt;Backpressure refers to the behavior where a slow receiver (e.g. of data/requests) makes the senders slow down in order to not overwhelm the receiver, something that can result in possibly dropping some of the processed data or requests. This is a crucial and very much desirable behavior for systems where completeness/correctness is important. Backpressure is implicitly implemented in many of the most basic building blocks of distributed communication, such as TCP Flow Control, bounded (blocking) I/O queues, poll-based consumers, etc.&lt;/p&gt;
&lt;p&gt;Apache Flink implements backpressure across the entire data flow graph. A sink that (temporarily) cannot keep up with the data rate will result in the source connectors slowing down and pulling data out of the source systems more slowly. We believe that this is a good and desirable behavior, because backpressure is not only necessary in order to avoid overwhelming the memory of a receiver (thread), but can also prevent different stages of the streaming application from drifting apart too far.&lt;/p&gt;
&lt;p&gt;Consider the example below:
- We have a source (let’s say reading data from Apache Kafka), parsing data, grouping and aggregating data by a key, and writing it to a sink system (some database).
- The application needs to re-group data by key between the parsing and the grouping/aggregation step.&lt;/p&gt;
&lt;p&gt;&lt;span&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:75%&quot; src=&quot;/img/blog/2020-10-15-from-aligned-to-unaligned-checkpoints-part-1/from-aligned-to-unaligned-checkpoints-part-1-5.png&quot; /&gt;
&lt;/center&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Let’s assume we use a &lt;strong&gt;non-backpressure approach&lt;/strong&gt;, like writing the data to a log/MQ for the data re-grouping over the network (the approach used by &lt;a href=&quot;https://docs.confluent.io/current/streams/architecture.html#backpressure&quot;&gt;Kafka Streams&lt;/a&gt;). If the sink is now slower than the remaining parts of the streaming application (which can easily happen), the first stage (source and parse) will still work as fast as possible to pull data out of the source, parse it, and put it into the log for the shuffle. That intermediate log will accumulate a lot of data, meaning it needs significant capacity, so that in a worst case scenario can hold a full copy of the input data or otherwise result in data loss (when the drift is greater than the retention time).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;With backpressure&lt;/strong&gt;, the source/parse stage slows down to match the speed of the sink, keeping both parts of the application closer together in their progress through the data, and avoiding the need to provision a lot of intermediate storage capacity.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;We’d like to thank Marta Paes Moreira and Markos Sfikas for the wonderful review process.&lt;/p&gt;
</description>
<pubDate>Thu, 15 Oct 2020 05:00:00 +0200</pubDate>
<link>https://flink.apache.org/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html</link>
<guid isPermaLink="true">/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html</guid>
</item>
<item>
<title>Stateful Functions Internals: Behind the scenes of Stateful Serverless</title>
<description>&lt;p&gt;Stateful Functions (StateFun) simplifies the building of distributed stateful applications by combining the best of two worlds:
the strong messaging and state consistency guarantees of stateful stream processing, and the elasticity and serverless experience of today’s cloud-native architectures and
popular event-driven FaaS platforms. Typical StateFun applications consist of functions deployed behind simple services
using these modern platforms, with a separate StateFun cluster playing the role of an “&lt;a href=&quot;https://flink.apache.org/news/2020/04/07/release-statefun-2.0.0.html&quot;&gt;event-driven database&lt;/a&gt;”
that provides consistency and fault-tolerance for the functions’ state and messaging.&lt;/p&gt;
&lt;p&gt;But how exactly does StateFun achieve that? How does the StateFun cluster communicate with the functions?&lt;/p&gt;
&lt;p&gt;This blog dives deep into the internals of the StateFun runtime. The entire walkthrough is complemented by a
&lt;a href=&quot;https://github.com/tzulitai/statefun-aws-demo/&quot;&gt;demo application&lt;/a&gt; which can be completely deployed on AWS services.
Most significantly, in the demo, the stateful functions are deployed and serviced using &lt;a href=&quot;https://aws.amazon.com/lambda/&quot;&gt;AWS Lambda&lt;/a&gt;,
a popular FaaS platform among many others. The goal here is to allow readers to have a good grasp of the interaction between
the StateFun runtime and the functions, how that works cohesively to provide a Stateful Serverless experience, and how they can apply
what they’ve learnt to deploy their StateFun applications on other public cloud offerings such as GCP or Microsoft Azure.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#introducing-the-example-shopping-cart&quot; id=&quot;markdown-toc-introducing-the-example-shopping-cart&quot;&gt;Introducing the example: Shopping Cart&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#what-happens-in-the-stateful-functions-runtime&quot; id=&quot;markdown-toc-what-happens-in-the-stateful-functions-runtime&quot;&gt;What happens in the Stateful Functions runtime?&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#remote-invocation-request-reply-protocol&quot; id=&quot;markdown-toc-remote-invocation-request-reply-protocol&quot;&gt;Remote Invocation Request-Reply Protocol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#function-state-consistency-and-fault-tolerance&quot; id=&quot;markdown-toc-function-state-consistency-and-fault-tolerance&quot;&gt;Function state consistency and fault-tolerance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#step-by-step-walkthrough-of-function-invocations&quot; id=&quot;markdown-toc-step-by-step-walkthrough-of-function-invocations&quot;&gt;Step-by-step walkthrough of function invocations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#stateful-serverless-in-the-cloud-with-faas-and-statefun&quot; id=&quot;markdown-toc-stateful-serverless-in-the-cloud-with-faas-and-statefun&quot;&gt;Stateful Serverless in the Cloud with FaaS and StateFun&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;introducing-the-example-shopping-cart&quot;&gt;Introducing the example: Shopping Cart&lt;/h2&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
You can find the full code &lt;a href=&quot;https://github.com/tzulitai/statefun-aws-demo/blob/master/app/shopping_cart.py&quot;&gt;here&lt;/a&gt;, which
uses StateFun’s &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-master/sdk/python.html&quot;&gt;Python SDK&lt;/a&gt;. Alternatively, if you are
unfamiliar with StateFun’s API, you can check out this &lt;a href=&quot;https://flink.apache.org/2020/08/19/statefun.html&quot;&gt;earlier blog&lt;/a&gt;
on modeling applications and stateful entities using &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-master/concepts/application-building-blocks.html&quot;&gt;StateFun’s programming constructs&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Let’s first take a look at a high-level overview of the motivating demo for this blog post: a shopping cart application.
The diagram below covers the functions that build up the application, the state that the functions would keep, and the messages
that flow between them. We’ll be referencing this example throughout the blog post.&lt;/p&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-10-13-stateful-serverless-internals/shopping-cart-overview.png&quot; width=&quot;600px&quot; alt=&quot;Shopping cart application&quot; /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.1:&lt;/b&gt; An overly simplified shopping cart application.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;The application consists of two function types: a &lt;code&gt;cart&lt;/code&gt; function and an &lt;code&gt;inventory&lt;/code&gt; function.
Each instance of the &lt;code&gt;cart&lt;/code&gt; function is associated with a single user entity, with its state being the items in the cart
for that user (&lt;code&gt;ItemsInCart&lt;/code&gt;). In the same way, each instance of the &lt;code&gt;inventory&lt;/code&gt; function represents a single inventory,
maintaining as state the number of items in stock (&lt;code&gt;NumInStock&lt;/code&gt;) as well as the number of items reserved across all
user carts (&lt;code&gt;NumReserved&lt;/code&gt;). Messages can be sent to function instances using their logical addresses, which consists
of the function type and the instance’s entity ID, e.g. &lt;code&gt;(cart:Kim)&lt;/code&gt; or &lt;code&gt;(inventory:socks)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There are two external messages being sent to and from the shopping cart application via ingresses and egresses:
&lt;code&gt;AddToCart&lt;/code&gt;, which is sent to the ingress when an item is added to a user’s cart (e.g. sent by a front-end web application),
and &lt;code&gt;AddToCartResult&lt;/code&gt;, which is sent back from our application to acknowledge the action.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;AddToCart&lt;/code&gt; messages are handled by the &lt;code&gt;cart&lt;/code&gt; function, which in turn invokes other functions to form the main logic of the application.
To keep things simple, only two messages between functions are demonstrated: &lt;code&gt;RequestItem&lt;/code&gt;, sent from the &lt;code&gt;cart&lt;/code&gt; function to the &lt;code&gt;inventory&lt;/code&gt;
function, representing a request to reserve an item, and &lt;code&gt;ItemReserved&lt;/code&gt;, which is a response from the inventory function to acknowledge the request.&lt;/p&gt;
&lt;h2 id=&quot;what-happens-in-the-stateful-functions-runtime&quot;&gt;What happens in the Stateful Functions runtime?&lt;/h2&gt;
&lt;p&gt;Now that we understand the business logic of the shopping cart application, let’s take a closer look at what keeps the state
of the functions and messages sent between them consistent and fault-tolerant: the StateFun cluster.&lt;/p&gt;
&lt;figure style=&quot;float:right;padding-left:1px;padding-top: 0px&quot;&gt;
&lt;img src=&quot;/img/blog/2020-10-13-stateful-serverless-internals/abstract-deployment.png&quot; width=&quot;400px&quot; /&gt;
&lt;figcaption style=&quot;padding-top: 10px;text-align:center&quot;&gt;&lt;i&gt;&lt;b&gt;Fig.2:&lt;/b&gt; Simplified view of a StateFun app deployment.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The StateFun runtime is built on-top of Apache Flink, and applies the same battle-tested technique that Flink uses as the
basis for strongly consistent stateful streaming applications - &lt;i&gt;&lt;b&gt;co-location of state and messaging&lt;/b&gt;&lt;/i&gt;.
In a StateFun application, all messages are routed through the StateFun cluster, including messages sent from ingresses,
messages sent between functions, and messages sent from functions to egresses. Moreover, the state of all functions is
maintained in the StateFun cluster as well. Like Flink, the message streams flowing through the StateFun cluster and
function state are co-partitioned so that compute has local state access, and any updates to the state can be handled
atomically with computed side-effects, i.e. messages to send to other functions.&lt;/p&gt;
&lt;p&gt;In more solid terms, take for example a message that is targeted for the logical address &lt;code&gt;(cart, &quot;Kim&quot;)&lt;/code&gt; being routed
through StateFun. Logical addresses are used in StateFun as the partitioning key for both message streams and state, so
that the resulting StateFun cluster partition that this message ends up in will have the state for &lt;code&gt;(cart, &quot;Kim&quot;)&lt;/code&gt;
locally available.&lt;/p&gt;
&lt;p&gt;The only difference here for StateFun, compared to Flink, is that the actual compute doesn’t happen within the StateFun
cluster partitions - &lt;i&gt;&lt;b&gt;computation happens remotely in the function services&lt;/b&gt;&lt;/i&gt;. So how does StateFun
route input messages to the remote function services and provide them with state access, all the while
preserving the same consistency guarantees as if state and compute were co-located?&lt;/p&gt;
&lt;h3 id=&quot;remote-invocation-request-reply-protocol&quot;&gt;Remote Invocation Request-Reply Protocol&lt;/h3&gt;
&lt;p&gt;A StateFun cluster partition communicates with the functions using a slim and well-defined request-reply protocol, as
illustrated in &lt;strong&gt;Fig. 3&lt;/strong&gt;. Upon receiving an input message, the cluster partition invokes the target functions via their
HTTP service endpoint. The service request body carries input events and current state for the functions, retrieved from
local state. Any outgoing messages and state mutations as a result of invocations are sent back through StateFun as part
of the service response. When the StateFun cluster partition receives the response, all state mutations are written back
to local state and outgoing messages are routed to other cluster partitions, which in turn invokes other function
services.&lt;/p&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-10-13-stateful-serverless-internals/request-reply-protocol.png&quot; width=&quot;750px&quot; /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.3:&lt;/b&gt; The remote invocation request/reply protocol.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;Under the hood, StateFun SDKs like the Python SDK and other &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-master/sdk/external.html&quot;&gt;3rd party SDKs for other languages&lt;/a&gt;
all implement this protocol. From the user’s perspective, they are programming with state local to their function deployment,
whereas in reality, state is maintained in StateFun and supplied through this protocol. It is easy to add more language SDKs,
as long as the language can handle HTTP requests and responses.&lt;/p&gt;
&lt;h3 id=&quot;function-state-consistency-and-fault-tolerance&quot;&gt;Function state consistency and fault-tolerance&lt;/h3&gt;
&lt;p&gt;The runtime makes sure that only one invocation per function instance (e.g. &lt;code&gt;(cart, &quot;Kim&quot;)&lt;/code&gt;) is ongoing at any point in
time (i.e. invocations per entity are serial). If an invocation is ongoing while new messages for the same function
instance arrives, the messages are buffered in state and sent as a single batch as soon as the ongoing invocation completes.&lt;/p&gt;
&lt;p&gt;In addition, since each request happens in complete isolation and all relevant information is encapsulated in each
request and response, &lt;i&gt;&lt;b&gt;function invocations are effectively idempotent&lt;/b&gt;&lt;/i&gt;
(i.e. results depend purely on the provided context of the invocation) and can be retried. This naturally avoids
violating consistency in case any function service hiccups occur.&lt;/p&gt;
&lt;p&gt;For fault tolerance, all function state managed in the StateFun cluster is periodically and asynchronously checkpointed
to a blob storage (e.g. HDFS, S3, GCS) using Flink’s &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/concepts/stateful-stream-processing.html#checkpointing&quot;&gt;original distributed snapshot mechanism&lt;/a&gt;.
These checkpoints contain &lt;i&gt;&lt;b&gt;a globally consistent view of state across all functions of the application&lt;/b&gt;&lt;/i&gt;,
including the offset positions in ingresses and the ongoing transaction state in egresses. In the case of an abrupt failure,
the system may restore from the latest available checkpoint: all function states will be restored and all events between
the checkpoint and the crash will be re-processed (and the functions re-invoked) with identical routing, all as if the failure
never happened.&lt;/p&gt;
&lt;h3 id=&quot;step-by-step-walkthrough-of-function-invocations&quot;&gt;Step-by-step walkthrough of function invocations&lt;/h3&gt;
&lt;p&gt;Let’s conclude this section by going through the actual messages that flow between StateFun and the functions in our shopping
cart demo application!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Customer “Kim” puts 2 socks into his shopping cart (Fig. 4):&lt;/strong&gt;&lt;/p&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-10-13-stateful-serverless-internals/protocol-walkthrough-1.png&quot; width=&quot;750px&quot; /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.4:&lt;/b&gt; Message flow walkthrough.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;An event &lt;code&gt;AddToCart(&quot;Kim&quot;, &quot;socks&quot;, 2)&lt;/code&gt; comes through one of the ingress partitions &lt;b&gt;&lt;code&gt;(1)&lt;/code&gt;&lt;/b&gt;. The ingress event router is
configured to route &lt;code&gt;AddToCart&lt;/code&gt; events to the function type &lt;code&gt;cart&lt;/code&gt;, taking the user ID (&lt;code&gt;&quot;Kim&quot;&lt;/code&gt;) as the instance ID. The
function type and instance ID together define the logical address of the target function instance for the event &lt;code&gt;(cart:Kim)&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Let’s assume the event is read by StateFun partition B, but the &lt;code&gt;(cart:Kim)&lt;/code&gt; address is owned by partition A.
The event is thus routed to partition A &lt;b&gt;&lt;code&gt;(2)&lt;/code&gt;&lt;/b&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;StateFun Partition A receives the event and processes it:
&lt;ul&gt;
&lt;li&gt;First, the runtime fetches the state for &lt;code&gt;(cart:Kim)&lt;/code&gt; from the local state store, i.e. the existing items in Kim’s cart &lt;b&gt;&lt;code&gt;(3)&lt;/code&gt;&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;Next, it marks &lt;code&gt;(cart:Kim)&lt;/code&gt; as &lt;i&gt;“busy”&lt;/i&gt;, meaning an invocation is happening. This buffers other messages targeted for
&lt;code&gt;(cart:Kim)&lt;/code&gt; in state until this invocation is completed.&lt;/li&gt;
&lt;li&gt;The runtime grabs a free HTTP client connection and sends a request to the &lt;code&gt;cart&lt;/code&gt; function type’s service endpoint.
The request contains the &lt;code&gt;AddToCart(&quot;Kim&quot;, &quot;socks&quot;, 2)&lt;/code&gt; message and the current state for &lt;code&gt;(cart:Kim)&lt;/code&gt; &lt;b&gt;&lt;code&gt;(4)&lt;/code&gt;&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;The remote &lt;code&gt;cart&lt;/code&gt; function service receives the event and attempts to reserve socks with the &lt;code&gt;inventory&lt;/code&gt; function.
Therefore, it replies to the invocation with a new message &lt;code&gt;RequestItem(&quot;socks&quot;, 2)&lt;/code&gt; targeted at the address &lt;code&gt;(inventory:socks)&lt;/code&gt;.
Any state modifications will be included in the response as well, but in this case there aren’t any state modifications yet
(i.e. Kim’s cart is still empty until a reservation acknowledgement is received from the inventory service) &lt;b&gt;&lt;code&gt;(5)&lt;/code&gt;&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;The StateFun runtime receives the response, routes the &lt;code&gt;RequestItem&lt;/code&gt; message to other partitions,
and marks &lt;code&gt;(cart:Kim)&lt;/code&gt; as &lt;i&gt;“available”&lt;/i&gt; again for invocation.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Assuming that the &lt;code&gt;(inventory:socks)&lt;/code&gt; address is owned by partition B, the message is routed to partition B &lt;b&gt;&lt;code&gt;(6)&lt;/code&gt;&lt;/b&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;Once partition B receives the &lt;code&gt;RequestItem&lt;/code&gt; message, the runtime invokes the function &lt;code&gt;(inventory:socks)&lt;/code&gt; in the same
way as described above, and receives a reply with a modification of the state of the inventory (the number of reserved socks is now increased by 2).
&lt;code&gt;(inventory:socks)&lt;/code&gt; now also wants to reply reservation of 2 socks for Kim, so an &lt;code&gt;ItemReserved(&quot;socks&quot;, 2)&lt;/code&gt;
message targeted for &lt;code&gt;(cart:Kim)&lt;/code&gt; is also included in the response &lt;b&gt;&lt;code&gt;(7)&lt;/code&gt;&lt;/b&gt;, which will again be routed by the StateFun runtime.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;stateful-serverless-in-the-cloud-with-faas-and-statefun&quot;&gt;Stateful Serverless in the Cloud with FaaS and StateFun&lt;/h2&gt;
&lt;p&gt;We’d like to wrap up this blog by re-emphasizing how the StateFun runtime works well with cloud-native
architectures, and provide an overview of what your complete StateFun application deployment would look like
using public cloud services.&lt;/p&gt;
&lt;p&gt;As you’ve already learnt in previous sections, invocation requests themselves are stateless, with all necessary information
for an invocation included in the HTTP request (i.e. input events and state access), and all side-effects of the invocation
included in the HTTP response (i.e. outgoing messages and state modifications).&lt;/p&gt;
&lt;figure style=&quot;float:right;padding-left:1px;padding-top: 0px&quot;&gt;
&lt;img src=&quot;/img/blog/2020-10-13-stateful-serverless-internals/aws-deployment.png&quot; width=&quot;450px&quot; /&gt;
&lt;figcaption style=&quot;padding-top: 10px;text-align:center&quot;&gt;&lt;i&gt;&lt;b&gt;Fig.5:&lt;/b&gt; Complete deployment example on AWS.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;A natural consequence of this characteristic is that there is no session-related dependency between individual HTTP
requests, making it very easy to horizontally scale the function deployments. This makes it very easy to deploy your
stateful functions using FaaS platforms solutions, allowing them to rapidly scale out, scale to zero, or be upgraded
with zero-downtime.&lt;/p&gt;
&lt;p&gt;In our complementary demo code, you can find &lt;a href=&quot;https://github.com/tzulitai/statefun-aws-demo/blob/master/app/shopping_cart.py#L49&quot;&gt;here&lt;/a&gt;
the exact code on how to expose and service StateFun functions through AWS Lambda. Likewise, this is possible for any other
FaaS platform that supports triggering the functions using HTTP endpoints (and other transports as well in the future).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fig. 5&lt;/strong&gt; on the right illustrates what a complete AWS deployment of a StateFun application would look like, with functions
serviced via AWS Lambda, AWS Kinesis streams as ingresses and egresses, AWS EKS managed Kubernetes cluster to run the
StateFun cluster, and an AWS S3 bucket to store the periodic checkpoints. You can also follow the
&lt;a href=&quot;https://github.com/tzulitai/statefun-aws-demo#running-the-demo&quot;&gt;instructions&lt;/a&gt; in the demo code to try it out and deploy this yourself right away!&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;If you’d like to learn more about Stateful Functions, head over to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-master/&quot;&gt;official documentation&lt;/a&gt;, where you can also find more hands-on tutorials to try out yourself!&lt;/p&gt;
</description>
<pubDate>Tue, 13 Oct 2020 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/10/13/stateful-serverless-internals.html</link>
<guid isPermaLink="true">/news/2020/10/13/stateful-serverless-internals.html</guid>
</item>
<item>
<title>Stateful Functions 2.2.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is happy to announce the release of Stateful Functions (StateFun) 2.2.0! This release
introduces major features that extend the SDKs, such as support for asynchronous functions in the Python SDK, new
persisted state constructs, and a new SDK that allows embedding StateFun functions within a Flink DataStream job.
Moreover, we’ve also included important changes that improve out-of-the-box stability for common workloads,
as well as increased observability for operational purposes.&lt;/p&gt;
&lt;p&gt;We’ve also seen new 3rd party SDKs for StateFun being developed since the last release. While they are not part of the
release artifacts, it’s great seeing these community-driven additions! We’ve highlighted these efforts below
in this announcement.&lt;/p&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads&lt;/a&gt;
page of the Flink website, and the most recent Python SDK distribution is available on &lt;a href=&quot;https://pypi.org/project/apache-flink-statefun/&quot;&gt;PyPI&lt;/a&gt;.
For more details, check the complete &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12348350&quot;&gt;release changelog&lt;/a&gt;
and the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.2/&quot;&gt;updated documentation&lt;/a&gt;.
We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt;
or &lt;a href=&quot;https://issues.apache.org/jira/browse/&quot;&gt;JIRA&lt;/a&gt;!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#new-features&quot; id=&quot;markdown-toc-new-features&quot;&gt;New Features&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#asynchronous-functions-in-python-sdk&quot; id=&quot;markdown-toc-asynchronous-functions-in-python-sdk&quot;&gt;Asynchronous functions in Python SDK&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-datastream-integration-sdk&quot; id=&quot;markdown-toc-flink-datastream-integration-sdk&quot;&gt;Flink DataStream Integration SDK&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#construct-for-dynamic-state-registration&quot; id=&quot;markdown-toc-construct-for-dynamic-state-registration&quot;&gt;Construct for Dynamic State Registration&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#improvements&quot; id=&quot;markdown-toc-improvements&quot;&gt;Improvements&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#remote-functions-communication-stability&quot; id=&quot;markdown-toc-remote-functions-communication-stability&quot;&gt;Remote Functions Communication Stability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#operational-observability-of-a-statefun-application&quot; id=&quot;markdown-toc-operational-observability-of-a-statefun-application&quot;&gt;Operational observability of a StateFun Application&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#fine-grained-control-over-remote-connection-lifecycle&quot; id=&quot;markdown-toc-fine-grained-control-over-remote-connection-lifecycle&quot;&gt;Fine-grained control over remote connection lifecycle&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#rd-party-sdks&quot; id=&quot;markdown-toc-rd-party-sdks&quot;&gt;3rd Party SDKs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#important-patch-notes&quot; id=&quot;markdown-toc-important-patch-notes&quot;&gt;Important Patch Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes&quot; id=&quot;markdown-toc-release-notes&quot;&gt;Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;new-features&quot;&gt;New Features&lt;/h2&gt;
&lt;h3 id=&quot;asynchronous-functions-in-python-sdk&quot;&gt;Asynchronous functions in Python SDK&lt;/h3&gt;
&lt;p&gt;This release enables registering asynchronous Python functions as stateful functions by introducing a new handler
in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.2/sdk/python.html&quot;&gt;Python SDK&lt;/a&gt;: &lt;code&gt;AsyncRequestReplyHandler&lt;/code&gt;.
This allows serving StateFun functions with Python web frameworks that support asynchronous IO natively (for example,
&lt;a href=&quot;https://pypi.org/project/aiohttp/&quot;&gt;aiohttp&lt;/a&gt;):&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;statefun&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StatefulFunctions&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;statefun&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AsyncRequestReplyHandler&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;functions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StatefulFunctions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@functions.bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;example/greeter&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;async&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;greeter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;html&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;await&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fetch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;session&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;http://....&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack_and_reply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SomeProtobufMessage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;html&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# expose this handler via an async web framework&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;handler&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AsyncRequestReplyHandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;functions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For more details, please see the docs on &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.2/sdk/python.html#exposing-functions&quot;&gt;exposing Python functions&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;flink-datastream-integration-sdk&quot;&gt;Flink DataStream Integration SDK&lt;/h3&gt;
&lt;p&gt;Using this SDK, you may combine pipelines written with the Flink &lt;code&gt;DataStream&lt;/code&gt; API or higher-level libraries
(such as Table API, CEP etc., basically anything that can consume or produce a &lt;code&gt;DataStream&lt;/code&gt;) with the programming constructs
provided by Stateful Functions, as demonstrated below:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RoutableMessage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;namesIngress&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;StatefulFunctionEgressStreams&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;egresses&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;StatefulFunctionDataStreamBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;example&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withDataStreamAsIngress&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;namesIngress&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withRequestReplyRemoteFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;RequestReplyFunctionBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;requestReplyFunctionBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;REMOTE_GREET&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;URI&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;http://...&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withPersistedState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;seen_count&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withFunctionProvider&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GREET&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unused&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;MyFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;withEgressId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GREETINGS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;responsesEgress&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getDataStreamForEgressId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GREETINGS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Events from &lt;code&gt;DataStream&lt;/code&gt; ingresses are being routed to bound functions, and events sent to
egresses are captured as &lt;code&gt;DataStream&lt;/code&gt; egresses. This opens up the possibility of building complex streaming
applications.&lt;/p&gt;
&lt;h3 id=&quot;construct-for-dynamic-state-registration&quot;&gt;Construct for Dynamic State Registration&lt;/h3&gt;
&lt;p&gt;Prior to this release, the persisted state constructs in the Java SDK, such as &lt;code&gt;PersistedValue&lt;/code&gt;, &lt;code&gt;PersistedTable&lt;/code&gt; etc.,
had to be eagerly defined in a stateful function’s class. In certain scenarios, what state a function requires is not
known in advance, and may only be dynamically registered at runtime (e.g., when a function is invoked).&lt;/p&gt;
&lt;p&gt;This release enables that by providing a new &lt;code&gt;PersistedStateRegistry&lt;/code&gt; construct:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyFunction&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StatefulFunction&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Persisted&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PersistedStateRegistry&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;registry&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;PersistedStateRegistry&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PersistedValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;myValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;invoke&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Object&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;myValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;myValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;registry&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;registerValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PersistedValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;my-value&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;improvements&quot;&gt;Improvements&lt;/h2&gt;
&lt;h3 id=&quot;remote-functions-communication-stability&quot;&gt;Remote Functions Communication Stability&lt;/h3&gt;
&lt;p&gt;After observing common workloads, a few configurations for communicating with remote functions were adjusted for a better
out-of-the-box connection stability. This includes the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The underlying connection pool was tuned for low latency, high throughput workloads. This allows StateFun to reuse
existing connections much more aggressively and avoid re-establishing a connection for each request.&lt;/li&gt;
&lt;li&gt;StateFun applies backpressure once the total number of uncompleted requests reaches a per JVM threshold (&lt;code&gt;statefun.async.max-per-task&lt;/code&gt;),
but observing typical workloads we have discovered that the default value is set too high. In this release the default
was reduced to improve stability and resource consumption, in the face of a slow-responding remote function.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;operational-observability-of-a-statefun-application&quot;&gt;Operational observability of a StateFun Application&lt;/h3&gt;
&lt;p&gt;One major goal of this release was to take a necessary step towards supporting auto-scaling of remote functions. Towards that end,
we’ve exposed several metrics related to workload of remote functions and resulting backpressure applied by the function
dispatchers. This includes the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Per function type invocation duration / latency histograms&lt;/li&gt;
&lt;li&gt;Per function type backlog size&lt;/li&gt;
&lt;li&gt;Per JVM (StateFun worker) and per function type number of in-flight invocations&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The full list of metrics and their descriptions can be found &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.2/deployment-and-operations/metrics.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;fine-grained-control-over-remote-connection-lifecycle&quot;&gt;Fine-grained control over remote connection lifecycle&lt;/h3&gt;
&lt;p&gt;With this release, it’s possible to set individual timeouts for overall duration and individual read and write IO operations
of HTTP requests with remote functions. You can find the corresponding field names in a function spec that defines
these timeout values &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.2/sdk/index.html#defining-functions&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;rd-party-sdks&quot;&gt;3rd Party SDKs&lt;/h2&gt;
&lt;p&gt;Since the last release, we’ve seen new 3rd party SDKs for different languages being implemented on top of StateFun’s
remote function HTTP request-reply protocol, including &lt;a href=&quot;https://github.com/sjwiesman/statefun-go/&quot;&gt;Go&lt;/a&gt; and &lt;a href=&quot;https://github.com/aljoscha/statefun-rust&quot;&gt;Rust&lt;/a&gt; implementations. While these SDKs are not
endorsed or maintained by the Apache Flink PMC and currently not part of the current releases, it is great to see these
new additions that demonstrate the extensibility of the framework.&lt;/p&gt;
&lt;p&gt;For that reason, we’ve added
a new &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.2/sdk/external.html&quot;&gt;page in the documentation&lt;/a&gt;
to list the 3rd party SDKs that the community is aware of. If you’ve also worked on a new language SDK for StateFun that
is stable and you plan to continue maintaining, please consider letting the community know of your work by
submitting a pull request to add your project to the list!&lt;/p&gt;
&lt;h2 id=&quot;important-patch-notes&quot;&gt;Important Patch Notes&lt;/h2&gt;
&lt;p&gt;Below is a list of user-facing interface and configuration changes, dependency version upgrades, or removal of supported versions that would be
important to be aware of when upgrading your StateFun applications to this version:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18812&quot;&gt;FLINK-18812&lt;/a&gt;] The Flink version in StateFun 2.2 has been upgraded to 1.11.1.&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19203&quot;&gt;FLINK-19203&lt;/a&gt;] Upgraded Scala version to 2.12, and dropped support for 2.11.&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19190&quot;&gt;FLINK-19190&lt;/a&gt;] All existing metric names have been renamed to be camel-cased instead of snake-cased, to conform with the Flink metric naming conventions. &lt;strong&gt;This breaks existing deployments if you depended on previous metrics&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19192&quot;&gt;FLINK-19192&lt;/a&gt;] The connection pool size for remote function HTTP requests have been increased to 1024, with a stale TTL of 1 minute.&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19191&quot;&gt;FLINK-19191&lt;/a&gt;] The default max number of asynchronous operations per JVM (StateFun worker) has been decreased to 1024.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h2&gt;
&lt;p&gt;Please review the &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12348350&quot;&gt;release notes&lt;/a&gt;
for a detailed list of changes and new features if you plan to upgrade your setup to Stateful Functions 2.2.0.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank all contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;abc863377, Authuir, Chesnay Schepler, Congxian Qiu, David Anderson, Dian Fu, Francesco Guardiani, Igal Shilman, Marta Paes Moreira, Patrick Wiener, Rafi Aroch, Seth Wiesman, Stephan Ewen, Tzu-Li (Gordon) Tai, Ufuk Celebi&lt;/p&gt;
&lt;p&gt;If you’d like to get involved, we’re always &lt;a href=&quot;https://github.com/apache/flink-statefun#contributing&quot;&gt;looking for new contributors&lt;/a&gt;.&lt;/p&gt;
</description>
<pubDate>Mon, 28 Sep 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/09/28/release-statefun-2.2.0.html</link>
<guid isPermaLink="true">/news/2020/09/28/release-statefun-2.2.0.html</guid>
</item>
<item>
<title>Apache Flink 1.11.2 Released</title>
<description>&lt;p&gt;The Apache Flink community released the second bugfix version of the Apache Flink 1.11 series.&lt;/p&gt;
&lt;p&gt;This release includes 96 fixes and minor improvements for Flink 1.11.1. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.11.2.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16087&quot;&gt;FLINK-16087&lt;/a&gt;] - Translate &amp;quot;Detecting Patterns&amp;quot; page of &amp;quot;Streaming Concepts&amp;quot; into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18264&quot;&gt;FLINK-18264&lt;/a&gt;] - Translate the &amp;quot;External Resource Framework&amp;quot; page into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18628&quot;&gt;FLINK-18628&lt;/a&gt;] - Invalid error message for overloaded methods with same parameter name
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18801&quot;&gt;FLINK-18801&lt;/a&gt;] - Add a &amp;quot;10 minutes to Table API&amp;quot; document under the &amp;quot;Python API&amp;quot; -&amp;gt; &amp;quot;User Guide&amp;quot; -&amp;gt; &amp;quot;Table API&amp;quot; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18910&quot;&gt;FLINK-18910&lt;/a&gt;] - Create the new document structure for Python documentation according to FLIP-133
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18912&quot;&gt;FLINK-18912&lt;/a&gt;] - Add a Table API tutorial link(linked to try-flink/python_table_api.md) under the &amp;quot;Python API&amp;quot; -&amp;gt; &amp;quot;GettingStart&amp;quot; -&amp;gt; &amp;quot;Tutorial&amp;quot; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18913&quot;&gt;FLINK-18913&lt;/a&gt;] - Add a &amp;quot;TableEnvironment&amp;quot; document under the &amp;quot;Python API&amp;quot; -&amp;gt; &amp;quot;User Guide&amp;quot; -&amp;gt; &amp;quot;Table API&amp;quot; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18917&quot;&gt;FLINK-18917&lt;/a&gt;] - Add a &amp;quot;Built-in Functions&amp;quot; link (linked to dev/table/functions/systemFunctions.md) under the &amp;quot;Python API&amp;quot; -&amp;gt; &amp;quot;User Guide&amp;quot; -&amp;gt; &amp;quot;Table API&amp;quot; section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19110&quot;&gt;FLINK-19110&lt;/a&gt;] - Flatten current PyFlink documentation structure
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14087&quot;&gt;FLINK-14087&lt;/a&gt;] - throws java.lang.ArrayIndexOutOfBoundsException when emiting the data using RebalancePartitioner.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15467&quot;&gt;FLINK-15467&lt;/a&gt;] - Should wait for the end of the source thread during the Task cancellation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16510&quot;&gt;FLINK-16510&lt;/a&gt;] - Task manager safeguard shutdown may not be reliable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16827&quot;&gt;FLINK-16827&lt;/a&gt;] - StreamExecTemporalSort should require a distribution trait in StreamExecTemporalSortRule
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18081&quot;&gt;FLINK-18081&lt;/a&gt;] - Fix broken links in &amp;quot;Kerberos Authentication Setup and Configuration&amp;quot; doc
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18212&quot;&gt;FLINK-18212&lt;/a&gt;] - Init lookup join failed when use udf on lookup table
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18341&quot;&gt;FLINK-18341&lt;/a&gt;] - Building Flink Walkthrough Table Java 0.1 COMPILATION ERROR
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18421&quot;&gt;FLINK-18421&lt;/a&gt;] - Elasticsearch (v6.3.1) sink end-to-end test instable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18468&quot;&gt;FLINK-18468&lt;/a&gt;] - TaskExecutorITCase.testJobReExecutionAfterTaskExecutorTermination fails with DuplicateJobSubmissionException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18552&quot;&gt;FLINK-18552&lt;/a&gt;] - Update migration tests in master to cover migration from release-1.11
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18581&quot;&gt;FLINK-18581&lt;/a&gt;] - Cannot find GC cleaner with java version previous jdk8u72(-b01)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18588&quot;&gt;FLINK-18588&lt;/a&gt;] - hive ddl create table should support &amp;#39;if not exists&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18595&quot;&gt;FLINK-18595&lt;/a&gt;] - Deadlock during job shutdown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18600&quot;&gt;FLINK-18600&lt;/a&gt;] - Kerberized YARN per-job on Docker test failed to download JDK 8u251
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18608&quot;&gt;FLINK-18608&lt;/a&gt;] - CustomizedConvertRule#convertCast drops nullability
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18612&quot;&gt;FLINK-18612&lt;/a&gt;] - WordCount example failure when setting relative output path
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18632&quot;&gt;FLINK-18632&lt;/a&gt;] - RowData&amp;#39;s row kind do not assigned from input row data when sink code generate and physical type info is pojo type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18639&quot;&gt;FLINK-18639&lt;/a&gt;] - Error messages from BashJavaUtils are eaten
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18641&quot;&gt;FLINK-18641&lt;/a&gt;] - &amp;quot;Failure to finalize checkpoint&amp;quot; error in MasterTriggerRestoreHook
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18646&quot;&gt;FLINK-18646&lt;/a&gt;] - Managed memory released check can block RPC thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18650&quot;&gt;FLINK-18650&lt;/a&gt;] - The description of dispatcher in Flink Architecture document is not accurate
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18655&quot;&gt;FLINK-18655&lt;/a&gt;] - Set failOnUnableToExtractRepoInfo to false for git-commit-id-plugin in module flink-runtime
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18656&quot;&gt;FLINK-18656&lt;/a&gt;] - Start Delay metric is always zero for unaligned checkpoints
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18659&quot;&gt;FLINK-18659&lt;/a&gt;] - FileNotFoundException when writing Hive orc tables
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18663&quot;&gt;FLINK-18663&lt;/a&gt;] - RestServerEndpoint may prevent server shutdown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18665&quot;&gt;FLINK-18665&lt;/a&gt;] - Filesystem connector should use TableSchema exclude computed columns
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18672&quot;&gt;FLINK-18672&lt;/a&gt;] - Fix Scala code examples for UDF type inference annotations
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18677&quot;&gt;FLINK-18677&lt;/a&gt;] - ZooKeeperLeaderRetrievalService does not invalidate leader in case of SUSPENDED connection
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18682&quot;&gt;FLINK-18682&lt;/a&gt;] - Vector orc reader cannot read Hive 2.0.0 table
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18697&quot;&gt;FLINK-18697&lt;/a&gt;] - Adding flink-table-api-java-bridge_2.11 to a Flink job kills the IDE logging
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18700&quot;&gt;FLINK-18700&lt;/a&gt;] - Debezium-json format throws Exception when PG table&amp;#39;s IDENTITY config is not FULL
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18705&quot;&gt;FLINK-18705&lt;/a&gt;] - Debezium-JSON throws NPE when tombstone message is received
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18708&quot;&gt;FLINK-18708&lt;/a&gt;] - The links of the connector sql jar of Kafka 0.10 and 0.11 are extinct
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18710&quot;&gt;FLINK-18710&lt;/a&gt;] - ResourceProfileInfo is not serializable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18748&quot;&gt;FLINK-18748&lt;/a&gt;] - Savepoint would be queued unexpected if pendingCheckpoints less than maxConcurrentCheckpoints
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18749&quot;&gt;FLINK-18749&lt;/a&gt;] - Correct dependencies in Kubernetes pom
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18750&quot;&gt;FLINK-18750&lt;/a&gt;] - SqlValidatorException thrown when select from a view which contains a UDTF call
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18769&quot;&gt;FLINK-18769&lt;/a&gt;] - MiniBatch doesn&amp;#39;t work with FLIP-95 source
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18821&quot;&gt;FLINK-18821&lt;/a&gt;] - Netty client retry mechanism may cause PartitionRequestClientFactory#createPartitionRequestClient to wait infinitely
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18832&quot;&gt;FLINK-18832&lt;/a&gt;] - BoundedBlockingSubpartition does not work with StreamTask
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18856&quot;&gt;FLINK-18856&lt;/a&gt;] - CheckpointCoordinator ignores checkpointing.min-pause
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18859&quot;&gt;FLINK-18859&lt;/a&gt;] - ExecutionGraphNotEnoughResourceTest.testRestartWithSlotSharingAndNotEnoughResources failed with &amp;quot;Condition was not met in given timeout.&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18862&quot;&gt;FLINK-18862&lt;/a&gt;] - Fix LISTAGG throws BinaryRawValueData cannot be cast to StringData exception in runtime
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18867&quot;&gt;FLINK-18867&lt;/a&gt;] - Generic table stored in Hive catalog is incompatible between 1.10 and 1.11
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18900&quot;&gt;FLINK-18900&lt;/a&gt;] - HiveCatalog should error out when listing partitions with an invalid spec
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18902&quot;&gt;FLINK-18902&lt;/a&gt;] - Cannot serve results of asynchronous REST operations in per-job mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18941&quot;&gt;FLINK-18941&lt;/a&gt;] - There are some typos in &amp;quot;Set up JobManager Memory&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18942&quot;&gt;FLINK-18942&lt;/a&gt;] - HiveTableSink shouldn&amp;#39;t try to create BulkWriter factory when using MR writer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18956&quot;&gt;FLINK-18956&lt;/a&gt;] - StreamTask.invoke should catch Throwable instead of Exception
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18959&quot;&gt;FLINK-18959&lt;/a&gt;] - Fail to archiveExecutionGraph because job is not finished when dispatcher close
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18992&quot;&gt;FLINK-18992&lt;/a&gt;] - Table API renameColumns method annotation error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18993&quot;&gt;FLINK-18993&lt;/a&gt;] - Invoke sanityCheckTotalFlinkMemory method incorrectly in JobManagerFlinkMemoryUtils.java
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18994&quot;&gt;FLINK-18994&lt;/a&gt;] - There is one typo in &amp;quot;Set up TaskManager Memory&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19040&quot;&gt;FLINK-19040&lt;/a&gt;] - SourceOperator is not closing SourceReader
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19061&quot;&gt;FLINK-19061&lt;/a&gt;] - HiveCatalog fails to get partition column stats if partition value contains special characters
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19094&quot;&gt;FLINK-19094&lt;/a&gt;] - Revise the description of watermark strategy in Flink Table document
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19108&quot;&gt;FLINK-19108&lt;/a&gt;] - Stop expanding the identifiers with scope aliased by the system with &amp;#39;EXPR$&amp;#39; prefix
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19109&quot;&gt;FLINK-19109&lt;/a&gt;] - Split Reader eats chained periodic watermarks
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19121&quot;&gt;FLINK-19121&lt;/a&gt;] - Avoid accessing HDFS frequently in HiveBulkWriterFactory
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19133&quot;&gt;FLINK-19133&lt;/a&gt;] - User provided kafka partitioners are not initialized correctly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19148&quot;&gt;FLINK-19148&lt;/a&gt;] - Table crashed in Flink Table API &amp;amp; SQL Docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19166&quot;&gt;FLINK-19166&lt;/a&gt;] - StreamingFileWriter should register Listener before the initialization of buckets
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16619&quot;&gt;FLINK-16619&lt;/a&gt;] - Misleading SlotManagerImpl logging for slot reports of unknown task manager
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17075&quot;&gt;FLINK-17075&lt;/a&gt;] - Add task status reconciliation between TM and JM
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17285&quot;&gt;FLINK-17285&lt;/a&gt;] - Translate &amp;quot;Python Table API&amp;quot; page into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17503&quot;&gt;FLINK-17503&lt;/a&gt;] - Make memory configuration logging more user-friendly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18598&quot;&gt;FLINK-18598&lt;/a&gt;] - Add instructions for asynchronous execute in PyFlink doc
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18618&quot;&gt;FLINK-18618&lt;/a&gt;] - Docker e2e tests are failing on CI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18619&quot;&gt;FLINK-18619&lt;/a&gt;] - Update training to use WatermarkStrategy
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18635&quot;&gt;FLINK-18635&lt;/a&gt;] - Typo in &amp;#39;concepts/timely stream processing&amp;#39; part of the website
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18643&quot;&gt;FLINK-18643&lt;/a&gt;] - Migrate Jenkins jobs to ci-builds.apache.org
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18644&quot;&gt;FLINK-18644&lt;/a&gt;] - Remove obsolete doc for hive connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18730&quot;&gt;FLINK-18730&lt;/a&gt;] - Remove Beta tag from SQL Client docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18772&quot;&gt;FLINK-18772&lt;/a&gt;] - Hide submit job web ui elements when running in per-job/application mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18793&quot;&gt;FLINK-18793&lt;/a&gt;] - Fix Typo for api.common.eventtime.WatermarkStrategy Description
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18797&quot;&gt;FLINK-18797&lt;/a&gt;] - docs and examples use deprecated forms of keyBy
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18816&quot;&gt;FLINK-18816&lt;/a&gt;] - Correct API usage in Pyflink Dependency Management page
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18831&quot;&gt;FLINK-18831&lt;/a&gt;] - Improve the Python documentation about the operations in Table
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18839&quot;&gt;FLINK-18839&lt;/a&gt;] - Add documentation about how to use catalog in Python Table API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18847&quot;&gt;FLINK-18847&lt;/a&gt;] - Add documentation about data types in Python Table API
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18849&quot;&gt;FLINK-18849&lt;/a&gt;] - Improve the code tabs of the Flink documents
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18881&quot;&gt;FLINK-18881&lt;/a&gt;] - Modify the Access Broken Link
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19055&quot;&gt;FLINK-19055&lt;/a&gt;] - MemoryManagerSharedResourcesTest contains three tests running extraordinary long
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-19105&quot;&gt;FLINK-19105&lt;/a&gt;] - Table API Sample Code Error
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18666&quot;&gt;FLINK-18666&lt;/a&gt;] - Update japicmp configuration for 1.11.1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18667&quot;&gt;FLINK-18667&lt;/a&gt;] - Data Types documentation misunderstand users
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18678&quot;&gt;FLINK-18678&lt;/a&gt;] - Hive connector fails to create vector orc reader if user specifies incorrect hive version
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Thu, 17 Sep 2020 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/09/17/release-1.11.2.html</link>
<guid isPermaLink="true">/news/2020/09/17/release-1.11.2.html</guid>
</item>
<item>
<title>Flink Community Update - August&#39;20</title>
<description>&lt;p&gt;Ah, so much for a quiet August month. This time around, we bring you some new Flink Improvement Proposals (FLIPs), a preview of the upcoming &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-master/&quot;&gt;Flink Stateful Functions&lt;/a&gt; 2.2 release and a look into how far Flink has come in comparison to 2019.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#the-past-month-in-flink&quot; id=&quot;markdown-toc-the-past-month-in-flink&quot;&gt;The Past Month in Flink&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-releases&quot; id=&quot;markdown-toc-flink-releases&quot;&gt;Flink Releases&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#getting-ready-for-flink-stateful-functions-22&quot; id=&quot;markdown-toc-getting-ready-for-flink-stateful-functions-22&quot;&gt;Getting Ready for Flink Stateful Functions 2.2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-1102&quot; id=&quot;markdown-toc-flink-1102&quot;&gt;Flink 1.10.2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-flink-improvement-proposals-flips&quot; id=&quot;markdown-toc-new-flink-improvement-proposals-flips&quot;&gt;New Flink Improvement Proposals (FLIPs)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers-and-pmc-members&quot; id=&quot;markdown-toc-new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#new-pmc-members&quot; id=&quot;markdown-toc-new-pmc-members&quot;&gt;New PMC Members&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers&quot; id=&quot;markdown-toc-new-committers&quot;&gt;New Committers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-bigger-picture&quot; id=&quot;markdown-toc-the-bigger-picture&quot;&gt;The Bigger Picture&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-in-2019-the-aftermath&quot; id=&quot;markdown-toc-flink-in-2019-the-aftermath&quot;&gt;Flink in 2019: the Aftermath&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#google-season-of-docs-2020-results&quot; id=&quot;markdown-toc-google-season-of-docs-2020-results&quot;&gt;Google Season of Docs 2020 Results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#upcoming-events-and-more&quot; id=&quot;markdown-toc-upcoming-events-and-more&quot;&gt;Upcoming Events (and More!)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;the-past-month-in-flink&quot;&gt;The Past Month in Flink&lt;/h1&gt;
&lt;h2 id=&quot;flink-releases&quot;&gt;Flink Releases&lt;/h2&gt;
&lt;h3 id=&quot;getting-ready-for-flink-stateful-functions-22&quot;&gt;Getting Ready for Flink Stateful Functions 2.2&lt;/h3&gt;
&lt;p&gt;The details of the next release of &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-master/&quot;&gt;Stateful Functions&lt;/a&gt; are under discussion in &lt;a href=&quot;http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Next-Stateful-Functions-Release-td44063.html&quot;&gt;this @dev mailing list thread&lt;/a&gt;, and the feature freeze is set for &lt;strong&gt;September 10th&lt;/strong&gt; — so, you can expect Stateful Functions 2.2 to be released soon after! Some of the most relevant features in the upcoming release are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DataStream API interoperability&lt;/strong&gt;, allowing users to embed Stateful Functions pipelines in regular &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/datastream_api.html&quot;&gt;DataStream API&lt;/a&gt; programs with &lt;code&gt;DataStream&lt;/code&gt; ingress/egress.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fine-grained control over state&lt;/strong&gt; for remote functions, including the ability to configure different state expiration modes for each individual function.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As the community around StateFun grows, the release cycle will follow this pattern of smaller and more frequent releases to incorporate user feedback and allow for faster iteration. If you’d like to get involved, we’re always looking for &lt;a href=&quot;https://github.com/apache/flink-statefun#contributing&quot;&gt;new contributors&lt;/a&gt;!&lt;/p&gt;
&lt;h3 id=&quot;flink-1102&quot;&gt;Flink 1.10.2&lt;/h3&gt;
&lt;p&gt;The community has announced the second patch version to cover some outstanding issues in Flink 1.10. You can find a detailed list with all the improvements and bugfixes that went into Flink 1.10.2 in the &lt;a href=&quot;https://flink.apache.org/news/2020/08/25/release-1.10.2.html&quot;&gt;announcement blogpost&lt;/a&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;new-flink-improvement-proposals-flips&quot;&gt;New Flink Improvement Proposals (FLIPs)&lt;/h2&gt;
&lt;p&gt;The number of FLIPs being created and discussed in the @dev mailing list is growing week over week, as the Flink 1.12 release takes form and some longer-term efforts are kicked-off. Below are some of the new FLIPs to keep an eye out for!&lt;/p&gt;
&lt;table class=&quot;table table-bordered&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;center&gt;#&lt;/center&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741&quot;&gt;FLIP-131&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Consolidate User-Facing APIs and Deprecate the DataSet API&lt;/b&gt;&lt;/li&gt;
&lt;p&gt;The community proposes to deprecate the DataSet API in favor of the Table API/SQL and the DataStream API, in the long run. For this to be feasible, both APIs first need to be &lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741#FLIP131:ConsolidatetheuserfacingDataflowSDKs/APIs(anddeprecatetheDataSetAPI)-ProposedChanges&quot;&gt;adapted and expanded&lt;/a&gt; to support the additional use cases currently covered by the DataSet API.&lt;/p&gt;
&lt;p&gt; The first discussion to branch out of this &quot;umbrella&quot; FLIP is around support for a batch execution mode in the DataStream API (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-134%3A+Semantics+of+Bounded+Applications+on+the+DataStream+API&quot;&gt;FLIP-134&lt;/a&gt;).&lt;/p&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-135+Approximate+Task-Local+Recovery&quot;&gt;FLIP-135&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Approximate Task-Local Recovery&lt;/b&gt;&lt;/li&gt;
&lt;p&gt;To better accommodate recovery scenarios where a certain amount of data loss is tolerable, but a full pipeline restart is not desirable, the community plans to introduce a new failover strategy that allows to restart only the failed task(s). Approximate task-local recovery will allow users to trade consistency for fast failure recovery, which is handy for use cases like online training.&lt;/p&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-136%3A++Improve+interoperability+between+DataStream+and+Table+API&quot;&gt;FLIP-136&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Improve the interoperability between DataStream and Table API&lt;/b&gt;&lt;/li&gt;
&lt;p&gt;The Table API has seen a great deal of refactoring and new features in recent releases, but the interfaces to and from the DataStream API haven&#39;t been updated accordingly. The work in this FLIP will cover multiple known gaps to improve interoperability and expose important functionality also to the DataStream API (e.g. changelog handling).&lt;/p&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-139%3A+General+Python+User-Defined+Aggregate+Function+Support+on+Table+API&quot;&gt;FLIP-139&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Support Stateful Python UDFs&lt;/b&gt;&lt;/li&gt;
&lt;p&gt;Python UDFs have been supported in PyFlink &lt;a href=&quot;https://flink.apache.org/news/2020/02/11/release-1.10.0.html#pyflink-support-for-native-user-defined-functions-udfs&quot;&gt;since 1.10&lt;/a&gt;, but were so far limited to stateless functions. The community is now looking to introduce stateful aggregate functions (UDAFs) in the Python Table API.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Note: &lt;/b&gt;Pandas UDAFs are covered in a separate proposal (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-137%3A+Support+Pandas+UDAF+in+PyFlink&quot;&gt;FLIP-137&lt;/a&gt;).&lt;/p&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;For a complete overview of the development threads coming up in the project, check the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/1.12+Release&quot;&gt;Flink 1.12 Release Wiki&lt;/a&gt; and follow the feature discussions in the &lt;a href=&quot;https://lists.apache.org/list.html?dev@flink.apache.org&quot;&gt;@dev mailing list&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/h2&gt;
&lt;p&gt;The Apache Flink community has welcomed &lt;strong&gt;1 new PMC Member&lt;/strong&gt; and &lt;strong&gt;1 new Committer&lt;/strong&gt; since the last update. Congratulations!&lt;/p&gt;
&lt;h3 id=&quot;new-pmc-members&quot;&gt;New PMC Members&lt;/h3&gt;
&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars3.githubusercontent.com/u/5466492?s=400&amp;amp;u=7e01cfb0dd0e0dc57d181b986a379027bba48ec4&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/dianfu&quot;&gt;Dian Fu&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&quot;new-committers&quot;&gt;New Committers&lt;/h3&gt;
&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars3.githubusercontent.com/u/43608?s=400&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/alpinegizmo&quot;&gt;David Anderson&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;the-bigger-picture&quot;&gt;The Bigger Picture&lt;/h1&gt;
&lt;h2 id=&quot;flink-in-2019-the-aftermath&quot;&gt;Flink in 2019: the Aftermath&lt;/h2&gt;
&lt;p&gt;Roughly a year ago, we did a &lt;a href=&quot;https://flink.apache.org/news/2019/09/10/community-update.html#the-bigger-picture&quot;&gt;roundup of community stats&lt;/a&gt; to understand how far Flink (and the Flink community) had come in 2019. Where does Flink stand now? What changed?&lt;/p&gt;
&lt;p&gt;Perhaps the most impressive result this time around is the surge in activity in the @user-zh mailing list. What started as an effort to better support the chinese-speaking users early in 2019 is now even &lt;strong&gt;exceeding&lt;/strong&gt; the level of activity of the (already very active) main @user mailing list. Also @dev&lt;sup&gt;1&lt;/sup&gt; registered the highest ever peaks in activity in the months leading to the release of Flink 1.11!&lt;/p&gt;
&lt;p&gt;For what it’s worth, the Flink GitHub repository is now headed to &lt;strong&gt;15k stars&lt;/strong&gt;, after reaching the 10k milestone last year. If you consider some other numbers we gathered previously on &lt;a href=&quot;https://flink.apache.org/news/2020/04/01/community-update.html#a-look-into-the-flink-repository&quot;&gt;repository activity&lt;/a&gt; and &lt;a href=&quot;https://flink.apache.org/news/2020/07/27/community-update.html#a-look-into-the-evolution-of-flink-releases&quot;&gt;releases over time&lt;/a&gt;, 2020 is looking like one for the books in the Flink community.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-09-04-community-update/2020-09-04-community-update_1.png&quot; width=&quot;1000px&quot; alt=&quot;&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;sup&gt;1. Excluding messages from “jira@apache.org”.&lt;/sup&gt;&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;To put these numbers into perspective, the report for the financial year of 2020 from the Apache Software Foundation (ASF) features Flink as &lt;strong&gt;one of the most active open source projects&lt;/strong&gt;, with mentions for:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Most Active Sources: Visits (#2)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Top Repositories by Number of Commits (#2)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Top Most Active Apache Mailing Lists (@user (#1) and @dev (#2))&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For more details on where Flink and other open source projects stand in the bigger ASF picture, check out the &lt;a href=&quot;https://www.apache.org/foundation/docs/FY2020AnnualReport.pdf&quot;&gt;full report&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;google-season-of-docs-2020-results&quot;&gt;Google Season of Docs 2020 Results&lt;/h2&gt;
&lt;p&gt;In a &lt;a href=&quot;https://flink.apache.org/news/2020/06/11/community-update.html#google-season-of-docs-2020&quot;&gt;previous update&lt;/a&gt;, we announced that Flink had been selected for &lt;a href=&quot;https://developers.google.com/season-of-docs&quot;&gt;Google Season of Docs (GSoD)&lt;/a&gt; 2020, an initiative to pair technical writers with mentors to work on documentation for open source projects. Today, we’d like to welcome the two technical writers that will be working with the Flink community to improve the Table API/SQL documentation: &lt;strong&gt;Kartik Khare&lt;/strong&gt; and &lt;strong&gt;Muhammad Haseeb Asif&lt;/strong&gt;!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/KKcorps&quot;&gt;Kartik&lt;/a&gt; is a software engineer at Walmart Labs and a regular contributor to multiple Apache projects. He is also a prolific writer on &lt;a href=&quot;https://medium.com/@kharekartik&quot;&gt;Medium&lt;/a&gt; and has previously published on the &lt;a href=&quot;https://flink.apache.org/news/2020/02/07/a-guide-for-unit-testing-in-apache-flink.html&quot;&gt;Flink blog&lt;/a&gt;. Last year, he contributed to Apache Airflow as part of GSoD and he’s currently revamping the Apache Pinot documentation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://www.linkedin.com/in/haseebasif/&quot;&gt;Muhammad&lt;/a&gt; is a dual degree master student at KTH and TU Berlin, with a focus on distributed systems and data intensive processing (in particular, performance optimization of state backends). He writes frequently about Flink on &lt;a href=&quot;https://medium.com/@haseeb1431&quot;&gt;Medium&lt;/a&gt; and you can catch him at &lt;a href=&quot;https://www.flink-forward.org/global-2020/conference-program#flinkndb---skyrocketing-stateful-capabilities-of-apache-flink&quot;&gt;Flink Forward&lt;/a&gt; later this year!&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We’re looking forward to the next 3 months of collaboration, and would like to thank again all the applicants that invested time into their applications for GSoD with Flink.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;upcoming-events-and-more&quot;&gt;Upcoming Events (and More!)&lt;/h1&gt;
&lt;p&gt;With conference season in full swing, we’re glad to see some great Flink content coming up in September! Here, we highlight some of the Flink talks happening soon in virtual events.&lt;/p&gt;
&lt;p&gt;As usual, we also leave you with some resources to read and explore.&lt;/p&gt;
&lt;table class=&quot;table table-bordered&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon glyphicon-console&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Events&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;b&gt;ODSC Europe (Sep. 17-19)&lt;/b&gt;
&lt;p&gt;&lt;a href=&quot;https://odsc.com/speakers/snakes-on-a-plane-interactive-data-exploration-with-pyflink-and-zeppelin-notebooks/&quot;&gt;Snakes on a Plane: Interactive Data Exploration with PyFlink and Zeppelin Notebooks&lt;/a&gt;&lt;/p&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;b&gt;Big Data LDN (Sep. 23-24)&lt;/b&gt;
&lt;p&gt;&lt;a href=&quot;https://bigdataldn.com/&quot;&gt;Flink SQL: From Real-Time Pattern Detection to Online View Maintenance&lt;/a&gt;&lt;/p&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;b&gt;ApacheCon @Home (Sep. 29-Oct.1)&lt;/b&gt;
&lt;p&gt;&lt;a href=&quot;https://www.apachecon.com/acah2020/tracks/bigdata-1.html&quot;&gt;Integrate Apache Flink with Cloud Native Ecosystem&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.apachecon.com/acah2020/tracks/bigdata-1.html&quot;&gt;Snakes on a Plane: Interactive Data Exploration with PyFlink and Zeppelin Notebooks&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.apachecon.com/acah2020/tracks/bigdata-1.html&quot;&gt;Interactive Streaming Data Analytics via Flink on Zeppelin&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.apachecon.com/acah2020/tracks/bigdata-2.html&quot;&gt;Flink SQL in 2020: Time to show off!&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.apachecon.com/acah2020/tracks/streaming.html&quot;&gt;Change Data Capture with Flink SQL and Debezium&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.apachecon.com/acah2020/tracks/streaming.html&quot;&gt;Real-Time Stock Processing With Apache NiFi, Apache Flink and Apache Kafka&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.apachecon.com/acah2020/tracks/iot.html&quot;&gt;Using the Mm FLaNK Stack for Edge AI (Apache MXNet, Apache Flink, Apache NiFi, Apache Kafka, Apache Kudu)&lt;/a&gt;&lt;/p&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon-fire&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Blogposts&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;b&gt;Flink 1.11 Series&lt;/b&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/08/20/flink-docker.html&quot;&gt;The State of Flink on Docker&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/08/06/external-resource.html&quot;&gt;Accelerating your workload with GPU and other external resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/2020/08/04/pyflink-pandas-udf-support-flink.html&quot;&gt;PyFlink: The integration of Pandas into PyFlink&lt;/a&gt;&lt;/li&gt;
&lt;p&gt;&lt;/p&gt;
&lt;b&gt;Other&lt;/b&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/2020/08/19/statefun.html&quot;&gt;Monitoring and Controlling Networks of IoT Devices with Flink Stateful Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/07/30/demo-fraud-detection-3.html&quot;&gt;Advanced Flink Application Patterns Vol.3: Custom Window Processing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon glyphicon-certificate&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Flink Packages&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;&lt;p&gt;&lt;a href=&quot;https://flink-packages.org/&quot;&gt;Flink Packages&lt;/a&gt; is a website where you can explore (and contribute to) the Flink &lt;br /&gt; ecosystem of connectors, extensions, APIs, tools and integrations. &lt;b&gt;New in:&lt;/b&gt; &lt;/p&gt;
&lt;li&gt;&lt;a href=&quot;https://flink-packages.org/packages/cdc-connectors&quot;&gt; Flink CDC Connectors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink-packages.org/packages/streaming-flink-file-source&quot;&gt;Flink File Source&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink-packages.org/packages/streaming-flink-dynamodb-connector&quot;&gt;Flink DynamoDB Connector&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr /&gt;
&lt;p&gt;If you’d like to keep a closer eye on what’s happening in the community, subscribe to the Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;@community mailing list&lt;/a&gt; to get fine-grained weekly updates, upcoming event announcements and more.&lt;/p&gt;
</description>
<pubDate>Fri, 04 Sep 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/09/04/community-update.html</link>
<guid isPermaLink="true">/news/2020/09/04/community-update.html</guid>
</item>
<item>
<title>Memory Management improvements for Flink’s JobManager in Apache Flink 1.11</title>
<description>&lt;p&gt;Apache Flink 1.11 comes with significant changes to the memory model of Flink’s JobManager and configuration options for your Flink clusters.
These recently-introduced changes make Flink adaptable to all kinds of deployment environments (e.g. Kubernetes, Yarn, Mesos),
providing better control over its memory consumption.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://flink.apache.org/news/2020/04/21/memory-management-improvements-flink-1.10.html&quot;&gt;previous blog post&lt;/a&gt;
focused on the memory model of the TaskManagers and how it was improved in Flink 1.10. This post addresses the same topic but for the JobManager instead.
Flink 1.11 unifies the memory model of Flink’s processes. The newly-introduced memory model of the JobManager follows a similar approach to that of the TaskManagers;
it is simpler and has fewer components and tuning knobs. This post might consequently seem very similar to our previous story on Flink’s memory
but aims at providing a complete overview of Flink’s JobManager memory model as of Flink 1.11. Read on for a full list of updates and changes below!&lt;/p&gt;
&lt;h2 id=&quot;introduction-to-flinks-process-memory-model&quot;&gt;Introduction to Flink’s process memory model&lt;/h2&gt;
&lt;p&gt;Having a clear understanding of Apache Flink’s process memory model allows you to manage resources for the various workloads more efficiently.
The following diagram illustrates the main memory components of a Flink process:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-09-01-flink-1.11-memory-management-improvements/total-process-memory-flink-1.11.png&quot; width=&quot;400px&quot; alt=&quot;Backpressure sampling:high&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Flink: Total Process Memory&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;The JobManager process is a JVM process. On a high level, its memory consists of the JVM Heap and Off-Heap memory.
These types of memory are consumed by Flink directly or by the JVM for its specific purposes (i.e. metaspace).
There are two major memory consumers within the JobManager process: the framework itself consuming memory for internal data structures, network communication, etc.
and the user code which runs within the JobManager process, e.g. in certain batch sources or during checkpoint completion callbacks.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Please note that the user code has direct access to all memory types: JVM Heap, Direct and Native memory. Therefore, Flink cannot really control its allocation and usage.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;how-to-set-up-jobmanager-memory&quot;&gt;How to set up JobManager memory&lt;/h2&gt;
&lt;p&gt;With the release of Flink 1.11 and in order to provide better user experience, the Flink community introduced three alternatives to setting up memory in JobManagers.&lt;/p&gt;
&lt;p&gt;The first two — and simplest — alternatives are configuring one of the two following options for total memory available for the JVM process of the JobManager:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;Total Process Memory:&lt;/em&gt;&lt;/strong&gt; total memory consumed by the Flink Java application (including user code) and by the JVM to run the whole process.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;Total Flink Memory:&lt;/em&gt;&lt;/strong&gt; only the memory consumed by the Flink Java application, including user code but excluding any memory allocated by the JVM to run it.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is advisable to configure the &lt;em&gt;Total Flink Memory&lt;/em&gt; for standalone deployments where explicitly declaring how much memory is given to Flink is a common practice,
while the outer JVM overhead is of little interest. For the cases of deploying Flink in containerized environments
(such as &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/deployment/kubernetes.html&quot;&gt;Kubernetes&lt;/a&gt;,
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/deployment/yarn_setup.html&quot;&gt;Yarn&lt;/a&gt; or
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/deployment/mesos.html&quot;&gt;Mesos&lt;/a&gt;),
the &lt;em&gt;Total Process Memory&lt;/em&gt; option is recommended instead, because it becomes the size for the total memory of the requested container.
Containerized environments usually strictly enforce this memory limit.&lt;/p&gt;
&lt;p&gt;If you want more fine-grained control over the size of the &lt;em&gt;JVM Heap&lt;/em&gt;, there is also the third alternative of configuring it directly.
This alternative gives a clear separation between the heap memory and any other memory types.&lt;/p&gt;
&lt;p&gt;The remaining memory components will be automatically adjusted either based on their default values or additionally-configured parameters.
Apache Flink also checks the overall consistency. You can find more information about the different memory components in the corresponding
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_setup_jobmanager.html&quot;&gt;documentation&lt;/a&gt;.
You can try different configuration options with the &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE/edit#gid=605121894&quot;&gt;configuration spreadsheet&lt;/a&gt;
(you have to make a copy of the spreadsheet to edit it) of &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-116%3A+Unified+Memory+Configuration+for+Job+Managers&quot;&gt;FLIP-116&lt;/a&gt;
and check the corresponding results for your individual case.&lt;/p&gt;
&lt;p&gt;If you are migrating from a Flink version older than 1.11, we suggest following the steps in the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_migration.html#migrate-job-manager-memory-configuration&quot;&gt;migration guide&lt;/a&gt; of the Flink documentation.&lt;/p&gt;
&lt;p&gt;Additionally, you can configure separately the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_setup_jobmanager.html#configure-off-heap-memory&quot;&gt;Off-heap memory&lt;/a&gt;
(&lt;em&gt;JVM direct and non-direct memory&lt;/em&gt;) as well as the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_setup_jobmanager.html#detailed-configuration&quot;&gt;JVM metaspace &amp;amp; overhead&lt;/a&gt;.
The &lt;em&gt;JVM overhead&lt;/em&gt; is a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_setup.html#capped-fractionated-components&quot;&gt;fraction&lt;/a&gt; of the &lt;em&gt;Total Process Memory&lt;/em&gt;.
The &lt;em&gt;JVM overhead&lt;/em&gt; can be configured in a similar way as other fractions described in &lt;a href=&quot;https://flink.apache.org/news/2020/04/21/memory-management-improvements-flink-1.10.html#fractions-of-the-total-flink-memory&quot;&gt;our previous blog post&lt;/a&gt;
about the TaskManager’s memory model.&lt;/p&gt;
&lt;h2 id=&quot;more-hints-to-control-the-container-memory-limit&quot;&gt;More hints to control the container memory limit&lt;/h2&gt;
&lt;p&gt;The heap and direct memory usage are managed by the JVM. There are also many other possible sources of native memory consumption in Apache Flink or its user applications
which are not managed directly by Flink or the JVM. Controlling their limits is often difficult which complicates debugging of potential memory leaks.
If Flink’s process allocates too much memory in an unmanaged way, it can often result in killing its containers for containerized environments.
In this case, understanding which type of memory consumption has exceeded its limit might be difficult to grasp and resolve.
Flink 1.11 introduces some specific tuning options to clearly represent such components for the JobManager’s process.
Although Flink cannot always enforce strict limits and borders among them, the idea here is to explicitly plan the memory usage.
Below we provide some examples of how memory setup can prevent containers from exceeding their memory limit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_setup_jobmanager.html#configure-off-heap-memory&quot;&gt;User code or its dependencies consume significant off-heap memory&lt;/a&gt;.&lt;/strong&gt;
Tuning the &lt;em&gt;Off-heap&lt;/em&gt; option can assign additional direct or native memory to the user code or any of its dependencies.
Flink cannot control native allocations but it sets the limit for &lt;em&gt;JVM Direct&lt;/em&gt; memory allocations. The Direct memory limit is enforced by the JVM.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_setup_jobmanager.html#detailed-configuration&quot;&gt;JVM metaspace requires additional memory&lt;/a&gt;.&lt;/strong&gt;
If you encounter &lt;code&gt;OutOfMemoryError: Metaspace&lt;/code&gt;, Flink provides an option to increase its default limit and the JVM will ensure that it is not exceeded.
The metaspace size of a Flink JVM process is always explicitly set in contrast to the default JVM settings where it is not limited.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_setup_jobmanager.html#detailed-configuration&quot;&gt;JVM requires more internal memory&lt;/a&gt;.&lt;/strong&gt;
There is no direct control over certain types of JVM process allocations but Flink provides &lt;em&gt;JVM Overhead&lt;/em&gt; options.
The &lt;em&gt;JVM Overhead&lt;/em&gt; options allow declaring an additional amount of memory, anticipated for those allocations and not covered by other options.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The latest Flink release (&lt;a href=&quot;https://flink.apache.org/downloads.html#apache-flink-1111&quot;&gt;Flink 1.11&lt;/a&gt;) introduces some notable changes to the memory configuration of Flink’s JobManager,
making its memory management significantly easier than before. Stay tuned for more additions and features in upcoming releases.
If you have any suggestions or questions for the Flink community, we encourage you to sign up to the Apache Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;mailing lists&lt;/a&gt;
and become part of the discussion.&lt;/p&gt;
</description>
<pubDate>Tue, 01 Sep 2020 17:30:00 +0200</pubDate>
<link>https://flink.apache.org/2020/09/01/flink-1.11-memory-management-improvements.html</link>
<guid isPermaLink="true">/2020/09/01/flink-1.11-memory-management-improvements.html</guid>
</item>
<item>
<title>Apache Flink 1.10.2 Released</title>
<description>&lt;p&gt;The Apache Flink community released the second bugfix version of the Apache Flink 1.10 series.&lt;/p&gt;
&lt;p&gt;This release includes 73 fixes and minor improvements for Flink 1.10.1. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.10.2.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
After FLINK-18242, the deprecated &lt;code&gt;OptionsFactory&lt;/code&gt; and &lt;code&gt;ConfigurableOptionsFactory&lt;/code&gt; classes are removed (not applicable for release-1.10), please use &lt;code&gt;RocksDBOptionsFactory&lt;/code&gt; and &lt;code&gt;ConfigurableRocksDBOptionsFactory&lt;/code&gt; instead. Please also recompile your application codes if any class extending &lt;code&gt;DefaultConfigurableOptionsFactory&lt;/code&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
After FLINK-17800 by default we will set &lt;code&gt;setTotalOrderSeek&lt;/code&gt; to true for RocksDB’s &lt;code&gt;ReadOptions&lt;/code&gt;, to prevent user from miss using &lt;code&gt;optimizeForPointLookup&lt;/code&gt;. Meantime we support customizing &lt;code&gt;ReadOptions&lt;/code&gt; through &lt;code&gt;RocksDBOptionsFactory&lt;/code&gt;. Please set &lt;code&gt;setTotalOrderSeek&lt;/code&gt; back to false if any performance regression observed (normally won’t happen according to our testing).&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15836&quot;&gt;FLINK-15836&lt;/a&gt;] - Throw fatal error in KubernetesResourceManager when the pods watcher is closed with exception
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16160&quot;&gt;FLINK-16160&lt;/a&gt;] - Schema#proctime and Schema#rowtime don&amp;#39;t work in TableEnvironment#connect code path
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13689&quot;&gt;FLINK-13689&lt;/a&gt;] - Rest High Level Client for Elasticsearch6.x connector leaks threads if no connection could be established
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14369&quot;&gt;FLINK-14369&lt;/a&gt;] - KafkaProducerAtLeastOnceITCase&amp;gt;KafkaProducerTestBase.testOneToOneAtLeastOnceCustomOperator fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14836&quot;&gt;FLINK-14836&lt;/a&gt;] - Unable to set yarn container number for scala shell in yarn mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14894&quot;&gt;FLINK-14894&lt;/a&gt;] - HybridOffHeapUnsafeMemorySegmentTest#testByteBufferWrap failed on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15758&quot;&gt;FLINK-15758&lt;/a&gt;] - Investigate potential out-of-memory problems due to managed unsafe memory allocation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15849&quot;&gt;FLINK-15849&lt;/a&gt;] - Update SQL-CLIENT document from type to data-type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16309&quot;&gt;FLINK-16309&lt;/a&gt;] - ElasticSearch 7 connector is missing in SQL connector list
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16346&quot;&gt;FLINK-16346&lt;/a&gt;] - BlobsCleanupITCase.testBlobServerCleanupCancelledJob fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16432&quot;&gt;FLINK-16432&lt;/a&gt;] - Building Hive connector gives problems
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16451&quot;&gt;FLINK-16451&lt;/a&gt;] - Fix IndexOutOfBoundsException for DISTINCT AGG with constants
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16510&quot;&gt;FLINK-16510&lt;/a&gt;] - Task manager safeguard shutdown may not be reliable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17092&quot;&gt;FLINK-17092&lt;/a&gt;] - Pyflink test BlinkStreamDependencyTests is instable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17322&quot;&gt;FLINK-17322&lt;/a&gt;] - Enable latency tracker would corrupt the broadcast state
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17420&quot;&gt;FLINK-17420&lt;/a&gt;] - Cannot alias Tuple and Row fields when converting DataStream to Table
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17466&quot;&gt;FLINK-17466&lt;/a&gt;] - toRetractStream doesn&amp;#39;t work correctly with Pojo conversion class
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17555&quot;&gt;FLINK-17555&lt;/a&gt;] - docstring of pyflink.table.descriptors.FileSystem:1:duplicate object description of pyflink.table.descriptors.FileSystem
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17558&quot;&gt;FLINK-17558&lt;/a&gt;] - Partitions are released in TaskExecutor Main Thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17562&quot;&gt;FLINK-17562&lt;/a&gt;] - POST /jars/:jarid/plan is not working
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17578&quot;&gt;FLINK-17578&lt;/a&gt;] - Union of 2 SideOutputs behaviour incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17639&quot;&gt;FLINK-17639&lt;/a&gt;] - Document which FileSystems are supported by the StreamingFileSink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17643&quot;&gt;FLINK-17643&lt;/a&gt;] - LaunchCoordinatorTest fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17700&quot;&gt;FLINK-17700&lt;/a&gt;] - The callback client of JavaGatewayServer should run in a daemon thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17744&quot;&gt;FLINK-17744&lt;/a&gt;] - StreamContextEnvironment#execute cannot be call JobListener#onJobExecuted
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17763&quot;&gt;FLINK-17763&lt;/a&gt;] - No log files when starting scala-shell
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17788&quot;&gt;FLINK-17788&lt;/a&gt;] - scala shell in yarn mode is broken
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17800&quot;&gt;FLINK-17800&lt;/a&gt;] - RocksDB optimizeForPointLookup results in missing time windows
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17801&quot;&gt;FLINK-17801&lt;/a&gt;] - TaskExecutorTest.testHeartbeatTimeoutWithResourceManager timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17809&quot;&gt;FLINK-17809&lt;/a&gt;] - BashJavaUtil script logic does not work for paths with spaces
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17822&quot;&gt;FLINK-17822&lt;/a&gt;] - Nightly Flink CLI end-to-end test failed with &amp;quot;JavaGcCleanerWrapper$PendingCleanersRunner cannot access class jdk.internal.misc.SharedSecrets&amp;quot; in Java 11
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17870&quot;&gt;FLINK-17870&lt;/a&gt;] - dependent jars are missing to be shipped to cluster in scala shell
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17891&quot;&gt;FLINK-17891&lt;/a&gt;] - FlinkYarnSessionCli sets wrong execution.target type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17959&quot;&gt;FLINK-17959&lt;/a&gt;] - Exception: &amp;quot;CANCELLED: call already cancelled&amp;quot; is thrown when run python udf
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18008&quot;&gt;FLINK-18008&lt;/a&gt;] - HistoryServer does not log environment information on startup
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18012&quot;&gt;FLINK-18012&lt;/a&gt;] - Deactivate slot timeout if TaskSlotTable.tryMarkSlotActive is called
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18035&quot;&gt;FLINK-18035&lt;/a&gt;] - Executors#newCachedThreadPool could not work as expected
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18045&quot;&gt;FLINK-18045&lt;/a&gt;] - Fix Kerberos credentials checking to unblock Flink on secured MapR
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18048&quot;&gt;FLINK-18048&lt;/a&gt;] - &amp;quot;--host&amp;quot; option could not take effect for standalone application cluster
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18097&quot;&gt;FLINK-18097&lt;/a&gt;] - History server doesn&amp;#39;t clean all job json files
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18168&quot;&gt;FLINK-18168&lt;/a&gt;] - Error results when use UDAF with Object Array return type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18223&quot;&gt;FLINK-18223&lt;/a&gt;] - AvroSerializer does not correctly instantiate GenericRecord
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18241&quot;&gt;FLINK-18241&lt;/a&gt;] - Custom OptionsFactory in user code not working when configured via flink-conf.yaml
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18242&quot;&gt;FLINK-18242&lt;/a&gt;] - Custom OptionsFactory settings seem to have no effect on RocksDB
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18297&quot;&gt;FLINK-18297&lt;/a&gt;] - SQL client: setting execution.type to invalid value shuts down the session
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18329&quot;&gt;FLINK-18329&lt;/a&gt;] - Dist NOTICE issues
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18352&quot;&gt;FLINK-18352&lt;/a&gt;] - org.apache.flink.core.execution.DefaultExecutorServiceLoader not thread safe
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18517&quot;&gt;FLINK-18517&lt;/a&gt;] - kubernetes session test failed with &amp;quot;java.net.SocketException: Broken pipe&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18539&quot;&gt;FLINK-18539&lt;/a&gt;] - StreamExecutionEnvironment#addSource(SourceFunction, TypeInformation) doesn&amp;#39;t use the user defined type information
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18595&quot;&gt;FLINK-18595&lt;/a&gt;] - Deadlock during job shutdown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18646&quot;&gt;FLINK-18646&lt;/a&gt;] - Managed memory released check can block RPC thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18663&quot;&gt;FLINK-18663&lt;/a&gt;] - RestServerEndpoint may prevent server shutdown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18677&quot;&gt;FLINK-18677&lt;/a&gt;] - ZooKeeperLeaderRetrievalService does not invalidate leader in case of SUSPENDED connection
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18702&quot;&gt;FLINK-18702&lt;/a&gt;] - Flink elasticsearch connector leaks threads and classloaders thereof
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18815&quot;&gt;FLINK-18815&lt;/a&gt;] - AbstractCloseableRegistryTest.testClose unstable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18821&quot;&gt;FLINK-18821&lt;/a&gt;] - Netty client retry mechanism may cause PartitionRequestClientFactory#createPartitionRequestClient to wait infinitely
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18859&quot;&gt;FLINK-18859&lt;/a&gt;] - ExecutionGraphNotEnoughResourceTest.testRestartWithSlotSharingAndNotEnoughResources failed with &amp;quot;Condition was not met in given timeout.&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18902&quot;&gt;FLINK-18902&lt;/a&gt;] - Cannot serve results of asynchronous REST operations in per-job mode
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; New Feature
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17844&quot;&gt;FLINK-17844&lt;/a&gt;] - Activate japicmp-maven-plugin checks for @PublicEvolving between bug fix releases (x.y.u -&amp;gt; x.y.v)
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16217&quot;&gt;FLINK-16217&lt;/a&gt;] - SQL Client crashed when any uncatched exception is thrown
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16225&quot;&gt;FLINK-16225&lt;/a&gt;] - Metaspace Out Of Memory should be handled as Fatal Error in TaskManager
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16619&quot;&gt;FLINK-16619&lt;/a&gt;] - Misleading SlotManagerImpl logging for slot reports of unknown task manager
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16717&quot;&gt;FLINK-16717&lt;/a&gt;] - Use headless service for rpc and blob port when flink on K8S
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17248&quot;&gt;FLINK-17248&lt;/a&gt;] - Make the thread nums of io executor of ClusterEntrypoint and MiniCluster configurable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17503&quot;&gt;FLINK-17503&lt;/a&gt;] - Make memory configuration logging more user-friendly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17819&quot;&gt;FLINK-17819&lt;/a&gt;] - Yarn error unhelpful when forgetting HADOOP_CLASSPATH
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17920&quot;&gt;FLINK-17920&lt;/a&gt;] - Add the Python example of Interval Join in Table API doc
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17945&quot;&gt;FLINK-17945&lt;/a&gt;] - Improve error reporting of Python CI tests
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17970&quot;&gt;FLINK-17970&lt;/a&gt;] - Increase default value of IO pool executor to 4 * #cores
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18010&quot;&gt;FLINK-18010&lt;/a&gt;] - Add more logging to HistoryServer
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18501&quot;&gt;FLINK-18501&lt;/a&gt;] - Mapping of Pluggable Filesystems to scheme is not properly logged
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18644&quot;&gt;FLINK-18644&lt;/a&gt;] - Remove obsolete doc for hive connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18772&quot;&gt;FLINK-18772&lt;/a&gt;] - Hide submit job web ui elements when running in per-job/application mode
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Tue, 25 Aug 2020 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/08/25/release-1.10.2.html</link>
<guid isPermaLink="true">/news/2020/08/25/release-1.10.2.html</guid>
</item>
<item>
<title>The State of Flink on Docker</title>
<description>&lt;p&gt;With over 50 million downloads from Docker Hub, the Flink docker images are a very popular deployment option.&lt;/p&gt;
&lt;p&gt;The Flink community recently put some effort into improving the Docker experience for our users with the goal to reduce confusion and improve usability.&lt;/p&gt;
&lt;p&gt;Let’s quickly break down the recent improvements:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Reduce confusion: Flink used to have 2 Dockerfiles and a 3rd file maintained outside of the official repository — all with different features and varying stability. Now, we have one central place for all images: &lt;a href=&quot;https://github.com/apache/flink-docker&quot;&gt;apache/flink-docker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here, we keep all the Dockerfiles for the different releases. Check out the &lt;a href=&quot;https://github.com/apache/flink-docker/blob/master/README.md&quot;&gt;detailed readme&lt;/a&gt; of that repository for further explanation on the different branches, as well as the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification&quot;&gt;Flink Improvement Proposal (FLIP-111)&lt;/a&gt; that contains the detailed planning.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;apache/flink-docker&lt;/code&gt; repository also seeds the &lt;a href=&quot;https://hub.docker.com/_/flink&quot;&gt;official Flink image on Docker Hub&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Improve Usability: The Dockerfiles are used for various purposes: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/deployment/docker.html&quot;&gt;Native Docker deployments&lt;/a&gt;, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/deployment/native_kubernetes.html&quot;&gt;Flink on Kubernetes&lt;/a&gt;, the (unofficial) &lt;a href=&quot;https://github.com/docker-flink/examples&quot;&gt;Flink helm example&lt;/a&gt; and the project’s &lt;a href=&quot;https://github.com/apache/flink/tree/master/flink-end-to-end-tests&quot;&gt;internal end to end tests&lt;/a&gt;. With one unified image, all these consumers of the images benefit from the same set of features, documentation and testing.&lt;/p&gt;
&lt;p&gt;The new images support &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/deployment/docker.html#configure-options&quot;&gt;passing configuration variables&lt;/a&gt; via a &lt;code&gt;FLINK_PROPERTIES&lt;/code&gt; environment variable. Users can &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/deployment/docker.html#using-plugins&quot;&gt;enable default plugins&lt;/a&gt; with the &lt;code&gt;ENABLE_BUILT_IN_PLUGINS&lt;/code&gt; environment variable. The images also allow loading custom jar paths and configuration files.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Looking into the future, there are already some interesting potential improvements lined up:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16260&quot;&gt;Java 11 Docker images&lt;/a&gt; (already completed)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15793&quot;&gt;Use vanilla docker-entrypoint with flink-kubernetes&lt;/a&gt; (in progress)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17167&quot;&gt;History server support&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15587&quot;&gt;Support for OpenShift&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;how-do-i-get-started&quot;&gt;How do I get started?&lt;/h2&gt;
&lt;p&gt;This is a short tutorial on &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/deployment/docker.html#start-a-session-cluster&quot;&gt;how to start a Flink Session Cluster&lt;/a&gt; with Docker.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;Flink Session cluster&lt;/em&gt; can be used to run multiple jobs. Each job needs to be submitted to the cluster after it has been deployed. To deploy a &lt;em&gt;Flink Session cluster&lt;/em&gt; with Docker, you need to start a &lt;em&gt;JobManager&lt;/em&gt; container. To enable communication between the containers, we first set a required Flink configuration property and create a network:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;FLINK_PROPERTIES=&quot;jobmanager.rpc.address: jobmanager&quot;
docker network create flink-network
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then we launch the JobManager:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;docker run \
--rm \
--name=jobmanager \
--network flink-network \
-p 8081:8081 \
--env FLINK_PROPERTIES=&quot;${FLINK_PROPERTIES}&quot; \
flink:1.11.1 jobmanager
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;and one or more &lt;em&gt;TaskManager&lt;/em&gt; containers:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;docker run \
--rm \
--name=taskmanager \
--network flink-network \
--env FLINK_PROPERTIES=&quot;${FLINK_PROPERTIES}&quot; \
flink:1.11.1 taskmanager
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You now have a fully functional Flink cluster running! You can access the the web front end here: &lt;a href=&quot;http://localhost:8081/&quot;&gt;localhost:8081&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Let’s now submit one of Flink’s example jobs:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;# 1: (optional) Download the Flink distribution, and unpack it&lt;/span&gt;
wget https://archive.apache.org/dist/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
tar xf flink-1.11.1-bin-scala_2.12.tgz
&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;flink-1.11.1
&lt;span class=&quot;c&quot;&gt;# 2: Start the Flink job&lt;/span&gt;
./bin/flink run ./examples/streaming/TopSpeedWindowing.jar&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The main steps of the tutorial are also recorded in this short screencast:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/flink-docker/flink-docker.gif&quot; width=&quot;882px&quot; height=&quot;730px&quot; alt=&quot;Demo video&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;strong&gt;Next steps&lt;/strong&gt;: Now that you’ve successfully completed this tutorial, we recommend you checking out the full &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/deployment/docker.html&quot;&gt;Flink on Docker documentation&lt;/a&gt; for implementing more advanced deployment scenarios, such as Job Clusters, Docker Compose or our native Kubernetes integration.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;We encourage all readers to try out Flink on Docker to provide the community with feedback to further improve the experience.
Please refer to the user@flink.apache.org (&lt;a href=&quot;https://flink.apache.org/community.html#how-to-subscribe-to-a-mailing-list&quot;&gt;remember to subscribe first&lt;/a&gt;) for general questions and our &lt;a href=&quot;https://issues.apache.org/jira/issues/?jql=project+%3D+FLINK+AND+component+%3D+flink-docker&quot;&gt;issue tracker&lt;/a&gt; for specific bugs or improvements, or &lt;a href=&quot;https://flink.apache.org/contributing/how-to-contribute.html&quot;&gt;ideas for contributions&lt;/a&gt;!&lt;/p&gt;
</description>
<pubDate>Thu, 20 Aug 2020 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/08/20/flink-docker.html</link>
<guid isPermaLink="true">/news/2020/08/20/flink-docker.html</guid>
</item>
<item>
<title>Monitoring and Controlling Networks of IoT Devices with Flink Stateful Functions</title>
<description>&lt;p&gt;In this blog post, we’ll take a look at a class of use cases that is a natural fit for &lt;a href=&quot;https://flink.apache.org/stateful-functions.html&quot;&gt;Flink Stateful Functions&lt;/a&gt;: monitoring and controlling networks of connected devices (often called the “Internet of Things” (IoT)).&lt;/p&gt;
&lt;p&gt;IoT networks are composed of many individual, but interconnected components, which makes getting some kind of high-level insight into the status, problems, or optimization opportunities in these networks not trivial. Each individual device “sees” only its own state, which means that the status of groups of devices, or even the network as a whole, is often a complex aggregation of the individual devices’ state. Diagnosing, controlling, or optimizing these groups of devices thus requires distributed logic that analyzes the “bigger picture” and then acts upon it.&lt;/p&gt;
&lt;p&gt;A powerful approach to implement this is using &lt;em&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Digital_twin&quot;&gt;digital twins&lt;/a&gt;&lt;/em&gt;: each device has a corresponding virtual entity (i.e. the digital twin), which also captures their relationships and interactions. The digital twins track the status of their corresponding devices and send updates to other twins, representing groups (such as geographical regions) of devices. Those, in turn, handle the logic to obtain the network’s aggregated view, or this “bigger picture” we mentioned before.&lt;/p&gt;
&lt;h1 id=&quot;our-scenario-datacenter-monitoring-and-alerting&quot;&gt;Our Scenario: Datacenter Monitoring and Alerting&lt;/h1&gt;
&lt;figure style=&quot;float:right;padding-left:1px;padding-top: 20px&quot;&gt;
&lt;img src=&quot;/img/blog/2020-08-18-statefun/rack.png&quot; width=&quot;350px&quot; /&gt;
&lt;figcaption style=&quot;padding-top: 10px;text-align:center&quot;&gt;&lt;b&gt;Fig.1&lt;/b&gt; An oversimplified view of a data center.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;There are many examples of the digital twins approach in the real world, such as &lt;a href=&quot;https://www.infoq.com/presentations/tesla-vpp/&quot;&gt;smart grids of batteries&lt;/a&gt;, &lt;a href=&quot;https://www.alibabacloud.com/solutions/intelligence-brain/city&quot;&gt;smart cities&lt;/a&gt;, or &lt;a href=&quot;https://www.youtube.com/watch?v=9y27FJgz5-M&quot;&gt;monitoring infrastructure software clusters&lt;/a&gt;. In this blogpost, we’ll use the example of data center monitoring and alert correlation implemented with Stateful Functions.&lt;/p&gt;
&lt;p&gt;Consider a very simplified view of a data center, consisting of many thousands of commodity servers arranged in server racks. Each server rack typically contains up to 40 servers, with a ToR (Top of the Rack) network switch connected to each server. The switches from all the racks connect through a larger switch (&lt;strong&gt;Fig. 1&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;In this datacenter, many things can go wrong: a disk in a server can stop working, network cards can start dropping packets, or ToR switches might cease to function. The entire data center might also be affected by power supply degradation, causing servers to operate at reduced capacity. On-site engineers must be able to identify these incidents quickly and fix them promptly.&lt;/p&gt;
&lt;p&gt;Diagnosing individual server failures is rather straightforward: take a recent history of metric reports from that particular server, analyse it and pinpoint the anomaly. On the other hand, other incidents only make sense “together”, because they share a common root cause. Diagnosing or predicting causes of networking degradation at a rack or datacenter level requires an aggregate view of metrics (such as package drop rates) from the individual machines and racks, and possibly some prediction model or diagnosis code that runs under certain conditions.&lt;/p&gt;
&lt;h2 id=&quot;monitoring-a-virtual-datacenter-via-digital-twins&quot;&gt;Monitoring a Virtual Datacenter via Digital Twins&lt;/h2&gt;
&lt;p&gt;For the sake of this blog post, our oversimplified data center has some servers and racks, each with a unique ID. Each server has a metrics-collecting daemon that publishes metrics to a message queue, and there is a provisioning service that operators will use to ask for server commission- and decommissioning.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-18-statefun/1.png&quot; width=&quot;550px&quot; alt=&quot;&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Our application will consume these server metrics and commission/decommission events, and produce server/rack/datacenter alerts. There will also be an operator consuming any alerts triggered by the monitoring system. In the next section, we’ll show how this use case can be naturally modeled with Stateful Functions (StateFun).&lt;/p&gt;
&lt;h2 id=&quot;implementing-the-use-case-with-flink-statefun&quot;&gt;Implementing the use case with Flink StateFun&lt;/h2&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;You can find the code for this example at: &lt;a href=&quot;https://github.com/igalshilman/iot-statefun-blogpost&quot;&gt;https://github.com/igalshilman/iot-statefun-blogpost&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The basic building block for modeling a StateFun application is a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.1/concepts/application-building-blocks.html#stateful-functions&quot;&gt;&lt;em&gt;stateful function&lt;/em&gt;&lt;/a&gt;, which has the following properties:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;It has a logical unique address; and persisted, fault tolerant state, scoped to that address.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It can &lt;em&gt;react&lt;/em&gt; to messages, both internal (or, sent from other stateful functions) and external (e.g. a message from Kafka).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Invocations of a specific function are serializable, so messages sent to a specific address are &lt;strong&gt;not&lt;/strong&gt; executed concurrently.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;There can be many billions of function instances in a single StateFun cluster.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To model our use case, we’ll define three functions: &lt;strong&gt;ServerFun&lt;/strong&gt;, &lt;strong&gt;RackFun&lt;/strong&gt; and &lt;strong&gt;DataCenterFun&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ServerFun&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Each physical server is represented with its &lt;em&gt;digital twin&lt;/em&gt; stateful function. This function is responsible for:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Maintaining a sliding window of incoming metrics.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Applying a model that decides whether or not to trigger an alert.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Alerting if metrics are missing for too long.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Notifying its containing &lt;strong&gt;RackFun&lt;/strong&gt; about any open incidents.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;RackFun&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;While the &lt;em&gt;ServerFun&lt;/em&gt; is responsible for identifying server-local incidents, we need a function that correlates incidents happening on the different servers deployed in the same rack and:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Collects open incidents reported by the &lt;strong&gt;ServerFun&lt;/strong&gt; functions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Maintains an histogram of currently opened incidents on this rack.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Applies a correlation model to the individual incidents sent by the &lt;strong&gt;ServerFun&lt;/strong&gt;, and reports high-level, related incidents as a single incident to the &lt;strong&gt;DataCenterFun&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;DataCenterFun&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This function maintains a view of incidents across different racks in our datacenter.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-18-statefun/2.png&quot; width=&quot;600px&quot; alt=&quot;&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;To summarize our plan:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Leaf functions ingest raw metric data (&lt;span style=&quot;color:blue&quot;&gt;blue&lt;/span&gt; lines), and apply localized logic to trigger an alert.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Intermediate functions operate on already summarized events (&lt;span style=&quot;color:orange&quot;&gt;orange&lt;/span&gt; lines) and correlate them into high-level events.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A root function correlates the high-level events across the intermediate functions and into a single &lt;em&gt;healthy/not healthy&lt;/em&gt; value.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;how-does-it-really-look&quot;&gt;How does it really look?&lt;/h2&gt;
&lt;h3 id=&quot;serverfun&quot;&gt;ServerFun&lt;/h3&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-18-statefun/3_1.png&quot; width=&quot;600px&quot; alt=&quot;&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;This section associates a behaviour for every message that the function expects to be invoked with.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;metricsHistory&lt;/code&gt; buffer is our sliding window of the last 15 minutes worth of &lt;code&gt;ServerMetricReports&lt;/code&gt;. Note that this buffer is configured to expire entries 15 minutes after they were written.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;serverHealthState&lt;/code&gt; represents the current physical server state, open incidents and so on.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let’s take a look at what happens when a &lt;code&gt;ServerMetricReport&lt;/code&gt; message arrives:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-18-statefun/3_2.png&quot; width=&quot;600px&quot; alt=&quot;&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;Retrieve the previously computed &lt;code&gt;serverHealthState&lt;/code&gt; that is kept in state.&lt;/li&gt;
&lt;li&gt;Evaluate a model on the sliding window of the previous metric reports + the current metric reported + the previously computed server state to obtain an assessment of the current server health.&lt;/li&gt;
&lt;li&gt;If the server is not believed to be healthy, emit an alert via an alerts topic, and also send a message to our containing rack with all the open incidents that this server currently has.&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&quot;alert alert-warning&quot;&gt;
&lt;p&gt;We’ll omit the other handlers for brevity, but it’s important to mention that &lt;b&gt;onTimer&lt;/b&gt; makes sure that metric reports are coming in periodically, otherwise it’d trigger an alert stating that we didn’t hear from that server for a long time.&lt;/p&gt;
&lt;/div&gt;
&lt;h3 id=&quot;rackfun&quot;&gt;RackFun&lt;/h3&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-18-statefun/5.png&quot; width=&quot;650px&quot; alt=&quot;&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;This function keeps a mapping between a &lt;code&gt;ServerId&lt;/code&gt; and a set of open incidents on that server.&lt;/li&gt;
&lt;li&gt;When new alerts are received, this function tries to correlate the alert with any other open alerts on that rack. If a correlated rack alert is present, this function notifies the &lt;strong&gt;DataCenterFun&lt;/strong&gt; about it.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&quot;datacenterfun&quot;&gt;DataCenterFun&lt;/h3&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-18-statefun/6.png&quot; width=&quot;650px&quot; alt=&quot;&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;A persisted mapping between a &lt;code&gt;RackId&lt;/code&gt; and the latest alert that rack reported.&lt;/li&gt;
&lt;li&gt;Throughout the usage of ingress/egress pairs, this function can report back its current view of the world of what racks are currently known to be unhealthy.&lt;/li&gt;
&lt;li&gt;An operator (via a front-end) can send a &lt;code&gt;GetUnhealthyRacks&lt;/code&gt; message addressed to that &lt;strong&gt;DataCenterFun&lt;/strong&gt;, and wait for the corresponding response &lt;code&gt;message(UnhealthyRacks)&lt;/code&gt;. Whenever a rack reports &lt;em&gt;OK&lt;/em&gt;, it’ll be removed from the unhealthy racks map.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This pattern — where each layer of functions performs a stateful aggregation of events sent from the previous layer (or the input) — is useful for a whole class of problems. And, although we used connected devices to motivate this use case, it’s not limited to the IoT domain.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-18-statefun/7.png&quot; width=&quot;500px&quot; alt=&quot;&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Stateful Functions provides the building blocks necessary for building complex distributed applications (here the digital twins that support analysis and interactions of the physical entities), while removing common complexities of distributed systems like service discovery, retires, circuit breakers, state management, scalability and similar challenges. If you’d like to learn more about Stateful Functions, head over to the official &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-master/&quot;&gt;documentation&lt;/a&gt;, where you can also find more hands-on tutorials to try out yourself!&lt;/p&gt;
</description>
<pubDate>Wed, 19 Aug 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/2020/08/19/statefun.html</link>
<guid isPermaLink="true">/2020/08/19/statefun.html</guid>
</item>
<item>
<title>Accelerating your workload with GPU and other external resources</title>
<description>&lt;p&gt;Apache Flink 1.11 introduces a new &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/external_resources.html&quot;&gt;External Resource Framework&lt;/a&gt;,
which allows you to request external resources from the underlying resource management systems (e.g., Kubernetes) and accelerate your workload with
those resources. As Flink provides a first-party GPU plugin at the moment, we will take GPU as an example and show how it affects Flink applications
in the AI field. Other external resources (e.g. RDMA and SSD) can also be supported &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/external_resources.html#implement-a-plugin-for-your-custom-resource-type&quot;&gt;in a pluggable manner&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;end-to-end-real-time-ai-with-gpu&quot;&gt;End-to-end real-time AI with GPU&lt;/h1&gt;
&lt;p&gt;Recently, AI and Machine Learning have gained additional popularity and have been widely used in various scenarios, such
as personalized recommendation and image recognition. &lt;a href=&quot;https://flink.apache.org/&quot;&gt;Flink&lt;/a&gt;, with the ability to support GPU
allocation, can be used to build an end-to-end real-time AI workflow.&lt;/p&gt;
&lt;h2 id=&quot;why-flink&quot;&gt;Why Flink&lt;/h2&gt;
&lt;p&gt;Typical AI workloads fall into two categories: training and inference.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-06-accelerate-with-external-resources/ai-workflow.png&quot; width=&quot;800px&quot; alt=&quot;Typical AI Workflow&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Typical AI Workflow&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;The training workload is usually a batch task, in which we train a model from a bounded dataset. On the other hand, the inference
workload tends to be a streaming job. It consumes an unbounded data stream, which contains image data, for example, and uses a model
to produce the output of predictions. Both workloads need to do data preprocessing first. Flink, as a
&lt;a href=&quot;https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html&quot;&gt;unified batch and stream processing engine&lt;/a&gt;, can be used to build an end-to-end AI workflow naturally.&lt;/p&gt;
&lt;p&gt;In many cases, the training and inference workload can benefit a lot by leveraging GPUs. &lt;a href=&quot;https://azure.microsoft.com/en-us/blog/gpus-vs-cpus-for-deployment-of-deep-learning-models/&quot;&gt;Research&lt;/a&gt;
shows that CPU cluster is outperformed by GPU cluster, which is of similar cost, by about 400 percent. As training datasets
are getting bigger and models more complex, supporting GPUs has become mandatory for running AI workloads.&lt;/p&gt;
&lt;p&gt;With the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/external_resources.html&quot;&gt;External Resource Framework&lt;/a&gt;
and its &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/external_resources.html#plugin-for-gpu-resources&quot;&gt;GPU plugin&lt;/a&gt;, Flink
can now request GPU resources from the external resource management system and expose GPU information to operators. With this
feature, users can now easily build end-to-end training and real-time inference pipelines with GPU support on Flink.&lt;/p&gt;
&lt;h2 id=&quot;example-mnist-inference-with-flink&quot;&gt;Example: MNIST Inference with Flink&lt;/h2&gt;
&lt;p&gt;We take the MNIST inference task as an example to show how to use the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/external_resources.html&quot;&gt;External Resource Framework&lt;/a&gt;
and how to leverage GPUs in Flink. MNIST is a database of handwritten digits, which is usually viewed as the HelloWorld of AI.
The goal is to recognize a 28px*28px picture of a number from 0 to 9.&lt;/p&gt;
&lt;p&gt;First, you need to set configurations for the external resource framework to enable GPU support:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;external-resources: gpu
&lt;span class=&quot;c&quot;&gt;# Define the driver factory class of gpu resource.&lt;/span&gt;
external-resource.gpu.driver-factory.class: org.apache.flink.externalresource.gpu.GPUDriverFactory
&lt;span class=&quot;c&quot;&gt;# Define the amount of gpu resource per TaskManager.&lt;/span&gt;
external-resource.gpu.amount: 1
&lt;span class=&quot;c&quot;&gt;# Enable the coordination mode if you run it in standalone mode&lt;/span&gt;
external-resource.gpu.param.discovery-script.args: --enable-coordination
&lt;span class=&quot;c&quot;&gt;# If you run it on Yarn&lt;/span&gt;
external-resource.gpu.yarn.config-key: yarn.io/gpu
&lt;span class=&quot;c&quot;&gt;# If you run it on Kubernetes&lt;/span&gt;
external-resource.gpu.kubernetes.config-key: nvidia.com/gpu&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For more details of the configuration, please refer to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/external_resources.html#configurations-1&quot;&gt;official documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In the MNIST inference task, we first need to read the images and do data preprocessing. You can download &lt;a href=&quot;http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz&quot;&gt;training&lt;/a&gt;
or &lt;a href=&quot;http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz&quot;&gt;testing&lt;/a&gt; data from &lt;a href=&quot;http://yann.lecun.com/exdb/mnist/&quot;&gt;this site&lt;/a&gt;.
We provide a simple &lt;a href=&quot;https://github.com/KarmaGYZ/flink-mnist/blob/master/src/main/java/org/apache/flink/MNISTReader.java&quot;&gt;MNISTReader&lt;/a&gt;.
It will read the image data located in the provided file path and transform each image into a list of floating point numbers.&lt;/p&gt;
&lt;p&gt;Then, we need a classifier to recognize those images. A one-layer pre-trained neural network, whose prediction accuracy is 92.14%,
is used in our classify operator. To leverage GPUs in order to accelerate the matrix-matrix multiplication, we use &lt;a href=&quot;https://github.com/jcuda/jcuda&quot;&gt;JCuda&lt;/a&gt;
to call the native Cuda API. The prediction logic of the &lt;a href=&quot;https://github.com/KarmaGYZ/flink-mnist/blob/master/src/main/java/org/apache/flink/MNISTClassifier.java&quot;&gt;MNISTClassifier&lt;/a&gt; is shown below.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MNISTClassifier&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichMapFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Float&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Configuration&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Get the GPU information and select the first GPU.&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ExternalResourceInfo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;externalResourceInfos&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getRuntimeContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getExternalResourceInfos&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resourceName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Optional&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;firstIndexOptional&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;externalResourceInfos&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;iterator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getProperty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;index&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Initialize JCublas with the selected GPU&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;JCuda&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;cudaSetDevice&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;parseInt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;firstIndexOptional&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;JCublas&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;cublasInit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Float&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Performs multiplication using JCublas. The matrixPointer points to our pre-trained model.&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;JCublas&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;cublasSgemv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sc&quot;&gt;&amp;#39;n&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DIMENSIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;f1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DIMENSIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;f0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.0f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;matrixPointer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DIMENSIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;f1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inputPointer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;outputPointer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Read the result back from GPU.&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;JCublas&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;cublasGetVector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DIMENSIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;f1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Sizeof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;FLOAT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;outputPointer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Pointer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DIMENSIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;f1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The complete MNIST inference project can be found &lt;a href=&quot;https://github.com/KarmaGYZ/flink-mnist&quot;&gt;here&lt;/a&gt;. In this project, we simply
print the inference result to &lt;strong&gt;STDOUT&lt;/strong&gt;. In the actual production environment, you could also write the result to Elasticsearch or Kafka, for example.&lt;/p&gt;
&lt;p&gt;The MNIST inference task is just a simple case that shows you how the external resource framework works and what Flink can
do with GPU support. With Flink’s open source extension &lt;a href=&quot;https://github.com/alibaba/Alink&quot;&gt;Alink&lt;/a&gt;, which contains a lot of
pre-built algorithms based on Flink, and &lt;a href=&quot;https://github.com/alibaba/flink-ai-extended&quot;&gt;Tensorflow on Flink&lt;/a&gt;, some complex
AI workloads, e.g. online learning, real-time inference service, could be easily implemented as well.&lt;/p&gt;
&lt;h1 id=&quot;other-external-resources&quot;&gt;Other external resources&lt;/h1&gt;
&lt;p&gt;In addition to GPU support, there are many other external resources that can be used to accelerate jobs in some specific scenarios.
E.g. FPGA, for AI workloads, is supported by both Yarn and Kubernetes. Some low-latency network devices, like RDMA and Solarflare, also
provide their device plugin for Kubernetes. Currently, Yarn supports GPUs and FPGAs, while the list of Kubernetes’ device plugins can be found &lt;a href=&quot;https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#examples&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;With the external resource framework, you only need to implement a plugin that enables the operator to get the information
for these external resources; see &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/external_resources.html#implement-a-plugin-for-your-custom-resource-type&quot;&gt;Custom Plugin&lt;/a&gt;
for more details. If you just want to ensure that an external resource exists in the TaskManager, then you only need to find the
configuration key of that resource in the underlying resource management system and configure the external resource framework accordingly.&lt;/p&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;In the latest Flink release (Flink 1.11), an external resource framework has been introduced to support requesting various types of
resources from the underlying resource management systems, and supply all the necessary information for using these resources to the
operators. The first-party GPU plugin expands the application prospects of Flink in the AI domain. Different resource types can be supported
in a pluggable way. You can also implement your own plugins for custom resource types.&lt;/p&gt;
&lt;p&gt;Future developments in this area include implementing operator level resource isolation and fine-grained external resource scheduling.
The community may kick this work off once &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation&quot;&gt;FLIP-56&lt;/a&gt;
is finished. If you have any suggestions or questions for the community, we encourage you to sign up to the Apache Flink
&lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;mailing lists&lt;/a&gt; and join the discussion there.&lt;/p&gt;
</description>
<pubDate>Thu, 06 Aug 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/08/06/external-resource.html</link>
<guid isPermaLink="true">/news/2020/08/06/external-resource.html</guid>
</item>
<item>
<title>PyFlink: The integration of Pandas into PyFlink</title>
<description>&lt;p&gt;Python has evolved into one of the most important programming languages for many fields of data processing. So big has been Python’s popularity, that it has pretty much become the default data processing language for data scientists. On top of that, there is a plethora of Python-based data processing tools such as NumPy, Pandas, and Scikit-learn that have gained additional popularity due to their flexibility or powerful functionalities.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-04-pyflink-pandas/python-scientific-stack.png&quot; width=&quot;450px&quot; alt=&quot;Python Scientific Stack&quot; /&gt;
&lt;/center&gt;
&lt;center&gt;
&lt;a href=&quot;https://speakerdeck.com/jakevdp/the-unexpected-effectiveness-of-python-in-science?slide=52&quot;&gt;Pic source: VanderPlas 2017, slide 52.&lt;/a&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;In an effort to meet the user needs and demands, the Flink community hopes to leverage and make better use of these tools. Along this direction, the Flink community put some great effort in integrating Pandas into PyFlink with the latest Flink version 1.11. Some of the added features include &lt;strong&gt;support for Pandas UDF&lt;/strong&gt; and the &lt;strong&gt;conversion between Pandas DataFrame and Table&lt;/strong&gt;. Pandas UDF not only greatly improve the execution performance of Python UDF, but also make it more convenient for users to leverage libraries such as Pandas and NumPy in Python UDF. Additionally, providing support for the conversion between Pandas DataFrame and Table enables users to switch processing engines seamlessly without the need for an intermediate connector. In the remainder of this article, we will introduce how these functionalities work and how to use them with a step-by-step example.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Currently, only Scalar Pandas UDFs are supported in PyFlink.&lt;/p&gt;
&lt;/div&gt;
&lt;h1 id=&quot;pandas-udf-in-flink-111&quot;&gt;Pandas UDF in Flink 1.11&lt;/h1&gt;
&lt;p&gt;Using scalar Python UDF was already possible in Flink 1.10 as described in a &lt;a href=&quot;https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html&quot;&gt;previous article on the Flink blog&lt;/a&gt;. Scalar Python UDFs work based on three primary steps:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;the Java operator serializes one input row to bytes and sends them to the Python worker;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;the Python worker deserializes the input row and evaluates the Python UDF with it;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;the resulting row is serialized and sent back to the Java operator&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While providing support for Python UDFs in PyFlink greatly improved the user experience, it had some drawbacks, namely resulting in:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;High serialization/deserialization overhead&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Difficulty when leveraging popular Python libraries used by data scientists — such as Pandas or NumPy — that provide high-performance data structure and functions.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The introduction of Pandas UDF is used to address these drawbacks. For Pandas UDF, a batch of rows is transferred between the JVM and PVM in a columnar format (&lt;a href=&quot;https://arrow.apache.org/docs/format/Columnar.html&quot;&gt;Arrow memory format&lt;/a&gt;). The batch of rows will be converted into a collection of Pandas Series and will be transferred to the Pandas UDF to then leverage popular Python libraries (such as Pandas, or NumPy) for the Python UDF implementation.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-04-pyflink-pandas/vm-communication.png&quot; width=&quot;550px&quot; alt=&quot;VM Communication&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;The performance of vectorized UDFs is usually much higher when compared to the normal Python UDF, as the serialization/deserialization overhead is minimized by falling back to &lt;a href=&quot;https://arrow.apache.org/&quot;&gt;Apache Arrow&lt;/a&gt;, while handling &lt;code&gt;pandas.Series&lt;/code&gt; as input/output allows us to take full advantage of the Pandas and NumPy libraries, making it a popular solution to parallelize Machine Learning and other large-scale, distributed data science workloads (e.g. feature engineering, distributed model application).&lt;/p&gt;
&lt;h1 id=&quot;conversion-between-pyflink-table-and-pandas-dataframe&quot;&gt;Conversion between PyFlink Table and Pandas DataFrame&lt;/h1&gt;
&lt;p&gt;Pandas DataFrame is the de-facto standard for working with tabular data in the Python community while PyFlink Table is Flink’s representation of the tabular data in Python. Enabling the conversion between PyFlink Table and Pandas DataFrame allows switching between PyFlink and Pandas seamlessly when processing data in Python. Users can process data by utilizing one execution engine and switch to a different one effortlessly. For example, in case users already have a Pandas DataFrame at hand and want to perform some expensive transformation, they can easily convert it to a PyFlink Table and leverage the power of the Flink engine. On the other hand, users can also convert a PyFlink Table to a Pandas DataFrame and perform the same transformation with the rich functionalities provided by the Pandas ecosystem.&lt;/p&gt;
&lt;h1 id=&quot;examples&quot;&gt;Examples&lt;/h1&gt;
&lt;p&gt;Using Python in Apache Flink requires installing PyFlink, which is available on &lt;a href=&quot;https://pypi.org/project/apache-flink/&quot;&gt;PyPI&lt;/a&gt; and can be easily installed using &lt;code&gt;pip&lt;/code&gt;. Before installing PyFlink, check the working version of Python running in your system using:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python --version
Python 3.7.6&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Please note that Python 3.5 or higher is required to install and run PyFlink&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python -m pip install apache-flink&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;using-pandas-udf&quot;&gt;Using Pandas UDF&lt;/h2&gt;
&lt;p&gt;Pandas UDFs take &lt;code&gt;pandas.Series&lt;/code&gt; as the input and return a &lt;code&gt;pandas.Series&lt;/code&gt; of the same length as the output. Pandas UDFs can be used at the exact same place where non-Pandas functions are currently being utilized. To mark a UDF as a Pandas UDF, you only need to add an extra parameter udf_type=”pandas” in the udf decorator:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FLOAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FLOAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;pandas&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;interpolate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;temperature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# takes id: pandas.Series and temperature: pandas.Series as input&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;temperature&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;temperature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# use interpolate() to interpolate the missing temperature&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;interpolated_df&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;groupby&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interpolate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;limit_direction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;both&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# output temperature: pandas.Series&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interpolated_df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;temperature&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The Pandas UDF above uses the Pandas &lt;code&gt;dataframe.interpolate()&lt;/code&gt; function to interpolate the missing temperature data for each equipment id. This is a common IoT scenario whereby each equipment/device reports it’s id and temperature to be analyzed, but the temperature field may be null due to various reasons.
With the function, you can register and use it in the same way as the &lt;a href=&quot;https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html&quot;&gt;normal Python UDF&lt;/a&gt;. Below is a complete example of how to use the Pandas UDF in PyFlink.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.datastream&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.table&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.table.udf&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pandas&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pd&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_execution_environment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_parallelism&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_configuration&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_boolean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;python.fn-execution.memory.managed&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FLOAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FLOAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;pandas&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;interpolate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;temperature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# takes id: pandas.Series and temperature: pandas.Series as input&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;temperature&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;temperature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# use interpolate() to interpolate the missing temperature&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;interpolated_df&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;groupby&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interpolate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;limit_direction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;both&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# output temperature: pandas.Series&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interpolated_df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;temperature&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;register_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;interpolate&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interpolate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;my_source_ddl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; create table mySource (&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; id INT,&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; temperature FLOAT &lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; ) with (&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; &amp;#39;connector.type&amp;#39; = &amp;#39;filesystem&amp;#39;,&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; &amp;#39;format.type&amp;#39; = &amp;#39;csv&amp;#39;,&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; &amp;#39;connector.path&amp;#39; = &amp;#39;/tmp/input&amp;#39;&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; )&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;my_sink_ddl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; create table mySink (&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; id INT,&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; temperature FLOAT &lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; ) with (&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; &amp;#39;connector.type&amp;#39; = &amp;#39;filesystem&amp;#39;,&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; &amp;#39;format.type&amp;#39; = &amp;#39;csv&amp;#39;,&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; &amp;#39;connector.path&amp;#39; = &amp;#39;/tmp/output&amp;#39;&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt; )&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute_sql&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_source_ddl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute_sql&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_sink_ddl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;mySource&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;\
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;id, interpolate(id, temperature) as temperature&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insert_into&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;mySink&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;pandas_udf_demo&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To submit the job:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Firstly, you need to prepare the input data in the “/tmp/input” file. For example,&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; -e &lt;span class=&quot;s2&quot;&gt;&amp;quot;1,98.0\n1,\n1,100.0\n2,99.0&amp;quot;&lt;/span&gt; &amp;gt; /tmp/input&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Next, you can run this example on the command line,&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python pandas_udf_demo.py&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The command builds and runs the Python Table API program in a local mini-cluster. You can also submit the Python Table API program to a remote cluster using different command lines, see more details &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/cli.html#job-submission-examples&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Finally, you can see the execution result on the command line. As you can see, all the temperature data with an empty value has been interpolated:&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt; cat /tmp/output
1,98.0
1,99.0
1,100.0
2,99.0&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;conversion-between-pyflink-table-and-pandas-dataframe-1&quot;&gt;Conversion between PyFlink Table and Pandas DataFrame&lt;/h2&gt;
&lt;p&gt;You can use the &lt;code&gt;from_pandas()&lt;/code&gt; method to create a PyFlink Table from a Pandas DataFrame or use the &lt;code&gt;to_pandas()&lt;/code&gt; method to convert a PyFlink Table to a Pandas DataFrame.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.datastream&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.table&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pandas&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pd&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numpy&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;np&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_execution_environment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# Create a PyFlink Table&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;pdf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_pandas&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pdf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;a&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;b&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;a &amp;gt; 0.5&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# Convert the PyFlink Table to a Pandas DataFrame&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;pdf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_pandas&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pdf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&quot;conclusion--upcoming-work&quot;&gt;Conclusion &amp;amp; Upcoming work&lt;/h1&gt;
&lt;p&gt;In this article, we introduce the integration of Pandas in Flink 1.11, including Pandas UDF and the conversion between Table and Pandas. In fact, in the latest Apache Flink release, there are many excellent features added to PyFlink, such as support of User-defined Table functions and User-defined Metrics for Python UDFs. What’s more, from Flink 1.11, you can build PyFlink with Cython support and “Cythonize” your Python UDFs to substantially improve code execution speed (up to 30x faster, compared to Python UDFs in Flink 1.10).&lt;/p&gt;
&lt;p&gt;Future work by the community will focus on adding more features and bringing additional optimizations with follow up releases. Such optimizations and additions include a Python DataStream API and more integration with the Python ecosystem, such as support for distributed Pandas in Flink. Stay tuned for more information and updates with the upcoming releases!&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-08-04-pyflink-pandas/mission-of-pyFlink.gif&quot; width=&quot;600px&quot; alt=&quot;Mission of PyFlink&quot; /&gt;
&lt;/center&gt;
</description>
<pubDate>Tue, 04 Aug 2020 02:00:00 +0200</pubDate>
<link>https://flink.apache.org/2020/08/04/pyflink-pandas-udf-support-flink.html</link>
<guid isPermaLink="true">/2020/08/04/pyflink-pandas-udf-support-flink.html</guid>
</item>
<item>
<title>Advanced Flink Application Patterns Vol.3: Custom Window Processing</title>
<description>&lt;style type=&quot;text/css&quot;&gt;
.tg {border-collapse:collapse;border-spacing:0;}
.tg td{padding:10px 10px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;}
.tg th{padding:10px 10px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;background-color:#eff0f1;}
.tg .tg-wide{padding:10px 30px;}
.tg .tg-top{vertical-align:top}
.tg .tg-topcenter{text-align:center;vertical-align:top}
.tg .tg-center{text-align:center;vertical-align:center}
&lt;/style&gt;
&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the previous articles of the series, we described how you can achieve
flexible stream partitioning based on dynamically-updated configurations
(a set of fraud-detection rules) and how you can utilize Flink&#39;s
Broadcast mechanism to distribute processing configuration at runtime
among the relevant operators. &lt;/p&gt;
&lt;p&gt;Following up directly where we left the discussion of the end-to-end
solution last time, in this article we will describe how you can use the
&quot;Swiss knife&quot; of Flink - the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/operators/process_function.html&quot;&gt;&lt;em&gt;Process Function&lt;/em&gt;&lt;/a&gt; to create an
implementation that is tailor-made to match your streaming business
logic requirements. Our discussion will continue in the context of the
&lt;a href=&quot;/news/2020/01/15/demo-fraud-detection.html#fraud-detection-demo&quot;&gt;Fraud Detection engine&lt;/a&gt;. We will also demonstrate how you can
implement your own &lt;strong&gt;custom replacement for time windows&lt;/strong&gt; for cases
where the out-of-the-box windowing available from the DataStream API
does not satisfy your requirements. In particular, we will look at the
trade-offs that you can make when designing a solution which requires
low-latency reactions to individual events.&lt;/p&gt;
&lt;p&gt;This article will describe some high-level concepts that can be applied
independently, but it is recommended that you review the material in
&lt;a href=&quot;/news/2020/01/15/demo-fraud-detection.html&quot;&gt;part one&lt;/a&gt; and
&lt;a href=&quot;/news/2020/03/24/demo-fraud-detection-2.html&quot;&gt;part two&lt;/a&gt; of the series as well as checkout the &lt;a href=&quot;https://github.com/afedulov/fraud-detection-demo&quot;&gt;code
base&lt;/a&gt; in order to make
it easier to follow along.&lt;/p&gt;
&lt;h2 id=&quot;processfunction-as-a-window&quot;&gt;ProcessFunction as a “Window”&lt;/h2&gt;
&lt;h3 id=&quot;low-latency&quot;&gt;Low Latency&lt;/h3&gt;
&lt;p&gt;Let’s start with a reminder of the type of fraud detection rule that we
would like to support:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;“Whenever the &lt;strong&gt;sum&lt;/strong&gt; of  &lt;strong&gt;payments&lt;/strong&gt; from the same &lt;strong&gt;payer&lt;/strong&gt; to the
same &lt;strong&gt;beneficiary&lt;/strong&gt; within &lt;strong&gt;a 24 hour
period&lt;/strong&gt; is &lt;strong&gt;greater&lt;/strong&gt; than &lt;strong&gt;200 000 $&lt;/strong&gt; - trigger an alert.”&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In other words, given a stream of transactions partitioned by a key that
combines the payer and the beneficiary fields, we would like to look
back in time and determine, for each incoming transaction, if the sum of
all previous payments between the two specific participants exceeds the
defined threshold. In effect, the computation window is always moved
along to the position of the last observed event for a particular data
partitioning key.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/time-windows.png&quot; width=&quot;600px&quot; alt=&quot;Figure 1: Time Windows&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 1: Time Windows&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;One of the common key requirements for a fraud detection system is &lt;em&gt;low
response time&lt;/em&gt;. The sooner the fraudulent action gets detected, the
higher the chances that it can be blocked and its negative consequences
mitigated. This requirement is especially prominent in the financial
domain, where you have one important constraint - any time spent
evaluating a fraud detection model is time that a law-abiding user of
your system will spend waiting for a response. Swiftness of processing
often becomes a competitive advantage between various payment systems
and the time limit for producing an alert could lie as low as &lt;em&gt;300-500
ms&lt;/em&gt;. This is all the time you get from the moment of ingestion of a
transaction event into a fraud detection system until an alert has to
become available to downstream systems. &lt;/p&gt;
&lt;p&gt;As you might know, Flink provides a powerful &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/operators/windows.html&quot;&gt;Window
API&lt;/a&gt;
that is applicable for a wide range of use cases. However, if you go
over all of the available types of supported windows, you will realize
that none of them exactly match our main requirement for this use case -
the low-latency evaluation of &lt;em&gt;each&lt;/em&gt; incoming transaction. There is
no type of window in Flink that can express the &lt;em&gt;“x minutes/hours/days
back from the &lt;u&gt;current event&lt;/u&gt;”&lt;/em&gt; semantic. In the Window API, events
fall into windows (as defined by the window
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/operators/windows.html#window-assigners&quot;&gt;assigners&lt;/a&gt;),
but they cannot themselves individually control the creation and
evaluation of windows*. As described above, our goal for the fraud
detection engine is to achieve immediate evaluation of the previous
relevant data points as soon as the new event is received. This raises
the question of feasibility of applying the Window API in this case. The Window API offers some options for defining custom triggers, evictors, and window assigners, which may get to the required result. However, it is usually difficult to get this right (and easy to break). Moreover, this approach does not provide access to broadcast state, which is required for implementing dynamic reconfiguration of business rules.&lt;/p&gt;
&lt;p&gt;*) apart from the session windows, but they are limited to assignments
based on the session &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/operators/windows.html#session-windows&quot;&gt;gaps&lt;/a&gt;&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/evaluation-delays.png&quot; width=&quot;600px&quot; alt=&quot;Figure 2: Evaluation Delays&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 2: Evaluation Delays&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;Let’s take an example of using a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/operators/windows.html#sliding-windows&quot;&gt;sliding
window&lt;/a&gt;
from Flink’s Window API. Using sliding windows with the slide of &lt;em&gt;S&lt;/em&gt;
translates into an expected value of evaluation delay equal to &lt;em&gt;S/2.&lt;/em&gt;
This means that you would need to define a window slide of 600-1000 ms
to fulfill the low-latency requirement of 300-500 ms delay, even before
taking any actual computation time into account. The fact that Flink
stores a separate window state for each sliding window pane renders this
approach unfeasible under any moderately high load conditions.&lt;/p&gt;
&lt;p&gt;In order to satisfy the requirements, we need to create our own
low-latency window implementation. Luckily, Flink gives us all the tools
required to do so. &lt;code&gt;ProcessFunction&lt;/code&gt; is a low-level, but powerful
building block in Flink&#39;s API. It has a simple contract:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;SomeProcessFunction&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;KeyType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;InputType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OutputType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InputType&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OutputType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;){}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;onTimer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OnTimerContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OutputType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Configuration&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;){}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;processElement()&lt;/code&gt; receives input events one by one. You can react to
each input by producing one or more output events to the next
operator by calling &lt;code&gt;out.collect(someOutput)&lt;/code&gt;. You can also pass data
to a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/side_output.html&quot;&gt;side
output&lt;/a&gt;
or ignore a particular input altogether.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;onTimer()&lt;/code&gt; is called by Flink when a previously-registered timer
fires. Both event time and processing time timers are supported.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;open()&lt;/code&gt; is equivalent to a constructor. It is called inside of the
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/concepts/glossary.html#flink-taskmanager&quot;&gt;TaskManager’s&lt;/a&gt;
JVM, and is used for initialization, such as registering
Flink-managed state. It is also the right place to initialize fields
that are not serializable and cannot be transferred from the
JobManager’s JVM.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Most importantly, &lt;code&gt;ProcessFunction&lt;/code&gt; also has access to the fault-tolerant
state, handled by Flink. This combination, together with Flink&#39;s
message processing and delivery guarantees, makes it possible to build
resilient event-driven applications with almost arbitrarily
sophisticated business logic. This includes creation and processing of
custom windows with state.&lt;/p&gt;
&lt;h3 id=&quot;implementation&quot;&gt;Implementation&lt;/h3&gt;
&lt;h4 id=&quot;state-and-clean-up&quot;&gt;State and Clean-up&lt;/h4&gt;
&lt;p&gt;In order to be able to process time windows, we need to keep track of
data belonging to the window inside of our program. To ensure that this
data is fault-tolerant and can survive failures in a distributed system,
we should store it inside of Flink-managed state. As the time
progresses, we do not need to keep all previous transactions. According
to the sample rule, all events that are older than 24 hours become
irrelevant. We are looking at a window of data that constantly moves and
where stale transactions need to be constantly moved out of scope (in
other words, cleaned up from state).&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/window-clean-up.png&quot; width=&quot;400px&quot; alt=&quot;Figure 3: Window Clean-up&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 3: Window Clean-up&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;We will
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/state/state.html#using-keyed-state&quot;&gt;use&lt;/a&gt;
&lt;code&gt;MapState&lt;/code&gt; to store the individual events of the window. In order to allow
efficient clean-up of the out-of-scope events, we will utilize event
timestamps as the &lt;code&gt;MapState&lt;/code&gt; keys.&lt;/p&gt;
&lt;p&gt;In a general case, we have to take into account the fact that there
might be different events with exactly the same timestamp, therefore
instead of individual Transaction per key(timestamp) we will store sets.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;MapState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Side Note &lt;/span&gt;
when any Flink-managed state is used inside a
&lt;code&gt;KeyedProcessFunction&lt;/code&gt;, the data returned by the &lt;code&gt;state.value()&lt;/code&gt; call is
automatically scoped by the key of the &lt;em&gt;currently-processed event&lt;/em&gt;
- see Figure 4. If &lt;code&gt;MapState&lt;/code&gt; is used, the same principle applies, with
the difference that a &lt;code&gt;Map&lt;/code&gt; is returned instead of &lt;code&gt;MyObject&lt;/code&gt;. If you are
compelled to do something like
&lt;code&gt;mapState.value().get(inputEvent.getKey())&lt;/code&gt;, you should probably be using
&lt;code&gt;ValueState&lt;/code&gt; instead of the &lt;code&gt;MapState&lt;/code&gt;. As we want to store &lt;em&gt;multiple values
per event key&lt;/em&gt;, in our case, &lt;code&gt;MapState&lt;/code&gt; is the right choice.&lt;/p&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/keyed-state-scoping.png&quot; width=&quot;800px&quot; alt=&quot;Figure 4: Keyed State Scoping&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 4: Keyed State Scoping&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;/div&gt;
&lt;p&gt;As described in the &lt;a href=&quot;/news/2020/01/15/demo-fraud-detection.html&quot;&gt;first blog of the series&lt;/a&gt;, we are dispatching events based on the keys
specified in the active fraud detection rules. Multiple distinct rules
can be based on the same grouping key. This means that our alerting
function can potentially receive transactions scoped by the same key
(e.g. &lt;code&gt;{payerId=25;beneficiaryId=12}&lt;/code&gt;), but destined to be evaluated
according to different rules, which implies potentially different
lengths of the time windows. This raises the question of how can we best
store fault-tolerant window state within the &lt;code&gt;KeyedProcessFunction&lt;/code&gt;. One
approach would be to create and manage separate &lt;code&gt;MapStates&lt;/code&gt; per rule. Such
an approach, however, would be wasteful - we would separately hold state
for overlapping time windows, and therefore unnecessarily store
duplicate events. A better approach is to always store just enough data
to be able to estimate all currently active rules which are scoped by
the same key. In order to achieve that, whenever a new rule is added, we
will determine if its time window has the largest span and store it in
the broadcast state under the special reserved &lt;code&gt;WIDEST_RULE_KEY&lt;/code&gt;. This
information will later be used during the state clean-up procedure, as
described later in this section.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;processBroadcastElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Alert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;){&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;updateWidestWindowRule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;broadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;updateWidestWindowRule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BroadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;broadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;){&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;widestWindowRule&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;broadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WIDEST_RULE_KEY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;widestWindowRule&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;broadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WIDEST_RULE_KEY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;widestWindowRule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getWindowMillis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getWindowMillis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;broadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WIDEST_RULE_KEY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Let’s now look at the implementation of the main method,
&lt;code&gt;processElement()&lt;/code&gt;, in some detail.&lt;/p&gt;
&lt;p&gt;In the &lt;a href=&quot;/news/2020/01/15/demo-fraud-detection.html#dynamic-data-partitioning&quot;&gt;previous blog post&lt;/a&gt;, we described how &lt;code&gt;DynamicKeyFunction&lt;/code&gt; allowed
us to perform dynamic data partitioning based on the &lt;code&gt;groupingKeyNames&lt;/code&gt;
parameter in the rule definition. The subsequent description is focused
around the &lt;code&gt;DynamicAlertFunction&lt;/code&gt;, which makes use of the remaining rule
settings.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/sample-rule-definition.png&quot; width=&quot;700px&quot; alt=&quot;Figure 5: Sample Rule Definition&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 5: Sample Rule Definition&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;As described in the previous parts of the blog post
series, our alerting process function receives events of type
&lt;code&gt;Keyed&amp;lt;Transaction, String, Integer&amp;gt;&lt;/code&gt;, where &lt;code&gt;Transaction&lt;/code&gt; is the main
“wrapped” event, String is the key (&lt;em&gt;payer #x - beneficiary #y&lt;/em&gt; in
Figure 1), and &lt;code&gt;Integer&lt;/code&gt; is the ID of the rule that caused the dispatch of
this event. This rule was previously &lt;a href=&quot;/news/2020/03/24/demo-fraud-detection-2.html#broadcast-state-pattern&quot;&gt;stored in the broadcast state&lt;/a&gt; and has to be retrieved from that state by the ID. Here is the
outline of the implementation:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;DynamicAlertFunction&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedBroadcastProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Alert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;transient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MapState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ReadOnlyContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Alert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;){&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Add Transaction to state&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentEventTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getWrapped&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// &amp;lt;--- (1)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;addToStateValuesSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;windowState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getWrapped&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Calculate the aggregate value&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getBroadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Descriptors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;rulesDescriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// &amp;lt;--- (2)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowStartTimestampForEvent&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getWindowStartTimestampFor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currentEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;// &amp;lt;--- (3)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;SimpleAccumulator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BigDecimal&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aggregator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RuleHelper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getAggregator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// &amp;lt;--- (4)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isStateValueInWindow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowStartForEvent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;aggregateValuesInState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aggregator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Evaluate the rule and trigger an alert if violated&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BigDecimal&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aggregateResult&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aggregator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getLocalValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// &amp;lt;--- (5)&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;isRuleViolated&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aggregateResult&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isRuleViolated&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;decisionTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;currentTimeMillis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Alert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getRuleId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getKey&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;decisionTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getWrapped&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;aggregateResult&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Register timers to ensure state cleanup&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cleanupTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currentEventTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// &amp;lt;--- (6)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;timerService&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;registerEventTimeTimer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cleanupTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;br /&gt;
Here are the details of the steps:&lt;br /&gt;
1) We first add each new event to our window state:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;K&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;addToStateValuesSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MapState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;K&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mapState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;K&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;V&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;valuesSet&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mapState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;valuesSet&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;valuesSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;valuesSet&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HashSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;valuesSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mapState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;valuesSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;valuesSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;2) Next, we retrieve the previously-broadcasted rule, according to
which the incoming transaction needs to be evaluated.&lt;/p&gt;
&lt;p&gt;3) &lt;code&gt;getWindowStartTimestampFor&lt;/code&gt; determines, given the window span defined
in the rule, and the current transaction timestamp, how far back in
time our evaluation should span.&lt;/p&gt;
&lt;p&gt;4) The aggregate value is calculated by iterating over all window state
entries and applying an aggregate function. It could be an &lt;em&gt;average,
max, min&lt;/em&gt; or, as in the example rule from the beginning of this
section, a &lt;em&gt;sum&lt;/em&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;isStateValueInWindow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowStartForEvent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowStartForEvent&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;aggregateValuesInState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleAccumulator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BigDecimal&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aggregator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inWindow&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inWindow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BigDecimal&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aggregatedValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FieldsExtractor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getBigDecimalByName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getAggregateFieldName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;aggregator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aggregatedValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;5) Having an aggregate value, we can compare it to the threshold value
that is specified in the rule definition and fire an alert, if
necessary.&lt;/p&gt;
&lt;p&gt;6) At the end, we register a clean-up timer using
&lt;code&gt;ctx.timerService().registerEventTimeTimer()&lt;/code&gt;. This timer will be
responsible for removing the current transaction when it is going to
move out of scope.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note &lt;/span&gt;
Notice the rounding during timer creation. It is an important technique
which enables a reasonable trade-off between the precision with which
the timers will be triggered, and the number of timers being used.
Timers are stored in Flink’s fault-tolerant state, and managing them
with millisecond-level precision can be wasteful. In our case, with this
rounding, we will create at most one timer per key in any given second. Flink documentation provides some additional &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/operators/process_function.html#timer-coalescing&quot;&gt;&lt;u&gt;details&lt;/u&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;7) The &lt;code&gt;onTimer&lt;/code&gt; method will trigger the clean-up of the window state.&lt;/p&gt;
&lt;p&gt;As previously described, we are always keeping as many events in the
state as required for the evaluation of an active rule with the widest
window span. This means that during the clean-up, we only need to remove
the state which is out of scope of this widest window.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/widest-window.png&quot; width=&quot;800px&quot; alt=&quot;Figure 6: Widest Window&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 6: Widest Window&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;This is how the clean-up procedure can be implemented:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;onTimer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OnTimerContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Alert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;widestWindowRule&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getBroadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Descriptors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;rulesDescriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WIDEST_RULE_KEY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Optional&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cleanupEventTimeWindow&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Optional&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;ofNullable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;widestWindowRule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;Rule:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getWindowMillis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Optional&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cleanupEventTimeThreshold&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cleanupEventTimeWindow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;window&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Remove events that are older than (timestamp - widestWindowSpan)ms&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cleanupEventTimeThreshold&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;ifPresent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;evictOutOfScopeElementsFromWindow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;evictOutOfScopeElementsFromWindow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;threshold&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Iterator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;windowState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;iterator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;hasNext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stateEventTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;threshold&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;remove&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;throw&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;RuntimeException&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
You might be wondering why we did not use &lt;code&gt;ListState&lt;/code&gt; , as we are always
iterating over all of the values of the window state? This is actually
an optimization for the case when &lt;code&gt;RocksDBStateBackend&lt;/code&gt;
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/state/state_backends.html#the-rocksdbstatebackend&quot;&gt;is used&lt;/a&gt;. Iterating over a &lt;code&gt;ListState&lt;/code&gt; would cause all of the &lt;code&gt;Transaction&lt;/code&gt;
objects to be deserialized. Using &lt;code&gt;MapState&lt;/code&gt;&#39;s keys iterator only causes
deserialization of the keys (type &lt;code&gt;long&lt;/code&gt;), and therefore reduces the
computational overhead.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;This concludes the description of the implementation details. Our
approach triggers evaluation of a time window as soon as a new
transaction arrives. It therefore fulfills the main requirement that we
have targeted - low delay for potentially issuing an alert. For the
complete implementation, please have a look at
&lt;a href=&quot;https://github.com/afedulov/fraud-detection-demo&quot;&gt;the project on github&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;improvements-and-optimizations&quot;&gt;Improvements and Optimizations&lt;/h2&gt;
&lt;p&gt;What are the pros and cons of the described approach?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Low latency capabilities&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Tailored solution with potential use-case specific optimizations&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Efficient state reuse (shared state for the rules with the same key)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Cannot make use of potential future optimizations in the existing
Window API&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;No late event handling, which is available out of the box in the
Window API&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Quadratic computation complexity and potentially large state&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let’s now look at the latter two drawbacks and see if we can address
them.&lt;/p&gt;
&lt;h4 id=&quot;late-events&quot;&gt;Late events:&lt;/h4&gt;
&lt;p&gt;Processing late events poses a certain question - is it still meaningful
to re-evaluate the window in case of a late event arrival? In case this
is required, you would need to extend the widest window used for the
clean-up by your maximum expected out-of-orderness. This would avoid
having potentially incomplete time window data for such late firings
(see Figure 7).&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/late-events.png&quot; width=&quot;500px&quot; alt=&quot;Figure 7: Late Events Handling&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 7: Late Events Handling&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;It can be argued, however, that for a use case that puts emphasis on low
latency processing, such late triggering would be meaningless. In this
case, we could keep track of the most recent timestamp that we have
observed so far, and for events that do not monotonically increase this
value, only add them to the state and skip the aggregate calculation and
the alert triggering logic.&lt;/p&gt;
&lt;h4 id=&quot;redundant-re-computations-and-state-size&quot;&gt;Redundant Re-computations and State Size:&lt;/h4&gt;
&lt;p&gt;In our described implementation we keep individual transactions in state
and go over them to calculate the aggregate again and again on every new
event. This is obviously not optimal in terms of wasting computational
resources on repeated calculations.&lt;/p&gt;
&lt;p&gt;What is the main reason to keep the individual transactions in state?
The granularity of stored events directly corresponds to the precision
of the time window calculation. Because we store transactions
individually, we can precisely ignore individual transactions as soon as
they leave the exact 2592000000 ms time window (30 days in ms). At this
point, it is worth raising the question - do we really need this
milliseconds precision when estimating such a long time window, or is it
OK to accept potential false positives in exceptional cases? If the
answer for your use case is that such precision is not needed, you could
implement additional optimization based on bucketing and
pre-aggregation. The idea of this optimization can be broken down as
follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Instead of storing individual events, create a parent class that can
either contain fields of a single transaction, or combined values,
calculated based on applying an aggregate function to a set of
transactions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Instead of using timestamps in milliseconds as &lt;code&gt;MapState&lt;/code&gt; keys, round
them to the level of “resolution” that you are willing to accept
(for instance, a full minute). Each entry therefore represents a
bucket.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Whenever a window is evaluated, append the new transaction’s data to
the bucket aggregate instead of storing individual data points per
transaction.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/pre-aggregation.png&quot; width=&quot;700px&quot; alt=&quot;Figure 8: Pre-aggregation&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 8: Pre-aggregation&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h4 id=&quot;state-data-and-serializers&quot;&gt;State Data and Serializers&lt;/h4&gt;
&lt;p&gt;Another question that we can ask ourselves in order to further optimize
the implementation is how probable is it to get different events with
exactly the same timestamp. In the described implementation, we
demonstrated one way of approaching this question by storing sets of
transactions per timestamp in &lt;code&gt;MapState&amp;lt;Long, Set&amp;lt;Transaction&amp;gt;&amp;gt;&lt;/code&gt;. Such
a choice, however, might have a more significant effect on performance
than might be anticipated. The reason is that Flink does not currently
provide a native &lt;code&gt;Set&lt;/code&gt; serializer and will enforce a fallback to the less
efficient &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/types_serialization.html#general-class-types&quot;&gt;Kryo
serializer&lt;/a&gt;
instead
(&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16729&quot;&gt;FLINK-16729&lt;/a&gt;). A
meaningful alternative strategy is to assume that, in a normal scenario,
no two discrepant events can have exactly the same timestamp and to turn
the window state into a &lt;code&gt;MapState&amp;lt;Long, Transaction&amp;gt;&lt;/code&gt; type. You can use
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/side_output.html&quot;&gt;side-outputs&lt;/a&gt;
to collect and monitor any unexpected occurrences which contradict your
assumption. During performance optimizations, I generally recommend you
to &lt;a href=&quot;https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html#disabling-kryo&quot;&gt;disable the fallback to
Kryo&lt;/a&gt;
and verify where your application might be further optimized by ensuring
that &lt;a href=&quot;https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html#performance-comparison&quot;&gt;more efficient
serializers&lt;/a&gt;
are being used.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Tip:&lt;/span&gt;
you can quickly determine which serializer is going to be
used for your classes by setting a breakpoint and verifying the type of
the returned TypeInformation.
&lt;br /&gt;&lt;/p&gt;
&lt;center&gt;
&lt;table class=&quot;tg&quot;&gt;
&lt;tr&gt;
&lt;td class=&quot;tg-topcenter&quot;&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/type-pojo.png&quot; alt=&quot;POJO&quot; /&gt;&lt;/td&gt;
&lt;td class=&quot;tg-topcenter&quot;&gt;
&lt;i&gt;PojoTypeInfo&lt;/i&gt; indicates that that an efficient Flink POJO serializer will be used.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&quot;tg-top&quot;&gt;
&lt;img src=&quot;/img/blog/patterns-blog-3/type-kryo.png&quot; alt=&quot;Kryo&quot; /&gt;&lt;/td&gt;
&lt;td class=&quot;tg-topcenter&quot;&gt;
&lt;i&gt;GenericTypeInfo&lt;/i&gt; indicates the fallback to a Kryo serializer.&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;
&lt;/center&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Event pruning&lt;/strong&gt;: instead of storing complete events and putting
additional stress on the ser/de machinery, we can reduce individual
events data to only relevant information. This would potentially require
“unpacking” individual events as fields, and storing those fields into a
generic &lt;code&gt;Map&amp;lt;String, Object&amp;gt;&lt;/code&gt; data structure, based on the
configurations of active rules.&lt;/p&gt;
&lt;p&gt;While this adjustment could potentially produce significant improvements
for objects of large size, it should not be your first pick as it can
easily turn into a premature optimization.&lt;/p&gt;
&lt;h2 id=&quot;summary&quot;&gt;Summary:&lt;/h2&gt;
&lt;p&gt;This article concludes the description of the implementation of the
fraud detection engine that we started in &lt;a href=&quot;/news/2020/01/15/demo-fraud-detection.html&quot;&gt;part one&lt;/a&gt;. In this blog
post we demonstrated how &lt;code&gt;ProcessFunction&lt;/code&gt; can be utilized to
&quot;impersonate&quot; a window with a sophisticated custom logic. We have
discussed the pros and cons of such approach and elaborated how custom
use-case-specific optimizations can be applied - something that would
not be directly possible with the Window API.&lt;/p&gt;
&lt;p&gt;The goal of this blog post was to illustrate the power and flexibility
of Apache Flink’s APIs. At the core of it are the pillars of Flink, that
spare you, as a developer, very significant amounts of work and
generalize well to a wide range of use cases by providing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Efficient data exchange in a distributed cluster&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Horizontal scalability via data partitioning&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Fault-tolerant state with quick, local access&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Convenient abstraction for working with this state, which is as simple as using a
local variable&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Multi-threaded, parallel execution engine. &lt;code&gt;ProcessFunction&lt;/code&gt; code runs
in a single thread, without the need for synchronization. Flink
handles all the parallel execution aspects and correct access to the
shared state, without you, as a developer, having to think about it
(concurrency is hard).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All these aspects make it possible to build applications with Flink that
go well beyond trivial streaming ETL use cases and enable implementation
of arbitrarily-sophisticated, distributed event-driven applications.
With Flink, you can rethink approaches to a wide range of use cases
which normally would rely on using stateless parallel execution nodes
and “pushing” the concerns of state fault tolerance to a database, an
approach that is often destined to run into scalability issues in the
face of ever-increasing data volumes.&lt;/p&gt;
</description>
<pubDate>Thu, 30 Jul 2020 14:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/07/30/demo-fraud-detection-3.html</link>
<guid isPermaLink="true">/news/2020/07/30/demo-fraud-detection-3.html</guid>
</item>
<item>
<title>Flink SQL Demo: Building an End-to-End Streaming Application</title>
<description>&lt;p&gt;Apache Flink 1.11 has released many exciting new features, including many developments in Flink SQL which is evolving at a fast pace. This article takes a closer look at how to quickly build streaming applications with Flink SQL from a practical point of view.&lt;/p&gt;
&lt;p&gt;In the following sections, we describe how to integrate Kafka, MySQL, Elasticsearch, and Kibana with Flink SQL to analyze e-commerce user behavior in real-time. All exercises in this blogpost are performed in the Flink SQL CLI, and the entire process uses standard SQL syntax, without a single line of Java/Scala code or IDE installation. The final result of this demo is shown in the following figure:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-28-flink-sql-demo/image1.gif&quot; width=&quot;650px&quot; alt=&quot;Demo Overview&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h1 id=&quot;preparation&quot;&gt;Preparation&lt;/h1&gt;
&lt;p&gt;Prepare a Linux or MacOS computer with Docker installed.&lt;/p&gt;
&lt;h2 id=&quot;starting-the-demo-environment&quot;&gt;Starting the Demo Environment&lt;/h2&gt;
&lt;p&gt;The components required in this demo are all managed in containers, so we will use &lt;code&gt;docker-compose&lt;/code&gt; to start them. First, download the &lt;code&gt;docker-compose.yml&lt;/code&gt; file that defines the demo environment, for example by running the following commands:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;mkdir flink-sql-demo; cd flink-sql-demo;
wget https://raw.githubusercontent.com/wuchong/flink-sql-demo/v1.11-EN/docker-compose.yml
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The Docker Compose environment consists of the following containers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Flink SQL CLI:&lt;/strong&gt; used to submit queries and visualize their results.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Flink Cluster:&lt;/strong&gt; a Flink JobManager and a Flink TaskManager container to execute queries.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;MySQL:&lt;/strong&gt; MySQL 5.7 and a pre-populated &lt;code&gt;category&lt;/code&gt; table in the database. The &lt;code&gt;category&lt;/code&gt; table will be joined with data in Kafka to enrich the real-time data.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kafka:&lt;/strong&gt; mainly used as a data source. The DataGen component automatically writes data into a Kafka topic.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Zookeeper:&lt;/strong&gt; this component is required by Kafka.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Elasticsearch:&lt;/strong&gt; mainly used as a data sink.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kibana:&lt;/strong&gt; used to visualize the data in Elasticsearch.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DataGen:&lt;/strong&gt; the data generator. After the container is started, user behavior data is automatically generated and sent to the Kafka topic. By default, 2000 data entries are generated each second for about 1.5 hours. You can modify DataGen’s &lt;code&gt;speedup&lt;/code&gt; parameter in &lt;code&gt;docker-compose.yml&lt;/code&gt; to adjust the generation rate (which takes effect after Docker Compose is restarted).&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;alert alert-danger&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-danger&quot; style=&quot;display: inline-block&quot;&gt; Note &lt;/span&gt;
Before starting the containers, we recommend configuring Docker so that sufficient resources are available and the environment does not become unresponsive. We suggest running Docker at 3-4 GB memory and 3-4 CPU cores.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;To start all containers, run the following command in the directory that contains the &lt;code&gt;docker-compose.yml&lt;/code&gt; file.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;docker-compose up -d
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This command automatically starts all the containers defined in the Docker Compose configuration in a detached mode. Run &lt;code&gt;docker ps&lt;/code&gt; to check whether the 9 containers are running properly. You can also visit &lt;a href=&quot;http://localhost:5601/&quot;&gt;http://localhost:5601/&lt;/a&gt; to see if Kibana is running normally.&lt;/p&gt;
&lt;p&gt;Don’t forget to run the following command to stop all containers after you finished the tutorial:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;docker-compose down
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;entering-the-flink-sql-cli-client&quot;&gt;Entering the Flink SQL CLI client&lt;/h2&gt;
&lt;p&gt;To enter the SQL CLI client run:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;docker-compose &lt;span class=&quot;nb&quot;&gt;exec &lt;/span&gt;sql-client ./sql-client.sh&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The command starts the SQL CLI client in the container.
You should see the welcome screen of the CLI client.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-28-flink-sql-demo/image3.png&quot; width=&quot;500px&quot; alt=&quot;Flink SQL CLI welcome page&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h2 id=&quot;creating-a-kafka-table-using-ddl&quot;&gt;Creating a Kafka table using DDL&lt;/h2&gt;
&lt;p&gt;The DataGen container continuously writes events into the Kafka &lt;code&gt;user_behavior&lt;/code&gt; topic. This data contains the user behavior on the day of November 27, 2017 (behaviors include “click”, “like”, “purchase” and “add to shopping cart” events). Each row represents a user behavior event, with the user ID, product ID, product category ID, event type, and timestamp in JSON format. Note that the dataset is from the &lt;a href=&quot;https://tianchi.aliyun.com/dataset/dataDetail?dataId=649&quot;&gt;Alibaba Cloud Tianchi public dataset&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In the directory that contains &lt;code&gt;docker-compose.yml&lt;/code&gt;, run the following command to view the first 10 data entries generated in the Kafka topic:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;docker-compose exec kafka bash -c &#39;kafka-console-consumer.sh --topic user_behavior --bootstrap-server kafka:9094 --from-beginning --max-messages 10&#39;
{&quot;user_id&quot;: &quot;952483&quot;, &quot;item_id&quot;:&quot;310884&quot;, &quot;category_id&quot;: &quot;4580532&quot;, &quot;behavior&quot;: &quot;pv&quot;, &quot;ts&quot;: &quot;2017-11-27T00:00:00Z&quot;}
{&quot;user_id&quot;: &quot;794777&quot;, &quot;item_id&quot;:&quot;5119439&quot;, &quot;category_id&quot;: &quot;982926&quot;, &quot;behavior&quot;: &quot;pv&quot;, &quot;ts&quot;: &quot;2017-11-27T00:00:00Z&quot;}
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In order to make the events in the Kafka topic accessible to Flink SQL, we run the following DDL statement in SQL CLI to create a table that connects to the topic in the Kafka cluster:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user_behavior&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;item_id&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;category_id&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;behavior&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;proctime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PROCTIME&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- generates processing-time attribute using computed column&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WATERMARK&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;5&amp;#39;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;SECOND&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- defines watermark on ts column, marks ts as event-time attribute&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;kafka&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- using kafka connector&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;topic&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;user_behavior&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- kafka topic&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;scan.startup.mode&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;earliest-offset&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- reading from the beginning&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;properties.bootstrap.servers&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;kafka:9094&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- kafka broker address&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;format&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;json&amp;#39;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- the data format is json&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The above snippet declares five fields based on the data format. In addition, it uses the computed column syntax and built-in &lt;code&gt;PROCTIME()&lt;/code&gt; function to declare a virtual column that generates the processing-time attribute. It also uses the &lt;code&gt;WATERMARK&lt;/code&gt; syntax to declare the watermark strategy on the &lt;code&gt;ts&lt;/code&gt; field (tolerate 5-seconds out-of-order). Therefore, the &lt;code&gt;ts&lt;/code&gt; field becomes an event-time attribute. For more information about time attributes and DDL syntax, see the following official documents:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/streaming/time_attributes.html&quot;&gt;Time attributes in Flink’s Table API &amp;amp; SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/sql/create.html#create-table&quot;&gt;DDL Syntax in Flink SQL&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After creating the &lt;code&gt;user_behavior&lt;/code&gt; table in the SQL CLI, run &lt;code&gt;SHOW TABLES;&lt;/code&gt; and &lt;code&gt;DESCRIBE user_behavior;&lt;/code&gt; to see registered tables and table details. Also, run the command &lt;code&gt;SELECT * FROM user_behavior;&lt;/code&gt; directly in the SQL CLI to preview the data (press &lt;code&gt;q&lt;/code&gt; to exit).&lt;/p&gt;
&lt;p&gt;Next, we discover more about Flink SQL through three real-world scenarios.&lt;/p&gt;
&lt;h1 id=&quot;hourly-trading-volume&quot;&gt;Hourly Trading Volume&lt;/h1&gt;
&lt;h2 id=&quot;creating-an-elasticsearch-table-using-ddl&quot;&gt;Creating an Elasticsearch table using DDL&lt;/h2&gt;
&lt;p&gt;Let’s create an Elasticsearch result table in the SQL CLI. We need two columns in this case: &lt;code&gt;hour_of_day&lt;/code&gt; and &lt;code&gt;buy_cnt&lt;/code&gt; (trading volume).&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buy_cnt_per_hour&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;hour_of_day&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;buy_cnt&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;elasticsearch-7&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- using elasticsearch connector&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;hosts&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;http://elasticsearch:9200&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- elasticsearch address&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;index&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;buy_cnt_per_hour&amp;#39;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- elasticsearch index name, similar to database table name&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There is no need to create the &lt;code&gt;buy_cnt_per_hour&lt;/code&gt; index in Elasticsearch in advance since Elasticsearch will automatically create the index if it does not exist.&lt;/p&gt;
&lt;h2 id=&quot;submitting-a-query&quot;&gt;Submitting a Query&lt;/h2&gt;
&lt;p&gt;The hourly trading volume is the number of “buy” behaviors completed each hour. Therefore, we can use a &lt;code&gt;TUMBLE&lt;/code&gt; window function to assign data into hourly windows. Then, we count the number of “buy” records in each window. To implement this, we can filter out the “buy” data first and then apply &lt;code&gt;COUNT(*)&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buy_cnt_per_hour&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HOUR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TUMBLE_START&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;1&amp;#39;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HOUR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;COUNT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user_behavior&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;behavior&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;buy&amp;#39;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TUMBLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;1&amp;#39;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HOUR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here, we use the built-in &lt;code&gt;HOUR&lt;/code&gt; function to extract the value for each hour in the day from a &lt;code&gt;TIMESTAMP&lt;/code&gt; column. Use &lt;code&gt;INSERT INTO&lt;/code&gt; to start a Flink SQL job that continuously writes results into the Elasticsearch &lt;code&gt;buy_cnt_per_hour&lt;/code&gt; index. The Elasticearch result table can be seen as a materialized view of the query. You can find more information about Flink’s window aggregation in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/sql/queries.html#group-windows&quot;&gt;Apache Flink documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;After running the previous query in the Flink SQL CLI, we can observe the submitted task on the &lt;a href=&quot;http://localhost:8081&quot;&gt;Flink Web UI&lt;/a&gt;. This task is a streaming task and therefore runs continuously.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-28-flink-sql-demo/image4.jpg&quot; width=&quot;800px&quot; alt=&quot;Flink Dashboard&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h2 id=&quot;using-kibana-to-visualize-results&quot;&gt;Using Kibana to Visualize Results&lt;/h2&gt;
&lt;p&gt;Access Kibana at &lt;a href=&quot;http://localhost:5601&quot;&gt;http://localhost:5601&lt;/a&gt;. First, configure an index pattern by clicking “Management” in the left-side toolbar and find “Index Patterns”. Next, click “Create Index Pattern” and enter the full index name &lt;code&gt;buy_cnt_per_hour&lt;/code&gt; to create the index pattern. After creating the index pattern, we can explore data in Kibana.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note &lt;/span&gt;
Since we are using the TUMBLE window of one hour here, it might take about four minutes between the time that containers started and until the first row is emitted. Until then the index does not exist and Kibana is unable to find the index.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Click “Discover” in the left-side toolbar. Kibana lists the content of the created index.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-28-flink-sql-demo/image5.jpg&quot; width=&quot;800px&quot; alt=&quot;Kibana Discover&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;Next, create a dashboard to display various views. Click “Dashboard” on the left side of the page to create a dashboard named “User Behavior Analysis”. Then, click “Create New” to create a new view. Select “Area” (area graph), then select the &lt;code&gt;buy_cnt_per_hour&lt;/code&gt; index, and draw the trading volume area chart as illustrated in the configuration on the left side of the following diagram. Apply the changes by clicking the “▶” play button. Then, save it as “Hourly Trading Volume”.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-28-flink-sql-demo/image6.jpg&quot; width=&quot;800px&quot; alt=&quot;Hourly Trading Volume&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;You can see that during the early morning hours the number of transactions have the lowest value for the entire day.&lt;/p&gt;
&lt;p&gt;As real-time data is added into the indices, you can enable auto-refresh in Kibana to see real-time visualization changes and updates. You can do so by clicking the time picker and entering a refresh interval (e.g. 3 seconds) in the “Refresh every” field.&lt;/p&gt;
&lt;h1 id=&quot;cumulative-number-of-unique-visitors-every-10-min&quot;&gt;Cumulative number of Unique Visitors every 10-min&lt;/h1&gt;
&lt;p&gt;Another interesting visualization is the cumulative number of unique visitors (UV). For example, the number of UV at 10:00 represents the total number of UV from 00:00 to 10:00. Therefore, the curve is monotonically increasing.&lt;/p&gt;
&lt;p&gt;Let’s create another Elasticsearch table in the SQL CLI to store the UV results. This table contains 3 columns: date, time and cumulative UVs.
The &lt;code&gt;date_str&lt;/code&gt; and &lt;code&gt;time_str&lt;/code&gt; column are defined as primary key, Elasticsearch sink will use them to calculate the document ID and work in upsert mode to update UV values under the document ID.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cumulative_uv&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;date_str&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;time_str&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;uv&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;PRIMARY&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;KEY&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;date_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;NOT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ENFORCED&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;elasticsearch-7&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;hosts&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;http://elasticsearch:9200&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;index&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;cumulative_uv&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We can extract the date and time using &lt;code&gt;DATE_FORMAT&lt;/code&gt; function based on the &lt;code&gt;ts&lt;/code&gt; field. As the section title describes, we only need to report every 10 minutes. So, we can use &lt;code&gt;SUBSTR&lt;/code&gt; and the string concat function &lt;code&gt;||&lt;/code&gt; to convert the time value into a 10-minute interval time string, such as &lt;code&gt;12:00&lt;/code&gt;, &lt;code&gt;12:10&lt;/code&gt;.
Next, we group data by &lt;code&gt;date_str&lt;/code&gt; and perform a &lt;code&gt;COUNT DISTINCT&lt;/code&gt; aggregation on &lt;code&gt;user_id&lt;/code&gt; to get the current cumulative UV in this day. Additionally, we perform a &lt;code&gt;MAX&lt;/code&gt; aggregation on &lt;code&gt;time_str&lt;/code&gt; field to get the current stream time: the maximum event time observed so far.
As the maximum time is also a part of the primary key of the sink, the final result is that we will insert a new point into the elasticsearch every 10 minute. And every latest point will be updated continuously until the next 10-minute point is generated.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cumulative_uv&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;date_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;MAX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;COUNT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;DISTINCT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;uv&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DATE_FORMAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;yyyy-MM-dd&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;date_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;SUBSTR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DATE_FORMAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;HH:mm&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;0&amp;#39;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user_behavior&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;date_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After submitting this query, we create a &lt;code&gt;cumulative_uv&lt;/code&gt; index pattern in Kibana. We then create a “Line” (line graph) on the dashboard, by selecting the &lt;code&gt;cumulative_uv&lt;/code&gt; index, and drawing the cumulative UV curve according to the configuration on the left side of the following figure before finally saving the curve.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-28-flink-sql-demo/image7.jpg&quot; width=&quot;800px&quot; alt=&quot;Cumulative Unique Visitors every 10-min&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h1 id=&quot;top-categories&quot;&gt;Top Categories&lt;/h1&gt;
&lt;p&gt;The last visualization represents the category rankings to inform us on the most popular categories in our e-commerce site. Since our data source offers events for more than 5,000 categories without providing any additional significance to our analytics, we would like to reduce it so that it only includes the top-level categories. We will use the data in our MySQL database by joining it as a dimension table with our Kafka events to map sub-categories to top-level categories.&lt;/p&gt;
&lt;p&gt;Create a table in the SQL CLI to make the data in MySQL accessible to Flink SQL.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;category_dim&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;sub_category_id&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;parent_category_name&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;jdbc&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;url&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;jdbc:mysql://mysql:3306/flink&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;table-name&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;category&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;username&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;root&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;password&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;123456&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;lookup.cache.max-rows&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;5000&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;lookup.cache.ttl&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;10min&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The underlying JDBC connector implements the &lt;code&gt;LookupTableSource&lt;/code&gt; interface, so the created JDBC table &lt;code&gt;category_dim&lt;/code&gt; can be used as a temporal table (i.e. lookup table) out-of-the-box in the data enrichment.&lt;/p&gt;
&lt;p&gt;In addition, create an Elasticsearch table to store the category statistics.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;top_category&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;category_name&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;PRIMARY&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;KEY&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;NOT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ENFORCED&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;buy_cnt&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;elasticsearch-7&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;hosts&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;http://elasticsearch:9200&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;index&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;top_category&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In order to enrich the category names, we use Flink SQL’s temporal table joins to join a dimension table. You can access more information about &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/streaming/joins.html#join-with-a-temporal-table&quot;&gt;temporal joins&lt;/a&gt; in the Flink documentation.&lt;/p&gt;
&lt;p&gt;Additionally, we use the &lt;code&gt;CREATE VIEW&lt;/code&gt; syntax to register the query as a logical view, allowing us to easily reference this query in subsequent queries and simplify nested queries. Please note that creating a logical view does not trigger the execution of the job and the view results are not persisted. Therefore, this statement is lightweight and does not have additional overhead.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;VIEW&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rich_user_behavior&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;U&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;U&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;U&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;behavior&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent_category_name&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;category_name&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user_behavior&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;U&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;LEFT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;JOIN&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;category_dim&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SYSTEM_TIME&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;OF&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;U&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proctime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;C&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;ON&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;U&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;category_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub_category_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Finally, we group the dimensional table by category name to count the number of &lt;code&gt;buy&lt;/code&gt; events and write the result to Elasticsearch’s &lt;code&gt;top_category&lt;/code&gt; index.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;top_category&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;category_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;COUNT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buy_cnt&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rich_user_behavior&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;behavior&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;buy&amp;#39;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;category_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After submitting the query, we create a &lt;code&gt;top_category&lt;/code&gt; index pattern in Kibana. We then create a “Horizontal Bar” (bar graph) on the dashboard, by selecting the &lt;code&gt;top_category&lt;/code&gt; index and drawing the category ranking according to the configuration on the left side of the following diagram before finally saving the list.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-28-flink-sql-demo/image8.jpg&quot; width=&quot;800px&quot; alt=&quot;Top Categories&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;As illustrated in the diagram, the categories of clothing and shoes exceed by far other categories on the e-commerce website.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;We have now implemented three practical applications and created charts for them. We can now return to the dashboard page and drag-and-drop each view to give our dashboard a more formal and intuitive style, as illustrated in the beginning of the blogpost. Of course, Kibana also provides a rich set of graphics and visualization features, and the user_behavior logs contain a lot more interesting information to explore. Using Flink SQL, you can analyze data in more dimensions, while using Kibana allows you to display more views and observe real-time changes in its charts!&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;In the previous sections, we described how to use Flink SQL to integrate Kafka, MySQL, Elasticsearch, and Kibana to quickly build a real-time analytics application. The entire process can be completed using standard SQL syntax, without a line of Java or Scala code. We hope that this article provides some clear and practical examples of the convenience and power of Flink SQL, featuring an easy connection to various external systems, native support for event time and out-of-order handling, dimension table joins and a wide range of built-in functions. We hope you have fun following the examples in this blogpost!&lt;/p&gt;
</description>
<pubDate>Tue, 28 Jul 2020 14:00:00 +0200</pubDate>
<link>https://flink.apache.org/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html</link>
<guid isPermaLink="true">/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html</guid>
</item>
<item>
<title>Flink Community Update - July&#39;20</title>
<description>&lt;p&gt;As July draws to an end, we look back at a monthful of activity in the Flink community, including two releases (!) and some work around improving the first-time contribution experience in the project.&lt;/p&gt;
&lt;p&gt;Also, events are starting to pick up again, so we’ve put together a list of some great ones you can (virtually) attend in August!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#the-past-month-in-flink&quot; id=&quot;markdown-toc-the-past-month-in-flink&quot;&gt;The Past Month in Flink&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-releases&quot; id=&quot;markdown-toc-flink-releases&quot;&gt;Flink Releases&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-111&quot; id=&quot;markdown-toc-flink-111&quot;&gt;Flink 1.11&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-1111&quot; id=&quot;markdown-toc-flink-1111&quot;&gt;Flink 1.11.1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#gearing-up-for-flink-112&quot; id=&quot;markdown-toc-gearing-up-for-flink-112&quot;&gt;Gearing up for Flink 1.12&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers-and-pmc-members&quot; id=&quot;markdown-toc-new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#new-pmc-members&quot; id=&quot;markdown-toc-new-pmc-members&quot;&gt;New PMC Members&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-bigger-picture&quot; id=&quot;markdown-toc-the-bigger-picture&quot;&gt;The Bigger Picture&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#a-look-into-the-evolution-of-flink-releases&quot; id=&quot;markdown-toc-a-look-into-the-evolution-of-flink-releases&quot;&gt;A Look Into the Evolution of Flink Releases&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#first-time-contributor-guide&quot; id=&quot;markdown-toc-first-time-contributor-guide&quot;&gt;First-time Contributor Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#replacing-charged-words-in-the-flink-repo&quot; id=&quot;markdown-toc-replacing-charged-words-in-the-flink-repo&quot;&gt;Replacing “charged” words in the Flink repo&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#upcoming-events-and-more&quot; id=&quot;markdown-toc-upcoming-events-and-more&quot;&gt;Upcoming Events (and More!)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;the-past-month-in-flink&quot;&gt;The Past Month in Flink&lt;/h1&gt;
&lt;h2 id=&quot;flink-releases&quot;&gt;Flink Releases&lt;/h2&gt;
&lt;h3 id=&quot;flink-111&quot;&gt;Flink 1.11&lt;/h3&gt;
&lt;p&gt;A couple of weeks ago, Flink 1.11 was announced in what was (again) the biggest Flink release to date (&lt;em&gt;see &lt;a href=&quot;#a-look-into-the-evolution-of-flink-releases&quot;&gt;“A Look Into the Evolution of Flink Releases”&lt;/a&gt;&lt;/em&gt;)! The new release brought significant improvements to usability as well as new features to Flink users across the API stack. Some highlights of Flink 1.11 are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Unaligned checkpoints to cope with high backpressure scenarios;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The new source API, that simplifies and unifies the implementation of (custom) sources;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Support for Change Data Capture (CDC) and other common use cases in the Table API/SQL;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Pandas UDFs and other performance optimizations in PyFlink, making it more powerful for data science and ML workloads.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For a more detailed look into the release, you can recap the &lt;a href=&quot;https://flink.apache.org/news/2020/07/06/release-1.11.0.html&quot;&gt;announcement blogpost&lt;/a&gt; and join the upcoming meetup on &lt;a href=&quot;https://www.meetup.com/seattle-flink/events/271922632/&quot;&gt;“What’s new in Flink 1.11?”&lt;/a&gt;, where you’ll be able to ask anything release-related to Aljoscha Krettek (Flink PMC Member). The community has also been working on a series of blogposts that deep-dive into the most significant features and improvements in 1.11, so keep an eye on the &lt;a href=&quot;https://flink.apache.org/blog/&quot;&gt;Flink blog&lt;/a&gt;!&lt;/p&gt;
&lt;h3 id=&quot;flink-1111&quot;&gt;Flink 1.11.1&lt;/h3&gt;
&lt;p&gt;Shortly after releasing Flink 1.11, the community announced the first patch version to cover some outstanding issues in the major release. This version is &lt;strong&gt;particularly important for users of the Table API/SQL&lt;/strong&gt;, as it addresses known limitations that affect the usability of new features like changelog sources and support for JDBC catalogs.&lt;/p&gt;
&lt;p&gt;You can find a detailed list with all the improvements and bugfixes that went into Flink 1.11.1 in the &lt;a href=&quot;https://flink.apache.org/news/2020/07/21/release-1.11.1.html&quot;&gt;announcement blogpost&lt;/a&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;gearing-up-for-flink-112&quot;&gt;Gearing up for Flink 1.12&lt;/h2&gt;
&lt;p&gt;The Flink 1.12 release cycle has been kicked-off last week and a discussion about what features will go into the upcoming release is underway in &lt;a href=&quot;https://lists.apache.org/thread.html/rb01160c7c9c26304a7665f9a252d4ed1583173620df307015c095fcf%40%3Cdev.flink.apache.org%3E&quot;&gt;this @dev Mailing List thread&lt;/a&gt;. While we wait for more of these ideas to turn into proposals and JIRA issues, here are some recent FLIPs that are already being discussed in the Flink community:&lt;/p&gt;
&lt;table class=&quot;table table-bordered&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;FLIP&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866298&quot;&gt;FLIP-130&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Support Python DataStream API&lt;/b&gt;&lt;/li&gt;
&lt;p&gt;Python support in Flink has so far been bounded to the Table API/SQL. These APIs are high-level and convenient, but have some limitations for more complex stream processing use cases. To expand the usability of PyFlink to a broader set of use cases, FLIP-130 proposes to support it also in the DataStream API, starting with stateless operations.&lt;/p&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL&quot;&gt;FLIP-132&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Temporal Table DDL&lt;/b&gt;&lt;/li&gt;
&lt;p&gt;Flink SQL users can&#39;t currently create temporal tables using SQL DDL, which forces them to change context frequently for use cases that require them. FLIP-132 proposes to extend the DDL syntax to support temporal tables, which in turn will allow to also bring &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/table/streaming/joins.html#join-with-a-temporal-table&quot;&gt;temporal joins&lt;/a&gt; with changelog sources to Flink SQL.&lt;/p&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/h2&gt;
&lt;p&gt;The Apache Flink community has welcomed &lt;strong&gt;2 new PMC Members&lt;/strong&gt; since the last update. Congratulations!&lt;/p&gt;
&lt;h3 id=&quot;new-pmc-members&quot;&gt;New PMC Members&lt;/h3&gt;
&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars0.githubusercontent.com/u/8957547?s=400&amp;amp;u=4560f775da9ebc5f3aa2e1563f57cdad03862ce8&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/PiotrNowojski&quot;&gt;Piotr Nowojski&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars0.githubusercontent.com/u/6239804?s=460&amp;amp;u=6cd81b1ab38fcc6a5736fcfa957c51093bf060e2&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/LiyuApache&quot;&gt;Yu Li&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;the-bigger-picture&quot;&gt;The Bigger Picture&lt;/h1&gt;
&lt;h2 id=&quot;a-look-into-the-evolution-of-flink-releases&quot;&gt;A Look Into the Evolution of Flink Releases&lt;/h2&gt;
&lt;p&gt;It’s &lt;a href=&quot;https://flink.apache.org/news/2020/04/01/community-update.html#a-look-into-the-flink-repository&quot;&gt;been a while&lt;/a&gt; since we had a look at community numbers, so this time we’d like to shed some light on the evolution of contributors and, well, work across releases. Let’s have a look at some &lt;em&gt;git&lt;/em&gt; data:&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-29-community-update/2020-07-29_releases.png&quot; width=&quot;600px&quot; alt=&quot;Flink Releases&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;If we consider Flink 1.8 (Apr. 2019) as the baseline, the Flink community more than &lt;strong&gt;tripled&lt;/strong&gt; the number of implemented and/or resolved issues in a single release with the support of an &lt;strong&gt;additional ~100 contributors&lt;/strong&gt; in Flink 1.11. This is pretty impressive on its own, and even more so if you consider that Flink contributors are distributed around the globe, working across different locations and timezones!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;first-time-contributor-guide&quot;&gt;First-time Contributor Guide&lt;/h2&gt;
&lt;p&gt;Flink has an extensive guide for &lt;a href=&quot;https://flink.apache.org/contributing/how-to-contribute.html&quot;&gt;code and non-code contributions&lt;/a&gt; that helps new community members navigate the project and get familiar with existing contribution guidelines. In particular for code contributions, knowing where to start can be difficult, given the sheer size of the Flink codebase and the pace of development of the project.&lt;/p&gt;
&lt;p&gt;To better guide new contributors, a brief section was added to the guide on &lt;a href=&quot;https://flink.apache.org/contributing/contribute-code.html#looking-for-what-to-contribute&quot;&gt;how to look for what to contribute&lt;/a&gt; and the &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18704?filter=12349196&quot;&gt;&lt;em&gt;starter&lt;/em&gt; label&lt;/a&gt; has been revived in Jira to highlight issues that are suitable for first-time contributors.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note &lt;/span&gt;
As a reminder, you no longer need to ask for contributor permissions to start contributing to Flink. Once you’ve found something you’d like to work on, read the &lt;a href=&quot;https://flink.apache.org/contributing/contribute-code.html&quot;&gt;contribution guide&lt;/a&gt; carefully and reach out to a Flink Committer, who will be able to help you get started.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;replacing-charged-words-in-the-flink-repo&quot;&gt;Replacing “charged” words in the Flink repo&lt;/h2&gt;
&lt;p&gt;The community is working on gradually replacing words that are outdated and carry a negative connotation in the Flink codebase, such as “master/slave” and “whitelist/blacklist”. The progress of this work can be tracked in &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18209&quot;&gt;FLINK-18209&lt;/a&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;upcoming-events-and-more&quot;&gt;Upcoming Events (and More!)&lt;/h1&gt;
&lt;p&gt;We’re happy to see the “high season” of virtual events approaching, with a lot of great conferences taking place in the coming month, as well as some meetups. Here, we highlight some of the Flink talks happening in those events, but we recommend checking out the complete event programs!&lt;/p&gt;
&lt;p&gt;As usual, we also leave you with some resources to read and explore.&lt;/p&gt;
&lt;table class=&quot;table table-bordered&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon glyphicon-console&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Events&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;b&gt;Virtual Flink Meetup (Jul. 29)&lt;/b&gt;
&lt;p&gt;&lt;a href=&quot;https://www.meetup.com/seattle-flink/events/271922632/&quot;&gt;What’s new in Flink 1.11? + Q&amp;amp;A with Aljoscha Krettek&lt;/a&gt;&lt;/p&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;b&gt;DC Thursday (Jul. 30)&lt;/b&gt;
&lt;p&gt;&lt;a href=&quot;https://www.eventbrite.com/e/dc-thurs-apache-flink-w-stephan-ewen-tickets-112137488246?utm_campaign=Events%20%26%20Talks&amp;amp;utm_content=135006406&amp;amp;utm_medium=social&amp;amp;utm_source=twitter&amp;amp;hss_channel=tw-2581958070&quot;&gt;Interview and Community Q&amp;amp;A with Stephan Ewen&lt;/a&gt;&lt;/p&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;b&gt;KubeCon + CloudNativeCon Europe (Aug. 17-20)&lt;/b&gt;
&lt;p&gt;&lt;a href=&quot;https://kccnceu20.sched.com/event/ZelA/stateful-serverless-and-the-elephant-in-the-room-stephan-ewen-ververica&quot;&gt;Stateful Serverless and the Elephant in the Room&lt;/a&gt;&lt;/p&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;b&gt;DataEngBytes (Aug. 20-21)&lt;/b&gt;
&lt;p&gt;&lt;a href=&quot;https://dataengconf.com.au/&quot;&gt;Change Data Capture with Flink SQL and Debezium&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://dataengconf.com.au/&quot;&gt;Sweet Streams are Made of These: Data Driven Development with Stream Processing&lt;/a&gt;&lt;/p&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;b&gt;Beam Summit (Aug. 24-29)&lt;/b&gt;
&lt;p&gt;&lt;a href=&quot;https://2020.beamsummit.org/sessions/streaming-fast-slow/&quot;&gt;Streaming, Fast and Slow&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://2020.beamsummit.org/sessions/building-stateful-streaming-pipelines/&quot;&gt;Building Stateful Streaming Pipelines With Beam&lt;/a&gt;&lt;/p&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon-fire&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Blogposts&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;b&gt;Flink 1.11 Series&lt;/b&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/07/14/application-mode.html&quot;&gt;Application Deployment in Flink: Current State and the new Application Mode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/2020/07/23/catalogs.html&quot;&gt;Sharing is caring - Catalogs in Flink SQL (Tutorial)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html&quot;&gt;Flink SQL Demo: Building an End-to-End Streaming Application (Tutorial)&lt;/a&gt;&lt;/li&gt;
&lt;p&gt;&lt;/p&gt;
&lt;b&gt;Other&lt;/b&gt;
&lt;li&gt;&lt;a href=&quot;https://blogs.oracle.com/javamagazine/streaming-analytics-with-java-and-apache-flink?source=:em:nw:mt::RC_WWMK200429P00043:NSL400072808&amp;amp;elq_mid=167902&amp;amp;sh=162609181316181313222609291604350235&amp;amp;cmid=WWMK200429P00043C0004&quot;&gt;Streaming analytics with Java and Apache Flink (Tutorial)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.ververica.com/blog/flink-for-online-machine-learning-and-real-time-processing-at-weibo&quot;&gt;Flink for online Machine Learning and real-time processing at Weibo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.ververica.com/blog/data-driven-matchmaking-at-azar-with-apache-flink&quot;&gt;Data-driven Matchmaking at Azar with Apache Flink&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon glyphicon-certificate&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Flink Packages&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;&lt;p&gt;&lt;a href=&quot;https://flink-packages.org/&quot;&gt;Flink Packages&lt;/a&gt; is a website where you can explore (and contribute to) the Flink &lt;br /&gt; ecosystem of connectors, extensions, APIs, tools and integrations. &lt;b&gt;New in:&lt;/b&gt; &lt;/p&gt;
&lt;li&gt;&lt;a href=&quot;https://flink-packages.org/packages/flink-metrics-signalfx&quot;&gt; SignalFx Metrics Reporter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink-packages.org/packages/yauaa&quot;&gt;Yauaa: Yet Another UserAgent Analyzer&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr /&gt;
&lt;p&gt;If you’d like to keep a closer eye on what’s happening in the community, subscribe to the Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;@community mailing list&lt;/a&gt; to get fine-grained weekly updates, upcoming event announcements and more.&lt;/p&gt;
</description>
<pubDate>Mon, 27 Jul 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/07/27/community-update.html</link>
<guid isPermaLink="true">/news/2020/07/27/community-update.html</guid>
</item>
<item>
<title>Sharing is caring - Catalogs in Flink SQL</title>
<description>&lt;p&gt;With an ever-growing number of people working with data, it’s a common practice for companies to build self-service platforms with the goal of democratizing their access across different teams and — especially — to enable users from any background to be independent in their data needs. In such environments, metadata management becomes a crucial aspect. Without it, users often work blindly, spending too much time searching for datasets and their location, figuring out data formats and similar cumbersome tasks.&lt;/p&gt;
&lt;p&gt;In this blog post, we want to give you a high level overview of catalogs in Flink. We’ll describe why you should consider using them and what you can achieve with one in place. To round it up, we’ll also showcase how simple it is to combine catalogs and Flink, in the form of an end-to-end example that you can try out yourself.&lt;/p&gt;
&lt;h2 id=&quot;why-do-i-need-a-catalog&quot;&gt;Why do I need a catalog?&lt;/h2&gt;
&lt;p&gt;Frequently, companies start building a data platform with a metastore, catalog, or schema registry of some sort already in place. Those let you clearly separate making the data available from consuming it. That separation has a few benefits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Improved productivity&lt;/strong&gt; - The most obvious one. Making data reusable and shifting the focus on building new models/pipelines rather than data cleansing and discovery.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt; - You can control the access to certain features of the data. For example, you can make the schema of the dataset publicly available, but limit the actual access to the underlying data only to particular teams.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compliance&lt;/strong&gt; - If you have all the metadata in a central entity, it’s much easier to ensure compliance with GDPR and similar regulations and legal requirements.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;what-is-stored-in-a-catalog&quot;&gt;What is stored in a catalog?&lt;/h2&gt;
&lt;p&gt;Almost all data sets can be described by certain properties that must be known in order to consume them. Those include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Schema&lt;/strong&gt; - It describes the actual contents of the data, what columns it has, what are the constraints (e.g. keys) on which the updates should be performed, which fields can act as time attributes, what are the rules for watermark generation and so on.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Location&lt;/strong&gt; - Does the data come from Kafka or a file in a filesystem? How do you connect to the external system? Which topic or file name do you use?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Format&lt;/strong&gt; - Is the data serialized as JSON, CSV, or maybe Avro records?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Statistics&lt;/strong&gt; - You can also store additional information that can be useful when creating an execution plan of your query. For example, you can choose the best join algorithm, based on the number of rows in joined datasets.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Catalogs don’t have to be limited to the metadata of datasets. You can usually store other objects that can be reused in different scenarios, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Functions&lt;/strong&gt; - It’s very common to have domain specific functions that can be helpful in different use cases. Instead of having to create them in each place separately, you can just create them once and share them with others.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Queries&lt;/strong&gt; - Those can be useful when you don’t want to persist a data set, but want to provide a recipe for creating it from other sources instead.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;catalogs-support-in-flink-sql&quot;&gt;Catalogs support in Flink SQL&lt;/h2&gt;
&lt;p&gt;Starting from version 1.9, Flink has a set of Catalog APIs that allows to integrate Flink with various catalog implementations. With the help of those APIs, you can query tables in Flink that were created in your external catalogs (e.g. Hive Metastore). Additionally, depending on the catalog implementation, you can create new objects such as tables or views from Flink, reuse them across different jobs, and possibly even use them in other tools compatible with that catalog. In other words, you can see catalogs as having a two-fold purpose:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Provide an out-of-the box integration with ecosystems such as RDBMSs or Hive that allows you to query external objects like tables, views, or functions with no additional connector configuration. The connector properties are automatically derived from the catalog itself.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Act as a persistent store for Flink-specific metadata. In this mode, we additionally store connector properties alongside the logical metadata (e.g. schema, object name). That approach enables you to, for example, store a full definition of a Kafka-backed table with records serialized with Avro in Hive that can be later on used by Flink. However, as it incorporates Flink-specific properties, it can not be used by other tools that leverage Hive Metastore.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As of Flink 1.11, there are two catalog implementations supported by the community:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;A comprehensive Hive catalog&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A Postgres catalog (preview, read-only, for now)&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Flink does not store data at rest; it is a compute engine and requires other systems to consume input from and write its output. This means that Flink does not own the lifecycle of the data. Integration with Catalogs does not change that. Flink uses catalogs for metadata management only.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;All you need to do to start querying your tables defined in either of these metastores is to create the corresponding catalogs with connection parameters. Once this is done, you can use them the way you would in any relational database management system.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;c1&quot;&gt;-- create a catalog which gives access to the backing Postgres installation&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;postgres&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;type&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;jdbc&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;property-version&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;1&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;base-url&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;jdbc:postgresql://postgres:5432/&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;default-database&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;postgres&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;username&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;postgres&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;password&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;example&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- create a catalog which gives access to the backing Hive installation&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;type&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;hive&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;property-version&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;1&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;hive-version&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;2.3.6&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;hive-conf-dir&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;/opt/hive-conf&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After creating the catalogs, you can confirm that they are available to Flink and also list the databases or tables in each of these catalogs:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;catalogs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;default_catalog&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;postgres&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- switch the default catalog to Hive&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catalog&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;databases&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- hive&amp;#39;s default database&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tables&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;dev_orders&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catalog&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;postgres&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tables&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;prod_customer&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;prod_nation&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;prod_rates&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;prod_region&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;region_stats&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- desribe a schema of a table in Postgres, the Postgres types are automatically mapped to&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- Flink&amp;#39;s type system&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;describe&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prod_customer&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;root&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;-- c_custkey: INT NOT NULL&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;-- c_name: VARCHAR(25) NOT NULL&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;-- c_address: VARCHAR(40) NOT NULL&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;-- c_nationkey: INT NOT NULL&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;-- c_phone: CHAR(15) NOT NULL&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;-- c_acctbal: DOUBLE NOT NULL&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;-- c_mktsegment: CHAR(10) NOT NULL&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;-- c_comment: VARCHAR(117) NOT NULL&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now that you know which tables are available, you can write your first query.
In this scenario, we keep customer orders in Hive (&lt;code&gt;dev_orders&lt;/code&gt;) because of their volume, and reference customer data in Postgres (&lt;code&gt;prod_customer&lt;/code&gt;) to be able to easily update it. Let’s write a query that shows customers and their orders by region and order priority for a specific day.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;n&quot;&gt;USE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;postgres&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;r_name&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;region&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;o_orderpriority&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;priority&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;COUNT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;DISTINCT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c_custkey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number_of_customers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;COUNT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;o_orderkey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number_of_orders&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dev_orders&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- we need to fully qualify the table in hive because we set the&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- current catalog to Postgres&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;JOIN&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prod_customer&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;ON&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o_custkey&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c_custkey&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;JOIN&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prod_nation&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;ON&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c_nationkey&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n_nationkey&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;JOIN&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prod_region&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;ON&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n_regionkey&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r_regionkey&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FLOOR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;o_ordertime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TO&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;DAY&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;2020-04-01 0:00:00.000&amp;#39;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;AND&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;NOT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o_orderpriority&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;4-NOT SPECIFIED&amp;#39;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o_orderpriority&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;ORDER&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o_orderpriority&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Flink’s catalog support also covers storing Flink-specific objects in external catalogs that might not be fully usable by the corresponding external tools. The most notable use case for this is, for example, storing a table that describes a Kafka topic in a Hive catalog. Take the following DDL statement, that contains a watermark declaration as well as a set of connector properties that are not recognizable by Hive. You won’t be able to query the table with Hive, but it will be persisted and can be reused by different Flink jobs.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;n&quot;&gt;USE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prod_lineitem&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_orderkey&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTEGER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_partkey&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTEGER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_suppkey&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTEGER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_linenumber&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTEGER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_quantity&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOUBLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_extendedprice&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOUBLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_discount&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOUBLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_tax&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOUBLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_currency&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_returnflag&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_linestatus&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_ordertime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_shipinstruct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_shipmode&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_comment&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_proctime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PROCTIME&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WATERMARK&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l_ordertime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l_ordertime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;5&amp;#39;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SECONDS&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;kafka&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;topic&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;lineitem&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;scan.startup.mode&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;earliest-offset&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;properties.bootstrap.servers&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;kafka:9092&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;properties.group.id&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;testGroup&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;format&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;csv&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;csv.field-delimiter&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;|&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With &lt;code&gt;prod_lineitem&lt;/code&gt; stored in Hive, you can now write a query that will enrich the incoming stream with static data kept in Postgres. To illustrate how this works, let’s calculate the item prices based on the current currency rates:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;n&quot;&gt;USE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;CATALOG&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;postgres&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_proctime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;querytime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_orderkey&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_linenumber&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;linenumber&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_currency&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currency&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;rs_rate&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cur_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;l_extendedprice&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l_discount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l_tax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rs_rate&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;open_in_euro&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prod_lineitem&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;JOIN&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prod_rates&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SYSTEM_TIME&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;OF&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l_proctime&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;ON&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rs_symbol&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;l_currency&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;l_linestatus&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;O&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The query above uses a &lt;code&gt;SYSTEM AS OF&lt;/code&gt; &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/streaming/temporal_tables.html#temporal-table&quot;&gt;clause&lt;/a&gt; for executing a temporal join. If you’d like to learn more about the different kind of joins you can do in Flink I highly encourage you to check &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/sql/queries.html#joins&quot;&gt;this documentation page&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Catalogs can be extremely powerful when building data platforms aimed at reusing the work of different teams in an organization. Centralizing the metadata is a common practice for improving productivity, security, and compliance when working with data.&lt;/p&gt;
&lt;p&gt;Flink provides flexible metadata management capabilities, that aim at reducing the cumbersome, repetitive work needed before querying the data such as defining schemas, connection properties etc. As of version 1.11, Flink provides a native, comprehensive integration with Hive Metastore and a read-only version for Postgres catalogs.&lt;/p&gt;
&lt;p&gt;You can get started with Flink and catalogs by reading &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/catalogs.html&quot;&gt;the docs&lt;/a&gt;. If you want to play around with Flink SQL (e.g. try out how catalogs work in Flink yourself), you can check &lt;a href=&quot;https://github.com/fhueske/flink-sql-demo&quot;&gt;this demo&lt;/a&gt; prepared by our colleagues Fabian and Timo — it runs in a dockerized environment, and we used it for the examples in this blog post.&lt;/p&gt;
</description>
<pubDate>Thu, 23 Jul 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/2020/07/23/catalogs.html</link>
<guid isPermaLink="true">/2020/07/23/catalogs.html</guid>
</item>
<item>
<title>Apache Flink 1.11.1 Released</title>
<description>&lt;p&gt;The Apache Flink community released the first bugfix version of the Apache Flink 1.11 series.&lt;/p&gt;
&lt;p&gt;This release includes 44 fixes and minor improvements for Flink 1.11.0. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.11.1.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.11.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15794&quot;&gt;FLINK-15794&lt;/a&gt;] - Rethink default value of kubernetes.container.image
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18324&quot;&gt;FLINK-18324&lt;/a&gt;] - Translate updated data type and function page into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18387&quot;&gt;FLINK-18387&lt;/a&gt;] - Translate &amp;quot;BlackHole SQL Connector&amp;quot; page into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18388&quot;&gt;FLINK-18388&lt;/a&gt;] - Translate &amp;quot;CSV Format&amp;quot; page into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18391&quot;&gt;FLINK-18391&lt;/a&gt;] - Translate &amp;quot;Avro Format&amp;quot; page into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18395&quot;&gt;FLINK-18395&lt;/a&gt;] - Translate &amp;quot;ORC Format&amp;quot; page into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18469&quot;&gt;FLINK-18469&lt;/a&gt;] - Add Application Mode to release notes.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18524&quot;&gt;FLINK-18524&lt;/a&gt;] - Scala varargs cause exception for new inference
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15414&quot;&gt;FLINK-15414&lt;/a&gt;] - KafkaITCase#prepare failed in travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16181&quot;&gt;FLINK-16181&lt;/a&gt;] - IfCallGen will throw NPE for primitive types in blink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16572&quot;&gt;FLINK-16572&lt;/a&gt;] - CheckPubSubEmulatorTest is flaky on Azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17543&quot;&gt;FLINK-17543&lt;/a&gt;] - Rerunning failed azure jobs fails when uploading logs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17636&quot;&gt;FLINK-17636&lt;/a&gt;] - SingleInputGateTest.testConcurrentReadStateAndProcessAndClose: Trying to read from released RecoveredInputChannel
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18097&quot;&gt;FLINK-18097&lt;/a&gt;] - History server doesn&amp;#39;t clean all job json files
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18419&quot;&gt;FLINK-18419&lt;/a&gt;] - Can not create a catalog from user jar
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18434&quot;&gt;FLINK-18434&lt;/a&gt;] - Can not select fields with JdbcCatalog
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18440&quot;&gt;FLINK-18440&lt;/a&gt;] - ROW_NUMBER function: ROW/RANGE not allowed with RANK, DENSE_RANK or ROW_NUMBER functions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18461&quot;&gt;FLINK-18461&lt;/a&gt;] - Changelog source can&amp;#39;t be insert into upsert sink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18470&quot;&gt;FLINK-18470&lt;/a&gt;] - Tests RocksKeyGroupsRocksSingleStateIteratorTest#testMergeIteratorByte &amp;amp; RocksKeyGroupsRocksSingleStateIteratorTest#testMergeIteratorShort fail locally
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18471&quot;&gt;FLINK-18471&lt;/a&gt;] - flink-runtime lists &amp;quot;org.uncommons.maths:uncommons-maths:1.2.2a&amp;quot; as a bundled dependency, but it isn&amp;#39;t
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18477&quot;&gt;FLINK-18477&lt;/a&gt;] - ChangelogSocketExample does not work
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18478&quot;&gt;FLINK-18478&lt;/a&gt;] - AvroDeserializationSchema does not work with types generated by avrohugger
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18485&quot;&gt;FLINK-18485&lt;/a&gt;] - Kerberized YARN per-job on Docker test failed during unzip jce_policy-8.zip
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18519&quot;&gt;FLINK-18519&lt;/a&gt;] - Propagate exception to client when execution fails for REST submission
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18520&quot;&gt;FLINK-18520&lt;/a&gt;] - New Table Function type inference fails
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18529&quot;&gt;FLINK-18529&lt;/a&gt;] - Query Hive table and filter by timestamp partition can fail
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18539&quot;&gt;FLINK-18539&lt;/a&gt;] - StreamExecutionEnvironment#addSource(SourceFunction, TypeInformation) doesn&amp;#39;t use the user defined type information
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18573&quot;&gt;FLINK-18573&lt;/a&gt;] - InfluxDB reporter cannot be loaded as plugin
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18583&quot;&gt;FLINK-18583&lt;/a&gt;] - The _id field is incorrectly set to index in Elasticsearch6 DynamicTableSink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18585&quot;&gt;FLINK-18585&lt;/a&gt;] - Dynamic index can not work in new DynamicTableSink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18591&quot;&gt;FLINK-18591&lt;/a&gt;] - Fix the format issue for metrics web page
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18186&quot;&gt;FLINK-18186&lt;/a&gt;] - Various updates on Kubernetes standalone document
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18422&quot;&gt;FLINK-18422&lt;/a&gt;] - Update Prefer tag in documentation &amp;#39;Fault Tolerance training lesson&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18457&quot;&gt;FLINK-18457&lt;/a&gt;] - Fix invalid links in &amp;quot;Detecting Patterns&amp;quot; page of &amp;quot;Streaming Concepts&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18472&quot;&gt;FLINK-18472&lt;/a&gt;] - Local Installation Getting Started Guide
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18484&quot;&gt;FLINK-18484&lt;/a&gt;] - RowSerializer arity error does not provide specific information about the mismatch
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18501&quot;&gt;FLINK-18501&lt;/a&gt;] - Mapping of Pluggable Filesystems to scheme is not properly logged
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18526&quot;&gt;FLINK-18526&lt;/a&gt;] - Add the configuration of Python UDF using Managed Memory in the doc of Pyflink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18532&quot;&gt;FLINK-18532&lt;/a&gt;] - Remove Beta tag from MATCH_RECOGNIZE docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18561&quot;&gt;FLINK-18561&lt;/a&gt;] - Build manylinux1 with better compatibility instead of manylinux2014 Python Wheel Packages
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18593&quot;&gt;FLINK-18593&lt;/a&gt;] - Hive bundle jar URLs are broken
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Test
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18534&quot;&gt;FLINK-18534&lt;/a&gt;] - KafkaTableITCase.testKafkaDebeziumChangelogSource failed with &amp;quot;Topic &amp;#39;changelog_topic&amp;#39; already exists&amp;quot;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18502&quot;&gt;FLINK-18502&lt;/a&gt;] - Add the page &amp;#39;legacySourceSinks.zh.md&amp;#39; into the directory &amp;#39;docs/dev/table&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18505&quot;&gt;FLINK-18505&lt;/a&gt;] - Correct the content of &amp;#39;sourceSinks.zh.md&amp;#39;
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Tue, 21 Jul 2020 20:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/07/21/release-1.11.1.html</link>
<guid isPermaLink="true">/news/2020/07/21/release-1.11.1.html</guid>
</item>
<item>
<title>Application Deployment in Flink: Current State and the new Application Mode</title>
<description>&lt;p&gt;With the rise of stream processing and real-time analytics as a critical tool for modern
businesses, an increasing number of organizations build platforms with Apache Flink at their
core and offer it internally as a service. Many talks with related topics from companies
like &lt;a href=&quot;https://www.youtube.com/watch?v=VX3S9POGAdU&quot;&gt;Uber&lt;/a&gt;, &lt;a href=&quot;https://www.youtube.com/watch?v=VX3S9POGAdU&quot;&gt;Netflix&lt;/a&gt;
and &lt;a href=&quot;https://www.youtube.com/watch?v=cH9UdK0yYjc&quot;&gt;Alibaba&lt;/a&gt; in the latest editions of Flink Forward further
illustrate this trend.&lt;/p&gt;
&lt;p&gt;These platforms aim at simplifying application submission internally by lifting all the
operational burden from the end user. To submit Flink applications, these platforms
usually expose only a centralized or low-parallelism endpoint (&lt;em&gt;e.g.&lt;/em&gt; a Web frontend)
for application submission that we will call the &lt;em&gt;Deployer&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;One of the roadblocks that platform developers and maintainers often mention is that the
Deployer can be a heavy resource consumer that is difficult to provision for. Provisioning
for average load can lead to the Deployer service being overwhelmed with deployment
requests (in the worst case, for all production applications in a short period of time),
while planning based on top load leads to unnecessary costs. Building on this observation,
Flink 1.11 introduces the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/deployment/#application-mode&quot;&gt;Application Mode&lt;/a&gt;
as a deployment option, which allows for a lightweight, more scalable application
submission process that manages to spread more evenly the application deployment load
across the nodes in the cluster.&lt;/p&gt;
&lt;p&gt;In order to understand the problem and how the Application Mode solves it, we start by
describing briefly the current status of application execution in Flink, before
describing the architectural changes introduced by the new deployment mode and how to
leverage them.&lt;/p&gt;
&lt;h1 id=&quot;application-execution-in-flink&quot;&gt;Application Execution in Flink&lt;/h1&gt;
&lt;p&gt;The execution of an application in Flink mainly involves three entities: the &lt;em&gt;Client&lt;/em&gt;,
the &lt;em&gt;JobManager&lt;/em&gt; and the &lt;em&gt;TaskManagers&lt;/em&gt;. The Client is responsible for submitting the application to the
cluster, the JobManager is responsible for the necessary bookkeeping during execution,
and the TaskManagers are the ones doing the actual computation. For more details please
refer to &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/concepts/flink-architecture.html&quot;&gt;Flink’s Architecture&lt;/a&gt;
documentation page.&lt;/p&gt;
&lt;h2 id=&quot;current-deployment-modes&quot;&gt;Current Deployment Modes&lt;/h2&gt;
&lt;p&gt;Before the introduction of the Application Mode in version 1.11, Flink allowed users to execute an application either on a
&lt;em&gt;Session&lt;/em&gt; or a &lt;em&gt;Per-Job Cluster&lt;/em&gt;. The differences between the two have to do with the cluster
lifecycle and the resource isolation guarantees they provide.&lt;/p&gt;
&lt;h3 id=&quot;session-mode&quot;&gt;Session Mode&lt;/h3&gt;
&lt;p&gt;Session Mode assumes an already running cluster and uses the resources of that cluster to
execute any submitted application. Applications executed in the same (session) cluster use,
and consequently compete for, the same resources. This has the advantage that you do not
pay the resource overhead of spinning up a full cluster for every submitted job. But, if
one of the jobs misbehaves or brings down a TaskManager, then all jobs running on that
TaskManager will be affected by the failure. Apart from a negative impact on the job that
caused the failure, this implies a potential massive recovery process with all the
restarting jobs accessing the file system concurrently and making it unavailable to other
services. Additionally, having a single cluster running multiple jobs implies more load
for the JobManager, which is responsible for the bookkeeping of all the jobs in the
cluster. This mode is ideal for short jobs where startup latency is of high importance,
&lt;em&gt;e.g.&lt;/em&gt; interactive queries.&lt;/p&gt;
&lt;h3 id=&quot;per-job-mode&quot;&gt;Per-Job Mode&lt;/h3&gt;
&lt;p&gt;In Per-Job Mode, the available cluster manager framework (&lt;em&gt;e.g.&lt;/em&gt; YARN or Kubernetes) is
used to spin up a Flink cluster for each submitted job, which is available to that job
only. When the job finishes, the cluster is shut down and any lingering resources
(&lt;em&gt;e.g.&lt;/em&gt; files) are cleaned up. This mode allows for better resource isolation, as a
misbehaving job cannot affect any other job. In addition, it spreads the load of
bookkeeping across multiple entities, as each application has its own JobManager.
Given the aforementioned resource isolation concerns of the Session Mode, users often
opt for the Per-Job Mode for long-running jobs which are willing to accept some increase
in startup latency in favor of resilience.&lt;/p&gt;
&lt;p&gt;To summarize, in Session Mode, the cluster lifecycle is independent of any job running on
the cluster and all jobs running on the cluster share its resources. The per-job mode
chooses to pay the price of spinning up a cluster for every submitted job, in order to
provide better resource isolation guarantees as the resources are not shared across jobs.
In this case, the lifecycle of the cluster is bound to that of the job.&lt;/p&gt;
&lt;h2 id=&quot;application-submission&quot;&gt;Application Submission&lt;/h2&gt;
&lt;p&gt;Flink application execution consists of two stages: &lt;em&gt;pre-flight&lt;/em&gt;, when the users’ &lt;code&gt;main()&lt;/code&gt;
method is called; and &lt;em&gt;runtime&lt;/em&gt;, which is triggered as soon as the user code calls &lt;code&gt;execute()&lt;/code&gt;.
The &lt;code&gt;main()&lt;/code&gt; method constructs the user program using one of Flink’s APIs
(DataStream API, Table API, DataSet API). When the &lt;code&gt;main()&lt;/code&gt; method calls &lt;code&gt;env.execute()&lt;/code&gt;,
the user-defined pipeline is translated into a form that Flink’s runtime can understand,
called the &lt;em&gt;job graph&lt;/em&gt;, and it is shipped to the cluster.&lt;/p&gt;
&lt;p&gt;Despite their differences, both session and per-job modes execute the application’s &lt;code&gt;main()&lt;/code&gt;
method, &lt;em&gt;i.e.&lt;/em&gt; the &lt;em&gt;pre-flight&lt;/em&gt; phase, on the client side.&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;This is usually not a problem for individual users who already have all the dependencies
of their jobs locally, and then submit their applications through a client running on
their machine. But in the case of submission through a remote entity like the Deployer,
this process includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;downloading the application’s dependencies locally,&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;executing the main()method to extract the job graph,&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;ship the job graph and its dependencies to the cluster for execution and,&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;potentially, wait for the result.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This makes the Client a heavy resource consumer as it may need substantial network
bandwidth to download dependencies and ship binaries to the cluster, and CPU cycles to
execute the &lt;code&gt;main()&lt;/code&gt; method. This problem is even more pronounced as more users share
the same Client.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-07-14-application-mode/session-per-job.png&quot; width=&quot;75%&quot; alt=&quot;Session and Per-Job Mode&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;The figure above illustrates the two deployment modes using 3 applications depicted in
&lt;span style=&quot;color:red&quot;&gt;red&lt;/span&gt;, &lt;span style=&quot;color:blue&quot;&gt;blue&lt;/span&gt; and &lt;span style=&quot;color:green&quot;&gt;green&lt;/span&gt;.
Each one has a parallelism of 3. The black rectangles represent
different processes: TaskManagers, JobManagers and the Deployer; and we assume a single
Deployer process in all scenarios. The colored triangles represent the load of the
submission process, while the colored rectangles represent the load of the TaskManager
and JobManager processes. As shown in the figure, the Deployer in both per-job and
session mode share the same load. Their difference lies in the distribution of the
tasks and the JobManager load. In the Session Mode, there is a single JobManager for
all the jobs in the cluster while in the per-job mode, there is one for each job. In
addition, tasks in Session Mode are assigned randomly to TaskManagers while in Per-Job
Mode, each TaskManager can only have tasks of a single job.&lt;/p&gt;
&lt;h1 id=&quot;application-mode&quot;&gt;Application Mode&lt;/h1&gt;
&lt;p&gt;&lt;img style=&quot;float: right;margin-left:10px;margin-right: 15px;&quot; src=&quot;/img/blog/2020-07-14-application-mode/application.png&quot; width=&quot;320px&quot; alt=&quot;Application Mode&quot; /&gt;&lt;/p&gt;
&lt;p&gt;The Application Mode builds on the above observations and tries to combine the resource
isolation of the per-job mode with a lightweight and scalable application submission
process. To achieve this, it creates a cluster &lt;em&gt;per submitted application&lt;/em&gt;, but this
time, the &lt;code&gt;main()&lt;/code&gt; method of the application is executed on the JobManager.&lt;/p&gt;
&lt;p&gt;Creating a cluster per application can be seen as creating a session cluster shared
only among the jobs of a particular application and torn down when the application
finishes. With this architecture, the Application Mode provides the same resource
isolation and load balancing guarantees as the Per-Job Mode, but at the granularity of
a whole application. This makes sense, as jobs belonging to the same application are
expected to be correlated and treated as a unit.&lt;/p&gt;
&lt;p&gt;Executing the &lt;code&gt;main()&lt;/code&gt; method on the JobManager allows saving the CPU cycles required
for extracting the job graph, but also the bandwidth required on the client for
downloading the dependencies locally and shipping the job graph and its dependencies
to the cluster. Furthermore, it spreads the network load more evenly, as there is one
JobManager per application. This is illustrated in the figure above, where we have the
same scenario as in the session and per-job deployment mode section, but this time
the client load has shifted to the JobManager of each application.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
In the Application Mode, the main() method is executed on the cluster and not on the Client, as in the other modes.
This may have implications for your code as, for example, any paths you register in your
environment using the registerCachedFile() must be accessible by the JobManager of
your application.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Compared to the Per-Job Mode, the Application Mode allows the submission of applications
consisting of multiple jobs. The order of job execution is not affected by the deployment
mode but by the call used to launch the job. Using the blocking &lt;code&gt;execute()&lt;/code&gt; method
establishes an order and will lead to the execution of the “next” job being postponed
until “this” job finishes. In contrast, the non-blocking &lt;code&gt;executeAsync()&lt;/code&gt; method will
immediately continue to submit the “next” job as soon as the current job is submitted.&lt;/p&gt;
&lt;h2 id=&quot;reducing-network-requirements&quot;&gt;Reducing Network Requirements&lt;/h2&gt;
&lt;p&gt;As described above, by executing the application’s &lt;code&gt;main()&lt;/code&gt; method on the JobManager,
the Application Mode manages to save a lot of the resources previously required during
job submission. But there is still room for improvement.&lt;/p&gt;
&lt;p&gt;Focusing on YARN, which already supports all the optimizations mentioned here&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, and
even with the Application Mode in place, the Client is still required to send the user
jar to the JobManager. In addition, &lt;em&gt;for each application&lt;/em&gt;, the Client has to ship to
the cluster the “flink-dist” directory which contains the binaries of the framework
itself, including the &lt;code&gt;flink-dist.jar&lt;/code&gt;, &lt;code&gt;lib/&lt;/code&gt; and &lt;code&gt;plugin/&lt;/code&gt; directories. These two can
account for a substantial amount of bandwidth on the client side. Furthermore, shipping
the same flink-dist binaries on every submission is both a waste of bandwidth but also
of storage space which can be alleviated by simply allowing applications to share the
same binaries.&lt;/p&gt;
&lt;p&gt;In Flink 1.11, we introduce options that allow the user to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Specify a remote path to a directory where YARN can find the Flink distribution binaries, and&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Specify a remote path where YARN can find the user jar.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For 1., we leverage YARN’s distributed cache and allow applications to share these
binaries. So, if an application happens to find copies of Flink on the local storage
of its TaskManager due to a previous application that was executed on the same
TaskManager, it will not even have to download it internally.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Both optimizations are available to all deployment modes on YARN, and not only the Application Mode.&lt;/p&gt;
&lt;/div&gt;
&lt;h1 id=&quot;example-application-mode-on-yarn&quot;&gt;Example: Application Mode on Yarn&lt;/h1&gt;
&lt;p&gt;For a full description, please refer to the official Flink documentation and more
specifically to the page that refers to your cluster management framework, &lt;em&gt;e.g.&lt;/em&gt;
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/deployment/yarn_setup.html#run-an-application-in-application-mode&quot;&gt;YARN&lt;/a&gt;
or &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/deployment/native_kubernetes.html#flink-kubernetes-application&quot;&gt;Kubernetes&lt;/a&gt;.
Here we will give some examples around YARN, where all the above features are available.&lt;/p&gt;
&lt;p&gt;To launch an application in Application Mode, you can use:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;b&gt;./bin/flink run-application -t yarn-application&lt;/b&gt; ./MyApplication.jar&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;With this command, all configuration parameters, such as the path to a savepoint to
be used to bootstrap the application’s state or the required JobManager/TaskManager
memory sizes, can be specified by their configuration option, prefixed by &lt;code&gt;-D&lt;/code&gt;. For
a catalog of the available configuration options, please refer to Flink’s
&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/ops/config.html&quot;&gt;configuration page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As an example, the command to specify the memory sizes of the JobManager and the
TaskManager would look like:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;./bin/flink run-application -t yarn-application \
&lt;b&gt;-Djobmanager.memory.process.size=2048m&lt;/b&gt; \
&lt;b&gt;-Dtaskmanager.memory.process.size=4096m&lt;/b&gt; \
./MyApplication.jar
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As discussed earlier, the above will make sure that your application’s &lt;code&gt;main()&lt;/code&gt; method
will be executed on the JobManager.&lt;/p&gt;
&lt;p&gt;To further save the bandwidth of shipping the Flink distribution to the cluster, consider
pre-uploading the Flink distribution to a location accessible by YARN and using the
&lt;code&gt;yarn.provided.lib.dirs&lt;/code&gt; configuration option, as shown below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;./bin/flink run-application -t yarn-application \
-Djobmanager.memory.process.size=2048m \
-Dtaskmanager.memory.process.size=4096m \
&lt;b&gt;-Dyarn.provided.lib.dirs=&quot;hdfs://myhdfs/remote-flink-dist-dir&quot;&lt;/b&gt; \
./MyApplication.jar
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, in order to further save the bandwidth required to submit your application jar,
you can pre-upload it to HDFS, and specify the remote path that points to
&lt;code&gt;./MyApplication.jar&lt;/code&gt;, as shown below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;./bin/flink run-application -t yarn-application \
-Djobmanager.memory.process.size=2048m \
-Dtaskmanager.memory.process.size=4096m \
-Dyarn.provided.lib.dirs=&quot;hdfs://myhdfs/remote-flink-dist-dir&quot; \
&lt;b&gt;hdfs://myhdfs/jars/MyApplication.jar&lt;/b&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will make the job submission extra lightweight as the needed Flink jars and the
application jar are going to be picked up from the specified remote locations rather
than be shipped to the cluster by the Client. The only thing the Client will ship to
the cluster is the configuration of your application which includes all the
aforementioned paths.&lt;/p&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;We hope that this discussion helped you understand the differences between the various
deployment modes offered by Flink and will help you to make informed decisions about
which one is suitable in your own setup. Feel free to play around with them and report
any issues you may find. If you have any questions or requests, do not hesitate to post
them in the &lt;a href=&quot;https://wints.github.io/flink-web//community.html#mailing-lists&quot;&gt;mailing lists&lt;/a&gt;
and, hopefully, see you (virtually) at one of our conferences or meetups soon!&lt;/p&gt;
&lt;div class=&quot;footnotes&quot;&gt;
&lt;ol&gt;
&lt;li id=&quot;fn:1&quot;&gt;
&lt;p&gt;The only exceptions are the Web Submission and the Standalone per-job implementation. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn:2&quot;&gt;
&lt;p&gt;Support for Kubernetes will come soon. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
<pubDate>Tue, 14 Jul 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/07/14/application-mode.html</link>
<guid isPermaLink="true">/news/2020/07/14/application-mode.html</guid>
</item>
<item>
<title>Apache Flink 1.11.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is proud to announce the release of Flink 1.11.0! More than 200 contributors worked on over 1.3k issues to bring significant improvements to usability as well as new features to Flink users across the whole API stack. Some highlights that we’re particularly excited about are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The core engine is introducing &lt;strong&gt;unaligned checkpoints&lt;/strong&gt;, a major change to Flink’s fault tolerance mechanism that improves checkpointing performance under heavy backpressure.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A &lt;strong&gt;new Source API&lt;/strong&gt; that simplifies the implementation of (custom) sources by unifying batch and streaming execution, as well as offloading internals such as event-time handling, watermark generation or idleness detection to Flink.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Flink SQL is introducing &lt;strong&gt;Support for Change Data Capture (CDC)&lt;/strong&gt; to easily consume and interpret database changelogs from tools like Debezium. The renewed &lt;strong&gt;FileSystem Connector&lt;/strong&gt; also expands the set of use cases and formats supported in the Table API/SQL, enabling scenarios like streaming data directly from Kafka to Hive.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Multiple performance optimizations to PyFlink, including support for &lt;strong&gt;vectorized User-defined Functions (Pandas UDFs)&lt;/strong&gt;. This improves interoperability with libraries like Pandas and NumPy, making Flink more powerful for data science and ML workloads.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Read on for all major new features and improvements, important changes to be aware of and what to expect moving forward!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#new-features-and-improvements&quot; id=&quot;markdown-toc-new-features-and-improvements&quot;&gt;New Features and Improvements&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#unaligned-checkpoints-beta&quot; id=&quot;markdown-toc-unaligned-checkpoints-beta&quot;&gt;Unaligned Checkpoints (Beta)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#unified-watermark-generators&quot; id=&quot;markdown-toc-unified-watermark-generators&quot;&gt;Unified Watermark Generators&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-data-source-api-beta&quot; id=&quot;markdown-toc-new-data-source-api-beta&quot;&gt;New Data Source API (Beta)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#application-mode-deployments&quot; id=&quot;markdown-toc-application-mode-deployments&quot;&gt;Application Mode Deployments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-improvements&quot; id=&quot;markdown-toc-other-improvements&quot;&gt;Other Improvements&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#table-apisql-support-for-change-data-capture-cdc&quot; id=&quot;markdown-toc-table-apisql-support-for-change-data-capture-cdc&quot;&gt;Table API/SQL: Support for Change Data Capture (CDC)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#table-apisql-jdbc-catalog-interface-and-postgres-catalog&quot; id=&quot;markdown-toc-table-apisql-jdbc-catalog-interface-and-postgres-catalog&quot;&gt;Table API/SQL: JDBC Catalog Interface and Postgres Catalog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#table-apisql-filesystem-connector-with-support-for-avro-orc-and-parquet&quot; id=&quot;markdown-toc-table-apisql-filesystem-connector-with-support-for-avro-orc-and-parquet&quot;&gt;Table API/SQL: FileSystem Connector with Support for Avro, ORC and Parquet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#table-apisql-support-for-python-udfs&quot; id=&quot;markdown-toc-table-apisql-support-for-python-udfs&quot;&gt;Table API/SQL: Support for Python UDFs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-improvements-to-the-table-apisql&quot; id=&quot;markdown-toc-other-improvements-to-the-table-apisql&quot;&gt;Other Improvements to the Table API/SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pyflink-support-for-pandas-udfs&quot; id=&quot;markdown-toc-pyflink-support-for-pandas-udfs&quot;&gt;PyFlink: Support for Pandas UDFs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-improvements-to-pyflink&quot; id=&quot;markdown-toc-other-improvements-to-pyflink&quot;&gt;Other Improvements to PyFlink&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#important-changes&quot; id=&quot;markdown-toc-important-changes&quot;&gt;Important Changes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes&quot; id=&quot;markdown-toc-release-notes&quot;&gt;Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt; of the Flink website, and the most recent distribution of PyFlink is available on &lt;a href=&quot;https://pypi.org/project/apache-flink/&quot;&gt;PyPI&lt;/a&gt;. Please review the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/release-notes/flink-1.11.html&quot;&gt;release notes&lt;/a&gt; carefully, and check the complete &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346364&amp;amp;styleName=Html&amp;amp;projectId=12315522&quot;&gt;release changelog&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/flink-docs-release-1.11/&quot;&gt;updated documentation&lt;/a&gt; for more details.&lt;/p&gt;
&lt;p&gt;We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt; or &lt;a href=&quot;https://issues.apache.org/jira/projects/FLINK/summary&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;new-features-and-improvements&quot;&gt;New Features and Improvements&lt;/h2&gt;
&lt;h3 id=&quot;unaligned-checkpoints-beta&quot;&gt;Unaligned Checkpoints (Beta)&lt;/h3&gt;
&lt;p&gt;Triggering a checkpoint in Flink will cause a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/internals/stream_checkpointing.html#barriers&quot;&gt;checkpoint barrier&lt;/a&gt; to flow from the sources of your topology all the way towards the sinks. For operators that receive more than one input stream, the barriers flowing through each channel need to be aligned before the operator can snapshot its state and forward the checkpoint barrier — typically, this alignment will take just a few milliseconds to complete, but it can become a bottleneck in backpressured pipelines as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Checkpoint barriers will flow much slower through backpressured channels, effectively blocking the remaining channels and their upstream operators during checkpointing;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Slow checkpoint barrier propagation leads to longer checkpointing times and can, worst case, result in little to no progress in the application.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To improve the performance of checkpointing under backpressure scenarios, the community is rolling out the first iteration of unaligned checkpoints (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-76%3A+Unaligned+Checkpoints&quot;&gt;FLIP-76&lt;/a&gt;) with Flink 1.11. Compared to the original checkpointing mechanism (Fig. 1), this approach doesn’t wait for barrier alignment across input channels, instead allowing barriers to overtake in-flight records (i.e., data stored in buffers) and forwarding them downstream before the synchronous part of the checkpoint takes place (Fig. 2).&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-lg-6&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-07-06-release-1.11.0/image1.gif&quot; width=&quot;600px&quot; alt=&quot;Aligned Checkpoints&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.1:&lt;/b&gt; Aligned Checkpoints&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;col-lg-6&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-07-06-release-1.11.0/image2.png&quot; width=&quot;600px&quot; alt=&quot;Unaligned Checkpoints&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.2:&lt;/b&gt; Unaligned Checkpoints&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Because in-flight records have to be persisted as part of the snapshot, unaligned checkpoints will lead to increased checkpoints sizes. On the upside, &lt;strong&gt;checkpointing times are heavily reduced&lt;/strong&gt;, so users will see more progress (even in unstable environments) as more up-to-date checkpoints will lighten the recovery process. You can learn more about the current limitations of unaligned checkpoints in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/state/checkpoints.html#unaligned-checkpoints&quot;&gt;documentation&lt;/a&gt;, and track the improvement work planned for this feature in &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14551&quot;&gt;FLINK-14551&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As with any beta feature, we appreciate early feedback that you might want to share with the community after giving unaligned checkpoints a try!&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot;&gt;Info&lt;/span&gt; To enable this feature, you need to configure the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/streaming/api/environment/CheckpointConfig.html&quot;&gt;&lt;code&gt;enableUnalignedCheckpoints&lt;/code&gt;&lt;/a&gt; option in your &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing&quot;&gt;checkpoint config&lt;/a&gt;. Please note that unaligned checkpoints can only be enabled if &lt;code&gt;checkpointingMode&lt;/code&gt; is set to &lt;code&gt;CheckpointingMode.EXACTLY_ONCE&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&quot;unified-watermark-generators&quot;&gt;Unified Watermark Generators&lt;/h3&gt;
&lt;p&gt;So far, watermark generation (prev. also called &lt;em&gt;assignment&lt;/em&gt;) relied on two different interfaces: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/streaming/api/functions/AssignerWithPunctuatedWatermarks.html&quot;&gt;&lt;code&gt;AssignerWithPunctuatedWatermarks&lt;/code&gt;&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/streaming/api/functions/AssignerWithPeriodicWatermarks.html&quot;&gt;&lt;code&gt;AssignerWithPeriodicWatermarks&lt;/code&gt;&lt;/a&gt;; that were closely intertwined with timestamp extraction. This made it difficult to implement long-requested features like support for idleness detection, besides leading to code duplication and maintenance burden. With &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-126%3A+Unify+%28and+separate%29+Watermark+Assigners&quot;&gt;FLIP-126&lt;/a&gt;, the legacy watermark assigners are unified into a single interface: the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/WatermarkGenerator.html&quot;&gt;&lt;code&gt;WatermarkGenerator&lt;/code&gt;&lt;/a&gt;; and detached from the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/TimestampAssigner.html&quot;&gt;&lt;code&gt;TimestampAssigner&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This gives users more control over watermark emission and simplifies the implementation of new connectors that need to support watermark assignment and timestamp extraction at the source (see &lt;em&gt;&lt;a href=&quot;#new-data-source-api-beta&quot;&gt;New Data Source API&lt;/a&gt;&lt;/em&gt;). Multiple &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11//dev/event_timestamps_watermarks.html#introduction-to-watermark-strategies&quot;&gt;strategies for watermarking&lt;/a&gt; are available out-of-the-box as convenience methods in Flink 1.11 (e.g. &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/WatermarkStrategy.html#forBoundedOutOfOrderness-java.time.Duration-&quot;&gt;&lt;code&gt;forBoundedOutOfOrderness&lt;/code&gt;&lt;/a&gt;, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/WatermarkStrategy.html#forMonotonousTimestamps--&quot;&gt;&lt;code&gt;forMonotonousTimestamps&lt;/code&gt;&lt;/a&gt;), though you can also choose to customize your own.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Support for Watermark Idleness Detection&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/WatermarkStrategy.html#withIdleness-java.time.Duration-&quot;&gt;&lt;code&gt;WatermarkStrategy.withIdleness()&lt;/code&gt;&lt;/a&gt; method allows you to mark a stream as idle if no events arrive within a configured time (i.e. a timeout duration), which in turn allows handling event time skew properly and preventing idle partitions from holding back the event time progress of the entire application. Users can already benefit from &lt;strong&gt;per-partition idleness detection&lt;/strong&gt; in the Kafka connector, which has been adapted to use the new interfaces (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17669&quot;&gt;FLINK-17669&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot;&gt;Note&lt;/span&gt; &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-126%3A+Unify+%28and+separate%29+Watermark+Assigners&quot;&gt;FLIP-126&lt;/a&gt; introduces no breaking changes, but we recommend that users give preference to the new &lt;code&gt;WatermarkGenerator&lt;/code&gt; interface moving forward, in preparation for the deprecation of the legacy watermark assigners in future releases.&lt;/p&gt;
&lt;h3 id=&quot;new-data-source-api-beta&quot;&gt;New Data Source API (Beta)&lt;/h3&gt;
&lt;p&gt;Up to this point, writing a production-grade source connector for Flink was a non-trivial task that required users to be somewhat familiar with Flink internals and account for implementation details like event time assignment, watermark generation or idleness detection in their code. Flink 1.11 introduces a new Data Source API (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface&quot;&gt;FLIP-27&lt;/a&gt;) to overcome these limitations, as well as the need to rewrite separate code for batch and streaming execution.&lt;/p&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-07-06-release-1.11.0/image3.png&quot; width=&quot;600px&quot; alt=&quot;Data Source API&quot; /&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Separating the work of split discovery and the actual reading of the consumed data (i.e. the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/sources.html#data-source-concepts&quot;&gt;&lt;em&gt;splits&lt;/em&gt;&lt;/a&gt;) in different components — resp. the &lt;code&gt;SplitEnumerator&lt;/code&gt; and &lt;code&gt;SourceReader&lt;/code&gt; — allows mixing and matching different enumeration strategies and split readers.&lt;/p&gt;
&lt;p&gt;As an example, the existing Kafka connector has multiple strategies for partition discovery that are intermingled with the rest of the code. With the new interfaces in place, it would only need a single reader implementation and there could be several split enumerators for the different partition discovery strategies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Batch and Streaming Unification&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Source connectors implemented using the Data Source API will be able to work both as a bounded (&lt;em&gt;batch&lt;/em&gt;) and unbounded (&lt;em&gt;streaming&lt;/em&gt;) source. The difference between both cases is minimal: for bounded input, the &lt;code&gt;SplitEnumerator&lt;/code&gt; will generate a fixed set of splits and each split is finite; for unbounded input, either the splits are not finite or the &lt;code&gt;SplitEnumerator&lt;/code&gt; keeps generating new splits.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Implicit Watermark and Event Time Handling&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;TimestampAssigner&lt;/code&gt; and &lt;code&gt;WatermarkGenerator&lt;/code&gt; run transparently as part of the &lt;code&gt;SourceReader&lt;/code&gt; component, so users also don’t have to implement any timestamp extraction or watermark generation code.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot;&gt;Note&lt;/span&gt; The existing source connectors have not yet been reimplemented using the Data Source API — this is planned for upcoming releases. If you’re looking to implement a new source, please refer to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/sources.html#data-sources&quot;&gt;Data Source documentation&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/stream/sources.html#the-data-source-api&quot;&gt;the tips on source development&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;application-mode-deployments&quot;&gt;Application Mode Deployments&lt;/h3&gt;
&lt;p&gt;Prior to Flink 1.11, jobs in a Flink application could either be submitted to a long-running &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/concepts/flink-architecture.html#flink-session-cluster&quot;&gt;Flink Session Cluster&lt;/a&gt; (&lt;em&gt;session mode&lt;/em&gt;) or a dedicated &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/concepts/flink-architecture.html#flink-job-cluster&quot;&gt;Flink Job Cluster&lt;/a&gt; (&lt;em&gt;job mode&lt;/em&gt;). For both these modes, the &lt;code&gt;main()&lt;/code&gt; method of user programs runs on the &lt;em&gt;client&lt;/em&gt; side. This presents some challenges: on one hand, if the client is part of a large installation, it can easily become a bottleneck for &lt;code&gt;JobGraph&lt;/code&gt; generation; and on the other, it’s not a good fit for containerized environments like Docker or Kubernetes.&lt;/p&gt;
&lt;p&gt;From this release on, Flink gets an additional deployment mode: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/deployment/#application-mode&quot;&gt;Application Mode&lt;/a&gt; (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Flink+Application+Mode&quot;&gt;FLIP-85&lt;/a&gt;); where the &lt;code&gt;main()&lt;/code&gt; method runs on the cluster, rather than the client. The job submission becomes a one-step process: you package your application logic and dependencies into an executable job JAR and the cluster entrypoint (&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/client/deployment/application/ApplicationClusterEntryPoint.html&quot;&gt;&lt;code&gt;ApplicationClusterEntryPoint&lt;/code&gt;&lt;/a&gt;) is responsible for calling the &lt;code&gt;main()&lt;/code&gt; method to extract the &lt;code&gt;JobGraph&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In Flink 1.11, the community worked to already support &lt;em&gt;application mode&lt;/em&gt; in Kubernetes (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10934&quot;&gt;FLINK-10934&lt;/a&gt;).&lt;/p&gt;
&lt;h3 id=&quot;other-improvements&quot;&gt;Other Improvements&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Unified Memory Configuration for JobManagers (&lt;a href=&quot;https://jira.apache.org/jira/browse/FLINK-16614&quot;&gt;FLIP-116&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Following the work started in Flink 1.10 to improve memory management and configuration, this release introduces a new memory model that aligns the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/memory/mem_setup_jobmanager.html&quot;&gt;JobManagers’ configuration options&lt;/a&gt; and terminology with that introduced in &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors&quot;&gt;FLIP-49&lt;/a&gt; for TaskManagers. This affects all deployment types: standalone, YARN, Mesos and the new active Kubernetes integration.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-danger&quot;&gt;Attention&lt;/span&gt; Reusing a previous Flink configuration without any adjustments can result in differently computed memory parameters for the JVM and, as a result, performance changes or even failures. Make sure to check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/memory/mem_migration.html#migrate-job-manager-memory-configuration&quot;&gt;migration guide&lt;/a&gt; if you’re planning to update to the latest version.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Improvements to the Flink WebUI (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal&quot;&gt;FLIP-75&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In Flink 1.11, the community kicked off a series of improvements to the Flink WebUI. The first to roll out are better TaskManager and JobManager log display (&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=147427143&quot;&gt;FLIP-103&lt;/a&gt;), as well as a new thread dump utility (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14816&quot;&gt;FLINK-14816&lt;/a&gt;). Some additional work planned for upcoming releases includes better backpressure detection, more flexible and configurable exception display and support for displaying the history of subtask failure attempts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Docker Image Unification (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification&quot;&gt;FLIP-111&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;With this release, all Docker-related resources have been consolidated into &lt;a href=&quot;https://github.com/apache/flink-docker&quot;&gt;apache/flink-docker&lt;/a&gt; and the entry point script has been extended to allow users to run the default Docker image in different modes without the need to create a custom image. The &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/ops/deployment/docker.html#customize-flink-image&quot;&gt;updated documentation&lt;/a&gt; describes in detail how to use and customize the official Flink Docker image for different environments and deployment modes.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3 id=&quot;table-apisql-support-for-change-data-capture-cdc&quot;&gt;Table API/SQL: Support for Change Data Capture (CDC)&lt;/h3&gt;
&lt;p&gt;Change Data Capture (CDC) has become a popular pattern to capture committed changes from a database and propagate those changes to downstream consumers, for example to keep multiple datastores in sync and avoid common pitfalls such as &lt;a href=&quot;https://thorben-janssen.com/dual-writes/&quot;&gt;dual writes&lt;/a&gt;. Being able to easily ingest and interpret these changelogs into the Table API/SQL has been a highly demanded feature in the Flink community — and it’s now possible with Flink 1.11.&lt;/p&gt;
&lt;p&gt;To extend the scope of the Table API/SQL to use cases like CDC, Flink 1.11 introduces new table source and sink interfaces with &lt;strong&gt;changelog mode&lt;/strong&gt; (see &lt;em&gt;&lt;a href=&quot;#other-improvements-to-the-table-apisql&quot;&gt;New TableSource and TableSink Interfaces&lt;/a&gt;&lt;/em&gt;) and support for the &lt;a href=&quot;https://debezium.io/&quot;&gt;Debezium&lt;/a&gt; and &lt;a href=&quot;https://github.com/alibaba/canal&quot;&gt;Canal&lt;/a&gt; formats (&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=147427289&quot;&gt;FLIP-105&lt;/a&gt;). This means that dynamic tables sources are no longer limited to append-only operations and can ingest these external changelogs (&lt;code&gt;INSERT&lt;/code&gt; events), interpret them into change operations (&lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, &lt;code&gt;DELETE&lt;/code&gt; events) and emit them downstream with the change type.&lt;/p&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-07-06-release-1.11.0/image4.png&quot; width=&quot;500px&quot; alt=&quot;CDC&quot; /&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Users have to specify either &lt;code&gt;“format=debezium-json”&lt;/code&gt; or &lt;code&gt;“format=canal-json”&lt;/code&gt; in their &lt;code&gt;CREATE TABLE&lt;/code&gt; statement to consume changelogs using SQL DDL.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;my_table&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;...&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- e.g. &amp;#39;kafka&amp;#39;&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;format&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;debezium-json&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;debezium-json.schema-include&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;true&amp;#39;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- default: false (Debezium can be configured to include or exclude the message schema)&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;debezium-json.ignore-parse-errors&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;true&amp;#39;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- default: false&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Flink 1.11 only supports Kafka as a changelog source out-of-the-box and JSON-encoded changelogs, with Avro (Debezium) and Protobuf (Canal) planned for future releases. There are also plans to support MySQL binlogs and Kafka compacted topics as sources, as well as to extend changelog support to batch execution.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-danger&quot;&gt;Attention&lt;/span&gt; There is a known issue (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18461&quot;&gt;FLINK-18461&lt;/a&gt;) that prevents changelog sources from being used to write to upsert sinks (e.g. MySQL, HBase, Elasticsearch). This will be fixed in the next patch release (1.11.1).&lt;/p&gt;
&lt;h3 id=&quot;table-apisql-jdbc-catalog-interface-and-postgres-catalog&quot;&gt;Table API/SQL: JDBC Catalog Interface and Postgres Catalog&lt;/h3&gt;
&lt;p&gt;Flink 1.11 introduces a generic JDBC catalog interface (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-93%3A+JDBC+catalog+and+Postgres+catalog&quot;&gt;FLIP-93&lt;/a&gt;) that enables users of the Table API/SQL to &lt;strong&gt;derive table schemas automatically&lt;/strong&gt; from connections to relational databases over &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/connect.html#jdbc-connector&quot;&gt;JDBC&lt;/a&gt;. This eliminates the previous need for manual schema definition and type conversion, and also allows to check for schema errors at compile time instead of runtime.&lt;/p&gt;
&lt;p&gt;The first implementation, rolling out with the new release, is the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/catalogs.html#postgrescatalog&quot;&gt;Postgres catalog&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;table-apisql-filesystem-connector-with-support-for-avro-orc-and-parquet&quot;&gt;Table API/SQL: FileSystem Connector with Support for Avro, ORC and Parquet&lt;/h3&gt;
&lt;p&gt;To improve the user experience for end-to-end streaming ETL use cases, the Flink community worked on a new FileSystem Connector for the Table API/SQL (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table&quot;&gt;FLIP-115&lt;/a&gt;). The implementation is based on Flink’s &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/ops/filesystems/index.html&quot;&gt;FileSystem abstraction&lt;/a&gt; and reuses &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html&quot;&gt;StreamingFileSink&lt;/a&gt; to ensure the same set of capabilities and consistent behaviour with the DataStream API.&lt;/p&gt;
&lt;p&gt;This also means that Table API/SQL users can now make use of all formats already supported by StreamingFileSink, like (Avro) Parquet, as well as the new formats introduced with this release, like Avro (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11395&quot;&gt;FLINK-11395&lt;/a&gt;) and Orc (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10114&quot;&gt;FLINK-10114&lt;/a&gt;).&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;my_table&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;column_name1&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;column_name2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;part_name1&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;part_name2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PARTITIONED&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;part_name1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;part_name2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;filesystem&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;path&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;file:///path/to/file,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt; &amp;#39;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39; = &amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;-- supported formats: Avro, ORC, Parquet, CSV, JSON &lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The new all-rounder FileSystem Connector transparently handles batch and streaming execution, provides exactly-once guarantees and has full partition support, greatly expanding the scope of usage of the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/connect.html#file-system-connector&quot;&gt;legacy connector&lt;/a&gt;. This allows users to easily implement common use cases like &lt;strong&gt;directly streaming data from Kafka to Hive&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;You can track the upcoming improvements to the FileSystem Connector in &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17778&quot;&gt;FLINK-17778&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;table-apisql-support-for-python-udfs&quot;&gt;Table API/SQL: Support for Python UDFs&lt;/h3&gt;
&lt;p&gt;Prior to this release, users of the Table API/SQL were limited to defining UDFs in either Java or Scala. In Flink 1.11, the community worked on expanding the usage scope of the Python language beyond PyFlink and providing support for Python UDFs in the SQL DDL syntax (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-106%3A+Support+Python+UDF+in+SQL+Function+DDL&quot;&gt;FLIP-106&lt;/a&gt;), as well as the SQL Client (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-114%3A+Support+Python+UDF+in+SQL+Client&quot;&gt;FLIP-114&lt;/a&gt;). Users can also register Python UDFs in the system catalog via SQL DDL or the Java Catalog API, so that functions can be shared between jobs.&lt;/p&gt;
&lt;h3 id=&quot;other-improvements-to-the-table-apisql&quot;&gt;Other Improvements to the Table API/SQL&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;DDL and DML Compatibility for the Hive Connector (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-123%3A+DDL+and+DML+compatibility+for+Hive+connector&quot;&gt;FLIP-123&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Starting from Flink 1.11, users can write SQL statements directly using Hive syntax (HiveQL) in the Table API/SQL and the SQL Client. For this purpose, an additional dialect was introduced and users can now dynamically switch between Flink (&lt;code&gt;default&lt;/code&gt;) and Hive (&lt;code&gt;hive&lt;/code&gt;) on a per-statement basis. For a complete list of supported DDL and DML statements, check the Hive dialect &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/hive/hive_dialect.html#hive-dialect&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extensions and Improvements to the Flink SQL Syntax&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Flink 1.11 introduces the concept of &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/sql/create.html#create-table&quot;&gt;primary key constraints&lt;/a&gt; to leverage runtime optimizations in Flink SQL DDL (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP+87%3A+Primary+key+constraints+in+Table+API&quot;&gt;FLIP-87&lt;/a&gt;);&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;View objects are now fully supported in SQL DDL using the &lt;code&gt;CREATE&lt;/code&gt;/&lt;code&gt;ALTER&lt;/code&gt;/&lt;code&gt;DROP VIEW&lt;/code&gt; statements (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-71%3A+E2E+View+support+in+FLINK+SQL&quot;&gt;FLIP-71&lt;/a&gt;);&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Users can now specify or override table options in their DQL/DML statements using &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/dev/table/sql/hints.html#dynamic-table-options&quot;&gt;dynamic table options&lt;/a&gt; (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL&quot;&gt;FLIP-113&lt;/a&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;To make connector properties less verbose and improve exception handling, some key properties have been refactored (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory&quot;&gt;FLIP-122&lt;/a&gt;). This change does not break compatibility, so users can still use the old property keys.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;New TableSource and TableSink Interfaces (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-95%3A+New+TableSource+and+TableSink+interfaces&quot;&gt;FLIP-95&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Flink 1.11 introduces new table source and sink interfaces (resp. &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/table/connector/source/DynamicTableSource.html&quot;&gt;&lt;code&gt;DynamicTableSource&lt;/code&gt;&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/table/connector/sink/DynamicTableSink.html&quot;&gt;&lt;code&gt;DynamicTableSink&lt;/code&gt;&lt;/a&gt;) that unify batch and streaming execution, provide more efficient data processing with the Blink planner and offer support for handling changelogs (see &lt;em&gt;&lt;a href=&quot;#table-apisql-support-for-change-data-capture-cdc&quot;&gt;Support for Change Data Capture (CDC)&lt;/a&gt;&lt;/em&gt;). The new interfaces also make it easier for users to &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/sourceSinks.html#full-stack-example&quot;&gt;implement custom connectors&lt;/a&gt; or modify existing ones. For an end-to-end example on how to implement a custom scan table source with a decoding format that supports changelog semantics, check out the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/sourceSinks.html#full-stack-example&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot;&gt;Note&lt;/span&gt; Although compatibility is not immediately affected, we recommend that Table API/SQL users update any sources and sinks to the new interface stack.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Refactored TableEnvironment Interface (&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878&quot;&gt;FLIP-84&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The semantics to describe similar behaviours in the &lt;code&gt;TableEnvironment&lt;/code&gt; and &lt;code&gt;Table&lt;/code&gt; interfaces have diverged over time, leading to an inconsistent and sometimes unclear user experience. To improve this and make programming more fluent in the Table API/SQL, Flink 1.11 introduces new methods that unify behaviours like execution triggering (e.g. &lt;code&gt;executeSql()&lt;/code&gt;) and result representation (e.g. &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/table/api/TableResult.html#print--&quot;&gt;&lt;code&gt;print()&lt;/code&gt;&lt;/a&gt;, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/table/api/TableResult.html#collect--&quot;&gt;&lt;code&gt;collect()&lt;/code&gt;&lt;/a&gt;), and also lay the groundwork for important features like &lt;a href=&quot;https://lists.apache.org/thread.html/r076e63bf6c8ed42d1b9ed2b406029696274a3a90cc360bc3a03e65d2%40%3Cdev.flink.apache.org%3E&quot;&gt;multi-statement execution support&lt;/a&gt; in future releases.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot;&gt;Note&lt;/span&gt; The methods deprecated with FLIP-84 will not be immediately removed, but we recommend that users adopt the newly introduced methods. For a complete list of new and deprecated methods, check the “Summary” section of &lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878&quot;&gt;FLIP-84&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;New Type Inference for Table API UDFs (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-65%3A+New+type+inference+for+Table+API+UDFs&quot;&gt;FLIP-65&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In Flink 1.9, the community started working on a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/types.html#data-types&quot;&gt;new data type system&lt;/a&gt; for the Table API to improve its compliance with the SQL standard (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System&quot;&gt;FLIP-37&lt;/a&gt;). This work is now close to being completed in Flink 1.11, with the exposure of Table API UDFs to the new type system (scalar and table functions, with aggregate functions planned for the next release).&lt;/p&gt;
&lt;hr /&gt;
&lt;h3 id=&quot;pyflink-support-for-pandas-udfs&quot;&gt;PyFlink: Support for Pandas UDFs&lt;/h3&gt;
&lt;p&gt;Up to this release, Python UDFs in PyFlink only supported scalar values of standard Python types. This presented some limitations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;High serialization/deserialization overhead in the process of transferring data between the JVM and the Python processes;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Hard to integrate with common Python libraries for high-performance numerical processing like pandas and NumPy.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To overcome these limitations, the community introduced support for (scalar) &lt;strong&gt;vectorized Python UDFs&lt;/strong&gt; based on &lt;a href=&quot;https://pandas.pydata.org/pandas-docs/stable/getting_started/overview.html&quot;&gt;pandas&lt;/a&gt; in Flink 1.11 (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-97%3A+Support+Scalar+Vectorized+Python+UDF+in+PyFlink&quot;&gt;FLIP-97&lt;/a&gt;). The performance of vectorized UDFs is usually much higher, as the serialization/deserialization overhead is minimized by falling back to &lt;a href=&quot;https://arrow.apache.org/&quot;&gt;Apache Arrow&lt;/a&gt;; and handling &lt;code&gt;pandas.Series&lt;/code&gt; as input/output allows to take full advantage of the pandas and NumPy libraries. This makes Pandas UDFs a popular solution to parallelize Machine Learning and other large-scale, distributed data science workloads (e.g. feature engineering, distributed model application).&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;pandas&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To mark a UDF as a Pandas UDF, you only need to add an extra parameter &lt;code&gt;udf_type=”pandas”&lt;/code&gt; in the udf decorator, as described in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/dev/table/python/vectorized_python_udfs.html#vectorized-user-defined-functions&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;other-improvements-to-pyflink&quot;&gt;Other Improvements to PyFlink&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Conversion fromPandas/toPandas (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-120%3A+Support+conversion+between+PyFlink+Table+and+Pandas+DataFrame&quot;&gt;FLIP-120&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Arrow is also supported as an optimization to convert between PyFlink tables and &lt;a href=&quot;https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html&quot;&gt;&lt;code&gt;pandas.DataFrames&lt;/code&gt;&lt;/a&gt;, enabling users to switch processing engines seamlessly without the need for an intermediate connector. For examples on how to use the new &lt;code&gt;fromPandas()&lt;/code&gt; and &lt;code&gt;toPandas()&lt;/code&gt; methods in PyFlink, check out the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/python/conversion_of_pandas.html#conversions-between-pyflink-table-and-pandas-dataframe&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Support for User-defined Table Functions (UDTFs) (&lt;a href=&quot;https://jira.apache.org/jira/browse/FLINK-14500&quot;&gt;FLINK-14500&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;From Flink 1.11, you can define and register custom &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/python/python_udfs.html#table-functions&quot;&gt;UDTFs&lt;/a&gt; in PyFlink. Similar to a Python UDF, a UDTF takes zero, one or multiple scalar values as input, but can return an arbitrary number of rows as output instead of a single value.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cython Performance Optimization for UDFs (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-121%3A+Support+Cython+Optimizing+Python+User+Defined+Function&quot;&gt;FLIP-121&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://cython.readthedocs.io/en/latest/src/quickstart/cythonize.html&quot;&gt;Cython&lt;/a&gt; is a compiled superset of the Python language that is often used to improve the performance of large-scale numeric processing in Python, as it optimizes execution to machine code-level speed and pairs well with popular C-based libraries like NumPy. From Flink 1.11, you can build &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/flinkDev/building.html#build-pyflink&quot;&gt;PyFlink with Cython support&lt;/a&gt; and “Cythonize” your Python UDFs to substantially improve code execution speed (up to 30x faster, compared to Python UDFs in Flink 1.10).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;User-defined Metrics in Python UDFs (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-112%3A+Support+User-Defined+Metrics+in++Python+UDF&quot;&gt;FLIP-112&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To make it easier for users to monitor and debug the execution of Python UDFs, PyFlink now allows gathering and exposing metrics to external systems, as well as defining user scopes and variables. You can access the metrics system from a UDF by calling &lt;code&gt;function_context.get_metric_group()&lt;/code&gt; in the open method, as described in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-master/dev/table/python/metrics.html#registering-metrics&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;important-changes&quot;&gt;Important Changes&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://jira.apache.org/jira/browse/FLINK-17339&quot;&gt;FLINK-17339&lt;/a&gt;] The Blink planner is the &lt;strong&gt;default&lt;/strong&gt; in the Table API/SQL starting from Flink 1.11. This was already the case for the SQL Client since Flink 1.10. The old Flink planner is still supported, but not actively developed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-5763&quot;&gt;FLINK-5763&lt;/a&gt;] Savepoints now contain all their state inside a single directory (both metadata and program state). This makes it straightforward to figure out which files make up the state of a savepoint and allows users to &lt;strong&gt;relocate savepoints&lt;/strong&gt; by simply moving a directory.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16408&quot;&gt;FLINK-16408&lt;/a&gt;] To reduce pressure on the JVM metaspace, the user code class loader is being reused by a &lt;code&gt;TaskExecutor&lt;/code&gt; as long as there is at least a single slot allocated for the respective job. This changes Flink’s recovery behaviour slightly, so that it will not reload static fields.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11086&quot;&gt;FLINK-11086&lt;/a&gt;] Flink now supports Hadoop versions above &lt;strong&gt;Hadoop 3.0.0&lt;/strong&gt;. Note that the Flink project does not provide any updated “flink-shaded-hadoop-*” jars. Users need to provide Hadoop dependencies through the &lt;code&gt;HADOOP_CLASSPATH&lt;/code&gt; environment variable (recommended) or the lib/ folder.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16963&quot;&gt;FLINK-16963&lt;/a&gt;] All &lt;code&gt;MetricReporters&lt;/code&gt; that come with Flink have been converted to plugins. These should no longer be placed into &lt;code&gt;/lib&lt;/code&gt; (which may result in dependency conflicts), but &lt;code&gt;/plugins/&amp;lt;some_directory&amp;gt;&lt;/code&gt; instead.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12639&quot;&gt;FLINK-12639&lt;/a&gt;] The Flink &lt;strong&gt;documentation&lt;/strong&gt; is undergoing some &lt;strong&gt;rework&lt;/strong&gt;, so you might notice that the navigation and organization of content look slightly different starting from Flink 1.11.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h2&gt;
&lt;p&gt;Please review the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.11/release-notes/flink-1.11.html&quot;&gt;release notes&lt;/a&gt; carefully for a detailed list of changes and new features if you plan to upgrade your setup to Flink 1.11. This version is API-compatible with previous 1.x releases for APIs annotated with the @Public annotation.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank all the 200+ contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;Aitozi, Alexander Fedulov, Alexey Trenikhin, Aljoscha Krettek, Andrey Zagrebin, Arvid Heise, Ayush Saxena, Bairos, Bartosz Krasinski, Benchao Li, Benoit Hanotte, Benoît Paris, Bhagavan Das, Canbin Zheng, Cedric Chen, Chesnay Schepler, Colm O hEigeartaigh, Congxian Qiu, CrazyTomatoOo, Danish Amjad, Danny Chan, David Anderson, Dawid Wysakowicz, Dian Fu, Dominik Wosiński, Echo Lee, Ethan Marsh, Etienne Chauchot, Fabian Hueske, Fabian Paul, Flavio Pompermaier, Gao Yun, Gary Yao, Ghildiyal, Grebennikov Roman, GuoWei Ma, Guru Prasad, Gyula Fora, Hequn Cheng, Hu Guang, HuFeiHu, HuangXingBo, Igal Shilman, Ismael Juma, Jacob Sevart, Jark Wu, Jaskaran Bindra, Jason K, Jeff Yang, Jeff Zhang, Jerry Wang, Jiangjie (Becket) Qin, Jiayi, Jiayi Liao, Jiayi-Liao, Jincheng Sun, Jing Zhang, Jingsong Lee, JingsongLi, Jun Qin, JunZhang, Jörn Kottmann, Kevin Bohinski, Konstantin Knauf, Kostas Kloudas, Kurt Young, Leonard Xu, Lining Jing, Liupengcheng, LululuAlu, Marta Paes Moreira, Matt Welke, Max Kuklinski, Maximilian Michels, Nico Kruber, Niels Basjes, Oleksandr Nitavskyi, Paul Lam, Paul Lin, PengFei Li, PengchengLiu, Piotr Nowojski, Prem Santosh, Qingsheng Ren, Rafi Aroch, Raymond Farrelly, Richard Deurwaarder, Robert Metzger, RocMarshal, Roey Shem Tov, Roman, Roman Khachatryan, Rong Rong, RoyRuan, Rui Li, Seth Wiesman, Shaobin.Ou, Shengkai, Shuiqiang Chen, Shuo Cheng, Sivaprasanna, Sivaprasanna S, SteNicholas, Stefan Richter, Stephan Ewen, Steve OU, Steve Whelan, Tartarus, Terry Wang, Thomas Weise, Till Rohrmann, Timo Walther, TsReaper, Tzu-Li (Gordon) Tai, Victor Wong, Wei Zhong, Weike DONG, Xiaogang Zhou, Xintong Song, Xu Bai, Xuannan, Yadong Xie, Yang Wang, Yangze Guo, Yichao Yang, Ying, Yu Li, Yuan Mei, Yun Gao, Yun Tang, Yuval Itzchakov, Zakelly, Zhao, Zhenghua Gao, Zhijiang, Zhu Zhu, acqua.csq, austin ce, azagrebin, bdine, bowen.li, caoyingjie, caozhen, caozhen1937, chaojianok, chen, chendonglin, comsir, cpugputpu, czhang2, dianfu, edu05, eduardowt, fangliang, felixzheng, fmyblack, gauss, gk0916, godfrey he, godfreyhe, guliziduo, guowei.mgw, hehuiyuan, hequn8128, hpeter, huangxingbo, huzheng, ifndef-SleePy, jingwen-ywb, jrthe42, kevin.cyj, klion26, lamber-ken, leesf, libenchao, lijiewang.wlj, liuyongvs, lsy, lumen, machinedoll, mans2singh, molsionmo, oliveryunchang, openinx, paul8263, ptmagic, qqibrow, sev7e0, shuai-xu, shuai.xu, shuiqiangchen, snuyanzin, spafka, sunhaibotb, sunjincheng121, testfixer, tison, vinoyang, vthinkxie, wangtong, wangxianghu, wangxiyuan, wangxlong, wangyang0918, wenlong.lwl, whlwanghailong, william, windWheel, wooplevip, wuxuyang, xushiwei, xuyang1706, yanghua, yangyichao-mango, yuzhao.cyz, zentol, zhanglibing, zhangmang, zhangzhanchun, zhengcanbin, zhengshuli, zhenxianyimeng, zhijiang, zhongyong jin, zhule, zhuxiaoshang, zjuwangg, zoudan, zoudaokoulife, zzchun, “lzh576177775”, 骚sir, 厉颖, 张军, 曹建华, 漫步云端&lt;/p&gt;
</description>
<pubDate>Mon, 06 Jul 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/07/06/release-1.11.0.html</link>
<guid isPermaLink="true">/news/2020/07/06/release-1.11.0.html</guid>
</item>
<item>
<title>Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 2</title>
<description>&lt;p&gt;In a previous post, we introduced the basics of Flink on Zeppelin and how to do Streaming ETL. In this second part of the “Flink on Zeppelin” series of posts, I will share how to
perform streaming data visualization via Flink on Zeppelin and how to use Apache Flink UDFs in Zeppelin.&lt;/p&gt;
&lt;h1 id=&quot;streaming-data-visualization&quot;&gt;Streaming Data Visualization&lt;/h1&gt;
&lt;p&gt;With &lt;a href=&quot;https://zeppelin.apache.org/&quot;&gt;Zeppelin&lt;/a&gt;, you can build a real time streaming dashboard without writing any line of javascript/html/css code.&lt;/p&gt;
&lt;p&gt;Overall, Zeppelin supports 3 kinds of streaming data analytics:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Single Mode&lt;/li&gt;
&lt;li&gt;Update Mode&lt;/li&gt;
&lt;li&gt;Append Mode&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;single-mode&quot;&gt;Single Mode&lt;/h3&gt;
&lt;p&gt;Single mode is used for cases when the result of a SQL statement is always one row, such as the following example.
The output format is translated in HTML, and you can specify a paragraph local property template for the final output content template.
And you can use &lt;code&gt;{i}&lt;/code&gt; as placeholder for the {i}th column of the result.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-23-flink-on-zeppelin-part2/flink_single_mode.gif&quot; width=&quot;80%&quot; alt=&quot;Single Mode&quot; /&gt;
&lt;/center&gt;
&lt;h3 id=&quot;update-mode&quot;&gt;Update Mode&lt;/h3&gt;
&lt;p&gt;Update mode is suitable for the cases when the output format is more than one row,
and will always be continuously updated. Here’s one example where we use &lt;code&gt;GROUP BY&lt;/code&gt;.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-23-flink-on-zeppelin-part2/flink_update_mode.gif&quot; width=&quot;80%&quot; alt=&quot;Update Mode&quot; /&gt;
&lt;/center&gt;
&lt;h3 id=&quot;append-mode&quot;&gt;Append Mode&lt;/h3&gt;
&lt;p&gt;Append mode is suitable for the cases when the output data is always appended.
For instance, the example below uses a tumble window.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-23-flink-on-zeppelin-part2/flink_append_mode.gif&quot; width=&quot;80%&quot; alt=&quot;Append Mode&quot; /&gt;
&lt;/center&gt;
&lt;h1 id=&quot;udf&quot;&gt;UDF&lt;/h1&gt;
&lt;p&gt;SQL is a very powerful language, especially in expressing data flow. But most of the time, you need to handle complicated business logic that cannot be expressed by SQL.
In these cases UDFs (user-defined functions) come particularly handy. In Zeppelin, you can write Scala or Python UDFs, while you can also import Scala, Python and Java UDFs.
Here are 2 examples of Scala and Python UDFs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Scala UDF&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-scala&quot;&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flink&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ScalaUpper&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ScalarFunction&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;toUpperCase&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;btenv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;registerFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;scala_upper&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ScalaUpper&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Python UDF&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pyflink&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PythonUpper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ScalarFunction&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;eval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;bt_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;register_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;python_upper&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PythonUpper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After you define the UDFs, you can use them directly in SQL:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use Scala UDF in SQL&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-23-flink-on-zeppelin-part2/flink_scala_udf.png&quot; width=&quot;100%&quot; alt=&quot;Scala UDF&quot; /&gt;
&lt;/center&gt;
&lt;ul&gt;
&lt;li&gt;Use Python UDF in SQL&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-23-flink-on-zeppelin-part2/flink_python_udf.png&quot; width=&quot;100%&quot; alt=&quot;Python UDF&quot; /&gt;
&lt;/center&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;In this post, we explained how to perform streaming data visualization via Flink on Zeppelin and how to use UDFs.
Besides that, you can do more in Zeppelin with Flink, such as batch processing, Hive integration and more.
You can check the following articles for more details and here’s a list of &lt;a href=&quot;https://www.youtube.com/watch?v=YxPo0Fosjjg&amp;amp;list=PL4oy12nnS7FFtg3KV1iS5vDb0pTz12VcX&quot;&gt;Flink on Zeppelin tutorial videos&lt;/a&gt; for your reference.&lt;/p&gt;
&lt;h1 id=&quot;references&quot;&gt;References&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://zeppelin.apache.org&quot;&gt;Apache Zeppelin official website&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flink on Zeppelin tutorials - &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-1-get-started-2591aaa6aa47&quot;&gt;Part 1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flink on Zeppelin tutorials - &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-2-batch-711731df5ad9&quot;&gt;Part 2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flink on Zeppelin tutorials - &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-3-streaming-5fca1e16754&quot;&gt;Part 3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flink on Zeppelin tutorials - &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-4-advanced-usage-998b74908cd9&quot;&gt;Part 4&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=YxPo0Fosjjg&amp;amp;list=PL4oy12nnS7FFtg3KV1iS5vDb0pTz12VcX&quot;&gt;Flink on Zeppelin tutorial videos&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Tue, 23 Jun 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/ecosystem/2020/06/23/flink-on-zeppelin-part2.html</link>
<guid isPermaLink="true">/ecosystem/2020/06/23/flink-on-zeppelin-part2.html</guid>
</item>
<item>
<title>Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 1</title>
<description>&lt;p&gt;The latest release of &lt;a href=&quot;https://zeppelin.apache.org/&quot;&gt;Apache Zeppelin&lt;/a&gt; comes with a redesigned interpreter for Apache Flink (version Flink 1.10+ is only supported moving forward)
that allows developers to use Flink directly on Zeppelin notebooks for interactive data analysis. I wrote 2 posts about how to use Flink in Zeppelin. This is part-1 where I explain how the Flink interpreter in Zeppelin works,
and provide a tutorial for running Streaming ETL with Flink on Zeppelin.&lt;/p&gt;
&lt;h1 id=&quot;the-flink-interpreter-in-zeppelin-09&quot;&gt;The Flink Interpreter in Zeppelin 0.9&lt;/h1&gt;
&lt;p&gt;The Flink interpreter can be accessed and configured from Zeppelin’s interpreter settings page.
The interpreter has been refactored so that Flink users can now take advantage of Zeppelin to write Flink applications in three languages,
namely Scala, Python (PyFlink) and SQL (for both batch &amp;amp; streaming executions).
Zeppelin 0.9 now comes with the Flink interpreter group, consisting of the below five interpreters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;%flink - Provides a Scala environment&lt;/li&gt;
&lt;li&gt;%flink.pyflink - Provides a python environment&lt;/li&gt;
&lt;li&gt;%flink.ipyflink - Provides an ipython environment&lt;/li&gt;
&lt;li&gt;%flink.ssql - Provides a stream sql environment&lt;/li&gt;
&lt;li&gt;%flink.bsql - Provides a batch sql environment&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Not only has the interpreter been extended to support writing Flink applications in three languages, but it has also extended the available execution modes for Flink that now include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Running Flink in Local Mode&lt;/li&gt;
&lt;li&gt;Running Flink in Remote Mode&lt;/li&gt;
&lt;li&gt;Running Flink in Yarn Mode&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can find more information about how to get started with Zeppelin and all the execution modes for Flink applications in &lt;a href=&quot;https://github.com/apache/zeppelin/tree/master/notebook/Flink%20Tutorial&quot;&gt;Zeppelin notebooks&lt;/a&gt; in this post.&lt;/p&gt;
&lt;h1 id=&quot;flink-on-zeppelin-for-stream-processing&quot;&gt;Flink on Zeppelin for Stream processing&lt;/h1&gt;
&lt;p&gt;Performing stream processing jobs with Apache Flink on Zeppelin allows you to run most major streaming cases,
such as streaming ETL and real time data analytics, with the use of Flink SQL and specific UDFs.
Below we showcase how you can execute streaming ETL using Flink on Zeppelin:&lt;/p&gt;
&lt;p&gt;You can use Flink SQL to perform streaming ETL by following the steps below
(for the full tutorial, please refer to the &lt;a href=&quot;https://github.com/apache/zeppelin/blob/master/notebook/Flink%20Tutorial/4.%20Streaming%20ETL_2EYD56B9B.zpln&quot;&gt;Flink Tutorial/Streaming ETL tutorial&lt;/a&gt; of the Zeppelin distribution):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Step 1. Create source table to represent the source data.&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-15-flink-on-zeppelin/create_source.png&quot; width=&quot;80%&quot; alt=&quot;Create Source Table&quot; /&gt;
&lt;/center&gt;
&lt;ul&gt;
&lt;li&gt;Step 2. Create a sink table to represent the processed data.&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-15-flink-on-zeppelin/create_sink.png&quot; width=&quot;80%&quot; alt=&quot;Create Sink Table&quot; /&gt;
&lt;/center&gt;
&lt;ul&gt;
&lt;li&gt;Step 3. After creating the source and sink table, we can insert them to our statement to trigger the stream processing job as the following:&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-15-flink-on-zeppelin/etl.png&quot; width=&quot;80%&quot; alt=&quot;ETL&quot; /&gt;
&lt;/center&gt;
&lt;ul&gt;
&lt;li&gt;Step 4. After initiating the streaming job, you can use another SQL statement to query the sink table to verify the results of your job. Here you can see the top 10 records which will be refreshed every 3 seconds.&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-15-flink-on-zeppelin/preview.png&quot; width=&quot;80%&quot; alt=&quot;Preview&quot; /&gt;
&lt;/center&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;In this post, we explained how the redesigned Flink interpreter works in Zeppelin 0.9.0 and provided some examples for performing streaming ETL jobs with
Flink and Zeppelin. In the next post, I will talk about how to do streaming data visualization via Flink on Zeppelin.
Besides that, you can find an additional &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-2-batch-711731df5ad9&quot;&gt;tutorial for batch processing with Flink on Zeppelin&lt;/a&gt; as well as using Flink on Zeppelin for
more advance operations like resource isolation, job concurrency &amp;amp; parallelism, multiple Hadoop &amp;amp; Hive environments and more on our series of posts on Medium.
And here’s a list of &lt;a href=&quot;https://www.youtube.com/watch?v=YxPo0Fosjjg&amp;amp;list=PL4oy12nnS7FFtg3KV1iS5vDb0pTz12VcX&quot;&gt;Flink on Zeppelin tutorial videos&lt;/a&gt; for your reference.&lt;/p&gt;
&lt;h1 id=&quot;references&quot;&gt;References&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://zeppelin.apache.org&quot;&gt;Apache Zeppelin official website&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flink on Zeppelin tutorials - &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-1-get-started-2591aaa6aa47&quot;&gt;Part 1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flink on Zeppelin tutorials - &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-2-batch-711731df5ad9&quot;&gt;Part 2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flink on Zeppelin tutorials - &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-3-streaming-5fca1e16754&quot;&gt;Part 3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Flink on Zeppelin tutorials - &lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-4-advanced-usage-998b74908cd9&quot;&gt;Part 4&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=YxPo0Fosjjg&amp;amp;list=PL4oy12nnS7FFtg3KV1iS5vDb0pTz12VcX&quot;&gt;Flink on Zeppelin tutorial videos&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Mon, 15 Jun 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/06/15/flink-on-zeppelin-part1.html</link>
<guid isPermaLink="true">/news/2020/06/15/flink-on-zeppelin-part1.html</guid>
</item>
<item>
<title>Flink Community Update - June&#39;20</title>
<description>&lt;p&gt;And suddenly it’s June. The previous month has been calm on the surface, but quite hectic underneath — the final testing phase for Flink 1.11 is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink has made it into Google Season of Docs 2020.&lt;/p&gt;
&lt;p&gt;To top it off, a piece of good news: &lt;a href=&quot;https://www.flink-forward.org/global-2020&quot;&gt;Flink Forward&lt;/a&gt; is back on October 19-22 as a free virtual event!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#the-past-month-in-flink&quot; id=&quot;markdown-toc-the-past-month-in-flink&quot;&gt;The Past Month in Flink&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-stateful-functions-21-release&quot; id=&quot;markdown-toc-flink-stateful-functions-21-release&quot;&gt;Flink Stateful Functions 2.1 Release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#testing-is-on-for-flink-111&quot; id=&quot;markdown-toc-testing-is-on-for-flink-111&quot;&gt;Testing is ON for Flink 1.11&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-minor-releases&quot; id=&quot;markdown-toc-flink-minor-releases&quot;&gt;Flink Minor Releases&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-1101&quot; id=&quot;markdown-toc-flink-1101&quot;&gt;Flink 1.10.1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers-and-pmc-members&quot; id=&quot;markdown-toc-new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers&quot; id=&quot;markdown-toc-new-committers&quot;&gt;New Committers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-bigger-picture&quot; id=&quot;markdown-toc-the-bigger-picture&quot;&gt;The Bigger Picture&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-forward-global-virtual-conference-2020&quot; id=&quot;markdown-toc-flink-forward-global-virtual-conference-2020&quot;&gt;Flink Forward Global Virtual Conference 2020&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#google-season-of-docs-2020&quot; id=&quot;markdown-toc-google-season-of-docs-2020&quot;&gt;Google Season of Docs 2020&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;the-past-month-in-flink&quot;&gt;The Past Month in Flink&lt;/h1&gt;
&lt;h2 id=&quot;flink-stateful-functions-21-release&quot;&gt;Flink Stateful Functions 2.1 Release&lt;/h2&gt;
&lt;p&gt;It might seem like &lt;a href=&quot;https://flink.apache.org/news/2020/04/07/release-statefun-2.0.0.html&quot;&gt;Stateful Functions 2.0 was announced&lt;/a&gt; only a handful of weeks ago (and it was!), but the Flink community has just released Stateful Functions 2.1! This release introduces two new features: state expiration for any kind of persisted state and support for UNIX Domain Sockets (UDS) to improve the performance of inter-container communication in co-located deployments; as well as other important changes that improve the overall stability and testability of the project. You can read the &lt;a href=&quot;https://flink.apache.org/news/2020/06/09/release-statefun-2.1.0.html&quot;&gt;announcement blogpost&lt;/a&gt; for more details on the release!&lt;/p&gt;
&lt;p&gt;As the community around StateFun grows, the release cycle will follow this pattern of smaller and more frequent releases to incorporate user feedback and allow for faster iteration. If you’d like to get involved, we’re always &lt;a href=&quot;https://github.com/apache/flink-statefun#contributing&quot;&gt;looking for new contributors&lt;/a&gt; — especially around SDKs for other languages (e.g. Go, Rust, Javascript).&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;testing-is-on-for-flink-111&quot;&gt;Testing is ON for Flink 1.11&lt;/h2&gt;
&lt;p&gt;Things have been pretty quiet in the Flink community, as all efforts shifted to testing the newest features shipping with Flink 1.11. While we wait for a voting Release Candidate (RC) to be out, you can check the progress of testing in &lt;a href=&quot;https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=364&amp;amp;projectKey=FLINK&quot;&gt;this JIRA burndown board&lt;/a&gt; and learn more about some of the &lt;a href=&quot;https://flink.apache.org/news/2020/05/07/community-update.html#warming-up-for-flink-111&quot;&gt;upcoming features&lt;/a&gt; in these Flink Forward videos:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=ssEmeLcL5Uk&quot;&gt;Rethinking of fault tolerance in Flink: what lies ahead?&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=t7fAN3xNJ3Q&quot;&gt;It’s finally here: Python on Flink &amp;amp; Flink on Zeppelin&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=KDD8e4GE12w&quot;&gt;A deep dive into Flink SQL&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=4ce1H9CRyEc&quot;&gt;Production-Ready Flink and Hive Integration - what story you can tell now?&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We encourage the wider community to also get involved in testing once the voting RC is out. Keep an eye on the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;@dev mailing list&lt;/a&gt; for updates!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;flink-minor-releases&quot;&gt;Flink Minor Releases&lt;/h2&gt;
&lt;h3 id=&quot;flink-1101&quot;&gt;Flink 1.10.1&lt;/h3&gt;
&lt;p&gt;The community released Flink 1.10.1, covering some outstanding bugs in Flink 1.10. You can find more in the &lt;a href=&quot;https://flink.apache.org/news/2020/05/12/release-1.10.1.html&quot;&gt;announcement blogpost&lt;/a&gt;!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/h2&gt;
&lt;p&gt;The Apache Flink community has welcomed &lt;strong&gt;2 new Committers&lt;/strong&gt; since the last update. Congratulations!&lt;/p&gt;
&lt;h3 id=&quot;new-committers&quot;&gt;New Committers&lt;/h3&gt;
&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars3.githubusercontent.com/u/4471524?s=400&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;Benchao Li&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars0.githubusercontent.com/u/6509172?s=400&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;Xintong Song&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;the-bigger-picture&quot;&gt;The Bigger Picture&lt;/h1&gt;
&lt;h2 id=&quot;flink-forward-global-virtual-conference-2020&quot;&gt;Flink Forward Global Virtual Conference 2020&lt;/h2&gt;
&lt;p&gt;After a first successful &lt;a href=&quot;https://www.youtube.com/playlist?list=PLDX4T_cnKjD0ngnBSU-bYGfgVv17MiwA7&quot;&gt;virtual conference&lt;/a&gt; last April, Flink Forward will be hosting a second free virtual edition on October 19-22. This time around, the conference will feature two days of hands-on training and two full days of conference talks!&lt;/p&gt;
&lt;p&gt;Got a Flink story to share? Maybe your recent adventures with Stateful Functions? The &lt;a href=&quot;https://www.flink-forward.org/global-2020/call-for-presentations&quot;&gt;Call for Presentations is now open&lt;/a&gt; and accepting submissions from the community until &lt;strong&gt;June 19th, 11:59 PM CEST&lt;/strong&gt;.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-06-10-community-update/FlinkForward_Banner_CFP_Global_2020.png&quot; width=&quot;600px&quot; alt=&quot;Flink Forward Global 2020&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;google-season-of-docs-2020&quot;&gt;Google Season of Docs 2020&lt;/h2&gt;
&lt;p&gt;In the last update, we announced that Flink was applying to &lt;a href=&quot;https://developers.google.com/season-of-docs&quot;&gt;Google Season of Docs (GSoD)&lt;/a&gt; again this year. The good news: we’ve made it into the shortlist of accepted projects! This represents an invaluable opportunity for the Flink community to collaborate with technical writers to improve the Table API &amp;amp; SQL documentation. We’re honored to have seen a great number of people reach out over the last couple of weeks, and look forward to receiving applications from this week on!&lt;/p&gt;
&lt;p&gt;If you’re interested in learning more about our project idea or want to get involved in GSoD as a technical writer, check out the &lt;a href=&quot;https://flink.apache.org/news/2020/05/04/season-of-docs.html&quot;&gt;announcement blogpost&lt;/a&gt; and &lt;a href=&quot;https://developers.google.com/season-of-docs/docs/tech-writer-application-hints&quot;&gt;submit your application&lt;/a&gt;. The deadline for GSoD applications is &lt;strong&gt;July 9th, 18:00 UTC&lt;/strong&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;If you’d like to keep a closer eye on what’s happening in the community, subscribe to the Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;@community mailing list&lt;/a&gt; to get fine-grained weekly updates, upcoming event announcements and more.&lt;/p&gt;
</description>
<pubDate>Thu, 11 Jun 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/06/11/community-update.html</link>
<guid isPermaLink="true">/news/2020/06/11/community-update.html</guid>
</item>
<item>
<title>Stateful Functions 2.1.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is happy to announce the release of Stateful Functions (StateFun) 2.1.0! This release introduces new features around state expiration and performance improvements for co-located deployments, as well as other important changes that improve the stability and testability of the project. As the community around StateFun grows, the release cycle will follow this pattern of smaller and more frequent releases to incorporate user feedback and allow for faster iteration.&lt;/p&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;https://flink.apache.org/downloads.html&quot;&gt;Downloads&lt;/a&gt; page of the Flink website, and the most recent Python SDK distribution is available on &lt;a href=&quot;https://pypi.org/project/apache-flink-statefun/&quot;&gt;PyPI&lt;/a&gt;. For more details, check the complete &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12347861&quot;&gt;release changelog&lt;/a&gt; and the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.1/&quot;&gt;updated documentation&lt;/a&gt;. We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt; or &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-18016?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Stateful%20Functions%22%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC&quot;&gt;JIRA&lt;/a&gt;!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#new-features-and-improvements&quot; id=&quot;markdown-toc-new-features-and-improvements&quot;&gt;New Features and Improvements&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#support-for-state-time-to-live-ttl&quot; id=&quot;markdown-toc-support-for-state-time-to-live-ttl&quot;&gt;Support for State Time-To-Live (TTL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#improved-performance-with-unix-domain-sockets-uds&quot; id=&quot;markdown-toc-improved-performance-with-unix-domain-sockets-uds&quot;&gt;Improved Performance with UNIX Domain Sockets (UDS)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#important-changes&quot; id=&quot;markdown-toc-important-changes&quot;&gt;Important Changes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes&quot; id=&quot;markdown-toc-release-notes&quot;&gt;Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;new-features-and-improvements&quot;&gt;New Features and Improvements&lt;/h2&gt;
&lt;h3 id=&quot;support-for-state-time-to-live-ttl&quot;&gt;Support for State Time-To-Live (TTL)&lt;/h3&gt;
&lt;p&gt;Being able to define state expiration and a state cleanup strategy is a useful feature for stateful applications — for example, to keep state size from growing indefinitely or to work with sensitive data. In previous StateFun versions, users could implement this behavior manually using &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.0/sdk/java.html#sending-delayed-messages&quot;&gt;delayed messages&lt;/a&gt; as state expiration callbacks. For StateFun 2.1, the community has worked on enabling users to configure any persisted state to expire and be purged after a given duration (i.e. the state time-to-live) (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17644&quot;&gt;FLINK-17644&lt;/a&gt;, &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17875&quot;&gt;FLINK-17875&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Persisted state can be configured to expire after the last &lt;em&gt;write&lt;/em&gt; operation (&lt;code&gt;AFTER_WRITE&lt;/code&gt;) or after the last &lt;em&gt;read or write&lt;/em&gt; operation (&lt;code&gt;AFTER_READ_AND_WRITE&lt;/code&gt;). For the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.1/sdk/java.html#state-expiration&quot;&gt;Java SDK&lt;/a&gt;, users can configure State TTL in the definition of their persisted fields:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@Persisted&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;PersistedValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PersistedValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;my-value&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Expiration&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;expireAfterWriting&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Duration&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;ofHours&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)));&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.1/concepts/distributed_architecture.html#remote-functions&quot;&gt;remote functions&lt;/a&gt; using e.g. the Python SDK, users can configure State TTL in their &lt;code&gt;module.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;functions:
- function:
states:
- name: xxxx
expireAfter: 5min # optional key
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;b&gt;Note:&lt;/b&gt;
The state expiration mode for remote functions is currently restricted to AFTER_READ_AND_WRITE, and the actual TTL being set is the longest duration across all registered state, not for each individual state entry. This is planned to be improved in upcoming releases (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17954&quot;&gt;FLINK-17954&lt;/a&gt;).
&lt;/div&gt;
&lt;h3 id=&quot;improved-performance-with-unix-domain-sockets-uds&quot;&gt;Improved Performance with UNIX Domain Sockets (UDS)&lt;/h3&gt;
&lt;p&gt;Stateful functions can be &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.1/concepts/distributed_architecture.html#deployment-styles-for-functions&quot;&gt;deployed in multiple ways&lt;/a&gt;, even within the same application. For deployments where functions are &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.1/concepts/distributed_architecture.html#co-located-functions&quot;&gt;co-located&lt;/a&gt; with the Flink StateFun workers, it’s common to use Kubernetes to deploy pods consisting of a Flink StateFun container and the function sidecar container, communicating via the pod-local network. To improve the performance of such deployments, StateFun 2.1 allows using &lt;a href=&quot;https://troydhanson.github.io/network/Unix_domain_sockets.html&quot;&gt;Unix Domain Sockets&lt;/a&gt; (UDS) to communicate between containers in the same pod (i.e. the same machine) (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17611&quot;&gt;FLINK-17611&lt;/a&gt;), which drastically reduces the overhead of going through the network stack.&lt;/p&gt;
&lt;p&gt;Users can &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-master/sdk/modules.html#defining-functions&quot;&gt;enable transport via UDS&lt;/a&gt; in a remote module by specifying the following in their &lt;code&gt;module.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;functions:
- function:
spec:
- endpoint: http(s)+unix://&amp;lt;socket-file-path&amp;gt;/&amp;lt;serve-url-path&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;important-changes&quot;&gt;Important Changes&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17712&quot;&gt;FLINK-17712&lt;/a&gt;] The Flink version in StateFun 2.1 has been upgraded to 1.10.1, the most recent patch version.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17533&quot;&gt;FLINK-17533&lt;/a&gt;] StateFun 2.1 now supports concurrent checkpoints, which means applications will no longer fail on savepoints that are triggered concurrently to a checkpoint.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16928&quot;&gt;FLINK-16928&lt;/a&gt;] StateFun 2.0 was using the Flink legacy scheduler due to a &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16927&quot;&gt;bug in Flink 1.10&lt;/a&gt;. In 2.1, this change is reverted to using the new Flink scheduler again.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17516&quot;&gt;FLINK-17516&lt;/a&gt;] The coverage for end-to-end StateFun tests has been extended to also include exactly-once semantics verification (with failure recovery).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h2&gt;
&lt;p&gt;Please review the &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12347861&quot;&gt;release notes&lt;/a&gt; for a detailed list of changes and new features if you plan to upgrade your setup to Stateful Functions 2.1.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank all contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;abc863377, Authuir, Chesnay Schepler, Congxian Qiu, David Anderson, Dian Fu, Francesco Guardiani, Igal Shilman, Marta Paes Moreira, Patrick Wiener, Rafi Aroch, Seth Wiesman, Stephan Ewen, Tzu-Li (Gordon) Tai&lt;/p&gt;
&lt;p&gt;If you’d like to get involved, we’re always &lt;a href=&quot;https://github.com/apache/flink-statefun#contributing&quot;&gt;looking for new contributors&lt;/a&gt; — especially around SDKs for other languages like Go, Rust or Javascript.&lt;/p&gt;
</description>
<pubDate>Tue, 09 Jun 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/06/09/release-statefun-2.1.0.html</link>
<guid isPermaLink="true">/news/2020/06/09/release-statefun-2.1.0.html</guid>
</item>
<item>
<title>Apache Flink 1.10.1 Released</title>
<description>&lt;p&gt;The Apache Flink community released the first bugfix version of the Apache Flink 1.10 series.&lt;/p&gt;
&lt;p&gt;This release includes 158 fixes and minor improvements for Flink 1.10.0. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.10.1.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
FLINK-16684 changed the builders of the StreamingFileSink to make them compilable in Scala. This change is source compatible but binary incompatible. If using the StreamingFileSink, please recompile your user code against 1.10.1 before upgrading.&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
FLINK-16683 Flink no longer supports starting clusters with .bat scripts. Users should instead use environments like WSL or Cygwin and work with the .sh scripts.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.10.1&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14126&quot;&gt;FLINK-14126&lt;/a&gt;] - Elasticsearch Xpack Machine Learning doesn&amp;#39;t support ARM
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15143&quot;&gt;FLINK-15143&lt;/a&gt;] - Create document for FLIP-49 TM memory model and configuration guide
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15561&quot;&gt;FLINK-15561&lt;/a&gt;] - Unify Kerberos credentials checking
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15790&quot;&gt;FLINK-15790&lt;/a&gt;] - Make FlinkKubeClient and its implementations asynchronous
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15817&quot;&gt;FLINK-15817&lt;/a&gt;] - Kubernetes Resource leak while deployment exception happens
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16049&quot;&gt;FLINK-16049&lt;/a&gt;] - Remove outdated &amp;quot;Best Practices&amp;quot; section from Application Development Section
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16131&quot;&gt;FLINK-16131&lt;/a&gt;] - Translate &amp;quot;Amazon S3&amp;quot; page of &amp;quot;File Systems&amp;quot; into Chinese
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16389&quot;&gt;FLINK-16389&lt;/a&gt;] - Bump Kafka 0.10 to 0.10.2.2
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-2336&quot;&gt;FLINK-2336&lt;/a&gt;] - ArrayIndexOufOBoundsException in TypeExtractor when mapping
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10918&quot;&gt;FLINK-10918&lt;/a&gt;] - incremental Keyed state with RocksDB throws cannot create directory error in windows
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11193&quot;&gt;FLINK-11193&lt;/a&gt;] - Rocksdb timer service factory configuration option is not settable per job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13483&quot;&gt;FLINK-13483&lt;/a&gt;] - PrestoS3FileSystemITCase.testDirectoryListing fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14038&quot;&gt;FLINK-14038&lt;/a&gt;] - ExecutionGraph deploy failed due to akka timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14311&quot;&gt;FLINK-14311&lt;/a&gt;] - Streaming File Sink end-to-end test failed on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14316&quot;&gt;FLINK-14316&lt;/a&gt;] - Stuck in &amp;quot;Job leader ... lost leadership&amp;quot; error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15417&quot;&gt;FLINK-15417&lt;/a&gt;] - Remove the docker volume or mount when starting Mesos e2e cluster
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15669&quot;&gt;FLINK-15669&lt;/a&gt;] - SQL client can&amp;#39;t cancel flink job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15772&quot;&gt;FLINK-15772&lt;/a&gt;] - Shaded Hadoop S3A with credentials provider end-to-end test fails on travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15811&quot;&gt;FLINK-15811&lt;/a&gt;] - StreamSourceOperatorWatermarksTest.testNoMaxWatermarkOnAsyncCancel fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15812&quot;&gt;FLINK-15812&lt;/a&gt;] - HistoryServer archiving is done in Dispatcher main thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15838&quot;&gt;FLINK-15838&lt;/a&gt;] - Dangling CountDownLatch.await(timeout)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15852&quot;&gt;FLINK-15852&lt;/a&gt;] - Job is submitted to the wrong session cluster
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15904&quot;&gt;FLINK-15904&lt;/a&gt;] - Make Kafka Consumer work with activated &amp;quot;disableGenericTypes()&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15936&quot;&gt;FLINK-15936&lt;/a&gt;] - TaskExecutorTest#testSlotAcceptance deadlocks
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15953&quot;&gt;FLINK-15953&lt;/a&gt;] - Job Status is hard to read for some Statuses
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16013&quot;&gt;FLINK-16013&lt;/a&gt;] - List and map config options could not be parsed correctly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16014&quot;&gt;FLINK-16014&lt;/a&gt;] - S3 plugin ClassNotFoundException SAXParser
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16025&quot;&gt;FLINK-16025&lt;/a&gt;] - Service could expose blob server port mismatched with JM Container
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16026&quot;&gt;FLINK-16026&lt;/a&gt;] - Travis failed due to python setup
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16047&quot;&gt;FLINK-16047&lt;/a&gt;] - Blink planner produces wrong aggregate results with state clean up
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16067&quot;&gt;FLINK-16067&lt;/a&gt;] - Flink&amp;#39;s CalciteParser swallows error position information
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16068&quot;&gt;FLINK-16068&lt;/a&gt;] - table with keyword-escaped columns and computed_column_expression columns
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16070&quot;&gt;FLINK-16070&lt;/a&gt;] - Blink planner can not extract correct unique key for UpsertStreamTableSink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16108&quot;&gt;FLINK-16108&lt;/a&gt;] - StreamSQLExample is failed if running in blink planner
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16111&quot;&gt;FLINK-16111&lt;/a&gt;] - Kubernetes deployment does not respect &amp;quot;taskmanager.cpu.cores&amp;quot;.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16113&quot;&gt;FLINK-16113&lt;/a&gt;] - ExpressionReducer shouldn&amp;#39;t escape the reduced string value
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16115&quot;&gt;FLINK-16115&lt;/a&gt;] - Aliyun oss filesystem could not work with plugin mechanism
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16139&quot;&gt;FLINK-16139&lt;/a&gt;] - Co-location constraints are not reset on task recovery in DefaultScheduler
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16161&quot;&gt;FLINK-16161&lt;/a&gt;] - Statistics zero should be unknown in HiveCatalog
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16170&quot;&gt;FLINK-16170&lt;/a&gt;] - SearchTemplateRequest ClassNotFoundException when use flink-sql-connector-elasticsearch7
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16220&quot;&gt;FLINK-16220&lt;/a&gt;] - JsonRowSerializationSchema throws cast exception : NullNode cannot be cast to ArrayNode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16231&quot;&gt;FLINK-16231&lt;/a&gt;] - Hive connector is missing jdk.tools exclusion against Hive 2.x.x
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16234&quot;&gt;FLINK-16234&lt;/a&gt;] - Fix unstable cases in StreamingJobGraphGeneratorTest
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16241&quot;&gt;FLINK-16241&lt;/a&gt;] - Remove the license and notice file in flink-ml-lib module on release-1.10 branch
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16242&quot;&gt;FLINK-16242&lt;/a&gt;] - BinaryGeneric serialization error cause checkpoint failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16262&quot;&gt;FLINK-16262&lt;/a&gt;] - Class loader problem with FlinkKafkaProducer.Semantic.EXACTLY_ONCE and usrlib directory
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16269&quot;&gt;FLINK-16269&lt;/a&gt;] - Generic type can not be matched when convert table to stream.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16281&quot;&gt;FLINK-16281&lt;/a&gt;] - parameter &amp;#39;maxRetryTimes&amp;#39; can not work in JDBCUpsertTableSink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16301&quot;&gt;FLINK-16301&lt;/a&gt;] - Annoying &amp;quot;Cannot find FunctionDefinition&amp;quot; messages with SQL for f_proctime or =
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16308&quot;&gt;FLINK-16308&lt;/a&gt;] - SQL connector download links are broken
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16313&quot;&gt;FLINK-16313&lt;/a&gt;] - flink-state-processor-api: surefire execution unstable on Azure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16331&quot;&gt;FLINK-16331&lt;/a&gt;] - Remove source licenses for old WebUI
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16345&quot;&gt;FLINK-16345&lt;/a&gt;] - Computed column can not refer time attribute column
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16360&quot;&gt;FLINK-16360&lt;/a&gt;] - connector on hive 2.0.1 don&amp;#39;t support type conversion from STRING to VARCHAR
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16371&quot;&gt;FLINK-16371&lt;/a&gt;] - HadoopCompressionBulkWriter fails with &amp;#39;java.io.NotSerializableException&amp;#39;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16373&quot;&gt;FLINK-16373&lt;/a&gt;] - EmbeddedLeaderService: IllegalStateException: The RPC connection is already closed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16413&quot;&gt;FLINK-16413&lt;/a&gt;] - Reduce hive source parallelism when limit push down
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16414&quot;&gt;FLINK-16414&lt;/a&gt;] - create udaf/udtf function using sql casuing ValidationException: SQL validation failed. null
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16433&quot;&gt;FLINK-16433&lt;/a&gt;] - TableEnvironmentImpl doesn&amp;#39;t clear buffered operations when it fails to translate the operation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16435&quot;&gt;FLINK-16435&lt;/a&gt;] - Replace since decorator with versionadd to mark the version an API was introduced
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16467&quot;&gt;FLINK-16467&lt;/a&gt;] - MemorySizeTest#testToHumanReadableString() is not portable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16526&quot;&gt;FLINK-16526&lt;/a&gt;] - Fix exception when computed column expression references a keyword column name
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16541&quot;&gt;FLINK-16541&lt;/a&gt;] - Document of table.exec.shuffle-mode is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16550&quot;&gt;FLINK-16550&lt;/a&gt;] - HadoopS3* tests fail with NullPointerException exceptions
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16560&quot;&gt;FLINK-16560&lt;/a&gt;] - Forward Configuration in PackagedProgramUtils#getPipelineFromProgram
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16567&quot;&gt;FLINK-16567&lt;/a&gt;] - Get the API error of the StreamQueryConfig on Page &amp;quot;Query Configuration&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16573&quot;&gt;FLINK-16573&lt;/a&gt;] - Kinesis consumer does not properly shutdown RecordFetcher threads
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16576&quot;&gt;FLINK-16576&lt;/a&gt;] - State inconsistency on restore with memory state backends
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16626&quot;&gt;FLINK-16626&lt;/a&gt;] - Prevent REST handler from being closed more than once
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16632&quot;&gt;FLINK-16632&lt;/a&gt;] - SqlDateTimeUtils#toSqlTimestamp(String, String) may yield incorrect result
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16635&quot;&gt;FLINK-16635&lt;/a&gt;] - Incompatible okio dependency in flink-metrics-influxdb module
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16646&quot;&gt;FLINK-16646&lt;/a&gt;] - flink read orc file throw a NullPointerException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16647&quot;&gt;FLINK-16647&lt;/a&gt;] - Miss file extension when inserting to hive table with compression
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16652&quot;&gt;FLINK-16652&lt;/a&gt;] - BytesColumnVector should init buffer in Hive 3.x
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16662&quot;&gt;FLINK-16662&lt;/a&gt;] - Blink Planner failed to generate JobGraph for POJO DataStream converting to Table (Cannot determine simple type name)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16664&quot;&gt;FLINK-16664&lt;/a&gt;] - Unable to set DataStreamSource parallelism to default (-1)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16675&quot;&gt;FLINK-16675&lt;/a&gt;] - TableEnvironmentITCase. testClearOperation fails on travis nightly build
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16684&quot;&gt;FLINK-16684&lt;/a&gt;] - StreamingFileSink builder does not work with Scala
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16696&quot;&gt;FLINK-16696&lt;/a&gt;] - Savepoint trigger documentation is insufficient
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16703&quot;&gt;FLINK-16703&lt;/a&gt;] - AkkaRpcActor state machine does not record transition to terminating state.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16705&quot;&gt;FLINK-16705&lt;/a&gt;] - LocalExecutor tears down MiniCluster before client can retrieve JobResult
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16718&quot;&gt;FLINK-16718&lt;/a&gt;] - KvStateServerHandlerTest leaks Netty ByteBufs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16727&quot;&gt;FLINK-16727&lt;/a&gt;] - Fix cast exception when having time point literal as parameters
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16732&quot;&gt;FLINK-16732&lt;/a&gt;] - Failed to call Hive UDF with constant return value
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16740&quot;&gt;FLINK-16740&lt;/a&gt;] - OrcSplitReaderUtil::logicalTypeToOrcType fails to create decimal type with precision &amp;lt; 10
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16759&quot;&gt;FLINK-16759&lt;/a&gt;] - HiveModuleTest failed to compile on release-1.10
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16767&quot;&gt;FLINK-16767&lt;/a&gt;] - Failed to read Hive table with RegexSerDe
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16771&quot;&gt;FLINK-16771&lt;/a&gt;] - NPE when filtering by decimal column
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16821&quot;&gt;FLINK-16821&lt;/a&gt;] - Run Kubernetes test failed with invalid named &amp;quot;minikube&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16822&quot;&gt;FLINK-16822&lt;/a&gt;] - The config set by SET command does not work
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16825&quot;&gt;FLINK-16825&lt;/a&gt;] - PrometheusReporterEndToEndITCase should rely on path returned by DownloadCache
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16836&quot;&gt;FLINK-16836&lt;/a&gt;] - Losing leadership does not clear rpc connection in JobManagerLeaderListener
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16860&quot;&gt;FLINK-16860&lt;/a&gt;] - Failed to push filter into OrcTableSource when upgrading to 1.9.2
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16888&quot;&gt;FLINK-16888&lt;/a&gt;] - Re-add jquery license file under &amp;quot;/licenses&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16901&quot;&gt;FLINK-16901&lt;/a&gt;] - Flink Kinesis connector NOTICE should have contents of AWS KPL&amp;#39;s THIRD_PARTY_NOTICES file manually merged in
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16913&quot;&gt;FLINK-16913&lt;/a&gt;] - ReadableConfigToConfigurationAdapter#getEnum throws UnsupportedOperationException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16916&quot;&gt;FLINK-16916&lt;/a&gt;] - The logic of NullableSerializer#copy is wrong
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16944&quot;&gt;FLINK-16944&lt;/a&gt;] - Compile error in. DumpCompiledPlanTest and PreviewPlanDumpTest
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16980&quot;&gt;FLINK-16980&lt;/a&gt;] - Python UDF doesn&amp;#39;t work with protobuf 3.6.1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16981&quot;&gt;FLINK-16981&lt;/a&gt;] - flink-runtime tests are crashing the JVM on Java11 because of PowerMock
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17062&quot;&gt;FLINK-17062&lt;/a&gt;] - Fix the conversion from Java row type to Python row type
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17066&quot;&gt;FLINK-17066&lt;/a&gt;] - Update pyarrow version bounds less than 0.14.0
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17093&quot;&gt;FLINK-17093&lt;/a&gt;] - Python UDF doesn&amp;#39;t work when the input column is from composite field
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17107&quot;&gt;FLINK-17107&lt;/a&gt;] - CheckpointCoordinatorConfiguration#isExactlyOnce() is inconsistent with StreamConfig#getCheckpointMode()
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17114&quot;&gt;FLINK-17114&lt;/a&gt;] - When the pyflink job runs in local mode and the command &amp;quot;python&amp;quot; points to Python 2.7, the startup of the Python UDF worker will fail.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17124&quot;&gt;FLINK-17124&lt;/a&gt;] - The PyFlink Job runs into infinite loop if the Python UDF imports job code
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17152&quot;&gt;FLINK-17152&lt;/a&gt;] - FunctionDefinitionUtil generate wrong resultType and acc type of AggregateFunctionDefinition
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17308&quot;&gt;FLINK-17308&lt;/a&gt;] - ExecutionGraphCache cachedExecutionGraphs not cleanup cause OOM Bug
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17313&quot;&gt;FLINK-17313&lt;/a&gt;] - Validation error when insert decimal/varchar with precision into sink using TypeInformation of row
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17334&quot;&gt;FLINK-17334&lt;/a&gt;] - Flink does not support HIVE UDFs with primitive return types
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17338&quot;&gt;FLINK-17338&lt;/a&gt;] - LocalExecutorITCase.testBatchQueryCancel test timeout
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17359&quot;&gt;FLINK-17359&lt;/a&gt;] - Entropy key is not resolved if flink-s3-fs-hadoop is added as a plugin
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17403&quot;&gt;FLINK-17403&lt;/a&gt;] - Fix invalid classpath in BashJavaUtilsITCase
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17471&quot;&gt;FLINK-17471&lt;/a&gt;] - Move LICENSE and NOTICE files to root directory of python distribution
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17483&quot;&gt;FLINK-17483&lt;/a&gt;] - Update flink-sql-connector-elasticsearch7 NOTICE file to correctly reflect bundled dependencies
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17496&quot;&gt;FLINK-17496&lt;/a&gt;] - Performance regression with amazon-kinesis-producer 0.13.1 in Flink 1.10.x
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17499&quot;&gt;FLINK-17499&lt;/a&gt;] - LazyTimerService used to register timers via State Processing API incorrectly mixes event time timers with processing time timers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17514&quot;&gt;FLINK-17514&lt;/a&gt;] - TaskCancelerWatchdog does not kill TaskManager
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; New Feature
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17275&quot;&gt;FLINK-17275&lt;/a&gt;] - Add core training exercises
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9656&quot;&gt;FLINK-9656&lt;/a&gt;] - Environment java opts for flink run
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15094&quot;&gt;FLINK-15094&lt;/a&gt;] - Warning about using private constructor of java.nio.DirectByteBuffer in Java 11
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15584&quot;&gt;FLINK-15584&lt;/a&gt;] - Give nested data type of ROWs in ValidationException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15616&quot;&gt;FLINK-15616&lt;/a&gt;] - Move boot error messages from python-udf-boot.log to taskmanager&amp;#39;s log file
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15989&quot;&gt;FLINK-15989&lt;/a&gt;] - Rewrap OutOfMemoryError in allocateUnpooledOffHeap with better message
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16018&quot;&gt;FLINK-16018&lt;/a&gt;] - Improve error reporting when submitting batch job (instead of AskTimeoutException)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16125&quot;&gt;FLINK-16125&lt;/a&gt;] - Make zookeeper.connect optional for Kafka connectors
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16167&quot;&gt;FLINK-16167&lt;/a&gt;] - Update documentation about python shell execution
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16191&quot;&gt;FLINK-16191&lt;/a&gt;] - Improve error message on Windows when RocksDB Paths are too long
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16280&quot;&gt;FLINK-16280&lt;/a&gt;] - Fix sample code errors in the documentation about elasticsearch connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16288&quot;&gt;FLINK-16288&lt;/a&gt;] - Setting the TTL for discarding task pods on Kubernetes.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16293&quot;&gt;FLINK-16293&lt;/a&gt;] - Document using plugins in Kubernetes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16343&quot;&gt;FLINK-16343&lt;/a&gt;] - Improve exception message when reading an unbounded source in batch mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16406&quot;&gt;FLINK-16406&lt;/a&gt;] - Increase default value for JVM Metaspace to minimise its OutOfMemoryError
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16538&quot;&gt;FLINK-16538&lt;/a&gt;] - Restructure Python Table API documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16604&quot;&gt;FLINK-16604&lt;/a&gt;] - Column key in JM configuration is too narrow
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16683&quot;&gt;FLINK-16683&lt;/a&gt;] - Remove scripts for starting Flink on Windows
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16697&quot;&gt;FLINK-16697&lt;/a&gt;] - Disable JMX rebinding
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16763&quot;&gt;FLINK-16763&lt;/a&gt;] - Should not use BatchTableEnvironment for Python UDF in the document of flink-1.10
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16772&quot;&gt;FLINK-16772&lt;/a&gt;] - Bump derby to 10.12.1.1+ or exclude it
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16790&quot;&gt;FLINK-16790&lt;/a&gt;] - enables the interpretation of backslash escapes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16862&quot;&gt;FLINK-16862&lt;/a&gt;] - Remove example url in quickstarts
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16874&quot;&gt;FLINK-16874&lt;/a&gt;] - Respect the dynamic options when calculating memory options in taskmanager.sh
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16942&quot;&gt;FLINK-16942&lt;/a&gt;] - ES 5 sink should allow users to select netty transport client
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17065&quot;&gt;FLINK-17065&lt;/a&gt;] - Add documentation about the Python versions supported for PyFlink
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17125&quot;&gt;FLINK-17125&lt;/a&gt;] - Add a Usage Notes Page to Answer Common Questions Encountered by PyFlink Users
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17254&quot;&gt;FLINK-17254&lt;/a&gt;] - Improve the PyFlink documentation and examples to use SQL DDL for source/sink definition
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17276&quot;&gt;FLINK-17276&lt;/a&gt;] - Add checkstyle to training exercises
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17277&quot;&gt;FLINK-17277&lt;/a&gt;] - Apply IntelliJ recommendations to training exercises
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17278&quot;&gt;FLINK-17278&lt;/a&gt;] - Add Travis to the training exercises
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17279&quot;&gt;FLINK-17279&lt;/a&gt;] - Use gradle build scans for training exercises
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17316&quot;&gt;FLINK-17316&lt;/a&gt;] - Have HourlyTips solutions use TumblingEventTimeWindows.of
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15741&quot;&gt;FLINK-15741&lt;/a&gt;] - Fix TTL docs after enabling RocksDB compaction filter by default (needs Chinese translation)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15933&quot;&gt;FLINK-15933&lt;/a&gt;] - update content of how generic table schema is stored in hive via HiveCatalog
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15991&quot;&gt;FLINK-15991&lt;/a&gt;] - Create Chinese documentation for FLIP-49 TM memory model
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16004&quot;&gt;FLINK-16004&lt;/a&gt;] - Exclude flink-rocksdb-state-memory-control-test jars from the dist
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16454&quot;&gt;FLINK-16454&lt;/a&gt;] - Update the copyright year in NOTICE files
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16530&quot;&gt;FLINK-16530&lt;/a&gt;] - Add documentation about &amp;quot;GROUPING SETS&amp;quot; and &amp;quot;CUBE&amp;quot; support in streaming mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16592&quot;&gt;FLINK-16592&lt;/a&gt;] - The doc of Streaming File Sink has a mistake of grammar
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Tue, 12 May 2020 14:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/05/12/release-1.10.1.html</link>
<guid isPermaLink="true">/news/2020/05/12/release-1.10.1.html</guid>
</item>
<item>
<title>Flink Community Update - May&#39;20</title>
<description>&lt;p&gt;Can you smell it? It’s release month! It took a while, but now that we’re &lt;a href=&quot;https://flink.apache.org/news/2020/04/01/community-update.html&quot;&gt;all caught up with the past&lt;/a&gt;, the Community Update is here to stay. This time around, we’re warming up for Flink 1.11 and peeping back to the month of April in the Flink community — with the release of Stateful Functions 2.0, a new self-paced Flink training and some efforts to improve the Flink documentation experience.&lt;/p&gt;
&lt;p&gt;Last month also marked the debut of Flink Forward Virtual Conference 2020: what did you think? If you missed it altogether or just want to recap some of the sessions, the &lt;a href=&quot;https://www.youtube.com/playlist?list=PLDX4T_cnKjD0ngnBSU-bYGfgVv17MiwA7&quot;&gt;videos&lt;/a&gt; and &lt;a href=&quot;https://www.slideshare.net/FlinkForward&quot;&gt;slides&lt;/a&gt; are now available!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#the-past-month-in-flink&quot; id=&quot;markdown-toc-the-past-month-in-flink&quot;&gt;The Past Month in Flink&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-stateful-functions-20-is-out&quot; id=&quot;markdown-toc-flink-stateful-functions-20-is-out&quot;&gt;Flink Stateful Functions 2.0 is out!&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#warming-up-for-flink-111&quot; id=&quot;markdown-toc-warming-up-for-flink-111&quot;&gt;Warming up for Flink 1.11&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-minor-releases&quot; id=&quot;markdown-toc-flink-minor-releases&quot;&gt;Flink Minor Releases&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-193&quot; id=&quot;markdown-toc-flink-193&quot;&gt;Flink 1.9.3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-1101&quot; id=&quot;markdown-toc-flink-1101&quot;&gt;Flink 1.10.1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers-and-pmc-members&quot; id=&quot;markdown-toc-new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#new-pmc-members&quot; id=&quot;markdown-toc-new-pmc-members&quot;&gt;New PMC Members&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers&quot; id=&quot;markdown-toc-new-committers&quot;&gt;New Committers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-bigger-picture&quot; id=&quot;markdown-toc-the-bigger-picture&quot;&gt;The Bigger Picture&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#a-new-self-paced-apache-flink-training&quot; id=&quot;markdown-toc-a-new-self-paced-apache-flink-training&quot;&gt;A new self-paced Apache Flink training&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#google-season-of-docs-2020&quot; id=&quot;markdown-toc-google-season-of-docs-2020&quot;&gt;Google Season of Docs 2020&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#and-something-to-read&quot; id=&quot;markdown-toc-and-something-to-read&quot;&gt;…and something to read!&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;the-past-month-in-flink&quot;&gt;The Past Month in Flink&lt;/h1&gt;
&lt;h2 id=&quot;flink-stateful-functions-20-is-out&quot;&gt;Flink Stateful Functions 2.0 is out!&lt;/h2&gt;
&lt;p&gt;In the beginning of April, the Flink community announced the &lt;a href=&quot;https://flink.apache.org/news/2020/04/07/release-statefun-2.0.0.html&quot;&gt;release of Stateful Functions 2.0&lt;/a&gt; — the first as part of the Apache Flink project. From this release, you can use Flink as the base of a (stateful) serverless platform with out-of-the-box consistent and scalable state, and efficient messaging between functions. You can even run your stateful functions on platforms like AWS Lambda, as Gordon (&lt;a href=&quot;https://twitter.com/tzulitai&quot;&gt;@tzulitai&lt;/a&gt;) demonstrated in &lt;a href=&quot;https://www.youtube.com/watch?v=tuSylBadNSo&amp;amp;list=PLDX4T_cnKjD0ngnBSU-bYGfgVv17MiwA7&amp;amp;index=27&amp;amp;t=8s&quot;&gt;his Flink Forward talk&lt;/a&gt;.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-05-06-community-update/2020-05-06-community-update_2.png&quot; width=&quot;550px&quot; alt=&quot;Stateful Functions&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;It’s been encouraging to see so many questions about Stateful Functions popping up in the &lt;a href=&quot;https://lists.apache.org/list.html?user@flink.apache.org:lte=3M:statefun&quot;&gt;mailing list&lt;/a&gt; and Stack Overflow! If you’d like to get involved, we’re always &lt;a href=&quot;https://github.com/apache/flink-statefun#contributing&quot;&gt;looking for new contributors&lt;/a&gt; — especially around SDKs for other languages like Go, Javascript and Rust.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;warming-up-for-flink-111&quot;&gt;Warming up for Flink 1.11&lt;/h2&gt;
&lt;p&gt;The final preparations for the release of Flink 1.11 are well underway, with the feature freeze scheduled for May 15th, and there’s a lot of new features and improvements to look out for:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;On the &lt;strong&gt;usability&lt;/strong&gt; side, you can expect a big focus on smoothing data ingestion with contributions like support for Change Data Capture (CDC) in the Table API/SQL (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-105%3A+Support+to+Interpret+and+Emit+Changelog+in+Flink+SQL&quot;&gt;FLIP-105&lt;/a&gt;), easy streaming data ingestion into Apache Hive (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table&quot;&gt;FLIP-115&lt;/a&gt;) or support for Pandas DataFrames in PyFlink (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-120%3A+Support+conversion+between+PyFlink+Table+and+Pandas+DataFrame&quot;&gt;FLIP-120&lt;/a&gt;). A great deal of effort has also gone into maturing PyFlink, with the introduction of user defined metrics in Python UDFs (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-112%3A+Support+User-Defined+Metrics+in++Python+UDF&quot;&gt;FLIP-112&lt;/a&gt;) and the extension of Python UDF support beyond the Python Table API (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-106%3A+Support+Python+UDF+in+SQL+Function+DDL&quot;&gt;FLIP-106&lt;/a&gt;,&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-114%3A+Support+Python+UDF+in+SQL+Client&quot;&gt;FLIP-114&lt;/a&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;On the &lt;strong&gt;operational&lt;/strong&gt; side, the much anticipated new Source API (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface&quot;&gt;FLIP-27&lt;/a&gt;) will unify batch and streaming sources, and improve out-of-the-box event-time behavior; while unaligned checkpoints (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-76%3A+Unaligned+Checkpoints&quot;&gt;FLIP-76&lt;/a&gt;) and changes to network memory management will allow to speed up checkpointing under backpressure — this is part of a bigger effort to rethink fault tolerance that will introduce many other non-trivial changes to Flink. You can learn more about it in &lt;a href=&quot;https://youtu.be/ssEmeLcL5Uk&quot;&gt;this&lt;/a&gt; recent Flink Forward talk!&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Throw into the mix improvements around type systems, the WebUI, metrics reporting, supported formats and…we can’t wait! To get an overview of the ongoing developments, have a look at &lt;a href=&quot;http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Development-progress-of-Apache-Flink-1-11-tp40718.html&quot;&gt;this thread&lt;/a&gt;. We encourage the community to get involved in testing once an RC (Release Candidate) is out. Keep an eye on the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;@dev mailing list&lt;/a&gt; for updates!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;flink-minor-releases&quot;&gt;Flink Minor Releases&lt;/h2&gt;
&lt;h3 id=&quot;flink-193&quot;&gt;Flink 1.9.3&lt;/h3&gt;
&lt;p&gt;The community released Flink 1.9.3, covering some outstanding bugs from Flink 1.9! You can find more in the &lt;a href=&quot;(https://flink.apache.org/news/2020/04/24/release-1.9.3.html)&quot;&gt;announcement blogpost&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;flink-1101&quot;&gt;Flink 1.10.1&lt;/h3&gt;
&lt;p&gt;Also in the pipeline is the release of Flink 1.10.1, already in the &lt;a href=&quot;http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-10-1-release-candidate-2-td41019.html&quot;&gt;RC voting&lt;/a&gt; phase. So, you can expect Flink 1.10.1 to be released soon!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/h2&gt;
&lt;p&gt;The Apache Flink community has welcomed &lt;strong&gt;3 PMC Members&lt;/strong&gt; and &lt;strong&gt;2 new Committers&lt;/strong&gt; since the last update. Congratulations!&lt;/p&gt;
&lt;h3 id=&quot;new-pmc-members&quot;&gt;New PMC Members&lt;/h3&gt;
&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars2.githubusercontent.com/u/6242259?s=400&amp;amp;u=6e39f4fdbabc8ce4ccde9125166f791957d3ae80&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/dwysakowicz&quot;&gt;Dawid Wysakowicz&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars1.githubusercontent.com/u/4971479?s=400&amp;amp;u=49d4f217e26186606ab13a17a23a038b62b86682&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/HequnC&quot;&gt;Hequn Cheng&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars3.githubusercontent.com/u/12387855?s=400&amp;amp;u=37edbfccb6908541f359433f420f9f1bc25bc714&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;Zhijiang Wang&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&quot;new-committers&quot;&gt;New Committers&lt;/h3&gt;
&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars3.githubusercontent.com/u/11538663?s=400&amp;amp;u=f4643f1981e2a8f8a1962c34511b0d32a31d9502&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/snntrable&quot;&gt;Konstantin Knauf&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;col-lg-3&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;img class=&quot;img-circle&quot; src=&quot;https://avatars1.githubusercontent.com/u/1891970?s=400&amp;amp;u=b7718355ceb1f4a8d1e554c3ae7221e2f32cc8e0&amp;amp;v=4&quot; width=&quot;90&quot; height=&quot;90&quot; /&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/sjwiesman&quot;&gt;Seth Wiesman&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;the-bigger-picture&quot;&gt;The Bigger Picture&lt;/h1&gt;
&lt;h2 id=&quot;a-new-self-paced-apache-flink-training&quot;&gt;A new self-paced Apache Flink training&lt;/h2&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;This week, the Flink website received the invaluable contribution of a self-paced training course curated by David (&lt;a href=&quot;https://twitter.com/alpinegizmo&quot;&gt;@alpinegizmo&lt;/a&gt;) — or, what used to be the entire training materials under &lt;a href=&quot;training.ververica.com&quot;&gt;training.ververica.com&lt;/a&gt;. The new materials guide you through the very basics of Flink and the DataStream API, and round off every concepts section with hands-on exercises to help you better assimilate what you learned.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-05-06-community-update/2020-05-06-community-update_1.png&quot; width=&quot;1000px&quot; alt=&quot;Self-paced Flink Training&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:140%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Whether you’re new to Flink or just looking to strengthen your foundations, this training is the most comprehensive way to get started and is now completely open source: &lt;a href=&quot;https://flink.apache.org/training.html&quot;&gt;https://flink.apache.org/training.html&lt;/a&gt;. For now, the materials are only available in English, but the community intends to also provide a Chinese translation in the future.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id=&quot;google-season-of-docs-2020&quot;&gt;Google Season of Docs 2020&lt;/h2&gt;
&lt;p&gt;Google Season of Docs (GSOD) is a great initiative organized by &lt;a href=&quot;https://opensource.google.com/&quot;&gt;Google Open Source&lt;/a&gt; to pair technical writers with mentors to work on documentation for open source projects. Last year, the Flink community submitted &lt;a href=&quot;https://flink.apache.org/news/2019/04/17/sod.html&quot;&gt;an application&lt;/a&gt; that unfortunately didn’t make the cut — but we are trying again! This time, with a project idea to improve the Table API &amp;amp; SQL documentation:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1) Restructure the Table API &amp;amp; SQL Documentation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Reworking the current documentation structure would allow to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Lower the entry barrier to Flink for non-programmatic (i.e. SQL) users.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Make the available features more easily discoverable.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Improve the flow and logical correlation of topics.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685&quot;&gt;FLIP-60&lt;/a&gt; contains a detailed proposal on how to reorganize the existing documentation, which can be used as a starting point.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2) Extend the Table API &amp;amp; SQL Documentation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Some areas of the documentation have insufficient detail or are not &lt;a href=&quot;https://flink.apache.org/contributing/docs-style.html#general-guiding-principles&quot;&gt;accessible&lt;/a&gt; for new Flink users. Examples of topics and sections that require attention are: planners, built-in functions, connectors, overview and concepts sections. There is a lot of work to be done and the technical writer could choose what areas to focus on — these improvements could then be added to the documentation rework umbrella issue (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12639&quot;&gt;FLINK-12639&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;If you’re interested in learning more about this project idea or want to get involved in GSoD as a technical writer, check out the &lt;a href=&quot;https://flink.apache.org/news/2020/05/04/season-of-docs.html&quot;&gt;announcement blogpost&lt;/a&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1 id=&quot;and-something-to-read&quot;&gt;…and something to read!&lt;/h1&gt;
&lt;p&gt;Events across the globe have pretty much come to a halt, so we’ll leave you with some interesting resources to read and explore instead. In addition to this written content, you can also recap the sessions from the &lt;a href=&quot;https://www.youtube.com/playlist?list=PLDX4T_cnKjD0ngnBSU-bYGfgVv17MiwA7&quot;&gt;Flink Forward Virtual Conference&lt;/a&gt;!&lt;/p&gt;
&lt;table class=&quot;table table-bordered&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Links&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon glyphicon-bookmark&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Blogposts&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://medium.com/@abdelkrim.hadjidj/event-driven-supply-chain-for-crisis-with-flinksql-be80cb3ad4f9&quot;&gt;Event-Driven Supply Chain for Crisis with FlinkSQL and Zeppelin&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/04/21/memory-management-improvements-flink-1.10.html&quot;&gt;Memory Management Improvements with Apache Flink 1.10&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html&quot;&gt;Flink Serialization Tuning Vol. 1: Choosing your Serializer — if you can&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon-console&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Tutorials&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html&quot;&gt;PyFlink: Introducing Python Support for UDFs in Flink&#39;s Table API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://dev.to/morsapaes/flink-stateful-functions-where-to-start-2j39&quot;&gt;Flink Stateful Functions: where to start?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon glyphicon-certificate&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Flink Packages&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;&lt;p&gt;&lt;a href=&quot;https://flink-packages.org/&quot;&gt;Flink Packages&lt;/a&gt; is a website where you can explore (and contribute to) the Flink &lt;br /&gt; ecosystem of connectors, extensions, APIs, tools and integrations. &lt;b&gt;New in:&lt;/b&gt; &lt;/p&gt;
&lt;li&gt;&lt;a href=&quot;https://flink-packages.org/packages/spillable-state-backend-for-flink&quot;&gt;Spillable State Backend for Flink&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink-packages.org/packages/flink-memory-calculator&quot;&gt;Flink Memory Calculator&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink-packages.org/packages/ververica-platform-community-edition&quot;&gt;Ververica Platform Community Edition&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;If you’d like to keep a closer eye on what’s happening in the community, subscribe to the Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;@community mailing list&lt;/a&gt; to get fine-grained weekly updates, upcoming event announcements and more.&lt;/p&gt;
</description>
<pubDate>Thu, 07 May 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/05/07/community-update.html</link>
<guid isPermaLink="true">/news/2020/05/07/community-update.html</guid>
</item>
<item>
<title>Applying to Google Season of Docs 2020</title>
<description>&lt;p&gt;The Flink community is thrilled to share that the project is applying again to &lt;a href=&quot;https://developers.google.com/season-of-docs/&quot;&gt;Google Season of Docs&lt;/a&gt; (GSoD) this year! If you’re unfamiliar with the program, GSoD is a great initiative organized by &lt;a href=&quot;https://opensource.google.com/&quot;&gt;Google Open Source&lt;/a&gt; to pair technical writers with mentors to work on documentation for open source projects. The &lt;a href=&quot;https://developers.google.com/season-of-docs/docs/2019/participants&quot;&gt;first edition&lt;/a&gt; supported over 40 projects, including some other cool Apache Software Foundation (ASF) members like Apache Airflow and Apache Cassandra.&lt;/p&gt;
&lt;h1 id=&quot;why-apply&quot;&gt;Why Apply?&lt;/h1&gt;
&lt;p&gt;As one of the most active projects in the ASF, Flink is experiencing a boom in contributions and some major changes to its codebase. And, while the project has also seen a significant increase in activity when it comes to writing, reviewing and translating documentation, it’s hard to keep up with the pace.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-05-04-season-of-docs/2020-04-30-season-of-docs_1.png&quot; width=&quot;650px&quot; alt=&quot;GitHub 1&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Since last year, the community has been working on &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-42%3A+Rework+Flink+Documentation&quot;&gt;FLIP-42&lt;/a&gt; to improve the documentation experience and bring a more accessible information architecture to Flink. After &lt;a href=&quot;https://www.mail-archive.com/dev@flink.apache.org/msg36987.html&quot;&gt;some discussion&lt;/a&gt;, we agreed that GSoD would be a valuable opportunity to double down on this effort and collaborate with someone who is passionate about technical writing…and Flink!&lt;/p&gt;
&lt;h1 id=&quot;how-can-you-contribute&quot;&gt;How can you contribute?&lt;/h1&gt;
&lt;p&gt;If working shoulder to shoulder with the Flink community on documentation sounds exciting, we’d love to hear from you! You can read more about our idea for this year’s project below and, depending on whether it is accepted, &lt;a href=&quot;https://developers.google.com/season-of-docs/docs/tech-writer-guide&quot;&gt;apply&lt;/a&gt; as a technical writer. If you have any questions or just want to know more about the project idea, ping us at &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;dev@flink.apache.org&lt;/a&gt;!&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
Please &lt;a href=&quot;mailto:dev-subscribe@flink.apache.org&quot;&gt;subscribe&lt;/a&gt; to the Apache Flink mailing list before reaching out.
If you are not subscribed then responses to your message will not go through.
You can always &lt;a href=&quot;mailto:dev-unsubscribe@flink.apache.org&quot;&gt;unsubscribe&lt;/a&gt; at any time.
&lt;/div&gt;
&lt;h2 id=&quot;project-improve-the-table-api--sql-documentation&quot;&gt;Project: Improve the Table API &amp;amp; SQL Documentation&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://flink.apache.org/&quot;&gt;Apache Flink&lt;/a&gt; is a stateful stream processor supporting a broad set of use cases and featuring APIs at different levels of abstraction that allow users to trade off expressiveness and usability, as well as work with their language of choice (Java/Scala, SQL or Python). The Table API &amp;amp; SQL are Flink’s high-level relational abstractions and focus on data analytics use cases. A core principle is that either API can be used to process static (batch) and continuous (streaming) data with the same syntax and yielding the same results.&lt;/p&gt;
&lt;p&gt;As the Flink community works on extending the scope of the Table API &amp;amp; SQL, a lot of new features are being added and some underlying structures are also being refactored. At the same time, the documentation for these APIs is growing onto a somewhat unruly structure and has potential for improvement in some areas.&lt;/p&gt;
&lt;p&gt;The project has two main workstreams: restructuring and extending the Table API &amp;amp; SQL documentation. These can be worked on by one person as a bigger effort or assigned to different technical writers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1) Restructure the Table API &amp;amp; SQL Documentation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Reworking the current documentation structure would allow to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Lower the entry barrier to Flink for non-programmatic (i.e. SQL) users.&lt;/li&gt;
&lt;li&gt;Make the available features more easily discoverable.&lt;/li&gt;
&lt;li&gt;Improve the flow and logical correlation of topics.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685&quot;&gt;FLIP-60&lt;/a&gt; contains a detailed proposal on how to reorganize the existing documentation, which can be used as a starting point.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2) Extend the Table API &amp;amp; SQL Documentation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Some areas of the documentation have insufficient detail or are not &lt;a href=&quot;https://flink.apache.org/contributing/docs-style.html#general-guiding-principles&quot;&gt;accessible&lt;/a&gt; for new Flink users. Examples of topics and sections that require attention are: planners, built-in functions, connectors, overview and concepts sections. There is a lot of work to be done and the technical writer could choose what areas to focus on — these improvements could then be added to the documentation rework umbrella issue (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12639&quot;&gt;FLINK-12639&lt;/a&gt;).&lt;/p&gt;
&lt;h3 id=&quot;project-mentors&quot;&gt;Project Mentors&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://twitter.com/aljoscha&quot;&gt;Aljoscha Krettek&lt;/a&gt; (Apache Flink and Apache Beam PMC Member)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://twitter.com/sjwiesman&quot;&gt;Seth Wiesman&lt;/a&gt; (Apache Flink Committer)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;related-resources&quot;&gt;Related Resources&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;FLIP-60: &lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685&quot;&gt;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Table API &amp;amp; SQL Documentation: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/&quot;&gt;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;How to Contribute Documentation: &lt;a href=&quot;https://flink.apache.org/contributing/contribute-documentation.html&quot;&gt;https://flink.apache.org/contributing/contribute-documentation.html&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Documentation Style Guide: &lt;a href=&quot;https://flink.apache.org/contributing/docs-style.html&quot;&gt;https://flink.apache.org/contributing/docs-style.html&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We look forward to receiving feedback on this GSoD application and also to continue improving the documentation experience for Flink users. Join us!&lt;/p&gt;
</description>
<pubDate>Mon, 04 May 2020 08:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/05/04/season-of-docs.html</link>
<guid isPermaLink="true">/news/2020/05/04/season-of-docs.html</guid>
</item>
<item>
<title>Apache Flink 1.9.3 Released</title>
<description>&lt;p&gt;The Apache Flink community released the third bugfix version of the Apache Flink 1.9 series.&lt;/p&gt;
&lt;p&gt;This release includes 38 fixes and minor improvements for Flink 1.9.2. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.9.3.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.9.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.9.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.9.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15143&quot;&gt;FLINK-15143&lt;/a&gt;] - Create document for FLIP-49 TM memory model and configuration guide
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16389&quot;&gt;FLINK-16389&lt;/a&gt;] - Bump Kafka 0.10 to 0.10.2.2
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11193&quot;&gt;FLINK-11193&lt;/a&gt;] - Rocksdb timer service factory configuration option is not settable per job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14316&quot;&gt;FLINK-14316&lt;/a&gt;] - Stuck in &amp;quot;Job leader ... lost leadership&amp;quot; error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14560&quot;&gt;FLINK-14560&lt;/a&gt;] - The value of taskmanager.memory.size in flink-conf.yaml is set to zero will cause taskmanager not to work
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15010&quot;&gt;FLINK-15010&lt;/a&gt;] - Temp directories flink-netty-shuffle-* are not cleaned up
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15085&quot;&gt;FLINK-15085&lt;/a&gt;] - HistoryServer dashboard config json out of sync
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15386&quot;&gt;FLINK-15386&lt;/a&gt;] - SingleJobSubmittedJobGraphStore.putJobGraph has a logic error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15575&quot;&gt;FLINK-15575&lt;/a&gt;] - Azure Filesystem Shades Wrong Package &amp;quot;httpcomponents&amp;quot;
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15638&quot;&gt;FLINK-15638&lt;/a&gt;] - releasing/create_release_branch.sh does not set version in flink-python/pyflink/version.py
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15812&quot;&gt;FLINK-15812&lt;/a&gt;] - HistoryServer archiving is done in Dispatcher main thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15844&quot;&gt;FLINK-15844&lt;/a&gt;] - Removal of JobWithJars.buildUserCodeClassLoader method without Configuration breaks backwards compatibility
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15863&quot;&gt;FLINK-15863&lt;/a&gt;] - Fix docs stating that savepoints are relocatable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16047&quot;&gt;FLINK-16047&lt;/a&gt;] - Blink planner produces wrong aggregate results with state clean up
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16242&quot;&gt;FLINK-16242&lt;/a&gt;] - BinaryGeneric serialization error cause checkpoint failure
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16308&quot;&gt;FLINK-16308&lt;/a&gt;] - SQL connector download links are broken
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16373&quot;&gt;FLINK-16373&lt;/a&gt;] - EmbeddedLeaderService: IllegalStateException: The RPC connection is already closed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16573&quot;&gt;FLINK-16573&lt;/a&gt;] - Kinesis consumer does not properly shutdown RecordFetcher threads
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16576&quot;&gt;FLINK-16576&lt;/a&gt;] - State inconsistency on restore with memory state backends
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16696&quot;&gt;FLINK-16696&lt;/a&gt;] - Savepoint trigger documentation is insufficient
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16703&quot;&gt;FLINK-16703&lt;/a&gt;] - AkkaRpcActor state machine does not record transition to terminating state.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16836&quot;&gt;FLINK-16836&lt;/a&gt;] - Losing leadership does not clear rpc connection in JobManagerLeaderListener
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16860&quot;&gt;FLINK-16860&lt;/a&gt;] - Failed to push filter into OrcTableSource when upgrading to 1.9.2
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16916&quot;&gt;FLINK-16916&lt;/a&gt;] - The logic of NullableSerializer#copy is wrong
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-17062&quot;&gt;FLINK-17062&lt;/a&gt;] - Fix the conversion from Java row type to Python row type
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14278&quot;&gt;FLINK-14278&lt;/a&gt;] - Pass in ioExecutor into AbstractDispatcherResourceManagerComponentFactory
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15908&quot;&gt;FLINK-15908&lt;/a&gt;] - Add description of support &amp;#39;pip install&amp;#39; to 1.9.x documents
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15909&quot;&gt;FLINK-15909&lt;/a&gt;] - Add PyPI release process into the subsequent release of 1.9.x
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15938&quot;&gt;FLINK-15938&lt;/a&gt;] - Idle state not cleaned in StreamingJoinOperator and StreamingSemiAntiJoinOperator
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16018&quot;&gt;FLINK-16018&lt;/a&gt;] - Improve error reporting when submitting batch job (instead of AskTimeoutException)
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16031&quot;&gt;FLINK-16031&lt;/a&gt;] - Improve the description in the README file of PyFlink 1.9.x
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16167&quot;&gt;FLINK-16167&lt;/a&gt;] - Update documentation about python shell execution
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16280&quot;&gt;FLINK-16280&lt;/a&gt;] - Fix sample code errors in the documentation about elasticsearch connector
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16697&quot;&gt;FLINK-16697&lt;/a&gt;] - Disable JMX rebinding
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16862&quot;&gt;FLINK-16862&lt;/a&gt;] - Remove example url in quickstarts
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16942&quot;&gt;FLINK-16942&lt;/a&gt;] - ES 5 sink should allow users to select netty transport client
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11767&quot;&gt;FLINK-11767&lt;/a&gt;] - Introduce new TypeSerializerUpgradeTestBase, new PojoSerializerUpgradeTest
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-16454&quot;&gt;FLINK-16454&lt;/a&gt;] - Update the copyright year in NOTICE files
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Fri, 24 Apr 2020 14:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/04/24/release-1.9.3.html</link>
<guid isPermaLink="true">/news/2020/04/24/release-1.9.3.html</guid>
</item>
<item>
<title>Memory Management Improvements with Apache Flink 1.10</title>
<description>&lt;p&gt;Apache Flink 1.10 comes with significant changes to the memory model of the Task Managers and configuration options for your Flink applications. These recently-introduced changes make Flink more adaptable to all kinds of deployment environments (e.g. Kubernetes, Yarn, Mesos), providing strict control over its memory consumption. In this post, we describe Flink’s memory model, as it stands in Flink 1.10, how to set up and manage memory consumption of your Flink applications and the recent changes the community implemented in the latest Apache Flink release.&lt;/p&gt;
&lt;h2 id=&quot;introduction-to-flinks-memory-model&quot;&gt;Introduction to Flink’s memory model&lt;/h2&gt;
&lt;p&gt;Having a clear understanding of Apache Flink’s memory model allows you to manage resources for the various workloads more efficiently. The following diagram illustrates the main memory components in Flink:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-04-21-memory-management-improvements-flink-1.10/total-process-memory.svg&quot; width=&quot;400px&quot; alt=&quot;Flink: Total Process Memory&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Flink: Total Process Memory&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;The Task Manager process is a JVM process. On a high level, its memory consists of the &lt;em&gt;JVM Heap&lt;/em&gt; and &lt;em&gt;Off-Heap&lt;/em&gt; memory. These types of memory are consumed by Flink directly or by JVM for its specific purposes (i.e. metaspace etc.). There are two major memory consumers within Flink: the user code of job operator tasks and the framework itself consuming memory for internal data structures, network buffers, etc.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Please note that&lt;/strong&gt; the user code has direct access to all memory types: &lt;em&gt;JVM Heap, Direct&lt;/em&gt; and &lt;em&gt;Native memory&lt;/em&gt;. Therefore, Flink cannot really control its allocation and usage. There are however two types of Off-Heap memory which are consumed by tasks and controlled explicitly by Flink:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Managed Memory&lt;/em&gt; (Off-Heap)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Network Buffers&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The latter is part of the &lt;em&gt;JVM Direct Memory&lt;/em&gt;, allocated for user record data exchange between operator tasks.&lt;/p&gt;
&lt;h2 id=&quot;how-to-set-up-flink-memory&quot;&gt;How to set up Flink memory&lt;/h2&gt;
&lt;p&gt;With the latest release of Flink 1.10 and in order to provide better user experience, the framework comes with both high-level and fine-grained tuning of memory components. There are essentially three alternatives to setting up memory in Task Managers.&lt;/p&gt;
&lt;p&gt;The first two — and simplest — alternatives are configuring one of the two following options for total memory available for the JVM process of the Task Manager:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Total Process Memory&lt;/em&gt;: total memory consumed by the Flink Java application (including user code) and by the JVM to run the whole process.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Total Flink Memory&lt;/em&gt;: only memory consumed by the Flink Java application, including user code but excluding memory allocated by JVM to run it&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is advisable to configure the &lt;em&gt;Total Flink Memory&lt;/em&gt; for standalone deployments where explicitly declaring how much memory is given to Flink is a common practice, while the outer &lt;em&gt;JVM overhead&lt;/em&gt; is of little interest. For the cases of deploying Flink in containerized environments (such as &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html&quot;&gt;Kubernetes&lt;/a&gt;, &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/deployment/yarn_setup.html&quot;&gt;Yarn&lt;/a&gt; or &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/deployment/mesos.html&quot;&gt;Mesos&lt;/a&gt;), the &lt;em&gt;Total Process Memory&lt;/em&gt; option is recommended instead, because it becomes the size for the total memory of the requested container. Containerized environments usually strictly enforce this memory limit.&lt;/p&gt;
&lt;p&gt;If you want more fine-grained control over the size of &lt;em&gt;JVM Heap&lt;/em&gt; and &lt;em&gt;Managed Memory&lt;/em&gt; (Off-Heap), there is also a second alternative to configure both &lt;em&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#task-operator-heap-memory&quot;&gt;Task Heap&lt;/a&gt;&lt;/em&gt; and &lt;em&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#managed-memory&quot;&gt;Managed Memory&lt;/a&gt;&lt;/em&gt;. This alternative gives a clear separation between the heap memory and any other memory types.&lt;/p&gt;
&lt;p&gt;In line with the community’s efforts to &lt;a href=&quot;https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html&quot;&gt;unify batch and stream processing&lt;/a&gt;, this model works universally for both scenarios. It allows sharing the &lt;em&gt;JVM Heap&lt;/em&gt; memory between the user code of operator tasks in any workload and the heap state backend in stream processing scenarios. In a similar way, the &lt;em&gt;Managed Memory&lt;/em&gt; can be used for batch spilling and for the RocksDB state backend in streaming.&lt;/p&gt;
&lt;p&gt;The remaining memory components are automatically adjusted either based on their default values or additionally configured parameters. Flink also checks the overall consistency. You can find more information about the different memory components in the corresponding &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_detail.html&quot;&gt;documentation&lt;/a&gt;. Additionally, you can try different configuration options with the &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE/edit#gid=0&quot;&gt;configuration spreadsheet&lt;/a&gt; of &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors&quot;&gt;FLIP-49&lt;/a&gt; and check the corresponding results for your individual case.&lt;/p&gt;
&lt;p&gt;If you are migrating from a Flink version older than 1.10, we suggest following the steps in the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_migration.html&quot;&gt;migration guide&lt;/a&gt; of the Flink documentation.&lt;/p&gt;
&lt;h2 id=&quot;other-components&quot;&gt;Other components&lt;/h2&gt;
&lt;p&gt;While configuring Flink’s memory, the size of different memory components can either be fixed with the value of the respective option or tuned using multiple options. Below we provide some more insight about the memory setup.&lt;/p&gt;
&lt;h3 id=&quot;fractions-of-the-total-flink-memory&quot;&gt;Fractions of the Total Flink Memory&lt;/h3&gt;
&lt;p&gt;This method allows a proportional breakdown of the &lt;em&gt;Total Flink Memory&lt;/em&gt; where the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#managed-memory&quot;&gt;Managed Memory&lt;/a&gt; (if not set explicitly) and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#capped-fractionated-components&quot;&gt;Network Buffers&lt;/a&gt; can take certain fractions of it. The remaining memory is then assigned to the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#task-operator-heap-memory&quot;&gt;Task Heap&lt;/a&gt; (if not set explicitly) and other fixed &lt;em&gt;JVM Heap&lt;/em&gt; and &lt;em&gt;Off-Heap components&lt;/em&gt;. The following picture represents an example of such a setup:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-04-21-memory-management-improvements-flink-1.10/flink-memory-setup.svg&quot; width=&quot;800px&quot; alt=&quot;Flink: Example of Memory Setup&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Flink: Example of Memory Setup&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Please note that&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Flink will verify that the size of the derived &lt;em&gt;Network Memory&lt;/em&gt; is between its minimum and maximum value, otherwise Flink’s startup will fail. The maximum and minimum limits have default values which can be overwritten by the respective configuration options.&lt;/li&gt;
&lt;li&gt;In general, the configured fractions are treated by Flink as hints. Under certain scenarios, the derived value might not match the fraction. For example, if the &lt;em&gt;Total Flink Memory&lt;/em&gt; and the &lt;em&gt;Task Heap&lt;/em&gt; are configured to fixed values, the &lt;em&gt;Managed Memory&lt;/em&gt; will get a certain fraction and the &lt;em&gt;Network Memory&lt;/em&gt; will get the remaining memory which might not exactly match its fraction.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;more-hints-to-control-the-container-memory-limit&quot;&gt;More hints to control the container memory limit&lt;/h3&gt;
&lt;p&gt;The heap and direct memory usage are managed by the JVM. There are also many other possible sources of native memory consumption in Apache Flink or its user applications which are not managed by Flink or the JVM. Controlling their limits is often difficult which complicates debugging of potential memory leaks. If Flink’s process allocates too much memory in an unmanaged way, it can often result in killing Task Manager containers in containerized environments. In this case, it may be hard to understand which type of memory consumption has exceeded its limit. Flink 1.10 introduces some specific tuning options to clearly represent such components. Although Flink cannot always enforce strict limits and borders among them, the idea here is to explicitly plan the memory usage. Below we provide some examples of how memory setup can prevent containers exceeding their memory limit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_tuning.html#rocksdb-state-backend&quot;&gt;RocksDB state cannot grow too big&lt;/a&gt;. The memory consumption of RocksDB state backend is accounted for in the &lt;em&gt;Managed Memory&lt;/em&gt;. RocksDB respects its limit by default (only since Flink 1.10). You can increase the &lt;em&gt;Managed Memory&lt;/em&gt; size to improve RocksDB’s performance or decrease it to save resources.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#configure-off-heap-memory-direct-or-native&quot;&gt;User code or its dependencies consume significant off-heap memory&lt;/a&gt;. Tuning the &lt;em&gt;Task Off-Heap&lt;/em&gt; option can assign additional direct or native memory to the user code or any of its dependencies. Flink cannot control native allocations but it sets the limit for &lt;em&gt;JVM Direct&lt;/em&gt; memory allocations. The &lt;em&gt;Direct&lt;/em&gt; memory limit is enforced by the JVM.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#jvm-parameters&quot;&gt;JVM metaspace requires additional memory&lt;/a&gt;. If you encounter &lt;code&gt;OutOfMemoryError: Metaspace&lt;/code&gt;, Flink provides an option to increase its limit and the JVM will ensure that it is not exceeded.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#capped-fractionated-components&quot;&gt;JVM requires more internal memory&lt;/a&gt;. There is no direct control over certain types of JVM process allocations but Flink provides &lt;em&gt;JVM Overhead&lt;/em&gt; options. The options allow declaring an additional amount of memory, anticipated for those allocations and not covered by other options.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The latest Flink release (Flink 1.10) introduces some significant changes to Flink’s memory configuration, making it possible to manage your application memory and debug Flink significantly better than before. Future developments in this area also include adopting a similar memory model for the job manager process in &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP+116%3A+Unified+Memory+Configuration+for+Job+Managers&quot;&gt;FLIP-116&lt;/a&gt;, so stay tuned for more additions and features in upcoming releases. If you have any suggestions or questions for the community, we encourage you to sign up to the Apache Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;mailing lists&lt;/a&gt; and become part of the discussion there.&lt;/p&gt;
</description>
<pubDate>Tue, 21 Apr 2020 14:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/04/21/memory-management-improvements-flink-1.10.html</link>
<guid isPermaLink="true">/news/2020/04/21/memory-management-improvements-flink-1.10.html</guid>
</item>
<item>
<title>Flink Serialization Tuning Vol. 1: Choosing your Serializer — if you can</title>
<description>&lt;p&gt;Almost every Flink job has to exchange data between its operators and since these records may not only be sent to another instance in the same JVM but instead to a separate process, records need to be serialized to bytes first. Similarly, Flink’s off-heap state-backend is based on a local embedded RocksDB instance which is implemented in native C++ code and thus also needs transformation into bytes on every state access. Wire and state serialization alone can easily cost a lot of your job’s performance if not executed correctly and thus, whenever you look into the profiler output of your Flink job, you will most likely see serialization in the top places for using CPU cycles.&lt;/p&gt;
&lt;p&gt;Since serialization is so crucial to your Flink job, we would like to highlight Flink’s serialization stack in a series of blog posts starting with looking at the different ways Flink can serialize your data types.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#recap-flink-serialization&quot; id=&quot;markdown-toc-recap-flink-serialization&quot;&gt;Recap: Flink Serialization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#choice-of-serializer&quot; id=&quot;markdown-toc-choice-of-serializer&quot;&gt;Choice of Serializer&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#pojoserializer&quot; id=&quot;markdown-toc-pojoserializer&quot;&gt;PojoSerializer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#tuple-data-types&quot; id=&quot;markdown-toc-tuple-data-types&quot;&gt;Tuple Data Types&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#row-data-types&quot; id=&quot;markdown-toc-row-data-types&quot;&gt;Row Data Types&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#avro&quot; id=&quot;markdown-toc-avro&quot;&gt;Avro&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#avro-specific&quot; id=&quot;markdown-toc-avro-specific&quot;&gt;Avro Specific&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#avro-generic&quot; id=&quot;markdown-toc-avro-generic&quot;&gt;Avro Generic&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#avro-reflect&quot; id=&quot;markdown-toc-avro-reflect&quot;&gt;Avro Reflect&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#kryo&quot; id=&quot;markdown-toc-kryo&quot;&gt;Kryo&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#disabling-kryo&quot; id=&quot;markdown-toc-disabling-kryo&quot;&gt;Disabling Kryo&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#apache-thrift-via-kryo&quot; id=&quot;markdown-toc-apache-thrift-via-kryo&quot;&gt;Apache Thrift (via Kryo)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#protobuf-via-kryo&quot; id=&quot;markdown-toc-protobuf-via-kryo&quot;&gt;Protobuf (via Kryo)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#state-schema-evolution&quot; id=&quot;markdown-toc-state-schema-evolution&quot;&gt;State Schema Evolution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#performance-comparison&quot; id=&quot;markdown-toc-performance-comparison&quot;&gt;Performance Comparison&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;recap-flink-serialization&quot;&gt;Recap: Flink Serialization&lt;/h1&gt;
&lt;p&gt;Flink handles &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/types_serialization.html&quot;&gt;data types and serialization&lt;/a&gt; with its own type descriptors, generic type extraction, and type serialization framework. We recommend reading through the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/types_serialization.html&quot;&gt;documentation&lt;/a&gt; first in order to be able to follow the arguments we present below. In essence, Flink tries to infer information about your job’s data types for wire and state serialization, and to be able to use grouping, joining, and aggregation operations by referring to individual field names, e.g.
&lt;code&gt;stream.keyBy(“ruleId”)&lt;/code&gt; or
&lt;code&gt;dataSet.join(another).where(&quot;name&quot;).equalTo(&quot;personName&quot;)&lt;/code&gt;. It also allows optimizations in the serialization format as well as reducing unnecessary de/serializations (mainly in certain Batch operations as well as in the SQL/Table APIs).&lt;/p&gt;
&lt;h1 id=&quot;choice-of-serializer&quot;&gt;Choice of Serializer&lt;/h1&gt;
&lt;p&gt;Apache Flink’s out-of-the-box serialization can be roughly divided into the following groups:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Flink-provided special serializers&lt;/strong&gt; for basic types (Java primitives and their boxed form), arrays, composite types (tuples, Scala case classes, Rows), and a few auxiliary types (Option, Either, Lists, Maps, …),&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;POJOs&lt;/strong&gt;; a public, standalone class with a public no-argument constructor and all non-static, non-transient fields in the class hierarchy either public or with a public getter- and a setter-method; see &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/types_serialization.html#rules-for-pojo-types&quot;&gt;POJO Rules&lt;/a&gt;,&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generic types&lt;/strong&gt;; user-defined data types that are not recognized as a POJO and then serialized via &lt;a href=&quot;https://github.com/EsotericSoftware/kryo&quot;&gt;Kryo&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Alternatively, you can also register &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/custom_serializers.html&quot;&gt;custom serializers&lt;/a&gt; for user-defined data types. This includes writing your own serializers or integrating other serialization systems like &lt;a href=&quot;https://developers.google.com/protocol-buffers/&quot;&gt;Google Protobuf&lt;/a&gt; or &lt;a href=&quot;https://thrift.apache.org/&quot;&gt;Apache Thrift&lt;/a&gt; via &lt;a href=&quot;https://github.com/EsotericSoftware/kryo&quot;&gt;Kryo&lt;/a&gt;. Overall, this gives quite a number of different options of serializing user-defined data types and we will elaborate seven of them in the sections below.&lt;/p&gt;
&lt;h2 id=&quot;pojoserializer&quot;&gt;PojoSerializer&lt;/h2&gt;
&lt;p&gt;As outlined above, if your data type is not covered by a specialized serializer but follows the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/types_serialization.html#rules-for-pojo-types&quot;&gt;POJO Rules&lt;/a&gt;, it will be serialized with the &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.10.0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/PojoSerializer.java&quot;&gt;PojoSerializer&lt;/a&gt; which uses Java reflection to access an object’s fields. It is fast, generic, Flink-specific, and supports &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/stream/state/schema_evolution.html&quot;&gt;state schema evolution&lt;/a&gt; out of the box. If a composite data type cannot be serialized as a POJO, you will find the following message (or similar) in your cluster logs:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;15:45:51,460 INFO org.apache.flink.api.java.typeutils.TypeExtractor - Class … cannot be used as a POJO type because not all fields are valid POJO fields, and must be processed as GenericType. Please read the Flink documentation on “Data Types &amp;amp; Serialization” for details of the effect on performance.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This means, that the PojoSerializer will not be used, but instead Flink will fall back to Kryo for serialization (see below). We will have a more detailed look into a few (more) situations that can lead to unexpected Kryo fallbacks in the second part of this blog post series.&lt;/p&gt;
&lt;h2 id=&quot;tuple-data-types&quot;&gt;Tuple Data Types&lt;/h2&gt;
&lt;p&gt;Flink comes with a predefined set of tuple types which all have a fixed length and contain a set of strongly-typed fields of potentially different types. There are implementations for &lt;code&gt;Tuple0&lt;/code&gt;, &lt;code&gt;Tuple1&amp;lt;T0&amp;gt;&lt;/code&gt;, …, &lt;code&gt;Tuple25&amp;lt;T0, T1, ..., T24&amp;gt;&lt;/code&gt; and they may serve as easy-to-use wrappers that spare the creation of POJOs for each and every combination of objects you need to pass between computations. With the exception of &lt;code&gt;Tuple0&lt;/code&gt;, these are serialized and deserialized with the &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.10.0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/TupleSerializer.java&quot;&gt;TupleSerializer&lt;/a&gt; and the according fields’ serializers. Since tuple classes are completely under the control of Flink, both actions can be performed without reflection by accessing the appropriate fields directly. This certainly is a (performance) advantage when working with tuples instead of POJOs. Tuples, however, are not as flexible and certainly less descriptive in code.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Since &lt;code&gt;Tuple0&lt;/code&gt; does not contain any data and therefore is probably a bit special anyway, it will use a special serializer implementation: &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.10.0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/Tuple0Serializer.java&quot;&gt;Tuple0Serializer&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;row-data-types&quot;&gt;Row Data Types&lt;/h2&gt;
&lt;p&gt;Row types are mainly used by the Table and SQL APIs of Flink. A &lt;code&gt;Row&lt;/code&gt; groups an arbitrary number of objects together similar to the tuples above. These fields are not strongly typed and may all be of different types. Because field types are missing, Flink’s type extraction cannot automatically extract type information and users of a &lt;code&gt;Row&lt;/code&gt; need to manually tell Flink about the row’s field types. The &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.10.0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/RowSerializer.java&quot;&gt;RowSerializer&lt;/a&gt; will then make use of these types for efficient serialization.&lt;/p&gt;
&lt;p&gt;Row type information can be provided in two ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;you can have your source or operator implement &lt;code&gt;ResultTypeQueryable&amp;lt;Row&amp;gt;&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;RowSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ResultTypeQueryable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeInformation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getProducedType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;ROW&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;INT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;OBJECT_ARRAY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;you can provide the types when building the job graph by using &lt;code&gt;SingleOutputStreamOperator#returns()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sourceStream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;RowSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;returns&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;ROW&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;INT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;OBJECT_ARRAY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)));&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;alert alert-warning&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-warning&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-warning-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Warning&lt;/span&gt;
If you fail to provide the type information for a &lt;code&gt;Row&lt;/code&gt;, Flink identifies that &lt;code&gt;Row&lt;/code&gt; is not a valid POJO type according to the rules above and falls back to Kryo serialization (see below) which you will also see in the logs as:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;13:10:11,148 INFO org.apache.flink.api.java.typeutils.TypeExtractor - Class class org.apache.flink.types.Row cannot be used as a POJO type because not all fields are valid POJO fields, and must be processed as GenericType. Please read the Flink documentation on &quot;Data Types &amp;amp; Serialization&quot; for details of the effect on performance.&lt;/code&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;avro&quot;&gt;Avro&lt;/h2&gt;
&lt;p&gt;Flink offers built-in support for the &lt;a href=&quot;http://avro.apache.org/&quot;&gt;Apache Avro&lt;/a&gt; serialization framework (currently using version 1.8.2) by adding the &lt;code&gt;org.apache.flink:flink-avro&lt;/code&gt; dependency into your job. Flink’s &lt;a href=&quot;https://github.com/apache/flink/blob/release-1.10.0/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSerializer.java&quot;&gt;AvroSerializer&lt;/a&gt; can then use Avro’s specific, generic, and reflective data serialization and make use of Avro’s performance and flexibility, especially in terms of &lt;a href=&quot;https://avro.apache.org/docs/current/spec.html#Schema+Resolution&quot;&gt;evolving the schema&lt;/a&gt; when the classes change over time.&lt;/p&gt;
&lt;h3 id=&quot;avro-specific&quot;&gt;Avro Specific&lt;/h3&gt;
&lt;p&gt;Avro specific records will be automatically detected by checking that the given type’s type hierarchy contains the &lt;code&gt;SpecificRecordBase&lt;/code&gt; class. You can either specify your concrete Avro type, or—if you want to be more generic and allow different types in your operator—use the &lt;code&gt;SpecificRecordBase&lt;/code&gt; type (or a subtype) in your user functions, in &lt;code&gt;ResultTypeQueryable#getProducedType()&lt;/code&gt;, or in &lt;code&gt;SingleOutputStreamOperator#returns()&lt;/code&gt;. Since specific records use generated Java code, they are strongly typed and allow direct access to the fields via known getters and setters.&lt;/p&gt;
&lt;div class=&quot;alert alert-warning&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-warning&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-warning-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Warning&lt;/span&gt; If you specify the Flink type as &lt;code&gt;SpecificRecord&lt;/code&gt; and not &lt;code&gt;SpecificRecordBase&lt;/code&gt;, Flink will not see this as an Avro type. Instead, it will use Kryo to de/serialize any objects which may be considerably slower.&lt;/p&gt;
&lt;/div&gt;
&lt;h3 id=&quot;avro-generic&quot;&gt;Avro Generic&lt;/h3&gt;
&lt;p&gt;Avro’s &lt;code&gt;GenericRecord&lt;/code&gt; types cannot, unfortunately, be used automatically since they require the user to &lt;a href=&quot;https://avro.apache.org/docs/1.8.2/gettingstartedjava.html#Serializing+and+deserializing+without+code+generation&quot;&gt;specify a schema&lt;/a&gt; (either manually or by retrieving it from some schema registry). With that schema, you can provide the right type information by either of the following options just like for the Row Types above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;implement &lt;code&gt;ResultTypeQueryable&amp;lt;GenericRecord&amp;gt;&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AvroGenericSource&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GenericRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ResultTypeQueryable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GenericRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GenericRecordAvroTypeInfo&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;producedType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;AvroGenericSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Schema&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;producedType&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;GenericRecordAvroTypeInfo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeInformation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GenericRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getProducedType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;producedType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;provide type information when building the job graph by using &lt;code&gt;SingleOutputStreamOperator#returns()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GenericRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sourceStream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;AvroGenericSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;returns&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;GenericRecordAvroTypeInfo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Without this type information, Flink will fall back to Kryo for serialization which would serialize the schema into every record, over and over again. As a result, the serialized form will be bigger and more costly to create.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Since Avro’s &lt;code&gt;Schema&lt;/code&gt; class is not serializable, it can not be sent around as is. You can work around this by converting it to a String and parsing it back when needed. If you only do this once on initialization, there is practically no difference to sending it directly.&lt;/p&gt;
&lt;/div&gt;
&lt;h3 id=&quot;avro-reflect&quot;&gt;Avro Reflect&lt;/h3&gt;
&lt;p&gt;The third way of using Avro is to exchange Flink’s PojoSerializer (for POJOs according to the rules above) for Avro’s reflection-based serializer. This can be enabled by calling&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;enableForceAvro&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;kryo&quot;&gt;Kryo&lt;/h2&gt;
&lt;p&gt;Any class or object which does not fall into the categories above or is covered by a Flink-provided special serializer is de/serialized with a fallback to &lt;a href=&quot;https://github.com/EsotericSoftware/kryo&quot;&gt;Kryo&lt;/a&gt; (currently version 2.24.0) which is a powerful and generic serialization framework in Java. Flink calls such a type a &lt;em&gt;generic type&lt;/em&gt; and you may stumble upon &lt;code&gt;GenericTypeInfo&lt;/code&gt; when debugging code. If you are using Kryo serialization, make sure to register your types with kryo:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;registerKryoType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyCustomType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Registering types adds them to an internal map of classes to tags so that, during serialization, Kryo does not have to add the fully qualified class names as a prefix into the serialized form. Instead, Kryo uses these (integer) tags to identify the underlying classes and reduce serialization overhead.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Flink will store Kryo serializer mappings from type registrations in its checkpoints and savepoints and will retain them across job (re)starts.&lt;/p&gt;
&lt;/div&gt;
&lt;h3 id=&quot;disabling-kryo&quot;&gt;Disabling Kryo&lt;/h3&gt;
&lt;p&gt;If desired, you can disable the Kryo fallback, i.e. the ability to serialize generic types, by calling&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;disableGenericTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is mostly useful for finding out where these fallbacks are applied and replacing them with better serializers. If your job has any generic types with this configuration, it will fail with&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Exception in thread “main” java.lang.UnsupportedOperationException: Generic types have been disabled in the ExecutionConfig and type … is treated as a generic type.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you cannot immediately see from the type where it is being used, this log message also gives you a stacktrace that can be used to set breakpoints and find out more details in your IDE.&lt;/p&gt;
&lt;h2 id=&quot;apache-thrift-via-kryo&quot;&gt;Apache Thrift (via Kryo)&lt;/h2&gt;
&lt;p&gt;In addition to the variants above, Flink also allows you to &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program&quot;&gt;register other type serialization frameworks&lt;/a&gt; with Kryo. After adding the appropriate dependencies from the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program&quot;&gt;documentation&lt;/a&gt; (&lt;code&gt;com.twitter:chill-thrift&lt;/code&gt; and &lt;code&gt;org.apache.thrift:libthrift&lt;/code&gt;), you can use &lt;a href=&quot;https://thrift.apache.org/&quot;&gt;Apache Thrift&lt;/a&gt; like the following:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addDefaultKryoSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyCustomType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TBaseSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This only works if generic types are not disabled and &lt;code&gt;MyCustomType&lt;/code&gt; is a Thrift-generated data type. If the data type is not generated by Thrift, Flink will fail at runtime with an exception like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;java.lang.ClassCastException: class MyCustomType cannot be cast to class org.apache.thrift.TBase (MyCustomType and org.apache.thrift.TBase are in unnamed module of loader ‘app’)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Please note that &lt;code&gt;TBaseSerializer&lt;/code&gt; can be registered as a default Kryo serializer as above (and as specified in &lt;a href=&quot;https://github.com/twitter/chill/blob/v0.7.6/chill-thrift/src/main/java/com/twitter/chill/thrift/TBaseSerializer.java&quot;&gt;its documentation&lt;/a&gt;) or via &lt;code&gt;registerTypeWithKryoSerializer&lt;/code&gt;. In practice, we found both ways working. We also saw no difference between registering Thrift classes in addition to the call above. Both may be different in your scenario.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;protobuf-via-kryo&quot;&gt;Protobuf (via Kryo)&lt;/h2&gt;
&lt;p&gt;In a way similar to Apache Thrift, &lt;a href=&quot;https://developers.google.com/protocol-buffers/&quot;&gt;Google Protobuf&lt;/a&gt; may be &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program&quot;&gt;registered as a custom serializer&lt;/a&gt; after adding the right dependencies (&lt;code&gt;com.twitter:chill-protobuf&lt;/code&gt; and &lt;code&gt;com.google.protobuf:protobuf-java&lt;/code&gt;):&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;registerTypeWithKryoSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyCustomType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ProtobufSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This will work as long as generic types have not been disabled (this would disable Kryo for good). If &lt;code&gt;MyCustomType&lt;/code&gt; is not a Protobuf-generated class, your Flink job will fail at runtime with the following exception:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;java.lang.ClassCastException: class &lt;code&gt;MyCustomType&lt;/code&gt; cannot be cast to class com.google.protobuf.Message (&lt;code&gt;MyCustomType&lt;/code&gt; and com.google.protobuf.Message are in unnamed module of loader ‘app’)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Please note that &lt;code&gt;ProtobufSerializer&lt;/code&gt; can be registered as a default Kryo serializer (as specified in the &lt;a href=&quot;https://github.com/twitter/chill/blob/v0.7.6/chill-thrift/src/main/java/com/twitter/chill/thrift/TBaseSerializer.java&quot;&gt;Protobuf documentation&lt;/a&gt;) or via &lt;code&gt;registerTypeWithKryoSerializer&lt;/code&gt; (as presented here). In practice, we found both ways working. We also saw no difference between registering your Protobuf classes in addition to the call above. Both may be different in your scenario.&lt;/p&gt;
&lt;/div&gt;
&lt;h1 id=&quot;state-schema-evolution&quot;&gt;State Schema Evolution&lt;/h1&gt;
&lt;p&gt;Before taking a closer look at the performance of each of the serializers described above, we would like to emphasize that performance is not everything that counts inside a real-world Flink job. Types for storing state, for example, should be able to evolve their schema (add/remove/change fields) throughout the lifetime of the job without losing previous state. This is what Flink calls &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/stream/state/schema_evolution.html&quot;&gt;State Schema Evolution&lt;/a&gt;. Currently, as of Flink 1.10, there are only two serializers that support out-of-the-box schema evolution: POJO and Avro. For anything else, if you want to change the state schema, you will have to either implement your own &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/stream/state/custom_serialization.html&quot;&gt;custom serializers&lt;/a&gt; or use the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/libs/state_processor_api.html&quot;&gt;State Processor API&lt;/a&gt; to modify your state for the new code.&lt;/p&gt;
&lt;h1 id=&quot;performance-comparison&quot;&gt;Performance Comparison&lt;/h1&gt;
&lt;p&gt;With so many options for serialization, it is actually not easy to make the right choice. We already saw some technical advantages and disadvantages of each of them outlined above. Since serializers are at the core of your Flink jobs and usually also sit on the hot path (per record invocations), let us actually take a deeper look into their performance with the help of the Flink benchmarks project at &lt;a href=&quot;https://github.com/dataArtisans/flink-benchmarks&quot;&gt;https://github.com/dataArtisans/flink-benchmarks&lt;/a&gt;. This project adds a few micro-benchmarks on top of Flink (some more low-level than others) to track performance regressions and improvements. Flink’s continuous benchmarks for monitoring the serialization stack’s performance are implemented in &lt;a href=&quot;https://github.com/dataArtisans/flink-benchmarks/blob/master/src/main/java/org/apache/flink/benchmark/SerializationFrameworkMiniBenchmarks.java&quot;&gt;SerializationFrameworkMiniBenchmarks.java&lt;/a&gt;. This is only a subset of all available serialization benchmarks though and you will find the complete set in &lt;a href=&quot;https://github.com/dataArtisans/flink-benchmarks/blob/master/src/main/java/org/apache/flink/benchmark/full/SerializationFrameworkAllBenchmarks.java&quot;&gt;SerializationFrameworkAllBenchmarks.java&lt;/a&gt;. All of these use the same definition of a small POJO that may cover average use cases. Essentially (without constructors, getters, and setters), these are the data types that it uses for evaluating performance:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyPojo&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;operationNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MyOperation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;operations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otherId1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otherId2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otherId3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Object&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;someObject&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyOperation&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;protected&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is mapped to tuples, rows, Avro specific records, Thrift and Protobuf representations appropriately and sent through a simple Flink job at parallelism 4 where the data type is used during network communication like this:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setParallelism&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;PojoSource&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RECORDS_PER_INVOCATION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;rebalance&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addSink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DiscardingSink&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;());&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After running this through the &lt;a href=&quot;http://openjdk.java.net/projects/code-tools/jmh/&quot;&gt;jmh&lt;/a&gt; micro-benchmarks defined in &lt;a href=&quot;https://github.com/dataArtisans/flink-benchmarks/blob/master/src/main/java/org/apache/flink/benchmark/full/SerializationFrameworkAllBenchmarks.java&quot;&gt;SerializationFrameworkAllBenchmarks.java&lt;/a&gt;, I retrieved the following performance results for Flink 1.10 on my machine (in number of operations per millisecond):
&lt;br /&gt;&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-04-15-flink-serialization-performance-results.svg&quot; width=&quot;800px&quot; alt=&quot;Communication between the Flink operator and the Python execution environment&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;A few takeaways from these numbers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The default fallback from POJO to Kryo reduces performance by 75%.&lt;br /&gt;
Registering types with Kryo significantly improves its performance with only 64% fewer operations than by using a POJO.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Avro GenericRecord and SpecificRecord are roughly serialized at the same speed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Avro Reflect serialization is even slower than Kryo default (-45%).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Tuples are the fastest, closely followed by Rows. Both leverage fast specialized serialization code based on direct access without Java reflection.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Using a (nested) Tuple instead of a POJO may speed up your job by 42% (but is less flexible!).
Having code-generation for the PojoSerializer (&lt;a href=&quot;https://jira.apache.org/jira/browse/FLINK-3599&quot;&gt;FLINK-3599&lt;/a&gt;) may actually close that gap (or at least move closer to the RowSerializer). If you feel like giving the implementation a go, please give the Flink community a note and we will see whether we can make that happen.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If you cannot use POJOs, try to define your data type with one of the serialization frameworks that generate specific code for it: Protobuf, Avro, Thrift (in that order, performance-wise).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt; As with all benchmarks, please bear in mind that these numbers only give a hint on Flink’s serializer performance in a specific scenario. They may be different with your data types but the rough classification is probably the same. If you want to be sure, please verify the results with your data types. You should be able to copy from &lt;code&gt;SerializationFrameworkAllBenchmarks.java&lt;/code&gt; to set up your own micro-benchmarks or integrate different serialization benchmarks into your own tooling.&lt;/p&gt;
&lt;/div&gt;
&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;In the sections above, we looked at how Flink performs serialization for different sorts of data types and elaborated the technical advantages and disadvantages. For data types used in Flink state, you probably want to leverage either POJO or Avro types which, currently, are the only ones supporting state evolution out of the box and allow your stateful application to develop over time. POJOs are usually faster in the de/serialization while Avro may support more flexible schema evolution and may integrate better with external systems. Please note, however, that you can use different serializers for external vs. internal components or even state vs. network communication.&lt;/p&gt;
&lt;p&gt;The fastest de/serialization is achieved with Flink’s internal tuple and row serializers which can access these types’ fields directly without going via reflection. With roughly 30% decreased throughput as compared to tuples, Protobuf and POJO types do not perform too badly on their own and are more flexible and maintainable. Avro (specific and generic) records as well as Thrift data types further reduce performance by 20% and 30%, respectively. You definitely want to avoid Kryo as that reduces throughput further by around 50% and more!&lt;/p&gt;
&lt;p&gt;The next article in this series will use this finding as a starting point to look into a few common pitfalls and obstacles of avoiding Kryo, how to get the most out of the PojoSerializer, and a few more tuning techniques with respect to serialization. Stay tuned for more.&lt;/p&gt;
</description>
<pubDate>Wed, 15 Apr 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html</link>
<guid isPermaLink="true">/news/2020/04/15/flink-serialization-tuning-vol-1.html</guid>
</item>
<item>
<title>PyFlink: Introducing Python Support for UDFs in Flink&#39;s Table API</title>
<description>&lt;p&gt;Flink 1.9 introduced the Python Table API, allowing developers and data engineers to write Python Table API jobs for Table transformations and analysis, such as Python ETL or aggregate jobs. However, Python users faced some limitations when it came to support for Python UDFs in Flink 1.9, preventing them from extending the system’s built-in functionality.&lt;/p&gt;
&lt;p&gt;In Flink 1.10, the community further extended the support for Python by adding Python UDFs in PyFlink. Additionally, both the Python UDF environment and dependency management are now supported, allowing users to import third-party libraries in the UDFs, leveraging Python’s rich set of third-party libraries.&lt;/p&gt;
&lt;h1 id=&quot;python-support-for-udfs-in-flink-110&quot;&gt;Python Support for UDFs in Flink 1.10&lt;/h1&gt;
&lt;p&gt;Before diving into how you can define and use Python UDFs, we explain the motivation and background behind how UDFs work in PyFlink and provide some additional context about the implementation of our approach. Below we give a brief introduction on the PyFlink architecture from job submission, all the way to executing the Python UDF.&lt;/p&gt;
&lt;p&gt;The PyFlink architecture mainly includes two parts — local and cluster — as shown in the architecture visual below. The local phase is the compilation of the job, and the cluster is the execution of the job.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-04-09-pyflink-udfs/pyflink-udf-architecture.png&quot; width=&quot;600px&quot; alt=&quot;PyFlink UDF Architecture&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;For the local part, the Python API is a mapping of the Java API: each time Python executes a method in the figure above, it will synchronously call the method corresponding to Java through Py4J, and finally generate a Java JobGraph, before submitting it to the cluster.&lt;/p&gt;
&lt;p&gt;For the cluster part, just like ordinary Java jobs, the JobMaster schedules tasks to TaskManagers. The tasks that include Python UDF in a TaskManager involve the execution of Java and Python operators. In the Python UDF operator, various gRPC services are used to provide different communications between the Java VM and the Python VM, such as DataService for data transmissions, StateService for state requirements, and Logging and Metrics Services. These services are built on Beam’s Fn API. While currently only Process mode is supported for Python workers, support for Docker mode and External service mode is also considered for future Flink releases.&lt;/p&gt;
&lt;h1 id=&quot;how-to-use-pyflink-with-udfs-in-flink-110&quot;&gt;How to use PyFlink with UDFs in Flink 1.10&lt;/h1&gt;
&lt;p&gt;This section provides some Python user defined function (UDF) examples, including how to install PyFlink, how to define/register/invoke UDFs in PyFlink and how to execute the job.&lt;/p&gt;
&lt;h2 id=&quot;install-pyflink&quot;&gt;Install PyFlink&lt;/h2&gt;
&lt;p&gt;Using Python in Apache Flink requires installing PyFlink. PyFlink is available through PyPI and can be easily installed using pip:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python -m pip install apache-flink&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
Please note that Python 3.5 or higher is required to install and run PyFlink&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h2 id=&quot;define-a-python-udf&quot;&gt;Define a Python UDF&lt;/h2&gt;
&lt;p&gt;There are many ways to define a Python scalar function, besides extending the base class &lt;code&gt;ScalarFunction&lt;/code&gt;. The following example shows the different ways of defining a Python scalar function that takes two columns of &lt;code&gt;BIGINT&lt;/code&gt; as input parameters and returns the sum of them as the result.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;c&quot;&gt;# option 1: extending the base class `ScalarFunction`&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ScalarFunction&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;eval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# option 2: Python function&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# option 3: lambda function&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# option 4: callable function&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CallableAdd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__call__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CallableAdd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# option 5: partial function&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;partial_add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;functools&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;partial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;partial_add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;register-a-python-udf&quot;&gt;Register a Python UDF&lt;/h2&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;c&quot;&gt;# register the Python function&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;table_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;register_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;add&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;invoke-a-python-udf&quot;&gt;Invoke a Python UDF&lt;/h2&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;c&quot;&gt;# use the function in Python Table API&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;my_table&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;add(a, b)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Below, you can find a complete example of using Python UDF.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.datastream&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.table&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.table.descriptors&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Schema&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OldCsv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FileSystem&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pyflink.table.udf&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_execution_environment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_parallelism&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;register_function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;add&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FileSystem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;/tmp/input&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;with_format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OldCsv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;b&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()))&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;with_schema&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Schema&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;b&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()))&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create_temporary_table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;mySource&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FileSystem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;/tmp/output&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;with_format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OldCsv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;sum&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()))&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;with_schema&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Schema&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;sum&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()))&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create_temporary_table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;mySink&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;mySource&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;\
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;add(a, b)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; \
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insert_into&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;mySink&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;tutorial_job&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;submit-the-job&quot;&gt;Submit the job&lt;/h2&gt;
&lt;p&gt;Firstly, you need to prepare the input data in the “/tmp/input” file. For example,&lt;/p&gt;
&lt;p&gt;&lt;code&gt;$ echo &quot;1,2&quot; &amp;gt; /tmp/input&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Next, you can run this example on the command line,&lt;/p&gt;
&lt;p&gt;&lt;code&gt;$ python python_udf_sum.py&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The command builds and runs the Python Table API program in a local mini-cluster. You can also submit the Python Table API program to a remote cluster using different command lines, (see more details &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/cli.html#job-submission-examples&quot;&gt;here&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Finally, you can see the execution result on the command line:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;$ cat /tmp/output
3&lt;/code&gt;&lt;/p&gt;
&lt;h2 id=&quot;python-udf-dependency-management&quot;&gt;Python UDF dependency management&lt;/h2&gt;
&lt;p&gt;In many cases, you would like to import third-party dependencies in the Python UDF. The example below provides detailed guidance on how to manage such dependencies.&lt;/p&gt;
&lt;p&gt;Suppose you want to use the &lt;code&gt;mpmath&lt;/code&gt; to perform the sum of the example above. The Python UDF may look like:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;mpmath&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fadd&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# add third-party dependency&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fadd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To make it available on the worker node that does not contain the dependency, you can specify the dependencies with the following commands and API:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; /tmp
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;mpmath&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;1.1.0 &amp;gt; requirements.txt
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;pip download -d cached_dir -r requirements.txt --no-binary :all:&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_python_requirements&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;/tmp/requirements.txt&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;/tmp/cached_dir&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;A &lt;code&gt;requirements.txt&lt;/code&gt; file that defines the third-party dependencies is used. If the dependencies cannot be accessed in the cluster, then you can specify a directory containing the installation packages of these dependencies by using the parameter “&lt;code&gt;requirements_cached_dir&lt;/code&gt;”, as illustrated in the example above. The dependencies will be uploaded to the cluster and installed offline.&lt;/p&gt;
&lt;h1 id=&quot;conclusion--upcoming-work&quot;&gt;Conclusion &amp;amp; Upcoming work&lt;/h1&gt;
&lt;p&gt;In this blog post, we introduced the architecture of Python UDFs in PyFlink and provided some examples on how to define, register and invoke UDFs. Flink 1.10 brings Python support in the framework to new levels, allowing Python users to write even more magic with their preferred language. The community is actively working towards continuously improving the functionality and performance of PyFlink. Future work in upcoming releases will introduce support for Pandas UDFs in scalar and aggregate functions, add support to use Python UDFs through the SQL client to further expand the usage scope of Python UDFs, provide support for a Python ML Pipeline API and finally work towards even more performance improvements. The picture below provides more details on the roadmap for succeeding releases.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-04-09-pyflink-udfs/roadmap-of-pyflink.png&quot; width=&quot;600px&quot; alt=&quot;Roadmap of PyFlink&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
</description>
<pubDate>Thu, 09 Apr 2020 14:00:00 +0200</pubDate>
<link>https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html</link>
<guid isPermaLink="true">/2020/04/09/pyflink-udf-support-flink.html</guid>
</item>
<item>
<title>Stateful Functions 2.0 - An Event-driven Database on Apache Flink</title>
<description>&lt;p&gt;Today, we are announcing the release of Stateful Functions (StateFun) 2.0 — the first release of Stateful Functions as part of the Apache Flink project.
This release marks a big milestone: Stateful Functions 2.0 is not only an API update, but the &lt;strong&gt;first version of an event-driven database&lt;/strong&gt; that is built on Apache Flink.&lt;/p&gt;
&lt;p&gt;Stateful Functions 2.0 makes it possible to combine StateFun’s powerful approach to state and composition with the elasticity, rapid scaling/scale-to-zero and rolling upgrade capabilities of FaaS implementations like AWS Lambda and modern resource orchestration frameworks like Kubernetes.&lt;/p&gt;
&lt;p&gt;With these features, Stateful Functions 2.0 addresses &lt;a href=&quot;https://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-3.pdf&quot;&gt;two of the most cited shortcomings&lt;/a&gt; of many FaaS setups today: consistent state and efficient messaging between functions.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#an-event-driven-database&quot; id=&quot;markdown-toc-an-event-driven-database&quot;&gt;An Event-driven Database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#event-driven-database-vs-requestresponse-database&quot; id=&quot;markdown-toc-event-driven-database-vs-requestresponse-database&quot;&gt;“Event-driven Database” vs. “Request/Response Database”&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#state-and-consistency&quot; id=&quot;markdown-toc-state-and-consistency&quot;&gt;State and Consistency&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#remote-co-located-or-embedded-functions&quot; id=&quot;markdown-toc-remote-co-located-or-embedded-functions&quot;&gt;Remote, Co-located or Embedded Functions&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#remote-functions&quot; id=&quot;markdown-toc-remote-functions&quot;&gt;Remote Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#co-located-functions&quot; id=&quot;markdown-toc-co-located-functions&quot;&gt;Co-located Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#embedded-functions&quot; id=&quot;markdown-toc-embedded-functions&quot;&gt;Embedded Functions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#loading-data-into-the-database&quot; id=&quot;markdown-toc-loading-data-into-the-database&quot;&gt;Loading Data into the Database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#try-it-out-and-get-involved&quot; id=&quot;markdown-toc-try-it-out-and-get-involved&quot;&gt;Try it out and get involved!&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#thank-you&quot; id=&quot;markdown-toc-thank-you&quot;&gt;Thank you!&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;an-event-driven-database&quot;&gt;An Event-driven Database&lt;/h2&gt;
&lt;p&gt;When Stateful Functions joined Apache Flink at the beginning of this year, the project had started as a library on top of Flink to build general-purpose event-driven applications. Users would implement &lt;em&gt;functions&lt;/em&gt; that receive and send messages, and maintain state in persistent variables. Flink provided the runtime with efficient exactly-once state and messaging. Stateful Functions 1.0 was a FaaS-inspired mix between stream processing and actor programming — on steroids.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image2.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 1&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.1:&lt;/b&gt; A ride-sharing app as a Stateful Functions example.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;In version 2.0, Stateful Functions now physically decouples the functions from Flink and the JVM, to invoke them through simple services. That makes it possible to execute functions on a FaaS platform, a Kubernetes deployment or behind a (micro) service.&lt;/p&gt;
&lt;p&gt;Flink invokes the functions through a service endpoint via HTTP or gRPC based on incoming events, and supplies state access. The system makes sure that only one invocation per entity (&lt;code&gt;type&lt;/code&gt;+&lt;code&gt;ID&lt;/code&gt;) is ongoing at any point in time, thus guaranteeing consistency through isolation.
By supplying state access as part of the function invocation, the functions themselves behave like stateless applications and can be managed with the same simplicity and benefits: rapid scalability, scale-to-zero, rolling/zero-downtime upgrades and so on.&lt;/p&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image5.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 2&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.2:&lt;/b&gt; In Stateful Functions 2.0, functions are stateless and state access is part of the function invocation.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;The functions can be implemented in any programming language that can handle HTTP requests or bring up a gRPC server. The &lt;a href=&quot;https://github.com/apache/flink-statefun&quot;&gt;StateFun project&lt;/a&gt; includes a very slim SDK for Python, taking requests and dispatching them to annotated functions. We aim to provide similar SDKs for other languages, such as Go, JavaScript or Rust. Users do not need to write any Flink code (or JVM code) at all; data ingresses/egresses and function endpoints can be defined in a compact YAML spec.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-lg-6&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image3.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 3&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.3:&lt;/b&gt; A module declaring a remote endpoint and a function type.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;col-lg-6&quot;&gt;
&lt;div class=&quot;text-center&quot;&gt;
&lt;figure&gt;
&lt;div style=&quot;line-height:540%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image10.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 4&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.4:&lt;/b&gt; A Python implementation of a simple classifier function.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;The Flink processes (and the JVM) are not executing any user-code at all — though this is possible, for performance reasons (see &lt;a href=&quot;#embedded-functions&quot;&gt;Embedded Functions&lt;/a&gt;). Rather than running application-specific dataflows, Flink here stores the state of the functions and provides the dynamic messaging plane through which functions message each other, carefully dispatching messages/invocations to the event-driven functions/services to maintain consistency guarantees.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Effectively, Flink takes the role of the database, but tailored towards event-driven functions and services.
It integrates state storage with the messaging between (and the invocations of) functions and services.
Because of this, Stateful Functions 2.0 can be thought of as an “Event-driven Database” on Apache Flink.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;event-driven-database-vs-requestresponse-database&quot;&gt;“Event-driven Database” vs. “Request/Response Database”&lt;/h2&gt;
&lt;p&gt;In the case of a traditional database or key/value store (let’s call them request/response databases), the application issues queries to the database (e.g. SQL via JDBC, GET/PUT via HTTP). In contrast, an event-driven database like StateFun &lt;strong&gt;&lt;em&gt;inverts&lt;/em&gt;&lt;/strong&gt; that relationship between database and application: the database invokes the functions/services based on arriving messages. This fits very naturally with FaaS and many event-driven application architectures.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image7.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 5&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.5:&lt;/b&gt; Stateful Functions 2.0 inverts the relationship between database and application.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;In the case of applications built on request/response databases, the database is responsible only for the state. Communication between different functions/services is a separate concern handled within the application layer. In contrast to that, an event-driven database takes care of both state storage and message transport, in a tightly integrated manner.&lt;/p&gt;
&lt;p&gt;Similar to &lt;a href=&quot;https://www.brianstorti.com/the-actor-model/&quot;&gt;Actor Programming&lt;/a&gt;, Stateful Functions uses the idea of &lt;em&gt;addressable entities&lt;/em&gt; - here, the entity is a function &lt;code&gt;type&lt;/code&gt; with an invocation scoped to an &lt;code&gt;ID&lt;/code&gt;. These addressable entities own the state and are the targets of messages. Different to actor systems is that the application logic is external and the addressable entities are not physical objects in memory (i.e. actors), but rows in Flink’s managed state, together with the entities’ mailboxes.&lt;/p&gt;
&lt;h3 id=&quot;state-and-consistency&quot;&gt;State and Consistency&lt;/h3&gt;
&lt;p&gt;Besides matching the needs of serverless applications and FaaS well, the event-driven database approach also helps with simplifying consistent state management.&lt;/p&gt;
&lt;p&gt;Consider the example below, with two entities of an application — for example two microservices (&lt;em&gt;Service 1&lt;/em&gt;, &lt;em&gt;Service 2&lt;/em&gt;). &lt;em&gt;Service 1&lt;/em&gt; is invoked, updates the state in the database, and sends a request to &lt;em&gt;Service 2&lt;/em&gt;. Assume that this request fails. There is, in general, no way for &lt;em&gt;Service 1&lt;/em&gt; to know whether &lt;em&gt;Service 2&lt;/em&gt; processed the request and updated its state or not (c.f. &lt;a href=&quot;https://en.wikipedia.org/wiki/Two_Generals%27_Problem&quot;&gt;Two Generals Problem&lt;/a&gt;). To work around that, many techniques exist — making requests idempotent and retrying, commit/rollback protocols, or external transaction coordinators, for example. Solving this in the application layer is complex enough, and including the database into these approaches only adds more complexity.&lt;/p&gt;
&lt;p&gt;In the scenario where the event-driven database takes care of state and messaging, we have a much easier problem to solve. Assume one shard of the database receives the initial message, updates its state, invokes &lt;em&gt;Service 1&lt;/em&gt;, and routes the message produced by the function to another shard, to be delivered to &lt;em&gt;Service 2&lt;/em&gt;. Now assume message transport errored — it may have failed or not, we cannot know for certain. Because the database is in charge of state and messaging, it can offer a generic solution to make sure that either both go through or none does, for example through transactions or &lt;a href=&quot;https://dl.acm.org/doi/abs/10.14778/3137765.3137777&quot;&gt;consistent snapshots&lt;/a&gt;. The application functions are stateless and their invocations without side effects, which means they can be re-invoked again without implications on consistency.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;figure&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image8.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 6&quot; /&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;figcaption&gt;&lt;i&gt;&lt;b&gt;Fig.6:&lt;/b&gt; The event-driven database integrates state access and messaging, guaranteeing consistency.&lt;/i&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;That is the big lesson we learned from working on stream processing technology in the past years: &lt;strong&gt;state access/updates and messaging need to be integrated&lt;/strong&gt;. This gives you consistency, scalable behavior and backpressures well based on both state access and compute bottlenecks.&lt;/p&gt;
&lt;p&gt;Despite state and computation being physically separated here, the scheduling/dispatching of function invocations is still integrated and physically co-located with state access, preserving the consistency guarantees given by physical state/compute co-location.&lt;/p&gt;
&lt;h2 id=&quot;remote-co-located-or-embedded-functions&quot;&gt;Remote, Co-located or Embedded Functions&lt;/h2&gt;
&lt;p&gt;Functions can be deployed in various ways that trade off loose coupling and independent scaling with performance overhead. Each module of functions can be of a different kind, so some functions can run remote, while others could run embedded.&lt;/p&gt;
&lt;h3 id=&quot;remote-functions&quot;&gt;Remote Functions&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;Remote Functions&lt;/em&gt; are the mechanism described so far, where functions are deployed separately from the Flink StateFun cluster. The state/messaging tier (i.e. the Flink processes) and the function tier can be deployed and scaled independently. All function invocations are remote and have to go through the endpoint service.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image6.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 7&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;In a similar way as databases are accessed via a standardized protocol (e.g. ODBC/JDBC for relational databases, REST for many key/value stores), StateFun 2.0 invokes functions and services through a standardized protocol: HTTP or gRPC with data in a well-defined ProtoBuf schema.&lt;/p&gt;
&lt;h3 id=&quot;co-located-functions&quot;&gt;Co-located Functions&lt;/h3&gt;
&lt;p&gt;An alternative way of deploying functions is &lt;em&gt;co-location&lt;/em&gt; with the Flink JVM processes. In such a setup, each Flink TaskManager would talk to one function process sitting “next to it”. A common way to do this is to use a system like Kubernetes and deploy pods consisting of a Flink container and the function container that communicate via the pod-local network.&lt;/p&gt;
&lt;p&gt;This mode supports different languages while avoiding to route invocations through a Service/Gateway/LoadBalancer, but it cannot scale the state and compute parts independently.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image9.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 8&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;This style of deployment is similar to how &lt;a href=&quot;https://beam.apache.org/roadmap/portability/&quot;&gt;Apache Beam’s portability layer&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/tutorials/python_table_api.html&quot;&gt;Flink’s Python API&lt;/a&gt; deploy their non-JVM language SDKs.&lt;/p&gt;
&lt;h3 id=&quot;embedded-functions&quot;&gt;Embedded Functions&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;Embedded Functions&lt;/em&gt; are the mode of Stateful Functions 1.0 and Flink’s Java/Scala stream processing APIs. Functions are deployed into the JVM and are directly invoked with the messages and state access. This is the most performant way, though at the cost of only supporting JVM languages.&lt;/p&gt;
&lt;div style=&quot;line-height:60%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-04-07-release-statefun-2.0.0/image11.png&quot; width=&quot;600px&quot; alt=&quot;Statefun 9&quot; /&gt;
&lt;/center&gt;
&lt;div style=&quot;line-height:150%;&quot;&gt;
&lt;br /&gt;
&lt;/div&gt;
&lt;p&gt;Following the database analogy, embedded functions are a bit like &lt;em&gt;stored procedures&lt;/em&gt;, but in a principled way: the functions here are normal Java/Scala/Kotlin functions implementing standard interfaces and can be developed or tested in any IDE.&lt;/p&gt;
&lt;h2 id=&quot;loading-data-into-the-database&quot;&gt;Loading Data into the Database&lt;/h2&gt;
&lt;p&gt;When building a new stateful application, you usually don’t start from a completely blank slate. Often, the application has initial state, such as initial “bootstrap” state, or state from previous versions of the application. When using a database, one could simply bulk load the data to prepare the application.&lt;/p&gt;
&lt;p&gt;The equivalent step for Flink would be to write a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/ops/state/savepoints.html&quot;&gt;savepoint&lt;/a&gt; that contains the initial state. Savepoints are snapshots of the state of the distributed stream processing application and can be passed to Flink to start processing from that state. Think of them as a database dump, but of a distributed streaming database. In the case of StateFun, the savepoint would contain the state of the functions.&lt;/p&gt;
&lt;p&gt;To create a savepoint for a Stateful Functions program, check out the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.0/deployment-and-operations/state-bootstrap.html&quot;&gt;State Bootstrapping API&lt;/a&gt; that is part of StateFun 2.0. The State Bootstrapping API uses Flink’s &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.15/dev/batch/&quot;&gt;DataSet API&lt;/a&gt;, but we plan to expand this to use SQL in the next versions.&lt;/p&gt;
&lt;h2 id=&quot;try-it-out-and-get-involved&quot;&gt;Try it out and get involved!&lt;/h2&gt;
&lt;p&gt;We hope that we could convey some of the excitement we feel about Stateful Functions. If we managed to pique your curiosity, try it out — for example, starting with &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.0/getting-started/python_walkthrough.html&quot;&gt;this walkthrough&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The project is still in a comparatively early stage, so if you want to get involved, there is lots to work on: SDKs for other languages (e.g. Go, JavaScript, Rust), ingresses/egresses and tools for testing, among others.&lt;/p&gt;
&lt;p&gt;To follow the project and learn more, please check out these resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Code: &lt;a href=&quot;https://github.com/apache/flink-statefun&quot;&gt;https://github.com/apache/flink-statefun&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Docs: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.0/&quot;&gt;https://nightlies.apache.org/flink/flink-statefun-docs-release-2.0/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Apache Flink project site: &lt;a href=&quot;https://flink.apache.org/&quot;&gt;https://flink.apache.org/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Apache Flink on Twitter: &lt;a href=&quot;https://twitter.com/apacheflink&quot;&gt;@ApacheFlink&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Stateful Functions Webpage: &lt;a href=&quot;https://statefun.io&quot;&gt;https://statefun.io&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Stateful Functions on Twitter: &lt;a href=&quot;https://twitter.com/statefun_io&quot;&gt;@StateFun_IO&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;thank-you&quot;&gt;Thank you!&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank all contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;David Anderson, Dian Fu, Igal Shilman, Seth Wiesman, Stephan Ewen, Tzu-Li (Gordon) Tai, hequn8128&lt;/p&gt;
</description>
<pubDate>Tue, 07 Apr 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/04/07/release-statefun-2.0.0.html</link>
<guid isPermaLink="true">/news/2020/04/07/release-statefun-2.0.0.html</guid>
</item>
<item>
<title>Flink Community Update - April&#39;20</title>
<description>&lt;p&gt;While things slow down around us, the Apache Flink community is privileged to remain as active as ever. This blogpost combs through the past few months to give you an update on the state of things in Flink — from core releases to Stateful Functions; from some good old community stats to a new development blog.&lt;/p&gt;
&lt;p&gt;And since now it’s more important than ever to keep up the spirits, we’d like to invite you to join the &lt;a href=&quot;https://www.flink-forward.org/sf-2020&quot;&gt;Flink Forward Virtual Conference&lt;/a&gt;, on April 22-24 (see &lt;a href=&quot;#upcoming-events&quot;&gt;Upcoming Events&lt;/a&gt;). Hope to see you there!&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#the-year-so-far-in-flink&quot; id=&quot;markdown-toc-the-year-so-far-in-flink&quot;&gt;The Year (so far) in Flink&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-110-release&quot; id=&quot;markdown-toc-flink-110-release&quot;&gt;Flink 1.10 Release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#stateful-functions-contribution-and-20-release&quot; id=&quot;markdown-toc-stateful-functions-contribution-and-20-release&quot;&gt;Stateful Functions Contribution and 2.0 Release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#building-up-to-flink-111&quot; id=&quot;markdown-toc-building-up-to-flink-111&quot;&gt;Building up to Flink 1.11&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers-and-pmc-members&quot; id=&quot;markdown-toc-new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#new-pmc-members&quot; id=&quot;markdown-toc-new-pmc-members&quot;&gt;New PMC Members&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#new-committers&quot; id=&quot;markdown-toc-new-committers&quot;&gt;New Committers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-bigger-picture&quot; id=&quot;markdown-toc-the-bigger-picture&quot;&gt;The Bigger Picture&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#a-look-into-the-flink-repository&quot; id=&quot;markdown-toc-a-look-into-the-flink-repository&quot;&gt;A Look into the Flink Repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-community-packages&quot; id=&quot;markdown-toc-flink-community-packages&quot;&gt;Flink Community Packages&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-engine-room&quot; id=&quot;markdown-toc-flink-engine-room&quot;&gt;Flink “Engine Room”&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#upcoming-events&quot; id=&quot;markdown-toc-upcoming-events&quot;&gt;Upcoming Events&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#flink-forward-virtual-conference&quot; id=&quot;markdown-toc-flink-forward-virtual-conference&quot;&gt;Flink Forward Virtual Conference&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#others&quot; id=&quot;markdown-toc-others&quot;&gt;Others&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h1 id=&quot;the-year-so-far-in-flink&quot;&gt;The Year (so far) in Flink&lt;/h1&gt;
&lt;h2 id=&quot;flink-110-release&quot;&gt;Flink 1.10 Release&lt;/h2&gt;
&lt;p&gt;To kick off the new year, the Flink community &lt;a href=&quot;https://flink.apache.org/news/2020/02/11/release-1.10.0.html&quot;&gt;released Flink 1.10&lt;/a&gt; with the record contribution of over 200 engineers. This release introduced significant improvements to the overall performance and stability of Flink jobs, a preview of native Kubernetes integration and advances in Python support (PyFlink). Flink 1.10 also marked the completion of the &lt;a href=&quot;https://flink.apache.org/news/2019/08/22/release-1.9.0.html#preview-of-the-new-blink-sql-query-processor&quot;&gt;Blink integration&lt;/a&gt;, hardening streaming SQL and bringing mature batch processing to Flink with production-ready Hive integration and TPC-DS coverage.&lt;/p&gt;
&lt;p&gt;The community is now discussing the &lt;a href=&quot;http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-10-1-td38689.html#a38690&quot;&gt;release of Flink 1.10.1&lt;/a&gt;, covering some outstanding bugs from Flink 1.10.&lt;/p&gt;
&lt;h2 id=&quot;stateful-functions-contribution-and-20-release&quot;&gt;Stateful Functions Contribution and 2.0 Release&lt;/h2&gt;
&lt;p&gt;Last January, the first version of Stateful Functions (&lt;a href=&quot;https://statefun.io/&quot;&gt;statefun.io&lt;/a&gt;) code was pushed to the &lt;a href=&quot;https://github.com/apache/flink-statefun&quot;&gt;Flink repository&lt;/a&gt;. Stateful Functions started out as an API to build general purpose event-driven applications on Flink, taking advantage of its advanced state management mechanism to cut the “middleman” that usually handles state coordination in such applications (e.g. a database).&lt;/p&gt;
&lt;p&gt;In a &lt;a href=&quot;http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Update-on-Flink-Stateful-Functions-what-are-the-next-steps-tp38646.html&quot;&gt;recent update&lt;/a&gt;, some new features were announced, like multi-language support (including a Python SDK), function unit testing and Stateful Functions’ own flavor of the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/libs/state_processor_api.html&quot;&gt;State Processor API&lt;/a&gt;. The release cycle will be independent from core Flink releases and the Release Candidate (RC) has been created — so, &lt;strong&gt;you can expect Stateful Functions 2.0 to be released very soon!&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id=&quot;building-up-to-flink-111&quot;&gt;Building up to Flink 1.11&lt;/h2&gt;
&lt;p&gt;Amidst the usual outpour of discussion threads, JIRA tickets and FLIPs, the community is working full steam on bringing Flink 1.11 to life in the next few months. The feature freeze is currently scheduled for late April, so the release is expected around mid May.
The upcoming release will focus on new features and integrations that broaden the scope of Flink use cases, as well as core runtime enhancements to streamline the operations of complex deployments.&lt;/p&gt;
&lt;p&gt;Some of the plans on the use case side include support for changelog streams in the Table API/SQL (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-105%3A+Support+to+Interpret+and+Emit+Changelog+in+Flink+SQL&quot;&gt;FLIP-105&lt;/a&gt;), easy streaming data ingestion into Apache Hive (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table&quot;&gt;FLIP-115&lt;/a&gt;) and support for Pandas DataFrames in PyFlink. On the operational side, the much anticipated new Source API (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface&quot;&gt;FLIP-27&lt;/a&gt;) will unify batch and streaming sources, and improve out-of-the-box event-time behavior; while unaligned checkpoints (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-76%3A+Unaligned+Checkpoints&quot;&gt;FLIP-76&lt;/a&gt;) and some changes to network memory management will allow to speed up checkpointing under backpressure.&lt;/p&gt;
&lt;p&gt;Throw into the mix improvements around type systems, the WebUI, metrics reporting and supported formats, this release is bound to keep the community busy. For a complete overview of the ongoing development, check &lt;a href=&quot;http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-of-Apache-Flink-1-11-td38724.html#a38793&quot;&gt;this discussion&lt;/a&gt; and follow the weekly updates on the Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;@community mailing list&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;new-committers-and-pmc-members&quot;&gt;New Committers and PMC Members&lt;/h2&gt;
&lt;p&gt;The Apache Flink community has welcomed &lt;strong&gt;1 PMC (Project Management Committee) Member&lt;/strong&gt; and &lt;strong&gt;5 new Committers&lt;/strong&gt; since the last update (September 2019):&lt;/p&gt;
&lt;h3 id=&quot;new-pmc-members&quot;&gt;New PMC Members&lt;/h3&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;Jark Wu
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;new-committers&quot;&gt;New Committers&lt;/h3&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;Zili Chen, Jingsong Lee, Yu Li, Dian Fu, Zhu Zhu
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Congratulations to all and thank you for your hardworking commitment to Flink!&lt;/p&gt;
&lt;h1 id=&quot;the-bigger-picture&quot;&gt;The Bigger Picture&lt;/h1&gt;
&lt;h2 id=&quot;a-look-into-the-flink-repository&quot;&gt;A Look into the Flink Repository&lt;/h2&gt;
&lt;p&gt;In the &lt;a href=&quot;https://flink.apache.org/news/2019/09/10/community-update.html&quot;&gt;last update&lt;/a&gt;, we shared some numbers around Flink releases and mailing list activity. This time, we’re looking into the activity in the Flink repository and how it’s evolving.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-03-30-flink-community-update/2020-03-30-flink-community-update_1.png&quot; width=&quot;725px&quot; alt=&quot;GitHub 1&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;There is a clear upward trend in the number of contributions to the repository, based on the number of commits. This reflects the &lt;strong&gt;fast pace of development&lt;/strong&gt; the project is experiencing and also the &lt;strong&gt;successful integration of the China-based Flink contributors&lt;/strong&gt; started early last year. To complement these observations, the repository registered a &lt;strong&gt;1.5x increase in the number of individual contributors in 2019&lt;/strong&gt;, compared to the previous year.&lt;/p&gt;
&lt;p&gt;But did this increase in capacity produce any other measurable benefits?&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-03-30-flink-community-update/2020-03-30-flink-community-update_2.png&quot; width=&quot;725px&quot; alt=&quot;GitHub 2&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;If we look at the average time of Pull Request (PR) “resolution”, it seems like it did: &lt;strong&gt;the average time it takes to close a PR these days has been steadily decreasing&lt;/strong&gt; since last year, sitting between 5-6 days for the past few months.&lt;/p&gt;
&lt;p&gt;These are great indicators of the health of Flink as an open source project!&lt;/p&gt;
&lt;h2 id=&quot;flink-community-packages&quot;&gt;Flink Community Packages&lt;/h2&gt;
&lt;p&gt;If you missed the launch of &lt;a href=&quot;http://flink-packages.org/&quot;&gt;flink-packages.org&lt;/a&gt;, here’s a reminder! Ververica has &lt;a href=&quot;https://www.ververica.com/blog/announcing-flink-community-packages&quot;&gt;created (and open sourced)&lt;/a&gt; a website that showcases the work of the community to push forward the ecosystem surrounding Flink. There, you can explore existing packages (like the Pravega and Pulsar Flink connectors, or the Flink Kubernetes operators developed by Google and Lyft) and also submit your own contributions to the ecosystem.&lt;/p&gt;
&lt;h2 id=&quot;flink-engine-room&quot;&gt;Flink “Engine Room”&lt;/h2&gt;
&lt;p&gt;The community has recently launched the &lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewrecentblogposts.action?key=FLINK&quot;&gt;“Engine Room”&lt;/a&gt;, a dedicated space in Flink’s Wiki for knowledge sharing between contributors. The goal of this initiative is to make ongoing development on Flink internals more transparent across different work streams, and also to help new contributors get on board with best practices. The first blogpost is already up and sheds light on the &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/2020/03/22/Migrating+Flink%27s+CI+Infrastructure+from+Travis+CI+to+Azure+Pipelines&quot;&gt;migration of Flink’s CI infrastructure from Travis to Azure Pipelines&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;upcoming-events&quot;&gt;Upcoming Events&lt;/h1&gt;
&lt;h2 id=&quot;flink-forward-virtual-conference&quot;&gt;Flink Forward Virtual Conference&lt;/h2&gt;
&lt;p&gt;The organization of Flink Forward had to make the hard decision of cancelling this year’s event in San Francisco. But all is not lost! &lt;strong&gt;Flink Forward SF will be held online on April 22-24 and you can register (for free)&lt;/strong&gt; &lt;a href=&quot;https://www.flink-forward.org/sf-2020&quot;&gt;here&lt;/a&gt;. Join the community for interactive talks and Q&amp;amp;A sessions with core Flink contributors and companies like Splunk, Lyft, Netflix or Google.&lt;/p&gt;
&lt;h2 id=&quot;others&quot;&gt;Others&lt;/h2&gt;
&lt;p&gt;Events across the globe have come to a halt due to the growing concerns around COVID-19, so this time we’ll leave you with some interesting content to read instead. In addition to this written content, you can also recap last year’s sessions from &lt;a href=&quot;https://www.youtube.com/playlist?list=PLDX4T_cnKjD207Aa8b5CsZjc7Z_KRezGz&quot;&gt;Flink Forward Berlin&lt;/a&gt; and &lt;a href=&quot;https://www.youtube.com/playlist?list=PLDX4T_cnKjD3ANoNinSx3Au-poZTHvbF5&quot;&gt;Flink Forward China&lt;/a&gt;!&lt;/p&gt;
&lt;table class=&quot;table table-bordered&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Links&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon glyphicon-bookmark&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Blogposts&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://medium.com/bird-engineering/replayable-process-functions-in-flink-time-ordering-and-timers-28007a0210e1&quot;&gt;Replayable Process Functions: Time, Ordering, and Timers @Bird&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://engineering.salesforce.com/application-log-intelligence-performance-insights-at-salesforce-using-flink-92955f30573f&quot;&gt;Application Log Intelligence &amp;amp; Performance Insights at Salesforce Using Flink @Salesforce&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/01/29/state-unlocked-interacting-with-state-in-apache-flink.html&quot;&gt;State Unlocked: Interacting with State in Apache Flink&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html&quot;&gt;Advanced Flink Application Patterns Vol.1: Case Study of a Fraud Detection System&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/03/24/demo-fraud-detection-2.html&quot;&gt;Advanced Flink Application Patterns Vol.2: Dynamic Updates of Application Logic&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html&quot;&gt;Apache Beam: How Beam Runs on Top of Flink&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/features/2020/03/27/flink-for-data-warehouse.html&quot;&gt;Flink as Unified Engine for Modern Data Warehousing: Production-Ready Hive Integration&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span class=&quot;glyphicon glyphicon-console&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Tutorials&lt;/td&gt;
&lt;td&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://medium.com/@zjffdu/flink-on-zeppelin-part-3-streaming-5fca1e16754&quot;&gt;Flink on Zeppelin — (Part 3). Streaming&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/blogs/big-data/streaming-etl-with-apache-flink-and-amazon-kinesis-data-analytics&quot;&gt;Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/02/20/ddl.html&quot;&gt;No Java Required: Configuring Sources and Sinks in SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://flink.apache.org/news/2020/02/07/a-guide-for-unit-testing-in-apache-flink.html&quot;&gt;A Guide for Unit Testing in Apache Flink&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;If you’d like to keep a closer eye on what’s happening in the community, subscribe to the Flink &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;@community mailing list&lt;/a&gt; to get fine-grained weekly updates, upcoming event announcements and more.&lt;/p&gt;
</description>
<pubDate>Wed, 01 Apr 2020 10:00:00 +0200</pubDate>
<link>https://flink.apache.org/news/2020/04/01/community-update.html</link>
<guid isPermaLink="true">/news/2020/04/01/community-update.html</guid>
</item>
<item>
<title>Flink as Unified Engine for Modern Data Warehousing: Production-Ready Hive Integration</title>
<description>&lt;p&gt;In this blog post, you will learn our motivation behind the Flink-Hive integration, and how Flink 1.10 can help modernize your data warehouse.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#flink-and-its-integration-with-hive-comes-into-the-scene&quot; id=&quot;markdown-toc-flink-and-its-integration-with-hive-comes-into-the-scene&quot;&gt;Flink and Its Integration With Hive Comes into the Scene&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#unified-metadata-management&quot; id=&quot;markdown-toc-unified-metadata-management&quot;&gt;Unified Metadata Management&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#stream-processing&quot; id=&quot;markdown-toc-stream-processing&quot;&gt;Stream Processing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#compatible-with-more-hive-versions&quot; id=&quot;markdown-toc-compatible-with-more-hive-versions&quot;&gt;Compatible with More Hive Versions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#reuse-hive-user-defined-functions-udfs&quot; id=&quot;markdown-toc-reuse-hive-user-defined-functions-udfs&quot;&gt;Reuse Hive User Defined Functions (UDFs)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#enhanced-read-and-write-on-hive-data&quot; id=&quot;markdown-toc-enhanced-read-and-write-on-hive-data&quot;&gt;Enhanced Read and Write on Hive Data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#formats&quot; id=&quot;markdown-toc-formats&quot;&gt;Formats&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#more-data-types&quot; id=&quot;markdown-toc-more-data-types&quot;&gt;More Data Types&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#roadmap&quot; id=&quot;markdown-toc-roadmap&quot;&gt;Roadmap&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;What are some of the latest requirements for your data warehouse and data infrastructure in 2020?&lt;/p&gt;
&lt;p&gt;We’ve came up with some for you.&lt;/p&gt;
&lt;p&gt;Firstly, today’s business is shifting to a more real-time fashion, and thus demands abilities to process online streaming data with low latency for near-real-time or even real-time analytics. People become less and less tolerant of delays between when data is generated and when it arrives at their hands, ready to use. Hours or even days of delay is not acceptable anymore. Users are expecting minutes, or even seconds, of end-to-end latency for data in their warehouse, to get quicker-than-ever insights.&lt;/p&gt;
&lt;p&gt;Secondly, the infrastructure should be able to handle both offline batch data for offline analytics and exploration, and online streaming data for more timely analytics. Both are indispensable as they both have very valid use cases. Apart from the real time processing mentioned above, batch processing would still exist as it’s good for ad hoc queries and explorations, and full-size calculations. Your modern infrastructure should not force users to choose between one or the other, it should offer users both options for a world-class data infrastructure.&lt;/p&gt;
&lt;p&gt;Thirdly, the data players, including data engineers, data scientists, analysts, and operations, urge a more unified infrastructure than ever before for easier ramp-up and higher working efficiency. The big data landscape has been fragmented for years - companies may have one set of infrastructure for real time processing, one set for batch, one set for OLAP, etc. That, oftentimes, comes as a result of the legacy of lambda architecture, which was popular in the era when stream processors were not as mature as today and users had to periodically run batch processing as a way to correct streaming pipelines. Well, it’s a different era now! As stream processing becomes mainstream and dominant, end users no longer want to learn shattered pieces of skills and maintain many moving parts with all kinds of tools and pipelines. Instead, what they really need is a unified analytics platform that can be mastered easily, and simplify any operational complexity.&lt;/p&gt;
&lt;p&gt;If any of these resonate with you, you just found the right post to read: we have never been this close to the vision by strengthening Flink’s integration with Hive to a production grade.&lt;/p&gt;
&lt;h2 id=&quot;flink-and-its-integration-with-hive-comes-into-the-scene&quot;&gt;Flink and Its Integration With Hive Comes into the Scene&lt;/h2&gt;
&lt;p&gt;Apache Flink has been a proven scalable system to handle extremely high workload of streaming data in super low latency in many giant tech companies.&lt;/p&gt;
&lt;p&gt;Despite its huge success in the real time processing domain, at its deep root, Flink has been faithfully following its inborn philosophy of being &lt;a href=&quot;https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html&quot;&gt;a unified data processing engine for both batch and streaming&lt;/a&gt;, and taking a streaming-first approach in its architecture to do batch processing. By making batch a special case for streaming, Flink really leverages its cutting edge streaming capabilities and applies them to batch scenarios to gain the best offline performance. Flink’s batch performance has been quite outstanding in the early days and has become even more impressive, as the community started merging Blink, Alibaba’s fork of Flink, back to Flink in 1.9 and finished it in 1.10.&lt;/p&gt;
&lt;p&gt;On the other hand, Apache Hive has established itself as a focal point of the data warehousing ecosystem. It serves as not only a SQL engine for big data analytics and ETL, but also a data management platform, where data is discovered and defined. As business evolves, it puts new requirements on data warehouse.&lt;/p&gt;
&lt;p&gt;Thus we started integrating Flink and Hive as a beta version in Flink 1.9. Over the past few months, we have been listening to users’ requests and feedback, extensively enhancing our product, and running rigorous benchmarks (which will be published soon separately). I’m glad to announce that the integration between Flink and Hive is at production grade in &lt;a href=&quot;https://flink.apache.org/news/2020/02/11/release-1.10.0.html&quot;&gt;Flink 1.10&lt;/a&gt; and we can’t wait to walk you through the details.&lt;/p&gt;
&lt;h3 id=&quot;unified-metadata-management&quot;&gt;Unified Metadata Management&lt;/h3&gt;
&lt;p&gt;Hive Metastore has evolved into the de facto metadata hub over the years in the Hadoop, or even the cloud, ecosystem. Many companies have a single Hive Metastore service instance in production to manage all of their schemas, either Hive or non-Hive metadata, as the single source of truth.&lt;/p&gt;
&lt;p&gt;In 1.9 we introduced Flink’s &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/hive_catalog.html&quot;&gt;HiveCatalog&lt;/a&gt;, connecting Flink to users’ rich metadata pool. The meaning of &lt;code&gt;HiveCatalog&lt;/code&gt; is two-fold here. First, it allows Apache Flink users to utilize Hive Metastore to store and manage Flink’s metadata, including tables, UDFs, and statistics of data. Second, it enables Flink to access Hive’s existing metadata, so that Flink itself can read and write Hive tables.&lt;/p&gt;
&lt;p&gt;In Flink 1.10, users can store Flink’s own tables, views, UDFs, statistics in Hive Metastore on all of the compatible Hive versions mentioned above. &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/hive_catalog.html#example&quot;&gt;Here’s an end-to-end example&lt;/a&gt; of how to store a Flink’s Kafka source table in Hive Metastore and later query the table in Flink SQL.&lt;/p&gt;
&lt;h3 id=&quot;stream-processing&quot;&gt;Stream Processing&lt;/h3&gt;
&lt;p&gt;The Hive integration feature in Flink 1.10 empowers users to re-imagine what they can accomplish with their Hive data and unlock stream processing use cases:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;join real-time streaming data in Flink with offline Hive data for more complex data processing&lt;/li&gt;
&lt;li&gt;backfill Hive data with Flink directly in a unified fashion&lt;/li&gt;
&lt;li&gt;leverage Flink to move real-time data into Hive more quickly, greatly shortening the end-to-end latency between when data is generated and when it arrives at your data warehouse for analytics, from hours — or even days — to minutes&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;compatible-with-more-hive-versions&quot;&gt;Compatible with More Hive Versions&lt;/h3&gt;
&lt;p&gt;In Flink 1.10, we brought full coverage to most Hive versions including 1.0, 1.1, 1.2, 2.0, 2.1, 2.2, 2.3, and 3.1. Take a look &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/#supported-hive-versions&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;reuse-hive-user-defined-functions-udfs&quot;&gt;Reuse Hive User Defined Functions (UDFs)&lt;/h3&gt;
&lt;p&gt;Users can &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/hive_functions.html#hive-user-defined-functions&quot;&gt;reuse all kinds of Hive UDFs in Flink&lt;/a&gt; since Flink 1.9.&lt;/p&gt;
&lt;p&gt;This is a great win for Flink users with past history with the Hive ecosystem, as they may have developed custom business logic in their Hive UDFs. Being able to run these functions without any rewrite saves users a lot of time and brings them a much smoother experience when they migrate to Flink.&lt;/p&gt;
&lt;p&gt;To take it a step further, Flink 1.10 introduces &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/hive_functions.html#use-hive-built-in-functions-via-hivemodule&quot;&gt;compatibility of Hive built-in functions via HiveModule&lt;/a&gt;. Over the years, the Hive community has developed a few hundreds of built-in functions that are super handy for users. For those built-in functions that don’t exist in Flink yet, users are now able to leverage the existing Hive built-in functions that they are familiar with and complete their jobs seamlessly.&lt;/p&gt;
&lt;h3 id=&quot;enhanced-read-and-write-on-hive-data&quot;&gt;Enhanced Read and Write on Hive Data&lt;/h3&gt;
&lt;p&gt;Flink 1.10 extends its read and write capabilities on Hive data to all the common use cases with better performance.&lt;/p&gt;
&lt;p&gt;On the reading side, Flink now can read Hive regular tables, partitioned tables, and views. Lots of optimization techniques are developed around reading, including partition pruning and projection pushdown to transport less data from file storage, limit pushdown for faster experiment and exploration, and vectorized reader for ORC files.&lt;/p&gt;
&lt;p&gt;On the writing side, Flink 1.10 introduces “INSERT INTO” and “INSERT OVERWRITE” to its syntax, and can write to not only Hive’s regular tables, but also partitioned tables with either static or dynamic partitions.&lt;/p&gt;
&lt;h3 id=&quot;formats&quot;&gt;Formats&lt;/h3&gt;
&lt;p&gt;Your engine should be able to handle all common types of file formats to give you the freedom of choosing one over another in order to fit your business needs. It’s no exception for Flink. We have tested the following table storage formats: text, csv, SequenceFile, ORC, and Parquet.&lt;/p&gt;
&lt;h3 id=&quot;more-data-types&quot;&gt;More Data Types&lt;/h3&gt;
&lt;p&gt;In Flink 1.10, we added support for a few more frequently-used Hive data types that were not covered by Flink 1.9. Flink users now should have a full, smooth experience to query and manipulate Hive data from Flink.&lt;/p&gt;
&lt;h3 id=&quot;roadmap&quot;&gt;Roadmap&lt;/h3&gt;
&lt;p&gt;Integration between any two systems is a never-ending story.&lt;/p&gt;
&lt;p&gt;We are constantly improving Flink itself and the Flink-Hive integration also gets improved by collecting user feedback and working with folks in this vibrant community.&lt;/p&gt;
&lt;p&gt;After careful consideration and prioritization of the feedback we received, we have prioritize many of the below requests for the next Flink release of 1.11.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Hive streaming sink so that Flink can stream data into Hive tables, bringing a real streaming experience to Hive&lt;/li&gt;
&lt;li&gt;Native Parquet reader for better performance&lt;/li&gt;
&lt;li&gt;Additional interoperability - support creating Hive tables, views, functions in Flink&lt;/li&gt;
&lt;li&gt;Better out-of-box experience with built-in dependencies, including documentations&lt;/li&gt;
&lt;li&gt;JDBC driver so that users can reuse their existing toolings to run SQL jobs on Flink&lt;/li&gt;
&lt;li&gt;Hive syntax and semantic compatible mode&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you have more feature requests or discover bugs, please reach out to the community through mailing list and JIRAs.&lt;/p&gt;
&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;
&lt;p&gt;Data warehousing is shifting to a more real-time fashion, and Apache Flink can make a difference for your organization in this space.&lt;/p&gt;
&lt;p&gt;Flink 1.10 brings production-ready Hive integration and empowers users to achieve more in both metadata management and unified/batch data processing.&lt;/p&gt;
&lt;p&gt;We encourage all our users to get their hands on Flink 1.10. You are very welcome to join the community in development, discussions, and all other kinds of collaborations in this topic.&lt;/p&gt;
</description>
<pubDate>Fri, 27 Mar 2020 03:30:00 +0100</pubDate>
<link>https://flink.apache.org/features/2020/03/27/flink-for-data-warehouse.html</link>
<guid isPermaLink="true">/features/2020/03/27/flink-for-data-warehouse.html</guid>
</item>
<item>
<title>Advanced Flink Application Patterns Vol.2: Dynamic Updates of Application Logic</title>
<description>&lt;p&gt;In the &lt;a href=&quot;https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html&quot;&gt;first article&lt;/a&gt; of the series, we gave a high-level description of the objectives and required functionality of a Fraud Detection engine. We also described how to make data partitioning in Apache Flink customizable based on modifiable rules instead of using a hardcoded &lt;code&gt;KeysExtractor&lt;/code&gt; implementation.&lt;/p&gt;
&lt;p&gt;We intentionally omitted details of how the applied rules are initialized and what possibilities exist for updating them at runtime. In this post, we will address exactly these details. You will learn how the approach to data partitioning described in &lt;a href=&quot;https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html&quot;&gt;Part 1&lt;/a&gt; can be applied in combination with a dynamic configuration. These two patterns, when used together, can eliminate the need to recompile the code and redeploy your Flink job for a wide range of modifications of the business logic.&lt;/p&gt;
&lt;h2 id=&quot;rules-broadcasting&quot;&gt;Rules Broadcasting&lt;/h2&gt;
&lt;p&gt;Let’s first have a look at the &lt;a href=&quot;https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html#dynamic-data-partitioning&quot;&gt;previously-defined&lt;/a&gt; data-processing pipeline:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Alert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;alerts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;transactions&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;DynamicKeyFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;keyBy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getKey&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;DynamicAlertFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;DynamicKeyFunction&lt;/code&gt; provides dynamic data partitioning while &lt;code&gt;DynamicAlertFunction&lt;/code&gt; is responsible for executing the main logic of processing transactions and sending alert messages according to defined rules.&lt;/p&gt;
&lt;p&gt;Vol.1 of this series simplified the use case and assumed that the applied set of rules is pre-initialized and accessible via the &lt;code&gt;List&amp;lt;Rules&amp;gt;&lt;/code&gt; within &lt;code&gt;DynamicKeyFunction&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;DynamicKeyFunction&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;cm&quot;&gt;/* Simplified */&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rules&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;cm&quot;&gt;/* Rules that are initialized somehow.*/&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Adding rules to this list is obviously possible directly inside the code of the Flink Job at the stage of its initialization (Create a &lt;code&gt;List&lt;/code&gt; object; use it’s &lt;code&gt;add&lt;/code&gt; method). A major drawback of doing so is that it will require recompilation of the job with each rule modification. In a real Fraud Detection system, rules are expected to change on a frequent basis, making this approach unacceptable from the point of view of business and operational requirements. A different approach is needed.&lt;/p&gt;
&lt;p&gt;Next, let’s take a look at a sample rule definition that we introduced in the previous post of the series:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-2/rule-dsl.png&quot; width=&quot;800px&quot; alt=&quot;Figure 1: Rule definition&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 1: Rule definition&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;The previous post covered use of &lt;code&gt;groupingKeyNames&lt;/code&gt; by &lt;code&gt;DynamicKeyFunction&lt;/code&gt; to extract message keys. Parameters from the second part of this rule are used by &lt;code&gt;DynamicAlertFunction&lt;/code&gt;: they define the actual logic of the performed operations and their parameters (such as the alert-triggering limit). This means that the same rule must be present in both &lt;code&gt;DynamicKeyFunction&lt;/code&gt; and &lt;code&gt;DynamicAlertFunction&lt;/code&gt;. To achieve this result, we will use the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/stream/state/broadcast_state.html&quot;&gt;broadcast data distribution mechanism&lt;/a&gt; of Apache Flink.&lt;/p&gt;
&lt;p&gt;Figure 2 presents the final job graph of the system that we are building:&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-2/job-graph.png&quot; width=&quot;800px&quot; alt=&quot;Figure 2: Job Graph of the Fraud Detection Flink Job&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 2: Job Graph of the Fraud Detection Flink Job&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;The main blocks of the Transactions processing pipeline are:&lt;br /&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Transaction Source&lt;/strong&gt; that consumes transaction messages from Kafka partitions in parallel. &lt;br /&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dynamic Key Function&lt;/strong&gt; that performs data enrichment with a dynamic key. The subsequent &lt;code&gt;keyBy&lt;/code&gt; hashes this dynamic key and partitions the data accordingly among all parallel instances of the following operator.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dynamic Alert Function&lt;/strong&gt; that accumulates a data window and creates Alerts based on it.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;data-exchange-inside-apache-flink&quot;&gt;Data Exchange inside Apache Flink&lt;/h2&gt;
&lt;p&gt;The job graph above also indicates various data exchange patterns between the operators. In order to understand how the broadcast pattern works, let’s take a short detour and discuss what methods of message propagation exist in Apache Flink’s distributed runtime.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;FORWARD&lt;/strong&gt; connection after the Transaction Source means that all data consumed by one of the parallel instances of the Transaction Source operator is transferred to exactly one instance of the subsequent &lt;code&gt;DynamicKeyFunction&lt;/code&gt; operator. It also indicates the same level of parallelism of the two connected operators (12 in the above case). This communication pattern is illustrated in Figure 3. Orange circles represent transactions, and dotted rectangles depict parallel instances of the conjoined operators.&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-2/forward.png&quot; width=&quot;800px&quot; alt=&quot;Figure 3: FORWARD message passing across operator instances&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 3: FORWARD message passing across operator instances&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;HASH&lt;/strong&gt; connection between &lt;code&gt;DynamicKeyFunction&lt;/code&gt; and &lt;code&gt;DynamicAlertFunction&lt;/code&gt; means that for each message a hash code is calculated and messages are evenly distributed among available parallel instances of the next operator. Such a connection needs to be explicitly “requested” from Flink by using &lt;code&gt;keyBy&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-2/hash.png&quot; width=&quot;800px&quot; alt=&quot;Figure 4: HASHED message passing across operator instances (via `keyBy`)&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 4: HASHED message passing across operator instances (via `keyBy`)&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;REBALANCE&lt;/strong&gt; distribution is either caused by an explicit call to &lt;code&gt;rebalance()&lt;/code&gt; or by a change of parallelism (12 -&amp;gt; 1 in the case of the job graph from Figure 2). Calling &lt;code&gt;rebalance()&lt;/code&gt; causes data to be repartitioned in a round-robin fashion and can help to mitigate data skew in certain scenarios.&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-2/rebalance.png&quot; width=&quot;800px&quot; alt=&quot;Figure 5: REBALANCE message passing across operator instances&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 5: REBALANCE message passing across operator instances&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;The Fraud Detection job graph in Figure 2 contains an additional data source: &lt;em&gt;Rules Source&lt;/em&gt;. It also consumes from Kafka. Rules are “mixed into” the main processing data flow through the &lt;strong&gt;BROADCAST&lt;/strong&gt; channel. Unlike other methods of transmitting data between operators, such as &lt;code&gt;forward&lt;/code&gt;, &lt;code&gt;hash&lt;/code&gt; or &lt;code&gt;rebalance&lt;/code&gt; that make each message available for processing in only one of the parallel instances of the receiving operator, &lt;code&gt;broadcast&lt;/code&gt; makes each message available at the input of all of the parallel instances of the operator to which the &lt;em&gt;broadcast stream&lt;/em&gt; is connected. This makes &lt;code&gt;broadcast&lt;/code&gt; applicable to a wide range of tasks that need to affect the processing of all messages, regardless of their key or source partition.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/patterns-blog-2/broadcast.png&quot; width=&quot;800px&quot; alt=&quot;Figure 6: BROADCAST message passing across operator instances&quot; /&gt;
&lt;br /&gt;
&lt;i&gt;&lt;small&gt;Figure 6: BROADCAST message passing across operator instances&lt;/small&gt;&lt;/i&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
&lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
There are actually a few more specialized data partitioning schemes in Flink which we did not mention here. If you want to find out more, please refer to Flink’s documentation on &lt;strong&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/stream/operators/#physical-partitioning&quot;&gt;stream partitioning&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;broadcast-state-pattern&quot;&gt;Broadcast State Pattern&lt;/h2&gt;
&lt;p&gt;In order to make use of the Rules Source, we need to “connect” it to the main data stream:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// Streams setup&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;transactions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[...]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rulesUpdateStream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[...]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BroadcastStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rulesStream&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rulesUpdateStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;broadcast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RULES_STATE_DESCRIPTOR&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Processing pipeline setup&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Alert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;alerts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;transactions&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rulesStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;DynamicKeyFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;keyBy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getKey&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rulesStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;DynamicAlertFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see, the broadcast stream can be created from any regular stream by calling the &lt;code&gt;broadcast&lt;/code&gt; method and specifying a state descriptor. Flink assumes that broadcasted data needs to be stored and retrieved while processing events of the main data flow and, therefore, always automatically creates a corresponding &lt;em&gt;broadcast state&lt;/em&gt; from this state descriptor. This is different from any other Apache Flink state type in which you need to initialize it in the &lt;code&gt;open()&lt;/code&gt; method of the processing function. Also note that broadcast state always has a key-value format (&lt;code&gt;MapState&lt;/code&gt;).&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MapStateDescriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RULES_STATE_DESCRIPTOR&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MapStateDescriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;rules&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Connecting to &lt;code&gt;rulesStream&lt;/code&gt; causes some changes in the signature of the processing functions. The previous article presented it in a slightly simplified way as a &lt;code&gt;ProcessFunction&lt;/code&gt;. However, &lt;code&gt;DynamicKeyFunction&lt;/code&gt; is actually a &lt;code&gt;BroadcastProcessFunction&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;BroadcastProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IN1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IN2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OUT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IN1&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ReadOnlyContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OUT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;processBroadcastElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IN2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OUT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The difference is the addition of the &lt;code&gt;processBroadcastElement&lt;/code&gt; method through which messages of the rules stream will arrive. The following new version of &lt;code&gt;DynamicKeyFunction&lt;/code&gt; allows modifying the list of data-distribution keys at runtime through this stream:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;DynamicKeyFunction&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BroadcastProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;processBroadcastElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BroadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;broadcastState&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getBroadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RULES_STATE_DESCRIPTOR&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;broadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getRuleId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ReadOnlyContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;){&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ReadOnlyBroadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rulesState&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getBroadcastState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RULES_STATE_DESCRIPTOR&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Entry&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;entry&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rulesState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;immutableEntries&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rule&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;entry&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Keyed&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeysExtractor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getKey&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getGroupingKeyNames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getRuleId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In the above code, &lt;code&gt;processElement()&lt;/code&gt; receives Transactions, and &lt;code&gt;processBroadcastElement()&lt;/code&gt; receives Rule updates. When a new rule is created, it is distributed as depicted in Figure 6 and saved in all parallel instances of the operator using &lt;code&gt;processBroadcastState&lt;/code&gt;. We use a Rule’s ID as the key to store and reference individual rules. Instead of iterating over a hardcoded &lt;code&gt;List&amp;lt;Rules&amp;gt;&lt;/code&gt;, we iterate over entries in the dynamically-updated broadcast state.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;DynamicAlertFunction&lt;/code&gt; follows the same logic with respect to storing the rules in the broadcast &lt;code&gt;MapState&lt;/code&gt;. As described in &lt;a href=&quot;https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html&quot;&gt;Part 1&lt;/a&gt;, each message in the &lt;code&gt;processElement&lt;/code&gt; input is intended to be processed by one specific rule and comes “pre-marked” with a corresponding ID by &lt;code&gt;DynamicKeyFunction&lt;/code&gt;. All we need to do is retrieve the definition of the corresponding rule from &lt;code&gt;BroadcastState&lt;/code&gt; by using the provided ID and process it according to the logic required by that rule. At this stage, we will also add messages to the internal function state in order to perform calculations on the required time window of data. We will consider how this is done in the &lt;a href=&quot;/news/2020/07/30/demo-fraud-detection-3.html&quot;&gt;final blog&lt;/a&gt; of the series about Fraud Detection.&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;In this blog post, we continued our investigation of the use case of a Fraud Detection System built with Apache Flink. We looked into different ways in which data can be distributed between parallel operator instances and, most importantly, examined broadcast state. We demonstrated how dynamic partitioning — a pattern described in the &lt;a href=&quot;https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html&quot;&gt;first part&lt;/a&gt; of the series — can be combined and enhanced by the functionality provided by the broadcast state pattern. The ability to send dynamic updates at runtime is a powerful feature of Apache Flink that is applicable in a variety of other use cases, such as controlling state (cleanup/insert/fix), running A/B experiments or executing updates of ML model coefficients.&lt;/p&gt;
</description>
<pubDate>Tue, 24 Mar 2020 13:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/03/24/demo-fraud-detection-2.html</link>
<guid isPermaLink="true">/news/2020/03/24/demo-fraud-detection-2.html</guid>
</item>
<item>
<title>Apache Beam: How Beam Runs on Top of Flink</title>
<description>&lt;p&gt;Note: This blog post is based on the talk &lt;a href=&quot;https://www.youtube.com/watch?v=hxHGLrshnCY&quot;&gt;“Beam on Flink: How Does It Actually Work?”&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://flink.apache.org/&quot;&gt;Apache Flink&lt;/a&gt; and &lt;a href=&quot;https://beam.apache.org/&quot;&gt;Apache Beam&lt;/a&gt; are open-source frameworks for parallel, distributed data processing at scale. Unlike Flink, Beam does not come with a full-blown execution engine of its own but plugs into other execution engines, such as Apache Flink, Apache Spark, or Google Cloud Dataflow. In this blog post we discuss the reasons to use Flink together with Beam for your batch and stream processing needs. We also take a closer look at how Beam works with Flink to provide an idea of the technical aspects of running Beam pipelines with Flink. We hope you find some useful information on how and why the two frameworks can be utilized in combination. For more information, you can refer to the corresponding &lt;a href=&quot;https://beam.apache.org/documentation/runners/flink/&quot;&gt;documentation&lt;/a&gt; on the Beam website or contact the community through the &lt;a href=&quot;https://beam.apache.org/community/contact-us/&quot;&gt;Beam mailing list&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;what-is-apache-beam&quot;&gt;What is Apache Beam&lt;/h1&gt;
&lt;p&gt;&lt;a href=&quot;https://beam.apache.org/&quot;&gt;Apache Beam&lt;/a&gt; is an open-source, unified model for defining batch and streaming data-parallel processing pipelines. It is unified in the sense that you use a single API, in contrast to using a separate API for batch and streaming like it is the case in Flink. Beam was originally developed by Google which released it in 2014 as the Cloud Dataflow SDK. In 2016, it was donated to &lt;a href=&quot;https://www.apache.org/&quot;&gt;the Apache Software Foundation&lt;/a&gt; with the name of Beam. It has been developed by the open-source community ever since. With Apache Beam, developers can write data processing jobs, also known as pipelines, in multiple languages, e.g. Java, Python, Go, SQL. A pipeline is then executed by one of Beam’s Runners. A Runner is responsible for translating Beam pipelines such that they can run on an execution engine. Every supported execution engine has a Runner. The following Runners are available: Apache Flink, Apache Spark, Apache Samza, Hazelcast Jet, Google Cloud Dataflow, and others.&lt;/p&gt;
&lt;p&gt;The execution model, as well as the API of Apache Beam, are similar to Flink’s. Both frameworks are inspired by the &lt;a href=&quot;https://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf&quot;&gt;MapReduce&lt;/a&gt;, &lt;a href=&quot;https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41378.pdf&quot;&gt;MillWheel&lt;/a&gt;, and &lt;a href=&quot;https://research.google/pubs/pub43864/&quot;&gt;Dataflow&lt;/a&gt; papers. Like Flink, Beam is designed for parallel, distributed data processing. Both have similar transformations, support for windowing, event/processing time, watermarks, timers, triggers, and much more. However, Beam not being a full runtime focuses on providing the framework for building portable, multi-language batch and stream processing pipelines such that they can be run across several execution engines. The idea is that you write your pipeline once and feed it with either batch or streaming data. When you run it, you just pick one of the supported backends to execute. A large integration test suite in Beam called “ValidatesRunner” ensures that the results will be the same, regardless of which backend you choose for the execution.&lt;/p&gt;
&lt;p&gt;One of the most exciting developments in the Beam technology is the framework’s support for multiple programming languages including Java, Python, Go, Scala and SQL. Essentially, developers can write their applications in a programming language of their choice. Beam, with the help of the Runners, translates the program to one of the execution engines, as shown in the diagram below.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-02-22-beam-on-flink/flink-runner-beam-beam-vision.png&quot; width=&quot;600px&quot; alt=&quot;The vision of Apache Beam&quot; /&gt;
&lt;/center&gt;
&lt;h1 id=&quot;reasons-to-use-beam-with-flink&quot;&gt;Reasons to use Beam with Flink&lt;/h1&gt;
&lt;p&gt;Why would you want to use Beam with Flink instead of directly using Flink? Ultimately, Beam and Flink complement each other and provide additional value to the user. The main reasons for using Beam with Flink are the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Beam provides a unified API for both batch and streaming scenarios.&lt;/li&gt;
&lt;li&gt;Beam comes with native support for different programming languages, like Python or Go with all their libraries like Numpy, Pandas, Tensorflow, or TFX.&lt;/li&gt;
&lt;li&gt;You get the power of Apache Flink like its exactly-once semantics, strong memory management and robustness.&lt;/li&gt;
&lt;li&gt;Beam programs run on your existing Flink infrastructure or infrastructure for other supported Runners, like Spark or Google Cloud Dataflow.&lt;/li&gt;
&lt;li&gt;You get additional features like side inputs and cross-language pipelines that are not supported natively in Flink but only supported when using Beam with Flink.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;the-flink-runner-in-beam&quot;&gt;The Flink Runner in Beam&lt;/h1&gt;
&lt;p&gt;The Flink Runner in Beam translates Beam pipelines into Flink jobs. The translation can be parameterized using Beam’s pipeline options which are parameters for settings like configuring the job name, parallelism, checkpointing, or metrics reporting.&lt;/p&gt;
&lt;p&gt;If you are familiar with a DataSet or a DataStream, you will have no problems understanding what a PCollection is. PCollection stands for parallel collection in Beam and is exactly what DataSet/DataStream would be in Flink. Due to Beam’s unified API we only have one type of results of transformation: PCollection.&lt;/p&gt;
&lt;p&gt;Beam pipelines are composed of transforms. Transforms are like operators in Flink and come in two flavors: primitive and composite transforms. The beauty of all this is that Beam only comes with a small set of primitive transforms which are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Source&lt;/code&gt; (for loading data)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ParDo&lt;/code&gt; (think of a flat map operator on steroids)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GroupByKey&lt;/code&gt; (think of keyBy() in Flink)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AssignWindows&lt;/code&gt; (windows can be assigned at any point in time in Beam)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Flatten&lt;/code&gt; (like a union() operation in Flink)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Composite transforms are built by combining the above primitive transforms. For example, &lt;code&gt;Combine = GroupByKey + ParDo&lt;/code&gt;.&lt;/p&gt;
&lt;h1 id=&quot;flink-runner-internals&quot;&gt;Flink Runner Internals&lt;/h1&gt;
&lt;p&gt;Although using the Flink Runner in Beam has no prerequisite to understanding its internals, we provide more details of how the Flink runner works in Beam to share knowledge of how the two frameworks can integrate and work together to provide state-of-the-art streaming data pipelines.&lt;/p&gt;
&lt;p&gt;The Flink Runner has two translation paths. Depending on whether we execute in batch or streaming mode, the Runner either translates into Flink’s DataSet or into Flink’s DataStream API. Since multi-language support has been added to Beam, another two translation paths have been added. To summarize the four modes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The Classic Flink Runner for batch jobs:&lt;/strong&gt; Executes batch Java pipelines&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Classic Flink Runner for streaming jobs:&lt;/strong&gt; Executes streaming Java pipelines&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Portable Flink Runner for batch jobs:&lt;/strong&gt; Executes Java as well as Python, Go and other supported SDK pipelines for batch scenarios&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Portable Flink Runner for streaming jobs:&lt;/strong&gt; Executes Java as well as Python, Go and other supported SDK pipelines for streaming scenarios&lt;/li&gt;
&lt;/ol&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-02-22-beam-on-flink/flink-runner-beam-runner-translation-paths.png&quot; width=&quot;300px&quot; alt=&quot;The 4 translation paths in the Beam&#39;s Flink Runner&quot; /&gt;
&lt;/center&gt;
&lt;h2 id=&quot;the-classic-flink-runner-in-beam&quot;&gt;The “Classic” Flink Runner in Beam&lt;/h2&gt;
&lt;p&gt;The classic Flink Runner was the initial version of the Runner, hence the “classic” name. Beam pipelines are represented as a graph in Java which is composed of the aforementioned composite and primitive transforms. Beam provides translators which traverse the graph in topological order. Topological order means that we start from all the sources first as we iterate through the graph. Presented with a transform from the graph, the Flink Runner generates the API calls as you would normally when writing a Flink job.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-02-22-beam-on-flink/classic-flink-runner-beam.png&quot; width=&quot;600px&quot; alt=&quot;The Classic Flink Runner in Beam&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;While Beam and Flink share very similar concepts, there are enough differences between the two frameworks that make Beam pipelines impossible to be translated 1:1 into a Flink program. In the following sections, we will present the key differences:&lt;/p&gt;
&lt;h3 id=&quot;serializers-vs-coders&quot;&gt;Serializers vs Coders&lt;/h3&gt;
&lt;p&gt;When data is transferred over the wire in Flink, it has to be turned into bytes. This is done with the help of serializers. Flink has a type system to instantiate the correct coder for a given type, e.g. &lt;code&gt;StringTypeSerializer&lt;/code&gt; for a String. Apache Beam also has its own type system which is similar to Flink’s but uses slightly different interfaces. Serializers are called Coders in Beam. In order to make a Beam Coder run in Flink, we have to make the two serializer types compatible. This is done by creating a special Flink type information that looks like the one in Flink but calls the appropriate Beam coder. That way, we can use Beam’s coders although we are executing the Beam job with Flink. Flink operators expect a TypeInformation, e.g. &lt;code&gt;StringTypeInformation&lt;/code&gt;, for which we use a &lt;code&gt;CoderTypeInformation&lt;/code&gt; in Beam. The type information returns the serializer for which we return a &lt;code&gt;CoderTypeSerializer&lt;/code&gt;, which calls the underlying Beam Coder.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-02-22-beam-on-flink/flink-runner-beam-serializers-coders.png&quot; width=&quot;300px&quot; alt=&quot;Serializers vs Coders&quot; /&gt;
&lt;/center&gt;
&lt;h3 id=&quot;read&quot;&gt;Read&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;Read&lt;/code&gt; transform provides a way to read data into your pipeline in Beam. The Read transform is supported by two wrappers in Beam, the &lt;code&gt;SourceInputFormat&lt;/code&gt; for batch processing and the &lt;code&gt;UnboundedSourceWrapper&lt;/code&gt; for stream processing.&lt;/p&gt;
&lt;h3 id=&quot;pardo&quot;&gt;ParDo&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;ParDo&lt;/code&gt; is the swiss army knife of Beam and can be compared to a &lt;code&gt;RichFlatMapFunction&lt;/code&gt; in Flink with additional features such as &lt;code&gt;SideInputs&lt;/code&gt;, &lt;code&gt;SideOutputs&lt;/code&gt;, State and Timers. &lt;code&gt;ParDo&lt;/code&gt; is essentially translated by the Flink runner using the &lt;code&gt;FlinkDoFnFunction&lt;/code&gt; for batch processing or the &lt;code&gt;FlinkStatefulDoFnFunction&lt;/code&gt;, while for streaming scenarios the translation is executed with the &lt;code&gt;DoFnOperator&lt;/code&gt; that takes care of checkpointing and buffering of data during checkpoints, watermark emissions and maintenance of state and timers. This is all executed by Beam’s interface, called the &lt;code&gt;DoFnRunner&lt;/code&gt;, that encapsulates Beam-specific execution logic, like retrieving state, executing state and timers, or reporting metrics.&lt;/p&gt;
&lt;h3 id=&quot;side-inputs&quot;&gt;Side Inputs&lt;/h3&gt;
&lt;p&gt;In addition to the main input, ParDo transforms can have a number of side inputs. A side input can be a static set of data that you want to have available at all parallel instances. However, it is more flexible than that. You can have keyed and even windowed side input which updates based on the window size. This is a very powerful concept which does not exist in Flink but is added on top of Flink using Beam.&lt;/p&gt;
&lt;h3 id=&quot;assignwindows&quot;&gt;AssignWindows&lt;/h3&gt;
&lt;p&gt;In Flink, windows are assigned by the &lt;code&gt;WindowOperator&lt;/code&gt; when you use the &lt;code&gt;window()&lt;/code&gt; in the API. In Beam, windows can be assigned at any point in time. Any element is implicitly part of a window. If no window is assigned explicitly, the element is part of the &lt;code&gt;GlobalWindow&lt;/code&gt;. Window information is stored for each element in a wrapper called &lt;code&gt;WindowedValue&lt;/code&gt;. The window information is only used once we issue a &lt;code&gt;GroupByKey&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&quot;groupbykey&quot;&gt;GroupByKey&lt;/h3&gt;
&lt;p&gt;Most of the time it is useful to partition the data by a key. In Flink, this is done via the &lt;code&gt;keyBy()&lt;/code&gt; API call. In Beam the &lt;code&gt;GroupByKey&lt;/code&gt; transform can only be applied if the input is of the form &lt;code&gt;KV&amp;lt;Key, Value&amp;gt;&lt;/code&gt;. Unlike Flink where the key can even be nested inside the data, Beam enforces the key to always be explicit. The &lt;code&gt;GroupByKey&lt;/code&gt; transform then groups the data by key and by window which is similar to what &lt;code&gt;keyBy(..).window(..)&lt;/code&gt; would give us in Flink. Beam has its own set of libraries to do that because Beam has its own set of window functions and triggers. Essentially, GroupByKey is very similar to what the WindowOperator does in Flink.&lt;/p&gt;
&lt;h3 id=&quot;flatten&quot;&gt;Flatten&lt;/h3&gt;
&lt;p&gt;The Flatten operator takes multiple DataSet/DataStreams, called P[arallel]Collections in Beam, and combines them into one collection. This is equivalent to Flink’s &lt;code&gt;union()&lt;/code&gt; operation.&lt;/p&gt;
&lt;h2 id=&quot;the-portable-flink-runner-in-beam&quot;&gt;The “Portable” Flink Runner in Beam&lt;/h2&gt;
&lt;p&gt;The portable Flink Runner in Beam is the evolution of the classic Runner. Classic Runners are tied to the JVM ecosystem, but the Beam community wanted to move past this and also execute Python, Go and other languages. This adds another dimension to Beam in terms of portability because, like previously mentioned, Beam already had portability across execution engines. It was necessary to change the translation logic of the Runner to be able to support language portability.&lt;/p&gt;
&lt;p&gt;There are two important building blocks for portable Runners:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A common pipeline format across all the languages: The Runner API&lt;/li&gt;
&lt;li&gt;A common interface during execution for the communication between the Runner and the code written in any language: The Fn API&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Runner API provides a universal representation of the pipeline as Protobuf which contains the transforms, types, and user code. Protobuf was chosen as the format because every language has libraries available for it. Similarly, for the execution part, Beam introduced the Fn API interface to handle the communication between the Runner/execution engine and the user code that may be written in a different language and executes in a different process. Fn API is pronounced “fun API”, you may guess why.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-02-22-beam-on-flink/flink-runner-beam-language-portability.png&quot; width=&quot;600px&quot; alt=&quot;Language Portability in Apache Beam&quot; /&gt;
&lt;/center&gt;
&lt;h2 id=&quot;how-are-beam-programs-translated-in-language-portability&quot;&gt;How Are Beam Programs Translated In Language Portability?&lt;/h2&gt;
&lt;p&gt;Users write their Beam pipelines in one language, but they may get executed in an environment based on a completely different language. How does that work? To explain that, let’s follow the lifecycle of a pipeline. Let’s suppose we use the Python SDK to write the pipeline. Before submitting the pipeline via the Job API to Beam’s JobServer, Beam would convert it to the Runner API, the language-agnostic format we described before. The JobServer is also a Beam component that handles the staging of the required dependencies during execution. The JobServer will then kick-off the translation which is similar to the classic Runner. However, an important change is the so-called &lt;code&gt;ExecutableStage&lt;/code&gt; transform. It is essentially a ParDo transform that we already know but designed for holding language-dependent code. Beam tries to combine as many of these transforms into one “executable stage”. The result again is a Flink program which is then sent to the Flink cluster and executed there. The major difference compared to the classic Runner is that during execution we will start &lt;em&gt;environments&lt;/em&gt; to execute the aforementioned &lt;em&gt;ExecutableStages&lt;/em&gt;. The following environments are available:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Docker-based (the default)&lt;/li&gt;
&lt;li&gt;Process-based (a simple process is started)&lt;/li&gt;
&lt;li&gt;Externally-provided (K8s or other schedulers)&lt;/li&gt;
&lt;li&gt;Embedded (intended for testing and only works with Java)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Environments hold the &lt;em&gt;SDK Harness&lt;/em&gt; which is the code that handles the execution and the communication with the Runner over the Fn API. For example, when Flink executes Python code, it sends the data to the Python environment containing the Python SDK Harness. Sending data to an external process involves a minor overhead which we have measured to be 5-10% slower than the classic Java pipelines. However, Beam uses a fusion of transforms to execute as many transforms as possible in the same environment which share the same input or output. That’s why in real-world scenarios the overhead could be much lower.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-02-22-beam-on-flink/flink-runner-beam-language-portability-architecture.png&quot; width=&quot;600px&quot; alt=&quot;Language Portability Architecture in beam&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;Environments can be present for many languages. This opens up an entirely new type of pipelines: cross-language pipelines. In cross-language pipelines we can combine transforms of two or more languages, e.g. a machine learning pipeline with the feature generation written in Java and the learning written in Python. All this can be run on top of Flink.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Using Apache Beam with Apache Flink combines (a.) the power of Flink with (b.) the flexibility of Beam. All it takes to run Beam is a Flink cluster, which you may already have. Apache Beam’s fully-fledged Python API is probably the most compelling argument for using Beam with Flink, but the unified API which allows to “write-once” and “execute-anywhere” is also very appealing to Beam users. On top of this, features like side inputs and a rich connector ecosystem are also reasons why people like Beam.&lt;/p&gt;
&lt;p&gt;With the introduction of schemas, a new format for handling type information, Beam is heading in a similar direction as Flink with its type system which is essential for the Table API or SQL. Speaking of, the next Flink release will include a Python version of the Table API which is based on the language portability of Beam. Looking ahead, the Beam community plans to extend the support for interactive programs like notebooks. TFX, which is built with Beam, is a very powerful way to solve many problems around training and validating machine learning models.&lt;/p&gt;
&lt;p&gt;For many years, Beam and Flink have inspired and learned from each other. With the Python support being based on Beam in Flink, they only seem to come closer to each other. That’s all the better for the community, and also users have more options and functionality to choose from.&lt;/p&gt;
</description>
<pubDate>Sat, 22 Feb 2020 13:00:00 +0100</pubDate>
<link>https://flink.apache.org/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html</link>
<guid isPermaLink="true">/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html</guid>
</item>
<item>
<title>No Java Required: Configuring Sources and Sinks in SQL</title>
<description>&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
&lt;p&gt;The recent &lt;a href=&quot;https://flink.apache.org/news/2020/02/11/release-1.10.0.html&quot;&gt;Apache Flink 1.10 release&lt;/a&gt; includes many exciting features.
In particular, it marks the end of the community’s year-long effort to merge in the &lt;a href=&quot;https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html&quot;&gt;Blink SQL contribution&lt;/a&gt; from Alibaba.
The reason the community chose to spend so much time on the contribution is that SQL works.
It allows Flink to offer a truly unified interface over batch and streaming and makes stream processing accessible to a broad audience of developers and analysts.
Best of all, Flink SQL is ANSI-SQL compliant, which means if you’ve ever used a database in the past, you already know it&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;!&lt;/p&gt;
&lt;p&gt;A lot of work focused on improving runtime performance and progressively extending its coverage of the SQL standard.
Flink now supports the full TPC-DS query set for batch queries, reflecting the readiness of its SQL engine to address the needs of modern data warehouse-like workloads.
Its streaming SQL supports an almost equal set of features - those that are well defined on a streaming runtime - including &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/streaming/joins.html&quot;&gt;complex joins&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-stable/dev/table/streaming/match_recognize.html&quot;&gt;MATCH_RECOGNIZE&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As important as this work is, the community also strives to make these features generally accessible to the broadest audience possible.
That is why the Flink community is excited in 1.10 to offer production-ready DDL syntax (e.g., &lt;code&gt;CREATE TABLE&lt;/code&gt;, &lt;code&gt;DROP TABLE&lt;/code&gt;) and a refactored catalog interface.&lt;/p&gt;
&lt;h1 id=&quot;accessing-your-data-where-it-lives&quot;&gt;Accessing Your Data Where It Lives&lt;/h1&gt;
&lt;p&gt;Flink does not store data at rest; it is a compute engine and requires other systems to consume input from and write its output.
Those that have used Flink’s &lt;code&gt;DataStream&lt;/code&gt; API in the past will be familiar with connectors that allow for interacting with external systems.
Flink has a vast connector ecosystem that includes all major message queues, filesystems, and databases.&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
If your favorite system does not have a connector maintained in the central Apache Flink repository, check out the &lt;a href=&quot;https://flink-packages.org&quot;&gt;flink packages website&lt;/a&gt;, which has a growing number of community-maintained components.
&lt;/div&gt;
&lt;p&gt;While these connectors are battle-tested and production-ready, they are written in Java and configured in code, which means they are not amenable to pure SQL or Table applications.
For a holistic SQL experience, not only queries need to be written in SQL, but also table definitions.&lt;/p&gt;
&lt;h1 id=&quot;create-table-statements&quot;&gt;CREATE TABLE Statements&lt;/h1&gt;
&lt;p&gt;While Flink SQL has long provided table abstractions atop some of Flink’s most popular connectors, configurations were not always so straightforward.
Beginning in 1.10, Flink supports defining tables through &lt;code&gt;CREATE TABLE&lt;/code&gt; statements.
With this feature, users can now create logical tables, backed by various external systems, in pure SQL.&lt;/p&gt;
&lt;p&gt;By defining tables in SQL, developers can write queries against logical schemas that are abstracted away from the underlying physical data store. Coupled with Flink SQL’s unified approach to batch and stream processing, Flink provides a straight line from discovery to production.&lt;/p&gt;
&lt;p&gt;Users can define tables over static data sets, anything from a local CSV file to a full-fledged data lake or even Hive.
Leveraging Flink’s efficient batch processing capabilities, they can perform ad-hoc queries searching for exciting insights.
Once something interesting is identified, businesses can gain real-time and continuous insights by merely altering the table so that it is powered by a message queue such as Kafka.
Because Flink guarantees SQL queries have unified semantics over batch and streaming, users can be confident that redeploying this query as a continuous streaming application over a message queue will output identical results.&lt;/p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot; data-lang=&quot;sql&quot;&gt;&lt;span class=&quot;c1&quot;&gt;-- Define a table called orders that is backed by a Kafka topic&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- The definition includes all relevant Kafka properties,&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- the underlying format (JSON) and even defines a&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- watermarking algorithm based on one of the fields&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- so that this table can be used with event time.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orders&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;product&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WATERMARK&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;5&amp;#39;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SECONDS&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.type&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;kafka&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.version&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;universal&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.topic&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;orders&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.startup-mode&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;earliest-offset&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.properties.bootstrap.servers&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;localhost:9092&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;format.type&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;json&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- Define a table called product_analysis&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- on top of ElasticSearch 7 where we &lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- can write the results of our query. &lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;product_analysis&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;product&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tracking_time&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TIMESTAMP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;units_sold&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BIGINT&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.type&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;elasticsearch&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.version&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;7&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.hosts&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;localhost:9200&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.index&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;ProductAnalysis&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;s1&quot;&gt;&amp;#39;connector.document.type&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;analysis&amp;#39;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- A simple query that analyzes order data&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- from Kafka and writes results into &lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;-- ElasticSearch. &lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;product_analysis&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;product_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;TUMBLE_START&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;1&amp;#39;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;DAY&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tracking_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;COUNT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;units_sold&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;orders&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;product_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;TUMBLE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;order_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;INTERVAL&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;1&amp;#39;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;DAY&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;h1 id=&quot;catalogs&quot;&gt;Catalogs&lt;/h1&gt;
&lt;p&gt;While being able to create tables is important, it often isn’t enough.
A business analyst, for example, shouldn’t have to know what properties to set for Kafka, or even have to know what the underlying data source is, to be able to write a query.&lt;/p&gt;
&lt;p&gt;To solve this problem, Flink 1.10 also ships with a revamped catalog system for managing metadata about tables and user definined functions.
With catalogs, users can create tables once and reuse them across Jobs and Sessions.
Now, the team managing a data set can create a table and immediately make it accessible to other groups within their organization.&lt;/p&gt;
&lt;p&gt;The most notable catalog that Flink integrates with today is Hive Metastore.
The Hive catalog allows Flink to fully interoperate with Hive and serve as a more efficient query engine.
Flink supports reading and writing Hive tables, using Hive UDFs, and even leveraging Hive’s metastore catalog to persist Flink specific metadata.&lt;/p&gt;
&lt;h1 id=&quot;looking-ahead&quot;&gt;Looking Ahead&lt;/h1&gt;
&lt;p&gt;Flink SQL has made enormous strides to democratize stream processing, and 1.10 marks a significant milestone in that development.
However, we are not ones to rest on our laurels and, the community is committed to raising the bar on standards while lowering the barriers to entry.
The community is looking to add more catalogs, such as JDBC and Apache Pulsar.
We encourage you to sign up for the &lt;a href=&quot;https://flink.apache.org/community.html&quot;&gt;mailing list&lt;/a&gt; and stay on top of the announcements and new features in upcoming releases.&lt;/p&gt;
&lt;hr /&gt;
&lt;div class=&quot;footnotes&quot;&gt;
&lt;ol&gt;
&lt;li id=&quot;fn:1&quot;&gt;
&lt;p&gt;My colleague Timo, whose worked on Flink SQL from the beginning, has the entire SQL standard printed on his desk and references it before any changes are merged. It’s enormous. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
<pubDate>Thu, 20 Feb 2020 13:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/02/20/ddl.html</link>
<guid isPermaLink="true">/news/2020/02/20/ddl.html</guid>
</item>
<item>
<title>Apache Flink 1.10.0 Release Announcement</title>
<description>&lt;p&gt;The Apache Flink community is excited to hit the double digits and announce the release of Flink 1.10.0! As a result of the biggest community effort to date, with over 1.2k issues implemented and more than 200 contributors, this release introduces significant improvements to the overall performance and stability of Flink jobs, a preview of native Kubernetes integration and great advances in Python support (PyFlink).&lt;/p&gt;
&lt;p&gt;Flink 1.10 also marks the completion of the &lt;a href=&quot;https://flink.apache.org/news/2019/08/22/release-1.9.0.html#preview-of-the-new-blink-sql-query-processor&quot;&gt;Blink integration&lt;/a&gt;, hardening streaming SQL and bringing mature batch processing to Flink with production-ready Hive integration and TPC-DS coverage. This blog post describes all major new features and improvements, important changes to be aware of and what to expect moving forward.&lt;/p&gt;
&lt;div class=&quot;page-toc&quot;&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;#new-features-and-improvements&quot; id=&quot;markdown-toc-new-features-and-improvements&quot;&gt;New Features and Improvements&lt;/a&gt; &lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#improved-memory-management-and-configuration&quot; id=&quot;markdown-toc-improved-memory-management-and-configuration&quot;&gt;Improved Memory Management and Configuration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#unified-logic-for-job-submission&quot; id=&quot;markdown-toc-unified-logic-for-job-submission&quot;&gt;Unified Logic for Job Submission&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#native-kubernetes-integration-beta&quot; id=&quot;markdown-toc-native-kubernetes-integration-beta&quot;&gt;Native Kubernetes Integration (Beta)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#table-apisql-production-ready-hive-integration&quot; id=&quot;markdown-toc-table-apisql-production-ready-hive-integration&quot;&gt;Table API/SQL: Production-ready Hive Integration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-improvements-to-the-table-apisql&quot; id=&quot;markdown-toc-other-improvements-to-the-table-apisql&quot;&gt;Other Improvements to the Table API/SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pyflink-support-for-native-user-defined-functions-udfs&quot; id=&quot;markdown-toc-pyflink-support-for-native-user-defined-functions-udfs&quot;&gt;PyFlink: Support for Native User Defined Functions (UDFs)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#important-changes&quot; id=&quot;markdown-toc-important-changes&quot;&gt;Important Changes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#release-notes&quot; id=&quot;markdown-toc-release-notes&quot;&gt;Release Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#list-of-contributors&quot; id=&quot;markdown-toc-list-of-contributors&quot;&gt;List of Contributors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;The binary distribution and source artifacts are now available on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt; of the Flink website. For more details, check the complete &lt;a href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12345845&quot;&gt;release changelog&lt;/a&gt; and the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/&quot;&gt;updated documentation&lt;/a&gt;. We encourage you to download the release and share your feedback with the community through the &lt;a href=&quot;https://flink.apache.org/community.html#mailing-lists&quot;&gt;Flink mailing lists&lt;/a&gt; or &lt;a href=&quot;https://issues.apache.org/jira/projects/FLINK/summary&quot;&gt;JIRA&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;new-features-and-improvements&quot;&gt;New Features and Improvements&lt;/h2&gt;
&lt;h3 id=&quot;improved-memory-management-and-configuration&quot;&gt;Improved Memory Management and Configuration&lt;/h3&gt;
&lt;p&gt;The current &lt;code&gt;TaskExecutor&lt;/code&gt; memory configuration in Flink has some shortcomings that make it hard to reason about or optimize resource utilization, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Different configuration models for memory footprint in Streaming and Batch execution;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Complex and user-dependent configuration of off-heap state backends (i.e. RocksDB) in Streaming execution.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To make memory options more explicit and intuitive to users, Flink 1.10 introduces significant changes to the &lt;code&gt;TaskExecutor&lt;/code&gt; memory model and configuration logic (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors&quot;&gt;FLIP-49&lt;/a&gt;). These changes make Flink more adaptable to all kinds of deployment environments (e.g. Kubernetes, Yarn, Mesos), giving users strict control over its memory consumption.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Managed Memory Extension&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Managed memory was extended to also account for memory usage of &lt;code&gt;RocksDBStateBackend&lt;/code&gt;. While batch jobs can use either on-heap or off-heap memory, streaming jobs with &lt;code&gt;RocksDBStateBackend&lt;/code&gt; can use off-heap memory only. Therefore, to allow users to switch between Streaming and Batch execution without having to modify cluster configurations, managed memory is now always off-heap.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Simplified RocksDB Configuration&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Configuring an off-heap state backend like RocksDB used to involve a good deal of manual tuning, like decreasing the JVM heap size or setting Flink to use off-heap memory. This can now be achieved through Flink’s out-of-box configuration, and adjusting the memory budget for &lt;code&gt;RocksDBStateBackend&lt;/code&gt; is as simple as resizing the managed memory size.&lt;/p&gt;
&lt;p&gt;Another important improvement was to allow Flink to bind RocksDB native memory usage (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-7289&quot;&gt;FLINK-7289&lt;/a&gt;), preventing it from exceeding its total memory budget — this is especially relevant in containerized environments like Kubernetes. For details on how to enable and tune this feature, refer to &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/state/large_state_tuning.html#tuning-rocksdb&quot;&gt;Tuning RocksDB&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-danger&quot;&gt;Note&lt;/span&gt; FLIP-49 changes the process of cluster resource configuration, which may require tuning your clusters for upgrades from previous Flink versions. For a comprehensive overview of the changes introduced and tuning guidance, consult &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/memory/mem_setup.html&quot;&gt;this setup&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;unified-logic-for-job-submission&quot;&gt;Unified Logic for Job Submission&lt;/h3&gt;
&lt;p&gt;Prior to this release, job submission was part of the duties of the Execution Environments and closely tied to the different deployment targets (e.g. Yarn, Kubernetes, Mesos). This led to a poor separation of concerns and, over time, to a growing number of customized environments that users needed to configure and manage separately.&lt;/p&gt;
&lt;p&gt;In Flink 1.10, job submission logic is abstracted into the generic &lt;code&gt;Executor&lt;/code&gt; interface (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-73%3A+Introducing+Executors+for+job+submission&quot;&gt;FLIP-73&lt;/a&gt;). The addition of the &lt;code&gt;ExecutorCLI&lt;/code&gt; (&lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=133631524&quot;&gt;FLIP-81&lt;/a&gt;) introduces a unified way to specify configuration parameters for &lt;strong&gt;any&lt;/strong&gt; &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/cli.html#deployment-targets&quot;&gt;execution target&lt;/a&gt;. To round up this effort, the process of result retrieval was also decoupled from job submission with the introduction of a &lt;code&gt;JobClient&lt;/code&gt; (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-74%3A+Flink+JobClient+API&quot;&gt;FLINK-74&lt;/a&gt;), responsible for fetching the &lt;code&gt;JobExecutionResult&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:100%&quot; src=&quot;/img/blog/2020-02-11-release-1.10.0/flink_1.10_zeppelin.png&quot; /&gt;
&lt;/center&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;In particular, these changes make it much easier to programmatically use Flink in downstream frameworks — for example, Apache Beam or Zeppelin interactive notebooks — by providing users with a unified entry point to Flink. For users working with Flink across multiple target environments, the transition to a configuration-based execution process also significantly reduces boilerplate code and maintainability overhead.&lt;/p&gt;
&lt;h3 id=&quot;native-kubernetes-integration-beta&quot;&gt;Native Kubernetes Integration (Beta)&lt;/h3&gt;
&lt;p&gt;For users looking to get started with Flink on a containerized environment, deploying and managing a standalone cluster on top of Kubernetes requires some upfront knowledge about containers, operators and environment-specific tools like &lt;code&gt;kubectl&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In Flink 1.10, we rolled out the first phase of &lt;strong&gt;Active Kubernetes Integration&lt;/strong&gt; (&lt;a href=&quot;https://jira.apache.org/jira/browse/FLINK-9953&quot;&gt;FLINK-9953&lt;/a&gt;) with support for session clusters (with per-job planned). In this context, “active” means that Flink’s ResourceManager (&lt;code&gt;K8sResMngr&lt;/code&gt;) natively communicates with Kubernetes to allocate new pods on-demand, similar to Flink’s Yarn and Mesos integration. Users can also leverage namespaces to launch Flink clusters for multi-tenant environments with limited aggregate resource consumption. RBAC roles and service accounts with enough permission should be configured beforehand.&lt;/p&gt;
&lt;p&gt;&lt;span&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; style=&quot;width:75%&quot; src=&quot;/img/blog/2020-02-11-release-1.10.0/flink_1.10_nativek8s.png&quot; /&gt;
&lt;/center&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;As introduced in &lt;a href=&quot;#unified-logic-for-job-submission&quot;&gt;Unified Logic For Job Submission&lt;/a&gt;, all command-line options in Flink 1.10 are mapped to a unified configuration. For this reason, users can simply refer to the Kubernetes config options and submit a job to an existing Flink session on Kubernetes in the CLI using:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;./bin/flink run -d -e kubernetes-session -Dkubernetes.cluster-id&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&amp;lt;ClusterId&amp;gt; examples/streaming/WindowJoin.jar&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you want to try out this preview feature, we encourage you to walk through the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/deployment/native_kubernetes.html&quot;&gt;Native Kubernetes setup&lt;/a&gt;, play around with it and share feedback with the community.&lt;/p&gt;
&lt;h3 id=&quot;table-apisql-production-ready-hive-integration&quot;&gt;Table API/SQL: Production-ready Hive Integration&lt;/h3&gt;
&lt;p&gt;Hive integration was announced as a preview feature in Flink 1.9. This preview allowed users to persist Flink-specific metadata (e.g. Kafka tables) in Hive Metastore using SQL DDL, call UDFs defined in Hive and use Flink for reading and writing Hive tables. Flink 1.10 rounds up this effort with further developments that bring production-ready Hive integration to Flink with full compatibility of &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/#supported-hive-versions&quot;&gt;most Hive versions&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id=&quot;native-partition-support-for-batch-sql&quot;&gt;Native Partition Support for Batch SQL&lt;/h4&gt;
&lt;p&gt;So far, only writes to non-partitioned Hive tables were supported. In Flink 1.10, the Flink SQL syntax has been extended with &lt;code&gt;INSERT OVERWRITE&lt;/code&gt; and &lt;code&gt;PARTITION&lt;/code&gt; (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support&quot;&gt;FLIP-63&lt;/a&gt;), enabling users to write into both static and dynamic partitions in Hive.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Static Partition Writing&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OVERWRITE&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tablename1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PARTITION&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;partcol1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;partcol2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...)]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;select_statement1&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from_statement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Dynamic Partition Writing&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OVERWRITE&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tablename1&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;select_statement1&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from_statement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Fully supporting partitioned tables allows users to take advantage of partition pruning on read, which significantly increases the performance of these operations by reducing the amount of data that needs to be scanned.&lt;/p&gt;
&lt;h4 id=&quot;further-optimizations&quot;&gt;Further Optimizations&lt;/h4&gt;
&lt;p&gt;Besides partition pruning, Flink 1.10 introduces more &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/hive/read_write_hive.html#optimizations&quot;&gt;read optimizations&lt;/a&gt; to Hive integration, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Projection pushdown:&lt;/strong&gt; Flink leverages projection pushdown to minimize data transfer between Flink and Hive tables by omitting unnecessary fields from table scans. This is especially beneficial for tables with a large number of columns.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;LIMIT pushdown:&lt;/strong&gt; for queries with the &lt;code&gt;LIMIT&lt;/code&gt; clause, Flink will limit the number of output records wherever possible to minimize the amount of data transferred across the network.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;ORC Vectorization on Read:&lt;/strong&gt; to boost read performance for ORC files, Flink now uses the native ORC Vectorized Reader by default for Hive versions above 2.0.0 and columns with non-complex data types.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&quot;pluggable-modules-as-flink-system-objects-beta&quot;&gt;Pluggable Modules as Flink System Objects (Beta)&lt;/h4&gt;
&lt;p&gt;Flink 1.10 introduces a generic mechanism for pluggable modules in the Flink table core, with a first focus on system functions (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules&quot;&gt;FLIP-68&lt;/a&gt;). With modules, users can extend Flink’s system objects — for example use Hive built-in functions that behave like Flink system functions. This release ships with a pre-implemented &lt;code&gt;HiveModule&lt;/code&gt;, supporting multiple Hive versions, but users are also given the possibility to &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/modules.html&quot;&gt;write their own pluggable modules&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;other-improvements-to-the-table-apisql&quot;&gt;Other Improvements to the Table API/SQL&lt;/h3&gt;
&lt;h4 id=&quot;watermarks-and-computed-columns-in-sql-ddl&quot;&gt;Watermarks and Computed Columns in SQL DDL&lt;/h4&gt;
&lt;p&gt;Flink 1.10 supports stream-specific syntax extensions to define time attributes and watermark generation in Flink SQL DDL (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-66%3A+Support+Time+Attribute+in+SQL+DDL&quot;&gt;FLIP-66&lt;/a&gt;). This allows time-based operations, like windowing, and the definition of &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/sql/create.html#create-table&quot;&gt;watermark strategies&lt;/a&gt; on tables created using DDL statements.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;table_name&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;WATERMARK&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FOR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columnName&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;watermark_strategy_expression&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WITH&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This release also introduces support for virtual computed columns (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-70%3A+Flink+SQL+Computed+Column+Design&quot;&gt;FLIP-70&lt;/a&gt;) that can be derived based on other columns in the same table or deterministic expressions (i.e. literal values, UDFs and built-in functions). In Flink, computed columns are useful to define time attributes &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/sql/create.html#create-table&quot;&gt;upon table creation&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id=&quot;additional-extensions-to-sql-ddl&quot;&gt;Additional Extensions to SQL DDL&lt;/h4&gt;
&lt;p&gt;There is now a clear distinction between temporary/persistent and system/catalog functions (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog&quot;&gt;FLIP-57&lt;/a&gt;). This not only eliminates ambiguity in function reference, but also allows for deterministic function resolution order (i.e. in case of naming collision, system functions will precede catalog functions, with temporary functions taking precedence over persistent functions for both dimensions).&lt;/p&gt;
&lt;p&gt;Following the groundwork in FLIP-57, we extended the SQL DDL syntax to support the creation of catalog functions, temporary functions and temporary system functions (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-79+Flink+Function+DDL+Support&quot;&gt;FLIP-79&lt;/a&gt;):&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;TEMPORARY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;TEMPORARY&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;SYSTEM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FUNCTION&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IF&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;NOT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;EXISTS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;catalog_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.][&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.]&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;function_name&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;identifier&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;LANGUAGE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JAVA&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SCALA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For a complete overview of the current state of DDL support in Flink SQL, check the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/table/sql/&quot;&gt;updated documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;label label-danger&quot;&gt;Note&lt;/span&gt; In order to correctly handle and guarantee a consistent behavior across meta-objects (tables, views, functions) in the future, some object declaration methods in the Table API have been deprecated in favor of methods that are closer to standard SQL DDL (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module&quot;&gt;FLIP-64&lt;/a&gt;).&lt;/p&gt;
&lt;h4 id=&quot;full-tpc-ds-coverage-for-batch&quot;&gt;Full TPC-DS Coverage for Batch&lt;/h4&gt;
&lt;p&gt;TPC-DS is a widely used industry-standard decision support benchmark to evaluate and measure the performance of SQL-based data processing engines. In Flink 1.10, all TPC-DS queries are supported end-to-end (&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11491&quot;&gt;FLINK-11491&lt;/a&gt;), reflecting the readiness of its SQL engine to address the needs of modern data warehouse-like workloads.&lt;/p&gt;
&lt;h3 id=&quot;pyflink-support-for-native-user-defined-functions-udfs&quot;&gt;PyFlink: Support for Native User Defined Functions (UDFs)&lt;/h3&gt;
&lt;p&gt;A preview of PyFlink was introduced in the previous release, making headway towards the goal of full Python support in Flink. For this release, the focus was to enable users to register and use Python User-Defined Functions (UDF, with UDTF/UDAF planned) in the Table API/SQL (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table&quot;&gt;FLIP-58&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;&lt;span&gt;
&lt;center&gt;
&lt;img vspace=&quot;8&quot; hspace=&quot;100&quot; style=&quot;width:75%&quot; src=&quot;/img/blog/2020-02-11-release-1.10.0/flink_1.10_pyflink.gif&quot; /&gt;
&lt;/center&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;If you are interested in the underlying implementation — leveraging Apache Beam’s &lt;a href=&quot;https://beam.apache.org/roadmap/portability/&quot;&gt;Portability Framework&lt;/a&gt; — refer to the “Architecture” section of FLIP-58 and also to &lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-78%3A+Flink+Python+UDF+Environment+and+Dependency+Management&quot;&gt;FLIP-78&lt;/a&gt;. These data structures lay the required foundation for Pandas support and for PyFlink to eventually reach the DataStream API.&lt;/p&gt;
&lt;p&gt;From Flink 1.10, users can also easily install PyFlink through &lt;code&gt;pip&lt;/code&gt; using:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;pip install apache-flink&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For a preview of other improvements planned for PyFlink, check &lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14500&quot;&gt;FLINK-14500&lt;/a&gt; and get involved in the &lt;a href=&quot;http://apache-flink.147419.n8.nabble.com/Re-DISCUSS-What-parts-of-the-Python-API-should-we-focus-on-next-td1285.html&quot;&gt;discussion&lt;/a&gt; for requested user features.&lt;/p&gt;
&lt;h2 id=&quot;important-changes&quot;&gt;Important Changes&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10725&quot;&gt;FLINK-10725&lt;/a&gt;] Flink can now be compiled and run on Java 11.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://jira.apache.org/jira/browse/FLINK-15495&quot;&gt;FLINK-15495&lt;/a&gt;] The Blink planner is now the default in the SQL Client, so that users can benefit from all the latest features and improvements. The switch from the old planner in the Table API is also planned for the next release, so we recommend that users start getting familiar with the Blink planner.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13025&quot;&gt;FLINK-13025&lt;/a&gt;] There is a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/connectors/elasticsearch.html#elasticsearch-connector&quot;&gt;new Elasticsearch sink connector&lt;/a&gt;, fully supporting Elasticsearch 7.x versions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15115&quot;&gt;FLINK-15115&lt;/a&gt;] The connectors for Kafka 0.8 and 0.9 have been marked as deprecated and will no longer be actively supported. If you are still using these versions or have any other related concerns, please reach out to the @dev mailing list.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14516&quot;&gt;FLINK-14516&lt;/a&gt;] The non-credit-based network flow control code was removed, along with the configuration option &lt;code&gt;taskmanager.network.credit.model&lt;/code&gt;. Moving forward, Flink will always use credit-based flow control.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12122&quot;&gt;FLINK-12122&lt;/a&gt;] &lt;a href=&quot;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077&quot;&gt;FLIP-6&lt;/a&gt; was rolled out with Flink 1.5.0 and introduced a code regression related to the way slots are allocated from &lt;code&gt;TaskManagers&lt;/code&gt;. To use a scheduling strategy that is closer to the pre-FLIP behavior, where Flink tries to spread out the workload across all currently available &lt;code&gt;TaskManagers&lt;/code&gt;, users can set &lt;code&gt;cluster.evenly-spread-out-slots: true&lt;/code&gt; in the &lt;code&gt;flink-conf.yaml&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11956&quot;&gt;FLINK-11956&lt;/a&gt;] &lt;code&gt;s3-hadoop&lt;/code&gt; and &lt;code&gt;s3-presto&lt;/code&gt; filesystems no longer use class relocations and should be loaded through &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/ops/filesystems/#pluggable-file-systems&quot;&gt;plugins&lt;/a&gt;, but now seamlessly integrate with all credential providers. Other filesystems are strongly recommended to be used only as plugins, as we will continue to remove relocations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Flink 1.9 shipped with a refactored Web UI, with the legacy one being kept around as backup in case something wasn’t working as expected. No issues have been reported so far, so &lt;a href=&quot;http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Remove-old-WebUI-td35218.html&quot;&gt;the community voted&lt;/a&gt; to drop the legacy Web UI in Flink 1.10.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;release-notes&quot;&gt;Release Notes&lt;/h2&gt;
&lt;p&gt;Please review the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.10/release-notes/flink-1.10.html&quot;&gt;release notes&lt;/a&gt; carefully for a detailed list of changes and new features if you plan to upgrade your setup to Flink 1.10. This version is API-compatible with previous 1.x releases for APIs annotated with the @Public annotation.&lt;/p&gt;
&lt;h2 id=&quot;list-of-contributors&quot;&gt;List of Contributors&lt;/h2&gt;
&lt;p&gt;The Apache Flink community would like to thank all contributors that have made this release possible:&lt;/p&gt;
&lt;p&gt;Achyuth Samudrala, Aitozi, Alberto Romero, Alec.Ch, Aleksey Pak, Alexander Fedulov, Alice Yan, Aljoscha Krettek, Aloys, Andrey Zagrebin, Arvid Heise, Benchao Li, Benoit Hanotte, Benoît Paris, Bhagavan Das, Biao Liu, Chesnay Schepler, Congxian Qiu, Cyrille Chépélov, César Soto Valero, David Anderson, David Hrbacek, David Moravek, Dawid Wysakowicz, Dezhi Cai, Dian Fu, Dyana Rose, Eamon Taaffe, Fabian Hueske, Fawad Halim, Fokko Driesprong, Frey Gao, Gabor Gevay, Gao Yun, Gary Yao, GatsbyNewton, GitHub, Grebennikov Roman, GuoWei Ma, Gyula Fora, Haibo Sun, Hao Dang, Henvealf, Hongtao Zhang, HuangXingBo, Hwanju Kim, Igal Shilman, Jacob Sevart, Jark Wu, Jeff Martin, Jeff Yang, Jeff Zhang, Jiangjie (Becket) Qin, Jiayi, Jiayi Liao, Jincheng Sun, Jing Zhang, Jingsong Lee, JingsongLi, Joao Boto, John Lonergan, Kaibo Zhou, Konstantin Knauf, Kostas Kloudas, Kurt Young, Leonard Xu, Ling Wang, Lining Jing, Liupengcheng, LouisXu, Mads Chr. Olesen, Marco Zühlke, Marcos Klein, Matyas Orhidi, Maximilian Bode, Maximilian Michels, Nick Pavlakis, Nico Kruber, Nicolas Deslandes, Pablo Valtuille, Paul Lam, Paul Lin, PengFei Li, Piotr Nowojski, Piotr Przybylski, Piyush Narang, Ricco Chen, Richard Deurwaarder, Robert Metzger, Roman, Roman Grebennikov, Roman Khachatryan, Rong Rong, Rui Li, Ryan Tao, Scott Kidder, Seth Wiesman, Shannon Carey, Shaobin.Ou, Shuo Cheng, Stefan Richter, Stephan Ewen, Steve OU, Steven Wu, Terry Wang, Thesharing, Thomas Weise, Till Rohrmann, Timo Walther, Tony Wei, TsReaper, Tzu-Li (Gordon) Tai, Victor Wong, WangHengwei, Wei Zhong, WeiZhong94, Wind (Jiayi Liao), Xintong Song, XuQianJin-Stars, Xuefu Zhang, Xupingyong, Yadong Xie, Yang Wang, Yangze Guo, Yikun Jiang, Ying, YngwieWang, Yu Li, Yuan Mei, Yun Gao, Yun Tang, Zhanchun Zhang, Zhenghua Gao, Zhijiang, Zhu Zhu, a-suiniaev, azagrebin, beyond1920, biao.liub, blueszheng, bowen.li, caoyingjie, catkint, chendonglin, chenqi, chunpinghe, cyq89051127, danrtsey.wy, dengziming, dianfu, eskabetxe, fanrui, forideal, gentlewang, godfrey he, godfreyhe, haodang, hehuiyuan, hequn8128, hpeter, huangxingbo, huzheng, ifndef-SleePy, jiemotongxue, joe, jrthe42, kevin.cyj, klion26, lamber-ken, libenchao, liketic, lincoln-lil, lining, liuyongvs, liyafan82, lz, mans2singh, mojo, openinx, ouyangwulin, shining-huang, shuai-xu, shuo.cs, stayhsfLee, sunhaibotb, sunjincheng121, tianboxiu, tianchen, tianchen92, tison, tszkitlo40, unknown, vinoyang, vthinkxie, wangpeibin, wangxiaowei, wangxiyuan, wangxlong, wangyang0918, whlwanghailong, xuchao0903, xuyang1706, yanghua, yangjf2019, yongqiang chai, yuzhao.cyz, zentol, zhangzhanchum, zhengcanbin, zhijiang, zhongyong jin, zhuzhu.zz, zjuwangg, zoudaokoulife, 砚田, 谢磊, 张志豪, 曹建华&lt;/p&gt;
</description>
<pubDate>Tue, 11 Feb 2020 03:30:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/02/11/release-1.10.0.html</link>
<guid isPermaLink="true">/news/2020/02/11/release-1.10.0.html</guid>
</item>
<item>
<title>A Guide for Unit Testing in Apache Flink</title>
<description>&lt;p&gt;Writing unit tests is one of the essential tasks of designing a production-grade application. Without tests, a single change in code can result in cascades of failure in production. Thus unit tests should be written for all types of applications, be it a simple job cleaning data and training a model or a complex multi-tenant, real-time data processing system. In the following sections, we provide a guide for unit testing of Apache Flink applications.
Apache Flink provides a robust unit testing framework to make sure your applications behave in production as expected during development. You need to include the following dependencies to utilize the provided framework.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-test-utils_${scala.binary.version}&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;${flink.version}&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;scope&amp;gt;&lt;/span&gt;test&lt;span class=&quot;nt&quot;&gt;&amp;lt;/scope&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-runtime_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.9.0&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;scope&amp;gt;&lt;/span&gt;test&lt;span class=&quot;nt&quot;&gt;&amp;lt;/scope&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;classifier&amp;gt;&lt;/span&gt;tests&lt;span class=&quot;nt&quot;&gt;&amp;lt;/classifier&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.9.0&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;scope&amp;gt;&lt;/span&gt;test&lt;span class=&quot;nt&quot;&gt;&amp;lt;/scope&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;classifier&amp;gt;&lt;/span&gt;tests&lt;span class=&quot;nt&quot;&gt;&amp;lt;/classifier&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The strategy of writing unit tests differs for various operators. You can break down the strategy into the following three buckets:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Stateless Operators&lt;/li&gt;
&lt;li&gt;Stateful Operators&lt;/li&gt;
&lt;li&gt;Timed Process Operators&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;stateless-operators&quot;&gt;Stateless Operators&lt;/h1&gt;
&lt;p&gt;Writing unit tests for a stateless operator is a breeze. You need to follow the basic norm of writing a test case, i.e., create an instance of the function class and test the appropriate methods. Let’s take an example of a simple &lt;code&gt;Map&lt;/code&gt; operator.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyStatelessMap&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MapFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;hello &amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The test case for the above operator should look like&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@Test&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;testMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;MyStatelessMap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;statelessMap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;MyStatelessMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;statelessMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;hello world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Pretty simple, right? Let’s take a look at one for the &lt;code&gt;FlatMap&lt;/code&gt; operator.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyStatelessFlatMap&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FlatMapFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;flatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;hello &amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;FlatMap&lt;/code&gt; operators require a &lt;code&gt;Collector&lt;/code&gt; object along with the input. For the test case, we have two options:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Mock the &lt;code&gt;Collector&lt;/code&gt; object using Mockito&lt;/li&gt;
&lt;li&gt;Use the &lt;code&gt;ListCollector&lt;/code&gt; provided by Flink&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I prefer the second method as it requires fewer lines of code and is suitable for most of the cases.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@Test&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;testFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;MyStatelessFlatMap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;statelessFlatMap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;MyStatelessFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ArrayList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ListCollector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;listCollector&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ListCollector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;statelessFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;flatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;listCollector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Lists&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;newArrayList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;hello world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&quot;stateful-operators&quot;&gt;Stateful Operators&lt;/h1&gt;
&lt;p&gt;Writing test cases for stateful operators requires more effort. You need to check whether the operator state is updated correctly and if it is cleaned up properly along with the output of the operator.&lt;/p&gt;
&lt;p&gt;Let’s take an example of stateful &lt;code&gt;FlatMap&lt;/code&gt; function&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;StatefulFlatMap&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RichFlatMapFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ValueState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previousInput&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Configuration&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;previousInput&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getRuntimeContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ValueStateDescriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;previousInput&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;flatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;hello &amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;previousInput&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;){&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot; &amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previousInput&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;previousInput&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The intricate part of writing tests for the above class is to mock the configuration as well as the runtime context of the application. Flink provides TestHarness classes so that users don’t have to create the mock objects themselves. Using the &lt;code&gt;KeyedOperatorHarness&lt;/code&gt;, the test looks like:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.api.operators.StreamFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.runtime.streamrecord.StreamRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.util.KeyedOneInputStreamOperatorTestHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Test&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;testFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;StatefulFlatMap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;statefulFlatMap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;StatefulFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// OneInputStreamOperatorTestHarness takes the input and output types as type parameters &lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;OneInputStreamOperatorTestHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// KeyedOneInputStreamOperatorTestHarness takes three arguments:&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Flink operator object, key selector and key type&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedOneInputStreamOperatorTestHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;statefulFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;1&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// test first record&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ValueState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previousInput&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;statefulFlatMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getRuntimeContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ValueStateDescriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;previousInput&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stateValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previousInput&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Lists&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;newArrayList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;hello world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;extractOutputStreamRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stateValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// test second record&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;parallel&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Lists&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;newArrayList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;hello world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;hello parallel world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;extractOutputStreamRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;parallel&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previousInput&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The test harness provides many helper methods, three of which are being used here:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;open&lt;/code&gt;: calls the open of the &lt;code&gt;FlatMap&lt;/code&gt; function with relevant parameters. It also initializes the context.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;processElement&lt;/code&gt;: allows users to pass an input element as well as the timestamp associated with the element.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;extractOutputStreamRecords&lt;/code&gt;: gets the output records along with their timestamps from the &lt;code&gt;Collector&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The test harness simplifies the unit testing for the stateful functions to a large extent.&lt;/p&gt;
&lt;p&gt;You might also need to check whether the state value is being set correctly. You can get the state value directly from the operator using a mechanism similar to the one used while creating the state. This is also demonstrated in the previous example.&lt;/p&gt;
&lt;h1 id=&quot;timed-process-operators&quot;&gt;Timed Process Operators&lt;/h1&gt;
&lt;p&gt;Writing tests for process functions, that work with time, is quite similar to writing tests for stateful functions because you can also use test harness.
However, you need to take care of another aspect, which is providing timestamps for events and controlling the current time of the application. By setting the current (processing or event) time, you can trigger registered timers, which will call the &lt;code&gt;onTimer&lt;/code&gt; method of the function&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyProcessFunction&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;timerService&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;registerProcessingTimeTimer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;50&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;hello &amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;onTimer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;long&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OnTimerContext&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Collector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Timer triggered at timestamp %d&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We need to test both the methods in the &lt;code&gt;KeyedProcessFunction&lt;/code&gt;, i.e., &lt;code&gt;processElement&lt;/code&gt; as well as &lt;code&gt;onTimer&lt;/code&gt;. Using a test harness, we can control the current time of the function. Thus, we can trigger the timer at will rather than waiting for a specific time.&lt;/p&gt;
&lt;p&gt;Let’s take a look at the test case&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@Test&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;testProcessElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;MyProcessFunction&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;myProcessFunction&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;MyProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;OneInputStreamOperatorTestHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedOneInputStreamOperatorTestHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedProcessOperator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;myProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;1&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Function time is initialized to 0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Lists&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;newArrayList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;hello world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;extractOutputStreamRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Test&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;testOnTimer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;MyProcessFunction&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;myProcessFunction&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;MyProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;OneInputStreamOperatorTestHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedOneInputStreamOperatorTestHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedProcessOperator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;myProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;1&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;STRING&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;processElement&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;numProcessingTimeTimers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Function time is set to 50&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setProcessingTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;50&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Assert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;assertEquals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Lists&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;newArrayList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;hello world&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;StreamRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Timer triggered at timestamp 50&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)),&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;testHarness&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;extractOutputStreamRecords&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The mechanism to test the multi-input stream operators such as CoProcess functions is similar to the ones described in this article. You should use the TwoInput variant of the harness for these operators, such as &lt;code&gt;TwoInputStreamOperatorTestHarness&lt;/code&gt;.&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;In the previous sections we showcased how unit testing in Apache Flink works for stateless, stateful and times-aware-operators. We hope you found the steps easy to follow and execute while developing your Flink applications. If you have any questions or feedback you can reach out to me &lt;a href=&quot;https://www.kharekartik.dev/about/&quot;&gt;here&lt;/a&gt; or contact the community on the &lt;a href=&quot;https://flink.apache.org/community.html&quot;&gt;Apache Flink user mailing list&lt;/a&gt;.&lt;/p&gt;
</description>
<pubDate>Fri, 07 Feb 2020 13:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/02/07/a-guide-for-unit-testing-in-apache-flink.html</link>
<guid isPermaLink="true">/news/2020/02/07/a-guide-for-unit-testing-in-apache-flink.html</guid>
</item>
<item>
<title>Apache Flink 1.9.2 Released</title>
<description>&lt;p&gt;The Apache Flink community released the second bugfix version of the Apache Flink 1.9 series.&lt;/p&gt;
&lt;p&gt;This release includes 117 fixes and minor improvements for Flink 1.9.1. The list below includes a detailed list of all fixes and improvements.&lt;/p&gt;
&lt;p&gt;We highly recommend all users to upgrade to Flink 1.9.2.&lt;/p&gt;
&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.9.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.9.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.9.2&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;List of resolved issues:&lt;/p&gt;
&lt;h2&gt; Sub-task
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12122&quot;&gt;FLINK-12122&lt;/a&gt;] - Spread out tasks evenly across all available registered TaskManagers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13360&quot;&gt;FLINK-13360&lt;/a&gt;] - Add documentation for HBase connector for Table API &amp;amp; SQL
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13361&quot;&gt;FLINK-13361&lt;/a&gt;] - Add documentation for JDBC connector for Table API &amp;amp; SQL
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13723&quot;&gt;FLINK-13723&lt;/a&gt;] - Use liquid-c for faster doc generation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13724&quot;&gt;FLINK-13724&lt;/a&gt;] - Remove unnecessary whitespace from the docs&amp;#39; sidenav
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13725&quot;&gt;FLINK-13725&lt;/a&gt;] - Use sassc for faster doc generation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13726&quot;&gt;FLINK-13726&lt;/a&gt;] - Build docs with jekyll 4.0.0.pre.beta1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13791&quot;&gt;FLINK-13791&lt;/a&gt;] - Speed up sidenav by using group_by
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13817&quot;&gt;FLINK-13817&lt;/a&gt;] - Expose whether web submissions are enabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13818&quot;&gt;FLINK-13818&lt;/a&gt;] - Check whether web submission are enabled
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14535&quot;&gt;FLINK-14535&lt;/a&gt;] - Cast exception is thrown when count distinct on decimal fields
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14735&quot;&gt;FLINK-14735&lt;/a&gt;] - Improve batch schedule check input consumable performance
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Bug
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10377&quot;&gt;FLINK-10377&lt;/a&gt;] - Remove precondition in TwoPhaseCommitSinkFunction.notifyCheckpointComplete
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10435&quot;&gt;FLINK-10435&lt;/a&gt;] - Client sporadically hangs after Ctrl + C
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11120&quot;&gt;FLINK-11120&lt;/a&gt;] - TIMESTAMPADD function handles TIME incorrectly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11835&quot;&gt;FLINK-11835&lt;/a&gt;] - ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange failed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12342&quot;&gt;FLINK-12342&lt;/a&gt;] - Yarn Resource Manager Acquires Too Many Containers
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12399&quot;&gt;FLINK-12399&lt;/a&gt;] - FilterableTableSource does not use filters on job run
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13184&quot;&gt;FLINK-13184&lt;/a&gt;] - Starting a TaskExecutor blocks the YarnResourceManager&amp;#39;s main thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13589&quot;&gt;FLINK-13589&lt;/a&gt;] - DelimitedInputFormat index error on multi-byte delimiters with whole file input splits
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13702&quot;&gt;FLINK-13702&lt;/a&gt;] - BaseMapSerializerTest.testDuplicate fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13708&quot;&gt;FLINK-13708&lt;/a&gt;] - Transformations should be cleared because a table environment could execute multiple job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13740&quot;&gt;FLINK-13740&lt;/a&gt;] - TableAggregateITCase.testNonkeyedFlatAggregate failed on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13749&quot;&gt;FLINK-13749&lt;/a&gt;] - Make Flink client respect classloading policy
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13758&quot;&gt;FLINK-13758&lt;/a&gt;] - Failed to submit JobGraph when registered hdfs file in DistributedCache
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13799&quot;&gt;FLINK-13799&lt;/a&gt;] - Web Job Submit Page displays stream of error message when web submit is disables in the config
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13827&quot;&gt;FLINK-13827&lt;/a&gt;] - Shell variable should be escaped in start-scala-shell.sh
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13862&quot;&gt;FLINK-13862&lt;/a&gt;] - Update Execution Plan docs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13945&quot;&gt;FLINK-13945&lt;/a&gt;] - Instructions for building flink-shaded against vendor repository don&amp;#39;t work
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13969&quot;&gt;FLINK-13969&lt;/a&gt;] - Resuming Externalized Checkpoint (rocks, incremental, scale down) end-to-end test fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13995&quot;&gt;FLINK-13995&lt;/a&gt;] - Fix shading of the licence information of netty
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13999&quot;&gt;FLINK-13999&lt;/a&gt;] - Correct the documentation of MATCH_RECOGNIZE
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14066&quot;&gt;FLINK-14066&lt;/a&gt;] - Pyflink building failure in master and 1.9.0 version
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14074&quot;&gt;FLINK-14074&lt;/a&gt;] - MesosResourceManager can&amp;#39;t create new taskmanagers in Session Cluster Mode.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14175&quot;&gt;FLINK-14175&lt;/a&gt;] - Upgrade KPL version in flink-connector-kinesis to fix application OOM
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14200&quot;&gt;FLINK-14200&lt;/a&gt;] - Temporal Table Function Joins do not work on Tables (only TableSources) on the query side
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14235&quot;&gt;FLINK-14235&lt;/a&gt;] - Kafka010ProducerITCase&amp;gt;KafkaProducerTestBase.testOneToOneAtLeastOnceCustomOperator fails on travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14315&quot;&gt;FLINK-14315&lt;/a&gt;] - NPE with JobMaster.disconnectTaskManager
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14337&quot;&gt;FLINK-14337&lt;/a&gt;] - HistoryServer does not handle NPE on corruped archives properly
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14347&quot;&gt;FLINK-14347&lt;/a&gt;] - YARNSessionFIFOITCase.checkForProhibitedLogContents found a log with prohibited string
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14355&quot;&gt;FLINK-14355&lt;/a&gt;] - Example code in state processor API docs doesn&amp;#39;t compile
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14370&quot;&gt;FLINK-14370&lt;/a&gt;] - KafkaProducerAtLeastOnceITCase&amp;gt;KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14382&quot;&gt;FLINK-14382&lt;/a&gt;] - Incorrect handling of FLINK_PLUGINS_DIR on Yarn
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14398&quot;&gt;FLINK-14398&lt;/a&gt;] - Further split input unboxing code into separate methods
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14413&quot;&gt;FLINK-14413&lt;/a&gt;] - Shade-plugin ApacheNoticeResourceTransformer uses platform-dependent encoding
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14434&quot;&gt;FLINK-14434&lt;/a&gt;] - Dispatcher#createJobManagerRunner should not start JobManagerRunner
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14445&quot;&gt;FLINK-14445&lt;/a&gt;] - Python module build failed when making sdist
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14447&quot;&gt;FLINK-14447&lt;/a&gt;] - Network metrics doc table render confusion
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14459&quot;&gt;FLINK-14459&lt;/a&gt;] - Python module build hangs
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14524&quot;&gt;FLINK-14524&lt;/a&gt;] - PostgreSQL JDBC sink generates invalid SQL in upsert mode
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14547&quot;&gt;FLINK-14547&lt;/a&gt;] - UDF cannot be in the join condition in blink planner
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14561&quot;&gt;FLINK-14561&lt;/a&gt;] - Don&amp;#39;t write FLINK_PLUGINS_DIR ENV variable to Flink configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14562&quot;&gt;FLINK-14562&lt;/a&gt;] - RMQSource leaves idle consumer after closing
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14574&quot;&gt;FLINK-14574&lt;/a&gt;] - flink-s3-fs-hadoop doesn&amp;#39;t work with plugins mechanism
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14589&quot;&gt;FLINK-14589&lt;/a&gt;] - Redundant slot requests with the same AllocationID leads to inconsistent slot table
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14641&quot;&gt;FLINK-14641&lt;/a&gt;] - Fix description of metric `fullRestarts`
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14673&quot;&gt;FLINK-14673&lt;/a&gt;] - Shouldn&amp;#39;t expect HMS client to throw NoSuchObjectException for non-existing function
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14683&quot;&gt;FLINK-14683&lt;/a&gt;] - RemoteStreamEnvironment&amp;#39;s construction function has a wrong method
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14701&quot;&gt;FLINK-14701&lt;/a&gt;] - Slot leaks if SharedSlotOversubscribedException happens
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14784&quot;&gt;FLINK-14784&lt;/a&gt;] - CsvTableSink miss delimiter when row start with null member
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14817&quot;&gt;FLINK-14817&lt;/a&gt;] - &amp;quot;Streaming Aggregation&amp;quot; document contains misleading code examples
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14846&quot;&gt;FLINK-14846&lt;/a&gt;] - Correct the default writerbuffer size documentation of RocksDB
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14910&quot;&gt;FLINK-14910&lt;/a&gt;] - DisableAutoGeneratedUIDs fails on keyBy
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14930&quot;&gt;FLINK-14930&lt;/a&gt;] - OSS Filesystem Uses Wrong Shading Prefix
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14949&quot;&gt;FLINK-14949&lt;/a&gt;] - Task cancellation can be stuck against out-of-thread error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14951&quot;&gt;FLINK-14951&lt;/a&gt;] - State TTL backend end-to-end test fail when taskManager has multiple slot
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14953&quot;&gt;FLINK-14953&lt;/a&gt;] - Parquet table source should use schema type to build FilterPredicate
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14960&quot;&gt;FLINK-14960&lt;/a&gt;] - Dependency shading of table modules test fails on Travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14976&quot;&gt;FLINK-14976&lt;/a&gt;] - Cassandra Connector leaks Semaphore on Throwable; hangs on close
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15001&quot;&gt;FLINK-15001&lt;/a&gt;] - The digest of sub-plan reuse should contain retraction traits for stream physical nodes
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15013&quot;&gt;FLINK-15013&lt;/a&gt;] - Flink (on YARN) sometimes needs too many slots
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15030&quot;&gt;FLINK-15030&lt;/a&gt;] - Potential deadlock for bounded blocking ResultPartition.
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15036&quot;&gt;FLINK-15036&lt;/a&gt;] - Container startup error will be handled out side of the YarnResourceManager&amp;#39;s main thread
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15063&quot;&gt;FLINK-15063&lt;/a&gt;] - Input group and output group of the task metric are reversed
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15065&quot;&gt;FLINK-15065&lt;/a&gt;] - RocksDB configurable options doc description error
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15076&quot;&gt;FLINK-15076&lt;/a&gt;] - Source thread should be interrupted during the Task cancellation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15234&quot;&gt;FLINK-15234&lt;/a&gt;] - Hive table created from flink catalog table shouldn&amp;#39;t have null properties in parameters
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15240&quot;&gt;FLINK-15240&lt;/a&gt;] - is_generic key is missing for Flink table stored in HiveCatalog
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15259&quot;&gt;FLINK-15259&lt;/a&gt;] - HiveInspector.toInspectors() should convert Flink constant to Hive constant
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15266&quot;&gt;FLINK-15266&lt;/a&gt;] - NPE in blink planner code gen
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15361&quot;&gt;FLINK-15361&lt;/a&gt;] - ParquetTableSource should pass predicate in projectFields
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15412&quot;&gt;FLINK-15412&lt;/a&gt;] - LocalExecutorITCase#testParameterizedTypes failed in travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15413&quot;&gt;FLINK-15413&lt;/a&gt;] - ScalarOperatorsTest failed in travis
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15418&quot;&gt;FLINK-15418&lt;/a&gt;] - StreamExecMatchRule not set FlinkRelDistribution
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15421&quot;&gt;FLINK-15421&lt;/a&gt;] - GroupAggsHandler throws java.time.LocalDateTime cannot be cast to java.sql.Timestamp
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15435&quot;&gt;FLINK-15435&lt;/a&gt;] - ExecutionConfigTests.test_equals_and_hash in pyFlink fails when cpu core numbers is 6
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15443&quot;&gt;FLINK-15443&lt;/a&gt;] - Use JDBC connector write FLOAT value occur ClassCastException
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15478&quot;&gt;FLINK-15478&lt;/a&gt;] - FROM_BASE64 code gen type wrong
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15489&quot;&gt;FLINK-15489&lt;/a&gt;] - WebUI log refresh not working
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15522&quot;&gt;FLINK-15522&lt;/a&gt;] - Misleading root cause exception when cancelling the job
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15523&quot;&gt;FLINK-15523&lt;/a&gt;] - ConfigConstants generally excluded from japicmp
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15543&quot;&gt;FLINK-15543&lt;/a&gt;] - Apache Camel not bundled but listed in flink-dist NOTICE
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15549&quot;&gt;FLINK-15549&lt;/a&gt;] - Integer overflow in SpillingResettableMutableObjectIterator
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15577&quot;&gt;FLINK-15577&lt;/a&gt;] - WindowAggregate RelNodes missing Window specs in digest
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15615&quot;&gt;FLINK-15615&lt;/a&gt;] - Docs: wrong guarantees stated for the file sink
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt; Improvement
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11135&quot;&gt;FLINK-11135&lt;/a&gt;] - Reorder Hadoop config loading in HadoopUtils
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-12848&quot;&gt;FLINK-12848&lt;/a&gt;] - Method equals() in RowTypeInfo should consider fieldsNames
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-13729&quot;&gt;FLINK-13729&lt;/a&gt;] - Update website generation dependencies
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14008&quot;&gt;FLINK-14008&lt;/a&gt;] - Auto-generate binary licensing
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14104&quot;&gt;FLINK-14104&lt;/a&gt;] - Bump Jackson to 2.10.1
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14123&quot;&gt;FLINK-14123&lt;/a&gt;] - Lower the default value of taskmanager.memory.fraction
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14206&quot;&gt;FLINK-14206&lt;/a&gt;] - Let fullRestart metric count fine grained restarts as well
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14215&quot;&gt;FLINK-14215&lt;/a&gt;] - Add Docs for TM and JM Environment Variable Setting
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14251&quot;&gt;FLINK-14251&lt;/a&gt;] - Add FutureUtils#forward utility
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14334&quot;&gt;FLINK-14334&lt;/a&gt;] - ElasticSearch docs refer to non-existent ExceptionUtils.containsThrowable
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14335&quot;&gt;FLINK-14335&lt;/a&gt;] - ExampleIntegrationTest in testing docs is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14408&quot;&gt;FLINK-14408&lt;/a&gt;] - In OldPlanner, UDF open method can not be invoke when SQL is optimized
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14557&quot;&gt;FLINK-14557&lt;/a&gt;] - Clean up the package of py4j
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14639&quot;&gt;FLINK-14639&lt;/a&gt;] - Metrics User Scope docs refer to wrong class
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14646&quot;&gt;FLINK-14646&lt;/a&gt;] - Check non-null for key in KeyGroupStreamPartitioner
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14825&quot;&gt;FLINK-14825&lt;/a&gt;] - Rework state processor api documentation
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-14995&quot;&gt;FLINK-14995&lt;/a&gt;] - Kinesis NOTICE is incorrect
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15113&quot;&gt;FLINK-15113&lt;/a&gt;] - fs.azure.account.key not hidden from global configuration
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15554&quot;&gt;FLINK-15554&lt;/a&gt;] - Bump jetty-util-ajax to 9.3.24
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15657&quot;&gt;FLINK-15657&lt;/a&gt;] - Fix the python table api doc link in Python API tutorial
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15700&quot;&gt;FLINK-15700&lt;/a&gt;] - Improve Python API Tutorial doc
&lt;/li&gt;
&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-15726&quot;&gt;FLINK-15726&lt;/a&gt;] - Fixing error message in StreamExecTableSourceScan
&lt;/li&gt;
&lt;/ul&gt;
</description>
<pubDate>Thu, 30 Jan 2020 13:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/01/30/release-1.9.2.html</link>
<guid isPermaLink="true">/news/2020/01/30/release-1.9.2.html</guid>
</item>
<item>
<title>State Unlocked: Interacting with State in Apache Flink</title>
<description>&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
&lt;p&gt;With stateful stream-processing becoming the norm for complex event-driven applications and real-time analytics, &lt;a href=&quot;https://flink.apache.org/&quot;&gt;Apache Flink&lt;/a&gt; is often the backbone for running business logic and managing an organization’s most valuable asset — its data — as application state in Flink.&lt;/p&gt;
&lt;p&gt;In order to provide a state-of-the-art experience to Flink developers, the Apache Flink community makes significant efforts to provide the safety and future-proof guarantees organizations need while managing state in Flink. In particular, Flink developers should have sufficient means to access and modify their state, as well as making bootstrapping state with existing data from external systems a piece-of-cake. These efforts span multiple Flink major releases and consist of the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Evolvable state schema in Apache Flink&lt;/li&gt;
&lt;li&gt;Flexibility in swapping state backends, and&lt;/li&gt;
&lt;li&gt;The State processor API, an offline tool to read, write and modify state in Flink&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This post discusses the community’s efforts related to state management in Flink, provides some practical examples of how the different features and APIs can be utilized and covers some future ideas for new and improved ways of managing state in Apache Flink.&lt;/p&gt;
&lt;h1 id=&quot;stream-processing-what-is-state&quot;&gt;Stream processing: What is State?&lt;/h1&gt;
&lt;p&gt;To set the tone for the remaining of the post, let us first try to explain the very definition of state in stream processing. When it comes to stateful stream processing, state comprises of the information that an application or stream processing engine will remember across events and streams as more realtime (unbounded) and/or offline (bounded) data flow through the system. Most trivial applications are inherently stateful; even the example of a simple COUNT operation, whereby when counting up to 10, you essentially need to remember that you have already counted up to 9.&lt;/p&gt;
&lt;p&gt;To better understand how Flink manages state, one can think of Flink like a three-layered state abstraction, as illustrated in the diagram below.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-01-29-state-unlocked-interacting-with-state-in-apache-flink/managing-state-in-flink-visual-1.png&quot; width=&quot;600px&quot; alt=&quot;State in Apache Flink&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;On the top layer, sits the Flink user code, for example, a &lt;code&gt;KeyedProcessFunction&lt;/code&gt; that contains some value state. This is a simple variable whose value state annotations makes it automatically fault-tolerant, re-scalable and queryable by the runtime. These variables are backed by the configured state backend that sits either on-heap or on-disk (RocksDB State Backend) and provides data locality, proximity to the computation and speed when it comes to per-record computations. Finally, when it comes to upgrades, the introduction of new features or bug fixes, and in order to keep your existing state intact, this is where savepoints come in.&lt;/p&gt;
&lt;p&gt;A savepoint is a snapshot of the distributed, global state of an application at a logical point-in-time and is stored in an external distributed file system or blob storage such as HDFS, or S3. Upon upgrading an application or implementing a code change — such as adding a new operator or changing a field — the Flink job can restart by re-loading the application state from the savepoint into the state backend, making it local and available for the computation and continue processing as if nothing had ever happened.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-01-29-state-unlocked-interacting-with-state-in-apache-flink/managing-state-in-flink-visual-2.png&quot; width=&quot;600px&quot; alt=&quot;State in Apache Flink&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;div class=&quot;alert alert-info&quot;&gt;
It is important to remember here that &lt;b&gt;state is one of the most valuable components of a Flink application&lt;/b&gt; carrying all the information about both where you are now and where you are going. State is among the most long-lived components in a Flink service since it can be carried across jobs, operators, configurations, new features and bug fixes.
&lt;/div&gt;
&lt;h1 id=&quot;schema-evolution-with-apache-flink&quot;&gt;Schema Evolution with Apache Flink&lt;/h1&gt;
&lt;p&gt;In the previous section, we explained how state is stored and persisted in a Flink application. Let’s now take a look at what happens when evolving state in a stateful Flink streaming application becomes necessary.&lt;/p&gt;
&lt;p&gt;Imagine an Apache Flink application that implements a &lt;code&gt;KeyedProcessFunction&lt;/code&gt; and contains some &lt;code&gt;ValueState&lt;/code&gt;. As illustrated below, within the state descriptor, when registering the type, Flink users specify their &lt;code&gt;TypeInformation&lt;/code&gt; that informs Flink about how to serialize the bytes and represents Flink’s internal type system, used to serialize data when shipped across the network or stored in state backends. Flink’s type system has built-in support for all the basic types such as longs, strings, doubles, arrays and basic collection types like lists and maps. Additionally, Flink supports most of the major composite types including Tuples, POJOs, Scala Case Classes and Apache Avro&lt;sup&gt;Ⓡ&lt;/sup&gt;. Finally, if an application’s type does not match any of the above, developers can either plug in their own serializer or Flink will then fall back to Kryo.&lt;/p&gt;
&lt;h2 id=&quot;state-registration-with-built-in-serialization-in-apache-flink&quot;&gt;State registration with built-in serialization in Apache Flink&lt;/h2&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyFunction&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KeyedProcessFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Output&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;transient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ValueState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;valueState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Configuration&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ValueStateDescriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;descriptor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ValueStateDescriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;my-state&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeInformation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MyState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;valueState&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getRuntimeContext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getState&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;descriptor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Typically, evolving the schema of an application’s state happens because of some business logic change (adding or dropping fields or changing data types). In all cases, the schema is determined by means of its serializer, and can be thought of in terms of an alter table statement when compared with a database. When a state variable is first introduced it is like running a &lt;code&gt;CREATE_TABLE&lt;/code&gt; command, there is a lot of freedom with its execution. However, having data in that table (registered rows) limits developers in what they can do and what rules they follow in order to make updates or changes by an &lt;code&gt;ALTER_TABLE&lt;/code&gt; statement. Schema migration in Apache Flink follows a similar principle since the framework is essentially running an &lt;code&gt;ALTER_TABLE&lt;/code&gt; statement across savepoints.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://flink.apache.org/downloads.html#apache-flink-182&quot;&gt;Flink 1.8&lt;/a&gt; comes with built-in support for &lt;a href=&quot;https://avro.apache.org/&quot;&gt;Apache Avro&lt;/a&gt; (specifically the &lt;a href=&quot;https://avro.apache.org/docs/1.7.7/spec.html&quot;&gt;1.7.7 specification&lt;/a&gt;) and evolves state schema according to Avro specifications by adding and removing types or even by swapping between generic and specific Avro record types.&lt;/p&gt;
&lt;p&gt;In &lt;a href=&quot;https://flink.apache.org/downloads.html#apache-flink-191&quot;&gt;Flink 1.9&lt;/a&gt; the community added support for schema evolution for POJOs, including the ability to remove existing fields from POJO types or add new fields. The POJO schema evolution tends to be less flexible — when compared to Avro — since it is not possible to change neither the declared field types nor the class name of a POJO type, including its namespace.&lt;/p&gt;
&lt;p&gt;With the community’s efforts related to schema evolution, Flink developers can now expect out-of-the-box support for both Avro and POJO formats, with backwards compatibility for all Flink state backends. Future work revolves around adding support for Scala Case Classes, Tuples and other formats. Make sure to subscribe to the &lt;a href=&quot;https://flink.apache.org/community.html&quot;&gt;Flink mailing list&lt;/a&gt; to contribute and stay on top of any upcoming additions in this space.&lt;/p&gt;
&lt;h2 id=&quot;peeking-under-the-hood&quot;&gt;Peeking Under the Hood&lt;/h2&gt;
&lt;p&gt;Now that we have explained how schema evolution in Flink works, let’s describe the challenges of performing schema serialization with Flink under the hood. Flink considers state as a core part of its API stability, in a way that developers should always be able to take a savepoint from one version of Flink and restart it on the next. With schema evolution, every migration needs to be backwards compatible and also compatible with the different state backends. While in the Flink code the state backends are represented as interfaces detailing how to store and retrieve bytes, in practice, they behave vastly differently, something that adds extra complexity to how schema evolution is executed in Flink.&lt;/p&gt;
&lt;p&gt;For instance, the heap state backend supports lazy serialization and eager deserialization, making the per-record code path always working with Java objects, serializing on a background thread. When restoring, Flink will eagerly deserialize all the data and then start the user code. If a developer plugs in a new serializer, the deserialization happens before Flink ever receives the information.&lt;/p&gt;
&lt;p&gt;The RocksDB state backend behaves in the exact opposite manner: it supports eager serialization — because of items being stored on disk and RocksDB only consuming byte arrays. RocksDB provides lazy deserialization simply by downloading files to the local disk, making Flink unaware of what the bytes mean until a serializer is registered.&lt;/p&gt;
&lt;p&gt;An additional challenge stems from the fact that different versions of user code contain different classes on their classpath making the serializer used to write into a savepoint likely potentially unavailable at runtime.&lt;/p&gt;
&lt;p&gt;To overcome the previously mentioned challenges, we introduced what we call &lt;code&gt;TypeSerializerSnapshot&lt;/code&gt;. The &lt;code&gt;TypeSerializerSnapshot&lt;/code&gt; stores the configuration of the writer serializer in the snapshot. When restoring it will use that configuration to read back the previous state and check its compatibility with the current version. Using such operation allows Flink to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Read the configuration used to write out a snapshot&lt;/li&gt;
&lt;li&gt;Consume the new user code&lt;/li&gt;
&lt;li&gt;Check if both items above are compatible&lt;/li&gt;
&lt;li&gt;Consume the bytes from the snapshot and move forward or alert the user otherwise&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;interface&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;TypeSerializerSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getCurrentVersion&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;writeSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataOutputView&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IOException&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;readSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readVersion&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataInputView&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ClassLoader&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;userCodeClassLoader&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IOException&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;TypeSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;restoreSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;TypeSerializerSchemaCompatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;resolveSchemaCompatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;TypeSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;newSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;implementing-apache-avro-serialization-in-flink&quot;&gt;Implementing Apache Avro Serialization in Flink&lt;/h2&gt;
&lt;p&gt;Apache Avro is a data serialization format that has very well-defined schema migration semantics and supports both reader and writer schemas. During normal Flink execution the reader and writer schemas will be the same. However, when upgrading an application they may be different and with schema evolution, Flink will be able to migrate objects with their schemas.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AvroSerializerSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeSerializerSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Schema&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtimeSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Schema&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previousSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@SuppressWarnings&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;WeakerAccess&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;AvroSerializerSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;AvroSerializerSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Schema&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;runtimeSchema&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is a sketch of our Avro serializer. It uses the provided schemas and delegates to Apache Avro for all (de)-serialization. Let’s take a look at one possible implementation of a &lt;code&gt;TypeSerializerSnapshot&lt;/code&gt; that supports schema migration for Avro.&lt;/p&gt;
&lt;h1 id=&quot;writing-out-the-snapshot&quot;&gt;Writing out the snapshot&lt;/h1&gt;
&lt;p&gt;When serializing out the snapshot, the snapshot configuration will write two pieces of information; the current snapshot configuration version and the serializer configuration.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt; &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;getCurrentVersion&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;writeSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataOutputView&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IOException&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;writeUTF&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;runtimeSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toString&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The version is used to version the snapshot configuration object itself while the &lt;code&gt;writeSnapshot&lt;/code&gt; method writes out all the information we need to understand the current format; the runtime schema.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt; &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;readSnapshot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readVersion&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DataInputView&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ClassLoader&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;userCodeClassLoader&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IOException&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;readVersion&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previousSchemaDefinition&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;readUTF&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;previousSchema&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parseAvroSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;previousSchemaDefinition&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;runtimeType&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;findClassOrFallbackToGeneric&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;userCodeClassLoader&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;previousSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getFullName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;runtimeSchema&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tryExtractAvroSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;userCodeClassLoader&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtimeType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now when Flink restores it is able to read back in the writer schema used to serialize the data. The current runtime schema is discovered on the class path using some Java reflection magic.&lt;/p&gt;
&lt;p&gt;Once we have both of these we can compare them for compatibility. Perhaps nothing has changed and the schemas are compatible as is.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt; &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeSerializerSchemaCompatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;resolveSchemaCompatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;TypeSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;newSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(!(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;newSerializer&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;instanceof&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AvroSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeSerializerSchemaCompatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;incompatible&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Objects&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;equals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;previousSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtimeSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeSerializerSchemaCompatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;compatibleAsIs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Otherwise, the schemas are compared using Avro’s compatibility checks and they may either be compatible with a migration or incompatible.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SchemaPairCompatibility&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compatibility&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SchemaCompatibility&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;checkReaderWriterCompatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;previousSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtimeSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;err&quot;&gt;​&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;avroCompatibilityToFlinkCompatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compatibility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If they are compatible with migration then Flink will restore a new serializer that can read the old schema and deserialize into the new runtime type which is in effect a migration.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-java&quot;&gt; &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;restoreSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;previousSchema&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AvroSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;runtimeType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtimeSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previousSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AvroSerializer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;runtimeType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtimeSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;runtimeSchema&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&quot;the-state-processor-api-reading-writing-and-modifying-flink-state&quot;&gt;The State Processor API: Reading, writing and modifying Flink state&lt;/h1&gt;
&lt;p&gt;The State Processor API allows reading from and writing to Flink savepoints. Some of the interesting use cases it can be used for are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Analyzing state for interesting patterns&lt;/li&gt;
&lt;li&gt;Troubleshooting or auditing jobs by checking for state discrepancies&lt;/li&gt;
&lt;li&gt;Bootstrapping state for new applications&lt;/li&gt;
&lt;li&gt;Modifying savepoints such as:
&lt;ul&gt;
&lt;li&gt;Changing the maximum parallelism of a savepoint after deploying a Flink job&lt;/li&gt;
&lt;li&gt;Introducing breaking schema updates to a Flink application&lt;/li&gt;
&lt;li&gt;Correcting invalid state in a Flink savepoint&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In a &lt;a href=&quot;https://flink.apache.org/feature/2019/09/13/state-processor-api.html&quot;&gt;previous blog post&lt;/a&gt;, we discussed the State Processor API in detail, the community’s motivation behind introducing the feature in Flink 1.9, what you can use the API for and how you can use it. Essentially, the State Processor API is based around a relational model of mapping your Flink job state to a database, as illustrated in the diagram below. We encourage you to &lt;a href=&quot;https://flink.apache.org/feature/2019/09/13/state-processor-api.html&quot;&gt;read the previous story&lt;/a&gt; for more information on the API and how to use it. In a follow up post, we will provide detailed tutorials on:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Reading Keyed and Operator State with the State Processor API and&lt;/li&gt;
&lt;li&gt;Writing and Bootstrapping Keyed and Operator State with the State Processor API&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Stay tuned for more details and guidance around this feature of Flink.&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-01-29-state-unlocked-interacting-with-state-in-apache-flink/managing-state-in-flink-state-processor-api-visual-1.png&quot; width=&quot;600px&quot; alt=&quot;State Processor API in Apache Flink&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;center&gt;
&lt;img src=&quot;/img/blog/2020-01-29-state-unlocked-interacting-with-state-in-apache-flink/managing-state-in-flink-state-processor-api-visual-2.png&quot; width=&quot;600px&quot; alt=&quot;State Processor API in Apache Flink&quot; /&gt;
&lt;/center&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h1 id=&quot;looking-ahead-more-ways-to-interact-with-state-in-flink&quot;&gt;Looking ahead: More ways to interact with State in Flink&lt;/h1&gt;
&lt;p&gt;There is a lot of discussion happening in the community related to extending the way Flink developers interact with state in their Flink applications. Regarding the State Processor API, some thoughts revolve around further broadening the API’s scope beyond its current ability to read from and write to both keyed and operator state. In upcoming releases, the State processor API will be extended to support both reading from and writing to windows and have a first-class integration with Flink’s Table API and SQL.&lt;/p&gt;
&lt;p&gt;Beyond widening the scope of the State Processor API, the Flink community is discussing a few additional ways to improve the way developers interact with state in Flink. One of them is the proposal for a Unified Savepoint Format (&lt;a href=&quot;https://cwiki.apache.org/confluence/display/FLINK/FLIP-41%3A+Unify+Binary+format+for+Keyed+State&quot;&gt;FLIP-41&lt;/a&gt;) for all keyed state backends. Such improvement aims at introducing a unified binary format across all savepoints in all keyed state backends, something that drastically reduces the overhead of swapping the state backend in a Flink application. Such an improvement would allow developers to take a savepoint in their application and restart it in a different state backend — for example, moving it from the heap to disk (RocksDB state backend) and back — depending on the scalability and evolution of the application at different points-in-time.&lt;/p&gt;
&lt;p&gt;The community is also discussing the ability to have upgradability dry runs in upcoming Flink releases. Having such functionality in Flink allows developers to detect incompatible updates offline without the need of starting a new Flink job from scratch. For example, Flink users will be able to uncover topology or schema incompatibilities upon upgrading a Flink job, without having to load the state back to a running Flink job in the first place. Additionally, with upgradability dry runs Flink users will be able to get information about the registered state through the streaming graph, without needing to access the state in the state backend.&lt;/p&gt;
&lt;p&gt;With all the exciting new functionality added in Flink 1.9 as well as some solid ideas and discussions around bringing state in Flink to the next level, the community is committed to making state in Apache Flink a fundamental element of the framework, something that is ever-present across versions and upgrades of your application and a component that is a true first-class citizen in Apache Flink. We encourage you to sign up to the &lt;a href=&quot;https://flink.apache.org/community.html&quot;&gt;mailing list&lt;/a&gt; and stay on top of the announcements and new features in upcoming releases.&lt;/p&gt;
</description>
<pubDate>Wed, 29 Jan 2020 13:00:00 +0100</pubDate>
<link>https://flink.apache.org/news/2020/01/29/state-unlocked-interacting-with-state-in-apache-flink.html</link>
<guid isPermaLink="true">/news/2020/01/29/state-unlocked-interacting-with-state-in-apache-flink.html</guid>
</item>
</channel>
</rss>