The examples are a variation of word count to illustrate end-to-end exactly-once processing by incorporating the external system integration aspect, which needs to be taken into account when developing real-world pipelines:
The examples combine the 3 properties that are required for end-to-end exactly-once results:
The test cases show how the applications can be configured to run in embedded mode (including Kafka).
Shows exactly-once output to JDBC through transactions. The JDBC output operator keeps track of the streaming window along with the count to avoid duplicate writes on replay during recovery. This is an example for continuously updating results in the database, enabled by the transactions.
This application shows exactly-once output to files through atomic file operation. In contrast to the JDBC example, output can only occur once the final count is computed. This implies batching at the sink, leading to high latency.