Apache Flink Playgrounds

Clone this repo:
  1. 4a7283d Fix ClickEventCount for parallelism higher than 1 by Tudor Pavel · 1 year, 2 months ago master
  2. d0ce3d8 [FLINK-32099] Create flink_data volume for operations playground by Alain Brown · 1 year, 7 months ago
  3. 4aa9a34 [FLINK-30441] Upgrading PyFlink walkthrough to Flink 1.16 by Gunnar Morling · 1 year, 11 months ago release-1.16
  4. a2b62f2 FLINK-30442: Upgrading table walk-through to Flink 1.16 by Gunnar Morling · 1 year, 11 months ago
  5. bf2b9f6 bump mysql version to 8.0.31 for table-walkthrough by Connor Ameres · 1 year, 11 months ago

Apache Flink Playgrounds

This repository provides playgrounds to quickly and easily explore Apache Flink's features.

The playgrounds are based on docker-compose environments. Each subfolder of this repository contains the docker-compose setup of a playground, except for the ./docker folder which contains code and configuration to build custom Docker images for the playgrounds.

Available Playgrounds

Currently, the following playgrounds are available:

  • The Flink Operations Playground (in the operations-playground folder) lets you explore and play with Flink's features to manage and operate stream processing jobs. You can witness how Flink recovers a job from a failure, upgrade and rescale a job, and query job metrics. The playground consists of a Flink cluster, a Kafka cluster and an example Flink job. The playground is presented in detail in “Flink Operations Playground”, which is part of the Try Flink section of the Flink documentation.

  • The Table Walkthrough (in the table-walkthrough folder) shows how to use the Table API to build an analytics pipeline that reads streaming data from Kafka and writes results to MySQL, along with a real-time dashboard in Grafana. The walkthrough is presented in detail in “Real Time Reporting with the Table API”, which is part of the Try Flink section of the Flink documentation.

  • The PyFlink Walkthrough (in the pyflink-walkthrough folder) provides a complete example that uses the Python API, and guides you through the steps needed to run and manage Pyflink Jobs. The pipeline used in this walkthrough reads data from Kafka, performs aggregations, and writes results to Elasticsearch that are visualized with Kibana. This walkthrough is presented in detail in the pyflink-walkthrough README.

About

Apache Flink is an open source project of The Apache Software Foundation (ASF).

Flink is distributed data processing framework with powerful stream and batch processing capabilities. Learn more about Flink at https://flink.apache.org/