title: “Ecosystem”

Connectors

To run an application using one of these connectors, additional third party components are usually required to be installed and launched, e.g., the servers for the message queues. Further instructions for these can be found in the corresponding subsections.

Third-Party Projects

This is a list of third party packages (i.e., libraries, system extensions, or examples) built on Flink. The Flink community collects links to these packages but does not maintain them. Thus, they do not belong to the Apache Flink project, and the community cannot give any support for them. Is your project missing? Please let us know on the [user/dev mailing list]({{ site.baseurl }}/community.html#mailing-lists).

Apache Zeppelin

Apache Zeppelin is a web-based notebook that enables interactive data analytics and can be used with Flink as an execution engine. See also Jim Dowling's Flink Forward talk about Zeppelin on Flink.

Apache Mahout

Apache Mahout is a machine learning library that will feature Flink as an execution engine soon. Check out Sebastian Schelter's Flink Forward talk about Mahout-Samsara DSL.

Cascading

Cascading enables a user to build complex workflows easily on Flink and other execution engines. Cascading on Flink is built by dataArtisans and Driven, Inc. See Fabian Hueske's Flink Forward talk for more details.

Apache Beam

Apache Beam is an open-source, unified programming model that you can use to create a data processing pipeline. Flink is one of the back-ends supported by the Beam programming model.

GRADOOP

GRADOOP enables scalable graph analytics on top of Flink and is developed at Leipzig University. Check out Martin Junghanns’ Flink Forward talk.

BigPetStore

BigPetStore is a benchmarking suite including a data generator and will be available for Flink soon. See Suneel Marthi's Flink Forward talk as preview.

FastR

FastR is an implemenation of the R language in Java. FastR Flink executes R workloads on top of Flink.

Apache SAMOA

Apache SAMOA (incubating) is a streaming ML library featuring Flink as an execution engine soon. Albert Bifet introduced SAMOA on Flink at his Flink Forward talk.

Alluxio

Alluxio is an open-source memory-speed virtual distributed storage that enables applications to efficiently share data and access data across different storage systems in a unified namespace. Here is an example of using Flink to access data through Alluxio.

Python Examples on Flink

A collection of examples using Apache Flink's Python API.

WordCount Example in Clojure

A small WordCount example on how to write a Flink program in Clojure.

Anomaly Detection and Prediction in Flink

flink-htm is a library for anomaly detection and prediction in Apache Flink. The algorithms are based on Hierarchical Temporal Memory (HTM) as implemented by the Numenta Platform for Intelligent Computing (NuPIC).

Apache Ignite

Apache Ignite is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time. See Flink sink streaming connector to inject data into Ignite cache.

Tink temporal graph library

Tink is a temporal graph library built on top of Flink. It allows for temporal graph analytics like different interpretations of the shortest temporal path algorithm and metrics like temporal betweenness and temporal closeness. This library was the result of the Thesis of Wouter Ligtenberg.

FlinkK8sOperator

FlinkK8sOperator is a Kubernetes operator that manages Flink applications on Kubernetes. The operator acts as control plane to manage the complete deployment lifecycle of the application.