Machine learning library of Apache Flink

Clone this repo:
  1. 44f71f2 [FLINK-35066] Fix the unwrap from IterationRecord during keyBy by yunfengzhou-hub · 2 weeks ago master
  2. f08f275 [FLINK-33118] Remove the PythonBridgeUtils by JiangXin · 7 months ago
  3. 8654049 [FLINK-32704] Supports spilling to disk when feedback channel memory buffer is full by JiangXin · 7 months ago
  4. ebdf362 [FLINK-32810] Improve memory allocation in ListStateWithCache by Fan Hong · 8 months ago
  5. 5619c3b [FLINK-32889] Fix calcuation of weighted areaUnderROC and areaUnderPRC in BinaryClassificationEvaluator by Fan Hong · 8 months ago

Flink ML is a library which provides machine learning (ML) APIs and infrastructures that simplify the building of ML pipelines. Users can implement ML algorithms with the standard ML APIs and further use these infrastructures to build ML pipelines for both training and inference jobs.

Flink ML is developed under the umbrella of Apache Flink.

Getting Started

You can follow the Python quick start and the Java quick start to get hands-on experience with Flink ML Python and Java APIs respectively.

Building the Project

Run the mvn clean package command.

Then you will find a JAR file that contains your application, plus any libraries that you may have added as dependencies to the application: target/<artifact-id>-<version>.jar.

Benchmark

Flink ML provides functionalities to benchmark its machine learning algorithms. For detailed information, please check the Benchmark Getting Started.

Documentation

The documentation of Flink ML is located on the website: https://nightlies.apache.org/flink/flink-ml-docs-master/ or in the docs/ directory of the source code.

Contributing

You can learn more about how to contribute in the Apache Flink website. For code contributions, please read carefully the Contributing Code section for an overview of ongoing community work.

License

The code in this repository is licensed under the Apache Software License 2.