blob: 6f98d3e143392e634f52379123543063b49252bf [file] [log] [blame] [view]
---
layout: page
title: Gluten For Flink with Velox Backend
nav_order: 1
---
# Supported Version
| Type | Version |
|-------|------------------------------|
| Flink | 1.19.2 |
| OS | Ubuntu20.04/22.04, Centos7/8 |
| jdk | openjdk11/jdk17 |
| scala | 2.12 |
# Prerequisite
Currently, with static build Gluten+Flink+Velox backend supports all the Linux OSes, but is only tested on **Ubuntu20.04**. With dynamic build, Gluten+Velox backend support **Ubuntu20.04/Ubuntu22.04/Centos7/Centos8** and their variants.
Currently, the officially supported Flink version is 1.19.2.
We need to set up the `JAVA_HOME` env. Currently, Gluten supports **java 11** and **java 17**.
**For x86_64**
```bash
## make sure jdk11 is used
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
export PATH=$JAVA_HOME/bin:$PATH
```
**For aarch64**
```bash
## make sure jdk11 is used
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-arm64
export PATH=$JAVA_HOME/bin:$PATH
```
**Get Velox4j**
Gluten for Flink depends on [Velox4j](https://github.com/velox4j/velox4j) to call velox. This is an experimental feature.
You need to get the Velox4j code, and compile it first.
As some features have not been committed to upstream, you have to use the following fork to run it first.
```bash
## fetch velox4j code
git clone -b gluten-0530 https://github.com/bigo-sg/velox4j.git
cd velox4j
git reset --hard 0180528e9b98fad22bc9da8a3864d2929ef73eec
mvn clean install -DskipTests -Dgpg.skip -Dspotless.skip=true
```
**Get gluten**
```bash
## config maven, like proxy in ~/.m2/settings.xml
## fetch gluten code
git clone https://github.com/apache/incubator-gluten.git
```
# Build Gluten Flink with Velox Backend
```
cd /path/to/gluten/gluten-flink
mvn clean package -Dmaven.test.skip=true
```
# Run Unit Tests
**Get Nexmark**
```shell
git clone https://github.com/nexmark/nexmark.git
cd nexmark
mvn clean install -DskipTests
```
**Run Tests**
```shell
cd /path/to/gluten/gluten-flink
mvn test
```
## Submit the Flink SQL job
Submit test script from `flink run`. You can use the `StreamSQLExample` as an example.
### Flink local cluster
After deploying Flink binaries, please configure the paths of gluten-flink jars and the dependency jars for Flink to use, as follows:
```shell
# notice: first set your own specified project home, you may set it
# mannualy in you .bash_profile so that it can auto take effect.
export VELOX4J_HOME=
export GLUTEN_FLINK_HOME=
export FLINK_HOME=
cd $FLINK_HOME
mkdir -p gluten_lib
ln -s $VELOX4J_HOME/target/velox4j-0.1.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/velox4j-0.1.0-SNAPSHOT.jar
ln -s $GLUTEN_FLINK_HOME/runtime/target/gluten-flink-runtime-1.6.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/gluten-flink-runtime-1.6.0.jar
ln -s $GLUTEN_FLINK_HOME/loader/target/gluten-flink-loader-1.6.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/gluten-flink-loader-1.6.0.jar
```
And make them loaded before flink libraries.
#### How to make sure gluten classes loaded first in Flink?
Gluten classes need to be loaded first in Flink,
you can modify the constructFlinkClassPath function in `$FLINK_HOME/bin/config.sh` like this:
```
GLUTEN_JAR="$FLINK_HOME/gluten_lib/gluten-flink-loader-1.6.0.jar:$FLINK_HOME/gluten_lib/velox4j-0.1.0-SNAPSHOT.jar:$FLINK_HOME/gluten_lib/gluten-flink-runtime-1.6.0.jar:"
echo "$GLUTEN_JAR""$FLINK_CLASSPATH""$FLINK_DIST"
```
Then you can go to flink binary path and use the below scripts to
submit the example job.
```bash
cd $FLINK_HOME
bin/start-cluster.sh
bin/flink run examples/table/StreamSQLExample.jar
```
Then you can get the result in `log/flink-*-taskexecutor-*.out`.
And you can see an operator named `gluten-cal` from the web frontend of your flink job.
**Notice: current this example will cause npe until [issue-10315](https://github.com/apache/incubator-gluten/issues/10315) get resolved.**
#### All operators executed by native
Another example supports all operators executed by native.
You can use the data-generator.sql under dev directory.
```bash
bin/sql-client.sh -f data-generator.sql
```
### Flink Yarn per job mode
TODO
## Performance
We are working on supporting the [Nexmark](https://github.com/nexmark/nexmark) benchmark for Flink.
Now the q0 has been supported.
Results show that running with gluten can be 2.x times faster than Flink.
Result using gluten (will support TPS metric soon):
```
-------------------------------- Nexmark Results --------------------------------
+------+-----------------+--------+----------+-----------------+--------------+-----------------+
| Query| Events Num | Cores | Time(s) | Cores * Time(s) | Throughput | Throughput/Cores|
+------+-----------------+--------+----------+-----------------+--------------+-----------------+
|q0 |100,000,000 |NaN |161.428 |NaN |619.47 K/s |0/s |
|Total |100,000,000 |NaN |161.428 |NaN |619.47 K/s |0/s |
+------+-----------------+--------+----------+-----------------+--------------+-----------------+
```
Result using Flink:
```
-------------------------------- Nexmark Results --------------------------------
+------+-----------------+--------+----------+-----------------+--------------+-----------------+
| Query| Events Num | Cores | Time(s) | Cores * Time(s) | Throughput | Throughput/Cores|
+------+-----------------+--------+----------+-----------------+--------------+-----------------+
|q0 |100,000,000 |1.21 |462.069 |558.210 |216.42 K/s |179.14/s |
|Total |100,000,000 |1.208 |462.069 |558.210 |216.42 K/s |179.14/s |
+------+-----------------+--------+----------+-----------------+--------------+-----------------+
```
We are still optimizing it.
## Notes:
Now both Gluten for Flink and Velox4j have not a bundled jar including all jars depends on.
So you may have to add these jars by yourself, which may including guava-33.4.0-jre.jar, jackson-core-2.18.0.jar,
jackson-databind-2.18.0.jar, jackson-datatype-jdk8-2.18.0.jar, jackson-annotations-2.18.0.jar, arrow-memory-core-18.1.0.jar,
arrow-memory-unsafe-18.1.0.jar, arrow-vector-18.1.0.jar, flatbuffers-java-24.3.25.jar, arrow-format-18.1.0.jar, arrow-c-data-18.1.0.jar.
We will supply bundled jars soon.