gluten-flink/docs/Flink.md - incubator-gluten - Git at Google

 ---
 layout: page
 title: Gluten For Flink with Velox Backend
 nav_order: 1
 ---

 # Supported Version

 | Type  | Version                      |
 |-------|------------------------------|
 | Flink | 1.19.2                       |
 | OS    | Ubuntu20.04/22.04, Centos7/8 |
 | jdk   | openjdk11/jdk17              |
 | scala | 2.12                         |

 # Prerequisite

 Currently, with static build Gluten+Flink+Velox backend supports all the Linux OSes, but is only tested on **Ubuntu20.04**. With dynamic build, Gluten+Velox backend support **Ubuntu20.04/Ubuntu22.04/Centos7/Centos8** and their variants.

 Currently, the officially supported Flink version is 1.19.2.

 We need to set up the `JAVA_HOME` env. Currently, Gluten supports **java 11** and **java 17**.

 **For x86_64**

 ```bash
 ## make sure jdk11 is used
 export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
 export PATH=$JAVA_HOME/bin:$PATH
 ```

 **For aarch64**

 ```bash
 ## make sure jdk11 is used
 export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-arm64
 export PATH=$JAVA_HOME/bin:$PATH
 ```

 **Get Velox4j**

 Gluten for Flink depends on [Velox4j](https://github.com/velox4j/velox4j) to call velox. This is an experimental feature.
 You need to get the Velox4j code, and compile it first.

 As some features have not been committed to upstream, you have to use the following fork to run it first.

 ```bash
 ## fetch velox4j code
 git clone -b gluten-0530 https://github.com/bigo-sg/velox4j.git
 cd velox4j
 git reset --hard 0180528e9b98fad22bc9da8a3864d2929ef73eec
 mvn clean install -DskipTests -Dgpg.skip -Dspotless.skip=true
 ```
 **Get gluten**

 ```bash
 ## config maven, like proxy in ~/.m2/settings.xml

 ## fetch gluten code
 git clone https://github.com/apache/incubator-gluten.git
 ```

 # Build Gluten Flink with Velox Backend

 ```
 cd /path/to/gluten/gluten-flink
 mvn clean package -Dmaven.test.skip=true
 ```

 # Run Unit Tests
 **Get Nexmark**
 ```shell
 git clone https://github.com/nexmark/nexmark.git
 cd nexmark
 mvn clean install -DskipTests
 ```
 **Run Tests**
 ```shell
 cd /path/to/gluten/gluten-flink
 mvn test
 ```

 ## Submit the Flink SQL job

 Submit test script from `flink run`. You can use the `StreamSQLExample` as an example.

 ### Flink local cluster

 After deploying Flink binaries, please configure the paths of gluten-flink jars and the dependency jars for Flink to use, as follows:

 ```shell
 # notice: first set your own specified project home, you may set it
 # mannualy in you .bash_profile so that it can auto take effect.
 export VELOX4J_HOME=
 export GLUTEN_FLINK_HOME=
 export FLINK_HOME=

 cd $FLINK_HOME
 mkdir -p gluten_lib
 ln -s $VELOX4J_HOME/target/velox4j-0.1.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/velox4j-0.1.0-SNAPSHOT.jar
 ln -s $GLUTEN_FLINK_HOME/runtime/target/gluten-flink-runtime-1.6.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/gluten-flink-runtime-1.6.0.jar
 ln -s $GLUTEN_FLINK_HOME/loader/target/gluten-flink-loader-1.6.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/gluten-flink-loader-1.6.0.jar
 ```

 And make them loaded before flink libraries.

 #### How to make sure gluten classes loaded first in Flink?

 Gluten classes need to be loaded first in Flink,
 you can modify the constructFlinkClassPath function in `$FLINK_HOME/bin/config.sh` like this:

 ```
 GLUTEN_JAR="$FLINK_HOME/gluten_lib/gluten-flink-loader-1.6.0.jar:$FLINK_HOME/gluten_lib/velox4j-0.1.0-SNAPSHOT.jar:$FLINK_HOME/gluten_lib/gluten-flink-runtime-1.6.0.jar:"
 echo "$GLUTEN_JAR""$FLINK_CLASSPATH""$FLINK_DIST"
 ```

 Then you can go to flink binary path and use the below scripts to
 submit the example job.

 ```bash
 cd $FLINK_HOME
 bin/start-cluster.sh
 bin/flink run examples/table/StreamSQLExample.jar
 ```

 Then you can get the result in `log/flink-*-taskexecutor-*.out`.
 And you can see an operator named `gluten-cal` from the web frontend of your flink job.

 **Notice: current this example will cause npe until  [issue-10315](https://github.com/apache/incubator-gluten/issues/10315) get resolved.**

 #### All operators executed by native
 Another example supports all operators executed by native.
 You can use the data-generator.sql under dev directory.

 ```bash
 bin/sql-client.sh -f data-generator.sql
 ```

 ### Flink Yarn per job mode

 TODO

 ## Performance
 We are working on supporting the [Nexmark](https://github.com/nexmark/nexmark) benchmark for Flink.
 Now the q0 has been supported.

 Results show that running with gluten can be 2.x times faster than Flink.

 Result using gluten (will support TPS metric soon):
 ```
 -------------------------------- Nexmark Results --------------------------------

 +------+-----------------+--------+----------+-----------------+--------------+-----------------+
 | Query| Events Num      | Cores  | Time(s)  | Cores * Time(s) | Throughput   | Throughput/Cores|
 +------+-----------------+--------+----------+-----------------+--------------+-----------------+
 |q0    |100,000,000      |NaN     |161.428   |NaN              |619.47 K/s    |0/s              |
 |Total |100,000,000      |NaN     |161.428   |NaN              |619.47 K/s    |0/s              |
 +------+-----------------+--------+----------+-----------------+--------------+-----------------+
 ```

 Result using Flink:
 ```
 -------------------------------- Nexmark Results --------------------------------

 +------+-----------------+--------+----------+-----------------+--------------+-----------------+
 | Query| Events Num      | Cores  | Time(s)  | Cores * Time(s) | Throughput   | Throughput/Cores|
 +------+-----------------+--------+----------+-----------------+--------------+-----------------+
 |q0    |100,000,000      |1.21    |462.069   |558.210          |216.42 K/s    |179.14/s         |
 |Total |100,000,000      |1.208   |462.069   |558.210          |216.42 K/s    |179.14/s         |
 +------+-----------------+--------+----------+-----------------+--------------+-----------------+
 ```
 We are still optimizing it.

 ## Notes:
 Now both Gluten for Flink and Velox4j have not a bundled jar including all jars depends on.
 So you may have to add these jars by yourself, which may including guava-33.4.0-jre.jar, jackson-core-2.18.0.jar,
 jackson-databind-2.18.0.jar, jackson-datatype-jdk8-2.18.0.jar, jackson-annotations-2.18.0.jar, arrow-memory-core-18.1.0.jar,
 arrow-memory-unsafe-18.1.0.jar, arrow-vector-18.1.0.jar, flatbuffers-java-24.3.25.jar, arrow-format-18.1.0.jar, arrow-c-data-18.1.0.jar.
 We will supply bundled jars soon.
	---
	layout: page
	title: Gluten For Flink with Velox Backend
	nav_order: 1
	---

	# Supported Version

	\| Type \| Version \|
	\|-------\|------------------------------\|
	\| Flink \| 1.19.2 \|
	\| OS \| Ubuntu20.04/22.04, Centos7/8 \|
	\| jdk \| openjdk11/jdk17 \|
	\| scala \| 2.12 \|

	# Prerequisite

	Currently, with static build Gluten+Flink+Velox backend supports all the Linux OSes, but is only tested on Ubuntu20.04. With dynamic build, Gluten+Velox backend support Ubuntu20.04/Ubuntu22.04/Centos7/Centos8 and their variants.

	Currently, the officially supported Flink version is 1.19.2.

	We need to set up the `JAVA_HOME` env. Currently, Gluten supports java 11 and java 17.

	For x86_64

	```bash
	## make sure jdk11 is used
	export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
	export PATH=$JAVA_HOME/bin:$PATH
	```

	For aarch64

	```bash
	## make sure jdk11 is used
	export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-arm64
	export PATH=$JAVA_HOME/bin:$PATH
	```

	Get Velox4j

	Gluten for Flink depends on [Velox4j](https://github.com/velox4j/velox4j) to call velox. This is an experimental feature.
	You need to get the Velox4j code, and compile it first.

	As some features have not been committed to upstream, you have to use the following fork to run it first.

	```bash
	## fetch velox4j code
	git clone -b gluten-0530 https://github.com/bigo-sg/velox4j.git
	cd velox4j
	git reset --hard 0180528e9b98fad22bc9da8a3864d2929ef73eec
	mvn clean install -DskipTests -Dgpg.skip -Dspotless.skip=true
	```
	Get gluten

	```bash
	## config maven, like proxy in ~/.m2/settings.xml

	## fetch gluten code
	git clone https://github.com/apache/incubator-gluten.git
	```

	# Build Gluten Flink with Velox Backend

	```
	cd /path/to/gluten/gluten-flink
	mvn clean package -Dmaven.test.skip=true
	```

	# Run Unit Tests
	Get Nexmark
	```shell
	git clone https://github.com/nexmark/nexmark.git
	cd nexmark
	mvn clean install -DskipTests
	```
	Run Tests
	```shell
	cd /path/to/gluten/gluten-flink
	mvn test
	```

	## Submit the Flink SQL job

	Submit test script from `flink run`. You can use the `StreamSQLExample` as an example.

	### Flink local cluster

	After deploying Flink binaries, please configure the paths of gluten-flink jars and the dependency jars for Flink to use, as follows:

	```shell
	# notice: first set your own specified project home, you may set it
	# mannualy in you .bash_profile so that it can auto take effect.
	export VELOX4J_HOME=
	export GLUTEN_FLINK_HOME=
	export FLINK_HOME=

	cd $FLINK_HOME
	mkdir -p gluten_lib
	ln -s $VELOX4J_HOME/target/velox4j-0.1.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/velox4j-0.1.0-SNAPSHOT.jar
	ln -s $GLUTEN_FLINK_HOME/runtime/target/gluten-flink-runtime-1.6.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/gluten-flink-runtime-1.6.0.jar
	ln -s $GLUTEN_FLINK_HOME/loader/target/gluten-flink-loader-1.6.0-SNAPSHOT.jar $FLINK_HOME/gluten_lib/gluten-flink-loader-1.6.0.jar
	```

	And make them loaded before flink libraries.

	#### How to make sure gluten classes loaded first in Flink?

	Gluten classes need to be loaded first in Flink,
	you can modify the constructFlinkClassPath function in `$FLINK_HOME/bin/config.sh` like this:

	```
	GLUTEN_JAR="$FLINK_HOME/gluten_lib/gluten-flink-loader-1.6.0.jar:$FLINK_HOME/gluten_lib/velox4j-0.1.0-SNAPSHOT.jar:$FLINK_HOME/gluten_lib/gluten-flink-runtime-1.6.0.jar:"
	echo "$GLUTEN_JAR""$FLINK_CLASSPATH""$FLINK_DIST"
	```

	Then you can go to flink binary path and use the below scripts to
	submit the example job.

	```bash
	cd $FLINK_HOME
	bin/start-cluster.sh
	bin/flink run examples/table/StreamSQLExample.jar
	```

	Then you can get the result in `log/flink--taskexecutor-.out`.
	And you can see an operator named `gluten-cal` from the web frontend of your flink job.

	Notice: current this example will cause npe until [issue-10315](https://github.com/apache/incubator-gluten/issues/10315) get resolved.

	#### All operators executed by native
	Another example supports all operators executed by native.
	You can use the data-generator.sql under dev directory.

	```bash
	bin/sql-client.sh -f data-generator.sql
	```

	### Flink Yarn per job mode

	TODO

	## Performance
	We are working on supporting the [Nexmark](https://github.com/nexmark/nexmark) benchmark for Flink.
	Now the q0 has been supported.

	Results show that running with gluten can be 2.x times faster than Flink.

	Result using gluten (will support TPS metric soon):
	```
	-------------------------------- Nexmark Results --------------------------------

	+------+-----------------+--------+----------+-----------------+--------------+-----------------+
	\| Query\| Events Num \| Cores \| Time(s) \| Cores * Time(s) \| Throughput \| Throughput/Cores\|
	+------+-----------------+--------+----------+-----------------+--------------+-----------------+
	\|q0 \|100,000,000 \|NaN \|161.428 \|NaN \|619.47 K/s \|0/s \|
	\|Total \|100,000,000 \|NaN \|161.428 \|NaN \|619.47 K/s \|0/s \|
	+------+-----------------+--------+----------+-----------------+--------------+-----------------+
	```

	Result using Flink:
	```
	-------------------------------- Nexmark Results --------------------------------

	+------+-----------------+--------+----------+-----------------+--------------+-----------------+
	\| Query\| Events Num \| Cores \| Time(s) \| Cores * Time(s) \| Throughput \| Throughput/Cores\|
	+------+-----------------+--------+----------+-----------------+--------------+-----------------+
	\|q0 \|100,000,000 \|1.21 \|462.069 \|558.210 \|216.42 K/s \|179.14/s \|
	\|Total \|100,000,000 \|1.208 \|462.069 \|558.210 \|216.42 K/s \|179.14/s \|
	+------+-----------------+--------+----------+-----------------+--------------+-----------------+
	```
	We are still optimizing it.

	## Notes:
	Now both Gluten for Flink and Velox4j have not a bundled jar including all jars depends on.
	So you may have to add these jars by yourself, which may including guava-33.4.0-jre.jar, jackson-core-2.18.0.jar,
	jackson-databind-2.18.0.jar, jackson-datatype-jdk8-2.18.0.jar, jackson-annotations-2.18.0.jar, arrow-memory-core-18.1.0.jar,
	arrow-memory-unsafe-18.1.0.jar, arrow-vector-18.1.0.jar, flatbuffers-java-24.3.25.jar, arrow-format-18.1.0.jar, arrow-c-data-18.1.0.jar.
	We will supply bundled jars soon.