To build Auron, please follow the steps below:
The native execution lib is written in Rust. So you're required to install Rust (nightly) first for compilation. We recommend you to use rustup.
Auron has been well tested on jdk8/11/17, should work fine with higher versions.
git clone git@github.com:kwai/auron.git cd auron
Specify shims package of which spark version that you would like to run on.
Currently we have supported these shims:
You could either build Auron in pre mode for debugging or in release mode to unlock the full potential of Auron.
SHIM=spark-3.5 # or spark-3.0/spark-3.1/spark-3.2/spark-3.3/spark-3.4/spark-3.5 MODE=release # or pre JDK=jdk-8 ./build/mvn package -P"${SHIM}" -P"${MODE}" -P${JDK}
After the build is finished, a fat Jar package that contains all the dependencies will be generated in the target directory.
You can use the following command to build a centos-7 compatible release:
SHIM=spark-3.5 MODE=release ./release-docker.sh
This section describes how to submit and configure a Spark Job with Auron support.
move auron jar package to spark client classpath (normally spark-xx.xx.xx/jars/).
add the follow confs to spark configuration in spark-xx.xx.xx/conf/spark-default.conf:
spark.auron.enable true spark.sql.extensions org.apache.spark.sql.auron.AuronSparkSessionExtension spark.shuffle.manager org.apache.spark.sql.execution.auron.shuffle.AuronShuffleManager spark.memory.offHeap.enabled false # suggested executor memory configuration spark.executor.memory 4g spark.executor.memoryOverhead 4096
spark-sql -f tpcds/q01.sql