The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing

Clone this repo:
  1. d1eaef1 [typo] fix typo in `cast.rs` (#1273) by bkhan · 7 hours ago master
  2. fc715a0 Bump log from 0.4.27 to 0.4.28 (#1271) by dependabot[bot] · 2 days ago
  3. 2b58678 remove maven module duplicate group id (#1270) by cxzl25 · 3 days ago
  4. abb1b52 [AURON-1265] Improve PR template with title/description guidelines (#1266) by Ruilei Ma · 3 days ago
  5. 1866e02 [AURON-1258] Bump Paimon from 1.1.1 to 1.2.0 (#1259) by Ruilei Ma · 3 days ago

Apache Auron (Incubating)

TPC-DS master-ce7-builds

The Auron accelerator for big data engine (e.g., Spark, Flink) leverages native vectorized execution to accelerate query processing. It combines the power of the Apache DataFusion library and the scale of the distributed computing framework.

Auron takes a fully optimized physical plan from distributed computing framework, mapping it into DataFusion's execution plan, and performs native plan computation.

The key capabilities of Auron include:

  • Native execution: Implemented in Rust, eliminating JVM overhead and enabling predictable performance.
  • Vectorized computation: Built on Apache Arrow's columnar format, fully leveraging SIMD instructions for batch processing.
  • Pluggable architecture:: Seamlessly integrates with Apache Spark while designed for future extensibility to other engines.
  • Production-hardened optimizations: Multi-level memory management, compacted shuffle formats, and adaptive execution strategies developed through large-scale deployment.

Based on the inherent well-defined extensibility of DataFusion, Auron can be easily extended to support:

  • Various object stores.
  • Operators.
  • Simple and Aggregate functions.
  • File formats.

We encourage you to extend DataFusion capability directly and add the supports in Auron with simple modifications in plan-serde and extension translation.

Build from source

To build Auron from source, follow the steps below:

  1. Install Rust

Auron's native execution lib is written in Rust. You need to install Rust (nightly) before compiling.

We recommend using rustup for installation.

  1. Install JDK

Auron has been well tested with JDK 8, 11, and 17.

Make sure JAVA_HOME is properly set and points to your desired version.

  1. Check out the source code.

  2. Build the project.

You can build Auron either locally or inside Docker with CentOS7 using a unified script: auron-build.sh.

Run ./auron-build.sh --help to see all available options.

After the build completes, a fat JAR with all dependencies will be generated in either the target/ directory (for local builds) or target-docker/ directory (for Docker builds), depending on the selected build mode.

Run Spark Job with Auron Accelerator

This section describes how to submit and configure a Spark Job with Auron support.

  1. Move the Auron JAR to the Spark client classpath (normally spark-xx.xx.xx/jars/).

  2. Add the following configs to spark configuration in spark-xx.xx.xx/conf/spark-default.conf:

spark.auron.enable true
spark.sql.extensions org.apache.spark.sql.auron.AuronSparkSessionExtension
spark.shuffle.manager org.apache.spark.sql.execution.auron.shuffle.AuronShuffleManager
spark.memory.offHeap.enabled false

# suggested executor memory configuration
spark.executor.memory 4g
spark.executor.memoryOverhead 4096
  1. submit a query with spark-sql, or other tools like spark-thriftserver:
spark-sql -f tpcds/q01.sql

Performance

TPC-DS 1TB Benchmark Results:

tpcds-benchmark-echarts.png

For methodology and additional results, please refer to benchmark documentation.

We also encourage you to benchmark Auron and share the results with us. 🤗

Community

Subscribe Mailing Lists

Mail List is the most recognized form of communication in the Apache community. Contact us through the following mailing list.

NameScope
dev@auron.apache.orgDevelopment-related discussionsSubscribeUnsubscribe

Report Issues or Submit Pull Request

If you meet any questions, connect us and fix it by submitting a 🔗Pull Request.

License

Auron is licensed under the Apache 2.0 License. A copy of the license can be found here.