Blaze

test

The Blaze project aims to provide Spark SQL with a high-performance, low-cost native execution layer.

We seek to solve a series of performance bottlenecks in the current JVM-based Task execution of Spark SQL, such as high fluctuations in performance due to GC, high memory overhead, and inability to accelerate computation directly with SIMD instructions.

This repo is under active development and is not ready for production (or even development) use, but stay tuned for updates! ☺️

Overview

How fast we are, compared to Vanilla Spark

How to run it

1. Build and Run

We could simply build Blaze using:

./gradlew -Pmode=[debug|release] build

Once we have Blaze successfully built, it can be submitted using the bin/spark-submit or bin/spark-sql script.

./bin/spark-submit \
  --jar target/blaze-engine-${VERSION}.jar
  ....

or

./bin/spark-sql \
  --jar target/blaze-engine-${VERSION}.jar
  ....

2. Run using Docker

TBD

For developers

Are we TPC-DS yet?

  • [ ] Q95