minimize load library code to pass github ci
5 files changed
tree: ded20c00c7221e435d139890108c4f316d34dc57
  1. .github/
  2. dev/
  3. gradle/
  4. native-engine/
  5. spark-extension/
  6. .env
  7. .gitattributes
  8. .gitignore
  9. .gitmodules
  10. .travis.yml
  11. Cargo.lock
  12. Cargo.toml
  13. codecov.yml
  14. gradlew
  15. gradlew.bat
  16. LICENSE.txt
  17. NOTICE.txt
  18. README.md
  19. release-docker.sh
  20. release.sh
  21. rustfmt.toml
  22. settings.gradle
README.md

Blaze

test

The Blaze project aims to provide Spark SQL with a high-performance, low-cost native execution layer.

We seek to solve a series of performance bottlenecks in the current JVM-based Task execution of Spark SQL, such as high fluctuations in performance due to GC, high memory overhead, and inability to accelerate computation directly with SIMD instructions.

This repo is under active development and is not ready for production (or even development) use, but stay tuned for updates! ☺️

Overview

How fast we are, compared to Vanilla Spark

How to run it

1. Build and Run

We could simply build Blaze using:

./gradlew -Pmode=[debug|release] build

Once we have Blaze successfully built, it can be submitted using the bin/spark-submit or bin/spark-sql script.

./bin/spark-submit \
  --jar target/blaze-engine-${VERSION}.jar
  ....

or

./bin/spark-sql \
  --jar target/blaze-engine-${VERSION}.jar
  ....

2. Run using Docker

TBD

For developers

Are we TPC-DS yet?

  • [ ] Q95