Gluten supports using Intel® QuickAssist Technology (QAT) for data compression during Spark Shuffle. It benefits from QAT Hardware-based acceleration on compression/decompression, and uses Gzip as compression format for higher compression ratio to reduce the pressure on disks and network transmission.
This feature is based on QAT driver library and QATzip library. Please manually download QAT driver for your system, and follow its README to build and install on all Driver and Worker node: Intel® QuickAssist Technology Driver for Linux* – HW Version 2.0.
echo "export ICP_ROOT=/path/to/QAT_driver" >> ~/.bashrc source ~/.bashrc # Also set for root if running as non-root user sudo su - echo "export ICP_ROOT=/path/to/QAT_driver" >> ~/.bashrc exit
sudo su - usermod -aG qat username # need relogin to take effect # To set 500MB add a line like this in /etc/security/limits.conf echo "@qat - memlock 500000" >> /etc/security/limits.conf exit
sudo su - cat << EOF > /usr/local/bin/qat_startup.sh #!/bin/bash echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages rmmod usdm_drv insmod $ICP_ROOT/build/usdm_drv.ko max_huge_pages=1024 max_huge_pages_per_process=32 EOF chmod +x /usr/local/bin/qat_startup.sh cat << EOF > /etc/systemd/system/qat_startup.service [Unit] Description=Configure QAT [Service] ExecStart=/usr/local/bin/qat_startup.sh [Install] WantedBy=multi-user.target EOF systemctl enable qat_startup.service systemctl start qat_startup.service # setup immediately systemctl status qat_startup.service exit
cd /path/to/gluten ## The script builds four jars for spark 3.2.2, 3.3.1, 3.4.3 and 3.5.1. ./dev/buildbundle-veloxbe.sh --enable_qat=ON
## run as root ## Overwrite QAT configuration file. cd /etc for i in {0..7}; do echo "4xxx_dev$i.conf"; done | xargs -i cp -f /path/to/gluten/docs/qat/4x16.conf {} ## Restart QAT after updating configuration files. adf_ctl restart
adf_ctl status
The output should be like:
Checking status of all devices. There is 8 QAT acceleration device(s) in the system: qat_dev0 - type: 4xxx, inst_id: 0, node_id: 0, bsf: 0000:6b:00.0, #accel: 1 #engines: 9 state: up qat_dev1 - type: 4xxx, inst_id: 1, node_id: 1, bsf: 0000:70:00.0, #accel: 1 #engines: 9 state: up qat_dev2 - type: 4xxx, inst_id: 2, node_id: 2, bsf: 0000:75:00.0, #accel: 1 #engines: 9 state: up qat_dev3 - type: 4xxx, inst_id: 3, node_id: 3, bsf: 0000:7a:00.0, #accel: 1 #engines: 9 state: up qat_dev4 - type: 4xxx, inst_id: 4, node_id: 4, bsf: 0000:e8:00.0, #accel: 1 #engines: 9 state: up qat_dev5 - type: 4xxx, inst_id: 5, node_id: 5, bsf: 0000:ed:00.0, #accel: 1 #engines: 9 state: up qat_dev6 - type: 4xxx, inst_id: 6, node_id: 6, bsf: 0000:f2:00.0, #accel: 1 #engines: 9 state: up qat_dev7 - type: 4xxx, inst_id: 7, node_id: 7, bsf: 0000:f7:00.0, #accel: 1 #engines: 9 state: up
--conf spark.gluten.sql.columnar.shuffle.codec=gzip # Valid options are gzip and zstd --conf spark.gluten.sql.columnar.shuffle.codecBackend=qat
while :; do cat /sys/kernel/debug/qat_4xxx_0000:6b:00.0/fw_counters; sleep 1; done
Documentation
README Text Files (README_QAT20.L.1.0.0-00021.txt)
Release Notes
Check out the Intel® QuickAssist Technology Software for Linux* - Release Notes for the latest changes in this release.
Getting Started Guide
Check out the Intel® QuickAssist Technology Software for Linux* - Getting Started Guide for detailed installation instructions.
Programmer's Guide
Check out the Intel® QuickAssist Technology Software for Linux* - Programmer's Guide for software usage guidelines.
For more Intel® QuickAssist Technology resources go to Intel® QuickAssist Technology (Intel® QAT)