blob: 4705e4f592f6287decf63f4ea58edf63c64a9146 [file] [view]
---
layout: page
title: Profile memory consumption of Gluten
nav_order: 8
has_children: true
parent: /developer-overview/
---
Gluten offloads most of Spark SQL execution to native engine. We can use [gperftools](https://github.com/gperftools/gperftools) or [jemalloc](https://github.com/jemalloc/jemalloc)
to analyze the offheap memory and cpu profile.
# Profile with gperftools
`gperftools` is a collection of a high-performance multi-threaded
malloc() implementation, plus some pretty nifty performance analysis
tools, see more: https://github.com/gperftools/gperftools/wiki.
## Build and install gperftools
Download `gperftools` from https://github.com/gperftools/gperftools/releases, build and install.
```bash
wget https://github.com/gperftools/gperftools/releases/download/gperftools<version>/gperftools-<version>.tar.gz
tar xzvf gperftools-<version>.tar.gz
cd gperftools-<version>
./configure
make && make install
```
Then we can find the tcmalloc libraries in `$GPERFTOOLS_HOME/.lib`.
## Run Gluten with gperftools
Configure `--files` or `spark.files` for Spark.
```
--files /path/to/gperftools/libtcmalloc_and_profiler.so
or
spark.files /path/to/gperftools/libtcmalloc_and_profiler.so
```
Use `LD_PRELOAD` to preload tcmalloc library, and enable heap profile with `HEAPPROFILE` or cpu profile with `CPUPROFILE`.
Example of enabling heap profile in spark executor:
```
spark.executorEnv.LD_PRELOAD ./libtcmalloc_and_profiler.so
# Specifies dump profile path. ${CONTAINER_ID} is only used to distinguish the result files when running on yarn.
spark.executorEnv.HEAPPROFILE /tmp/gluten_heap_perf_${CONTAINER_ID}
```
Finally, profiling files prefixed with `/tmp/gluten_heap_perf_${CONTAINER_ID}` will be generated for each spark executor.
## Analyze profiling output
Prepare the required native libraries. Assume static build is used for Gluten, there is no other shared dependency libs.
```bash
jar xf gluten-velox-bundle-spark3.5_2.12-centos_7_x86_64-1.2.0.jar relative/path/to/libvelox.so ralative/path/to/libgluten.so
mv libvelox.so libgluten.so /path/to/gluten_lib_prefix
```
Generate a GIF of the analysis result:
```bash
# `/usr/bin/java` indicates the program used by running spark executor
pprof --show_bytes --gif --lib_prefix=/path/to/gluten_lib_prefix /usr/bin/java /path/to/gluten_heap_perf_XXX > result.gif
```
Result like:
<img src="../image/velox_profile_memory_gif.gif" width="200" />
Or display analysis result in TEXT:
```bash
pprof --text --lib_prefix=/path/to/gluten_lib_prefix /usr/bin/java /path/to/gluten_heap_perf_XXX
```
Result like:
<img src="../image/velox_profile_memory_text.png" width="400" />
**\*\*** Get more help from https://github.com/gperftools/gperftools/wiki#documentation.
# Profile with jemalloc
`jemalloc` is a general purpose malloc(3) implementation that emphasizes fragmentation
avoidance and scalable concurrency support. We can also use it to analyze Gluten performance.
Getting Started with `jemalloc`: https://github.com/jemalloc/jemalloc/wiki/Getting-Started.
## Build and install jemalloc
Download `jemalloc` from https://github.com/jemalloc/jemalloc/releases, build and install.
```
cd /path/to/jemalloc
./autogen.sh --enable-prof
make && make install
```
Then we can find the jemalloc library in `$JEMALLOC_HOME/.lib`.
## Run Gluten with jemalloc
Configure `--files` or `spark.files` for Spark.
```
--files /path/to/jemalloc/libjemalloc.so
or
spark.files /path/to/jemalloc/libjemalloc.so
```
Example of enabling heap profile in spark executor:
```
spark.executorEnv.LD_PRELOAD ./libjemalloc.so
spark.executorEnv.MALLOC_CONF prof:true,lg_prof_interval:30,prof_prefix:/tmp/gluten_heap_perf
```
Finally, profiling files prefixed with `/tmp/gluten_heap_perf.${PID}` will be generated for each spark executor.
## Memory dump on spark executor exit
Sometimes, when native memory is not managed by gluten or there are some memory leaks that will cause spark executor to be killed due to memory limit,
we only need to trigger a memory dump on executor exit.
If we want to enable this feature we need to follow steps:
1. Build gluten with `--enable_jemalloc_stats=ON` to enabled jemalloc stats.
2. Enabled memory dump on exit, add spark executor environments to load jemalloc lib and make memory profiling active.
```
spark.gluten.monitor.memoryDumpOnExit=true
spark.executorEnv.LD_PRELOAD=/path/to/libjemalloc.so
spark.executorEnv.MALLOC_CONF=prof:true,prof_prefix:/tmp/gluten_heap_perf
```
## Analyze profiling output
Prepare the required native libraries. Assume static build is used for Gluten, so there is no other shared dependency libs.
```bash
jar xf gluten-velox-bundle-spark3.5_2.12-centos_7_x86_64-1.2.0.jar relative/path/to/libvelox.so relative/path/to/libgluten.so
mv libvelox.so libgluten.so /path/to/gluten_lib_prefix
```
Generate a GIF of the analysis result:
```bash
# `/usr/bin/java` indicates the program used by running spark executor
jeprof --show_bytes --gif --lib_prefix=/path/to/gluten_lib_prefix /usr/bin/java /path/to/gluten_heap_perf_XXX > result.gif
```
Or display analysis result in TEXT:
```bash
jeprof --text --lib_prefix=/path/to/gluten_lib_prefix /usr/bin/java /path/to/gluten_heap_perf_XXX
```
**\*\*** Get more help from https://jemalloc.net/jemalloc.3.html.