docs/developers/UsingGperftoolsInCH.md - gluten - Git at Google

 ---
 layout: page
 title: Debug for CH Backend with gpertools Tool
 nav_order: 11
 has_children: true
 parent: /developer-overview/
 ---
 We need using gpertools to find the memory or CPU issue. That's what this document is about.

 ## Install gperftools
 Install gperftools as described in https://github.com/gperftools/gperftools.
 We get the library and the command line tools.

 ## Compiler libch.so
 Disable jemalloc `-DENABLE_JEMALLOC=OFF` in cpp-ch/CMakeLists.txt, and recompile libch.so.

 ## Run Gluten with gperftools
 For Spark on Yarn, we can change the submit script to run Gluten with gperftools.
 Add the following to the submit script:
 ```
 export tcmalloc_path=/data2/zzb/gperftools-2.10/.libs/libtcmalloc_and_profiler.so # the path to the tcmalloc library
 export LD_PRELOAD=$tcmalloc_path,libch.so # load the library in the driver
 --files $tcmalloc_path # upload the library to the cluster
 --conf spark.executorEnv.LD_PRELOAD=./libtcmalloc_and_profiler.so,libch.so # load the library in the executor
 --conf spark.executorEnv.HEAPPROFILE=/tmp/gluten_heap_perf # set the heap profile path, you can change to CPUPROFILE for CPU profiling
 ```

 For thrift server on local machine, note using `export LD_PRELOAD="$tcmalloc_path libch.so" # load the library in the driver` to preload dynamic libraries.

 ## Analyze the result
 We can get the result in the path we set in the previous step. For example, we can get the result in `/tmp/gluten_heap_perf`. We can use the following website to analyze the result:
 https://gperftools.github.io/gperftools/heapprofile.html
 https://gperftools.github.io/gperftools/cpuprofile.html
	---
	layout: page
	title: Debug for CH Backend with gpertools Tool
	nav_order: 11
	has_children: true
	parent: /developer-overview/
	---
	We need using gpertools to find the memory or CPU issue. That's what this document is about.

	## Install gperftools
	Install gperftools as described in https://github.com/gperftools/gperftools.
	We get the library and the command line tools.

	## Compiler libch.so
	Disable jemalloc `-DENABLE_JEMALLOC=OFF` in cpp-ch/CMakeLists.txt, and recompile libch.so.

	## Run Gluten with gperftools
	For Spark on Yarn, we can change the submit script to run Gluten with gperftools.
	Add the following to the submit script:
	```
	export tcmalloc_path=/data2/zzb/gperftools-2.10/.libs/libtcmalloc_and_profiler.so # the path to the tcmalloc library
	export LD_PRELOAD=$tcmalloc_path,libch.so # load the library in the driver
	--files $tcmalloc_path # upload the library to the cluster
	--conf spark.executorEnv.LD_PRELOAD=./libtcmalloc_and_profiler.so,libch.so # load the library in the executor
	--conf spark.executorEnv.HEAPPROFILE=/tmp/gluten_heap_perf # set the heap profile path, you can change to CPUPROFILE for CPU profiling
	```

	For thrift server on local machine, note using `export LD_PRELOAD="$tcmalloc_path libch.so" # load the library in the driver` to preload dynamic libraries.

	## Analyze the result
	We can get the result in the path we set in the previous step. For example, we can get the result in `/tmp/gluten_heap_perf`. We can use the following website to analyze the result:
	https://gperftools.github.io/gperftools/heapprofile.html
	https://gperftools.github.io/gperftools/cpuprofile.html