For Hexagon, LWP can be used to get function and loop level processor cycle count. This is done by instrumenting the code with profiling builtin calls using a TIR pass. During codegen, these builtin calls are replaced with the calls to a hexagon specific handler which records the runtime information into a buffer. This buffer is written into a JSON file (‘lwp.json’) which is processed to construct function and loop level profiling information as a csv file.
Note: During codegen, the profiling builtin calls are ignored for other targets.
The TIR pass offers several config flags to control the level of instrumentation as mentioned below:
lwp_disable_func_prof: To disable function level profiling. By default, it is set to ‘False’, i.e., the function level profiling is enabled.
instr_siblings: When enabled, only loops with siblings are instrumented and rest are ignored. The inner-most loops are always excluded from instrumentation unless overwritten using lwp_min_height. This is done to minimize the adverse effect of instrumentation on actual performance. By default, it is set to ‘True’.
lwp_max_depth: To instrument loops up to a certain depth. This flag is effective only when instr_siblings is disabled. By default, it is set to 0.
lwp_min_height: To exclude inner loops up to a certain height from instrumentation. By default, it is set to 1.
For additional usage information on various config flags, please refer to the tests in tests/python/tir-transform/test_tir_transform_profiling_instr.py
tests/python/contrib/test_hexagon/test_launcher.py contains two tests, test_lwp and test_lwp_multiple_conv2d, to demonstrate lightweight profiling usage.
The steps involved are as follows:
While building a model, set tir.instrument_lwp to True. By default, the builtin calls will only be inserted for the loops with siblings. But it can be altered using LWP config options as described above.
Create HexagonProfiler object
Run the model and get the profiling data as a CSV file. It is done by post-processing ‘lwp.json’ file generated during runtime.
graph_mod.run(**inputs)
# Get lightweight profiling output as a CSV file
profiler.get_profile_output(hexagon_launcher, hexagon_session, hexagon_server_process)
Note:
Helpful Hints:
--hexagon-debug to pytest.python -m pytest --hexagon-debug tests/python/contrib/test_hexagon/test_launcher.py::test_lwp