Version 0.9 brings a number of important features and changes, including a back-end refactor to adopt the NNVM framework, a profiler for analyzing performance, a fast image IO and augmentation module that bypasses GIL, and various other changes.
NNVM is a library for neural network graph construction, optimization, and operator registration. It serves as an intermediary layer between the front-end (MXNet user API) and the back-end (computation on the device). After version 0.9, MXNet fully adopts the NNVM framework. Now it's easier to create operators. You can also register “pass”es that process and optimizes the graph when bind
is called on the symbol. For more discussion on how to create operators with NNVM, please refer to How to Create New Operators
Other changes brought by NNVM include:
mx.nd.Activation(x, act_type='relu')
works now.make cython
to activate it for accelerated communication with the back-end.MXNet now provides a native profiler for analyzing the performance of operators. This feature compliments general profiling tools like nvprof and gprof by summarizing at the operator level, instead of function, kernel, or instruction level.
To use this feature, first set USE_PROFILER = 1
in config.mk
and rebuild mxnet. Then add three lines at the beginning and end of the section of your program you want to profile:
mx.profiler.profiler_set_config(mode=scope, filename=fname) profiler.profiler_set_state('run') # do computation ... profiler.profiler_set_state('stop')
scope
can be ‘symbolic’ (to only include symbolic operations) or ‘all’ (to include all operations), and fname
is the path to save profiler output.
After program finishes, navigate to chrome://tracing in a Chrome browser and load profiler output to see the results.
MXNet already has mx.io.ImageRecordIter
for loading and preprocessing images. However, some tasks need more flexible image processing API. Detection, for example, requires transforming labels together with images. Usually, people write custom data iterators in python to handle this. But due to the infamous Global Interpreter Lock (GIL), python scripts cannot use multithreading to speed up processing.
mx.image
provides a set of fast image processing API that leverage MXNet Engine to automatically parallelize processing. You can write
imgs = [mx.image.imdecode(open(f).read()) for f in img_paths]
and decoding will be automatically run in parallel.
mx.io.DataDesc(..., layout='NHWC')
in provide_data to specify data layout. use mx.sym.YourSymbol(..., __layout__='NHWC')
to specify output layout. layout
option is now available for Convolution layer.