Version 0.9 brings a number of important features and changes, including a back-end refactor to adopt the NNVM framework, a profiler for analyzing performance, a fast image IO and augmentation module that bypasses GIL, and various other changes.
NNVM is a library for neural network graph construction, optimization, and operator registration. It serves as an intermediary layer between the front-end (MXNet user API) and the back-end (computation on the device). After version 0.9, MXNet fully adopts the NNVM framework. Now it's easier to create operators. You can also register “pass”es that process and optimizes the graph when
bind is called on the symbol. For more discussion on how to create operators with NNVM, please refer to How to Create New Operators
Other changes brought by NNVM include:
mx.nd.Activation(x, act_type='relu')works now.
make cythonto activate it for accelerated communication with the back-end.
MXNet now provides a native profiler for analyzing the performance of operators. This feature compliments general profiling tools like nvprof and gprof by summarizing at the operator level, instead of function, kernel, or instruction level.
To use this feature, first set
USE_PROFILER = 1 in
config.mk and rebuild mxnet. Then add three lines at the beginning and end of the section of your program you want to profile:
mx.profiler.profiler_set_config(mode=scope, filename=fname) profiler.profiler_set_state('run') # do computation ... profiler.profiler_set_state('stop')
scope can be ‘symbolic’ (to only include symbolic operations) or ‘all’ (to include all operations), and
fname is the path to save profiler output.
After program finishes, navigate to chrome://tracing in a Chrome browser and load profiler output to see the results.
MXNet already has
mx.io.ImageRecordIter for loading and preprocessing images. However, some tasks need more flexible image processing API. Detection, for example, requires transforming labels together with images. Usually, people write custom data iterators in python to handle this. But due to the infamous Global Interpreter Lock (GIL), python scripts cannot use multithreading to speed up processing.
mx.image provides a set of fast image processing API that leverage MXNet Engine to automatically parallelize processing. You can write
imgs = [mx.image.imdecode(open(f).read()) for f in img_paths]
and decoding will be automatically run in parallel.
mx.io.DataDesc(..., layout='NHWC')in provide_data to specify data layout. use
mx.sym.YourSymbol(..., __layout__='NHWC')to specify output layout.
layoutoption is now available for Convolution layer.