The artifact repository is an S3 bucket accessible only to restricted Jenkins nodes. It is used to store compiled MXNet artifacts that can be used by downstream CD pipelines to package the compiled libraries for different delivery channels (e.g. DockerHub, PyPI, Maven, etc.). The S3 object keys for the files being posted will be prefixed with the following distinguishing characteristics of the binary: branch, commit id, operating system, variant and dependency linking strategy (static or dynamic). For instance, s3://bucket/73b29fa90d3eac0b1fae403b7583fdd1529942dc/ubuntu16.04/cu102mkl/static/libmxnet.so
An MXNet artifact is defined as the following set of files:
The artifact_repository.py script automates the upload and download of the specified files with the appropriate S3 object keys by taking explicitly set, or automatically derived, values for the different characteristics of the artifact.
An mxnet compiled library, or artifact for our purposes, is identified by the following distinguishing characteristics, which when not explicitly stated, will be (as much as possible) ascertained from the environment by the artifact_repository.py script: commit id, variant, operating system, and library type.
Commit Id
Manually configured through the --git-sha argument.
If not set, derived by:
Operating System
Manually configured through the --os argument.
If not set, derived through the value of sys.platform (https://docs.python.org/3/library/sys.html#sys.platform). That is:
Variant
Manually configured through the --variant argument. The current variants are: cpu, native, cu101, cu102, cu110, cu112.
As long as the tool is being run from the MXNet code base, the runtime feature detection tool (https://github.com/larroy/mxnet/blob/dd432b7f241c9da2c96bcb877c2dc84e6a1f74d4/docs/api/python/libinfo/libinfo.md) can be used to detect whether the library has been compiled with oneDNN (library has oneDNN feature enabled) and/or CUDA support (compiled with CUDA feature enabled).
If it has been compiled with CUDA support, the output of /usr/local/cuda/bin/nvcc --version can be mined for the exact CUDA version (eg. 8.0, 9.0, etc.).
By knowing which features are enabled on the binary, and if necessary, which CUDA version is installed on the machine, the value for the variant argument can be calculated. Eg. if CUDA features are enabled, and nvcc reports cuda version 10.2, then the variant would be cu102. If neither oneDNN nor CUDA features are enabled, the variant would be native.
Dependency Linking
The library dependencies can be either statically or dynamically linked. This property will need to be manually set by user through either the --static
or --dynamic
arguments. There is no foolproof and programmatic way (that I could find) that can easily discern whether the library dependencies are statically or dynamically linked.
The user must specify the path to the libmxnet.so, any license files, and any dependencies. The latter two are optional.
Example:
./artifact_repository.py --push --static --libmxnet /path/to/libmxnet.so --licenses path/to/license1.txt /path/to/other_licenses/*.txt --dependencies /path/to/dependencies/*.so
./artifact_repository.py --push --dynamic --libmxnet /path/to/libmxnet.so
NOTE: There is nothing stopping the user from uploading licenses and dependencies for dynamically linked libraries.
The user must specify the directory to which the artifact should be downloaded. The user will also need to specify the variant, since different variants can work with the host operating system.
Example:
./artifact_repository.py --pull --static --variant=cu102 ./dist
This would result in the following directory structure:
dist |-----> libmxnet.so |-----> libmxnet.meta |-----> licenses |-----> MKL_LICENSE.txt |-----> CUP_LICENSE.txt |-----> ... |-----> dependencies |-----> libxxx.so |-----> libyyy.so |-----> ...
The libmxnet.meta file will include the characteristics of the artifact (ie. library type, variant, git commit id, etc.) in a “property” file format.