Inference and model training are two important topics in machine learning. Thanks to TVM and WebAssembly Executor, Teaclave is now able to run the former—inference tasks. TVM can convert a model (or computation graph) to an intermediate representation (IR) defined by TVM, and compile the binary of this model from the IR. Since TVM recruits LLVM to emit binary code and LLVM support WebAssembly as backend, Teaclave‘s WebAssembly Executor can then execute the model’s binary with additional lightweight runtime provided by TVM.
Although TVM has already provided an wasm-standalone example app, we still cannot copy and run it in Teaclave due to lack of WASI support and specific context file interface. This document mainly focuses on the what's different in Teaclave and we will finally build a MNIST inference function for Teaclave.
All the dependencies has been installed or built in our docker image. If you do not want to waste time on this step, you can skip this section with our image prepared.
TVM provides detailed build instruction in the document. Besides the dependencies listed on their website, we also need to install (e.g. on Ubuntu 18.04) these packages to build the example.
sudo apt install protobuf-compiler libprotoc-dev llvm-10 clang-10 pip3 install onnx==1.9.0 numpy decorator attrs spicy
::: tip NOTE At the time of writing this document, latest onnx
cannot work because it depends on a higher version protobuf
, which is not provided by Ubuntu 18.04. We tested TVM with commit hash df06c5848f59108a8e6e7dffb997b4b659b573a7
. Later versions may work, but commits older than this one hardly work. :::
TVM offers a set of Python APIs for downloading, building, and testing the model. Specifically, to compile a graph into binary, we need to:
llvm-ar
) to a static libraryAfter completing these steps, we will generate a static library with the PackedFunc
exported for inference task.
The complete example build script can be found here.
Although the library is in WebAssembly, we can not use it directly in Teaclave because it lacks parameters, and the interfaces is also not compatible with Teaclave. So we need a wrapper program which contains a small runtime for the compiled computation graph. This wrapper should:
Our wrapper is dependent on TVM's Rust APIs. We use GraphExecutor
to achieve calling to the graph library. Detailed mechanisms are explained in TVM's example. our example can be found here.
::: tip NOTE To compile a Teaclave-compatible WASM binary, please make sure your Rust version > 1.53. We tested on 1.54 stable. :::
Just like any other Teaclave function, users need to prepare a simple Python script to pass the function and data to Teaclave, and then get the result back. The script of this example is here.
::: tip NOTE To compile a Teaclave-compatible WASM binary, please make sure your Rust version > 1.53. We tested on 1.54 stable. :::