Now TVM comes with a brand-new OpenGL/WebGL backend! This blog post explains what it is, and what you can achieve with it.
TVM already targets a lot of backends covering a variety of platforms: CPU, GPU, mobile devices, etc... This time we are adding another backend: OpenGL/WebGL.
OpenGL/WebGL enables us to leverage the GPU on an environment which does not have CUDA installed. It is, for the time being, the only way of using the GPU inside a browser.
This new backend allows us to use OpenGL/WebGL in 3 different ways:
We rely on Emscripten and its fastcomp LLVM backend to generate the javascript backend.
{:center: style=“text-align: center”} {: width=“65%”}
Figure 1 {:center}
See here for examples of all three of them.
Running a neural network on a browser isn‘t an entirely new thing. Andrej Karpathy’s ConvNetJS and Google's DeepLearning.JS are examples of that.
So what's unique about TVM with WebGL? The big difference is that the op kernels in TVM are automatically compiled, not handwritten. As shown in Figure 2, TVM utilizes a unified AST to define kernels, and compiles it to code on different platforms.
{:center: style=“text-align: center”} {: width=“50%”}
Figure 2 {:center}
This means that:
Here we perform a benchmark for a typical workload: image classification using resnet18.
I'm using my 5-year-old laptop which has an 8-core Intel® Core™ i7-3610QM, and a GTX650M.
In this benchmark, we download a resnet18 model from the Gluon model zoo, and perform end-to-end classification on a cat image. We only measure the model execution time (without model/input/parameter loading), and each model is run 100 times to get an average. The results are shown in figure 3.
{:center: style=“text-align: center”}
Figure 3 {:center}
The benchmark is run in 4 different settings:
From the result above we can observe that, the TVM OpenGL backend has a similar performance as OpenCL. More interestingly, the WebGL version inside the browser isn't significantly slower than desktop OpenGL. Considering that the host code is JavaScript, this is quite surprising. This might be due to the fact that Emscripten generates asm.js which enables dramatic optimizations in Firefox.
This is a first step toward automatic compilation of deep learning models into web browser. We expect more performance improvements as we bring optimizations into the TVM stack.
We thank the developers of Emscripten for providing the fastcomp toolchain and the helps during the development.