| {"nbformat": 4, "cells": [{"source": "# Hybrid - Faster training and easy deployment\n\n*Note: a newer version is available [here](http://gluon.mxnet.io/chapter07_distributed-learning/hybridize.html).*\n\nDeep learning frameworks can be roughly divided into two categories: declarative\nand imperative. With declarative frameworks (including Tensorflow, Theano, etc)\nusers first declare a fixed computation graph and then execute it end-to-end.\nThe benefit of fixed computation graph is it's portable and runs more\nefficiently. However, it's less flexible because any logic must be encoded\ninto the graph as special operators like `scan`, `while_loop` and `cond`.\nIt's also hard to debug.\n\nImperative frameworks (including PyTorch, Chainer, etc) are just the opposite:\nthey execute commands one-by-one just like old fashioned Matlab and Numpy.\nThis style is more flexible, easier to debug, but less efficient.\n\n`HybridBlock` seamlessly combines declarative programming and imperative programming\nto offer the benefit of both. Users can quickly develop and debug models with\nimperative programming and switch to efficient declarative execution by simply\ncalling: `HybridBlock.hybridize()`.\n\n## HybridBlock\n\n`HybridBlock` is very similar to `Block` but has a few restrictions:\n\n- All children layers of `HybridBlock` must also be `HybridBlock`.\n- Only methods that are implemented for both `NDArray` and `Symbol` can be used.\n For example you cannot use `.asnumpy()`, `.shape`, etc.\n- Operations cannot change from run to run. For example, you cannot do `if x:`\n if `x` is different for each iteration.\n\nTo use hybrid support, we subclass the `HybridBlock`:", "cell_type": "markdown", "metadata": {}}, {"source": "import mxnet as mx\nfrom mxnet import gluon\nfrom mxnet.gluon import nn\n\nclass Net(gluon.HybridBlock):\n def __init__(self, **kwargs):\n super(Net, self).__init__(**kwargs)\n with self.name_scope():\n # layers created in name_scope will inherit name space\n # from parent layer.\n self.conv1 = nn.Conv2D(6, kernel_size=5)\n self.pool1 = nn.Pool2D(kernel_size=2)\n self.conv2 = nn.Conv2D(16, kernel_size=5)\n self.pool2 = nn.Pool2D(kernel_size=2)\n self.fc1 = nn.Dense(120)\n self.fc2 = nn.Dense(84)\n # You can use a Dense layer for fc3 but we do dot product manually\n # here for illustration purposes.\n self.fc3_weight = self.params.get('fc3_weight', shape=(10, 84))\n\n def hybrid_forward(self, F, x, fc3_weight):\n # Here `F` can be either mx.nd or mx.sym, x is the input data,\n # and fc3_weight is either self.fc3_weight.data() or\n # self.fc3_weight.var() depending on whether x is Symbol or NDArray\n print(x)\n x = self.pool1(F.relu(self.conv1(x)))\n x = self.pool2(F.relu(self.conv2(x)))\n # 0 means copy over size from corresponding dimension.\n # -1 means infer size from the rest of dimensions.\n x = x.reshape((0, -1))\n x = F.relu(self.fc1(x))\n x = F.relu(self.fc2(x))\n x = F.dot(x, fc3_weight, transpose_b=True)\n return x", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "## Hybridize\n\nBy default, `HybridBlock` runs just like a standard `Block`. Each time a layer\nis called, its `hybrid_forward` will be run:", "cell_type": "markdown", "metadata": {}}, {"source": "net = Net()\nnet.collect_params().initialize()\nx = mx.nd.random_normal(shape=(16, 1, 28, 28))\nnet(x)\nx = mx.nd.random_normal(shape=(16, 1, 28, 28))\nnet(x)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "Hybrid execution can be activated by simply calling `.hybridize()` on the top\nlevel layer. The first forward call after activation will try to build a\ncomputation graph from `hybrid_forward` and cache it. On subsequent forward\ncalls the cached graph instead of `hybrid_forward` will be invoked:", "cell_type": "markdown", "metadata": {}}, {"source": "net.hybridize()\nx = mx.nd.random_normal(shape=(16, 1, 28, 28))\nnet(x)\nx = mx.nd.random_normal(shape=(16, 1, 28, 28))\nnet(x)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "Note that before hybridize, `print(x)` printed out one NDArray for forward,\nbut after hybridize, only the first forward printed out a Symbol. On subsequent\nforward `hybrid_forward` is not called so nothing was printed.\n\nHybridize will speed up execution and save memory. If the top level layer is\nnot a `HybridBlock`, you can still call `.hybridize()` on it and Gluon will try\nto hybridize its children layers instead.\n\n## Serializing trained model for deployment\n\nModels implemented as `HybridBlock` can be easily serialized for deployment\nusing other language front-ends like C, C++ and Scala. To this end, we simply\nforward the model with symbolic variables instead of NDArrays and save the\noutput Symbol(s):", "cell_type": "markdown", "metadata": {}}, {"source": "x = mx.sym.var('data')\ny = net(x)\nprint(y)\ny.save('model.json')\nnet.collect_params().save('model.params')", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "\nIf your network outputs more than one value, you can use `mx.sym.Group` to\ncombine them into a grouped Symbol and then save. The saved json and params\nfiles can then be loaded with C, C++ and Scala interface for prediction.\n\n<!-- INSERT SOURCE DOWNLOAD BUTTONS -->\n\n", "cell_type": "markdown", "metadata": {}}], "metadata": {"display_name": "", "name": "", "language": "python"}, "nbformat_minor": 2} |