| { |
| "cells": [ |
| { |
| "cell_type": "markdown", |
| "id": "43291342", |
| "metadata": {}, |
| "source": [ |
| "<!--- Licensed to the Apache Software Foundation (ASF) under one -->\n", |
| "<!--- or more contributor license agreements. See the NOTICE file -->\n", |
| "<!--- distributed with this work for additional information -->\n", |
| "<!--- regarding copyright ownership. The ASF licenses this file -->\n", |
| "<!--- to you under the Apache License, Version 2.0 (the -->\n", |
| "<!--- \"License\"); you may not use this file except in compliance -->\n", |
| "<!--- with the License. You may obtain a copy of the License at -->\n", |
| "\n", |
| "<!--- http://www.apache.org/licenses/LICENSE-2.0 -->\n", |
| "\n", |
| "<!--- Unless required by applicable law or agreed to in writing, -->\n", |
| "<!--- software distributed under the License is distributed on an -->\n", |
| "<!--- \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->\n", |
| "<!--- KIND, either express or implied. See the License for the -->\n", |
| "<!--- specific language governing permissions and limitations -->\n", |
| "<!--- under the License. -->\n", |
| "\n", |
| "# Parameter and Block Naming\n", |
| "\n", |
| "In gluon, each Parameter or Block has a name. Parameter names and Block names can be automatically created.\n", |
| "\n", |
| "In this tutorial we talk about the best practices on naming. First, let's import MXNet and Gluon:" |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 1, |
| "id": "d0f2b48c", |
| "metadata": {}, |
| "outputs": [], |
| "source": [ |
| "from __future__ import print_function\n", |
| "import mxnet as mx\n", |
| "from mxnet import gluon" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "131d42d5", |
| "metadata": {}, |
| "source": [ |
| "## Naming Blocks\n", |
| "\n", |
| "When creating a block, you can simply do as follows:" |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 2, |
| "id": "495370e3", |
| "metadata": {}, |
| "outputs": [ |
| { |
| "name": "stdout", |
| "output_type": "stream", |
| "text": [ |
| "Dense\n" |
| ] |
| } |
| ], |
| "source": [ |
| "mydense = gluon.nn.Dense(100)\n", |
| "print(mydense.__class__.__name__)" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "299b0f42", |
| "metadata": {}, |
| "source": [ |
| "When you create more Blocks of the same kind, they will be named with incrementing suffixes to avoid collision:" |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 3, |
| "id": "bbc49978", |
| "metadata": {}, |
| "outputs": [ |
| { |
| "name": "stdout", |
| "output_type": "stream", |
| "text": [ |
| "Dense\n" |
| ] |
| } |
| ], |
| "source": [ |
| "dense1 = gluon.nn.Dense(100)\n", |
| "print(dense1.__class__.__name__)" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "18c6df08", |
| "metadata": {}, |
| "source": [ |
| "## Naming Parameters\n", |
| "\n", |
| "Parameters will be named automatically by a unique name in the format of `param_{uuid4}_{name}`:" |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 4, |
| "id": "7a2666bf", |
| "metadata": {}, |
| "outputs": [ |
| { |
| "name": "stdout", |
| "output_type": "stream", |
| "text": [ |
| "bias\n" |
| ] |
| } |
| ], |
| "source": [ |
| "param = gluon.Parameter(name = 'bias')\n", |
| "print(param.name)" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "74c0b12c", |
| "metadata": {}, |
| "source": [ |
| "`param.name` is used as the name of a parameter's symbol representation. And it can not be changed once the parameter is created.\n", |
| "\n", |
| "When getting parameters within a Block, you should use the structure based name as the key:" |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 5, |
| "id": "25b970a8", |
| "metadata": {}, |
| "outputs": [ |
| { |
| "name": "stdout", |
| "output_type": "stream", |
| "text": [ |
| "{'weight': Parameter (shape=(100, -1), dtype=float32), 'bias': Parameter (shape=(100,), dtype=float32)}\n" |
| ] |
| } |
| ], |
| "source": [ |
| "print(dense1.collect_params())" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "f3ec261a", |
| "metadata": {}, |
| "source": [ |
| "## Nested Blocks\n", |
| "\n", |
| "In MXNet 2, we don't have to define children blocks within a `name_scope` any more. Let's demonstrate this by defining and initiating a simple neural net:" |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 6, |
| "id": "5d53cf1f", |
| "metadata": {}, |
| "outputs": [ |
| { |
| "name": "stderr", |
| "output_type": "stream", |
| "text": [ |
| "[02:44:06] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU\n" |
| ] |
| }, |
| { |
| "data": { |
| "text/plain": [ |
| "array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n", |
| " 0., 0., 0., 0.]])" |
| ] |
| }, |
| "execution_count": 6, |
| "metadata": {}, |
| "output_type": "execute_result" |
| } |
| ], |
| "source": [ |
| "class Model(gluon.HybridBlock):\n", |
| " def __init__(self):\n", |
| " super(Model, self).__init__()\n", |
| " self.dense0 = gluon.nn.Dense(20)\n", |
| " self.dense1 = gluon.nn.Dense(20)\n", |
| " self.mydense = gluon.nn.Dense(20)\n", |
| "\n", |
| " def forward(self, x):\n", |
| " x = mx.npx.relu(self.dense0(x))\n", |
| " x = mx.npx.relu(self.dense1(x))\n", |
| " return mx.npx.relu(self.mydense(x))\n", |
| "\n", |
| "model0 = Model()\n", |
| "model0.initialize()\n", |
| "model0.hybridize()\n", |
| "model0(mx.np.zeros((1, 20)))" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "de7d2b26", |
| "metadata": {}, |
| "source": [ |
| "The same principle also applies to container blocks like Sequential. We can simply do as follows:" |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 7, |
| "id": "7adf65db", |
| "metadata": {}, |
| "outputs": [], |
| "source": [ |
| "net = gluon.nn.Sequential()\n", |
| "net.add(gluon.nn.Dense(20))\n", |
| "net.add(gluon.nn.Dense(20))" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "cd2f0662", |
| "metadata": {}, |
| "source": [ |
| "## Saving and loading\n", |
| "\n", |
| "\n", |
| "For `HybridBlock`, we use `save_parameters`/`load_parameters`, which uses model structure, instead of parameter name, to match parameters." |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 8, |
| "id": "07ef1f9d", |
| "metadata": {}, |
| "outputs": [ |
| { |
| "name": "stdout", |
| "output_type": "stream", |
| "text": [ |
| "dict_keys(['dense0.bias', 'dense0.weight', 'dense1.bias', 'dense1.weight', 'mydense.bias', 'mydense.weight'])\n" |
| ] |
| } |
| ], |
| "source": [ |
| "model1 = Model()\n", |
| "model0.save_parameters('model.params')\n", |
| "model1.load_parameters('model.params')\n", |
| "print(mx.npx.load('model.params').keys())" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "217a9c6c", |
| "metadata": {}, |
| "source": [ |
| "For `SymbolBlock.imports`, we use `export`, which uses parameter name `param.name`, to save parameters." |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 9, |
| "id": "ab2c4dd1", |
| "metadata": {}, |
| "outputs": [ |
| { |
| "name": "stderr", |
| "output_type": "stream", |
| "text": [ |
| "/work/mxnet/python/mxnet/gluon/block.py:1943: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:\n", |
| "\tdata: None\n", |
| " input_sym_arg_type = in_param.infer_type()[0]\n" |
| ] |
| } |
| ], |
| "source": [ |
| "model0.export('model0')\n", |
| "model2 = gluon.SymbolBlock.imports('model0-symbol.json', ['data'], 'model0-0000.params')" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "3b7e0914", |
| "metadata": {}, |
| "source": [ |
| "## Replacing Blocks from networks and fine-tuning\n", |
| "\n", |
| "Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning.\n", |
| "\n", |
| "For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.\n", |
| "\n", |
| "To see how to do this, we first load a pretrained ResNet.\n", |
| "\n", |
| "- In Gluon model zoo, all image classification models follow the format where the feature extraction layers are named `features` while the output layer is named `output`.\n", |
| "- Note that the output layer is a dense block with 1000 dimension outputs." |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 10, |
| "id": "9c64cb7d", |
| "metadata": {}, |
| "outputs": [ |
| { |
| "name": "stdout", |
| "output_type": "stream", |
| "text": [ |
| "Dense(2048 -> 1000, linear)\n" |
| ] |
| } |
| ], |
| "source": [ |
| "resnet = gluon.model_zoo.vision.resnet50_v2()\n", |
| "print(resnet.output)" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "c327e071", |
| "metadata": {}, |
| "source": [ |
| "To change the output to 100 dimension, we replace it with a new block." |
| ] |
| }, |
| { |
| "cell_type": "code", |
| "execution_count": 11, |
| "id": "45a8013c", |
| "metadata": {}, |
| "outputs": [], |
| "source": [ |
| "resnet.output = gluon.nn.Dense(100)\n", |
| "resnet.output.initialize()" |
| ] |
| } |
| ], |
| "metadata": { |
| "language_info": { |
| "name": "python" |
| } |
| }, |
| "nbformat": 4, |
| "nbformat_minor": 5 |
| } |