| { |
| "cells": [ |
| { |
| "cell_type": "markdown", |
| "id": "20f2bafd", |
| "metadata": {}, |
| "source": [ |
| "<!--- Licensed to the Apache Software Foundation (ASF) under one -->\n", |
| "<!--- or more contributor license agreements. See the NOTICE file -->\n", |
| "<!--- distributed with this work for additional information -->\n", |
| "<!--- regarding copyright ownership. The ASF licenses this file -->\n", |
| "<!--- to you under the Apache License, Version 2.0 (the -->\n", |
| "<!--- \"License\"); you may not use this file except in compliance -->\n", |
| "<!--- with the License. You may obtain a copy of the License at -->\n", |
| "\n", |
| "<!--- http://www.apache.org/licenses/LICENSE-2.0 -->\n", |
| "\n", |
| "<!--- Unless required by applicable law or agreed to in writing, -->\n", |
| "<!--- software distributed under the License is distributed on an -->\n", |
| "<!--- \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->\n", |
| "<!--- KIND, either express or implied. See the License for the -->\n", |
| "<!--- specific language governing permissions and limitations -->\n", |
| "<!--- under the License. -->\n", |
| "\n", |
| "# Parameter and Block Naming\n", |
| "\n", |
| "In gluon, each Parameter or Block has a name (and prefix). Parameter names are specified by users and Block names can be either specified by users or automatically created.\n", |
| "\n", |
| "In this tutorial we talk about the best practices on naming. First, let's import MXNet and Gluon:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "7fd10a34", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "from __future__ import print_function\n", |
| "import mxnet as mx\n", |
| "from mxnet import gluon\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "2842b87c", |
| "metadata": {}, |
| "source": [ |
| "## Naming Blocks\n", |
| "\n", |
| "When creating a block, you can assign a prefix to it:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "8c33eee6", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "mydense = gluon.nn.Dense(100, prefix='mydense_')\n", |
| "print(mydense.prefix)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "afa6b54b", |
| "metadata": {}, |
| "source": [ |
| "When no prefix is given, Gluon will automatically generate one:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "78de3f9a", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "dense0 = gluon.nn.Dense(100)\n", |
| "print(dense0.prefix)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "373295b0", |
| "metadata": {}, |
| "source": [ |
| "When you create more Blocks of the same kind, they will be named with incrementing suffixes to avoid collision:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "a6e16794", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "dense1 = gluon.nn.Dense(100)\n", |
| "print(dense1.prefix)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "0dbb6c6a", |
| "metadata": {}, |
| "source": [ |
| "## Naming Parameters\n", |
| "\n", |
| "Parameters within a Block will be named by prepending the prefix of the Block to the name of the Parameter:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "bf4341de", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "print(dense0.collect_params())\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "b641c2d7", |
| "metadata": {}, |
| "source": [ |
| "## Name scopes\n", |
| "\n", |
| "To manage the names of nested Blocks, each Block has a `name_scope` attached to it. All Blocks created within a name scope will have its parent Block's prefix prepended to its name.\n", |
| "\n", |
| "Let's demonstrate this by first defining a simple neural net:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "541e6c01", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "class Model(gluon.Block):\n", |
| " def __init__(self, **kwargs):\n", |
| " super(Model, self).__init__(**kwargs)\n", |
| " with self.name_scope():\n", |
| " self.dense0 = gluon.nn.Dense(20)\n", |
| " self.dense1 = gluon.nn.Dense(20)\n", |
| " self.mydense = gluon.nn.Dense(20, prefix='mydense_')\n", |
| "\n", |
| " def forward(self, x):\n", |
| " x = mx.nd.relu(self.dense0(x))\n", |
| " x = mx.nd.relu(self.dense1(x))\n", |
| " return mx.nd.relu(self.mydense(x))\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "1a6ee6d5", |
| "metadata": {}, |
| "source": [ |
| "Now let's instantiate our neural net.\n", |
| "\n", |
| "- Note that `model0.dense0` is named as `model0_dense0_` instead of `dense0_`.\n", |
| "\n", |
| "- Also note that although we specified `mydense_` as prefix for `model.mydense`, its parent's prefix is automatically prepended to generate the prefix `model0_mydense_`." |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "cf50c1a0", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "model0 = Model()\n", |
| "model0.initialize()\n", |
| "model0(mx.nd.zeros((1, 20)))\n", |
| "print(model0.prefix)\n", |
| "print(model0.dense0.prefix)\n", |
| "print(model0.dense1.prefix)\n", |
| "print(model0.mydense.prefix)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "7a2d3419", |
| "metadata": {}, |
| "source": [ |
| "If we instantiate `Model` again, it will be given a different name like shown before for `Dense`.\n", |
| "\n", |
| "- Note that `model1.dense0` is still named as `dense0_` instead of `dense2_`, following dense layers in previously created `model0`. This is because each instance of model's name scope is independent of each other." |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "91e3ebfd", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "model1 = Model()\n", |
| "print(model1.prefix)\n", |
| "print(model1.dense0.prefix)\n", |
| "print(model1.dense1.prefix)\n", |
| "print(model1.mydense.prefix)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "bd2275e9", |
| "metadata": {}, |
| "source": [ |
| "**It is recommended that you manually specify a prefix for the top level Block, i.e. `model = Model(prefix='mymodel_')`, to avoid potential confusions in naming.**\n", |
| "\n", |
| "The same principle also applies to container blocks like Sequential. `name_scope` can be used inside `__init__` as well as out side of `__init__`:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "15743ba0", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "net = gluon.nn.Sequential()\n", |
| "with net.name_scope():\n", |
| " net.add(gluon.nn.Dense(20))\n", |
| " net.add(gluon.nn.Dense(20))\n", |
| "print(net.prefix)\n", |
| "print(net[0].prefix)\n", |
| "print(net[1].prefix)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "fa91cfd5", |
| "metadata": {}, |
| "source": [ |
| "`gluon.model_zoo` also behaves similarly:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "fa5bcff0", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "net = gluon.nn.Sequential()\n", |
| "with net.name_scope():\n", |
| " net.add(gluon.model_zoo.vision.alexnet(pretrained=True))\n", |
| " net.add(gluon.model_zoo.vision.alexnet(pretrained=True))\n", |
| "print(net.prefix, net[0].prefix, net[1].prefix)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "1bd3762c", |
| "metadata": {}, |
| "source": [ |
| "## Saving and loading\n", |
| "\n", |
| "Because model0 and model1 have different prefixes, their parameters also have different names:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "ca688934", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "print(model0.collect_params(), '\\n')\n", |
| "print(model1.collect_params())\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "b6558305", |
| "metadata": {}, |
| "source": [ |
| "As a result, if you try to save parameters from model0 and load it with model1, you'll get an error due to unmatching names:" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "c71c794d", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "model0.collect_params().save('model.params')\n", |
| "try:\n", |
| " model1.collect_params().load('model.params', mx.cpu())\n", |
| "except Exception as e:\n", |
| " print(e)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "e90b8df6", |
| "metadata": {}, |
| "source": [ |
| "To solve this problem, we use `save_parameters`/`load_parameters` instead of `collect_params` and `save`/`load`. `save_parameters` uses model structure, instead of parameter name, to match parameters." |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "1d1e24f0", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "model0.save_parameters('model.params')\n", |
| "model1.load_parameters('model.params')\n", |
| "print(mx.nd.load('model.params').keys())\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "a547845e", |
| "metadata": {}, |
| "source": [ |
| "## Replacing Blocks from networks and fine-tuning\n", |
| "\n", |
| "Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning.\n", |
| "\n", |
| "For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.\n", |
| "\n", |
| "To see how to do this, we first load a pretrained AlexNet.\n", |
| "\n", |
| "- In Gluon model zoo, all image classification models follow the format where the feature extraction layers are named `features` while the output layer is named `output`.\n", |
| "- Note that the output layer is a dense block with 1000 dimension outputs." |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "fbdde321", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "alexnet = gluon.model_zoo.vision.alexnet(pretrained=True)\n", |
| "print(alexnet.output)\n", |
| "print(alexnet.output.prefix)\n", |
| "```\n" |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "d00cdd50", |
| "metadata": {}, |
| "source": [ |
| "To change the output to 100 dimension, we replace it with a new block." |
| ] |
| }, |
| { |
| "cell_type": "markdown", |
| "id": "a4047cfe", |
| "metadata": {}, |
| "source": [ |
| "```python\n", |
| "with alexnet.name_scope():\n", |
| " alexnet.output = gluon.nn.Dense(100)\n", |
| "alexnet.output.initialize()\n", |
| "print(alexnet.output)\n", |
| "print(alexnet.output.prefix)\n", |
| "```\n" |
| ] |
| } |
| ], |
| "metadata": { |
| "language_info": { |
| "name": "python" |
| } |
| }, |
| "nbformat": 4, |
| "nbformat_minor": 5 |
| } |