blob: 1b0bdb3567e456ff0b29715925f6d2d68fb74ce3 [file] [log] [blame]
{
"cells": [
{
"cell_type": "markdown",
"id": "20f2bafd",
"metadata": {},
"source": [
"<!--- Licensed to the Apache Software Foundation (ASF) under one -->\n",
"<!--- or more contributor license agreements. See the NOTICE file -->\n",
"<!--- distributed with this work for additional information -->\n",
"<!--- regarding copyright ownership. The ASF licenses this file -->\n",
"<!--- to you under the Apache License, Version 2.0 (the -->\n",
"<!--- \"License\"); you may not use this file except in compliance -->\n",
"<!--- with the License. You may obtain a copy of the License at -->\n",
"\n",
"<!--- http://www.apache.org/licenses/LICENSE-2.0 -->\n",
"\n",
"<!--- Unless required by applicable law or agreed to in writing, -->\n",
"<!--- software distributed under the License is distributed on an -->\n",
"<!--- \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->\n",
"<!--- KIND, either express or implied. See the License for the -->\n",
"<!--- specific language governing permissions and limitations -->\n",
"<!--- under the License. -->\n",
"\n",
"# Parameter and Block Naming\n",
"\n",
"In gluon, each Parameter or Block has a name (and prefix). Parameter names are specified by users and Block names can be either specified by users or automatically created.\n",
"\n",
"In this tutorial we talk about the best practices on naming. First, let's import MXNet and Gluon:"
]
},
{
"cell_type": "markdown",
"id": "7fd10a34",
"metadata": {},
"source": [
"```python\n",
"from __future__ import print_function\n",
"import mxnet as mx\n",
"from mxnet import gluon\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "2842b87c",
"metadata": {},
"source": [
"## Naming Blocks\n",
"\n",
"When creating a block, you can assign a prefix to it:"
]
},
{
"cell_type": "markdown",
"id": "8c33eee6",
"metadata": {},
"source": [
"```python\n",
"mydense = gluon.nn.Dense(100, prefix='mydense_')\n",
"print(mydense.prefix)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "afa6b54b",
"metadata": {},
"source": [
"When no prefix is given, Gluon will automatically generate one:"
]
},
{
"cell_type": "markdown",
"id": "78de3f9a",
"metadata": {},
"source": [
"```python\n",
"dense0 = gluon.nn.Dense(100)\n",
"print(dense0.prefix)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "373295b0",
"metadata": {},
"source": [
"When you create more Blocks of the same kind, they will be named with incrementing suffixes to avoid collision:"
]
},
{
"cell_type": "markdown",
"id": "a6e16794",
"metadata": {},
"source": [
"```python\n",
"dense1 = gluon.nn.Dense(100)\n",
"print(dense1.prefix)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "0dbb6c6a",
"metadata": {},
"source": [
"## Naming Parameters\n",
"\n",
"Parameters within a Block will be named by prepending the prefix of the Block to the name of the Parameter:"
]
},
{
"cell_type": "markdown",
"id": "bf4341de",
"metadata": {},
"source": [
"```python\n",
"print(dense0.collect_params())\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "b641c2d7",
"metadata": {},
"source": [
"## Name scopes\n",
"\n",
"To manage the names of nested Blocks, each Block has a `name_scope` attached to it. All Blocks created within a name scope will have its parent Block's prefix prepended to its name.\n",
"\n",
"Let's demonstrate this by first defining a simple neural net:"
]
},
{
"cell_type": "markdown",
"id": "541e6c01",
"metadata": {},
"source": [
"```python\n",
"class Model(gluon.Block):\n",
" def __init__(self, **kwargs):\n",
" super(Model, self).__init__(**kwargs)\n",
" with self.name_scope():\n",
" self.dense0 = gluon.nn.Dense(20)\n",
" self.dense1 = gluon.nn.Dense(20)\n",
" self.mydense = gluon.nn.Dense(20, prefix='mydense_')\n",
"\n",
" def forward(self, x):\n",
" x = mx.nd.relu(self.dense0(x))\n",
" x = mx.nd.relu(self.dense1(x))\n",
" return mx.nd.relu(self.mydense(x))\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "1a6ee6d5",
"metadata": {},
"source": [
"Now let's instantiate our neural net.\n",
"\n",
"- Note that `model0.dense0` is named as `model0_dense0_` instead of `dense0_`.\n",
"\n",
"- Also note that although we specified `mydense_` as prefix for `model.mydense`, its parent's prefix is automatically prepended to generate the prefix `model0_mydense_`."
]
},
{
"cell_type": "markdown",
"id": "cf50c1a0",
"metadata": {},
"source": [
"```python\n",
"model0 = Model()\n",
"model0.initialize()\n",
"model0(mx.nd.zeros((1, 20)))\n",
"print(model0.prefix)\n",
"print(model0.dense0.prefix)\n",
"print(model0.dense1.prefix)\n",
"print(model0.mydense.prefix)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "7a2d3419",
"metadata": {},
"source": [
"If we instantiate `Model` again, it will be given a different name like shown before for `Dense`.\n",
"\n",
"- Note that `model1.dense0` is still named as `dense0_` instead of `dense2_`, following dense layers in previously created `model0`. This is because each instance of model's name scope is independent of each other."
]
},
{
"cell_type": "markdown",
"id": "91e3ebfd",
"metadata": {},
"source": [
"```python\n",
"model1 = Model()\n",
"print(model1.prefix)\n",
"print(model1.dense0.prefix)\n",
"print(model1.dense1.prefix)\n",
"print(model1.mydense.prefix)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "bd2275e9",
"metadata": {},
"source": [
"**It is recommended that you manually specify a prefix for the top level Block, i.e. `model = Model(prefix='mymodel_')`, to avoid potential confusions in naming.**\n",
"\n",
"The same principle also applies to container blocks like Sequential. `name_scope` can be used inside `__init__` as well as out side of `__init__`:"
]
},
{
"cell_type": "markdown",
"id": "15743ba0",
"metadata": {},
"source": [
"```python\n",
"net = gluon.nn.Sequential()\n",
"with net.name_scope():\n",
" net.add(gluon.nn.Dense(20))\n",
" net.add(gluon.nn.Dense(20))\n",
"print(net.prefix)\n",
"print(net[0].prefix)\n",
"print(net[1].prefix)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "fa91cfd5",
"metadata": {},
"source": [
"`gluon.model_zoo` also behaves similarly:"
]
},
{
"cell_type": "markdown",
"id": "fa5bcff0",
"metadata": {},
"source": [
"```python\n",
"net = gluon.nn.Sequential()\n",
"with net.name_scope():\n",
" net.add(gluon.model_zoo.vision.alexnet(pretrained=True))\n",
" net.add(gluon.model_zoo.vision.alexnet(pretrained=True))\n",
"print(net.prefix, net[0].prefix, net[1].prefix)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "1bd3762c",
"metadata": {},
"source": [
"## Saving and loading\n",
"\n",
"Because model0 and model1 have different prefixes, their parameters also have different names:"
]
},
{
"cell_type": "markdown",
"id": "ca688934",
"metadata": {},
"source": [
"```python\n",
"print(model0.collect_params(), '\\n')\n",
"print(model1.collect_params())\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "b6558305",
"metadata": {},
"source": [
"As a result, if you try to save parameters from model0 and load it with model1, you'll get an error due to unmatching names:"
]
},
{
"cell_type": "markdown",
"id": "c71c794d",
"metadata": {},
"source": [
"```python\n",
"model0.collect_params().save('model.params')\n",
"try:\n",
" model1.collect_params().load('model.params', mx.cpu())\n",
"except Exception as e:\n",
" print(e)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "e90b8df6",
"metadata": {},
"source": [
"To solve this problem, we use `save_parameters`/`load_parameters` instead of `collect_params` and `save`/`load`. `save_parameters` uses model structure, instead of parameter name, to match parameters."
]
},
{
"cell_type": "markdown",
"id": "1d1e24f0",
"metadata": {},
"source": [
"```python\n",
"model0.save_parameters('model.params')\n",
"model1.load_parameters('model.params')\n",
"print(mx.nd.load('model.params').keys())\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "a547845e",
"metadata": {},
"source": [
"## Replacing Blocks from networks and fine-tuning\n",
"\n",
"Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning.\n",
"\n",
"For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.\n",
"\n",
"To see how to do this, we first load a pretrained AlexNet.\n",
"\n",
"- In Gluon model zoo, all image classification models follow the format where the feature extraction layers are named `features` while the output layer is named `output`.\n",
"- Note that the output layer is a dense block with 1000 dimension outputs."
]
},
{
"cell_type": "markdown",
"id": "fbdde321",
"metadata": {},
"source": [
"```python\n",
"alexnet = gluon.model_zoo.vision.alexnet(pretrained=True)\n",
"print(alexnet.output)\n",
"print(alexnet.output.prefix)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "d00cdd50",
"metadata": {},
"source": [
"To change the output to 100 dimension, we replace it with a new block."
]
},
{
"cell_type": "markdown",
"id": "a4047cfe",
"metadata": {},
"source": [
"```python\n",
"with alexnet.name_scope():\n",
" alexnet.output = gluon.nn.Dense(100)\n",
"alexnet.output.initialize()\n",
"print(alexnet.output)\n",
"print(alexnet.output.prefix)\n",
"```\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}