blob: 53077bb57e77d23e645f3d27a0412c66ef6db016 [file] [log] [blame]
{
"cells": [
{
"cell_type": "markdown",
"id": "cf6792bc",
"metadata": {},
"source": [
"<!--- Licensed to the Apache Software Foundation (ASF) under one -->\n",
"<!--- or more contributor license agreements. See the NOTICE file -->\n",
"<!--- distributed with this work for additional information -->\n",
"<!--- regarding copyright ownership. The ASF licenses this file -->\n",
"<!--- to you under the Apache License, Version 2.0 (the -->\n",
"<!--- \"License\"); you may not use this file except in compliance -->\n",
"<!--- with the License. You may obtain a copy of the License at -->\n",
"\n",
"<!--- http://www.apache.org/licenses/LICENSE-2.0 -->\n",
"\n",
"<!--- Unless required by applicable law or agreed to in writing, -->\n",
"<!--- software distributed under the License is distributed on an -->\n",
"<!--- \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->\n",
"<!--- KIND, either express or implied. See the License for the -->\n",
"<!--- specific language governing permissions and limitations -->\n",
"<!--- under the License. -->\n",
"\n",
"# Parameter and Block Naming\n",
"\n",
"In gluon, each Parameter or Block has a name. Parameter names and Block names can be automatically created.\n",
"\n",
"In this tutorial we talk about the best practices on naming. First, let's import MXNet and Gluon:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "2b54b8c3",
"metadata": {},
"outputs": [],
"source": [
"from __future__ import print_function\n",
"import mxnet as mx\n",
"from mxnet import gluon"
]
},
{
"cell_type": "markdown",
"id": "e9593c5e",
"metadata": {},
"source": [
"## Naming Blocks\n",
"\n",
"When creating a block, you can simply do as follows:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "9a13f298",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dense\n"
]
}
],
"source": [
"mydense = gluon.nn.Dense(100)\n",
"print(mydense.__class__.__name__)"
]
},
{
"cell_type": "markdown",
"id": "f5697139",
"metadata": {},
"source": [
"When you create more Blocks of the same kind, they will be named with incrementing suffixes to avoid collision:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "bb0aef08",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dense\n"
]
}
],
"source": [
"dense1 = gluon.nn.Dense(100)\n",
"print(dense1.__class__.__name__)"
]
},
{
"cell_type": "markdown",
"id": "4e0ad6ad",
"metadata": {},
"source": [
"## Naming Parameters\n",
"\n",
"Parameters will be named automatically by a unique name in the format of `param_{uuid4}_{name}`:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "007ca34b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"bias\n"
]
}
],
"source": [
"param = gluon.Parameter(name = 'bias')\n",
"print(param.name)"
]
},
{
"cell_type": "markdown",
"id": "5a2d7bff",
"metadata": {},
"source": [
"`param.name` is used as the name of a parameter's symbol representation. And it can not be changed once the parameter is created.\n",
"\n",
"When getting parameters within a Block, you should use the structure based name as the key:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a463cd50",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'weight': Parameter (shape=(100, -1), dtype=float32), 'bias': Parameter (shape=(100,), dtype=float32)}\n"
]
}
],
"source": [
"print(dense1.collect_params())"
]
},
{
"cell_type": "markdown",
"id": "483e25cf",
"metadata": {},
"source": [
"## Nested Blocks\n",
"\n",
"In MXNet 2, we don't have to define children blocks within a `name_scope` any more. Let's demonstrate this by defining and initiating a simple neural net:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4deb679f",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[04:46:00] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU\n"
]
},
{
"data": {
"text/plain": [
"array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
" 0., 0., 0., 0.]])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"class Model(gluon.HybridBlock):\n",
" def __init__(self):\n",
" super(Model, self).__init__()\n",
" self.dense0 = gluon.nn.Dense(20)\n",
" self.dense1 = gluon.nn.Dense(20)\n",
" self.mydense = gluon.nn.Dense(20)\n",
"\n",
" def forward(self, x):\n",
" x = mx.npx.relu(self.dense0(x))\n",
" x = mx.npx.relu(self.dense1(x))\n",
" return mx.npx.relu(self.mydense(x))\n",
"\n",
"model0 = Model()\n",
"model0.initialize()\n",
"model0.hybridize()\n",
"model0(mx.np.zeros((1, 20)))"
]
},
{
"cell_type": "markdown",
"id": "f16d59ae",
"metadata": {},
"source": [
"The same principle also applies to container blocks like Sequential. We can simply do as follows:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "117bba74",
"metadata": {},
"outputs": [],
"source": [
"net = gluon.nn.Sequential()\n",
"net.add(gluon.nn.Dense(20))\n",
"net.add(gluon.nn.Dense(20))"
]
},
{
"cell_type": "markdown",
"id": "52d7cbd2",
"metadata": {},
"source": [
"## Saving and loading\n",
"\n",
"\n",
"For `HybridBlock`, we use `save_parameters`/`load_parameters`, which uses model structure, instead of parameter name, to match parameters."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "29c1f4c3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dict_keys(['dense0.bias', 'dense0.weight', 'dense1.bias', 'dense1.weight', 'mydense.bias', 'mydense.weight'])\n"
]
}
],
"source": [
"model1 = Model()\n",
"model0.save_parameters('model.params')\n",
"model1.load_parameters('model.params')\n",
"print(mx.npx.load('model.params').keys())"
]
},
{
"cell_type": "markdown",
"id": "eb50e4ee",
"metadata": {},
"source": [
"For `SymbolBlock.imports`, we use `export`, which uses parameter name `param.name`, to save parameters."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "69b5b4f7",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/work/mxnet/python/mxnet/gluon/block.py:1943: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:\n",
"\tdata: None\n",
" input_sym_arg_type = in_param.infer_type()[0]\n"
]
}
],
"source": [
"model0.export('model0')\n",
"model2 = gluon.SymbolBlock.imports('model0-symbol.json', ['data'], 'model0-0000.params')"
]
},
{
"cell_type": "markdown",
"id": "4f8a8514",
"metadata": {},
"source": [
"## Replacing Blocks from networks and fine-tuning\n",
"\n",
"Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning.\n",
"\n",
"For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.\n",
"\n",
"To see how to do this, we first load a pretrained ResNet.\n",
"\n",
"- In Gluon model zoo, all image classification models follow the format where the feature extraction layers are named `features` while the output layer is named `output`.\n",
"- Note that the output layer is a dense block with 1000 dimension outputs."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "0c8c813a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dense(2048 -> 1000, linear)\n"
]
}
],
"source": [
"resnet = gluon.model_zoo.vision.resnet50_v2()\n",
"print(resnet.output)"
]
},
{
"cell_type": "markdown",
"id": "66b347a2",
"metadata": {},
"source": [
"To change the output to 100 dimension, we replace it with a new block."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "4e712d87",
"metadata": {},
"outputs": [],
"source": [
"resnet.output = gluon.nn.Dense(100)\n",
"resnet.output.initialize()"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}