api/python/docs/_sources/tutorials/getting-started/crash-course/3-autograd.ipynb - mxnet-site - Git at Google

 {
  "cells": [
   {
    "cell_type": "markdown",
    "id": "6a14f615",
    "metadata": {},
    "source": [
     "<!--- Licensed to the Apache Software Foundation (ASF) under one -->\n",
     "<!--- or more contributor license agreements.  See the NOTICE file -->\n",
     "<!--- distributed with this work for additional information -->\n",
     "<!--- regarding copyright ownership.  The ASF licenses this file -->\n",
     "<!--- to you under the Apache License, Version 2.0 (the -->\n",
     "<!--- \"License\"); you may not use this file except in compliance -->\n",
     "<!--- with the License.  You may obtain a copy of the License at -->\n",
     "\n",
     "<!---   http://www.apache.org/licenses/LICENSE-2.0 -->\n",
     "\n",
     "<!--- Unless required by applicable law or agreed to in writing, -->\n",
     "<!--- software distributed under the License is distributed on an -->\n",
     "<!--- \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->\n",
     "<!--- KIND, either express or implied.  See the License for the -->\n",
     "<!--- specific language governing permissions and limitations -->\n",
     "<!--- under the License. -->\n",
     "\n",
     "# Step 3: Automatic differentiation with autograd\n",
     "\n",
     "In this step, you learn how to use the MXNet `autograd` package to perform\n",
     "gradient calculations.\n",
     "\n",
     "## Basic use\n",
     "\n",
     "To get started, import the `autograd` package with the following code."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "a41a2302",
    "metadata": {},
    "outputs": [],
    "source": [
     "from mxnet import np, npx\n",
     "from mxnet import autograd\n",
     "npx.set_np()"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "cfb4d4cf",
    "metadata": {},
    "source": [
     "As an example, you could differentiate a function $f(x) = 2 x^2$ with respect to\n",
     "parameter $x$. For Autograd, you can start by assigning an initial value of $x$,\n",
     "as follows:"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "a55c84e6",
    "metadata": {},
    "outputs": [],
    "source": [
     "x = np.array([[1, 2], [3, 4]])\n",
     "x"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "008821e4",
    "metadata": {},
    "source": [
     "After you compute the gradient of $f(x)$ with respect to $x$, you need a place\n",
     "to store it. In MXNet, you can tell a ndarray that you plan to store a gradient\n",
     "by invoking its `attach_grad` method, as shown in the following example."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "db3eff4d",
    "metadata": {},
    "outputs": [],
    "source": [
     "x.attach_grad()"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "5f1fecaa",
    "metadata": {},
    "source": [
     "Next, define the function $y=f(x)$. To let MXNet store $y$, so that you can\n",
     "compute gradients later, use the following code to put the definition inside an\n",
     "`autograd.record()` scope."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "db23322a",
    "metadata": {},
    "outputs": [],
    "source": [
     "with autograd.record():\n",
     "    y = 2 * x * x"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "9069f608",
    "metadata": {},
    "source": [
     "You can invoke back propagation (backprop) by calling `y.backward()`. When $y$\n",
     "has more than one entry, `y.backward()` is equivalent to `y.sum().backward()`."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "13350863",
    "metadata": {},
    "outputs": [],
    "source": [
     "y.backward()"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "59753b31",
    "metadata": {},
    "source": [
     "Next, verify whether this is the expected output. Note that $y=2x^2$ and\n",
     "$\\frac{dy}{dx} = 4x$, which should be `[[4, 8],[12, 16]]`. Check the\n",
     "automatically computed results."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "6f6e7317",
    "metadata": {},
    "outputs": [],
    "source": [
     "x.grad"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "813eec0c",
    "metadata": {},
    "source": [
     "Now you get to dive into `y.backward()` by first discussing a bit on gradients. As\n",
     "alluded to earlier `y.backward()` is equivalent to `y.sum().backward()`."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "0e7fb1d5",
    "metadata": {},
    "outputs": [],
    "source": [
     "with autograd.record():\n",
     "    y = np.sum(2 * x * x)\n",
     "y.backward()\n",
     "x.grad"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "fb985663",
    "metadata": {},
    "source": [
     "Additionally, you can only run backward once. Unless you use the flag\n",
     "`retain_graph` to be `True`."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "c621c976",
    "metadata": {},
    "outputs": [],
    "source": [
     "with autograd.record():\n",
     "    y = np.sum(2 * x * x)\n",
     "y.backward(retain_graph=True)\n",
     "print(x.grad)\n",
     "print(\"Since you have retained your previous graph you can run backward again\")\n",
     "y.backward()\n",
     "print(x.grad)\n",
     "\n",
     "try:\n",
     "    y.backward()\n",
     "except:\n",
     "    print(\"However, you can't do backward twice unless you retain the graph.\")"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "e289a1db",
    "metadata": {},
    "source": [
     "## Custom MXNet ndarray operations\n",
     "\n",
     "In order to understand the `backward()` method it is beneficial to first\n",
     "understand how you can create custom operations. MXNet operators are classes\n",
     "with a forward and backward method. Where the number of args in `backward()`\n",
     "must equal the number of items returned in the `forward()` method. Additionally,\n",
     "the number of arguments in the `forward()` method must match the number of\n",
     "output arguments from `backward()`. You can modify the gradients in backward to\n",
     "return custom gradients. For instance, below you can return a different gradient then\n",
     "the actual derivative."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "5583ca32",
    "metadata": {},
    "outputs": [],
    "source": [
     "class MyFirstCustomOperation(autograd.Function):\n",
     "    def __init__(self):\n",
     "        super().__init__()\n",
     "\n",
     "    def forward(self,x,y):\n",
     "        return 2 * x, 2 * x * y, 2 * y\n",
     "\n",
     "    def backward(self, dx, dxy, dy):\n",
     "        \"\"\"\n",
     "        The input number of arguments must match the number of outputs from forward.\n",
     "        Furthermore, the number of output arguments must match the number of inputs from forward.\n",
     "        \"\"\"\n",
     "        return x, y"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "309142cc",
    "metadata": {},
    "source": [
     "Now you can use the first custom operation you have built."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "8f820dd8",
    "metadata": {},
    "outputs": [],
    "source": [
     "x = np.random.uniform(-1, 1, (2, 3)) \n",
     "y = np.random.uniform(-1, 1, (2, 3))\n",
     "x.attach_grad()\n",
     "y.attach_grad()\n",
     "with autograd.record():\n",
     "    z = MyFirstCustomOperation()\n",
     "    z1, z2, z3 = z(x, y)\n",
     "    out = z1 + z2 + z3 \n",
     "out.backward()\n",
     "print(np.array_equiv(x.asnumpy(), x.asnumpy()))\n",
     "print(np.array_equiv(y.asnumpy(), y.asnumpy()))"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "6c08fbaf",
    "metadata": {},
    "source": [
     "Alternatively, you may want to have a function which is different depending on\n",
     "if you are training or not."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "57d1d283",
    "metadata": {},
    "outputs": [],
    "source": [
     "def my_first_function(x):\n",
     "    if autograd.is_training(): # Return something else when training\n",
     "        return(4 * x)\n",
     "    else:\n",
     "        return(x)"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "706fc5b0",
    "metadata": {},
    "outputs": [],
    "source": [
     "y = my_first_function(x)\n",
     "print(np.array_equiv(y.asnumpy(), x.asnumpy()))\n",
     "with autograd.record(train_mode=False):\n",
     "    y = my_first_function(x)\n",
     "y.backward()\n",
     "print(x.grad)\n",
     "with autograd.record(train_mode=True): # train_mode = True by default\n",
     "    y = my_first_function(x)\n",
     "y.backward()\n",
     "print(x.grad)"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "ae3a7bc5",
    "metadata": {},
    "source": [
     "You could create functions with `autograd.record()`."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "2703873f",
    "metadata": {},
    "outputs": [],
    "source": [
     "def my_second_function(x):\n",
     "    with autograd.record():\n",
     "        return(2 * x)"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "1ba7abb4",
    "metadata": {},
    "outputs": [],
    "source": [
     "y = my_second_function(x)\n",
     "y.backward()\n",
     "print(x.grad)"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "343fa056",
    "metadata": {},
    "source": [
     "You can also combine multiple functions."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "d38f4f7b",
    "metadata": {},
    "outputs": [],
    "source": [
     "y = my_second_function(x)\n",
     "with autograd.record():\n",
     "    z = my_second_function(y) + 2\n",
     "z.backward()\n",
     "print(x.grad)"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "ac2db18a",
    "metadata": {},
    "source": [
     "Additionally, MXNet records the execution trace and computes the gradient\n",
     "accordingly. The following function `f` doubles the inputs until its `norm`\n",
     "reaches 1000. Then it selects one element depending on the sum of its elements."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "880a5bb8",
    "metadata": {},
    "outputs": [],
    "source": [
     "def f(a):\n",
     "    b = a * 2\n",
     "    while np.abs(b).sum() < 1000:\n",
     "        b = b * 2\n",
     "    if b.sum() >= 0:\n",
     "        c = b[0]\n",
     "    else:\n",
     "        c = b[1]\n",
     "    return c"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "8967c555",
    "metadata": {},
    "source": [
     "In this example, you record the trace and feed in a random value."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "2d7b9e99",
    "metadata": {},
    "outputs": [],
    "source": [
     "a = np.random.uniform(size=2)\n",
     "a.attach_grad()\n",
     "with autograd.record():\n",
     "    c = f(a)\n",
     "c.backward()"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "8759dc87",
    "metadata": {},
    "source": [
     "You can see that `b` is a linear function of `a`, and `c` is chosen from `b`.\n",
     "The gradient with respect to `a` be will be either `[c/a[0], 0]` or `[0,\n",
     "c/a[1]]`, depending on which element from `b` is picked. You see the results of\n",
     "this example with this code:"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "526bd860",
    "metadata": {},
    "outputs": [],
    "source": [
     "a.grad == c / a"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "85e05094",
    "metadata": {},
    "source": [
     "As you can notice there are 3 values along the dimension 0, so taking a `mean`\n",
     "along this axis is the same as summing that axis and multiplying by `1/3`.\n",
     "\n",
     "## Advanced MXNet ndarray operations with Autograd\n",
     "\n",
     "You can control gradients for different ndarray operations. For instance,\n",
     "perhaps you want to check that the gradients are propagating properly?\n",
     "the `attach_grad()` method automatically detaches itself from the gradient.\n",
     "Therefore, the input up until y will no longer look like it has `x`. To\n",
     "illustrate this notice that `x.grad` and `y.grad` is not the same in the second\n",
     "example."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "47d5f509",
    "metadata": {},
    "outputs": [],
    "source": [
     "with autograd.record():\n",
     "    y = 3 * x\n",
     "    y.attach_grad()\n",
     "    z = 4 * y + 2 * x\n",
     "z.backward()\n",
     "print(x.grad)\n",
     "print(y.grad)"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "d07133bf",
    "metadata": {},
    "source": [
     "Is not the same as:"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "a3b454a0",
    "metadata": {},
    "outputs": [],
    "source": [
     "with autograd.record():\n",
     "    y = 3 * x\n",
     "    z = 4 * y + 2 * x\n",
     "z.backward()\n",
     "print(x.grad)\n",
     "print(y.grad)"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "fc85bec1",
    "metadata": {},
    "source": [
     "## Next steps\n",
     "\n",
     "Learn how to initialize weights, choose loss function, metrics and optimizers for training your neural network [Step 4: Necessary components\n",
     "to train the neural network](./4-components.ipynb)."
    ]
   }
  ],
  "metadata": {
   "language_info": {
    "name": "python"
   }
  },
  "nbformat": 4,
  "nbformat_minor": 5
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"id": "6a14f615",
	"metadata": {},
	"source": [
	"<!--- Licensed to the Apache Software Foundation (ASF) under one -->\n",
	"<!--- or more contributor license agreements. See the NOTICE file -->\n",
	"<!--- distributed with this work for additional information -->\n",
	"<!--- regarding copyright ownership. The ASF licenses this file -->\n",
	"<!--- to you under the Apache License, Version 2.0 (the -->\n",
	"<!--- \"License\"); you may not use this file except in compliance -->\n",
	"<!--- with the License. You may obtain a copy of the License at -->\n",
	"\n",
	"<!--- http://www.apache.org/licenses/LICENSE-2.0 -->\n",
	"\n",
	"<!--- Unless required by applicable law or agreed to in writing, -->\n",
	"<!--- software distributed under the License is distributed on an -->\n",
	"<!--- \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->\n",
	"<!--- KIND, either express or implied. See the License for the -->\n",
	"<!--- specific language governing permissions and limitations -->\n",
	"<!--- under the License. -->\n",
	"\n",
	"# Step 3: Automatic differentiation with autograd\n",
	"\n",
	"In this step, you learn how to use the MXNet `autograd` package to perform\n",
	"gradient calculations.\n",
	"\n",
	"## Basic use\n",
	"\n",
	"To get started, import the `autograd` package with the following code."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "a41a2302",
	"metadata": {},
	"outputs": [],
	"source": [
	"from mxnet import np, npx\n",
	"from mxnet import autograd\n",
	"npx.set_np()"
	]
	},
	{
	"cell_type": "markdown",
	"id": "cfb4d4cf",
	"metadata": {},
	"source": [
	"As an example, you could differentiate a function $f(x) = 2 x^2$ with respect to\n",
	"parameter $x$. For Autograd, you can start by assigning an initial value of $x$,\n",
	"as follows:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "a55c84e6",
	"metadata": {},
	"outputs": [],
	"source": [
	"x = np.array([[1, 2], [3, 4]])\n",
	"x"
	]
	},
	{
	"cell_type": "markdown",
	"id": "008821e4",
	"metadata": {},
	"source": [
	"After you compute the gradient of $f(x)$ with respect to $x$, you need a place\n",
	"to store it. In MXNet, you can tell a ndarray that you plan to store a gradient\n",
	"by invoking its `attach_grad` method, as shown in the following example."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "db3eff4d",
	"metadata": {},
	"outputs": [],
	"source": [
	"x.attach_grad()"
	]
	},
	{
	"cell_type": "markdown",
	"id": "5f1fecaa",
	"metadata": {},
	"source": [
	"Next, define the function $y=f(x)$. To let MXNet store $y$, so that you can\n",
	"compute gradients later, use the following code to put the definition inside an\n",
	"`autograd.record()` scope."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "db23322a",
	"metadata": {},
	"outputs": [],
	"source": [
	"with autograd.record():\n",
	" y = 2 * x * x"
	]
	},
	{
	"cell_type": "markdown",
	"id": "9069f608",
	"metadata": {},
	"source": [
	"You can invoke back propagation (backprop) by calling `y.backward()`. When $y$\n",
	"has more than one entry, `y.backward()` is equivalent to `y.sum().backward()`."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "13350863",
	"metadata": {},
	"outputs": [],
	"source": [
	"y.backward()"
	]
	},
	{
	"cell_type": "markdown",
	"id": "59753b31",
	"metadata": {},
	"source": [
	"Next, verify whether this is the expected output. Note that $y=2x^2$ and\n",
	"$\\frac{dy}{dx} = 4x$, which should be `[[4, 8],[12, 16]]`. Check the\n",
	"automatically computed results."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "6f6e7317",
	"metadata": {},
	"outputs": [],
	"source": [
	"x.grad"
	]
	},
	{
	"cell_type": "markdown",
	"id": "813eec0c",
	"metadata": {},
	"source": [
	"Now you get to dive into `y.backward()` by first discussing a bit on gradients. As\n",
	"alluded to earlier `y.backward()` is equivalent to `y.sum().backward()`."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "0e7fb1d5",
	"metadata": {},
	"outputs": [],
	"source": [
	"with autograd.record():\n",
	" y = np.sum(2 * x * x)\n",
	"y.backward()\n",
	"x.grad"
	]
	},
	{
	"cell_type": "markdown",
	"id": "fb985663",
	"metadata": {},
	"source": [
	"Additionally, you can only run backward once. Unless you use the flag\n",
	"`retain_graph` to be `True`."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "c621c976",
	"metadata": {},
	"outputs": [],
	"source": [
	"with autograd.record():\n",
	" y = np.sum(2 * x * x)\n",
	"y.backward(retain_graph=True)\n",
	"print(x.grad)\n",
	"print(\"Since you have retained your previous graph you can run backward again\")\n",
	"y.backward()\n",
	"print(x.grad)\n",
	"\n",
	"try:\n",
	" y.backward()\n",
	"except:\n",
	" print(\"However, you can't do backward twice unless you retain the graph.\")"
	]
	},
	{
	"cell_type": "markdown",
	"id": "e289a1db",
	"metadata": {},
	"source": [
	"## Custom MXNet ndarray operations\n",
	"\n",
	"In order to understand the `backward()` method it is beneficial to first\n",
	"understand how you can create custom operations. MXNet operators are classes\n",
	"with a forward and backward method. Where the number of args in `backward()`\n",
	"must equal the number of items returned in the `forward()` method. Additionally,\n",
	"the number of arguments in the `forward()` method must match the number of\n",
	"output arguments from `backward()`. You can modify the gradients in backward to\n",
	"return custom gradients. For instance, below you can return a different gradient then\n",
	"the actual derivative."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "5583ca32",
	"metadata": {},
	"outputs": [],
	"source": [
	"class MyFirstCustomOperation(autograd.Function):\n",
	" def __init__(self):\n",
	" super().__init__()\n",
	"\n",
	" def forward(self,x,y):\n",
	" return 2 * x, 2 * x * y, 2 * y\n",
	"\n",
	" def backward(self, dx, dxy, dy):\n",
	" \"\"\"\n",
	" The input number of arguments must match the number of outputs from forward.\n",
	" Furthermore, the number of output arguments must match the number of inputs from forward.\n",
	" \"\"\"\n",
	" return x, y"
	]
	},
	{
	"cell_type": "markdown",
	"id": "309142cc",
	"metadata": {},
	"source": [
	"Now you can use the first custom operation you have built."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "8f820dd8",
	"metadata": {},
	"outputs": [],
	"source": [
	"x = np.random.uniform(-1, 1, (2, 3)) \n",
	"y = np.random.uniform(-1, 1, (2, 3))\n",
	"x.attach_grad()\n",
	"y.attach_grad()\n",
	"with autograd.record():\n",
	" z = MyFirstCustomOperation()\n",
	" z1, z2, z3 = z(x, y)\n",
	" out = z1 + z2 + z3 \n",
	"out.backward()\n",
	"print(np.array_equiv(x.asnumpy(), x.asnumpy()))\n",
	"print(np.array_equiv(y.asnumpy(), y.asnumpy()))"
	]
	},
	{
	"cell_type": "markdown",
	"id": "6c08fbaf",
	"metadata": {},
	"source": [
	"Alternatively, you may want to have a function which is different depending on\n",
	"if you are training or not."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "57d1d283",
	"metadata": {},
	"outputs": [],
	"source": [
	"def my_first_function(x):\n",
	" if autograd.is_training(): # Return something else when training\n",
	" return(4 * x)\n",
	" else:\n",
	" return(x)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "706fc5b0",
	"metadata": {},
	"outputs": [],
	"source": [
	"y = my_first_function(x)\n",
	"print(np.array_equiv(y.asnumpy(), x.asnumpy()))\n",
	"with autograd.record(train_mode=False):\n",
	" y = my_first_function(x)\n",
	"y.backward()\n",
	"print(x.grad)\n",
	"with autograd.record(train_mode=True): # train_mode = True by default\n",
	" y = my_first_function(x)\n",
	"y.backward()\n",
	"print(x.grad)"
	]
	},
	{
	"cell_type": "markdown",
	"id": "ae3a7bc5",
	"metadata": {},
	"source": [
	"You could create functions with `autograd.record()`."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "2703873f",
	"metadata": {},
	"outputs": [],
	"source": [
	"def my_second_function(x):\n",
	" with autograd.record():\n",
	" return(2 * x)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "1ba7abb4",
	"metadata": {},
	"outputs": [],
	"source": [
	"y = my_second_function(x)\n",
	"y.backward()\n",
	"print(x.grad)"
	]
	},
	{
	"cell_type": "markdown",
	"id": "343fa056",
	"metadata": {},
	"source": [
	"You can also combine multiple functions."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "d38f4f7b",
	"metadata": {},
	"outputs": [],
	"source": [
	"y = my_second_function(x)\n",
	"with autograd.record():\n",
	" z = my_second_function(y) + 2\n",
	"z.backward()\n",
	"print(x.grad)"
	]
	},
	{
	"cell_type": "markdown",
	"id": "ac2db18a",
	"metadata": {},
	"source": [
	"Additionally, MXNet records the execution trace and computes the gradient\n",
	"accordingly. The following function `f` doubles the inputs until its `norm`\n",
	"reaches 1000. Then it selects one element depending on the sum of its elements."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "880a5bb8",
	"metadata": {},
	"outputs": [],
	"source": [
	"def f(a):\n",
	" b = a * 2\n",
	" while np.abs(b).sum() < 1000:\n",
	" b = b * 2\n",
	" if b.sum() >= 0:\n",
	" c = b[0]\n",
	" else:\n",
	" c = b[1]\n",
	" return c"
	]
	},
	{
	"cell_type": "markdown",
	"id": "8967c555",
	"metadata": {},
	"source": [
	"In this example, you record the trace and feed in a random value."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "2d7b9e99",
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.uniform(size=2)\n",
	"a.attach_grad()\n",
	"with autograd.record():\n",
	" c = f(a)\n",
	"c.backward()"
	]
	},
	{
	"cell_type": "markdown",
	"id": "8759dc87",
	"metadata": {},
	"source": [
	"You can see that `b` is a linear function of `a`, and `c` is chosen from `b`.\n",
	"The gradient with respect to `a` be will be either `[c/a[0], 0]` or `[0,\n",
	"c/a[1]]`, depending on which element from `b` is picked. You see the results of\n",
	"this example with this code:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "526bd860",
	"metadata": {},
	"outputs": [],
	"source": [
	"a.grad == c / a"
	]
	},
	{
	"cell_type": "markdown",
	"id": "85e05094",
	"metadata": {},
	"source": [
	"As you can notice there are 3 values along the dimension 0, so taking a `mean`\n",
	"along this axis is the same as summing that axis and multiplying by `1/3`.\n",
	"\n",
	"## Advanced MXNet ndarray operations with Autograd\n",
	"\n",
	"You can control gradients for different ndarray operations. For instance,\n",
	"perhaps you want to check that the gradients are propagating properly?\n",
	"the `attach_grad()` method automatically detaches itself from the gradient.\n",
	"Therefore, the input up until y will no longer look like it has `x`. To\n",
	"illustrate this notice that `x.grad` and `y.grad` is not the same in the second\n",
	"example."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "47d5f509",
	"metadata": {},
	"outputs": [],
	"source": [
	"with autograd.record():\n",
	" y = 3 * x\n",
	" y.attach_grad()\n",
	" z = 4 * y + 2 * x\n",
	"z.backward()\n",
	"print(x.grad)\n",
	"print(y.grad)"
	]
	},
	{
	"cell_type": "markdown",
	"id": "d07133bf",
	"metadata": {},
	"source": [
	"Is not the same as:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "a3b454a0",
	"metadata": {},
	"outputs": [],
	"source": [
	"with autograd.record():\n",
	" y = 3 * x\n",
	" z = 4 * y + 2 * x\n",
	"z.backward()\n",
	"print(x.grad)\n",
	"print(y.grad)"
	]
	},
	{
	"cell_type": "markdown",
	"id": "fc85bec1",
	"metadata": {},
	"source": [
	"## Next steps\n",
	"\n",
	"Learn how to initialize weights, choose loss function, metrics and optimizers for training your neural network [Step 4: Necessary components\n",
	"to train the neural network](./4-components.ipynb)."
	]
	}
	],
	"metadata": {
	"language_info": {
	"name": "python"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 5
	}