| <!DOCTYPE html> |
| |
| <html xmlns="http://www.w3.org/1999/xhtml"> |
| <head> |
| <meta charset="utf-8" /> |
| <meta charset="utf-8"> |
| <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> |
| <meta http-equiv="x-ua-compatible" content="ie=edge"> |
| <style> |
| .dropdown { |
| position: relative; |
| display: inline-block; |
| } |
| |
| .dropdown-content { |
| display: none; |
| position: absolute; |
| background-color: #f9f9f9; |
| min-width: 160px; |
| box-shadow: 0px 8px 16px 0px rgba(0,0,0,0.2); |
| padding: 12px 16px; |
| z-index: 1; |
| text-align: left; |
| } |
| |
| .dropdown:hover .dropdown-content { |
| display: block; |
| } |
| |
| .dropdown-option:hover { |
| color: #FF4500; |
| } |
| |
| .dropdown-option-active { |
| color: #FF4500; |
| font-weight: lighter; |
| } |
| |
| .dropdown-option { |
| color: #000000; |
| font-weight: lighter; |
| } |
| |
| .dropdown-header { |
| color: #FFFFFF; |
| display: inline-flex; |
| } |
| |
| .dropdown-caret { |
| width: 18px; |
| height: 54px; |
| } |
| |
| .dropdown-caret-path { |
| fill: #FFFFFF; |
| } |
| </style> |
| |
| <title>mxnet.optimizer — Apache MXNet documentation</title> |
| |
| <link rel="stylesheet" href="../../_static/basic.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/pygments.css" type="text/css" /> |
| <link rel="stylesheet" type="text/css" href="../../_static/mxnet.css" /> |
| <link rel="stylesheet" href="../../_static/material-design-lite-1.3.0/material.blue-deep_orange.min.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/sphinx_materialdesign_theme.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/fontawesome/all.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/fonts.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/feedback.css" type="text/css" /> |
| <script id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script> |
| <script src="../../_static/jquery.js"></script> |
| <script src="../../_static/underscore.js"></script> |
| <script src="../../_static/doctools.js"></script> |
| <script src="../../_static/language_data.js"></script> |
| <script src="../../_static/matomo_analytics.js"></script> |
| <script src="../../_static/autodoc.js"></script> |
| <script crossorigin="anonymous" integrity="sha256-Ae2Vz/4ePdIu6ZyI/5ZGsYnb+m0JlOmKPjt6XZ9JJkA=" src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.4/require.min.js"></script> |
| <script async="async" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-AMS-MML_HTMLorMML"></script> |
| <script type="text/x-mathjax-config">MathJax.Hub.Config({"tex2jax": {"inlineMath": [["$", "$"], ["\\(", "\\)"]], "processEscapes": true, "ignoreClass": "document", "processClass": "math|output_area"}})</script> |
| <script src="../../_static/sphinx_materialdesign_theme.js"></script> |
| <link rel="shortcut icon" href="../../_static/mxnet-icon.png"/> |
| <link rel="index" title="Index" href="../../genindex.html" /> |
| <link rel="search" title="Search" href="../../search.html" /> |
| <link rel="next" title="mxnet.lr_scheduler" href="../lr_scheduler/index.html" /> |
| <link rel="prev" title="mxnet.initializer" href="../initializer/index.html" /> |
| </head> |
| <body><header class="site-header" role="banner"> |
| <div class="wrapper"> |
| <a class="site-title" rel="author" href="/"><img |
| src="../../_static/mxnet_logo.png" class="site-header-logo"></a> |
| <nav class="site-nav"> |
| <input type="checkbox" id="nav-trigger" class="nav-trigger"/> |
| <label for="nav-trigger"> |
| <span class="menu-icon"> |
| <svg viewBox="0 0 18 15" width="18px" height="15px"> |
| <path d="M18,1.484c0,0.82-0.665,1.484-1.484,1.484H1.484C0.665,2.969,0,2.304,0,1.484l0,0C0,0.665,0.665,0,1.484,0 h15.032C17.335,0,18,0.665,18,1.484L18,1.484z M18,7.516C18,8.335,17.335,9,16.516,9H1.484C0.665,9,0,8.335,0,7.516l0,0 c0-0.82,0.665-1.484,1.484-1.484h15.032C17.335,6.031,18,6.696,18,7.516L18,7.516z M18,13.516C18,14.335,17.335,15,16.516,15H1.484 C0.665,15,0,14.335,0,13.516l0,0c0-0.82,0.665-1.483,1.484-1.483h15.032C17.335,12.031,18,12.695,18,13.516L18,13.516z"/> |
| </svg> |
| </span> |
| </label> |
| |
| <div class="trigger"> |
| <a class="page-link" href="/get_started">Get Started</a> |
| <a class="page-link" href="/features">Features</a> |
| <a class="page-link" href="/ecosystem">Ecosystem</a> |
| <a class="page-link page-current" href="/api">Docs & Tutorials</a> |
| <a class="page-link" href="/trusted_by">Trusted By</a> |
| <a class="page-link" href="https://github.com/apache/incubator-mxnet">GitHub</a> |
| <div class="dropdown" style="min-width:100px"> |
| <span class="dropdown-header">Apache |
| <svg class="dropdown-caret" viewBox="0 0 32 32" class="icon icon-caret-bottom" aria-hidden="true"><path class="dropdown-caret-path" d="M24 11.305l-7.997 11.39L8 11.305z"></path></svg> |
| </span> |
| <div class="dropdown-content" style="min-width:250px"> |
| <a href="https://www.apache.org/foundation/">Apache Software Foundation</a> |
| <a href="https://incubator.apache.org/">Apache Incubator</a> |
| <a href="https://www.apache.org/licenses/">License</a> |
| <a href="/versions/1.9.1/api/faq/security.html">Security</a> |
| <a href="https://privacy.apache.org/policies/privacy-policy-public.html">Privacy</a> |
| <a href="https://www.apache.org/events/current-event">Events</a> |
| <a href="https://www.apache.org/foundation/sponsorship.html">Sponsorship</a> |
| <a href="https://www.apache.org/foundation/thanks.html">Thanks</a> |
| </div> |
| </div> |
| <div class="dropdown"> |
| <span class="dropdown-header">master |
| <svg class="dropdown-caret" viewBox="0 0 32 32" class="icon icon-caret-bottom" aria-hidden="true"><path class="dropdown-caret-path" d="M24 11.305l-7.997 11.39L8 11.305z"></path></svg> |
| </span> |
| <div class="dropdown-content"> |
| <a class="dropdown-option-active" href="/versions/master/">master</a><br> |
| <a class="dropdown-option" href="/versions/1.9.1/">1.9.1</a><br> |
| <a class="dropdown-option" href="/versions/1.8.0/">1.8.0</a><br> |
| <a class="dropdown-option" href="/versions/1.7.0/">1.7.0</a><br> |
| <a class="dropdown-option" href="/versions/1.6.0/">1.6.0</a><br> |
| <a class="dropdown-option" href="/versions/1.5.0/">1.5.0</a><br> |
| <a class="dropdown-option" href="/versions/1.4.1/">1.4.1</a><br> |
| <a class="dropdown-option" href="/versions/1.3.1/">1.3.1</a><br> |
| <a class="dropdown-option" href="/versions/1.2.1/">1.2.1</a><br> |
| <a class="dropdown-option" href="/versions/1.1.0/">1.1.0</a><br> |
| <a class="dropdown-option" href="/versions/1.0.0/">1.0.0</a><br> |
| <a class="dropdown-option" href="/versions/0.12.1/">0.12.1</a><br> |
| <a class="dropdown-option" href="/versions/0.11.0/">0.11.0</a> |
| </div> |
| </div> |
| </div> |
| </nav> |
| </div> |
| </header> |
| <div class="mdl-layout mdl-js-layout mdl-layout--fixed-header mdl-layout--fixed-drawer"><header class="mdl-layout__header mdl-layout__header--waterfall "> |
| <div class="mdl-layout__header-row"> |
| |
| <nav class="mdl-navigation breadcrumb"> |
| <a class="mdl-navigation__link" href="../index.html">Python API</a><i class="material-icons">navigate_next</i> |
| <a class="mdl-navigation__link is-active">mxnet.optimizer</a> |
| </nav> |
| <div class="mdl-layout-spacer"></div> |
| <nav class="mdl-navigation"> |
| |
| <form class="form-inline pull-sm-right" action="../../search.html" method="get"> |
| <div class="mdl-textfield mdl-js-textfield mdl-textfield--expandable mdl-textfield--floating-label mdl-textfield--align-right"> |
| <label id="quick-search-icon" class="mdl-button mdl-js-button mdl-button--icon" for="waterfall-exp"> |
| <i class="material-icons">search</i> |
| </label> |
| <div class="mdl-textfield__expandable-holder"> |
| <input class="mdl-textfield__input" type="text" name="q" id="waterfall-exp" placeholder="Search" /> |
| <input type="hidden" name="check_keywords" value="yes" /> |
| <input type="hidden" name="area" value="default" /> |
| </div> |
| </div> |
| <div class="mdl-tooltip" data-mdl-for="quick-search-icon"> |
| Quick search |
| </div> |
| </form> |
| |
| <a id="button-show-github" |
| href="https://github.com/apache/mxnet/edit/master/docs/python_docs/python/api/optimizer/index.rst" class="mdl-button mdl-js-button mdl-button--icon"> |
| <i class="material-icons">edit</i> |
| </a> |
| <div class="mdl-tooltip" data-mdl-for="button-show-github"> |
| Edit on Github |
| </div> |
| </nav> |
| </div> |
| <div class="mdl-layout__header-row header-links"> |
| <div class="mdl-layout-spacer"></div> |
| <nav class="mdl-navigation"> |
| </nav> |
| </div> |
| </header><header class="mdl-layout__drawer"> |
| |
| <div class="globaltoc"> |
| <span class="mdl-layout-title toc">Table Of Contents</span> |
| |
| |
| |
| <nav class="mdl-navigation"> |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="../../tutorials/index.html">Python Tutorials</a><ul> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/getting-started/index.html">Getting Started</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/crash-course/index.html">Crash Course</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/0-introduction.html">Introduction</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/1-nparray.html">Step 1: Manipulate data with NP on MXNet</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/2-create-nn.html">Step 2: Create a neural network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/3-autograd.html">Step 3: Automatic differentiation with autograd</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/4-components.html">Step 4: Necessary components that are not in the network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-datasets.html">Step 5: <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s and <code class="docutils literal notranslate"><span class="pre">DataLoader</span></code></a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-datasets.html#Using-own-data-with-included-Datasets">Using own data with included <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-datasets.html#Using-your-own-data-with-custom-Datasets">Using your own data with custom <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-datasets.html#New-in-MXNet-2.0:-faster-C++-backend-dataloaders">New in MXNet 2.0: faster C++ backend dataloaders</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/6-train-nn.html">Step 6: Train a Neural Network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/7-use-gpus.html">Step 7: Load and Run a NN using GPU</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/to-mxnet/index.html">Moving to MXNet from Other Frameworks</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/to-mxnet/pytorch.html">PyTorch vs Apache MXNet</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/gluon_from_experiment_to_deployment.html">Gluon: from experiment to deployment</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/gluon_migration_guide.html">Gluon2.0: Migration Guide</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/logistic_regression_explained.html">Logistic regression explained</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/image/mnist.html">MNIST</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/packages/index.html">Packages</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/autograd/index.html">Automatic Differentiation</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/gluon/index.html">Gluon</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/index.html">Blocks</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/custom-layer.html">Custom Layers</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/hybridize.html">Hybridize</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/init.html">Initialization</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/naming.html">Parameter and Block Naming</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/nn.html">Layers and Blocks</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/parameters.html">Parameter Management</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/save_load_params.html">Saving and Loading Gluon Models</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/activations/activations.html">Activation Blocks</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/data/index.html">Data Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/data_augmentation.html">Image Augmentation</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/datasets.html">Gluon <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s and <code class="docutils literal notranslate"><span class="pre">DataLoader</span></code></a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/datasets.html#Using-own-data-with-included-Datasets">Using own data with included <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/datasets.html#Using-own-data-with-custom-Datasets">Using own data with custom <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/datasets.html#Appendix:-Upgrading-from-Module-DataIter-to-Gluon-DataLoader">Appendix: Upgrading from Module <code class="docutils literal notranslate"><span class="pre">DataIter</span></code> to Gluon <code class="docutils literal notranslate"><span class="pre">DataLoader</span></code></a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/image/index.html">Image Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/info_gan.html">Image similarity search with InfoGAN</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/mnist.html">Handwritten Digit Recognition</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/loss/index.html">Losses</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/custom-loss.html">Custom Loss Blocks</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/kl_divergence.html">Kullback-Leibler (KL) Divergence</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/loss.html">Loss functions</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/text/index.html">Text Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/text/gnmt.html">Google Neural Machine Translation</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/text/transformer.html">Machine Translation with Transformer</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/training/index.html">Training</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/fit_api_tutorial.html">MXNet Gluon Fit API</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/trainer.html">Trainer</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/index.html">Learning Rates</a><ul> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_finder.html">Learning Rate Finder</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_schedules.html">Learning Rate Schedules</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_schedules_advanced.html">Advanced Learning Rate Schedules</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/normalization/index.html">Normalization Blocks</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/kvstore/index.html">KVStore</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/kvstore/kvstore.html">Distributed Key-Value Store</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/legacy/index.html">Legacy</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/index.html">NDArray</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/01-ndarray-intro.html">An Intro: Manipulate Data the MXNet Way with NDArray</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/02-ndarray-operations.html">NDArray Operations</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/03-ndarray-contexts.html">NDArray Contexts</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/gotchas_numpy_in_mxnet.html">Gotchas using NumPy in Apache MXNet</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/sparse/index.html">Tutorials</a><ul> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/sparse/csr.html">CSRNDArray - NDArray in Compressed Sparse Row Storage Format</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/sparse/row_sparse.html">RowSparseNDArray - NDArray for Sparse Gradient Updates</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/np/index.html">What is NP on MXNet</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/np/cheat-sheet.html">The NP on MXNet cheat sheet</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/np/np-vs-numpy.html">Differences between NP on MXNet and NumPy</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/onnx/index.html">ONNX</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/fine_tuning_gluon.html">Fine-tuning an ONNX model</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/inference_on_onnx_model.html">Running inference on MXNet/Gluon from an ONNX model</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/deploy/export/onnx.html">Export ONNX Models</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/optimizer/index.html">Optimizers</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/viz/index.html">Visualization</a><ul> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/visualize_graph">Visualize networks</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/performance/index.html">Performance</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/performance/compression/index.html">Compression</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/compression/int8.html">Deploy with int-8</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/float16">Float16</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/gradient_compression">Gradient Compression</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://gluon-cv.mxnet.io/build/examples_deployment/int8_inference.html">GluonCV with Quantized Models</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/performance/backend/index.html">Accelerated Backend Tools</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/dnnl/index.html">oneDNN</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/dnnl/dnnl_readme.html">Install MXNet with oneDNN</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/dnnl/dnnl_quantization.html">oneDNN Quantization</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/dnnl/dnnl_quantization_inc.html">Improving accuracy with Intel® Neural Compressor</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/tvm.html">Use TVM</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/profiler.html">Profiling MXNet Models</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/amp.html">Using AMP: Automatic Mixed Precision</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/deploy/index.html">Deployment</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/export/index.html">Export</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/export/onnx.html">Exporting to ONNX format</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://gluon-cv.mxnet.io/build/examples_deployment/export_network.html">Export Gluon CV Models</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/blocks/save_load_params.html">Save / Load Parameters</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/inference/index.html">Inference</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/cpp.html">Deploy into C++</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/image_classification_jetson.html">Image Classication using pretrained ResNet-50 model on Jetson module</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/index.html">Run on AWS</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/use_ec2.html">Run on an EC2 Instance</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/use_sagemaker.html">Run on Amazon SageMaker</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/cloud.html">MXNet on the Cloud</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/extend/index.html">Extend</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/extend/customop.html">Custom Numpy Operators</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/new_op">New Operator Creation</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/add_op_in_backend">New Operator in MXNet Backend</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/using_rtc">Using RTC for CUDA kernels</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l1 current"><a class="reference internal" href="../index.html">Python API</a><ul class="current"> |
| <li class="toctree-l2"><a class="reference internal" href="../np/index.html">mxnet.np</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../np/arrays.html">Array objects</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../np/arrays.ndarray.html">The N-dimensional array (<code class="xref py py-class docutils literal notranslate"><span class="pre">ndarray</span></code>)</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/arrays.indexing.html">Indexing</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../np/routines.html">Routines</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.array-creation.html">Array creation routines</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.eye.html">mxnet.np.eye</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.empty.html">mxnet.np.empty</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.full.html">mxnet.np.full</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.identity.html">mxnet.np.identity</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ones.html">mxnet.np.ones</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ones_like.html">mxnet.np.ones_like</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.zeros.html">mxnet.np.zeros</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.zeros_like.html">mxnet.np.zeros_like</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.array.html">mxnet.np.array</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.copy.html">mxnet.np.copy</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arange.html">mxnet.np.arange</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linspace.html">mxnet.np.linspace</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.logspace.html">mxnet.np.logspace</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.meshgrid.html">mxnet.np.meshgrid</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tril.html">mxnet.np.tril</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.array-manipulation.html">Array manipulation routines</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.reshape.html">mxnet.np.reshape</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ravel.html">mxnet.np.ravel</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ndarray.flatten.html">mxnet.np.ndarray.flatten</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.swapaxes.html">mxnet.np.swapaxes</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ndarray.T.html">mxnet.np.ndarray.T</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.transpose.html">mxnet.np.transpose</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.moveaxis.html">mxnet.np.moveaxis</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.rollaxis.html">mxnet.np.rollaxis</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.expand_dims.html">mxnet.np.expand_dims</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.squeeze.html">mxnet.np.squeeze</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.broadcast_to.html">mxnet.np.broadcast_to</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.broadcast_arrays.html">mxnet.np.broadcast_arrays</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.atleast_1d.html">mxnet.np.atleast_1d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.atleast_2d.html">mxnet.np.atleast_2d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.atleast_3d.html">mxnet.np.atleast_3d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.concatenate.html">mxnet.np.concatenate</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.stack.html">mxnet.np.stack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.dstack.html">mxnet.np.dstack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.vstack.html">mxnet.np.vstack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.column_stack.html">mxnet.np.column_stack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.hstack.html">mxnet.np.hstack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.split.html">mxnet.np.split</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.hsplit.html">mxnet.np.hsplit</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.vsplit.html">mxnet.np.vsplit</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.array_split.html">mxnet.np.array_split</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.dsplit.html">mxnet.np.dsplit</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tile.html">mxnet.np.tile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.repeat.html">mxnet.np.repeat</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.unique.html">mxnet.np.unique</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.delete.html">mxnet.np.delete</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.insert.html">mxnet.np.insert</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.append.html">mxnet.np.append</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.resize.html">mxnet.np.resize</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.trim_zeros.html">mxnet.np.trim_zeros</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.reshape.html">mxnet.np.reshape</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.flip.html">mxnet.np.flip</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.roll.html">mxnet.np.roll</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.rot90.html">mxnet.np.rot90</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fliplr.html">mxnet.np.fliplr</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.flipud.html">mxnet.np.flipud</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.io.html">Input and output</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.genfromtxt.html">mxnet.np.genfromtxt</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ndarray.tolist.html">mxnet.np.ndarray.tolist</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.set_printoptions.html">mxnet.np.set_printoptions</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.linalg.html">Linear algebra (<code class="xref py py-mod docutils literal notranslate"><span class="pre">numpy.linalg</span></code>)</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.dot.html">mxnet.np.dot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.vdot.html">mxnet.np.vdot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.inner.html">mxnet.np.inner</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.outer.html">mxnet.np.outer</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tensordot.html">mxnet.np.tensordot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.einsum.html">mxnet.np.einsum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.multi_dot.html">mxnet.np.linalg.multi_dot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.matmul.html">mxnet.np.matmul</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.matrix_power.html">mxnet.np.linalg.matrix_power</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.kron.html">mxnet.np.kron</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.svd.html">mxnet.np.linalg.svd</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.cholesky.html">mxnet.np.linalg.cholesky</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.qr.html">mxnet.np.linalg.qr</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.eig.html">mxnet.np.linalg.eig</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.eigh.html">mxnet.np.linalg.eigh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.eigvals.html">mxnet.np.linalg.eigvals</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.eigvalsh.html">mxnet.np.linalg.eigvalsh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.norm.html">mxnet.np.linalg.norm</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.trace.html">mxnet.np.trace</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.cond.html">mxnet.np.linalg.cond</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.det.html">mxnet.np.linalg.det</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.matrix_rank.html">mxnet.np.linalg.matrix_rank</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.slogdet.html">mxnet.np.linalg.slogdet</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.solve.html">mxnet.np.linalg.solve</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.tensorsolve.html">mxnet.np.linalg.tensorsolve</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.lstsq.html">mxnet.np.linalg.lstsq</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.inv.html">mxnet.np.linalg.inv</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.pinv.html">mxnet.np.linalg.pinv</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.tensorinv.html">mxnet.np.linalg.tensorinv</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.math.html">Mathematical functions</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sin.html">mxnet.np.sin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cos.html">mxnet.np.cos</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tan.html">mxnet.np.tan</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arcsin.html">mxnet.np.arcsin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arccos.html">mxnet.np.arccos</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arctan.html">mxnet.np.arctan</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.degrees.html">mxnet.np.degrees</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.radians.html">mxnet.np.radians</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.hypot.html">mxnet.np.hypot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arctan2.html">mxnet.np.arctan2</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.deg2rad.html">mxnet.np.deg2rad</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.rad2deg.html">mxnet.np.rad2deg</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.unwrap.html">mxnet.np.unwrap</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sinh.html">mxnet.np.sinh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cosh.html">mxnet.np.cosh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tanh.html">mxnet.np.tanh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arcsinh.html">mxnet.np.arcsinh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arccosh.html">mxnet.np.arccosh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arctanh.html">mxnet.np.arctanh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.rint.html">mxnet.np.rint</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fix.html">mxnet.np.fix</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.floor.html">mxnet.np.floor</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ceil.html">mxnet.np.ceil</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.trunc.html">mxnet.np.trunc</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.around.html">mxnet.np.around</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.round_.html">mxnet.np.round_</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sum.html">mxnet.np.sum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.prod.html">mxnet.np.prod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cumsum.html">mxnet.np.cumsum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanprod.html">mxnet.np.nanprod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nansum.html">mxnet.np.nansum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cumprod.html">mxnet.np.cumprod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nancumprod.html">mxnet.np.nancumprod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nancumsum.html">mxnet.np.nancumsum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.diff.html">mxnet.np.diff</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ediff1d.html">mxnet.np.ediff1d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cross.html">mxnet.np.cross</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.trapz.html">mxnet.np.trapz</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.exp.html">mxnet.np.exp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.expm1.html">mxnet.np.expm1</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.log.html">mxnet.np.log</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.log10.html">mxnet.np.log10</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.log2.html">mxnet.np.log2</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.log1p.html">mxnet.np.log1p</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.logaddexp.html">mxnet.np.logaddexp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.i0.html">mxnet.np.i0</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ldexp.html">mxnet.np.ldexp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.signbit.html">mxnet.np.signbit</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.copysign.html">mxnet.np.copysign</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.frexp.html">mxnet.np.frexp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.spacing.html">mxnet.np.spacing</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.lcm.html">mxnet.np.lcm</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.gcd.html">mxnet.np.gcd</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.add.html">mxnet.np.add</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.reciprocal.html">mxnet.np.reciprocal</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.negative.html">mxnet.np.negative</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.divide.html">mxnet.np.divide</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.power.html">mxnet.np.power</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.subtract.html">mxnet.np.subtract</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.mod.html">mxnet.np.mod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.multiply.html">mxnet.np.multiply</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.true_divide.html">mxnet.np.true_divide</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.remainder.html">mxnet.np.remainder</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.positive.html">mxnet.np.positive</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.float_power.html">mxnet.np.float_power</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fmod.html">mxnet.np.fmod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.modf.html">mxnet.np.modf</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.divmod.html">mxnet.np.divmod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.floor_divide.html">mxnet.np.floor_divide</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.clip.html">mxnet.np.clip</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sqrt.html">mxnet.np.sqrt</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cbrt.html">mxnet.np.cbrt</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.square.html">mxnet.np.square</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.absolute.html">mxnet.np.absolute</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sign.html">mxnet.np.sign</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.maximum.html">mxnet.np.maximum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.minimum.html">mxnet.np.minimum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fabs.html">mxnet.np.fabs</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.heaviside.html">mxnet.np.heaviside</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fmax.html">mxnet.np.fmax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fmin.html">mxnet.np.fmin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nan_to_num.html">mxnet.np.nan_to_num</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.interp.html">mxnet.np.interp</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/random/index.html">np.random</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.choice.html">mxnet.np.random.choice</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.shuffle.html">mxnet.np.random.shuffle</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.normal.html">mxnet.np.random.normal</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.uniform.html">mxnet.np.random.uniform</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.rand.html">mxnet.np.random.rand</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.randint.html">mxnet.np.random.randint</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.beta.html">mxnet.np.random.beta</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.chisquare.html">mxnet.np.random.chisquare</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.exponential.html">mxnet.np.random.exponential</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.f.html">mxnet.np.random.f</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.gamma.html">mxnet.np.random.gamma</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.gumbel.html">mxnet.np.random.gumbel</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.laplace.html">mxnet.np.random.laplace</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.logistic.html">mxnet.np.random.logistic</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.lognormal.html">mxnet.np.random.lognormal</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.multinomial.html">mxnet.np.random.multinomial</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.multivariate_normal.html">mxnet.np.random.multivariate_normal</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.pareto.html">mxnet.np.random.pareto</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.power.html">mxnet.np.random.power</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.rayleigh.html">mxnet.np.random.rayleigh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.weibull.html">mxnet.np.random.weibull</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.sort.html">Sorting, searching, and counting</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ndarray.sort.html">mxnet.np.ndarray.sort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sort.html">mxnet.np.sort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.lexsort.html">mxnet.np.lexsort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argsort.html">mxnet.np.argsort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.msort.html">mxnet.np.msort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.partition.html">mxnet.np.partition</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argpartition.html">mxnet.np.argpartition</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argmax.html">mxnet.np.argmax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argmin.html">mxnet.np.argmin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanargmax.html">mxnet.np.nanargmax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanargmin.html">mxnet.np.nanargmin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argwhere.html">mxnet.np.argwhere</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nonzero.html">mxnet.np.nonzero</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.flatnonzero.html">mxnet.np.flatnonzero</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.where.html">mxnet.np.where</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.searchsorted.html">mxnet.np.searchsorted</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.extract.html">mxnet.np.extract</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.count_nonzero.html">mxnet.np.count_nonzero</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.statistics.html">Statistics</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.min.html">mxnet.np.min</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.max.html">mxnet.np.max</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.amin.html">mxnet.np.amin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.amax.html">mxnet.np.amax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanmin.html">mxnet.np.nanmin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanmax.html">mxnet.np.nanmax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ptp.html">mxnet.np.ptp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.percentile.html">mxnet.np.percentile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanpercentile.html">mxnet.np.nanpercentile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.quantile.html">mxnet.np.quantile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanquantile.html">mxnet.np.nanquantile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.mean.html">mxnet.np.mean</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.std.html">mxnet.np.std</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.var.html">mxnet.np.var</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.median.html">mxnet.np.median</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.average.html">mxnet.np.average</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanmedian.html">mxnet.np.nanmedian</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanstd.html">mxnet.np.nanstd</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanvar.html">mxnet.np.nanvar</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.corrcoef.html">mxnet.np.corrcoef</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.correlate.html">mxnet.np.correlate</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cov.html">mxnet.np.cov</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.histogram.html">mxnet.np.histogram</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.histogram2d.html">mxnet.np.histogram2d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.histogramdd.html">mxnet.np.histogramdd</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.bincount.html">mxnet.np.bincount</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.histogram_bin_edges.html">mxnet.np.histogram_bin_edges</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.digitize.html">mxnet.np.digitize</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../npx/index.html">NPX: NumPy Neural Network Extension</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.set_np.html">mxnet.npx.set_np</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.reset_np.html">mxnet.npx.reset_np</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.cpu.html">mxnet.npx.cpu</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.cpu_pinned.html">mxnet.npx.cpu_pinned</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.gpu.html">mxnet.npx.gpu</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.gpu_memory_info.html">mxnet.npx.gpu_memory_info</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.current_device.html">mxnet.npx.current_device</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.num_gpus.html">mxnet.npx.num_gpus</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.activation.html">mxnet.npx.activation</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.batch_norm.html">mxnet.npx.batch_norm</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.convolution.html">mxnet.npx.convolution</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.dropout.html">mxnet.npx.dropout</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.embedding.html">mxnet.npx.embedding</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.fully_connected.html">mxnet.npx.fully_connected</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.layer_norm.html">mxnet.npx.layer_norm</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.pooling.html">mxnet.npx.pooling</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.rnn.html">mxnet.npx.rnn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.leaky_relu.html">mxnet.npx.leaky_relu</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.multibox_detection.html">mxnet.npx.multibox_detection</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.multibox_prior.html">mxnet.npx.multibox_prior</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.multibox_target.html">mxnet.npx.multibox_target</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.roi_pooling.html">mxnet.npx.roi_pooling</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.sigmoid.html">mxnet.npx.sigmoid</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.relu.html">mxnet.npx.relu</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.smooth_l1.html">mxnet.npx.smooth_l1</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.softmax.html">mxnet.npx.softmax</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.log_softmax.html">mxnet.npx.log_softmax</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.topk.html">mxnet.npx.topk</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.waitall.html">mxnet.npx.waitall</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.load.html">mxnet.npx.load</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.save.html">mxnet.npx.save</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.one_hot.html">mxnet.npx.one_hot</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.pick.html">mxnet.npx.pick</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.reshape_like.html">mxnet.npx.reshape_like</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.batch_flatten.html">mxnet.npx.batch_flatten</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.batch_dot.html">mxnet.npx.batch_dot</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.gamma.html">mxnet.npx.gamma</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.sequence_mask.html">mxnet.npx.sequence_mask</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../gluon/index.html">mxnet.gluon</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/block.html">gluon.Block</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/hybrid_block.html">gluon.HybridBlock</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/symbol_block.html">gluon.SymbolBlock</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/constant.html">gluon.Constant</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/parameter.html">gluon.Parameter</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/trainer.html">gluon.Trainer</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/contrib/index.html">gluon.contrib</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/data/index.html">gluon.data</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../gluon/data/vision/index.html">data.vision</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../gluon/data/vision/datasets/index.html">vision.datasets</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../gluon/data/vision/transforms/index.html">vision.transforms</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/loss/index.html">gluon.loss</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/metric/index.html">gluon.metric</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/model_zoo/index.html">gluon.model_zoo.vision</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/nn/index.html">gluon.nn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/rnn/index.html">gluon.rnn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/utils/index.html">gluon.utils</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../autograd/index.html">mxnet.autograd</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../initializer/index.html">mxnet.initializer</a></li> |
| <li class="toctree-l2 current"><a class="current reference internal" href="#">mxnet.optimizer</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../lr_scheduler/index.html">mxnet.lr_scheduler</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html">KVStore: Communication for Distributed Training</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html#horovod">Horovod</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.Horovod.html">mxnet.kvstore.Horovod</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html#byteps">BytePS</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.BytePS.html">mxnet.kvstore.BytePS</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html#kvstore-interface">KVStore Interface</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.KVStore.html">mxnet.kvstore.KVStore</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.KVStoreBase.html">mxnet.kvstore.KVStoreBase</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.KVStoreServer.html">mxnet.kvstore.KVStoreServer</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../contrib/index.html">mxnet.contrib</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/io/index.html">contrib.io</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/ndarray/index.html">contrib.ndarray</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/onnx/index.html">contrib.onnx</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/quantization/index.html">contrib.quantization</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/symbol/index.html">contrib.symbol</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/tensorboard/index.html">contrib.tensorboard</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/tensorrt/index.html">contrib.tensorrt</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/text/index.html">contrib.text</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../legacy/index.html">Legacy</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/callback/index.html">mxnet.callback</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/image/index.html">mxnet.image</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/io/index.html">mxnet.io</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/ndarray/index.html">mxnet.ndarray</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/ndarray.html">ndarray</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/contrib/index.html">ndarray.contrib</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/image/index.html">ndarray.image</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/linalg/index.html">ndarray.linalg</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/op/index.html">ndarray.op</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/random/index.html">ndarray.random</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/register/index.html">ndarray.register</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/sparse/index.html">ndarray.sparse</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/utils/index.html">ndarray.utils</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/recordio/index.html">mxnet.recordio</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/symbol/index.html">mxnet.symbol</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/symbol.html">symbol</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/contrib/index.html">symbol.contrib</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/image/index.html">symbol.image</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/linalg/index.html">symbol.linalg</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/op/index.html">symbol.op</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/random/index.html">symbol.random</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/register/index.html">symbol.register</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/sparse/index.html">symbol.sparse</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/visualization/index.html">mxnet.visualization</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../device/index.html">mxnet.device</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../engine/index.html">mxnet.engine</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../executor/index.html">mxnet.executor</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore_server/index.html">mxnet.kvstore_server</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../profiler/index.html">mxnet.profiler</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../rtc/index.html">mxnet.rtc</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../runtime/index.html">mxnet.runtime</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../runtime/generated/mxnet.runtime.Feature.html">mxnet.runtime.Feature</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../runtime/generated/mxnet.runtime.Features.html">mxnet.runtime.Features</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../runtime/generated/mxnet.runtime.feature_list.html">mxnet.runtime.feature_list</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../test_utils/index.html">mxnet.test_utils</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../util/index.html">mxnet.util</a></li> |
| </ul> |
| </li> |
| </ul> |
| |
| </nav> |
| |
| </div> |
| |
| </header> |
| <main class="mdl-layout__content" tabIndex="0"> |
| <header class="mdl-layout__drawer"> |
| |
| <div class="globaltoc"> |
| <span class="mdl-layout-title toc">Table Of Contents</span> |
| |
| |
| |
| <nav class="mdl-navigation"> |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="../../tutorials/index.html">Python Tutorials</a><ul> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/getting-started/index.html">Getting Started</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/crash-course/index.html">Crash Course</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/0-introduction.html">Introduction</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/1-nparray.html">Step 1: Manipulate data with NP on MXNet</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/2-create-nn.html">Step 2: Create a neural network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/3-autograd.html">Step 3: Automatic differentiation with autograd</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/4-components.html">Step 4: Necessary components that are not in the network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-datasets.html">Step 5: <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s and <code class="docutils literal notranslate"><span class="pre">DataLoader</span></code></a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-datasets.html#Using-own-data-with-included-Datasets">Using own data with included <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-datasets.html#Using-your-own-data-with-custom-Datasets">Using your own data with custom <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-datasets.html#New-in-MXNet-2.0:-faster-C++-backend-dataloaders">New in MXNet 2.0: faster C++ backend dataloaders</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/6-train-nn.html">Step 6: Train a Neural Network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/7-use-gpus.html">Step 7: Load and Run a NN using GPU</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/to-mxnet/index.html">Moving to MXNet from Other Frameworks</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/to-mxnet/pytorch.html">PyTorch vs Apache MXNet</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/gluon_from_experiment_to_deployment.html">Gluon: from experiment to deployment</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/gluon_migration_guide.html">Gluon2.0: Migration Guide</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/logistic_regression_explained.html">Logistic regression explained</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/image/mnist.html">MNIST</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/packages/index.html">Packages</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/autograd/index.html">Automatic Differentiation</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/gluon/index.html">Gluon</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/index.html">Blocks</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/custom-layer.html">Custom Layers</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/hybridize.html">Hybridize</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/init.html">Initialization</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/naming.html">Parameter and Block Naming</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/nn.html">Layers and Blocks</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/parameters.html">Parameter Management</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/save_load_params.html">Saving and Loading Gluon Models</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/activations/activations.html">Activation Blocks</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/data/index.html">Data Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/data_augmentation.html">Image Augmentation</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/datasets.html">Gluon <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s and <code class="docutils literal notranslate"><span class="pre">DataLoader</span></code></a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/datasets.html#Using-own-data-with-included-Datasets">Using own data with included <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/datasets.html#Using-own-data-with-custom-Datasets">Using own data with custom <code class="docutils literal notranslate"><span class="pre">Dataset</span></code>s</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/data/datasets.html#Appendix:-Upgrading-from-Module-DataIter-to-Gluon-DataLoader">Appendix: Upgrading from Module <code class="docutils literal notranslate"><span class="pre">DataIter</span></code> to Gluon <code class="docutils literal notranslate"><span class="pre">DataLoader</span></code></a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/image/index.html">Image Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/info_gan.html">Image similarity search with InfoGAN</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/mnist.html">Handwritten Digit Recognition</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/loss/index.html">Losses</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/custom-loss.html">Custom Loss Blocks</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/kl_divergence.html">Kullback-Leibler (KL) Divergence</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/loss.html">Loss functions</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/text/index.html">Text Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/text/gnmt.html">Google Neural Machine Translation</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/text/transformer.html">Machine Translation with Transformer</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/training/index.html">Training</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/fit_api_tutorial.html">MXNet Gluon Fit API</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/trainer.html">Trainer</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/index.html">Learning Rates</a><ul> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_finder.html">Learning Rate Finder</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_schedules.html">Learning Rate Schedules</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_schedules_advanced.html">Advanced Learning Rate Schedules</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/normalization/index.html">Normalization Blocks</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/kvstore/index.html">KVStore</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/kvstore/kvstore.html">Distributed Key-Value Store</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/legacy/index.html">Legacy</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/index.html">NDArray</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/01-ndarray-intro.html">An Intro: Manipulate Data the MXNet Way with NDArray</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/02-ndarray-operations.html">NDArray Operations</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/03-ndarray-contexts.html">NDArray Contexts</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/gotchas_numpy_in_mxnet.html">Gotchas using NumPy in Apache MXNet</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/sparse/index.html">Tutorials</a><ul> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/sparse/csr.html">CSRNDArray - NDArray in Compressed Sparse Row Storage Format</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/legacy/ndarray/sparse/row_sparse.html">RowSparseNDArray - NDArray for Sparse Gradient Updates</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/np/index.html">What is NP on MXNet</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/np/cheat-sheet.html">The NP on MXNet cheat sheet</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/np/np-vs-numpy.html">Differences between NP on MXNet and NumPy</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/onnx/index.html">ONNX</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/fine_tuning_gluon.html">Fine-tuning an ONNX model</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/inference_on_onnx_model.html">Running inference on MXNet/Gluon from an ONNX model</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/deploy/export/onnx.html">Export ONNX Models</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/optimizer/index.html">Optimizers</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/viz/index.html">Visualization</a><ul> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/visualize_graph">Visualize networks</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/performance/index.html">Performance</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/performance/compression/index.html">Compression</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/compression/int8.html">Deploy with int-8</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/float16">Float16</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/gradient_compression">Gradient Compression</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://gluon-cv.mxnet.io/build/examples_deployment/int8_inference.html">GluonCV with Quantized Models</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/performance/backend/index.html">Accelerated Backend Tools</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/dnnl/index.html">oneDNN</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/dnnl/dnnl_readme.html">Install MXNet with oneDNN</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/dnnl/dnnl_quantization.html">oneDNN Quantization</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/dnnl/dnnl_quantization_inc.html">Improving accuracy with Intel® Neural Compressor</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/tvm.html">Use TVM</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/profiler.html">Profiling MXNet Models</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/amp.html">Using AMP: Automatic Mixed Precision</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/deploy/index.html">Deployment</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/export/index.html">Export</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/export/onnx.html">Exporting to ONNX format</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://gluon-cv.mxnet.io/build/examples_deployment/export_network.html">Export Gluon CV Models</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/blocks/save_load_params.html">Save / Load Parameters</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/inference/index.html">Inference</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/cpp.html">Deploy into C++</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/image_classification_jetson.html">Image Classication using pretrained ResNet-50 model on Jetson module</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/index.html">Run on AWS</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/use_ec2.html">Run on an EC2 Instance</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/use_sagemaker.html">Run on Amazon SageMaker</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/cloud.html">MXNet on the Cloud</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/extend/index.html">Extend</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/extend/customop.html">Custom Numpy Operators</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/new_op">New Operator Creation</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/add_op_in_backend">New Operator in MXNet Backend</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/using_rtc">Using RTC for CUDA kernels</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l1 current"><a class="reference internal" href="../index.html">Python API</a><ul class="current"> |
| <li class="toctree-l2"><a class="reference internal" href="../np/index.html">mxnet.np</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../np/arrays.html">Array objects</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../np/arrays.ndarray.html">The N-dimensional array (<code class="xref py py-class docutils literal notranslate"><span class="pre">ndarray</span></code>)</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/arrays.indexing.html">Indexing</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../np/routines.html">Routines</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.array-creation.html">Array creation routines</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.eye.html">mxnet.np.eye</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.empty.html">mxnet.np.empty</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.full.html">mxnet.np.full</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.identity.html">mxnet.np.identity</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ones.html">mxnet.np.ones</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ones_like.html">mxnet.np.ones_like</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.zeros.html">mxnet.np.zeros</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.zeros_like.html">mxnet.np.zeros_like</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.array.html">mxnet.np.array</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.copy.html">mxnet.np.copy</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arange.html">mxnet.np.arange</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linspace.html">mxnet.np.linspace</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.logspace.html">mxnet.np.logspace</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.meshgrid.html">mxnet.np.meshgrid</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tril.html">mxnet.np.tril</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.array-manipulation.html">Array manipulation routines</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.reshape.html">mxnet.np.reshape</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ravel.html">mxnet.np.ravel</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ndarray.flatten.html">mxnet.np.ndarray.flatten</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.swapaxes.html">mxnet.np.swapaxes</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ndarray.T.html">mxnet.np.ndarray.T</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.transpose.html">mxnet.np.transpose</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.moveaxis.html">mxnet.np.moveaxis</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.rollaxis.html">mxnet.np.rollaxis</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.expand_dims.html">mxnet.np.expand_dims</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.squeeze.html">mxnet.np.squeeze</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.broadcast_to.html">mxnet.np.broadcast_to</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.broadcast_arrays.html">mxnet.np.broadcast_arrays</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.atleast_1d.html">mxnet.np.atleast_1d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.atleast_2d.html">mxnet.np.atleast_2d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.atleast_3d.html">mxnet.np.atleast_3d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.concatenate.html">mxnet.np.concatenate</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.stack.html">mxnet.np.stack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.dstack.html">mxnet.np.dstack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.vstack.html">mxnet.np.vstack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.column_stack.html">mxnet.np.column_stack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.hstack.html">mxnet.np.hstack</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.split.html">mxnet.np.split</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.hsplit.html">mxnet.np.hsplit</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.vsplit.html">mxnet.np.vsplit</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.array_split.html">mxnet.np.array_split</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.dsplit.html">mxnet.np.dsplit</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tile.html">mxnet.np.tile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.repeat.html">mxnet.np.repeat</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.unique.html">mxnet.np.unique</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.delete.html">mxnet.np.delete</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.insert.html">mxnet.np.insert</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.append.html">mxnet.np.append</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.resize.html">mxnet.np.resize</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.trim_zeros.html">mxnet.np.trim_zeros</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.reshape.html">mxnet.np.reshape</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.flip.html">mxnet.np.flip</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.roll.html">mxnet.np.roll</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.rot90.html">mxnet.np.rot90</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fliplr.html">mxnet.np.fliplr</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.flipud.html">mxnet.np.flipud</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.io.html">Input and output</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.genfromtxt.html">mxnet.np.genfromtxt</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ndarray.tolist.html">mxnet.np.ndarray.tolist</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.set_printoptions.html">mxnet.np.set_printoptions</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.linalg.html">Linear algebra (<code class="xref py py-mod docutils literal notranslate"><span class="pre">numpy.linalg</span></code>)</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.dot.html">mxnet.np.dot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.vdot.html">mxnet.np.vdot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.inner.html">mxnet.np.inner</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.outer.html">mxnet.np.outer</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tensordot.html">mxnet.np.tensordot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.einsum.html">mxnet.np.einsum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.multi_dot.html">mxnet.np.linalg.multi_dot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.matmul.html">mxnet.np.matmul</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.matrix_power.html">mxnet.np.linalg.matrix_power</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.kron.html">mxnet.np.kron</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.svd.html">mxnet.np.linalg.svd</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.cholesky.html">mxnet.np.linalg.cholesky</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.qr.html">mxnet.np.linalg.qr</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.eig.html">mxnet.np.linalg.eig</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.eigh.html">mxnet.np.linalg.eigh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.eigvals.html">mxnet.np.linalg.eigvals</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.eigvalsh.html">mxnet.np.linalg.eigvalsh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.norm.html">mxnet.np.linalg.norm</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.trace.html">mxnet.np.trace</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.cond.html">mxnet.np.linalg.cond</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.det.html">mxnet.np.linalg.det</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.matrix_rank.html">mxnet.np.linalg.matrix_rank</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.slogdet.html">mxnet.np.linalg.slogdet</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.solve.html">mxnet.np.linalg.solve</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.tensorsolve.html">mxnet.np.linalg.tensorsolve</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.lstsq.html">mxnet.np.linalg.lstsq</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.inv.html">mxnet.np.linalg.inv</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.pinv.html">mxnet.np.linalg.pinv</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.linalg.tensorinv.html">mxnet.np.linalg.tensorinv</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.math.html">Mathematical functions</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sin.html">mxnet.np.sin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cos.html">mxnet.np.cos</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tan.html">mxnet.np.tan</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arcsin.html">mxnet.np.arcsin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arccos.html">mxnet.np.arccos</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arctan.html">mxnet.np.arctan</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.degrees.html">mxnet.np.degrees</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.radians.html">mxnet.np.radians</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.hypot.html">mxnet.np.hypot</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arctan2.html">mxnet.np.arctan2</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.deg2rad.html">mxnet.np.deg2rad</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.rad2deg.html">mxnet.np.rad2deg</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.unwrap.html">mxnet.np.unwrap</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sinh.html">mxnet.np.sinh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cosh.html">mxnet.np.cosh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.tanh.html">mxnet.np.tanh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arcsinh.html">mxnet.np.arcsinh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arccosh.html">mxnet.np.arccosh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.arctanh.html">mxnet.np.arctanh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.rint.html">mxnet.np.rint</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fix.html">mxnet.np.fix</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.floor.html">mxnet.np.floor</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ceil.html">mxnet.np.ceil</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.trunc.html">mxnet.np.trunc</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.around.html">mxnet.np.around</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.round_.html">mxnet.np.round_</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sum.html">mxnet.np.sum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.prod.html">mxnet.np.prod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cumsum.html">mxnet.np.cumsum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanprod.html">mxnet.np.nanprod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nansum.html">mxnet.np.nansum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cumprod.html">mxnet.np.cumprod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nancumprod.html">mxnet.np.nancumprod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nancumsum.html">mxnet.np.nancumsum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.diff.html">mxnet.np.diff</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ediff1d.html">mxnet.np.ediff1d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cross.html">mxnet.np.cross</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.trapz.html">mxnet.np.trapz</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.exp.html">mxnet.np.exp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.expm1.html">mxnet.np.expm1</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.log.html">mxnet.np.log</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.log10.html">mxnet.np.log10</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.log2.html">mxnet.np.log2</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.log1p.html">mxnet.np.log1p</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.logaddexp.html">mxnet.np.logaddexp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.i0.html">mxnet.np.i0</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ldexp.html">mxnet.np.ldexp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.signbit.html">mxnet.np.signbit</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.copysign.html">mxnet.np.copysign</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.frexp.html">mxnet.np.frexp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.spacing.html">mxnet.np.spacing</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.lcm.html">mxnet.np.lcm</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.gcd.html">mxnet.np.gcd</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.add.html">mxnet.np.add</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.reciprocal.html">mxnet.np.reciprocal</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.negative.html">mxnet.np.negative</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.divide.html">mxnet.np.divide</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.power.html">mxnet.np.power</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.subtract.html">mxnet.np.subtract</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.mod.html">mxnet.np.mod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.multiply.html">mxnet.np.multiply</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.true_divide.html">mxnet.np.true_divide</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.remainder.html">mxnet.np.remainder</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.positive.html">mxnet.np.positive</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.float_power.html">mxnet.np.float_power</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fmod.html">mxnet.np.fmod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.modf.html">mxnet.np.modf</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.divmod.html">mxnet.np.divmod</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.floor_divide.html">mxnet.np.floor_divide</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.clip.html">mxnet.np.clip</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sqrt.html">mxnet.np.sqrt</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cbrt.html">mxnet.np.cbrt</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.square.html">mxnet.np.square</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.absolute.html">mxnet.np.absolute</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sign.html">mxnet.np.sign</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.maximum.html">mxnet.np.maximum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.minimum.html">mxnet.np.minimum</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fabs.html">mxnet.np.fabs</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.heaviside.html">mxnet.np.heaviside</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fmax.html">mxnet.np.fmax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.fmin.html">mxnet.np.fmin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nan_to_num.html">mxnet.np.nan_to_num</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.interp.html">mxnet.np.interp</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/random/index.html">np.random</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.choice.html">mxnet.np.random.choice</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.shuffle.html">mxnet.np.random.shuffle</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.normal.html">mxnet.np.random.normal</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.uniform.html">mxnet.np.random.uniform</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.rand.html">mxnet.np.random.rand</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.randint.html">mxnet.np.random.randint</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.beta.html">mxnet.np.random.beta</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.chisquare.html">mxnet.np.random.chisquare</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.exponential.html">mxnet.np.random.exponential</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.f.html">mxnet.np.random.f</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.gamma.html">mxnet.np.random.gamma</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.gumbel.html">mxnet.np.random.gumbel</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.laplace.html">mxnet.np.random.laplace</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.logistic.html">mxnet.np.random.logistic</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.lognormal.html">mxnet.np.random.lognormal</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.multinomial.html">mxnet.np.random.multinomial</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.multivariate_normal.html">mxnet.np.random.multivariate_normal</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.pareto.html">mxnet.np.random.pareto</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.power.html">mxnet.np.random.power</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.rayleigh.html">mxnet.np.random.rayleigh</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/random/generated/mxnet.np.random.weibull.html">mxnet.np.random.weibull</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.sort.html">Sorting, searching, and counting</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ndarray.sort.html">mxnet.np.ndarray.sort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.sort.html">mxnet.np.sort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.lexsort.html">mxnet.np.lexsort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argsort.html">mxnet.np.argsort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.msort.html">mxnet.np.msort</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.partition.html">mxnet.np.partition</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argpartition.html">mxnet.np.argpartition</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argmax.html">mxnet.np.argmax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argmin.html">mxnet.np.argmin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanargmax.html">mxnet.np.nanargmax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanargmin.html">mxnet.np.nanargmin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.argwhere.html">mxnet.np.argwhere</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nonzero.html">mxnet.np.nonzero</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.flatnonzero.html">mxnet.np.flatnonzero</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.where.html">mxnet.np.where</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.searchsorted.html">mxnet.np.searchsorted</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.extract.html">mxnet.np.extract</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.count_nonzero.html">mxnet.np.count_nonzero</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../np/routines.statistics.html">Statistics</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.min.html">mxnet.np.min</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.max.html">mxnet.np.max</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.amin.html">mxnet.np.amin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.amax.html">mxnet.np.amax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanmin.html">mxnet.np.nanmin</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanmax.html">mxnet.np.nanmax</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.ptp.html">mxnet.np.ptp</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.percentile.html">mxnet.np.percentile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanpercentile.html">mxnet.np.nanpercentile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.quantile.html">mxnet.np.quantile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanquantile.html">mxnet.np.nanquantile</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.mean.html">mxnet.np.mean</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.std.html">mxnet.np.std</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.var.html">mxnet.np.var</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.median.html">mxnet.np.median</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.average.html">mxnet.np.average</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanmedian.html">mxnet.np.nanmedian</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanstd.html">mxnet.np.nanstd</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.nanvar.html">mxnet.np.nanvar</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.corrcoef.html">mxnet.np.corrcoef</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.correlate.html">mxnet.np.correlate</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.cov.html">mxnet.np.cov</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.histogram.html">mxnet.np.histogram</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.histogram2d.html">mxnet.np.histogram2d</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.histogramdd.html">mxnet.np.histogramdd</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.bincount.html">mxnet.np.bincount</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.histogram_bin_edges.html">mxnet.np.histogram_bin_edges</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../np/generated/mxnet.np.digitize.html">mxnet.np.digitize</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../npx/index.html">NPX: NumPy Neural Network Extension</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.set_np.html">mxnet.npx.set_np</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.reset_np.html">mxnet.npx.reset_np</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.cpu.html">mxnet.npx.cpu</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.cpu_pinned.html">mxnet.npx.cpu_pinned</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.gpu.html">mxnet.npx.gpu</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.gpu_memory_info.html">mxnet.npx.gpu_memory_info</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.current_device.html">mxnet.npx.current_device</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.num_gpus.html">mxnet.npx.num_gpus</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.activation.html">mxnet.npx.activation</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.batch_norm.html">mxnet.npx.batch_norm</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.convolution.html">mxnet.npx.convolution</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.dropout.html">mxnet.npx.dropout</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.embedding.html">mxnet.npx.embedding</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.fully_connected.html">mxnet.npx.fully_connected</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.layer_norm.html">mxnet.npx.layer_norm</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.pooling.html">mxnet.npx.pooling</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.rnn.html">mxnet.npx.rnn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.leaky_relu.html">mxnet.npx.leaky_relu</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.multibox_detection.html">mxnet.npx.multibox_detection</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.multibox_prior.html">mxnet.npx.multibox_prior</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.multibox_target.html">mxnet.npx.multibox_target</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.roi_pooling.html">mxnet.npx.roi_pooling</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.sigmoid.html">mxnet.npx.sigmoid</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.relu.html">mxnet.npx.relu</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.smooth_l1.html">mxnet.npx.smooth_l1</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.softmax.html">mxnet.npx.softmax</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.log_softmax.html">mxnet.npx.log_softmax</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.topk.html">mxnet.npx.topk</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.waitall.html">mxnet.npx.waitall</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.load.html">mxnet.npx.load</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.save.html">mxnet.npx.save</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.one_hot.html">mxnet.npx.one_hot</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.pick.html">mxnet.npx.pick</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.reshape_like.html">mxnet.npx.reshape_like</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.batch_flatten.html">mxnet.npx.batch_flatten</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.batch_dot.html">mxnet.npx.batch_dot</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.gamma.html">mxnet.npx.gamma</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../npx/generated/mxnet.npx.sequence_mask.html">mxnet.npx.sequence_mask</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../gluon/index.html">mxnet.gluon</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/block.html">gluon.Block</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/hybrid_block.html">gluon.HybridBlock</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/symbol_block.html">gluon.SymbolBlock</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/constant.html">gluon.Constant</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/parameter.html">gluon.Parameter</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/trainer.html">gluon.Trainer</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/contrib/index.html">gluon.contrib</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/data/index.html">gluon.data</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../gluon/data/vision/index.html">data.vision</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../gluon/data/vision/datasets/index.html">vision.datasets</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../gluon/data/vision/transforms/index.html">vision.transforms</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/loss/index.html">gluon.loss</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/metric/index.html">gluon.metric</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/model_zoo/index.html">gluon.model_zoo.vision</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/nn/index.html">gluon.nn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/rnn/index.html">gluon.rnn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/utils/index.html">gluon.utils</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../autograd/index.html">mxnet.autograd</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../initializer/index.html">mxnet.initializer</a></li> |
| <li class="toctree-l2 current"><a class="current reference internal" href="#">mxnet.optimizer</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../lr_scheduler/index.html">mxnet.lr_scheduler</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html">KVStore: Communication for Distributed Training</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html#horovod">Horovod</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.Horovod.html">mxnet.kvstore.Horovod</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html#byteps">BytePS</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.BytePS.html">mxnet.kvstore.BytePS</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html#kvstore-interface">KVStore Interface</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.KVStore.html">mxnet.kvstore.KVStore</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.KVStoreBase.html">mxnet.kvstore.KVStoreBase</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../kvstore/generated/mxnet.kvstore.KVStoreServer.html">mxnet.kvstore.KVStoreServer</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../contrib/index.html">mxnet.contrib</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/io/index.html">contrib.io</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/ndarray/index.html">contrib.ndarray</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/onnx/index.html">contrib.onnx</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/quantization/index.html">contrib.quantization</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/symbol/index.html">contrib.symbol</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/tensorboard/index.html">contrib.tensorboard</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/tensorrt/index.html">contrib.tensorrt</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/text/index.html">contrib.text</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../legacy/index.html">Legacy</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/callback/index.html">mxnet.callback</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/image/index.html">mxnet.image</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/io/index.html">mxnet.io</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/ndarray/index.html">mxnet.ndarray</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/ndarray.html">ndarray</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/contrib/index.html">ndarray.contrib</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/image/index.html">ndarray.image</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/linalg/index.html">ndarray.linalg</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/op/index.html">ndarray.op</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/random/index.html">ndarray.random</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/register/index.html">ndarray.register</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/sparse/index.html">ndarray.sparse</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/ndarray/utils/index.html">ndarray.utils</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/recordio/index.html">mxnet.recordio</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/symbol/index.html">mxnet.symbol</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/symbol.html">symbol</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/contrib/index.html">symbol.contrib</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/image/index.html">symbol.image</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/linalg/index.html">symbol.linalg</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/op/index.html">symbol.op</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/random/index.html">symbol.random</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/register/index.html">symbol.register</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../legacy/symbol/sparse/index.html">symbol.sparse</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../legacy/visualization/index.html">mxnet.visualization</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../device/index.html">mxnet.device</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../engine/index.html">mxnet.engine</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../executor/index.html">mxnet.executor</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore_server/index.html">mxnet.kvstore_server</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../profiler/index.html">mxnet.profiler</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../rtc/index.html">mxnet.rtc</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../runtime/index.html">mxnet.runtime</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../runtime/generated/mxnet.runtime.Feature.html">mxnet.runtime.Feature</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../runtime/generated/mxnet.runtime.Features.html">mxnet.runtime.Features</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../runtime/generated/mxnet.runtime.feature_list.html">mxnet.runtime.feature_list</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../test_utils/index.html">mxnet.test_utils</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../util/index.html">mxnet.util</a></li> |
| </ul> |
| </li> |
| </ul> |
| |
| </nav> |
| |
| </div> |
| |
| </header> |
| |
| <div class="document"> |
| <div class="page-content" role="main"> |
| |
| <div class="section" id="module-mxnet.optimizer"> |
| <span id="mxnet-optimizer"></span><h1>mxnet.optimizer<a class="headerlink" href="#module-mxnet.optimizer" title="Permalink to this headline">¶</a></h1> |
| <p>Optimizer API of MXNet.</p> |
| <p><strong>Classes</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Optimizer</span></code></a>([rescale_grad, param_idx2name, …])</p></td> |
| <td><p>The base class inherited by all optimizers.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Test" title="mxnet.optimizer.Test"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Test</span></code></a>(**kwargs)</p></td> |
| <td><p>The Test optimizer</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Updater" title="mxnet.optimizer.Updater"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Updater</span></code></a>(optimizer)</p></td> |
| <td><p>Updater for kvstore.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD" title="mxnet.optimizer.SGD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">SGD</span></code></a>([learning_rate, momentum, lazy_update, …])</p></td> |
| <td><p>The SGD optimizer with momentum and weight decay.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.SGLD" title="mxnet.optimizer.SGLD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">SGLD</span></code></a>([learning_rate, use_fused_step])</p></td> |
| <td><p>Stochastic Gradient Riemannian Langevin Dynamics.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Signum" title="mxnet.optimizer.Signum"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Signum</span></code></a>([learning_rate, momentum, wd_lh, …])</p></td> |
| <td><p>The Signum optimizer that takes the sign of gradient or momentum.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.DCASGD" title="mxnet.optimizer.DCASGD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">DCASGD</span></code></a>([learning_rate, momentum, lamda, …])</p></td> |
| <td><p>The DCASGD optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG" title="mxnet.optimizer.NAG"><code class="xref py py-obj docutils literal notranslate"><span class="pre">NAG</span></code></a>([learning_rate, momentum, …])</p></td> |
| <td><p>Nesterov accelerated gradient.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaBelief" title="mxnet.optimizer.AdaBelief"><code class="xref py py-obj docutils literal notranslate"><span class="pre">AdaBelief</span></code></a>([learning_rate, beta1, beta2, …])</p></td> |
| <td><p>The AdaBelief optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaGrad" title="mxnet.optimizer.AdaGrad"><code class="xref py py-obj docutils literal notranslate"><span class="pre">AdaGrad</span></code></a>([learning_rate, epsilon, use_fused_step])</p></td> |
| <td><p>AdaGrad optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaDelta" title="mxnet.optimizer.AdaDelta"><code class="xref py py-obj docutils literal notranslate"><span class="pre">AdaDelta</span></code></a>([learning_rate, rho, epsilon, …])</p></td> |
| <td><p>The AdaDelta optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Adam" title="mxnet.optimizer.Adam"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Adam</span></code></a>([learning_rate, beta1, beta2, epsilon, …])</p></td> |
| <td><p>The Adam optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Adamax" title="mxnet.optimizer.Adamax"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Adamax</span></code></a>([learning_rate, beta1, beta2, …])</p></td> |
| <td><p>The AdaMax optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Nadam" title="mxnet.optimizer.Nadam"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Nadam</span></code></a>([learning_rate, beta1, beta2, …])</p></td> |
| <td><p>The Nesterov Adam optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Ftrl" title="mxnet.optimizer.Ftrl"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Ftrl</span></code></a>([learning_rate, lamda1, beta, …])</p></td> |
| <td><p>The Ftrl optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.FTML" title="mxnet.optimizer.FTML"><code class="xref py py-obj docutils literal notranslate"><span class="pre">FTML</span></code></a>([learning_rate, beta1, beta2, epsilon, …])</p></td> |
| <td><p>The FTML optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS" title="mxnet.optimizer.LARS"><code class="xref py py-obj docutils literal notranslate"><span class="pre">LARS</span></code></a>([learning_rate, momentum, eta, …])</p></td> |
| <td><p>the LARS optimizer from ‘Large Batch Training of Convolution Networks’ (<a class="reference external" href="https://arxiv.org/abs/1708.03888">https://arxiv.org/abs/1708.03888</a>)</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB" title="mxnet.optimizer.LAMB"><code class="xref py py-obj docutils literal notranslate"><span class="pre">LAMB</span></code></a>([learning_rate, beta1, beta2, epsilon, …])</p></td> |
| <td><p>LAMB Optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.RMSProp" title="mxnet.optimizer.RMSProp"><code class="xref py py-obj docutils literal notranslate"><span class="pre">RMSProp</span></code></a>([learning_rate, rho, momentum, …])</p></td> |
| <td><p>The RMSProp optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LANS" title="mxnet.optimizer.LANS"><code class="xref py py-obj docutils literal notranslate"><span class="pre">LANS</span></code></a>([learning_rate, beta1, beta2, epsilon, …])</p></td> |
| <td><p>LANS Optimizer.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p><strong>Functions</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.create" title="mxnet.optimizer.create"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create</span></code></a>(name, **kwargs)</p></td> |
| <td><p>Instantiates an optimizer with a given name and kwargs.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.register" title="mxnet.optimizer.register"><code class="xref py py-obj docutils literal notranslate"><span class="pre">register</span></code></a>(klass)</p></td> |
| <td><p>Registers a new optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.get_updater" title="mxnet.optimizer.get_updater"><code class="xref py py-obj docutils literal notranslate"><span class="pre">get_updater</span></code></a>(optimizer)</p></td> |
| <td><p>Returns a closure of the updater needed for kvstore.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Optimizer"> |
| <em class="property">class </em><code class="sig-name descname">Optimizer</code><span class="sig-paren">(</span><em class="sig-param">rescale_grad=1.0</em>, <em class="sig-param">param_idx2name=None</em>, <em class="sig-param">wd=0.0</em>, <em class="sig-param">clip_gradient=None</em>, <em class="sig-param">learning_rate=None</em>, <em class="sig-param">lr_scheduler=None</em>, <em class="sig-param">sym=None</em>, <em class="sig-param">begin_num_update=0</em>, <em class="sig-param">multi_precision=False</em>, <em class="sig-param">param_dict=None</em>, <em class="sig-param">aggregate_num=None</em>, <em class="sig-param">use_fused_step=None</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">object</span></code></p> |
| <p>The base class inherited by all optimizers.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>rescale_grad</strong> (<em>float</em><em>, </em><em>optional</em><em>, </em><em>default 1.0</em>) – Multiply the gradient with <cite>rescale_grad</cite> before updating. Often |
| choose to be <code class="docutils literal notranslate"><span class="pre">1.0/batch_size</span></code>.</p></li> |
| <li><p><strong>param_idx2name</strong> (<em>dict from int to string</em><em>, </em><em>optional</em><em>, </em><em>default None</em>) – A dictionary that maps int index to string name.</p></li> |
| <li><p><strong>clip_gradient</strong> (<em>float</em><em>, </em><em>optional</em><em>, </em><em>default None</em>) – Clip the gradient by projecting onto the box <code class="docutils literal notranslate"><span class="pre">[-clip_gradient,</span> <span class="pre">clip_gradient]</span></code>.</p></li> |
| <li><p><strong>learning_rate</strong> (<em>float</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>lr_scheduler</strong> (<a class="reference internal" href="../lr_scheduler/index.html#mxnet.lr_scheduler.LRScheduler" title="mxnet.lr_scheduler.LRScheduler"><em>LRScheduler</em></a><em>, </em><em>optional</em><em>, </em><em>default None</em>) – The learning rate scheduler.</p></li> |
| <li><p><strong>wd</strong> (<em>float</em><em>, </em><em>optional</em><em>, </em><em>default 0.0</em>) – The weight decay (or L2 regularization) coefficient. Modifies objective |
| by adding a penalty for having large weights.</p></li> |
| <li><p><strong>sym</strong> (<a class="reference internal" href="../legacy/symbol/symbol.html#mxnet.symbol.Symbol" title="mxnet.symbol.Symbol"><em>Symbol</em></a><em>, </em><em>optional</em><em>, </em><em>default None</em>) – The Symbol this optimizer is applying to.</p></li> |
| <li><p><strong>begin_num_update</strong> (<em>int</em><em>, </em><em>optional</em><em>, </em><em>default 0</em>) – The initial number of updates.</p></li> |
| <li><p><strong>multi_precision</strong> (<em>bool</em><em>, </em><em>optional</em><em>, </em><em>default False</em>) – Flag to control the internal precision of the optimizer. |
| False: results in using the same precision as the weights (default), |
| True: makes internal 32-bit copy of the weights and applies gradients |
| in 32-bit precision even if actual weights used in the model have lower precision. |
| Turning this on can improve convergence and accuracy when training with float16.</p></li> |
| <li><p><strong>param_dict</strong> (<em>dict of int -> gluon.Parameter</em><em>, </em><em>default None</em>) – Dictionary of parameter index to gluon.Parameter, used to lookup parameter attributes |
| such as lr_mult, wd_mult, etc. param_dict shall not be deep copied.</p></li> |
| <li><p><strong>aggregate_num</strong> (<em>int</em><em>, </em><em>optional</em><em>, </em><em>default None</em>) – Number of weights to be aggregated in a list. |
| They are passed to the optimizer for a single optimization step. |
| In default, only one weight is aggregated. |
| When <cite>aggregate_num</cite> is set to numpy.inf, all the weights are aggregated.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>optional</em><em>, </em><em>default None</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| <li><p><strong>Properties</strong> – </p></li> |
| <li><p><strong>----------</strong> – </p></li> |
| <li><p><strong>learning_rate</strong> – The current learning rate of the optimizer. Given an Optimizer object |
| optimizer, its learning rate can be accessed as optimizer.learning_rate.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.create_optimizer" title="mxnet.optimizer.Optimizer.create_optimizer"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_optimizer</span></code></a>(name, **kwargs)</p></td> |
| <td><p>Instantiates an optimizer with a given name and kwargs.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.create_state" title="mxnet.optimizer.Optimizer.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.create_state_multi_precision" title="mxnet.optimizer.Optimizer.create_state_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state_multi_precision</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight, including FP32 high precision copy if original weight is FP16.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.fused_step" title="mxnet.optimizer.Optimizer.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.register" title="mxnet.optimizer.Optimizer.register"><code class="xref py py-obj docutils literal notranslate"><span class="pre">register</span></code></a>(klass)</p></td> |
| <td><p>Registers a new optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.set_learning_rate" title="mxnet.optimizer.Optimizer.set_learning_rate"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_learning_rate</span></code></a>(lr)</p></td> |
| <td><p>Sets a new learning rate of the optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.set_lr_mult" title="mxnet.optimizer.Optimizer.set_lr_mult"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_lr_mult</span></code></a>(args_lr_mult)</p></td> |
| <td><p>Sets an individual learning rate multiplier for each parameter.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.set_wd_mult" title="mxnet.optimizer.Optimizer.set_wd_mult"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_wd_mult</span></code></a>(args_wd_mult)</p></td> |
| <td><p>Sets an individual weight decay multiplier for each parameter.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.step" title="mxnet.optimizer.Optimizer.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.update" title="mxnet.optimizer.Optimizer.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Call step to perform a single optimization update if use_fused_step is False, otherwise fused_step is called.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.update_multi_precision" title="mxnet.optimizer.Optimizer.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(indices, weights, …)</p></td> |
| <td><p>Call step to perform a single optimization update if use_fused_step is False, otherwise fused_step is called.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.create_optimizer"> |
| <em class="property">static </em><code class="sig-name descname">create_optimizer</code><span class="sig-paren">(</span><em class="sig-param">name</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.create_optimizer"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.create_optimizer" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Instantiates an optimizer with a given name and kwargs.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>We can use the alias <cite>create</cite> for <code class="docutils literal notranslate"><span class="pre">Optimizer.create_optimizer</span></code>.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>name</strong> (<em>str</em>) – Name of the optimizer. Should be the name |
| of a subclass of Optimizer. Case insensitive.</p></li> |
| <li><p><strong>kwargs</strong> (<em>dict</em>) – Parameters for the optimizer.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p>An instantiated optimizer.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p><a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer">Optimizer</a></p> |
| </dd> |
| </dl> |
| <p class="rubric">Examples</p> |
| <div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">sgd</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">create_optimizer</span><span class="p">(</span><span class="s1">'sgd'</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">type</span><span class="p">(</span><span class="n">sgd</span><span class="p">)</span> |
| <span class="go"><class 'mxnet.optimizer.SGD'></span> |
| <span class="gp">>>> </span><span class="n">adam</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="s1">'adam'</span><span class="p">,</span> <span class="n">learning_rate</span><span class="o">=</span><span class="mf">.1</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">type</span><span class="p">(</span><span class="n">adam</span><span class="p">)</span> |
| <span class="go"><class 'mxnet.optimizer.Adam'></span> |
| </pre></div> |
| </div> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.create_state_multi_precision"> |
| <code class="sig-name descname">create_state_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.create_state_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.create_state_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight, including FP32 high |
| precision copy if original weight is FP16.</p> |
| <p>This method is provided to perform automatic mixed precision training |
| for optimizers that do not support it themselves.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| New operators that fuses optimizer’s update should be put in this function.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.register"> |
| <em class="property">static </em><code class="sig-name descname">register</code><span class="sig-paren">(</span><em class="sig-param">klass</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.register"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.register" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Registers a new optimizer.</p> |
| <p>Once an optimizer is registered, we can create an instance of this |
| optimizer with <cite>create_optimizer</cite> later.</p> |
| <p class="rubric">Examples</p> |
| <div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nd">@mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">register</span> |
| <span class="gp">... </span><span class="k">class</span> <span class="nc">MyOptimizer</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="p">):</span> |
| <span class="gp">... </span> <span class="k">pass</span> |
| <span class="gp">>>> </span><span class="n">optim</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">create_optimizer</span><span class="p">(</span><span class="s1">'MyOptimizer'</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">type</span><span class="p">(</span><span class="n">optim</span><span class="p">))</span> |
| <span class="go"><class '__main__.MyOptimizer'></span> |
| </pre></div> |
| </div> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.set_learning_rate"> |
| <code class="sig-name descname">set_learning_rate</code><span class="sig-paren">(</span><em class="sig-param">lr</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.set_learning_rate"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.set_learning_rate" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets a new learning rate of the optimizer.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>lr</strong> (<em>float</em>) – The new learning rate of the optimizer.</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.set_lr_mult"> |
| <code class="sig-name descname">set_lr_mult</code><span class="sig-paren">(</span><em class="sig-param">args_lr_mult</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.set_lr_mult"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.set_lr_mult" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets an individual learning rate multiplier for each parameter.</p> |
| <p>If you specify a learning rate multiplier for a parameter, then |
| the learning rate for the parameter will be set as the product of |
| the global learning rate <cite>self.lr</cite> and its multiplier.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>The default learning rate multiplier of a <cite>Variable</cite> |
| can be set with <cite>lr_mult</cite> argument in the constructor.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>args_lr_mult</strong> (<em>dict of str/int to float</em>) – <p>For each of its key-value entries, the learning rate multipler for the |
| parameter specified in the key will be set as the given value.</p> |
| <p>You can specify the parameter with either its name or its index. |
| If you use the name, you should pass <cite>sym</cite> in the constructor, |
| and the name you specified in the key of <cite>args_lr_mult</cite> should match |
| the name of the parameter in <cite>sym</cite>. If you use the index, it should |
| correspond to the index of the parameter used in the <cite>update</cite> method.</p> |
| <p>Specifying a parameter by its index is only supported for backward |
| compatibility, and we recommend to use the name instead.</p> |
| </p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.set_wd_mult"> |
| <code class="sig-name descname">set_wd_mult</code><span class="sig-paren">(</span><em class="sig-param">args_wd_mult</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.set_wd_mult"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.set_wd_mult" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets an individual weight decay multiplier for each parameter.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>The default weight decay multiplier for a <cite>Variable</cite> |
| can be set with its <cite>wd_mult</cite> argument in the constructor.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>args_wd_mult</strong> (<em>dict of string/int to float</em>) – <p>For each of its key-value entries, the weight decay multipler for the |
| parameter specified in the key will be set as the given value.</p> |
| <p>You can specify the parameter with either its name or its index. |
| If you use the name, you should pass <cite>sym</cite> in the constructor, |
| and the name you specified in the key of <cite>args_lr_mult</cite> should match |
| the name of the parameter in <cite>sym</cite>. If you use the index, it should |
| correspond to the index of the parameter used in the <cite>update</cite> method.</p> |
| <p>Specifying a parameter by its index is only supported for backward |
| compatibility, and we recommend to use the name instead.</p> |
| </p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.update" title="Permalink to this definition">¶</a></dt> |
| <dd><dl class="simple"> |
| <dt>Call step to perform a single optimization update if use_fused_step is False,</dt><dd><p>otherwise fused_step is called.</p> |
| </dd> |
| </dl> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><dl class="simple"> |
| <dt>Call step to perform a single optimization update if use_fused_step is False,</dt><dd><p>otherwise fused_step is called. Mixed precision version.</p> |
| </dd> |
| </dl> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Test"> |
| <em class="property">class </em><code class="sig-name descname">Test</code><span class="sig-paren">(</span><em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Test"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Test" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Test optimizer</p> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Test.create_state" title="mxnet.optimizer.Test.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates a state to duplicate weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Test.step" title="mxnet.optimizer.Test.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Performs w += rescale_grad * grad.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Test.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Test.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Test.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates a state to duplicate weight.</p> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Test.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Test.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Test.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Performs w += rescale_grad * grad.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="function"> |
| <dt id="mxnet.optimizer.create"> |
| <code class="sig-name descname">create</code><span class="sig-paren">(</span><em class="sig-param">name</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="headerlink" href="#mxnet.optimizer.create" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Instantiates an optimizer with a given name and kwargs.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>We can use the alias <cite>create</cite> for <code class="docutils literal notranslate"><span class="pre">Optimizer.create_optimizer</span></code>.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>name</strong> (<em>str</em>) – Name of the optimizer. Should be the name |
| of a subclass of Optimizer. Case insensitive.</p></li> |
| <li><p><strong>kwargs</strong> (<em>dict</em>) – Parameters for the optimizer.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p>An instantiated optimizer.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p><a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer">Optimizer</a></p> |
| </dd> |
| </dl> |
| <p class="rubric">Examples</p> |
| <div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">sgd</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">create_optimizer</span><span class="p">(</span><span class="s1">'sgd'</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">type</span><span class="p">(</span><span class="n">sgd</span><span class="p">)</span> |
| <span class="go"><class 'mxnet.optimizer.SGD'></span> |
| <span class="gp">>>> </span><span class="n">adam</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="s1">'adam'</span><span class="p">,</span> <span class="n">learning_rate</span><span class="o">=</span><span class="mf">.1</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">type</span><span class="p">(</span><span class="n">adam</span><span class="p">)</span> |
| <span class="go"><class 'mxnet.optimizer.Adam'></span> |
| </pre></div> |
| </div> |
| </dd></dl> |
| |
| <dl class="function"> |
| <dt id="mxnet.optimizer.register"> |
| <code class="sig-name descname">register</code><span class="sig-paren">(</span><em class="sig-param">klass</em><span class="sig-paren">)</span><a class="headerlink" href="#mxnet.optimizer.register" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Registers a new optimizer.</p> |
| <p>Once an optimizer is registered, we can create an instance of this |
| optimizer with <cite>create_optimizer</cite> later.</p> |
| <p class="rubric">Examples</p> |
| <div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nd">@mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">register</span> |
| <span class="gp">... </span><span class="k">class</span> <span class="nc">MyOptimizer</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="p">):</span> |
| <span class="gp">... </span> <span class="k">pass</span> |
| <span class="gp">>>> </span><span class="n">optim</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">create_optimizer</span><span class="p">(</span><span class="s1">'MyOptimizer'</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">type</span><span class="p">(</span><span class="n">optim</span><span class="p">))</span> |
| <span class="go"><class '__main__.MyOptimizer'></span> |
| </pre></div> |
| </div> |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Updater"> |
| <em class="property">class </em><code class="sig-name descname">Updater</code><span class="sig-paren">(</span><em class="sig-param">optimizer</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/updater.html#Updater"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Updater" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">object</span></code></p> |
| <p>Updater for kvstore.</p> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Updater.get_states" title="mxnet.optimizer.Updater.get_states"><code class="xref py py-obj docutils literal notranslate"><span class="pre">get_states</span></code></a>([dump_optimizer])</p></td> |
| <td><p>Gets updater states.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Updater.set_states" title="mxnet.optimizer.Updater.set_states"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_states</span></code></a>(states)</p></td> |
| <td><p>Sets updater states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Updater.sync_state_context" title="mxnet.optimizer.Updater.sync_state_context"><code class="xref py py-obj docutils literal notranslate"><span class="pre">sync_state_context</span></code></a>(state, context)</p></td> |
| <td><p>sync state context.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Updater.get_states"> |
| <code class="sig-name descname">get_states</code><span class="sig-paren">(</span><em class="sig-param">dump_optimizer=False</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/updater.html#Updater.get_states"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Updater.get_states" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Gets updater states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>dump_optimizer</strong> (<em>bool</em><em>, </em><em>default False</em>) – Whether to also save the optimizer itself. This would also save optimizer |
| information such as learning rate and weight decay schedules.</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Updater.set_states"> |
| <code class="sig-name descname">set_states</code><span class="sig-paren">(</span><em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/updater.html#Updater.set_states"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Updater.set_states" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets updater states.</p> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Updater.sync_state_context"> |
| <code class="sig-name descname">sync_state_context</code><span class="sig-paren">(</span><em class="sig-param">state</em>, <em class="sig-param">context</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/updater.html#Updater.sync_state_context"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Updater.sync_state_context" title="Permalink to this definition">¶</a></dt> |
| <dd><p>sync state context.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="function"> |
| <dt id="mxnet.optimizer.get_updater"> |
| <code class="sig-name descname">get_updater</code><span class="sig-paren">(</span><em class="sig-param">optimizer</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/updater.html#get_updater"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.get_updater" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Returns a closure of the updater needed for kvstore.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>optimizer</strong> (<a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><em>Optimizer</em></a>) – The optimizer.</p> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>updater</strong> – The closure of the updater.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>function</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.SGD"> |
| <em class="property">class </em><code class="sig-name descname">SGD</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.1</em>, <em class="sig-param">momentum=0.0</em>, <em class="sig-param">lazy_update=False</em>, <em class="sig-param">multi_precision=False</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">aggregate_num=1</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/sgd.html#SGD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The SGD optimizer with momentum and weight decay.</p> |
| <p>If the storage types of grad is <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code> and <code class="docutils literal notranslate"><span class="pre">lazy_update</span></code> is True, <strong>lazy updates</strong> are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">grad</span><span class="o">.</span><span class="n">indices</span><span class="p">:</span> |
| <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">],</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">momentum</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">*</span> <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">-</span> <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD.create_state" title="mxnet.optimizer.SGD.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD.fused_step" title="mxnet.optimizer.SGD.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD.step" title="mxnet.optimizer.SGD.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD.update_multi_precision" title="mxnet.optimizer.SGD.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(indices, weights, …)</p></td> |
| <td><p>Override update_multi_precision.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>The sparse update only updates the momentum for the weights whose row_sparse |
| gradient indices appear in the current batch, rather than updating it for all |
| indices. Compared with the original update, it can provide large |
| improvements in model training throughput for some applications. However, it |
| provides slightly different semantics than the original update, and |
| may lead to different empirical results.</p> |
| <p>In the case when <code class="docutils literal notranslate"><span class="pre">update_on_kvstore</span></code> is set to False (either globally via |
| MXNET_UPDATE_ON_KVSTORE=0 environment variable or as a parameter in |
| <a class="reference internal" href="../gluon/trainer.html#mxnet.gluon.Trainer" title="mxnet.gluon.Trainer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Trainer</span></code></a>) SGD optimizer can perform aggregated update |
| of parameters, which may lead to improved performance. The aggregation size |
| is controlled by <code class="docutils literal notranslate"><span class="pre">aggregate_num</span></code> and defaults to 4.</p> |
| <p>Otherwise, <strong>standard updates</strong> are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">))</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">state</span> <span class="o">=</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">rescaled_grad</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="n">weight</span> <span class="o">-</span> <span class="n">state</span> |
| </pre></div> |
| </div> |
| <p>For details of the update algorithm see |
| <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.sgd_update" title="mxnet.ndarray.sgd_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">sgd_update</span></code></a> and <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.sgd_mom_update" title="mxnet.ndarray.sgd_mom_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">sgd_mom_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.1</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>default 0.</em>) – The momentum value.</p></li> |
| <li><p><strong>lazy_update</strong> (<em>bool</em><em>, </em><em>default False</em>) – Default is False. If True, lazy updates are applied if the storage types of weight and grad are both <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>.</p></li> |
| <li><p><strong>multi_precision</strong> (<em>bool</em><em>, </em><em>default False</em>) – Flag to control the internal precision of the optimizer. |
| False: results in using the same precision as the weights (default), |
| True: makes internal 32-bit copy of the weights and applies gradients |
| in 32-bit precision even if actual weights used in the model have lower precision. |
| Turning this on can improve convergence and accuracy when training with float16.</p></li> |
| <li><p><strong>aggregate_num</strong> (<em>int</em><em>, </em><em>default 1</em>) – Number of weights to be aggregated in a list. |
| They are passed to the optimizer for a single optimization step.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGD.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/sgd.html#SGD.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGD.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/sgd.html#SGD.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGD.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/sgd.html#SGD.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGD.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/sgd.html#SGD.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Override update_multi_precision.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.SGLD"> |
| <em class="property">class </em><code class="sig-name descname">SGLD</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.1</em>, <em class="sig-param">use_fused_step=False</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/sgld.html#SGLD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGLD" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>Stochastic Gradient Riemannian Langevin Dynamics.</p> |
| <p>This class implements the optimizer described in the paper <em>Stochastic Gradient |
| Riemannian Langevin Dynamics on the Probability Simplex</em>, available at |
| <a class="reference external" href="https://papers.nips.cc/paper/4883-stochastic-gradient-riemannian-langevin-dynamics-on-the-probability-simplex.pdf">https://papers.nips.cc/paper/4883-stochastic-gradient-riemannian-langevin-dynamics-on-the-probability-simplex.pdf</a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.001</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default False</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.SGLD.create_state" title="mxnet.optimizer.SGLD.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.SGLD.step" title="mxnet.optimizer.SGLD.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGLD.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/sgld.html#SGLD.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGLD.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGLD.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/sgld.html#SGLD.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGLD.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Signum"> |
| <em class="property">class </em><code class="sig-name descname">Signum</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.01</em>, <em class="sig-param">momentum=0.9</em>, <em class="sig-param">wd_lh=0.0</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/signum.html#Signum"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Signum" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Signum optimizer that takes the sign of gradient or momentum.</p> |
| <p>The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">state</span> <span class="o">=</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span><span class="o">-</span><span class="n">momentum</span><span class="p">)</span><span class="o">*</span><span class="n">rescaled_grad</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">wd_lh</span><span class="p">)</span> <span class="o">*</span> <span class="n">weight</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">sign</span><span class="p">(</span><span class="n">state</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Signum.create_state" title="mxnet.optimizer.Signum.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Signum.fused_step" title="mxnet.optimizer.Signum.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Signum.step" title="mxnet.optimizer.Signum.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p class="rubric">References</p> |
| <p>Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli & Anima Anandkumar. (2018). |
| signSGD: Compressed Optimisation for Non-Convex Problems. In ICML’18.</p> |
| <p>See: <a class="reference external" href="https://arxiv.org/abs/1802.04434">https://arxiv.org/abs/1802.04434</a></p> |
| <p>For details of the update algorithm see |
| <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.signsgd_update" title="mxnet.ndarray.signsgd_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">signsgd_update</span></code></a> and <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.signum_update" title="mxnet.ndarray.signum_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">signum_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.01</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>optional</em>) – The momentum value.</p></li> |
| <li><p><strong>wd_lh</strong> (<em>float</em><em>, </em><em>optional</em>) – The amount of decoupled weight decay regularization, see details in the original paper at:<a class="reference external" href="https://arxiv.org/abs/1711.05101">https://arxiv.org/abs/1711.05101</a></p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Signum.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/signum.html#Signum.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Signum.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Signum.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/signum.html#Signum.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Signum.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Signum.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/signum.html#Signum.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Signum.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.DCASGD"> |
| <em class="property">class </em><code class="sig-name descname">DCASGD</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.1</em>, <em class="sig-param">momentum=0.0</em>, <em class="sig-param">lamda=0.04</em>, <em class="sig-param">use_fused_step=False</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/dcasgd.html#DCASGD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.DCASGD" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The DCASGD optimizer.</p> |
| <p>This class implements the optimizer described in <em>Asynchronous Stochastic Gradient Descent |
| with Delay Compensation for Distributed Deep Learning</em>, |
| available at <a class="reference external" href="https://arxiv.org/abs/1609.08326">https://arxiv.org/abs/1609.08326</a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.1</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>optional</em>) – The momentum value.</p></li> |
| <li><p><strong>lamda</strong> (<em>float</em><em>, </em><em>optional</em>) – Scale DC value.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default False</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.DCASGD.create_state" title="mxnet.optimizer.DCASGD.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.DCASGD.step" title="mxnet.optimizer.DCASGD.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.DCASGD.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/dcasgd.html#DCASGD.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.DCASGD.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.DCASGD.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/dcasgd.html#DCASGD.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.DCASGD.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.NAG"> |
| <em class="property">class </em><code class="sig-name descname">NAG</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.1</em>, <em class="sig-param">momentum=0.9</em>, <em class="sig-param">multi_precision=False</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/nag.html#NAG"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>Nesterov accelerated gradient.</p> |
| <p>This optimizer updates each weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">state</span> <span class="o">=</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="n">weight</span> <span class="o">-</span> <span class="p">(</span><span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">grad</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG.create_state" title="mxnet.optimizer.NAG.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG.fused_step" title="mxnet.optimizer.NAG.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG.step" title="mxnet.optimizer.NAG.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG.update_multi_precision" title="mxnet.optimizer.NAG.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(indices, weights, …)</p></td> |
| <td><p>Override update_multi_precision.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.1</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – The momentum value.</p></li> |
| <li><p><strong>multi_precision</strong> (<em>bool</em><em>, </em><em>default False</em>) – Flag to control the internal precision of the optimizer. |
| False: results in using the same precision as the weights (default), |
| True: makes internal 32-bit copy of the weights and applies gradients |
| in 32-bit precision even if actual weights used in the model have lower precision. |
| Turning this on can improve convergence and accuracy when training with float16.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.NAG.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/nag.html#NAG.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.NAG.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/nag.html#NAG.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.NAG.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/nag.html#NAG.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.NAG.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/nag.html#NAG.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Override update_multi_precision.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.AdaBelief"> |
| <em class="property">class </em><code class="sig-name descname">AdaBelief</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-06</em>, <em class="sig-param">correct_bias=True</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adabelief.html#AdaBelief"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaBelief" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The AdaBelief optimizer.</p> |
| <dl class="simple"> |
| <dt>This class implements the optimizer described in <em>Adapting Stepsizes by the Belief in Observed Gradients</em>,</dt><dd><p>available at <a class="reference external" href="https://arxiv.org/pdf/2010.07468.pdf">https://arxiv.org/pdf/2010.07468.pdf</a>.</p> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaBelief.create_state" title="mxnet.optimizer.AdaBelief.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>state creation function.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaBelief.fused_step" title="mxnet.optimizer.AdaBelief.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaBelief.step" title="mxnet.optimizer.AdaBelief.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>Updates are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">w</span> |
| <span class="n">m</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">m</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">s</span> <span class="o">=</span> <span class="n">beta2</span> <span class="o">*</span> <span class="n">s</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="p">)</span> <span class="o">*</span> <span class="p">((</span><span class="n">grad</span> <span class="o">-</span> <span class="n">m</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">+</span> <span class="n">epsilon</span> |
| <span class="n">lr</span> <span class="o">=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> |
| <span class="n">w</span> <span class="o">=</span> <span class="n">w</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="p">(</span><span class="n">m</span> <span class="o">/</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">))</span> |
| </pre></div> |
| </div> |
| <p>Also, we can turn off the bias correction term and the updates are as follows:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">w</span> |
| <span class="n">m</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">m</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">s</span> <span class="o">=</span> <span class="n">beta2</span> <span class="o">*</span> <span class="n">s</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="p">)</span> <span class="o">*</span> <span class="p">((</span><span class="n">grad</span> <span class="o">-</span> <span class="n">m</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">+</span> <span class="n">epsilon</span> |
| <span class="n">lr</span> <span class="o">=</span> <span class="n">learning_rate</span> |
| <span class="n">w</span> <span class="o">=</span> <span class="n">w</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="p">(</span><span class="n">m</span> <span class="o">/</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">))</span> |
| </pre></div> |
| </div> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.001</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>default 0.999</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-6</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>correct_bias</strong> (<em>bool</em><em>, </em><em>default True</em>) – Can be set to False to avoid correcting bias in Adam (e.g. like in Bert TF repository). |
| Default True.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaBelief.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adabelief.html#AdaBelief.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaBelief.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>state creation function.</p> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaBelief.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adabelief.html#AdaBelief.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaBelief.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaBelief.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adabelief.html#AdaBelief.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaBelief.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.AdaGrad"> |
| <em class="property">class </em><code class="sig-name descname">AdaGrad</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.01</em>, <em class="sig-param">epsilon=1e-06</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adagrad.html#AdaGrad"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaGrad" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>AdaGrad optimizer.</p> |
| <p>This class implements the AdaGrad optimizer described in <em>Adaptive Subgradient |
| Methods for Online Learning and Stochastic Optimization</em>, and available at |
| <a class="reference external" href="http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf">http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf</a>.</p> |
| <p>This optimizer updates each weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">history</span> <span class="o">+=</span> <span class="n">square</span><span class="p">(</span><span class="n">grad</span><span class="p">)</span> |
| <span class="n">weight</span> <span class="o">-=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">grad</span> <span class="o">/</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">history</span><span class="p">)</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaGrad.create_state" title="mxnet.optimizer.AdaGrad.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaGrad.fused_step" title="mxnet.optimizer.AdaGrad.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaGrad.step" title="mxnet.optimizer.AdaGrad.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <div class="admonition seealso"> |
| <p class="admonition-title">See also</p> |
| <p><a class="reference internal" href="../legacy/ndarray/sparse/index.html#mxnet.ndarray.sparse.adagrad_update" title="mxnet.ndarray.sparse.adagrad_update"><code class="xref py py-meth docutils literal notranslate"><span class="pre">mxnet.ndarray.sparse.adagrad_update()</span></code></a></p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.01</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-6</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False or grad is not sparse, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaGrad.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adagrad.html#AdaGrad.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaGrad.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaGrad.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adagrad.html#AdaGrad.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaGrad.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaGrad.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adagrad.html#AdaGrad.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaGrad.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.AdaDelta"> |
| <em class="property">class </em><code class="sig-name descname">AdaDelta</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=1.0</em>, <em class="sig-param">rho=0.9</em>, <em class="sig-param">epsilon=1e-06</em>, <em class="sig-param">use_fused_step=False</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adadelta.html#AdaDelta"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaDelta" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The AdaDelta optimizer.</p> |
| <p>This class implements AdaDelta, an optimizer described in <em>ADADELTA: An adaptive |
| learning rate method</em>, available at <a class="reference external" href="https://arxiv.org/abs/1212.5701">https://arxiv.org/abs/1212.5701</a>.</p> |
| <p>This optimizer updates each weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">acc_grad</span> <span class="o">=</span> <span class="n">rho</span> <span class="o">*</span> <span class="n">acc_grad</span> <span class="o">+</span> <span class="p">(</span><span class="mf">1.</span> <span class="o">-</span> <span class="n">rho</span><span class="p">)</span> <span class="o">*</span> <span class="n">grad</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">delta</span> <span class="o">=</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">acc_delta</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> <span class="o">/</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">acc_grad</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">acc_delta</span> <span class="o">=</span> <span class="n">rho</span> <span class="o">*</span> <span class="n">acc_delta</span> <span class="o">+</span> <span class="p">(</span><span class="mf">1.</span> <span class="o">-</span> <span class="n">rho</span><span class="p">)</span> <span class="o">*</span> <span class="n">delta</span> <span class="o">*</span> <span class="n">delta</span> |
| <span class="n">weight</span> <span class="o">-=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">delta</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaDelta.create_state" title="mxnet.optimizer.AdaDelta.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaDelta.step" title="mxnet.optimizer.AdaDelta.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 1.0</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>rho</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Decay rate for both squared gradients and delta.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-6</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default False</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaDelta.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adadelta.html#AdaDelta.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaDelta.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaDelta.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adadelta.html#AdaDelta.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaDelta.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Adam"> |
| <em class="property">class </em><code class="sig-name descname">Adam</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">lazy_update=False</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adam.html#Adam"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adam" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Adam optimizer.</p> |
| <p>This class implements the optimizer described in <em>Adam: A Method for |
| Stochastic Optimization</em>, available at <a class="reference external" href="http://arxiv.org/abs/1412.6980">http://arxiv.org/abs/1412.6980</a>.</p> |
| <p>If the storage types of grad is <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>, and <code class="docutils literal notranslate"><span class="pre">lazy_update</span></code> is True, <strong>lazy updates</strong> at step t are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">grad</span><span class="o">.</span><span class="n">indices</span><span class="p">:</span> |
| <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| <span class="n">m</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">m</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| <span class="n">v</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">beta2</span> <span class="o">*</span> <span class="n">v</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> |
| <span class="n">lr</span> <span class="o">=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> |
| <span class="n">w</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">w</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">m</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">/</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">v</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Adam.create_state" title="mxnet.optimizer.Adam.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Adam.fused_step" title="mxnet.optimizer.Adam.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Adam.step" title="mxnet.optimizer.Adam.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>The lazy update only updates the mean and var for the weights whose row_sparse |
| gradient indices appear in the current batch, rather than updating it for all indices. |
| Compared with the original update, it can provide large improvements in model training |
| throughput for some applications. However, it provides slightly different semantics than |
| the original update, and may lead to different empirical results.</p> |
| <p>Otherwise, <strong>standard updates</strong> at step t are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">m</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">m</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">rescaled_grad</span> |
| <span class="n">v</span> <span class="o">=</span> <span class="n">beta2</span> <span class="o">*</span> <span class="n">v</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">rescaled_grad</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> |
| <span class="n">lr</span> <span class="o">=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> |
| <span class="n">w</span> <span class="o">=</span> <span class="n">w</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">m</span> <span class="o">/</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <p>For details of the update algorithm, see <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.adam_update" title="mxnet.ndarray.adam_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">adam_update</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.001</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>default 0.999</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-8</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>lazy_update</strong> (<em>bool</em><em>, </em><em>default False</em>) – Default is False. If True, lazy updates are applied if the storage types of weight and grad are both <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adam.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adam.html#Adam.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adam.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adam.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adam.html#Adam.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adam.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adam.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adam.html#Adam.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adam.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Adamax"> |
| <em class="property">class </em><code class="sig-name descname">Adamax</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.002</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">use_fused_step=False</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adamax.html#Adamax"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adamax" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The AdaMax optimizer.</p> |
| <p>It is a variant of Adam based on the infinity norm |
| available at <a class="reference external" href="http://arxiv.org/abs/1412.6980">http://arxiv.org/abs/1412.6980</a> Section 7.</p> |
| <p>The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">m</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">m_t</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">u</span> <span class="o">=</span> <span class="n">maximum</span><span class="p">(</span><span class="n">beta2</span> <span class="o">*</span> <span class="n">u</span><span class="p">,</span> <span class="nb">abs</span><span class="p">(</span><span class="n">grad</span><span class="p">))</span> |
| <span class="n">weight</span> <span class="o">-=</span> <span class="n">lr</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> <span class="o">*</span> <span class="n">m</span> <span class="o">/</span> <span class="p">(</span><span class="n">u</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Adamax.create_state" title="mxnet.optimizer.Adamax.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Adamax.step" title="mxnet.optimizer.Adamax.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.002</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>default 0.999</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default False</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adamax.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adamax.html#Adamax.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adamax.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adamax.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/adamax.html#Adamax.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adamax.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Nadam"> |
| <em class="property">class </em><code class="sig-name descname">Nadam</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">schedule_decay=0.004</em>, <em class="sig-param">use_fused_step=False</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/nadam.html#Nadam"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Nadam" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Nesterov Adam optimizer.</p> |
| <p>Much like Adam is essentially RMSprop with momentum, |
| Nadam is Adam RMSprop with Nesterov momentum available |
| at <a class="reference external" href="http://cs229.stanford.edu/proj2015/054_report.pdf">http://cs229.stanford.edu/proj2015/054_report.pdf</a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.001</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>default 0.999</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-8</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>schedule_decay</strong> (<em>float</em><em>, </em><em>default 0.004</em>) – Exponential decay rate for the momentum schedule</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default False</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Nadam.create_state" title="mxnet.optimizer.Nadam.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Nadam.step" title="mxnet.optimizer.Nadam.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Nadam.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/nadam.html#Nadam.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Nadam.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Nadam.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/nadam.html#Nadam.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Nadam.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Ftrl"> |
| <em class="property">class </em><code class="sig-name descname">Ftrl</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.1</em>, <em class="sig-param">lamda1=0.01</em>, <em class="sig-param">beta=1.0</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/ftrl.html#Ftrl"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Ftrl" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Ftrl optimizer.</p> |
| <p>Referenced from <em>Ad Click Prediction: a View from the Trenches</em>, available at |
| <a class="reference external" href="http://dl.acm.org/citation.cfm?id=2488200">http://dl.acm.org/citation.cfm?id=2488200</a>.</p> |
| <dl> |
| <dt>eta :</dt><dd><div class="math notranslate nohighlight"> |
| \[\eta_{t,i} = \frac{learningrate}{\beta+\sqrt{\sum_{s=1}^tg_{s,i}^2}}\]</div> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Ftrl.create_state" title="mxnet.optimizer.Ftrl.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Ftrl.fused_step" title="mxnet.optimizer.Ftrl.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Ftrl.step" title="mxnet.optimizer.Ftrl.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">z</span> <span class="o">+=</span> <span class="n">rescaled_grad</span> <span class="o">-</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="n">rescaled_grad</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">-</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">))</span> <span class="o">*</span> <span class="n">weight</span> <span class="o">/</span> <span class="n">learning_rate</span> |
| <span class="n">n</span> <span class="o">+=</span> <span class="n">rescaled_grad</span><span class="o">**</span><span class="mi">2</span> |
| <span class="n">w</span> <span class="o">=</span> <span class="p">(</span><span class="n">sign</span><span class="p">(</span><span class="n">z</span><span class="p">)</span> <span class="o">*</span> <span class="n">lamda1</span> <span class="o">-</span> <span class="n">z</span><span class="p">)</span> <span class="o">/</span> <span class="p">((</span><span class="n">beta</span> <span class="o">+</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">))</span> <span class="o">/</span> <span class="n">learning_rate</span> <span class="o">+</span> <span class="n">wd</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="nb">abs</span><span class="p">(</span><span class="n">z</span><span class="p">)</span> <span class="o">></span> <span class="n">lamda1</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p>If the storage types of weight, state and grad are all <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>, <strong>sparse updates</strong> are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">grad</span><span class="o">.</span><span class="n">indices</span><span class="p">:</span> |
| <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">z</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+=</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">-</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">-</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">[</span><span class="n">row</span><span class="p">]))</span> <span class="o">*</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">/</span> <span class="n">learning_rate</span> |
| <span class="n">n</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+=</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span><span class="o">**</span><span class="mi">2</span> |
| <span class="n">w</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">sign</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> <span class="o">*</span> <span class="n">lamda1</span> <span class="o">-</span> <span class="n">z</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> <span class="o">/</span> <span class="p">((</span><span class="n">beta</span> <span class="o">+</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">[</span><span class="n">row</span><span class="p">]))</span> <span class="o">/</span> <span class="n">learning_rate</span> <span class="o">+</span> <span class="n">wd</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="nb">abs</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> <span class="o">></span> <span class="n">lamda1</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p>The sparse update only updates the z and n for the weights whose row_sparse |
| gradient indices appear in the current batch, rather than updating it for all |
| indices. Compared with the original update, it can provide large |
| improvements in model training throughput for some applications. However, it |
| provides slightly different semantics than the original update, and |
| may lead to different empirical results.</p> |
| <p>For details of the update algorithm, see <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.ftrl_update" title="mxnet.ndarray.ftrl_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">ftrl_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.1</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>lamda1</strong> (<em>float</em><em>, </em><em>default 0.01</em>) – L1 regularization coefficient.</p></li> |
| <li><p><strong>beta</strong> (<em>float</em><em>, </em><em>default 1.0</em>) – Per-coordinate learning rate correlation parameter.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Ftrl.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/ftrl.html#Ftrl.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Ftrl.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Ftrl.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/ftrl.html#Ftrl.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Ftrl.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Ftrl.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/ftrl.html#Ftrl.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Ftrl.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.FTML"> |
| <em class="property">class </em><code class="sig-name descname">FTML</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.0025</em>, <em class="sig-param">beta1=0.6</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/ftml.html#FTML"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.FTML" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The FTML optimizer.</p> |
| <p>This class implements the optimizer described in |
| <em>FTML - Follow the Moving Leader in Deep Learning</em>, |
| available at <a class="reference external" href="http://proceedings.mlr.press/v70/zheng17a/zheng17a.pdf">http://proceedings.mlr.press/v70/zheng17a/zheng17a.pdf</a>.</p> |
| <p>Denote time step by t. The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">v</span> <span class="o">=</span> <span class="n">beta2</span> <span class="o">*</span> <span class="n">v</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="p">)</span> <span class="o">*</span> <span class="n">square</span><span class="p">(</span><span class="n">rescaled_grad</span><span class="p">)</span> |
| <span class="n">d_t</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">power</span><span class="p">(</span><span class="n">beta1</span><span class="p">,</span> <span class="n">t</span><span class="p">))</span> <span class="o">/</span> <span class="n">lr</span> <span class="o">*</span> <span class="p">(</span><span class="n">square_root</span><span class="p">(</span><span class="n">v</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">power</span><span class="p">(</span><span class="n">beta2</span><span class="p">,</span> <span class="n">t</span><span class="p">)))</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| <span class="n">z</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">z</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">rescaled_grad</span> <span class="o">-</span> <span class="p">(</span><span class="n">d_t</span> <span class="o">-</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">d_</span><span class="p">(</span><span class="n">t</span><span class="o">-</span><span class="mi">1</span><span class="p">))</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="o">-</span> <span class="n">z</span> <span class="o">/</span> <span class="n">d_t</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.FTML.create_state" title="mxnet.optimizer.FTML.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.FTML.fused_step" title="mxnet.optimizer.FTML.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.FTML.step" title="mxnet.optimizer.FTML.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>For details of the update algorithm, see <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.ftml_update" title="mxnet.ndarray.ftml_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">ftml_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.0025</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>default 0.6</em>) – 0 < beta1 < 1. Generally close to 0.5.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>default 0.999</em>) – 0 < beta2 < 1. Generally close to 1.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-8</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.FTML.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/ftml.html#FTML.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.FTML.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.FTML.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/ftml.html#FTML.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.FTML.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.FTML.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/ftml.html#FTML.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.FTML.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.LARS"> |
| <em class="property">class </em><code class="sig-name descname">LARS</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.1</em>, <em class="sig-param">momentum=0.0</em>, <em class="sig-param">eta=0.001</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">lazy_update=False</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">aggregate_num=1</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lars.html#LARS"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>the LARS optimizer from ‘Large Batch Training of Convolution Networks’ (<a class="reference external" href="https://arxiv.org/abs/1708.03888">https://arxiv.org/abs/1708.03888</a>)</p> |
| <p>Behave mostly like SGD with momentum and weight decay but is scaling adaptively the learning for each layer:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">w_norm</span> <span class="o">=</span> <span class="n">L2norm</span><span class="p">(</span><span class="n">weights</span><span class="p">)</span> |
| <span class="n">g_norm</span> <span class="o">=</span> <span class="n">L2norm</span><span class="p">(</span><span class="n">gradients</span><span class="p">)</span> |
| <span class="k">if</span> <span class="n">w_norm</span> <span class="o">></span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">g_norm</span> <span class="o">></span> <span class="mi">0</span><span class="p">:</span> |
| <span class="n">lr_layer</span> <span class="o">=</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">w_norm</span> <span class="o">/</span> <span class="p">(</span><span class="n">g_norm</span> <span class="o">+</span> <span class="n">weight_decay</span> <span class="o">*</span> <span class="n">w_norm</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| <span class="k">else</span><span class="p">:</span> |
| <span class="n">lr_layer</span> <span class="o">=</span> <span class="n">lr</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.create_state" title="mxnet.optimizer.LARS.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.fused_step" title="mxnet.optimizer.LARS.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.step" title="mxnet.optimizer.LARS.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.update_multi_precision" title="mxnet.optimizer.LARS.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(indices, weights, …)</p></td> |
| <td><p>Override update_multi_precision.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.1</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>default 0.</em>) – The momentum value.</p></li> |
| <li><p><strong>eta</strong> (<em>float</em><em>, </em><em>default 0.001</em>) – LARS coefficient used to scale the learning rate.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-8</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>lazy_update</strong> (<em>bool</em><em>, </em><em>default False</em>) – Default is False. If True, lazy updates are applied if the storage types of weight and grad are both <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>.</p></li> |
| <li><p><strong>aggregate_num</strong> (<em>int</em><em>, </em><em>default 1</em>) – Number of weights to be aggregated in a list. |
| They are passed to the optimizer for a single optimization step.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lars.html#LARS.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lars.html#LARS.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lars.html#LARS.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lars.html#LARS.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Override update_multi_precision.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.LAMB"> |
| <em class="property">class </em><code class="sig-name descname">LAMB</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-06</em>, <em class="sig-param">lower_bound=None</em>, <em class="sig-param">upper_bound=None</em>, <em class="sig-param">bias_correction=True</em>, <em class="sig-param">aggregate_num=4</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lamb.html#LAMB"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>LAMB Optimizer.</p> |
| <p>Referenced from ‘Large Batch Optimization for Deep Learning: Training BERT in 76 minutes’ |
| (<a class="reference external" href="https://arxiv.org/pdf/1904.00962.pdf">https://arxiv.org/pdf/1904.00962.pdf</a>)</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.001</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>default 0.999</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-6</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>lower_bound</strong> (<em>float</em><em>, </em><em>default None</em>) – Lower limit of norm of weight</p></li> |
| <li><p><strong>upper_bound</strong> (<em>float</em><em>, </em><em>default None</em>) – Upper limit of norm of weight</p></li> |
| <li><p><strong>bias_correction</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to apply bias correction</p></li> |
| <li><p><strong>aggregate_num</strong> (<em>int</em><em>, </em><em>default 4</em>) – Number of weights to be aggregated in a list. |
| They are passed to the optimizer for a single optimization step. |
| In default, all the weights are aggregated.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB.create_state" title="mxnet.optimizer.LAMB.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB.fused_step" title="mxnet.optimizer.LAMB.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB.step" title="mxnet.optimizer.LAMB.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB.update_multi_precision" title="mxnet.optimizer.LAMB.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(indices, weights, …)</p></td> |
| <td><p>Override update_multi_precision.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LAMB.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lamb.html#LAMB.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LAMB.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lamb.html#LAMB.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LAMB.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lamb.html#LAMB.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LAMB.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lamb.html#LAMB.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Override update_multi_precision.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.RMSProp"> |
| <em class="property">class </em><code class="sig-name descname">RMSProp</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">rho=0.9</em>, <em class="sig-param">momentum=0.9</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">centered=False</em>, <em class="sig-param">clip_weights=None</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/rmsprop.html#RMSProp"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.RMSProp" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The RMSProp optimizer.</p> |
| <p>Two versions of RMSProp are implemented:</p> |
| <p>If <code class="docutils literal notranslate"><span class="pre">centered=False</span></code>, we follow |
| <a class="reference external" href="http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf">http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf</a> by |
| Tieleman & Hinton, 2012. |
| For details of the update algorithm see <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.rmsprop_update" title="mxnet.ndarray.rmsprop_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">rmsprop_update</span></code></a>.</p> |
| <p>If <code class="docutils literal notranslate"><span class="pre">centered=True</span></code>, we follow <a class="reference external" href="http://arxiv.org/pdf/1308.0850v5.pdf">http://arxiv.org/pdf/1308.0850v5.pdf</a> (38)-(45) |
| by Alex Graves, 2013. |
| For details of the update algorithm see <a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.rmspropalex_update" title="mxnet.ndarray.rmspropalex_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">rmspropalex_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.001</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>rho</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – A decay factor of moving average over past squared gradient.</p></li> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Heavy ball momentum factor. Only used if <cite>centered`=``True`</cite>.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-8</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>centered</strong> (<em>bool</em><em>, </em><em>default False</em>) – <p>Flag to control which version of RMSProp to use.:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kc">True</span><span class="p">:</span> <span class="n">will</span> <span class="n">use</span> <span class="n">Graves</span><span class="s1">'s version of `RMSProp`,</span> |
| <span class="kc">False</span><span class="p">:</span> <span class="n">will</span> <span class="n">use</span> <span class="n">Tieleman</span> <span class="o">&</span> <span class="n">Hinton</span><span class="s1">'s version of `RMSProp`.</span> |
| </pre></div> |
| </div> |
| </p></li> |
| <li><p><strong>clip_weights</strong> (<em>float</em><em>, </em><em>optional</em>) – Clips weights into range <code class="docutils literal notranslate"><span class="pre">[-clip_weights,</span> <span class="pre">clip_weights]</span></code>.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.RMSProp.create_state" title="mxnet.optimizer.RMSProp.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.RMSProp.fused_step" title="mxnet.optimizer.RMSProp.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.RMSProp.step" title="mxnet.optimizer.RMSProp.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform an optimization step using gradients and states.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.RMSProp.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/rmsprop.html#RMSProp.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.RMSProp.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.RMSProp.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/rmsprop.html#RMSProp.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.RMSProp.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.RMSProp.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/rmsprop.html#RMSProp.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.RMSProp.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform an optimization step using gradients and states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.LANS"> |
| <em class="property">class </em><code class="sig-name descname">LANS</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-06</em>, <em class="sig-param">lower_bound=None</em>, <em class="sig-param">upper_bound=None</em>, <em class="sig-param">aggregate_num=4</em>, <em class="sig-param">use_fused_step=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lans.html#LANS"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LANS" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>LANS Optimizer.</p> |
| <p>Referenced from ‘Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes’ |
| (<a class="reference external" href="http://arxiv.org/abs/2006.13484">http://arxiv.org/abs/2006.13484</a>)</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>default 0.001</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>default 0.999</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-6</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>lower_bound</strong> (<em>float</em><em>, </em><em>default None</em>) – Lower limit of norm of weight</p></li> |
| <li><p><strong>upper_bound</strong> (<em>float</em><em>, </em><em>default None</em>) – Upper limit of norm of weight</p></li> |
| <li><p><strong>aggregate_num</strong> (<em>int</em><em>, </em><em>default 4</em>) – Number of weights to be aggregated in a list. |
| They are passed to the optimizer for a single optimization step. |
| In default, all the weights are aggregated.</p></li> |
| <li><p><strong>use_fused_step</strong> (<em>bool</em><em>, </em><em>default True</em>) – Whether or not to use fused kernels for optimizer. |
| When use_fused_step=False, step is called, |
| otherwise, fused_step is called.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LANS.create_state" title="mxnet.optimizer.LANS.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LANS.fused_step" title="mxnet.optimizer.LANS.fused_step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">fused_step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LANS.step" title="mxnet.optimizer.LANS.step"><code class="xref py py-obj docutils literal notranslate"><span class="pre">step</span></code></a>(indices, weights, grads, states)</p></td> |
| <td><p>Perform a fused optimization step using gradients and states.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LANS.update_multi_precision" title="mxnet.optimizer.LANS.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(indices, weights, …)</p></td> |
| <td><p>Override update_multi_precision.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LANS.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lans.html#LANS.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LANS.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../legacy/ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LANS.fused_step"> |
| <code class="sig-name descname">fused_step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lans.html#LANS.fused_step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LANS.fused_step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LANS.step"> |
| <code class="sig-name descname">step</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lans.html#LANS.step"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LANS.step" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Perform a fused optimization step using gradients and states. |
| Fused kernel is used for update.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>indices</strong> (<em>list of int</em>) – List of unique indices of the parameters into the individual learning rates |
| and weight decays. Learning rates and weight decay may be set via <cite>set_lr_mult()</cite> |
| and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weights</strong> (<em>list of NDArray</em>) – List of parameters to be updated.</p></li> |
| <li><p><strong>grads</strong> (<em>list of NDArray</em>) – List of gradients of the objective with respect to this parameter.</p></li> |
| <li><p><strong>states</strong> (<em>List of any obj</em>) – List of state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LANS.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">indices</em>, <em class="sig-param">weights</em>, <em class="sig-param">grads</em>, <em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/lans.html#LANS.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LANS.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Override update_multi_precision.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| </div> |
| |
| |
| <hr class="feedback-hr-top" /> |
| <div class="feedback-container"> |
| <div class="feedback-question">Did this page help you?</div> |
| <div class="feedback-answer-container"> |
| <div class="feedback-answer yes-link" data-response="yes">Yes</div> |
| <div class="feedback-answer no-link" data-response="no">No</div> |
| </div> |
| <div class="feedback-thank-you">Thanks for your feedback!</div> |
| </div> |
| <hr class="feedback-hr-bottom" /> |
| </div> |
| <div class="side-doc-outline"> |
| <div class="side-doc-outline--content"> |
| </div> |
| </div> |
| |
| <div class="clearer"></div> |
| </div><div class="pagenation"> |
| <a id="button-prev" href="../initializer/index.html" class="mdl-button mdl-js-button mdl-js-ripple-effect mdl-button--colored" role="botton" accesskey="P"> |
| <i class="pagenation-arrow-L fas fa-arrow-left fa-lg"></i> |
| <div class="pagenation-text"> |
| <span class="pagenation-direction">Previous</span> |
| <div>mxnet.initializer</div> |
| </div> |
| </a> |
| <a id="button-next" href="../lr_scheduler/index.html" class="mdl-button mdl-js-button mdl-js-ripple-effect mdl-button--colored" role="botton" accesskey="N"> |
| <i class="pagenation-arrow-R fas fa-arrow-right fa-lg"></i> |
| <div class="pagenation-text"> |
| <span class="pagenation-direction">Next</span> |
| <div>mxnet.lr_scheduler</div> |
| </div> |
| </a> |
| </div> |
| <footer class="site-footer h-card"> |
| <div class="wrapper"> |
| <div class="row"> |
| <div class="col-4"> |
| <h4 class="footer-category-title">Resources</h4> |
| <ul class="contact-list"> |
| <li><a href="https://lists.apache.org/list.html?dev@mxnet.apache.org">Mailing list</a> <a class="u-email" href="mailto:dev-subscribe@mxnet.apache.org">(subscribe)</a></li> |
| <li><a href="https://discuss.mxnet.io">MXNet Discuss forum</a></li> |
| <li><a href="https://github.com/apache/incubator-mxnet/issues">Github Issues</a></li> |
| <li><a href="https://github.com/apache/incubator-mxnet/projects">Projects</a></li> |
| <li><a href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home">Developer Wiki</a></li> |
| <li><a href="/community">Contribute To MXNet</a></li> |
| </ul> |
| </div> |
| |
| <div class="col-4"><ul class="social-media-list"><li><a href="https://github.com/apache/incubator-mxnet"><svg class="svg-icon"><use xlink:href="../../_static/minima-social-icons.svg#github"></use></svg> <span class="username">apache/incubator-mxnet</span></a></li><li><a href="https://www.twitter.com/apachemxnet"><svg class="svg-icon"><use xlink:href="../../_static/minima-social-icons.svg#twitter"></use></svg> <span class="username">apachemxnet</span></a></li><li><a href="https://youtube.com/apachemxnet"><svg class="svg-icon"><use xlink:href="../../_static/minima-social-icons.svg#youtube"></use></svg> <span class="username">apachemxnet</span></a></li></ul> |
| </div> |
| |
| <div class="col-4 footer-text"> |
| <p>A flexible and efficient library for deep learning.</p> |
| </div> |
| </div> |
| </div> |
| </footer> |
| |
| <footer class="site-footer2"> |
| <div class="wrapper"> |
| <div class="row"> |
| <div class="col-3"> |
| <img src="../../_static/apache_incubator_logo.png" class="footer-logo col-2"> |
| </div> |
| <div class="footer-bottom-warning col-9"> |
| <p>Apache MXNet is an effort undergoing incubation at <a href="http://www.apache.org/">The Apache Software Foundation</a> (ASF), <span style="font-weight:bold">sponsored by the <i>Apache Incubator</i></span>. Incubation is required |
| of all newly accepted projects until a further review indicates that the infrastructure, |
| communications, and decision making process have stabilized in a manner consistent with other |
| successful ASF projects. While incubation status is not necessarily a reflection of the completeness |
| or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. |
| </p><p>"Copyright © 2017-2018, The Apache Software Foundation Apache MXNet, MXNet, Apache, the Apache |
| feather, and the Apache MXNet project logo are either registered trademarks or trademarks of the |
| Apache Software Foundation."</p> |
| </div> |
| </div> |
| </div> |
| </footer> |
| |
| </body> |
| </html> |