| <!DOCTYPE html> |
| |
| <html xmlns="http://www.w3.org/1999/xhtml"> |
| <head> |
| <meta charset="utf-8" /> |
| <meta charset="utf-8"> |
| <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> |
| <meta http-equiv="x-ua-compatible" content="ie=edge"> |
| <style> |
| .dropdown { |
| position: relative; |
| display: inline-block; |
| } |
| |
| .dropdown-content { |
| display: none; |
| position: absolute; |
| background-color: #f9f9f9; |
| min-width: 160px; |
| box-shadow: 0px 8px 16px 0px rgba(0,0,0,0.2); |
| padding: 12px 16px; |
| z-index: 1; |
| text-align: left; |
| } |
| |
| .dropdown:hover .dropdown-content { |
| display: block; |
| } |
| |
| .dropdown-option:hover { |
| color: #FF4500 !important; |
| } |
| |
| .dropdown-option-active { |
| color: #FF4500; |
| font-weight: lighter; |
| } |
| |
| .dropdown-option { |
| color: #000000; |
| font-weight: lighter; |
| } |
| |
| .dropdown-header { |
| color: #FFFFFF; |
| display: inline-flex; |
| } |
| |
| .dropdown-caret { |
| width: 18px; |
| } |
| |
| .dropdown-caret-path { |
| fill: #FFFFFF; |
| } |
| </style> |
| |
| <title>mxnet.optimizer — Apache MXNet documentation</title> |
| |
| <link rel="stylesheet" href="../../_static/basic.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/pygments.css" type="text/css" /> |
| <link rel="stylesheet" type="text/css" href="../../_static/mxnet.css" /> |
| <link rel="stylesheet" href="../../_static/material-design-lite-1.3.0/material.blue-deep_orange.min.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/sphinx_materialdesign_theme.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/fontawesome/all.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/fonts.css" type="text/css" /> |
| <script id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script> |
| <script src="../../_static/jquery.js"></script> |
| <script src="../../_static/underscore.js"></script> |
| <script src="../../_static/doctools.js"></script> |
| <script src="../../_static/language_data.js"></script> |
| <script src="../../_static/google_analytics.js"></script> |
| <script src="../../_static/autodoc.js"></script> |
| <script crossorigin="anonymous" integrity="sha256-Ae2Vz/4ePdIu6ZyI/5ZGsYnb+m0JlOmKPjt6XZ9JJkA=" src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.4/require.min.js"></script> |
| <script async="async" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-AMS-MML_HTMLorMML"></script> |
| <script type="text/x-mathjax-config">MathJax.Hub.Config({"tex2jax": {"inlineMath": [["$", "$"], ["\\(", "\\)"]], "processEscapes": true, "ignoreClass": "document", "processClass": "math|output_area"}})</script> |
| <link rel="shortcut icon" href="../../_static/mxnet-icon.png"/> |
| <link rel="index" title="Index" href="../../genindex.html" /> |
| <link rel="search" title="Search" href="../../search.html" /> |
| <link rel="next" title="mxnet.lr_scheduler" href="../lr_scheduler/index.html" /> |
| <link rel="prev" title="mxnet.initializer" href="../initializer/index.html" /> |
| </head> |
| <body><header class="site-header" role="banner"> |
| <div class="wrapper"> |
| <a class="site-title" rel="author" href="/versions/1.6/"><img |
| src="../../_static/mxnet_logo.png" class="site-header-logo"></a> |
| <nav class="site-nav"> |
| <input type="checkbox" id="nav-trigger" class="nav-trigger"/> |
| <label for="nav-trigger"> |
| <span class="menu-icon"> |
| <svg viewBox="0 0 18 15" width="18px" height="15px"> |
| <path d="M18,1.484c0,0.82-0.665,1.484-1.484,1.484H1.484C0.665,2.969,0,2.304,0,1.484l0,0C0,0.665,0.665,0,1.484,0 h15.032C17.335,0,18,0.665,18,1.484L18,1.484z M18,7.516C18,8.335,17.335,9,16.516,9H1.484C0.665,9,0,8.335,0,7.516l0,0 c0-0.82,0.665-1.484,1.484-1.484h15.032C17.335,6.031,18,6.696,18,7.516L18,7.516z M18,13.516C18,14.335,17.335,15,16.516,15H1.484 C0.665,15,0,14.335,0,13.516l0,0c0-0.82,0.665-1.483,1.484-1.483h15.032C17.335,12.031,18,12.695,18,13.516L18,13.516z"/> |
| </svg> |
| </span> |
| </label> |
| |
| <div class="trigger"> |
| <a class="page-link" href="/versions/1.6/get_started">Get Started</a> |
| <a class="page-link" href="/versions/1.6/blog">Blog</a> |
| <a class="page-link" href="/versions/1.6/features">Features</a> |
| <a class="page-link" href="/versions/1.6/ecosystem">Ecosystem</a> |
| <a class="page-link page-current" href="/versions/1.6/api">Docs & Tutorials</a> |
| <a class="page-link" href="https://github.com/apache/incubator-mxnet">GitHub</a> |
| <div class="dropdown"> |
| <span class="dropdown-header">1.6 |
| <svg class="dropdown-caret" viewBox="0 0 32 32" class="icon icon-caret-bottom" aria-hidden="true"><path class="dropdown-caret-path" d="M24 11.305l-7.997 11.39L8 11.305z"></path></svg> |
| </span> |
| <div class="dropdown-content"> |
| <a class="dropdown-option" href="/">master</a><br> |
| <a class="dropdown-option" href="/versions/1.7/">1.7</a><br> |
| <a class="dropdown-option-active" href="/versions/1.6/">1.6</a><br> |
| <a class="dropdown-option" href="/versions/1.5.0/">1.5.0</a><br> |
| <a class="dropdown-option" href="/versions/1.4.1/">1.4.1</a><br> |
| <a class="dropdown-option" href="/versions/1.3.1/">1.3.1</a><br> |
| <a class="dropdown-option" href="/versions/1.2.1/">1.2.1</a><br> |
| <a class="dropdown-option" href="/versions/1.1.0/">1.1.0</a><br> |
| <a class="dropdown-option" href="/versions/1.0.0/">1.0.0</a><br> |
| <a class="dropdown-option" href="/versions/0.12.1/">0.12.1</a><br> |
| <a class="dropdown-option" href="/versions/0.11.0/">0.11.0</a> |
| </div> |
| </div> |
| </div> |
| </nav> |
| </div> |
| </header> |
| <div class="mdl-layout mdl-js-layout mdl-layout--fixed-header mdl-layout--fixed-drawer"><header class="mdl-layout__header mdl-layout__header--waterfall "> |
| <div class="mdl-layout__header-row"> |
| |
| <nav class="mdl-navigation breadcrumb"> |
| <a class="mdl-navigation__link" href="../index.html">Python API</a><i class="material-icons">navigate_next</i> |
| <a class="mdl-navigation__link is-active">mxnet.optimizer</a> |
| </nav> |
| <div class="mdl-layout-spacer"></div> |
| <nav class="mdl-navigation"> |
| |
| <form class="form-inline pull-sm-right" action="../../search.html" method="get"> |
| <div class="mdl-textfield mdl-js-textfield mdl-textfield--expandable mdl-textfield--floating-label mdl-textfield--align-right"> |
| <label id="quick-search-icon" class="mdl-button mdl-js-button mdl-button--icon" for="waterfall-exp"> |
| <i class="material-icons">search</i> |
| </label> |
| <div class="mdl-textfield__expandable-holder"> |
| <input class="mdl-textfield__input" type="text" name="q" id="waterfall-exp" placeholder="Search" /> |
| <input type="hidden" name="check_keywords" value="yes" /> |
| <input type="hidden" name="area" value="default" /> |
| </div> |
| </div> |
| <div class="mdl-tooltip" data-mdl-for="quick-search-icon"> |
| Quick search |
| </div> |
| </form> |
| |
| <a id="button-show-source" |
| class="mdl-button mdl-js-button mdl-button--icon" |
| href="../../_sources/api/optimizer/index.rst" rel="nofollow"> |
| <i class="material-icons">code</i> |
| </a> |
| <div class="mdl-tooltip" data-mdl-for="button-show-source"> |
| Show Source |
| </div> |
| </nav> |
| </div> |
| <div class="mdl-layout__header-row header-links"> |
| <div class="mdl-layout-spacer"></div> |
| <nav class="mdl-navigation"> |
| </nav> |
| </div> |
| </header><header class="mdl-layout__drawer"> |
| |
| <div class="globaltoc"> |
| <span class="mdl-layout-title toc">Table Of Contents</span> |
| |
| |
| |
| <nav class="mdl-navigation"> |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="../../tutorials/index.html">Python Tutorials</a><ul> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/getting-started/index.html">Getting Started</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/crash-course/index.html">Crash Course</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/1-ndarray.html">Manipulate data with <code class="docutils literal notranslate"><span class="pre">ndarray</span></code></a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/2-nn.html">Create a neural network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/3-autograd.html">Automatic differentiation with <code class="docutils literal notranslate"><span class="pre">autograd</span></code></a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/4-train.html">Train the neural network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-predict.html">Predict with a pre-trained model</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/6-use_gpus.html">Use GPUs</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/to-mxnet/index.html">Moving to MXNet from Other Frameworks</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/to-mxnet/pytorch.html">PyTorch vs Apache MXNet</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/gluon_from_experiment_to_deployment.html">Gluon: from experiment to deployment</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/logistic_regression_explained.html">Logistic regression explained</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/image/mnist.html">MNIST</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/packages/index.html">Packages</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/autograd/index.html">Automatic Differentiation</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/gluon/index.html">Gluon</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/index.html">Blocks</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/custom-layer.html">Custom Layers</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/custom_layer_beginners.html">Customer Layers (Beginners)</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/hybridize.html">Hybridize</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/init.html">Initialization</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/naming.html">Parameter and Block Naming</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/nn.html">Layers and Blocks</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/parameters.html">Parameter Management</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/save_load_params.html">Saving and Loading Gluon Models</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/activations/activations.html">Activation Blocks</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/image/index.html">Image Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/image-augmentation.html">Image Augmentation</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/mnist.html">Handwritten Digit Recognition</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/pretrained_models.html">Using pre-trained models in MXNet</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/loss/index.html">Losses</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/custom-loss.html">Custom Loss Blocks</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/kl_divergence.html">Kullback-Leibler (KL) Divergence</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/loss.html">Loss functions</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/text/index.html">Text Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/text/gnmt.html">Google Neural Machine Translation</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/text/transformer.html">Machine Translation with Transformer</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/training/index.html">Training</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/fit_api_tutorial.html">MXNet Gluon Fit API</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/trainer.html">Trainer</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/index.html">Learning Rates</a><ul> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_finder.html">Learning Rate Finder</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_schedules.html">Learning Rate Schedules</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_schedules_advanced.html">Advanced Learning Rate Schedules</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/normalization/index.html">Normalization Blocks</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/kvstore/index.html">KVStore</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/kvstore/kvstore.html">Distributed Key-Value Store</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/ndarray/index.html">NDArray</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/01-ndarray-intro.html">An Intro: Manipulate Data the MXNet Way with NDArray</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/02-ndarray-operations.html">NDArray Operations</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/03-ndarray-contexts.html">NDArray Contexts</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/gotchas_numpy_in_mxnet.html">Gotchas using NumPy in Apache MXNet</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/index.html">Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/csr.html">CSRNDArray - NDArray in Compressed Sparse Row Storage Format</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/row_sparse.html">RowSparseNDArray - NDArray for Sparse Gradient Updates</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/train.html">Train a Linear Regression Model with Sparse Symbols</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/train_gluon.html">Sparse NDArrays with Gluon</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/onnx/index.html">ONNX</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/fine_tuning_gluon.html">Fine-tuning an ONNX model</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/inference_on_onnx_model.html">Running inference on MXNet/Gluon from an ONNX model</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/super_resolution.html">Importing an ONNX model into MXNet</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/deploy/export/onnx.html">Export ONNX Models</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/optimizer/index.html">Optimizers</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/viz/index.html">Visualization</a><ul> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/visualize_graph">Visualize networks</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/performance/index.html">Performance</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/performance/compression/index.html">Compression</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/compression/int8.html">Deploy with int-8</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/float16">Float16</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/gradient_compression">Gradient Compression</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://gluon-cv.mxnet.io/build/examples_deployment/int8_inference.html">GluonCV with Quantized Models</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/performance/backend/index.html">Accelerated Backend Tools</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/mkldnn/index.html">Intel MKL-DNN</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/mkldnn/mkldnn_quantization.html">Quantize with MKL-DNN backend</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/mkldnn/mkldnn_readme.html">Install MXNet with MKL-DNN</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/tensorrt/index.html">TensorRT</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/tensorrt/tensorrt.html">Optimized GPU Inference</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/tvm.html">Use TVM</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/profiler.html">Profiling MXNet Models</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/amp.html">Using AMP: Automatic Mixed Precision</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/deploy/index.html">Deployment</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/export/index.html">Export</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/export/onnx.html">Exporting to ONNX format</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://gluon-cv.mxnet.io/build/examples_deployment/export_network.html">Export Gluon CV Models</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/blocks/save_load_params.html">Save / Load Parameters</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/inference/index.html">Inference</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/cpp.html">Deploy into C++</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/scala.html">Deploy into a Java or Scala Environment</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/wine_detector.html">Real-time Object Detection with MXNet On The Raspberry Pi</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/index.html">Run on AWS</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/use_ec2.html">Run on an EC2 Instance</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/use_sagemaker.html">Run on Amazon SageMaker</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/cloud.html">MXNet on the Cloud</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/extend/index.html">Extend</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/extend/custom_layer.html">Custom Layers</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/extend/customop.html">Custom Numpy Operators</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/new_op">New Operator Creation</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/add_op_in_backend">New Operator in MXNet Backend</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l1 current"><a class="reference internal" href="../index.html">Python API</a><ul class="current"> |
| <li class="toctree-l2"><a class="reference internal" href="../ndarray/index.html">mxnet.ndarray</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/ndarray.html">ndarray</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/contrib/index.html">ndarray.contrib</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/image/index.html">ndarray.image</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/linalg/index.html">ndarray.linalg</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/op/index.html">ndarray.op</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/random/index.html">ndarray.random</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/register/index.html">ndarray.register</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/sparse/index.html">ndarray.sparse</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/utils/index.html">ndarray.utils</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../gluon/index.html">mxnet.gluon</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/block.html">gluon.Block</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/hybrid_block.html">gluon.HybridBlock</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/symbol_block.html">gluon.SymbolBlock</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/constant.html">gluon.Constant</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/parameter.html">gluon.Parameter</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/parameter_dict.html">gluon.ParameterDict</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/trainer.html">gluon.Trainer</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/contrib/index.html">gluon.contrib</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/data/index.html">gluon.data</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../gluon/data/vision/index.html">data.vision</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../gluon/data/vision/datasets/index.html">vision.datasets</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../gluon/data/vision/transforms/index.html">vision.transforms</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/loss/index.html">gluon.loss</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/model_zoo/index.html">gluon.model_zoo.vision</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/nn/index.html">gluon.nn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/rnn/index.html">gluon.rnn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/utils/index.html">gluon.utils</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../autograd/index.html">mxnet.autograd</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../initializer/index.html">mxnet.initializer</a></li> |
| <li class="toctree-l2 current"><a class="current reference internal" href="#">mxnet.optimizer</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../lr_scheduler/index.html">mxnet.lr_scheduler</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../metric/index.html">mxnet.metric</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html">mxnet.kvstore</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../symbol/index.html">mxnet.symbol</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/symbol.html">symbol</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/contrib/index.html">symbol.contrib</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/image/index.html">symbol.image</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/linalg/index.html">symbol.linalg</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/op/index.html">symbol.op</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/random/index.html">symbol.random</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/register/index.html">symbol.register</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/sparse/index.html">symbol.sparse</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../module/index.html">mxnet.module</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../contrib/index.html">mxnet.contrib</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/autograd/index.html">contrib.autograd</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/io/index.html">contrib.io</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/ndarray/index.html">contrib.ndarray</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/onnx/index.html">contrib.onnx</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/quantization/index.html">contrib.quantization</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/symbol/index.html">contrib.symbol</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/tensorboard/index.html">contrib.tensorboard</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/tensorrt/index.html">contrib.tensorrt</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/text/index.html">contrib.text</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../mxnet/index.html">mxnet</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/attribute/index.html">mxnet.attribute</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/base/index.html">mxnet.base</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/callback/index.html">mxnet.callback</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/context/index.html">mxnet.context</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/engine/index.html">mxnet.engine</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/executor/index.html">mxnet.executor</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/executor_manager/index.html">mxnet.executor_manager</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/image/index.html">mxnet.image</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/io/index.html">mxnet.io</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/kvstore_server/index.html">mxnet.kvstore_server</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/libinfo/index.html">mxnet.libinfo</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/log/index.html">mxnet.log</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/model/index.html">mxnet.model</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/monitor/index.html">mxnet.monitor</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/name/index.html">mxnet.name</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/notebook/index.html">mxnet.notebook</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/operator/index.html">mxnet.operator</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/profiler/index.html">mxnet.profiler</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/random/index.html">mxnet.random</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/recordio/index.html">mxnet.recordio</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/registry/index.html">mxnet.registry</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/rtc/index.html">mxnet.rtc</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/test_utils/index.html">mxnet.test_utils</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/torch/index.html">mxnet.torch</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/util/index.html">mxnet.util</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/visualization/index.html">mxnet.visualization</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| |
| </nav> |
| |
| </div> |
| |
| </header> |
| <main class="mdl-layout__content" tabIndex="0"> |
| |
| <script type="text/javascript" src="../../_static/sphinx_materialdesign_theme.js "></script> |
| <header class="mdl-layout__drawer"> |
| |
| <div class="globaltoc"> |
| <span class="mdl-layout-title toc">Table Of Contents</span> |
| |
| |
| |
| <nav class="mdl-navigation"> |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="../../tutorials/index.html">Python Tutorials</a><ul> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/getting-started/index.html">Getting Started</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/crash-course/index.html">Crash Course</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/1-ndarray.html">Manipulate data with <code class="docutils literal notranslate"><span class="pre">ndarray</span></code></a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/2-nn.html">Create a neural network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/3-autograd.html">Automatic differentiation with <code class="docutils literal notranslate"><span class="pre">autograd</span></code></a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/4-train.html">Train the neural network</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/5-predict.html">Predict with a pre-trained model</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/crash-course/6-use_gpus.html">Use GPUs</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/to-mxnet/index.html">Moving to MXNet from Other Frameworks</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/getting-started/to-mxnet/pytorch.html">PyTorch vs Apache MXNet</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/gluon_from_experiment_to_deployment.html">Gluon: from experiment to deployment</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/getting-started/logistic_regression_explained.html">Logistic regression explained</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/image/mnist.html">MNIST</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/packages/index.html">Packages</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/autograd/index.html">Automatic Differentiation</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/gluon/index.html">Gluon</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/index.html">Blocks</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/custom-layer.html">Custom Layers</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/custom_layer_beginners.html">Customer Layers (Beginners)</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/hybridize.html">Hybridize</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/init.html">Initialization</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/naming.html">Parameter and Block Naming</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/nn.html">Layers and Blocks</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/parameters.html">Parameter Management</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/save_load_params.html">Saving and Loading Gluon Models</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/blocks/activations/activations.html">Activation Blocks</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/image/index.html">Image Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/image-augmentation.html">Image Augmentation</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/mnist.html">Handwritten Digit Recognition</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/image/pretrained_models.html">Using pre-trained models in MXNet</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/loss/index.html">Losses</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/custom-loss.html">Custom Loss Blocks</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/kl_divergence.html">Kullback-Leibler (KL) Divergence</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/loss/loss.html">Loss functions</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/text/index.html">Text Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/text/gnmt.html">Google Neural Machine Translation</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/text/transformer.html">Machine Translation with Transformer</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/gluon/training/index.html">Training</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/fit_api_tutorial.html">MXNet Gluon Fit API</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/trainer.html">Trainer</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/index.html">Learning Rates</a><ul> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_finder.html">Learning Rate Finder</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_schedules.html">Learning Rate Schedules</a></li> |
| <li class="toctree-l6"><a class="reference internal" href="../../tutorials/packages/gluon/training/learning_rates/learning_rate_schedules_advanced.html">Advanced Learning Rate Schedules</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/gluon/training/normalization/index.html">Normalization Blocks</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/kvstore/index.html">KVStore</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/kvstore/kvstore.html">Distributed Key-Value Store</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/ndarray/index.html">NDArray</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/01-ndarray-intro.html">An Intro: Manipulate Data the MXNet Way with NDArray</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/02-ndarray-operations.html">NDArray Operations</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/03-ndarray-contexts.html">NDArray Contexts</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/gotchas_numpy_in_mxnet.html">Gotchas using NumPy in Apache MXNet</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/index.html">Tutorials</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/csr.html">CSRNDArray - NDArray in Compressed Sparse Row Storage Format</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/row_sparse.html">RowSparseNDArray - NDArray for Sparse Gradient Updates</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/train.html">Train a Linear Regression Model with Sparse Symbols</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/packages/ndarray/sparse/train_gluon.html">Sparse NDArrays with Gluon</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/onnx/index.html">ONNX</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/fine_tuning_gluon.html">Fine-tuning an ONNX model</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/inference_on_onnx_model.html">Running inference on MXNet/Gluon from an ONNX model</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/packages/onnx/super_resolution.html">Importing an ONNX model into MXNet</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/deploy/export/onnx.html">Export ONNX Models</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/optimizer/index.html">Optimizers</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/packages/viz/index.html">Visualization</a><ul> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/visualize_graph">Visualize networks</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/performance/index.html">Performance</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/performance/compression/index.html">Compression</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/compression/int8.html">Deploy with int-8</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/float16">Float16</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/faq/gradient_compression">Gradient Compression</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://gluon-cv.mxnet.io/build/examples_deployment/int8_inference.html">GluonCV with Quantized Models</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/performance/backend/index.html">Accelerated Backend Tools</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/mkldnn/index.html">Intel MKL-DNN</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/mkldnn/mkldnn_quantization.html">Quantize with MKL-DNN backend</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/mkldnn/mkldnn_readme.html">Install MXNet with MKL-DNN</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/tensorrt/index.html">TensorRT</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../../tutorials/performance/backend/tensorrt/tensorrt.html">Optimized GPU Inference</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/tvm.html">Use TVM</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/profiler.html">Profiling MXNet Models</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/performance/backend/amp.html">Using AMP: Automatic Mixed Precision</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/deploy/index.html">Deployment</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/export/index.html">Export</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/export/onnx.html">Exporting to ONNX format</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://gluon-cv.mxnet.io/build/examples_deployment/export_network.html">Export Gluon CV Models</a></li> |
| <li class="toctree-l4"><a class="reference external" href="https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/blocks/save_load_params.html">Save / Load Parameters</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/inference/index.html">Inference</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/cpp.html">Deploy into C++</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/scala.html">Deploy into a Java or Scala Environment</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/inference/wine_detector.html">Real-time Object Detection with MXNet On The Raspberry Pi</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/index.html">Run on AWS</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/use_ec2.html">Run on an EC2 Instance</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/use_sagemaker.html">Run on Amazon SageMaker</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="../../tutorials/deploy/run-on-aws/cloud.html">MXNet on the Cloud</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../../tutorials/extend/index.html">Extend</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/extend/custom_layer.html">Custom Layers</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../../tutorials/extend/customop.html">Custom Numpy Operators</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/new_op">New Operator Creation</a></li> |
| <li class="toctree-l3"><a class="reference external" href="https://mxnet.apache.org/api/faq/add_op_in_backend">New Operator in MXNet Backend</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l1 current"><a class="reference internal" href="../index.html">Python API</a><ul class="current"> |
| <li class="toctree-l2"><a class="reference internal" href="../ndarray/index.html">mxnet.ndarray</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/ndarray.html">ndarray</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/contrib/index.html">ndarray.contrib</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/image/index.html">ndarray.image</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/linalg/index.html">ndarray.linalg</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/op/index.html">ndarray.op</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/random/index.html">ndarray.random</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/register/index.html">ndarray.register</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/sparse/index.html">ndarray.sparse</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../ndarray/utils/index.html">ndarray.utils</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../gluon/index.html">mxnet.gluon</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/block.html">gluon.Block</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/hybrid_block.html">gluon.HybridBlock</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/symbol_block.html">gluon.SymbolBlock</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/constant.html">gluon.Constant</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/parameter.html">gluon.Parameter</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/parameter_dict.html">gluon.ParameterDict</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/trainer.html">gluon.Trainer</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/contrib/index.html">gluon.contrib</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/data/index.html">gluon.data</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="../gluon/data/vision/index.html">data.vision</a><ul> |
| <li class="toctree-l5"><a class="reference internal" href="../gluon/data/vision/datasets/index.html">vision.datasets</a></li> |
| <li class="toctree-l5"><a class="reference internal" href="../gluon/data/vision/transforms/index.html">vision.transforms</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/loss/index.html">gluon.loss</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/model_zoo/index.html">gluon.model_zoo.vision</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/nn/index.html">gluon.nn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/rnn/index.html">gluon.rnn</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../gluon/utils/index.html">gluon.utils</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../autograd/index.html">mxnet.autograd</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../initializer/index.html">mxnet.initializer</a></li> |
| <li class="toctree-l2 current"><a class="current reference internal" href="#">mxnet.optimizer</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../lr_scheduler/index.html">mxnet.lr_scheduler</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../metric/index.html">mxnet.metric</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../kvstore/index.html">mxnet.kvstore</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../symbol/index.html">mxnet.symbol</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/symbol.html">symbol</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/contrib/index.html">symbol.contrib</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/image/index.html">symbol.image</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/linalg/index.html">symbol.linalg</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/op/index.html">symbol.op</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/random/index.html">symbol.random</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/register/index.html">symbol.register</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../symbol/sparse/index.html">symbol.sparse</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../module/index.html">mxnet.module</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../contrib/index.html">mxnet.contrib</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/autograd/index.html">contrib.autograd</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/io/index.html">contrib.io</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/ndarray/index.html">contrib.ndarray</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/onnx/index.html">contrib.onnx</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/quantization/index.html">contrib.quantization</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/symbol/index.html">contrib.symbol</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/tensorboard/index.html">contrib.tensorboard</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/tensorrt/index.html">contrib.tensorrt</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../contrib/text/index.html">contrib.text</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../mxnet/index.html">mxnet</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/attribute/index.html">mxnet.attribute</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/base/index.html">mxnet.base</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/callback/index.html">mxnet.callback</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/context/index.html">mxnet.context</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/engine/index.html">mxnet.engine</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/executor/index.html">mxnet.executor</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/executor_manager/index.html">mxnet.executor_manager</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/image/index.html">mxnet.image</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/io/index.html">mxnet.io</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/kvstore_server/index.html">mxnet.kvstore_server</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/libinfo/index.html">mxnet.libinfo</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/log/index.html">mxnet.log</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/model/index.html">mxnet.model</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/monitor/index.html">mxnet.monitor</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/name/index.html">mxnet.name</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/notebook/index.html">mxnet.notebook</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/operator/index.html">mxnet.operator</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/profiler/index.html">mxnet.profiler</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/random/index.html">mxnet.random</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/recordio/index.html">mxnet.recordio</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/registry/index.html">mxnet.registry</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/rtc/index.html">mxnet.rtc</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/test_utils/index.html">mxnet.test_utils</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/torch/index.html">mxnet.torch</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/util/index.html">mxnet.util</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../mxnet/visualization/index.html">mxnet.visualization</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| |
| </nav> |
| |
| </div> |
| |
| </header> |
| |
| <div class="document"> |
| <div class="page-content" role="main"> |
| |
| <div class="section" id="module-mxnet.optimizer"> |
| <span id="mxnet-optimizer"></span><h1>mxnet.optimizer<a class="headerlink" href="#module-mxnet.optimizer" title="Permalink to this headline">¶</a></h1> |
| <p>Optimizer API of MXNet.</p> |
| <p><strong>Classes</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaDelta" title="mxnet.optimizer.AdaDelta"><code class="xref py py-obj docutils literal notranslate"><span class="pre">AdaDelta</span></code></a>([rho, epsilon])</p></td> |
| <td><p>The AdaDelta optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaGrad" title="mxnet.optimizer.AdaGrad"><code class="xref py py-obj docutils literal notranslate"><span class="pre">AdaGrad</span></code></a>([eps])</p></td> |
| <td><p>AdaGrad optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Adam" title="mxnet.optimizer.Adam"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Adam</span></code></a>([learning_rate, beta1, beta2, epsilon, …])</p></td> |
| <td><p>The Adam optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Adamax" title="mxnet.optimizer.Adamax"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Adamax</span></code></a>([learning_rate, beta1, beta2])</p></td> |
| <td><p>The AdaMax optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.DCASGD" title="mxnet.optimizer.DCASGD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">DCASGD</span></code></a>([momentum, lamda])</p></td> |
| <td><p>The DCASGD optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.FTML" title="mxnet.optimizer.FTML"><code class="xref py py-obj docutils literal notranslate"><span class="pre">FTML</span></code></a>([beta1, beta2, epsilon])</p></td> |
| <td><p>The FTML optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Ftrl" title="mxnet.optimizer.Ftrl"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Ftrl</span></code></a>([lamda1, learning_rate, beta])</p></td> |
| <td><p>The Ftrl optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS" title="mxnet.optimizer.LARS"><code class="xref py py-obj docutils literal notranslate"><span class="pre">LARS</span></code></a>([momentum, lazy_update, eta, eps, …])</p></td> |
| <td><p>the LARS optimizer from ‘Large Batch Training of Convolution Networks’ (<a class="reference external" href="https://arxiv.org/abs/1708.03888">https://arxiv.org/abs/1708.03888</a>)</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LBSGD" title="mxnet.optimizer.LBSGD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">LBSGD</span></code></a>([momentum, multi_precision, …])</p></td> |
| <td><p>The Large Batch SGD optimizer with momentum and weight decay.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG" title="mxnet.optimizer.NAG"><code class="xref py py-obj docutils literal notranslate"><span class="pre">NAG</span></code></a>([momentum])</p></td> |
| <td><p>Nesterov accelerated gradient.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Nadam" title="mxnet.optimizer.Nadam"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Nadam</span></code></a>([learning_rate, beta1, beta2, …])</p></td> |
| <td><p>The Nesterov Adam optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Optimizer</span></code></a>([rescale_grad, param_idx2name, …])</p></td> |
| <td><p>The base class inherited by all optimizers.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.RMSProp" title="mxnet.optimizer.RMSProp"><code class="xref py py-obj docutils literal notranslate"><span class="pre">RMSProp</span></code></a>([learning_rate, gamma1, gamma2, …])</p></td> |
| <td><p>The RMSProp optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD" title="mxnet.optimizer.SGD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">SGD</span></code></a>([momentum, lazy_update])</p></td> |
| <td><p>The SGD optimizer with momentum and weight decay.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.SGLD" title="mxnet.optimizer.SGLD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">SGLD</span></code></a>(**kwargs)</p></td> |
| <td><p>Stochastic Gradient Riemannian Langevin Dynamics.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Signum" title="mxnet.optimizer.Signum"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Signum</span></code></a>([learning_rate, momentum, wd_lh])</p></td> |
| <td><p>The Signum optimizer that takes the sign of gradient or momentum.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB" title="mxnet.optimizer.LAMB"><code class="xref py py-obj docutils literal notranslate"><span class="pre">LAMB</span></code></a>([learning_rate, beta1, beta2, epsilon, …])</p></td> |
| <td><p>LAMB Optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Test" title="mxnet.optimizer.Test"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Test</span></code></a>(**kwargs)</p></td> |
| <td><p>The Test optimizer</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Updater" title="mxnet.optimizer.Updater"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Updater</span></code></a>(optimizer)</p></td> |
| <td><p>Updater for kvstore.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.ccSGD" title="mxnet.optimizer.ccSGD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">ccSGD</span></code></a>(*args, **kwargs)</p></td> |
| <td><p>[DEPRECATED] Same as <cite>SGD</cite>. Left here for backward compatibility.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p><strong>Functions</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.NDabs" title="mxnet.optimizer.NDabs"><code class="xref py py-obj docutils literal notranslate"><span class="pre">NDabs</span></code></a>([data, out, name])</p></td> |
| <td><p>Returns element-wise absolute value of the input.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.create" title="mxnet.optimizer.create"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create</span></code></a>(name, **kwargs)</p></td> |
| <td><p>Instantiates an optimizer with a given name and kwargs.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.get_updater" title="mxnet.optimizer.get_updater"><code class="xref py py-obj docutils literal notranslate"><span class="pre">get_updater</span></code></a>(optimizer)</p></td> |
| <td><p>Returns a closure of the updater needed for kvstore.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.register" title="mxnet.optimizer.register"><code class="xref py py-obj docutils literal notranslate"><span class="pre">register</span></code></a>(klass)</p></td> |
| <td><p>Registers a new optimizer.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="class"> |
| <dt id="mxnet.optimizer.AdaDelta"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">AdaDelta</code><span class="sig-paren">(</span><em class="sig-param">rho=0.9</em>, <em class="sig-param">epsilon=1e-05</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#AdaDelta"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaDelta" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The AdaDelta optimizer.</p> |
| <p>This class implements AdaDelta, an optimizer described in <em>ADADELTA: An adaptive |
| learning rate method</em>, available at <a class="reference external" href="https://arxiv.org/abs/1212.5701">https://arxiv.org/abs/1212.5701</a>.</p> |
| <p>This optimizer updates each weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">acc_grad</span> <span class="o">=</span> <span class="n">rho</span> <span class="o">*</span> <span class="n">acc_grad</span> <span class="o">+</span> <span class="p">(</span><span class="mf">1.</span> <span class="o">-</span> <span class="n">rho</span><span class="p">)</span> <span class="o">*</span> <span class="n">grad</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">delta</span> <span class="o">=</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">acc_delta</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> <span class="o">/</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">acc_grad</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">acc_delta</span> <span class="o">=</span> <span class="n">rho</span> <span class="o">*</span> <span class="n">acc_delta</span> <span class="o">+</span> <span class="p">(</span><span class="mf">1.</span> <span class="o">-</span> <span class="n">rho</span><span class="p">)</span> <span class="o">*</span> <span class="n">delta</span> <span class="o">*</span> <span class="n">delta</span> |
| <span class="n">weight</span> <span class="o">-=</span> <span class="p">(</span><span class="n">delta</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaDelta.create_state" title="mxnet.optimizer.AdaDelta.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaDelta.update" title="mxnet.optimizer.AdaDelta.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>rho</strong> (<em>float</em>) – Decay rate for both squared gradients and delta.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em>) – Small value to avoid division by 0.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaDelta.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#AdaDelta.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaDelta.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaDelta.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#AdaDelta.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaDelta.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.AdaGrad"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">AdaGrad</code><span class="sig-paren">(</span><em class="sig-param">eps=1e-07</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#AdaGrad"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaGrad" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>AdaGrad optimizer.</p> |
| <p>This class implements the AdaGrad optimizer described in <em>Adaptive Subgradient |
| Methods for Online Learning and Stochastic Optimization</em>, and available at |
| <a class="reference external" href="http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf">http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf</a>.</p> |
| <p>This optimizer updates each weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">history</span> <span class="o">+=</span> <span class="n">square</span><span class="p">(</span><span class="n">grad</span><span class="p">)</span> |
| <span class="n">div</span> <span class="o">=</span> <span class="n">grad</span> <span class="o">/</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">history</span> <span class="o">+</span> <span class="n">float_stable_eps</span><span class="p">)</span> |
| <span class="n">weight</span> <span class="o">+=</span> <span class="p">(</span><span class="n">div</span> <span class="o">+</span> <span class="n">weight</span> <span class="o">*</span> <span class="n">wd</span><span class="p">)</span> <span class="o">*</span> <span class="o">-</span><span class="n">lr</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaGrad.create_state" title="mxnet.optimizer.AdaGrad.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.AdaGrad.update" title="mxnet.optimizer.AdaGrad.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <div class="admonition seealso"> |
| <p class="admonition-title">See also</p> |
| <p><a class="reference internal" href="../ndarray/sparse/index.html#mxnet.ndarray.sparse.adagrad_update" title="mxnet.ndarray.sparse.adagrad_update"><code class="xref py py-meth docutils literal notranslate"><span class="pre">mxnet.ndarray.sparse.adagrad_update()</span></code></a></p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>eps</strong> (<em>float</em><em>, </em><em>optional</em>) – Initial value of the history accumulator. Avoids division by 0.</p> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaGrad.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#AdaGrad.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaGrad.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.AdaGrad.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#AdaGrad.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.AdaGrad.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Adam"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">Adam</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">lazy_update=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Adam"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adam" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Adam optimizer.</p> |
| <p>This class implements the optimizer described in <em>Adam: A Method for |
| Stochastic Optimization</em>, available at <a class="reference external" href="http://arxiv.org/abs/1412.6980">http://arxiv.org/abs/1412.6980</a>.</p> |
| <p>If the storage types of grad is <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>, and <code class="docutils literal notranslate"><span class="pre">lazy_update</span></code> is True, <strong>lazy updates</strong> at step t are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">grad</span><span class="o">.</span><span class="n">indices</span><span class="p">:</span> |
| <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">*</span> <span class="n">rescale_grad</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">],</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">m</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">m</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| <span class="n">v</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">beta2</span> <span class="o">*</span> <span class="n">v</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> |
| <span class="n">lr</span> <span class="o">=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> |
| <span class="n">w</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">w</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">m</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">/</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">v</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Adam.create_state" title="mxnet.optimizer.Adam.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Adam.update" title="mxnet.optimizer.Adam.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>The lazy update only updates the mean and var for the weights whose row_sparse |
| gradient indices appear in the current batch, rather than updating it for all indices. |
| Compared with the original update, it can provide large improvements in model training |
| throughput for some applications. However, it provides slightly different semantics than |
| the original update, and may lead to different empirical results.</p> |
| <p>Otherwise, <strong>standard updates</strong> at step t are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">m</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">m</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">rescaled_grad</span> |
| <span class="n">v</span> <span class="o">=</span> <span class="n">beta2</span> <span class="o">*</span> <span class="n">v</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">rescaled_grad</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> |
| <span class="n">lr</span> <span class="o">=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> |
| <span class="n">w</span> <span class="o">=</span> <span class="n">w</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">m</span> <span class="o">/</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <p>For details of the update algorithm, see <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.adam_update" title="mxnet.ndarray.adam_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">adam_update</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>optional</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>optional</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>optional</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>lazy_update</strong> (<em>bool</em><em>, </em><em>optional</em>) – Default is True. If True, lazy updates are applied if the storage types of weight and grad are both <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adam.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Adam.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adam.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adam.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Adam.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adam.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Adamax"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">Adamax</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.002</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Adamax"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adamax" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The AdaMax optimizer.</p> |
| <p>It is a variant of Adam based on the infinity norm |
| available at <a class="reference external" href="http://arxiv.org/abs/1412.6980">http://arxiv.org/abs/1412.6980</a> Section 7.</p> |
| <p>The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">m</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">m_t</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">grad</span> |
| <span class="n">u</span> <span class="o">=</span> <span class="n">maximum</span><span class="p">(</span><span class="n">beta2</span> <span class="o">*</span> <span class="n">u</span><span class="p">,</span> <span class="nb">abs</span><span class="p">(</span><span class="n">grad</span><span class="p">))</span> |
| <span class="n">weight</span> <span class="o">-=</span> <span class="n">lr</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">t</span><span class="p">)</span> <span class="o">*</span> <span class="n">m</span> <span class="o">/</span> <span class="n">u</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Adamax.create_state" title="mxnet.optimizer.Adamax.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Adamax.update" title="mxnet.optimizer.Adamax.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>optional</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>optional</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adamax.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Adamax.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adamax.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Adamax.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Adamax.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Adamax.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.DCASGD"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">DCASGD</code><span class="sig-paren">(</span><em class="sig-param">momentum=0.0</em>, <em class="sig-param">lamda=0.04</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#DCASGD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.DCASGD" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The DCASGD optimizer.</p> |
| <p>This class implements the optimizer described in <em>Asynchronous Stochastic Gradient Descent |
| with Delay Compensation for Distributed Deep Learning</em>, |
| available at <a class="reference external" href="https://arxiv.org/abs/1609.08326">https://arxiv.org/abs/1609.08326</a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>optional</em>) – The momentum value.</p></li> |
| <li><p><strong>lamda</strong> (<em>float</em><em>, </em><em>optional</em>) – Scale DC value.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.DCASGD.create_state" title="mxnet.optimizer.DCASGD.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.DCASGD.update" title="mxnet.optimizer.DCASGD.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.DCASGD.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#DCASGD.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.DCASGD.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.DCASGD.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#DCASGD.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.DCASGD.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.FTML"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">FTML</code><span class="sig-paren">(</span><em class="sig-param">beta1=0.6</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#FTML"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.FTML" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The FTML optimizer.</p> |
| <p>This class implements the optimizer described in |
| <em>FTML - Follow the Moving Leader in Deep Learning</em>, |
| available at <a class="reference external" href="http://proceedings.mlr.press/v70/zheng17a/zheng17a.pdf">http://proceedings.mlr.press/v70/zheng17a/zheng17a.pdf</a>.</p> |
| <p>Denote time step by t. The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">v</span> <span class="o">=</span> <span class="n">beta2</span> <span class="o">*</span> <span class="n">v</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="p">)</span> <span class="o">*</span> <span class="n">square</span><span class="p">(</span><span class="n">rescaled_grad</span><span class="p">)</span> |
| <span class="n">d_t</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">power</span><span class="p">(</span><span class="n">beta1</span><span class="p">,</span> <span class="n">t</span><span class="p">))</span> <span class="o">/</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">square_root</span><span class="p">(</span><span class="n">v</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">power</span><span class="p">(</span><span class="n">beta2</span><span class="p">,</span> <span class="n">t</span><span class="p">)))</span> <span class="o">+</span> <span class="n">epsilon</span><span class="p">)</span> |
| <span class="n">z</span> <span class="o">=</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">z</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="p">)</span> <span class="o">*</span> <span class="n">rescaled_grad</span> <span class="o">-</span> <span class="p">(</span><span class="n">d_t</span> <span class="o">-</span> <span class="n">beta1</span> <span class="o">*</span> <span class="n">d_</span><span class="p">(</span><span class="n">t</span><span class="o">-</span><span class="mi">1</span><span class="p">))</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="o">-</span> <span class="n">z</span> <span class="o">/</span> <span class="n">d_t</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.FTML.create_state" title="mxnet.optimizer.FTML.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.FTML.update" title="mxnet.optimizer.FTML.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>For details of the update algorithm, see <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.ftml_update" title="mxnet.ndarray.ftml_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">ftml_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>optional</em>) – 0 < beta1 < 1. Generally close to 0.5.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>optional</em>) – 0 < beta2 < 1. Generally close to 1.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>optional</em>) – Small value to avoid division by 0.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.FTML.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#FTML.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.FTML.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.FTML.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#FTML.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.FTML.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Ftrl"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">Ftrl</code><span class="sig-paren">(</span><em class="sig-param">lamda1=0.01</em>, <em class="sig-param">learning_rate=0.1</em>, <em class="sig-param">beta=1</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Ftrl"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Ftrl" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Ftrl optimizer.</p> |
| <p>Referenced from <em>Ad Click Prediction: a View from the Trenches</em>, available at |
| <a class="reference external" href="http://dl.acm.org/citation.cfm?id=2488200">http://dl.acm.org/citation.cfm?id=2488200</a>.</p> |
| <dl> |
| <dt>eta :</dt><dd><div class="math notranslate nohighlight"> |
| \[\eta_{t,i} = \frac{learningrate}{\beta+\sqrt{\sum_{s=1}^tg_{s,i}^2}}\]</div> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Ftrl.create_state" title="mxnet.optimizer.Ftrl.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Ftrl.update" title="mxnet.optimizer.Ftrl.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">z</span> <span class="o">+=</span> <span class="n">rescaled_grad</span> <span class="o">-</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="n">rescaled_grad</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">-</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">))</span> <span class="o">*</span> <span class="n">weight</span> <span class="o">/</span> <span class="n">learning_rate</span> |
| <span class="n">n</span> <span class="o">+=</span> <span class="n">rescaled_grad</span><span class="o">**</span><span class="mi">2</span> |
| <span class="n">w</span> <span class="o">=</span> <span class="p">(</span><span class="n">sign</span><span class="p">(</span><span class="n">z</span><span class="p">)</span> <span class="o">*</span> <span class="n">lamda1</span> <span class="o">-</span> <span class="n">z</span><span class="p">)</span> <span class="o">/</span> <span class="p">((</span><span class="n">beta</span> <span class="o">+</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">))</span> <span class="o">/</span> <span class="n">learning_rate</span> <span class="o">+</span> <span class="n">wd</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="nb">abs</span><span class="p">(</span><span class="n">z</span><span class="p">)</span> <span class="o">></span> <span class="n">lamda1</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p>If the storage types of weight, state and grad are all <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>, <strong>sparse updates</strong> are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">grad</span><span class="o">.</span><span class="n">indices</span><span class="p">:</span> |
| <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">*</span> <span class="n">rescale_grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> |
| <span class="n">z</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+=</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">-</span> <span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">-</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">[</span><span class="n">row</span><span class="p">]))</span> <span class="o">*</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">/</span> <span class="n">learning_rate</span> |
| <span class="n">n</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+=</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span><span class="o">**</span><span class="mi">2</span> |
| <span class="n">w</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">sign</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> <span class="o">*</span> <span class="n">lamda1</span> <span class="o">-</span> <span class="n">z</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> <span class="o">/</span> <span class="p">((</span><span class="n">beta</span> <span class="o">+</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">[</span><span class="n">row</span><span class="p">]))</span> <span class="o">/</span> <span class="n">learning_rate</span> <span class="o">+</span> <span class="n">wd</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="nb">abs</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> <span class="o">></span> <span class="n">lamda1</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p>The sparse update only updates the z and n for the weights whose row_sparse |
| gradient indices appear in the current batch, rather than updating it for all |
| indices. Compared with the original update, it can provide large |
| improvements in model training throughput for some applications. However, it |
| provides slightly different semantics than the original update, and |
| may lead to different empirical results.</p> |
| <p>For details of the update algorithm, see <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.ftrl_update" title="mxnet.ndarray.ftrl_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">ftrl_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>lamda1</strong> (<em>float</em><em>, </em><em>optional</em>) – L1 regularization coefficient.</p></li> |
| <li><p><strong>learning_rate</strong> (<em>float</em><em>, </em><em>optional</em>) – The initial learning rate.</p></li> |
| <li><p><strong>beta</strong> (<em>float</em><em>, </em><em>optional</em>) – Per-coordinate learning rate correlation parameter.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Ftrl.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Ftrl.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Ftrl.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Ftrl.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Ftrl.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Ftrl.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.LARS"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">LARS</code><span class="sig-paren">(</span><em class="sig-param">momentum=0.0</em>, <em class="sig-param">lazy_update=True</em>, <em class="sig-param">eta=0.001</em>, <em class="sig-param">eps=0</em>, <em class="sig-param">momentum_correction=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LARS"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>the LARS optimizer from ‘Large Batch Training of Convolution Networks’ (<a class="reference external" href="https://arxiv.org/abs/1708.03888">https://arxiv.org/abs/1708.03888</a>)</p> |
| <p>Behave mostly like SGD with momentum and weight decay but is scaling adaptively the learning for each layer (except bias and batch norm parameters): |
| w_norm = L2norm(weights) |
| g_norm = L2norm(gradients) |
| if w_norm > 0 and g_norm > 0:</p> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.create_state" title="mxnet.optimizer.LARS.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.create_state_multi_precision" title="mxnet.optimizer.LARS.create_state_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state_multi_precision</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight, including FP32 high precision copy if original weight is FP16.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.set_wd_mult" title="mxnet.optimizer.LARS.set_wd_mult"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_wd_mult</span></code></a>(args_wd_mult)</p></td> |
| <td><p>Sets an individual weight decay multiplier for each parameter.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.update" title="mxnet.optimizer.LARS.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LARS.update_multi_precision" title="mxnet.optimizer.LARS.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(index, weight, grad, …)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <blockquote> |
| <div><p>lr_layer = lr * lr_mult * eta * w_norm / (g_norm + weight_decay * w_norm + eps)</p> |
| </div></blockquote> |
| <dl class="simple"> |
| <dt>else:</dt><dd><p>lr_layer = lr * lr_mult</p> |
| </dd> |
| </dl> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>optional</em>) – The momentum value.</p></li> |
| <li><p><strong>lazy_update</strong> (<em>bool</em><em>, </em><em>optional</em>) – Default is True. If True, lazy updates are applied if the storage types of weight and grad are both <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>.</p></li> |
| <li><p><strong>lars_eta</strong> (<em>float</em><em>, </em><em>optional</em>) – LARS coefficient used to scale the learning rate. Default set to 0.001.</p></li> |
| <li><p><strong>lars_epsilon</strong> (<em>float</em><em>, </em><em>optional</em>) – Optional epsilon in case of very small gradients. Default set to 0.</p></li> |
| <li><p><strong>momentum_correction</strong> (<em>bool</em><em>, </em><em>optional</em>) – If True scale momentum w.r.t global learning rate change (with an lr_scheduler) as indicated in ‘Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour` (<a class="reference external" href="https://arxiv.org/pdf/1706.02677.pdf">https://arxiv.org/pdf/1706.02677.pdf</a>) |
| Default set to True.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LARS.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.create_state_multi_precision"> |
| <code class="sig-name descname">create_state_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LARS.create_state_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.create_state_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight, including FP32 high |
| precision copy if original weight is FP16.</p> |
| <p>This method is provided to perform automatic mixed precision training |
| for optimizers that do not support it themselves.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.set_wd_mult"> |
| <code class="sig-name descname">set_wd_mult</code><span class="sig-paren">(</span><em class="sig-param">args_wd_mult</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LARS.set_wd_mult"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.set_wd_mult" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets an individual weight decay multiplier for each parameter.</p> |
| <p>By default, if <cite>param_idx2name</cite> was provided in the |
| constructor, the weight decay multipler is set as 0 for all |
| parameters whose name don’t end with <code class="docutils literal notranslate"><span class="pre">_weight</span></code> or |
| <code class="docutils literal notranslate"><span class="pre">_gamma</span></code>.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>The default weight decay multiplier for a <cite>Variable</cite> |
| can be set with its <cite>wd_mult</cite> argument in the constructor.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>args_wd_mult</strong> (<em>dict of string/int to float</em>) – <p>For each of its key-value entries, the weight decay multipler for the |
| parameter specified in the key will be set as the given value.</p> |
| <p>You can specify the parameter with either its name or its index. |
| If you use the name, you should pass <cite>sym</cite> in the constructor, |
| and the name you specified in the key of <cite>args_lr_mult</cite> should match |
| the name of the parameter in <cite>sym</cite>. If you use the index, it should |
| correspond to the index of the parameter used in the <cite>update</cite> method.</p> |
| <p>Specifying a parameter by its index is only supported for backward |
| compatibility, and we recommend to use the name instead.</p> |
| </p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LARS.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LARS.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LARS.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LARS.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state. |
| Mixed precision version.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.LBSGD"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">LBSGD</code><span class="sig-paren">(</span><em class="sig-param">momentum=0.0</em>, <em class="sig-param">multi_precision=False</em>, <em class="sig-param">warmup_strategy='linear'</em>, <em class="sig-param">warmup_epochs=5</em>, <em class="sig-param">batch_scale=1</em>, <em class="sig-param">updates_per_epoch=32</em>, <em class="sig-param">begin_epoch=0</em>, <em class="sig-param">num_epochs=60</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LBSGD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LBSGD" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Large Batch SGD optimizer with momentum and weight decay.</p> |
| <p>The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">state</span> <span class="o">=</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="n">weight</span> <span class="o">-</span> <span class="n">state</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LBSGD.create_state" title="mxnet.optimizer.LBSGD.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LBSGD.update" title="mxnet.optimizer.LBSGD.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>For details of the update algorithm see <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.sgd_update" title="mxnet.ndarray.sgd_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">sgd_update</span></code></a> |
| and <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.sgd_mom_update" title="mxnet.ndarray.sgd_mom_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">sgd_mom_update</span></code></a>. |
| In addition to the SGD updates the LBSGD optimizer uses the LARS, Layer-wise |
| Adaptive Rate Scaling, algorithm to have a separate learning rate for each |
| layer of the network, which leads to better stability over large batch sizes.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>optional</em>) – The momentum value.</p></li> |
| <li><p><strong>multi_precision</strong> (<em>bool</em><em>, </em><em>optional</em>) – Flag to control the internal precision of the optimizer. |
| False: results in using the same precision as the weights (default), |
| True: makes internal 32-bit copy of the weights and applies gradients |
| in 32-bit precision even if actual weights used in the model have lower precision. |
| Turning this on can improve convergence and accuracy when training with float16.</p></li> |
| <li><p><strong>warmup_strategy</strong> (<em>string</em><em> (</em><em>'linear'</em><em>, </em><em>'power2'</em><em>, </em><em>'sqrt'.</em><em> , </em><em>'lars' default : 'linear'</em><em>)</em>) – </p></li> |
| <li><p><strong>warmup_epochs</strong> (<em>unsigned</em><em>, </em><em>default: 5</em>) – </p></li> |
| <li><p><strong>batch_scale</strong> (<em>unsigned</em><em>, </em><em>default: 1</em><em> (</em><em>same as batch size * numworkers</em><em>)</em>) – </p></li> |
| <li><p><strong>updates_per_epoch</strong> (<em>updates_per_epoch</em><em> (</em><em>default: 32</em><em>, </em><em>Default might not reflect true number batches per epoch. Used for warmup.</em><em>)</em>) – </p></li> |
| <li><p><strong>begin_epoch</strong> (<em>unsigned</em><em>, </em><em>default 0</em><em>, </em><em>starting epoch.</em>) – </p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LBSGD.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LBSGD.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LBSGD.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LBSGD.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LBSGD.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LBSGD.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.NAG"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">NAG</code><span class="sig-paren">(</span><em class="sig-param">momentum=0.0</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#NAG"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>Nesterov accelerated gradient.</p> |
| <p>This optimizer updates each weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">state</span> <span class="o">=</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="n">grad</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="n">weight</span> <span class="o">-</span> <span class="p">(</span><span class="n">lr</span> <span class="o">*</span> <span class="p">(</span><span class="n">grad</span> <span class="o">+</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span><span class="p">))</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG.create_state" title="mxnet.optimizer.NAG.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG.create_state_multi_precision" title="mxnet.optimizer.NAG.create_state_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state_multi_precision</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight, including FP32 high precision copy if original weight is FP16.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG.update" title="mxnet.optimizer.NAG.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.NAG.update_multi_precision" title="mxnet.optimizer.NAG.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(index, weight, grad, …)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>optional</em>) – The momentum value.</p></li> |
| <li><p><strong>multi_precision</strong> (<em>bool</em><em>, </em><em>optional</em>) – Flag to control the internal precision of the optimizer. |
| False: results in using the same precision as the weights (default), |
| True: makes internal 32-bit copy of the weights and applies gradients |
| in 32-bit precision even if actual weights used in the model have lower precision. |
| Turning this on can improve convergence and accuracy when training with float16.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.NAG.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#NAG.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.NAG.create_state_multi_precision"> |
| <code class="sig-name descname">create_state_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#NAG.create_state_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG.create_state_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight, including FP32 high |
| precision copy if original weight is FP16.</p> |
| <p>This method is provided to perform automatic mixed precision training |
| for optimizers that do not support it themselves.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.NAG.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#NAG.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.NAG.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#NAG.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.NAG.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state. |
| Mixed precision version.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="function"> |
| <dt id="mxnet.optimizer.NDabs"> |
| <code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">NDabs</code><span class="sig-paren">(</span><em class="sig-param">data=None</em>, <em class="sig-param">out=None</em>, <em class="sig-param">name=None</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="headerlink" href="#mxnet.optimizer.NDabs" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Returns element-wise absolute value of the input.</p> |
| <p>Example:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nb">abs</span><span class="p">([</span><span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">])</span> <span class="o">=</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span> |
| </pre></div> |
| </div> |
| <p>The storage type of <code class="docutils literal notranslate"><span class="pre">abs</span></code> output depends upon the input storage type:</p> |
| <blockquote> |
| <div><ul class="simple"> |
| <li><p>abs(default) = default</p></li> |
| <li><p>abs(row_sparse) = row_sparse</p></li> |
| <li><p>abs(csr) = csr</p></li> |
| </ul> |
| </div></blockquote> |
| <p>Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L721</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>data</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The input array.</p></li> |
| <li><p><strong>out</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a><em>, </em><em>optional</em>) – The output NDArray to hold the result.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>out</strong> – The output of this function.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p><a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray">NDArray</a> or list of NDArrays</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Nadam"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">Nadam</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">schedule_decay=0.004</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Nadam"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Nadam" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Nesterov Adam optimizer.</p> |
| <p>Much like Adam is essentially RMSprop with momentum, |
| Nadam is Adam RMSprop with Nesterov momentum available |
| at <a class="reference external" href="http://cs229.stanford.edu/proj2015/054_report.pdf">http://cs229.stanford.edu/proj2015/054_report.pdf</a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>beta1</strong> (<em>float</em><em>, </em><em>optional</em>) – Exponential decay rate for the first moment estimates.</p></li> |
| <li><p><strong>beta2</strong> (<em>float</em><em>, </em><em>optional</em>) – Exponential decay rate for the second moment estimates.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>optional</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>schedule_decay</strong> (<em>float</em><em>, </em><em>optional</em>) – Exponential decay rate for the momentum schedule</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Nadam.create_state" title="mxnet.optimizer.Nadam.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Nadam.update" title="mxnet.optimizer.Nadam.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Nadam.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Nadam.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Nadam.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Nadam.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Nadam.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Nadam.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Optimizer"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">Optimizer</code><span class="sig-paren">(</span><em class="sig-param">rescale_grad=1.0</em>, <em class="sig-param">param_idx2name=None</em>, <em class="sig-param">wd=0.0</em>, <em class="sig-param">clip_gradient=None</em>, <em class="sig-param">learning_rate=None</em>, <em class="sig-param">lr_scheduler=None</em>, <em class="sig-param">sym=None</em>, <em class="sig-param">begin_num_update=0</em>, <em class="sig-param">multi_precision=False</em>, <em class="sig-param">param_dict=None</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">object</span></code></p> |
| <p>The base class inherited by all optimizers.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>rescale_grad</strong> (<em>float</em><em>, </em><em>optional</em><em>, </em><em>default 1.0</em>) – Multiply the gradient with <cite>rescale_grad</cite> before updating. Often |
| choose to be <code class="docutils literal notranslate"><span class="pre">1.0/batch_size</span></code>.</p></li> |
| <li><p><strong>param_idx2name</strong> (<em>dict from int to string</em><em>, </em><em>optional</em><em>, </em><em>default None</em>) – A dictionary that maps int index to string name.</p></li> |
| <li><p><strong>clip_gradient</strong> (<em>float</em><em>, </em><em>optional</em><em>, </em><em>default None</em>) – Clip the gradient by projecting onto the box <code class="docutils literal notranslate"><span class="pre">[-clip_gradient,</span> <span class="pre">clip_gradient]</span></code>.</p></li> |
| <li><p><strong>learning_rate</strong> (<em>float</em>) – The initial learning rate. If None, the optimization will use the |
| learning rate from <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If not None, it will overwrite |
| the learning rate in <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code>. If None and <code class="docutils literal notranslate"><span class="pre">lr_scheduler</span></code> |
| is also None, then it will be set to 0.01 by default.</p></li> |
| <li><p><strong>lr_scheduler</strong> (<a class="reference internal" href="../lr_scheduler/index.html#mxnet.lr_scheduler.LRScheduler" title="mxnet.lr_scheduler.LRScheduler"><em>LRScheduler</em></a><em>, </em><em>optional</em><em>, </em><em>default None</em>) – The learning rate scheduler.</p></li> |
| <li><p><strong>wd</strong> (<em>float</em><em>, </em><em>optional</em><em>, </em><em>default 0.0</em>) – The weight decay (or L2 regularization) coefficient. Modifies objective |
| by adding a penalty for having large weights.</p></li> |
| <li><p><strong>sym</strong> (<a class="reference internal" href="../symbol/symbol.html#mxnet.symbol.Symbol" title="mxnet.symbol.Symbol"><em>Symbol</em></a><em>, </em><em>optional</em><em>, </em><em>default None</em>) – The Symbol this optimizer is applying to.</p></li> |
| <li><p><strong>begin_num_update</strong> (<em>int</em><em>, </em><em>optional</em><em>, </em><em>default 0</em>) – The initial number of updates.</p></li> |
| <li><p><strong>multi_precision</strong> (<em>bool</em><em>, </em><em>optional</em><em>, </em><em>default False</em>) – Flag to control the internal precision of the optimizer. |
| False: results in using the same precision as the weights (default), |
| True: makes internal 32-bit copy of the weights and applies gradients |
| in 32-bit precision even if actual weights used in the model have lower precision. |
| Turning this on can improve convergence and accuracy when training with float16.</p></li> |
| <li><p><strong>param_dict</strong> (<em>dict of int -> gluon.Parameter</em><em>, </em><em>default None</em>) – Dictionary of parameter index to gluon.Parameter, used to lookup parameter attributes |
| such as lr_mult, wd_mult, etc. param_dict shall not be deep copied.</p></li> |
| <li><p><strong>Properties</strong> – </p></li> |
| <li><p><strong>----------</strong> – </p></li> |
| <li><p><strong>learning_rate</strong> – The current learning rate of the optimizer. Given an Optimizer object |
| optimizer, its learning rate can be accessed as optimizer.learning_rate.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.create_optimizer" title="mxnet.optimizer.Optimizer.create_optimizer"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_optimizer</span></code></a>(name, **kwargs)</p></td> |
| <td><p>Instantiates an optimizer with a given name and kwargs.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.create_state" title="mxnet.optimizer.Optimizer.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.create_state_multi_precision" title="mxnet.optimizer.Optimizer.create_state_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state_multi_precision</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight, including FP32 high precision copy if original weight is FP16.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.register" title="mxnet.optimizer.Optimizer.register"><code class="xref py py-obj docutils literal notranslate"><span class="pre">register</span></code></a>(klass)</p></td> |
| <td><p>Registers a new optimizer.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.set_learning_rate" title="mxnet.optimizer.Optimizer.set_learning_rate"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_learning_rate</span></code></a>(lr)</p></td> |
| <td><p>Sets a new learning rate of the optimizer.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.set_lr_mult" title="mxnet.optimizer.Optimizer.set_lr_mult"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_lr_mult</span></code></a>(args_lr_mult)</p></td> |
| <td><p>Sets an individual learning rate multiplier for each parameter.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.set_lr_scale" title="mxnet.optimizer.Optimizer.set_lr_scale"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_lr_scale</span></code></a>(args_lrscale)</p></td> |
| <td><p>[DEPRECATED] Sets lr scale. Use set_lr_mult instead.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.set_wd_mult" title="mxnet.optimizer.Optimizer.set_wd_mult"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_wd_mult</span></code></a>(args_wd_mult)</p></td> |
| <td><p>Sets an individual weight decay multiplier for each parameter.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.update" title="mxnet.optimizer.Optimizer.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Optimizer.update_multi_precision" title="mxnet.optimizer.Optimizer.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(index, weight, grad, …)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.create_optimizer"> |
| <em class="property">static </em><code class="sig-name descname">create_optimizer</code><span class="sig-paren">(</span><em class="sig-param">name</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.create_optimizer"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.create_optimizer" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Instantiates an optimizer with a given name and kwargs.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>We can use the alias <cite>create</cite> for <code class="docutils literal notranslate"><span class="pre">Optimizer.create_optimizer</span></code>.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>name</strong> (<em>str</em>) – Name of the optimizer. Should be the name |
| of a subclass of Optimizer. Case insensitive.</p></li> |
| <li><p><strong>kwargs</strong> (<em>dict</em>) – Parameters for the optimizer.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p>An instantiated optimizer.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p><a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer">Optimizer</a></p> |
| </dd> |
| </dl> |
| <p class="rubric">Examples</p> |
| <div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">sgd</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">create_optimizer</span><span class="p">(</span><span class="s1">'sgd'</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">type</span><span class="p">(</span><span class="n">sgd</span><span class="p">)</span> |
| <span class="go"><class 'mxnet.optimizer.SGD'></span> |
| <span class="gp">>>> </span><span class="n">adam</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="s1">'adam'</span><span class="p">,</span> <span class="n">learning_rate</span><span class="o">=.</span><span class="mi">1</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">type</span><span class="p">(</span><span class="n">adam</span><span class="p">)</span> |
| <span class="go"><class 'mxnet.optimizer.Adam'></span> |
| </pre></div> |
| </div> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.create_state_multi_precision"> |
| <code class="sig-name descname">create_state_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.create_state_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.create_state_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight, including FP32 high |
| precision copy if original weight is FP16.</p> |
| <p>This method is provided to perform automatic mixed precision training |
| for optimizers that do not support it themselves.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.register"> |
| <em class="property">static </em><code class="sig-name descname">register</code><span class="sig-paren">(</span><em class="sig-param">klass</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.register"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.register" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Registers a new optimizer.</p> |
| <p>Once an optimizer is registered, we can create an instance of this |
| optimizer with <cite>create_optimizer</cite> later.</p> |
| <p class="rubric">Examples</p> |
| <div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nd">@mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">register</span> |
| <span class="gp">... </span><span class="k">class</span> <span class="nc">MyOptimizer</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="p">):</span> |
| <span class="gp">... </span> <span class="k">pass</span> |
| <span class="gp">>>> </span><span class="n">optim</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">create_optimizer</span><span class="p">(</span><span class="s1">'MyOptimizer'</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">type</span><span class="p">(</span><span class="n">optim</span><span class="p">))</span> |
| <span class="go"><class '__main__.MyOptimizer'></span> |
| </pre></div> |
| </div> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.set_learning_rate"> |
| <code class="sig-name descname">set_learning_rate</code><span class="sig-paren">(</span><em class="sig-param">lr</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.set_learning_rate"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.set_learning_rate" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets a new learning rate of the optimizer.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>lr</strong> (<em>float</em>) – The new learning rate of the optimizer.</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.set_lr_mult"> |
| <code class="sig-name descname">set_lr_mult</code><span class="sig-paren">(</span><em class="sig-param">args_lr_mult</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.set_lr_mult"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.set_lr_mult" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets an individual learning rate multiplier for each parameter.</p> |
| <p>If you specify a learning rate multiplier for a parameter, then |
| the learning rate for the parameter will be set as the product of |
| the global learning rate <cite>self.lr</cite> and its multiplier.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>The default learning rate multiplier of a <cite>Variable</cite> |
| can be set with <cite>lr_mult</cite> argument in the constructor.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>args_lr_mult</strong> (<em>dict of str/int to float</em>) – <p>For each of its key-value entries, the learning rate multipler for the |
| parameter specified in the key will be set as the given value.</p> |
| <p>You can specify the parameter with either its name or its index. |
| If you use the name, you should pass <cite>sym</cite> in the constructor, |
| and the name you specified in the key of <cite>args_lr_mult</cite> should match |
| the name of the parameter in <cite>sym</cite>. If you use the index, it should |
| correspond to the index of the parameter used in the <cite>update</cite> method.</p> |
| <p>Specifying a parameter by its index is only supported for backward |
| compatibility, and we recommend to use the name instead.</p> |
| </p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.set_lr_scale"> |
| <code class="sig-name descname">set_lr_scale</code><span class="sig-paren">(</span><em class="sig-param">args_lrscale</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.set_lr_scale"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.set_lr_scale" title="Permalink to this definition">¶</a></dt> |
| <dd><p>[DEPRECATED] Sets lr scale. Use set_lr_mult instead.</p> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.set_wd_mult"> |
| <code class="sig-name descname">set_wd_mult</code><span class="sig-paren">(</span><em class="sig-param">args_wd_mult</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.set_wd_mult"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.set_wd_mult" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets an individual weight decay multiplier for each parameter.</p> |
| <p>By default, if <cite>param_idx2name</cite> was provided in the |
| constructor, the weight decay multipler is set as 0 for all |
| parameters whose name don’t end with <code class="docutils literal notranslate"><span class="pre">_weight</span></code> or |
| <code class="docutils literal notranslate"><span class="pre">_gamma</span></code>.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>The default weight decay multiplier for a <cite>Variable</cite> |
| can be set with its <cite>wd_mult</cite> argument in the constructor.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>args_wd_mult</strong> (<em>dict of string/int to float</em>) – <p>For each of its key-value entries, the weight decay multipler for the |
| parameter specified in the key will be set as the given value.</p> |
| <p>You can specify the parameter with either its name or its index. |
| If you use the name, you should pass <cite>sym</cite> in the constructor, |
| and the name you specified in the key of <cite>args_lr_mult</cite> should match |
| the name of the parameter in <cite>sym</cite>. If you use the index, it should |
| correspond to the index of the parameter used in the <cite>update</cite> method.</p> |
| <p>Specifying a parameter by its index is only supported for backward |
| compatibility, and we recommend to use the name instead.</p> |
| </p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Optimizer.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Optimizer.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Optimizer.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state. |
| Mixed precision version.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.RMSProp"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">RMSProp</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">gamma1=0.9</em>, <em class="sig-param">gamma2=0.9</em>, <em class="sig-param">epsilon=1e-08</em>, <em class="sig-param">centered=False</em>, <em class="sig-param">clip_weights=None</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#RMSProp"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.RMSProp" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The RMSProp optimizer.</p> |
| <p>Two versions of RMSProp are implemented:</p> |
| <p>If <code class="docutils literal notranslate"><span class="pre">centered=False</span></code>, we follow |
| <a class="reference external" href="http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf">http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf</a> by |
| Tieleman & Hinton, 2012. |
| For details of the update algorithm see <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.rmsprop_update" title="mxnet.ndarray.rmsprop_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">rmsprop_update</span></code></a>.</p> |
| <p>If <code class="docutils literal notranslate"><span class="pre">centered=True</span></code>, we follow <a class="reference external" href="http://arxiv.org/pdf/1308.0850v5.pdf">http://arxiv.org/pdf/1308.0850v5.pdf</a> (38)-(45) |
| by Alex Graves, 2013. |
| For details of the update algorithm see <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.rmspropalex_update" title="mxnet.ndarray.rmspropalex_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">rmspropalex_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>gamma1</strong> (<em>float</em><em>, </em><em>optional</em>) – A decay factor of moving average over past squared gradient.</p></li> |
| <li><p><strong>gamma2</strong> (<em>float</em><em>, </em><em>optional</em>) – A “momentum” factor. Only used if <cite>centered`=``True`</cite>.</p></li> |
| <li><p><strong>epsilon</strong> (<em>float</em><em>, </em><em>optional</em>) – Small value to avoid division by 0.</p></li> |
| <li><p><strong>centered</strong> (<em>bool</em><em>, </em><em>optional</em>) – <p>Flag to control which version of RMSProp to use.:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kc">True</span><span class="p">:</span> <span class="n">will</span> <span class="n">use</span> <span class="n">Graves</span><span class="s1">'s version of `RMSProp`,</span> |
| <span class="kc">False</span><span class="p">:</span> <span class="n">will</span> <span class="n">use</span> <span class="n">Tieleman</span> <span class="o">&</span> <span class="n">Hinton</span><span class="s1">'s version of `RMSProp`.</span> |
| </pre></div> |
| </div> |
| </p></li> |
| <li><p><strong>clip_weights</strong> (<em>float</em><em>, </em><em>optional</em>) – Clips weights into range <code class="docutils literal notranslate"><span class="pre">[-clip_weights,</span> <span class="pre">clip_weights]</span></code>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.RMSProp.create_state" title="mxnet.optimizer.RMSProp.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.RMSProp.update" title="mxnet.optimizer.RMSProp.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.RMSProp.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#RMSProp.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.RMSProp.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.RMSProp.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#RMSProp.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.RMSProp.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.SGD"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">SGD</code><span class="sig-paren">(</span><em class="sig-param">momentum=0.0</em>, <em class="sig-param">lazy_update=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#SGD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The SGD optimizer with momentum and weight decay.</p> |
| <p>If the storage types of grad is <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code> and <code class="docutils literal notranslate"><span class="pre">lazy_update</span></code> is True, <strong>lazy updates</strong> are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">grad</span><span class="o">.</span><span class="n">indices</span><span class="p">:</span> |
| <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">lr</span> <span class="o">*</span> <span class="p">(</span><span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">[</span><span class="n">row</span><span class="p">],</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">])</span> |
| <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">momentum</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">*</span> <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">-</span> <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD.create_state" title="mxnet.optimizer.SGD.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD.create_state_multi_precision" title="mxnet.optimizer.SGD.create_state_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state_multi_precision</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight, including FP32 high precision copy if original weight is FP16.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD.update" title="mxnet.optimizer.SGD.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.SGD.update_multi_precision" title="mxnet.optimizer.SGD.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(index, weight, grad, …)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>The sparse update only updates the momentum for the weights whose row_sparse |
| gradient indices appear in the current batch, rather than updating it for all |
| indices. Compared with the original update, it can provide large |
| improvements in model training throughput for some applications. However, it |
| provides slightly different semantics than the original update, and |
| may lead to different empirical results.</p> |
| <p>In the case when <code class="docutils literal notranslate"><span class="pre">update_on_kvstore</span></code> is set to False (either globally via |
| MXNET_UPDATE_ON_KVSTORE=0 environment variable or as a parameter in |
| <a class="reference internal" href="../gluon/trainer.html#mxnet.gluon.Trainer" title="mxnet.gluon.Trainer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Trainer</span></code></a>) SGD optimizer can perform aggregated update |
| of parameters, which may lead to improved performance. The aggregation size |
| is controlled by MXNET_OPTIMIZER_AGGREGATION_SIZE environment variable and |
| defaults to 4.</p> |
| <p>Otherwise, <strong>standard updates</strong> are applied by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">lr</span> <span class="o">*</span> <span class="p">(</span><span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span><span class="p">)</span> |
| <span class="n">state</span> <span class="o">=</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="n">rescaled_grad</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="n">weight</span> <span class="o">-</span> <span class="n">state</span> |
| </pre></div> |
| </div> |
| <p>For details of the update algorithm see |
| <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.sgd_update" title="mxnet.ndarray.sgd_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">sgd_update</span></code></a> and <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.sgd_mom_update" title="mxnet.ndarray.sgd_mom_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">sgd_mom_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>optional</em>) – The momentum value.</p></li> |
| <li><p><strong>lazy_update</strong> (<em>bool</em><em>, </em><em>optional</em>) – Default is True. If True, lazy updates are applied if the storage types of weight and grad are both <code class="docutils literal notranslate"><span class="pre">row_sparse</span></code>.</p></li> |
| <li><p><strong>multi_precision</strong> (<em>bool</em><em>, </em><em>optional</em>) – Flag to control the internal precision of the optimizer. |
| False: results in using the same precision as the weights (default), |
| True: makes internal 32-bit copy of the weights and applies gradients |
| in 32-bit precision even if actual weights used in the model have lower precision. |
| Turning this on can improve convergence and accuracy when training with float16.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGD.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#SGD.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGD.create_state_multi_precision"> |
| <code class="sig-name descname">create_state_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#SGD.create_state_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD.create_state_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight, including FP32 high |
| precision copy if original weight is FP16.</p> |
| <p>This method is provided to perform automatic mixed precision training |
| for optimizers that do not support it themselves.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGD.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#SGD.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGD.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#SGD.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGD.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state. |
| Mixed precision version.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.SGLD"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">SGLD</code><span class="sig-paren">(</span><em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#SGLD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGLD" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>Stochastic Gradient Riemannian Langevin Dynamics.</p> |
| <p>This class implements the optimizer described in the paper <em>Stochastic Gradient |
| Riemannian Langevin Dynamics on the Probability Simplex</em>, available at |
| <a class="reference external" href="https://papers.nips.cc/paper/4883-stochastic-gradient-riemannian-langevin-dynamics-on-the-probability-simplex.pdf">https://papers.nips.cc/paper/4883-stochastic-gradient-riemannian-langevin-dynamics-on-the-probability-simplex.pdf</a>.</p> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.SGLD.create_state" title="mxnet.optimizer.SGLD.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.SGLD.update" title="mxnet.optimizer.SGLD.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGLD.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#SGLD.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGLD.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.SGLD.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#SGLD.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.SGLD.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Signum"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">Signum</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.01</em>, <em class="sig-param">momentum=0.9</em>, <em class="sig-param">wd_lh=0.0</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Signum"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Signum" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Signum optimizer that takes the sign of gradient or momentum.</p> |
| <p>The optimizer updates the weight by:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">wd</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">state</span> <span class="o">=</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span><span class="o">-</span><span class="n">momentum</span><span class="p">)</span><span class="o">*</span><span class="n">rescaled_grad</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">wd_lh</span><span class="p">)</span> <span class="o">*</span> <span class="n">weight</span> <span class="o">-</span> <span class="n">lr</span> <span class="o">*</span> <span class="n">sign</span><span class="p">(</span><span class="n">state</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Signum.create_state" title="mxnet.optimizer.Signum.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Signum.update" title="mxnet.optimizer.Signum.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <p class="rubric">References</p> |
| <p>Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli & Anima Anandkumar. (2018). |
| signSGD: Compressed Optimisation for Non-Convex Problems. In ICML’18.</p> |
| <p>See: <a class="reference external" href="https://arxiv.org/abs/1802.04434">https://arxiv.org/abs/1802.04434</a></p> |
| <p>For details of the update algorithm see |
| <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.signsgd_update" title="mxnet.ndarray.signsgd_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">signsgd_update</span></code></a> and <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.signum_update" title="mxnet.ndarray.signum_update"><code class="xref py py-class docutils literal notranslate"><span class="pre">signum_update</span></code></a>.</p> |
| <p>This optimizer accepts the following parameters in addition to those accepted |
| by <a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a>.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>momentum</strong> (<em>float</em><em>, </em><em>optional</em>) – The momentum value.</p></li> |
| <li><p><strong>wd_lh</strong> (<em>float</em><em>, </em><em>optional</em>) – The amount of decoupled weight decay regularization, see details in the original paper at:<a class="reference external" href="https://arxiv.org/abs/1711.05101">https://arxiv.org/abs/1711.05101</a></p></li> |
| </ul> |
| </dd> |
| </dl> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Signum.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Signum.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Signum.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Signum.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Signum.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Signum.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.LAMB"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">LAMB</code><span class="sig-paren">(</span><em class="sig-param">learning_rate=0.001</em>, <em class="sig-param">beta1=0.9</em>, <em class="sig-param">beta2=0.999</em>, <em class="sig-param">epsilon=1e-06</em>, <em class="sig-param">lower_bound=None</em>, <em class="sig-param">upper_bound=None</em>, <em class="sig-param">bias_correction=True</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LAMB"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>LAMB Optimizer.</p> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB.create_state" title="mxnet.optimizer.LAMB.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates auxiliary state for a given weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB.update" title="mxnet.optimizer.LAMB.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.LAMB.update_multi_precision" title="mxnet.optimizer.LAMB.update_multi_precision"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update_multi_precision</span></code></a>(index, weight, grad, …)</p></td> |
| <td><p>Updates the given parameter using the corresponding gradient and state.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LAMB.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LAMB.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates auxiliary state for a given weight.</p> |
| <p>Some optimizers require additional states, e.g. as momentum, in addition |
| to gradients in order to update weights. This function creates state |
| for a given weight which will be used in <cite>update</cite>. This function is |
| called only once for each weight.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – An unique index to identify the weight.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The weight.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>state</strong> – The state associated with the weight.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>any obj</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LAMB.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LAMB.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.LAMB.update_multi_precision"> |
| <code class="sig-name descname">update_multi_precision</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#LAMB.update_multi_precision"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.LAMB.update_multi_precision" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Updates the given parameter using the corresponding gradient and state. |
| Mixed precision version.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>index</strong> (<em>int</em>) – The unique index of the parameter into the individual learning |
| rates and weight decays. Learning rates and weight decay |
| may be set via <cite>set_lr_mult()</cite> and <cite>set_wd_mult()</cite>, respectively.</p></li> |
| <li><p><strong>weight</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The parameter to be updated.</p></li> |
| <li><p><strong>grad</strong> (<a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.NDArray" title="mxnet.ndarray.NDArray"><em>NDArray</em></a>) – The gradient of the objective with respect to this parameter.</p></li> |
| <li><p><strong>state</strong> (<em>any obj</em>) – The state returned by <cite>create_state()</cite>.</p></li> |
| </ul> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Test"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">Test</code><span class="sig-paren">(</span><em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Test"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Test" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.Optimizer</span></code></p> |
| <p>The Test optimizer</p> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Test.create_state" title="mxnet.optimizer.Test.create_state"><code class="xref py py-obj docutils literal notranslate"><span class="pre">create_state</span></code></a>(index, weight)</p></td> |
| <td><p>Creates a state to duplicate weight.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Test.update" title="mxnet.optimizer.Test.update"><code class="xref py py-obj docutils literal notranslate"><span class="pre">update</span></code></a>(index, weight, grad, state)</p></td> |
| <td><p>Performs w += rescale_grad * grad.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Test.create_state"> |
| <code class="sig-name descname">create_state</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Test.create_state"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Test.create_state" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Creates a state to duplicate weight.</p> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Test.update"> |
| <code class="sig-name descname">update</code><span class="sig-paren">(</span><em class="sig-param">index</em>, <em class="sig-param">weight</em>, <em class="sig-param">grad</em>, <em class="sig-param">state</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Test.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Test.update" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Performs w += rescale_grad * grad.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.Updater"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">Updater</code><span class="sig-paren">(</span><em class="sig-param">optimizer</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Updater"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Updater" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">object</span></code></p> |
| <p>Updater for kvstore.</p> |
| <p><strong>Methods</strong></p> |
| <table class="longtable docutils align-default"> |
| <colgroup> |
| <col style="width: 10%" /> |
| <col style="width: 90%" /> |
| </colgroup> |
| <tbody> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Updater.get_states" title="mxnet.optimizer.Updater.get_states"><code class="xref py py-obj docutils literal notranslate"><span class="pre">get_states</span></code></a>([dump_optimizer])</p></td> |
| <td><p>Gets updater states.</p></td> |
| </tr> |
| <tr class="row-even"><td><p><a class="reference internal" href="#mxnet.optimizer.Updater.set_states" title="mxnet.optimizer.Updater.set_states"><code class="xref py py-obj docutils literal notranslate"><span class="pre">set_states</span></code></a>(states)</p></td> |
| <td><p>Sets updater states.</p></td> |
| </tr> |
| <tr class="row-odd"><td><p><a class="reference internal" href="#mxnet.optimizer.Updater.sync_state_context" title="mxnet.optimizer.Updater.sync_state_context"><code class="xref py py-obj docutils literal notranslate"><span class="pre">sync_state_context</span></code></a>(state, context)</p></td> |
| <td><p>sync state context.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Updater.get_states"> |
| <code class="sig-name descname">get_states</code><span class="sig-paren">(</span><em class="sig-param">dump_optimizer=False</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Updater.get_states"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Updater.get_states" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Gets updater states.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>dump_optimizer</strong> (<em>bool</em><em>, </em><em>default False</em>) – Whether to also save the optimizer itself. This would also save optimizer |
| information such as learning rate and weight decay schedules.</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Updater.set_states"> |
| <code class="sig-name descname">set_states</code><span class="sig-paren">(</span><em class="sig-param">states</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Updater.set_states"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Updater.set_states" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Sets updater states.</p> |
| </dd></dl> |
| |
| <dl class="method"> |
| <dt id="mxnet.optimizer.Updater.sync_state_context"> |
| <code class="sig-name descname">sync_state_context</code><span class="sig-paren">(</span><em class="sig-param">state</em>, <em class="sig-param">context</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#Updater.sync_state_context"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.Updater.sync_state_context" title="Permalink to this definition">¶</a></dt> |
| <dd><p>sync state context.</p> |
| </dd></dl> |
| |
| </dd></dl> |
| |
| <dl class="class"> |
| <dt id="mxnet.optimizer.ccSGD"> |
| <em class="property">class </em><code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">ccSGD</code><span class="sig-paren">(</span><em class="sig-param">*args</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#ccSGD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.ccSGD" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">mxnet.optimizer.optimizer.SGD</span></code></p> |
| <p>[DEPRECATED] Same as <cite>SGD</cite>. Left here for backward compatibility.</p> |
| </dd></dl> |
| |
| <dl class="function"> |
| <dt id="mxnet.optimizer.create"> |
| <code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">create</code><span class="sig-paren">(</span><em class="sig-param">name</em>, <em class="sig-param">**kwargs</em><span class="sig-paren">)</span><a class="headerlink" href="#mxnet.optimizer.create" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Instantiates an optimizer with a given name and kwargs.</p> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>We can use the alias <cite>create</cite> for <code class="docutils literal notranslate"><span class="pre">Optimizer.create_optimizer</span></code>.</p> |
| </div> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><ul class="simple"> |
| <li><p><strong>name</strong> (<em>str</em>) – Name of the optimizer. Should be the name |
| of a subclass of Optimizer. Case insensitive.</p></li> |
| <li><p><strong>kwargs</strong> (<em>dict</em>) – Parameters for the optimizer.</p></li> |
| </ul> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p>An instantiated optimizer.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p><a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer">Optimizer</a></p> |
| </dd> |
| </dl> |
| <p class="rubric">Examples</p> |
| <div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">sgd</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">create_optimizer</span><span class="p">(</span><span class="s1">'sgd'</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">type</span><span class="p">(</span><span class="n">sgd</span><span class="p">)</span> |
| <span class="go"><class 'mxnet.optimizer.SGD'></span> |
| <span class="gp">>>> </span><span class="n">adam</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="s1">'adam'</span><span class="p">,</span> <span class="n">learning_rate</span><span class="o">=.</span><span class="mi">1</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">type</span><span class="p">(</span><span class="n">adam</span><span class="p">)</span> |
| <span class="go"><class 'mxnet.optimizer.Adam'></span> |
| </pre></div> |
| </div> |
| </dd></dl> |
| |
| <dl class="function"> |
| <dt id="mxnet.optimizer.get_updater"> |
| <code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">get_updater</code><span class="sig-paren">(</span><em class="sig-param">optimizer</em><span class="sig-paren">)</span><a class="reference internal" href="../../_modules/mxnet/optimizer/optimizer.html#get_updater"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.optimizer.get_updater" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Returns a closure of the updater needed for kvstore.</p> |
| <dl class="field-list simple"> |
| <dt class="field-odd">Parameters</dt> |
| <dd class="field-odd"><p><strong>optimizer</strong> (<a class="reference internal" href="#mxnet.optimizer.Optimizer" title="mxnet.optimizer.Optimizer"><em>Optimizer</em></a>) – The optimizer.</p> |
| </dd> |
| <dt class="field-even">Returns</dt> |
| <dd class="field-even"><p><strong>updater</strong> – The closure of the updater.</p> |
| </dd> |
| <dt class="field-odd">Return type</dt> |
| <dd class="field-odd"><p>function</p> |
| </dd> |
| </dl> |
| </dd></dl> |
| |
| <dl class="function"> |
| <dt id="mxnet.optimizer.register"> |
| <code class="sig-prename descclassname">mxnet.optimizer.</code><code class="sig-name descname">register</code><span class="sig-paren">(</span><em class="sig-param">klass</em><span class="sig-paren">)</span><a class="headerlink" href="#mxnet.optimizer.register" title="Permalink to this definition">¶</a></dt> |
| <dd><p>Registers a new optimizer.</p> |
| <p>Once an optimizer is registered, we can create an instance of this |
| optimizer with <cite>create_optimizer</cite> later.</p> |
| <p class="rubric">Examples</p> |
| <div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nd">@mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">register</span> |
| <span class="gp">... </span><span class="k">class</span> <span class="nc">MyOptimizer</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="p">):</span> |
| <span class="gp">... </span> <span class="k">pass</span> |
| <span class="gp">>>> </span><span class="n">optim</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">Optimizer</span><span class="o">.</span><span class="n">create_optimizer</span><span class="p">(</span><span class="s1">'MyOptimizer'</span><span class="p">)</span> |
| <span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">type</span><span class="p">(</span><span class="n">optim</span><span class="p">))</span> |
| <span class="go"><class '__main__.MyOptimizer'></span> |
| </pre></div> |
| </div> |
| </dd></dl> |
| |
| </div> |
| |
| |
| </div> |
| <div class="side-doc-outline"> |
| <div class="side-doc-outline--content"> |
| </div> |
| </div> |
| |
| <div class="clearer"></div> |
| </div><div class="pagenation"> |
| <a id="button-prev" href="../initializer/index.html" class="mdl-button mdl-js-button mdl-js-ripple-effect mdl-button--colored" role="botton" accesskey="P"> |
| <i class="pagenation-arrow-L fas fa-arrow-left fa-lg"></i> |
| <div class="pagenation-text"> |
| <span class="pagenation-direction">Previous</span> |
| <div>mxnet.initializer</div> |
| </div> |
| </a> |
| <a id="button-next" href="../lr_scheduler/index.html" class="mdl-button mdl-js-button mdl-js-ripple-effect mdl-button--colored" role="botton" accesskey="N"> |
| <i class="pagenation-arrow-R fas fa-arrow-right fa-lg"></i> |
| <div class="pagenation-text"> |
| <span class="pagenation-direction">Next</span> |
| <div>mxnet.lr_scheduler</div> |
| </div> |
| </a> |
| </div> |
| <footer class="site-footer h-card"> |
| <div class="wrapper"> |
| <div class="row"> |
| <div class="col-4"> |
| <h4 class="footer-category-title">Resources</h4> |
| <ul class="contact-list"> |
| <li><a class="u-email" href="mailto:dev@mxnet.apache.org">Dev list</a></li> |
| <li><a class="u-email" href="mailto:user@mxnet.apache.org">User mailing list</a></li> |
| <li><a href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home">Developer Wiki</a></li> |
| <li><a href="https://issues.apache.org/jira/projects/MXNET/issues">Jira Tracker</a></li> |
| <li><a href="https://github.com/apache/incubator-mxnet/labels/Roadmap">Github Roadmap</a></li> |
| <li><a href="https://discuss.mxnet.io">MXNet Discuss forum</a></li> |
| <li><a href="/versions/1.6/community/contribute">Contribute To MXNet</a></li> |
| |
| </ul> |
| </div> |
| |
| <div class="col-4"><ul class="social-media-list"><li><a href="https://github.com/apache/incubator-mxnet"><svg class="svg-icon"><use xlink:href="../../_static/minima-social-icons.svg#github"></use></svg> <span class="username">apache/incubator-mxnet</span></a></li><li><a href="https://www.twitter.com/apachemxnet"><svg class="svg-icon"><use xlink:href="../../_static/minima-social-icons.svg#twitter"></use></svg> <span class="username">apachemxnet</span></a></li><li><a href="https://youtube.com/apachemxnet"><svg class="svg-icon"><use xlink:href="../../_static/minima-social-icons.svg#youtube"></use></svg> <span class="username">apachemxnet</span></a></li></ul> |
| </div> |
| |
| <div class="col-4 footer-text"> |
| <p>A flexible and efficient library for deep learning.</p> |
| </div> |
| </div> |
| </div> |
| </footer> |
| |
| <footer class="site-footer2"> |
| <div class="wrapper"> |
| <div class="row"> |
| <div class="col-3"> |
| <img src="../../_static/apache_incubator_logo.png" class="footer-logo col-2"> |
| </div> |
| <div class="footer-bottom-warning col-9"> |
| <p>Apache MXNet is an effort undergoing incubation at The Apache Software Foundation (ASF), <span style="font-weight:bold">sponsored by the <i>Apache Incubator</i></span>. Incubation is required |
| of all newly accepted projects until a further review indicates that the infrastructure, |
| communications, and decision making process have stabilized in a manner consistent with other |
| successful ASF projects. While incubation status is not necessarily a reflection of the completeness |
| or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. |
| </p><p>"Copyright © 2017-2018, The Apache Software Foundation Apache MXNet, MXNet, Apache, the Apache |
| feather, and the Apache MXNet project logo are either registered trademarks or trademarks of the |
| Apache Software Foundation."</p> |
| </div> |
| </div> |
| </div> |
| </footer> |
| |
| </body> |
| </html> |