| <!DOCTYPE html> |
| |
| <html lang="en"> |
| <head> |
| <meta charset="utf-8"/> |
| <meta content="IE=edge" http-equiv="X-UA-Compatible"/> |
| <meta content="width=device-width, initial-scale=1" name="viewport"/> |
| <meta content="RowSparseNDArray - NDArray for Sparse Gradient Updates" property="og:title"> |
| <meta content="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/og-logo.png" property="og:image"> |
| <meta content="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/og-logo.png" property="og:image:secure_url"> |
| <meta content="RowSparseNDArray - NDArray for Sparse Gradient Updates" property="og:description"/> |
| <title>RowSparseNDArray - NDArray for Sparse Gradient Updates — mxnet documentation</title> |
| <link crossorigin="anonymous" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" rel="stylesheet"/> |
| <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css" rel="stylesheet"/> |
| <link href="../../_static/basic.css" rel="stylesheet" type="text/css"> |
| <link href="../../_static/pygments.css" rel="stylesheet" type="text/css"> |
| <link href="../../_static/mxnet.css" rel="stylesheet" type="text/css"/> |
| <script type="text/javascript"> |
| var DOCUMENTATION_OPTIONS = { |
| URL_ROOT: '../../', |
| VERSION: '', |
| COLLAPSE_INDEX: false, |
| FILE_SUFFIX: '.html', |
| HAS_SOURCE: true, |
| SOURCELINK_SUFFIX: '.txt' |
| }; |
| </script> |
| <script src="https://code.jquery.com/jquery-1.11.1.min.js" type="text/javascript"></script> |
| <script src="../../_static/underscore.js" type="text/javascript"></script> |
| <script src="../../_static/searchtools_custom.js" type="text/javascript"></script> |
| <script src="../../_static/doctools.js" type="text/javascript"></script> |
| <script src="../../_static/selectlang.js" type="text/javascript"></script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script> |
| <script type="text/javascript"> jQuery(function() { Search.loadIndex("/versions/0.12.1/searchindex.js"); Search.init();}); </script> |
| <script> |
| (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ |
| (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new |
| Date();a=s.createElement(o), |
| m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) |
| })(window,document,'script','https://www.google-analytics.com/analytics.js','ga'); |
| |
| ga('create', 'UA-96378503-1', 'auto'); |
| ga('send', 'pageview'); |
| |
| </script> |
| <!-- --> |
| <!-- <script type="text/javascript" src="../../_static/jquery.js"></script> --> |
| <!-- --> |
| <!-- <script type="text/javascript" src="../../_static/underscore.js"></script> --> |
| <!-- --> |
| <!-- <script type="text/javascript" src="../../_static/doctools.js"></script> --> |
| <!-- --> |
| <!-- <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> --> |
| <!-- --> |
| <link href="../../genindex.html" rel="index" title="Index"> |
| <link href="../../search.html" rel="search" title="Search"/> |
| <link href="../index.html" rel="up" title="Tutorials"/> |
| <link href="train.html" rel="next" title="Train a Linear Regression Model with Sparse Symbols"/> |
| <link href="csr.html" rel="prev" title="CSRNDArray - NDArray in Compressed Sparse Row Storage Format"/> |
| <link href="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet-icon.png" rel="icon" type="image/png"/> |
| </link></link></link></meta></meta></meta></head> |
| <body background="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet-background-compressed.jpeg" role="document"> |
| <div class="content-block"><div class="navbar navbar-fixed-top"> |
| <div class="container" id="navContainer"> |
| <div class="innder" id="header-inner"> |
| <h1 id="logo-wrap"> |
| <a href="../../" id="logo"><img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet_logo.png"/></a> |
| </h1> |
| <nav class="nav-bar" id="main-nav"> |
| <a class="main-nav-link" href="/versions/0.12.1/install/index.html">Install</a> |
| <span id="dropdown-menu-position-anchor"> |
| <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Gluon <span class="caret"></span></a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu"> |
| <li><a class="main-nav-link" href="/versions/0.12.1/tutorials/gluon/gluon.html">About</a></li> |
| <li><a class="main-nav-link" href="https://www.d2l.ai/">Dive into Deep Learning</a></li> |
| <li><a class="main-nav-link" href="https://gluon-cv.mxnet.io">GluonCV Toolkit</a></li> |
| <li><a class="main-nav-link" href="https://gluon-nlp.mxnet.io/">GluonNLP Toolkit</a></li> |
| </ul> |
| </span> |
| <span id="dropdown-menu-position-anchor"> |
| <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">API <span class="caret"></span></a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu"> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/python/index.html">Python</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/c++/index.html">C++</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/julia/index.html">Julia</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/perl/index.html">Perl</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/r/index.html">R</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/scala/index.html">Scala</a></li> |
| </ul> |
| </span> |
| <span id="dropdown-menu-position-anchor-docs"> |
| <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Docs <span class="caret"></span></a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu-docs"> |
| <li><a class="main-nav-link" href="/versions/0.12.1/faq/index.html">FAQ</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/tutorials/index.html">Tutorials</a> |
| <li><a class="main-nav-link" href="https://github.com/apache/incubator-mxnet/tree/0.12.1/example">Examples</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/architecture/index.html">Architecture</a></li> |
| <li><a class="main-nav-link" href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home">Developer Wiki</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/model_zoo/index.html">Model Zoo</a></li> |
| <li><a class="main-nav-link" href="https://github.com/onnx/onnx-mxnet">ONNX</a></li> |
| </li></ul> |
| </span> |
| <span id="dropdown-menu-position-anchor-community"> |
| <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Community <span class="caret"></span></a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu-community"> |
| <li><a class="main-nav-link" href="http://discuss.mxnet.io">Forum</a></li> |
| <li><a class="main-nav-link" href="https://github.com/apache/incubator-mxnet/tree/0.12.1">Github</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/community/contribute.html">Contribute</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/community/powered_by.html">Powered By</a></li> |
| </ul> |
| </span> |
| <span id="dropdown-menu-position-anchor-version" style="position: relative"><a href="#" class="main-nav-link dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="true">0.12.1<span class="caret"></span></a><ul id="package-dropdown-menu" class="dropdown-menu"><li><a href="/">master</a></li><li><a href="/versions/1.7.0/">1.7.0</a></li><li><a href=/versions/1.6.0/>1.6.0</a></li><li><a href=/versions/1.5.0/index.html>1.5.0</a></li><li><a href=/versions/1.4.1/index.html>1.4.1</a></li><li><a href=/versions/1.3.1/index.html>1.3.1</a></li><li><a href=/versions/1.2.1/index.html>1.2.1</a></li><li><a href=/versions/1.1.0/index.html>1.1.0</a></li><li><a href=/versions/1.0.0/index.html>1.0.0</a></li><li><a href=/versions/0.12.1/index.html>0.12.1</a></li><li><a href=/versions/0.11.0/index.html>0.11.0</a></li></ul></span></nav> |
| <script> function getRootPath(){ return "../../" } </script> |
| <div class="burgerIcon dropdown"> |
| <a class="dropdown-toggle" data-toggle="dropdown" href="#" role="button">☰</a> |
| <ul class="dropdown-menu" id="burgerMenu"> |
| <li><a href="/versions/0.12.1/install/index.html">Install</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/tutorials/index.html">Tutorials</a></li> |
| <li class="dropdown-submenu dropdown"> |
| <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">Gluon</a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu"> |
| <li><a class="main-nav-link" href="/versions/0.12.1/tutorials/gluon/gluon.html">About</a></li> |
| <li><a class="main-nav-link" href="http://gluon.mxnet.io">The Straight Dope (Tutorials)</a></li> |
| <li><a class="main-nav-link" href="https://gluon-cv.mxnet.io">GluonCV Toolkit</a></li> |
| <li><a class="main-nav-link" href="https://gluon-nlp.mxnet.io/">GluonNLP Toolkit</a></li> |
| </ul> |
| </li> |
| <li class="dropdown-submenu"> |
| <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">API</a> |
| <ul class="dropdown-menu"> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/python/index.html">Python</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/c++/index.html">C++</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/julia/index.html">Julia</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/perl/index.html">Perl</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/r/index.html">R</a></li> |
| <li><a class="main-nav-link" href="/versions/0.12.1/api/scala/index.html">Scala</a></li> |
| </ul> |
| </li> |
| <li class="dropdown-submenu"> |
| <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">Docs</a> |
| <ul class="dropdown-menu"> |
| <li><a href="/versions/0.12.1/faq/index.html" tabindex="-1">FAQ</a></li> |
| <li><a href="/versions/0.12.1/tutorials/index.html" tabindex="-1">Tutorials</a></li> |
| <li><a href="https://github.com/apache/incubator-mxnet/tree/0.12.1/example" tabindex="-1">Examples</a></li> |
| <li><a href="/versions/0.12.1/architecture/index.html" tabindex="-1">Architecture</a></li> |
| <li><a href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home" tabindex="-1">Developer Wiki</a></li> |
| <li><a href="/versions/0.12.1/model_zoo/index.html" tabindex="-1">Gluon Model Zoo</a></li> |
| <li><a href="https://github.com/onnx/onnx-mxnet" tabindex="-1">ONNX</a></li> |
| </ul> |
| </li> |
| <li class="dropdown-submenu dropdown"> |
| <a aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" role="button" tabindex="-1">Community</a> |
| <ul class="dropdown-menu"> |
| <li><a href="http://discuss.mxnet.io" tabindex="-1">Forum</a></li> |
| <li><a href="https://github.com/apache/incubator-mxnet/tree/0.12.1" tabindex="-1">Github</a></li> |
| <li><a href="/versions/0.12.1/community/contribute.html" tabindex="-1">Contribute</a></li> |
| <li><a href="/versions/0.12.1/community/powered_by.html" tabindex="-1">Powered By</a></li> |
| </ul> |
| </li> |
| <li id="dropdown-menu-position-anchor-version-mobile" class="dropdown-submenu" style="position: relative"><a href="#" tabindex="-1">0.12.1</a><ul class="dropdown-menu"><li><a tabindex="-1" href=/>master</a></li><li><a tabindex="-1" href=/versions/1.6.0/>1.6.0</a></li><li><a tabindex="-1" href=/versions/1.5.0/index.html>1.5.0</a></li><li><a tabindex="-1" href=/versions/1.4.1/index.html>1.4.1</a></li><li><a tabindex="-1" href=/versions/1.3.1/index.html>1.3.1</a></li><li><a tabindex="-1" href=/versions/1.2.1/index.html>1.2.1</a></li><li><a tabindex="-1" href=/versions/1.1.0/index.html>1.1.0</a></li><li><a tabindex="-1" href=/versions/1.0.0/index.html>1.0.0</a></li><li><a tabindex="-1" href=/versions/0.12.1/index.html>0.12.1</a></li><li><a tabindex="-1" href=/versions/0.11.0/index.html>0.11.0</a></li></ul></li></ul> |
| </div> |
| <div class="plusIcon dropdown"> |
| <a class="dropdown-toggle" data-toggle="dropdown" href="#" role="button"><span aria-hidden="true" class="glyphicon glyphicon-plus"></span></a> |
| <ul class="dropdown-menu dropdown-menu-right" id="plusMenu"></ul> |
| </div> |
| <div id="search-input-wrap"> |
| <form action="../../search.html" autocomplete="off" class="" method="get" role="search"> |
| <div class="form-group inner-addon left-addon"> |
| <i class="glyphicon glyphicon-search"></i> |
| <input class="form-control" name="q" placeholder="Search" type="text"/> |
| </div> |
| <input name="check_keywords" type="hidden" value="yes"> |
| <input name="area" type="hidden" value="default"/> |
| </input></form> |
| <div id="search-preview"></div> |
| </div> |
| <div id="searchIcon"> |
| <span aria-hidden="true" class="glyphicon glyphicon-search"></span> |
| </div> |
| <!-- <div id="lang-select-wrap"> --> |
| <!-- <label id="lang-select-label"> --> |
| <!-- <\!-- <i class="fa fa-globe"></i> -\-> --> |
| <!-- <span></span> --> |
| <!-- </label> --> |
| <!-- <select id="lang-select"> --> |
| <!-- <option value="en">Eng</option> --> |
| <!-- <option value="zh">中文</option> --> |
| <!-- </select> --> |
| <!-- </div> --> |
| <!-- <a id="mobile-nav-toggle"> |
| <span class="mobile-nav-toggle-bar"></span> |
| <span class="mobile-nav-toggle-bar"></span> |
| <span class="mobile-nav-toggle-bar"></span> |
| </a> --> |
| </div> |
| </div> |
| </div> |
| <script type="text/javascript"> |
| $('body').css('background', 'white'); |
| </script> |
| <div class="container"> |
| <div class="row"> |
| <div aria-label="main navigation" class="sphinxsidebar leftsidebar" role="navigation"> |
| <div class="sphinxsidebarwrapper"> |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/python/index.html">Python Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/r/index.html">R Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/julia/index.html">Julia Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/c++/index.html">C++ Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/scala/index.html">Scala Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/perl/index.html">Perl Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../faq/index.html">HowTo Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../architecture/index.html">System Documents</a></li> |
| <li class="toctree-l1 current"><a class="reference internal" href="../index.html">Tutorials</a><ul class="current"> |
| <li class="toctree-l2 current"><a class="reference internal" href="../index.html#python">Python</a><ul class="current"> |
| <li class="toctree-l3"><a class="reference internal" href="../index.html#basic">Basic</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="../index.html#training-and-inference">Training and Inference</a></li> |
| <li class="toctree-l3 current"><a class="reference internal" href="../index.html#sparse-ndarray">Sparse NDArray</a><ul class="current"> |
| <li class="toctree-l4"><a class="reference internal" href="csr.html">CSRNDArray - NDArray in Compressed Sparse Row Storage Format</a></li> |
| <li class="toctree-l4 current"><a class="current reference internal" href="#">RowSparseNDArray - NDArray for Sparse Gradient Updates</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="train.html">Train a Linear Regression Model with Sparse Symbols</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../index.html#contributing-tutorials">Contributing Tutorials</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l1"><a class="reference internal" href="../../community/index.html">Community</a></li> |
| </ul> |
| </div> |
| </div> |
| <div class="content"> |
| <div class="page-tracker"></div> |
| <div class="section" id="rowsparsendarray-ndarray-for-sparse-gradient-updates"> |
| <span id="rowsparsendarray-ndarray-for-sparse-gradient-updates"></span><h1>RowSparseNDArray - NDArray for Sparse Gradient Updates<a class="headerlink" href="#rowsparsendarray-ndarray-for-sparse-gradient-updates" title="Permalink to this headline">¶</a></h1> |
| <div class="section" id="motivation"> |
| <span id="motivation"></span><h2>Motivation<a class="headerlink" href="#motivation" title="Permalink to this headline">¶</a></h2> |
| <p>Many real world datasets deal with high dimensional sparse feature vectors. When learning |
| the weights of models with sparse datasets, the derived gradients of the weights could be sparse.</p> |
| <p>Let’s say we perform a matrix multiplication of <code class="docutils literal"><span class="pre">X</span></code> and <code class="docutils literal"><span class="pre">W</span></code>, where <code class="docutils literal"><span class="pre">X</span></code> is a 2x2 matrix, and <code class="docutils literal"><span class="pre">W</span></code> is a 2x1 matrix. Let <code class="docutils literal"><span class="pre">Y</span></code> be the matrix multiplication of the two matrices:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">mxnet</span> <span class="kn">as</span> <span class="nn">mx</span> |
| <span class="n">X</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">([[</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">]])</span> |
| <span class="n">W</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">([[</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">],</span> <span class="p">[</span><span class="mi">6</span><span class="p">,</span><span class="mi">7</span><span class="p">,</span><span class="mi">8</span><span class="p">]])</span> |
| <span class="n">Y</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">W</span><span class="p">)</span> |
| <span class="p">{</span><span class="s1">'X'</span><span class="p">:</span> <span class="n">X</span><span class="p">,</span> <span class="s1">'W'</span><span class="p">:</span> <span class="n">W</span><span class="p">,</span> <span class="s1">'Y'</span><span class="p">:</span> <span class="n">Y</span><span class="p">}</span> |
| </pre></div> |
| </div> |
| <p>As you can see,</p> |
| <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">Y</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">W</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">*</span> <span class="n">W</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">*</span> <span class="mi">3</span> <span class="o">+</span> <span class="mi">0</span> <span class="o">*</span> <span class="mi">6</span> <span class="o">=</span> <span class="mi">3</span> |
| <span class="n">Y</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">W</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">*</span> <span class="n">W</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">*</span> <span class="mi">4</span> <span class="o">+</span> <span class="mi">0</span> <span class="o">*</span> <span class="mi">7</span> <span class="o">=</span> <span class="mi">4</span> |
| <span class="n">Y</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">W</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">2</span><span class="p">]</span> <span class="o">+</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">*</span> <span class="n">W</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">*</span> <span class="mi">5</span> <span class="o">+</span> <span class="mi">0</span> <span class="o">*</span> <span class="mi">8</span> <span class="o">=</span> <span class="mi">5</span> |
| </pre></div> |
| </div> |
| <p>What about dY / dW, the gradient for <code class="docutils literal"><span class="pre">W</span></code>? Let’s call it <code class="docutils literal"><span class="pre">grad_W</span></code>. To start with, the shape of <code class="docutils literal"><span class="pre">grad_W</span></code> is the same as that of <code class="docutils literal"><span class="pre">W</span></code> as we are taking the derivatives with respect to <code class="docutils literal"><span class="pre">W</span></code>, which is 2x3. Then we calculate each entry in <code class="docutils literal"><span class="pre">grad_W</span></code> as follows:</p> |
| <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">grad_W</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span> |
| <span class="n">grad_W</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span> |
| <span class="n">grad_W</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span> |
| <span class="n">grad_W</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span> |
| <span class="n">grad_W</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span> |
| <span class="n">grad_W</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span> |
| </pre></div> |
| </div> |
| <p>As a matter of fact, you can calculate <code class="docutils literal"><span class="pre">grad_W</span></code> by multiplying the transpose of <code class="docutils literal"><span class="pre">X</span></code> with a matrix of ones:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">grad_W</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ones_like</span><span class="p">(</span><span class="n">Y</span><span class="p">),</span> <span class="n">transpose_a</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> |
| <span class="n">grad_W</span> |
| </pre></div> |
| </div> |
| <p>As you can see, row 0 of <code class="docutils literal"><span class="pre">grad_W</span></code> contains non-zero values while row 1 of <code class="docutils literal"><span class="pre">grad_W</span></code> does not. Why did that happen? |
| If you look at how <code class="docutils literal"><span class="pre">grad_W</span></code> is calculated, notice that since column 1 of <code class="docutils literal"><span class="pre">X</span></code> is filled with zeros, row 1 of <code class="docutils literal"><span class="pre">grad_W</span></code> is filled with zeros too.</p> |
| <p>In the real world, gradients for parameters that interact with sparse inputs ususally have gradients where many row slices are completely zeros. Storing and manipulating such sparse matrices with many row slices of all zeros in the default dense structure results in wasted memory and processing on the zeros. More importantly, many gradient based optimization methods such as SGD, <a class="reference external" href="https://stanford.edu/~jduchi/projects/DuchiHaSi10_colt.pdf">AdaGrad</a> and <a class="reference external" href="https://arxiv.org/pdf/1412.6980.pdf">Adam</a> |
| take advantage of sparse gradients and prove to be efficient and effective. |
| <strong>In MXNet, the <code class="docutils literal"><span class="pre">RowSparseNDArray</span></code> stores the matrix in <code class="docutils literal"><span class="pre">row</span> <span class="pre">sparse</span></code> format, which is designed for arrays of which most row slices are all zeros.</strong> |
| In this tutorial, we will describe what the row sparse format is and how to use RowSparseNDArray for sparse gradient updates in MXNet.</p> |
| </div> |
| <div class="section" id="prerequisites"> |
| <span id="prerequisites"></span><h2>Prerequisites<a class="headerlink" href="#prerequisites" title="Permalink to this headline">¶</a></h2> |
| <p>To complete this tutorial, we need:</p> |
| <ul> |
| <li><p class="first">MXNet. See the instructions for your operating system in <a class="reference external" href="/versions/0.12.1/install/index.html">Setup and Installation</a></p> |
| </li> |
| <li><p class="first"><a class="reference external" href="http://jupyter.org/">Jupyter</a></p> |
| <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">pip</span> <span class="n">install</span> <span class="n">jupyter</span> |
| </pre></div> |
| </div> |
| </li> |
| <li><p class="first">Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray in <a class="reference external" href="/versions/0.12.1/tutorials/basic/ndarray.html">NDArray - Imperative tensor operations on CPU/GPU</a></p> |
| </li> |
| <li><p class="first">Understanding of <a class="reference external" href="http://gluon.mxnet.io/chapter01_crashcourse/autograd.html">automatic differentiation with autograd</a></p> |
| </li> |
| <li><p class="first">GPUs - A section of this tutorial uses GPUs. If you don’t have GPUs on your |
| machine, simply set the variable <code class="docutils literal"><span class="pre">gpu_device</span></code> (set in the GPUs section of this |
| tutorial) to <code class="docutils literal"><span class="pre">mx.cpu()</span></code></p> |
| </li> |
| </ul> |
| </div> |
| <div class="section" id="row-sparse-format"> |
| <span id="row-sparse-format"></span><h2>Row Sparse Format<a class="headerlink" href="#row-sparse-format" title="Permalink to this headline">¶</a></h2> |
| <p>A RowSparseNDArray represents a multidimensional NDArray using two separate 1D arrays: |
| <code class="docutils literal"><span class="pre">data</span></code> and <code class="docutils literal"><span class="pre">indices</span></code>.</p> |
| <ul class="simple"> |
| <li>data: an NDArray of any dtype with shape <code class="docutils literal"><span class="pre">[D0,</span> <span class="pre">D1,</span> <span class="pre">...,</span> <span class="pre">Dn]</span></code>.</li> |
| <li>indices: a 1D int64 NDArray with shape <code class="docutils literal"><span class="pre">[D0]</span></code> with values sorted in ascending order.</li> |
| </ul> |
| <p>The <code class="docutils literal"><span class="pre">indices</span></code> array stores the indices of the row slices with non-zeros, |
| while the values are stored in <code class="docutils literal"><span class="pre">data</span></code> array. The corresponding NDArray <code class="docutils literal"><span class="pre">dense</span></code> represented by RowSparseNDArray <code class="docutils literal"><span class="pre">rsp</span></code> has</p> |
| <p><code class="docutils literal"><span class="pre">dense[rsp.indices[i],</span> <span class="pre">:,</span> <span class="pre">:,</span> <span class="pre">:,</span> <span class="pre">...]</span> <span class="pre">=</span> <span class="pre">rsp.data[i,</span> <span class="pre">:,</span> <span class="pre">:,</span> <span class="pre">:,</span> <span class="pre">...]</span></code></p> |
| <p>A RowSparseNDArray is typically used to represent non-zero row slices of a large NDArray of shape [LARGE0, D1, .. , Dn] where LARGE0 >> D0 and most row slices are zeros.</p> |
| <p>Given this two-dimension matrix:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="p">[[</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> |
| <span class="p">[</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> |
| <span class="p">[</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">],</span> |
| <span class="p">[</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> |
| <span class="p">[</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]]</span> |
| </pre></div> |
| </div> |
| <p>The row sparse representation would be:</p> |
| <ul class="simple"> |
| <li><code class="docutils literal"><span class="pre">data</span></code> array holds all the non-zero row slices of the array.</li> |
| <li><code class="docutils literal"><span class="pre">indices</span></code> array stores the row index for each row slice with non-zero elements.</li> |
| </ul> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data</span> <span class="o">=</span> <span class="p">[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">]]</span> |
| <span class="n">indices</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">]</span> |
| </pre></div> |
| </div> |
| <p><code class="docutils literal"><span class="pre">RowSparseNDArray</span></code> supports multidimensional arrays. Given this 3D tensor:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="p">[[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> |
| <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> |
| <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]],</span> |
| |
| <span class="p">[[</span><span class="mi">5</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> |
| <span class="p">[</span><span class="mi">6</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> |
| <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]],</span> |
| |
| <span class="p">[[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> |
| <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> |
| <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]]]</span> |
| </pre></div> |
| </div> |
| <p>The row sparse representation would be (with <code class="docutils literal"><span class="pre">data</span></code> and <code class="docutils literal"><span class="pre">indices</span></code> defined the same as above):</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data</span> <span class="o">=</span> <span class="p">[[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]],</span> <span class="p">[[</span><span class="mi">5</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="p">[</span><span class="mi">6</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]]]</span> |
| <span class="n">indices</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span> |
| </pre></div> |
| </div> |
| <p><code class="docutils literal"><span class="pre">RowSparseNDArray</span></code> is a subclass of <code class="docutils literal"><span class="pre">NDArray</span></code>. If you query <strong>stype</strong> of a RowSparseNDArray, |
| the value will be <strong>“row_sparse”</strong>.</p> |
| </div> |
| <div class="section" id="array-creation"> |
| <span id="array-creation"></span><h2>Array Creation<a class="headerlink" href="#array-creation" title="Permalink to this headline">¶</a></h2> |
| <p>You can create a <code class="docutils literal"><span class="pre">RowSparseNDArray</span></code> with data and indices by using the <code class="docutils literal"><span class="pre">row_sparse_array</span></code> function:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">mxnet</span> <span class="kn">as</span> <span class="nn">mx</span> |
| <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> |
| <span class="c1"># Create a RowSparseNDArray with python lists</span> |
| <span class="n">shape</span> <span class="o">=</span> <span class="p">(</span><span class="mi">6</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> |
| <span class="n">data_list</span> <span class="o">=</span> <span class="p">[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]]</span> |
| <span class="n">indices_list</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">]</span> |
| <span class="n">a</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">row_sparse_array</span><span class="p">((</span><span class="n">data_list</span><span class="p">,</span> <span class="n">indices_list</span><span class="p">),</span> <span class="n">shape</span><span class="o">=</span><span class="n">shape</span><span class="p">)</span> |
| <span class="c1"># Create a RowSparseNDArray with numpy arrays</span> |
| <span class="n">data_np</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]])</span> |
| <span class="n">indices_np</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span> |
| <span class="n">b</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">row_sparse_array</span><span class="p">((</span><span class="n">data_np</span><span class="p">,</span> <span class="n">indices_np</span><span class="p">),</span> <span class="n">shape</span><span class="o">=</span><span class="n">shape</span><span class="p">)</span> |
| <span class="p">{</span><span class="s1">'a'</span><span class="p">:</span><span class="n">a</span><span class="p">,</span> <span class="s1">'b'</span><span class="p">:</span><span class="n">b</span><span class="p">}</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="function-overview"> |
| <span id="function-overview"></span><h2>Function Overview<a class="headerlink" href="#function-overview" title="Permalink to this headline">¶</a></h2> |
| <p>Similar to <code class="docutils literal"><span class="pre">CSRNDArray</span></code>, the are several functions with <code class="docutils literal"><span class="pre">RowSparseNDArray</span></code> that behave the same way. In the code blocks below you can try out these common functions:</p> |
| <ul class="simple"> |
| <li><strong>.dtype</strong> - to set the data type</li> |
| <li><strong>.asnumpy</strong> - to cast as a numpy array for inspecting it</li> |
| <li><strong>.data</strong> - to access the data array</li> |
| <li><strong>.indices</strong> - to access the indices array</li> |
| <li><strong>.tostype</strong> - to set the storage type</li> |
| <li><strong>.cast_storage</strong> - to convert the storage type</li> |
| <li><strong>.copy</strong> - to copy the array</li> |
| <li><strong>.copyto</strong> - to copy to deep copy an existing array</li> |
| </ul> |
| </div> |
| <div class="section" id="setting-type"> |
| <span id="setting-type"></span><h2>Setting Type<a class="headerlink" href="#setting-type" title="Permalink to this headline">¶</a></h2> |
| <p>You can create a <code class="docutils literal"><span class="pre">RowSparseNDArray</span></code> from another specifying the element data type with the option <code class="docutils literal"><span class="pre">dtype</span></code>, which accepts a numpy type. By default, <code class="docutils literal"><span class="pre">float32</span></code> is used.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Float32 is used by default</span> |
| <span class="n">c</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> |
| <span class="c1"># Create a 16-bit float array</span> |
| <span class="n">d</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float16</span><span class="p">)</span> |
| <span class="p">(</span><span class="n">c</span><span class="o">.</span><span class="n">dtype</span><span class="p">,</span> <span class="n">d</span><span class="o">.</span><span class="n">dtype</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="inspecting-arrays"> |
| <span id="inspecting-arrays"></span><h2>Inspecting Arrays<a class="headerlink" href="#inspecting-arrays" title="Permalink to this headline">¶</a></h2> |
| <p>As with <code class="docutils literal"><span class="pre">CSRNDArray</span></code>, you can inspect the contents of a <code class="docutils literal"><span class="pre">RowSparseNDArray</span></code> by filling |
| its contents into a dense <code class="docutils literal"><span class="pre">numpy.ndarray</span></code> using the <code class="docutils literal"><span class="pre">asnumpy</span></code> function.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">a</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| <p>You can inspect the internal storage of a RowSparseNDArray by accessing attributes such as <code class="docutils literal"><span class="pre">indices</span></code> and <code class="docutils literal"><span class="pre">data</span></code>:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Access data array</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">data</span> |
| <span class="c1"># Access indices array</span> |
| <span class="n">indices</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">indices</span> |
| <span class="p">{</span><span class="s1">'a.stype'</span><span class="p">:</span> <span class="n">a</span><span class="o">.</span><span class="n">stype</span><span class="p">,</span> <span class="s1">'data'</span><span class="p">:</span><span class="n">data</span><span class="p">,</span> <span class="s1">'indices'</span><span class="p">:</span><span class="n">indices</span><span class="p">}</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="storage-type-conversion"> |
| <span id="storage-type-conversion"></span><h2>Storage Type Conversion<a class="headerlink" href="#storage-type-conversion" title="Permalink to this headline">¶</a></h2> |
| <p>You can convert an NDArray to a RowSparseNDArray and vice versa by using the <code class="docutils literal"><span class="pre">tostype</span></code> function:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Create a dense NDArray</span> |
| <span class="n">ones</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span> |
| <span class="c1"># Cast the storage type from `default` to `row_sparse`</span> |
| <span class="n">rsp</span> <span class="o">=</span> <span class="n">ones</span><span class="o">.</span><span class="n">tostype</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">)</span> |
| <span class="c1"># Cast the storage type from `row_sparse` to `default`</span> |
| <span class="n">dense</span> <span class="o">=</span> <span class="n">rsp</span><span class="o">.</span><span class="n">tostype</span><span class="p">(</span><span class="s1">'default'</span><span class="p">)</span> |
| <span class="p">{</span><span class="s1">'rsp'</span><span class="p">:</span><span class="n">rsp</span><span class="p">,</span> <span class="s1">'dense'</span><span class="p">:</span><span class="n">dense</span><span class="p">}</span> |
| </pre></div> |
| </div> |
| <p>You can also convert the storage type by using the <code class="docutils literal"><span class="pre">cast_storage</span></code> operator:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Create a dense NDArray</span> |
| <span class="n">ones</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span> |
| <span class="c1"># Cast the storage type to `row_sparse`</span> |
| <span class="n">rsp</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">cast_storage</span><span class="p">(</span><span class="n">ones</span><span class="p">,</span> <span class="s1">'row_sparse'</span><span class="p">)</span> |
| <span class="c1"># Cast the storage type to `default`</span> |
| <span class="n">dense</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">cast_storage</span><span class="p">(</span><span class="n">rsp</span><span class="p">,</span> <span class="s1">'default'</span><span class="p">)</span> |
| <span class="p">{</span><span class="s1">'rsp'</span><span class="p">:</span><span class="n">rsp</span><span class="p">,</span> <span class="s1">'dense'</span><span class="p">:</span><span class="n">dense</span><span class="p">}</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="copies"> |
| <span id="copies"></span><h2>Copies<a class="headerlink" href="#copies" title="Permalink to this headline">¶</a></h2> |
| <p>You can use the <code class="docutils literal"><span class="pre">copy</span></code> method which makes a deep copy of the array and its data, and returns a new array. |
| We can also use the <code class="docutils literal"><span class="pre">copyto</span></code> method or the slice operator <code class="docutils literal"><span class="pre">[]</span></code> to deep copy to an existing array.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">a</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span><span class="o">.</span><span class="n">tostype</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">)</span> |
| <span class="n">b</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span> |
| <span class="n">c</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">,</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span> |
| <span class="n">c</span><span class="p">[:]</span> <span class="o">=</span> <span class="n">a</span> |
| <span class="n">d</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">,</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span> |
| <span class="n">a</span><span class="o">.</span><span class="n">copyto</span><span class="p">(</span><span class="n">d</span><span class="p">)</span> |
| <span class="p">{</span><span class="s1">'b is a'</span><span class="p">:</span> <span class="n">b</span> <span class="ow">is</span> <span class="n">a</span><span class="p">,</span> <span class="s1">'b.asnumpy()'</span><span class="p">:</span><span class="n">b</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">(),</span> <span class="s1">'c.asnumpy()'</span><span class="p">:</span><span class="n">c</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">(),</span> <span class="s1">'d.asnumpy()'</span><span class="p">:</span><span class="n">d</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">()}</span> |
| </pre></div> |
| </div> |
| <p>If the storage types of source array and destination array do not match, |
| the storage type of destination array will not change when copying with <code class="docutils literal"><span class="pre">copyto</span></code> or the slice operator <code class="docutils literal"><span class="pre">[]</span></code>. The source array will be temporarily converted to desired storage type before the copy.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">e</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">,</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span> |
| <span class="n">f</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">,</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span> |
| <span class="n">g</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ones</span><span class="p">(</span><span class="n">e</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span> |
| <span class="n">e</span><span class="p">[:]</span> <span class="o">=</span> <span class="n">g</span> |
| <span class="n">g</span><span class="o">.</span><span class="n">copyto</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> |
| <span class="p">{</span><span class="s1">'e.stype'</span><span class="p">:</span><span class="n">e</span><span class="o">.</span><span class="n">stype</span><span class="p">,</span> <span class="s1">'f.stype'</span><span class="p">:</span><span class="n">f</span><span class="o">.</span><span class="n">stype</span><span class="p">,</span> <span class="s1">'g.stype'</span><span class="p">:</span><span class="n">g</span><span class="o">.</span><span class="n">stype</span><span class="p">}</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="retain-row-slices"> |
| <span id="retain-row-slices"></span><h2>Retain Row Slices<a class="headerlink" href="#retain-row-slices" title="Permalink to this headline">¶</a></h2> |
| <p>You can retain a subset of row slices from a RowSparseNDArray specified by their row indices.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data</span> <span class="o">=</span> <span class="p">[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">],</span> <span class="p">[</span><span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">]]</span> |
| <span class="n">indices</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span> |
| <span class="n">rsp</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">row_sparse_array</span><span class="p">((</span><span class="n">data</span><span class="p">,</span> <span class="n">indices</span><span class="p">),</span> <span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span> |
| <span class="c1"># Retain row 0 and row 1</span> |
| <span class="n">rsp_retained</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">retain</span><span class="p">(</span><span class="n">rsp</span><span class="p">,</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">]))</span> |
| <span class="p">{</span><span class="s1">'rsp.asnumpy()'</span><span class="p">:</span> <span class="n">rsp</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">(),</span> <span class="s1">'rsp_retained'</span><span class="p">:</span> <span class="n">rsp_retained</span><span class="p">,</span> <span class="s1">'rsp_retained.asnumpy()'</span><span class="p">:</span> <span class="n">rsp_retained</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">()}</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="sparse-operators-and-storage-type-inference"> |
| <span id="sparse-operators-and-storage-type-inference"></span><h2>Sparse Operators and Storage Type Inference<a class="headerlink" href="#sparse-operators-and-storage-type-inference" title="Permalink to this headline">¶</a></h2> |
| <p>Operators that have specialized implementation for sparse arrays can be accessed in <code class="docutils literal"><span class="pre">mx.nd.sparse</span></code>. You can read the <a class="reference external" href="/versions/0.12.1/versions/master/api/python/ndarray/sparse.html">mxnet.ndarray.sparse API documentation</a> to find what sparse operators are available.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">shape</span> <span class="o">=</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="p">[</span><span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">9</span><span class="p">]</span> |
| <span class="n">indptr</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span> |
| <span class="n">indices</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span> |
| <span class="c1"># A csr matrix as lhs</span> |
| <span class="n">lhs</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">csr_matrix</span><span class="p">((</span><span class="n">data</span><span class="p">,</span> <span class="n">indices</span><span class="p">,</span> <span class="n">indptr</span><span class="p">),</span> <span class="n">shape</span><span class="o">=</span><span class="n">shape</span><span class="p">)</span> |
| <span class="c1"># A dense matrix as rhs</span> |
| <span class="n">rhs</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span> |
| <span class="c1"># row_sparse result is inferred from sparse operator dot(csr.T, dense) based on input stypes</span> |
| <span class="n">transpose_dot</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">lhs</span><span class="p">,</span> <span class="n">rhs</span><span class="p">,</span> <span class="n">transpose_a</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> |
| <span class="p">{</span><span class="s1">'transpose_dot'</span><span class="p">:</span> <span class="n">transpose_dot</span><span class="p">,</span> <span class="s1">'transpose_dot.asnumpy()'</span><span class="p">:</span> <span class="n">transpose_dot</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">()}</span> |
| </pre></div> |
| </div> |
| <p>For any sparse operator, the storage type of output array is inferred based on inputs. You can either read the documentation or inspect the <code class="docutils literal"><span class="pre">stype</span></code> attribute of output array to know what storage type is inferred:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">a</span> <span class="o">=</span> <span class="n">transpose_dot</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span> |
| <span class="n">b</span> <span class="o">=</span> <span class="n">a</span> <span class="o">*</span> <span class="mi">2</span> <span class="c1"># b will be a RowSparseNDArray since zero multiplied by 2 is still zero</span> |
| <span class="n">c</span> <span class="o">=</span> <span class="n">a</span> <span class="o">+</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span> <span class="c1"># c will be a dense NDArray</span> |
| <span class="p">{</span><span class="s1">'b.stype'</span><span class="p">:</span><span class="n">b</span><span class="o">.</span><span class="n">stype</span><span class="p">,</span> <span class="s1">'c.stype'</span><span class="p">:</span><span class="n">c</span><span class="o">.</span><span class="n">stype</span><span class="p">}</span> |
| </pre></div> |
| </div> |
| <p>For operators that don’t specialize in sparse arrays, you can still use them with sparse inputs with some performance penalty. |
| In MXNet, dense operators require all inputs and outputs to be in the dense format.</p> |
| <p>If sparse inputs are provided, MXNet will convert sparse inputs into dense ones temporarily so that the dense operator can be used.</p> |
| <p>If sparse outputs are provided, MXNet will convert the dense outputs generated by the dense operator into the provided sparse format.</p> |
| <p>For operators that don’t specialize in sparse arrays, you can still use them with sparse inputs with some performance penalty.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">e</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">,</span> <span class="n">a</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span> |
| <span class="n">d</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">log</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="c1"># dense operator with a sparse input</span> |
| <span class="n">e</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">log</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">out</span><span class="o">=</span><span class="n">e</span><span class="p">)</span> <span class="c1"># dense operator with a sparse output</span> |
| <span class="p">{</span><span class="s1">'a.stype'</span><span class="p">:</span><span class="n">a</span><span class="o">.</span><span class="n">stype</span><span class="p">,</span> <span class="s1">'d.stype'</span><span class="p">:</span><span class="n">d</span><span class="o">.</span><span class="n">stype</span><span class="p">,</span> <span class="s1">'e.stype'</span><span class="p">:</span><span class="n">e</span><span class="o">.</span><span class="n">stype</span><span class="p">}</span> <span class="c1"># stypes of a and e will be not changed</span> |
| </pre></div> |
| </div> |
| <p>Note that warning messages will be printed when such a storage fallback event happens. If you are using jupyter notebook, the warning message will be printed in your terminal console.</p> |
| </div> |
| <div class="section" id="sparse-optimizers"> |
| <span id="sparse-optimizers"></span><h2>Sparse Optimizers<a class="headerlink" href="#sparse-optimizers" title="Permalink to this headline">¶</a></h2> |
| <p>In MXNet, sparse gradient updates are applied when weight, state and gradient are all in <code class="docutils literal"><span class="pre">row_sparse</span></code> storage. |
| The sparse optimizers only update the row slices of the weight and the states whose indices appear |
| in <code class="docutils literal"><span class="pre">gradient.indices</span></code>. For example, the default update rule for SGD optimizer is:</p> |
| <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">rescaled_grad</span> <span class="o">=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">weight_decay</span> <span class="o">*</span> <span class="n">weight</span> |
| <span class="n">state</span> <span class="o">=</span> <span class="n">momentum</span> <span class="o">*</span> <span class="n">state</span> <span class="o">+</span> <span class="n">rescaled_grad</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="n">weight</span> <span class="o">-</span> <span class="n">state</span> |
| </pre></div> |
| </div> |
| <p>Meanwhile, the sparse update rule for SGD optimizer is:</p> |
| <div class="highlight-default"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">grad</span><span class="o">.</span><span class="n">indices</span><span class="p">:</span> |
| <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">learning_rate</span> <span class="o">*</span> <span class="n">rescale_grad</span> <span class="o">*</span> <span class="n">clip</span><span class="p">(</span><span class="n">grad</span><span class="p">[</span><span class="n">row</span><span class="p">],</span> <span class="n">clip_gradient</span><span class="p">)</span> <span class="o">+</span> <span class="n">weight_decay</span> <span class="o">*</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">momentum</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">*</span> <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">+</span> <span class="n">rescaled_grad</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">=</span> <span class="n">weight</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> <span class="o">-</span> <span class="n">state</span><span class="p">[</span><span class="n">row</span><span class="p">]</span> |
| </pre></div> |
| </div> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Create weight</span> |
| <span class="n">shape</span> <span class="o">=</span> <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> |
| <span class="n">weight</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ones</span><span class="p">(</span><span class="n">shape</span><span class="p">)</span><span class="o">.</span><span class="n">tostype</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">)</span> |
| <span class="c1"># Create gradient</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="p">[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">]]</span> |
| <span class="n">indices</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">]</span> |
| <span class="n">grad</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">row_sparse_array</span><span class="p">((</span><span class="n">data</span><span class="p">,</span> <span class="n">indices</span><span class="p">),</span> <span class="n">shape</span><span class="o">=</span><span class="n">shape</span><span class="p">)</span> |
| <span class="n">sgd</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">optimizer</span><span class="o">.</span><span class="n">SGD</span><span class="p">(</span><span class="n">learning_rate</span><span class="o">=</span><span class="mf">0.01</span><span class="p">,</span> <span class="n">momentum</span><span class="o">=</span><span class="mf">0.01</span><span class="p">)</span> |
| <span class="c1"># Create momentum</span> |
| <span class="n">momentum</span> <span class="o">=</span> <span class="n">sgd</span><span class="o">.</span><span class="n">create_state</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">weight</span><span class="p">)</span> |
| <span class="c1"># Before the update</span> |
| <span class="p">{</span><span class="s2">"grad.asnumpy()"</span><span class="p">:</span><span class="n">grad</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">(),</span> <span class="s2">"weight.asnumpy()"</span><span class="p">:</span><span class="n">weight</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">(),</span> <span class="s2">"momentum.asnumpy()"</span><span class="p">:</span><span class="n">momentum</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">()}</span> |
| </pre></div> |
| </div> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">sgd</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">weight</span><span class="p">,</span> <span class="n">grad</span><span class="p">,</span> <span class="n">momentum</span><span class="p">)</span> |
| <span class="c1"># Only row 0 and row 2 are updated for both weight and momentum</span> |
| <span class="p">{</span><span class="s2">"weight.asnumpy()"</span><span class="p">:</span><span class="n">weight</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">(),</span> <span class="s2">"momentum.asnumpy()"</span><span class="p">:</span><span class="n">momentum</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">()}</span> |
| </pre></div> |
| </div> |
| <p>Note that both <a class="reference external" href="/api/python/optimization.html#mxnet.optimizer.SGD">mxnet.optimizer.SGD</a> |
| and <a class="reference external" href="/api/python/optimization.html#mxnet.optimizer.Adam">mxnet.optimizer.Adam</a> support sparse updates in MXNet.</p> |
| </div> |
| <div class="section" id="advanced-topics"> |
| <span id="advanced-topics"></span><h2>Advanced Topics<a class="headerlink" href="#advanced-topics" title="Permalink to this headline">¶</a></h2> |
| <div class="section" id="gpu-support"> |
| <span id="gpu-support"></span><h3>GPU Support<a class="headerlink" href="#gpu-support" title="Permalink to this headline">¶</a></h3> |
| <p>By default, RowSparseNDArray operators are executed on CPU. In MXNet, GPU support for RowSparseNDArray is experimental |
| with only a few sparse operators such as cast_storage and dot.</p> |
| <p>To create a RowSparseNDArray on gpu, we need to explicitly specify the context:</p> |
| <p><strong>Note</strong> If a GPU is not available, an error will be reported in the following section. In order to execute it on a cpu, set gpu_device to mx.cpu().</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">sys</span> |
| <span class="n">gpu_device</span><span class="o">=</span><span class="n">mx</span><span class="o">.</span><span class="n">gpu</span><span class="p">()</span> <span class="c1"># Change this to mx.cpu() in absence of GPUs.</span> |
| <span class="k">try</span><span class="p">:</span> |
| <span class="n">a</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">sparse</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="s1">'row_sparse'</span><span class="p">,</span> <span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">100</span><span class="p">),</span> <span class="n">ctx</span><span class="o">=</span><span class="n">gpu_device</span><span class="p">)</span> |
| <span class="n">a</span> |
| <span class="k">except</span> <span class="n">mx</span><span class="o">.</span><span class="n">MXNetError</span> <span class="k">as</span> <span class="n">err</span><span class="p">:</span> |
| <span class="n">sys</span><span class="o">.</span><span class="n">stderr</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">err</span><span class="p">))</span> |
| </pre></div> |
| </div> |
| <div class="btn-group" role="group"> |
| <div class="download-btn"><a download="row_sparse.ipynb" href="row_sparse.ipynb"><span class="glyphicon glyphicon-download-alt"></span> row_sparse.ipynb</a></div></div></div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div aria-label="main navigation" class="sphinxsidebar rightsidebar" role="navigation"> |
| <div class="sphinxsidebarwrapper"> |
| <h3><a href="../../index.html">Table Of Contents</a></h3> |
| <ul> |
| <li><a class="reference internal" href="#">RowSparseNDArray - NDArray for Sparse Gradient Updates</a><ul> |
| <li><a class="reference internal" href="#motivation">Motivation</a></li> |
| <li><a class="reference internal" href="#prerequisites">Prerequisites</a></li> |
| <li><a class="reference internal" href="#row-sparse-format">Row Sparse Format</a></li> |
| <li><a class="reference internal" href="#array-creation">Array Creation</a></li> |
| <li><a class="reference internal" href="#function-overview">Function Overview</a></li> |
| <li><a class="reference internal" href="#setting-type">Setting Type</a></li> |
| <li><a class="reference internal" href="#inspecting-arrays">Inspecting Arrays</a></li> |
| <li><a class="reference internal" href="#storage-type-conversion">Storage Type Conversion</a></li> |
| <li><a class="reference internal" href="#copies">Copies</a></li> |
| <li><a class="reference internal" href="#retain-row-slices">Retain Row Slices</a></li> |
| <li><a class="reference internal" href="#sparse-operators-and-storage-type-inference">Sparse Operators and Storage Type Inference</a></li> |
| <li><a class="reference internal" href="#sparse-optimizers">Sparse Optimizers</a></li> |
| <li><a class="reference internal" href="#advanced-topics">Advanced Topics</a><ul> |
| <li><a class="reference internal" href="#gpu-support">GPU Support</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div><div class="footer"> |
| <div class="section-disclaimer"> |
| <div class="container"> |
| <div> |
| <img height="60" src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/apache_incubator_logo.png"/> |
| <p> |
| Apache MXNet is an effort undergoing incubation at The Apache Software Foundation (ASF), <strong>sponsored by the <i>Apache Incubator</i></strong>. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. |
| </p> |
| <p> |
| "Copyright © 2017-2018, The Apache Software Foundation |
| Apache MXNet, MXNet, Apache, the Apache feather, and the Apache MXNet project logo are either registered trademarks or trademarks of the Apache Software Foundation." |
| </p> |
| </div> |
| </div> |
| </div> |
| </div> <!-- pagename != index --> |
| </div> |
| <script crossorigin="anonymous" integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS" src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"></script> |
| <script src="../../_static/js/sidebar.js" type="text/javascript"></script> |
| <script src="../../_static/js/search.js" type="text/javascript"></script> |
| <script src="../../_static/js/navbar.js" type="text/javascript"></script> |
| <script src="../../_static/js/clipboard.min.js" type="text/javascript"></script> |
| <script src="../../_static/js/copycode.js" type="text/javascript"></script> |
| <script src="../../_static/js/page.js" type="text/javascript"></script> |
| <script src="../../_static/js/docversion.js" type="text/javascript"></script> |
| <script type="text/javascript"> |
| $('body').ready(function () { |
| $('body').css('visibility', 'visible'); |
| }); |
| </script> |
| </body> |
| </html> |