versions/1.2.1/api/python/gluon/nn.html - mxnet-site - Git at Google

 <!DOCTYPE html>

 <html lang="en">
 <head>
 <meta charset="utf-8"/>
 <meta content="IE=edge" http-equiv="X-UA-Compatible"/>
 <meta content="width=device-width, initial-scale=1" name="viewport"/>
 <meta content="Gluon Neural Network Layers" property="og:title">
 <meta content="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/og-logo.png" property="og:image">
 <meta content="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/og-logo.png" property="og:image:secure_url">
 <meta content="Gluon Neural Network Layers" property="og:description"/>
 <title>Gluon Neural Network Layers — mxnet  documentation</title>
 <link crossorigin="anonymous" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" rel="stylesheet"/>
 <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css" rel="stylesheet"/>
 <link href="../../../_static/basic.css" rel="stylesheet" type="text/css">
 <link href="../../../_static/pygments.css" rel="stylesheet" type="text/css">
 <link href="../../../_static/mxnet.css" rel="stylesheet" type="text/css"/>
 <script type="text/javascript">
       var DOCUMENTATION_OPTIONS = {
         URL_ROOT:    '../../../',
         VERSION:     '',
         COLLAPSE_INDEX: false,
         FILE_SUFFIX: '.html',
         HAS_SOURCE:  true,
         SOURCELINK_SUFFIX: '.txt'
       };
     </script>
 <script src="https://code.jquery.com/jquery-1.11.1.min.js" type="text/javascript"></script>
 <script src="../../../_static/underscore.js" type="text/javascript"></script>
 <script src="../../../_static/searchtools_custom.js" type="text/javascript"></script>
 <script src="../../../_static/doctools.js" type="text/javascript"></script>
 <script src="../../../_static/selectlang.js" type="text/javascript"></script>
 <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
 <script type="text/javascript"> jQuery(function() { Search.loadIndex("/versions/1.2.1/searchindex.js"); Search.init();}); </script>
 <!-- -->
 <!-- <script type="text/javascript" src="../../../_static/jquery.js"></script> -->
 <!-- -->
 <!-- <script type="text/javascript" src="../../../_static/underscore.js"></script> -->
 <!-- -->
 <!-- <script type="text/javascript" src="../../../_static/doctools.js"></script> -->
 <!-- -->
 <!-- <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> -->
 <!-- -->
 <link href="../../../genindex.html" rel="index" title="Index">
 <link href="../../../search.html" rel="search" title="Search"/>
 <link href="gluon.html" rel="up" title="Gluon Package"/>
 <link href="rnn.html" rel="next" title="Gluon Recurrent Neural Network API"/>
 <link href="gluon.html" rel="prev" title="Gluon Package"/>
 <link href="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet-icon.png" rel="icon" type="image/png"/>
 </link></link></link></meta></meta></meta></head>
 <body background="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet-background-compressed.jpeg" role="document">
 <div class="content-block"><div class="navbar navbar-fixed-top">
 <div class="container" id="navContainer">
 <div class="innder" id="header-inner">
 <h1 id="logo-wrap">
 <a href="../../../" id="logo"><img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet_logo.png"/></a>
 </h1>
 <nav class="nav-bar" id="main-nav">
 <a class="main-nav-link" href="/versions/1.2.1/install/index.html">Install</a>
 <span id="dropdown-menu-position-anchor">
 <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Gluon <span class="caret"></span></a>
 <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu">
 <li><a class="main-nav-link" href="/versions/1.2.1/tutorials/gluon/gluon.html">About</a></li>
 <li><a class="main-nav-link" href="https://www.d2l.ai/">Dive into Deep Learning</a></li>
 <li><a class="main-nav-link" href="https://gluon-cv.mxnet.io">GluonCV Toolkit</a></li>
 <li><a class="main-nav-link" href="https://gluon-nlp.mxnet.io/">GluonNLP Toolkit</a></li>
 </ul>
 </span>
 <span id="dropdown-menu-position-anchor">
 <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">API <span class="caret"></span></a>
 <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu">
 <li><a class="main-nav-link" href="/versions/1.2.1/api/python/index.html">Python</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/c++/index.html">C++</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/julia/index.html">Julia</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/perl/index.html">Perl</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/r/index.html">R</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/scala/index.html">Scala</a></li>
 </ul>
 </span>
 <span id="dropdown-menu-position-anchor-docs">
 <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Docs <span class="caret"></span></a>
 <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu-docs">
 <li><a class="main-nav-link" href="/versions/1.2.1/faq/index.html">FAQ</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/tutorials/index.html">Tutorials</a>
 <li><a class="main-nav-link" href="https://github.com/apache/incubator-mxnet/tree/1.2.1/example">Examples</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/architecture/index.html">Architecture</a></li>
 <li><a class="main-nav-link" href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home">Developer Wiki</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/model_zoo/index.html">Model Zoo</a></li>
 <li><a class="main-nav-link" href="https://github.com/onnx/onnx-mxnet">ONNX</a></li>
 </li></ul>
 </span>
 <span id="dropdown-menu-position-anchor-community">
 <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Community <span class="caret"></span></a>
 <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu-community">
 <li><a class="main-nav-link" href="http://discuss.mxnet.io">Forum</a></li>
 <li><a class="main-nav-link" href="https://github.com/apache/incubator-mxnet/tree/1.2.1">Github</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/community/contribute.html">Contribute</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/community/powered_by.html">Powered By</a></li>
 </ul>
 </span>
 <span id="dropdown-menu-position-anchor-version" style="position: relative"><a href="#" class="main-nav-link dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="true">1.2.1<span class="caret"></span></a><ul id="package-dropdown-menu" class="dropdown-menu"><li><a href="/">master</a></li><li><a href="/versions/1.7.0/">1.7.0</a></li><li><a href=/versions/1.6.0/>1.6.0</a></li><li><a href=/versions/1.5.0/>1.5.0</a></li><li><a href=/versions/1.4.1/>1.4.1</a></li><li><a href=/versions/1.3.1/>1.3.1</a></li><li><a href=/versions/1.2.1/>1.2.1</a></li><li><a href=/versions/1.1.0/>1.1.0</a></li><li><a href=/versions/1.0.0/>1.0.0</a></li><li><a href=/versions/0.12.1/>0.12.1</a></li><li><a href=/versions/0.11.0/>0.11.0</a></li></ul></span></nav>
 <script> function getRootPath(){ return "../../../" } </script>
 <div class="burgerIcon dropdown">
 <a class="dropdown-toggle" data-toggle="dropdown" href="#" role="button">☰</a>
 <ul class="dropdown-menu" id="burgerMenu">
 <li><a href="/versions/1.2.1/install/index.html">Install</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/tutorials/index.html">Tutorials</a></li>
 <li class="dropdown-submenu dropdown">
 <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">Gluon</a>
 <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu">
 <li><a class="main-nav-link" href="/versions/1.2.1/tutorials/gluon/gluon.html">About</a></li>
 <li><a class="main-nav-link" href="http://gluon.mxnet.io">The Straight Dope (Tutorials)</a></li>
 <li><a class="main-nav-link" href="https://gluon-cv.mxnet.io">GluonCV Toolkit</a></li>
 <li><a class="main-nav-link" href="https://gluon-nlp.mxnet.io/">GluonNLP Toolkit</a></li>
 </ul>
 </li>
 <li class="dropdown-submenu">
 <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">API</a>
 <ul class="dropdown-menu">
 <li><a class="main-nav-link" href="/versions/1.2.1/api/python/index.html">Python</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/c++/index.html">C++</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/julia/index.html">Julia</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/perl/index.html">Perl</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/r/index.html">R</a></li>
 <li><a class="main-nav-link" href="/versions/1.2.1/api/scala/index.html">Scala</a></li>
 </ul>
 </li>
 <li class="dropdown-submenu">
 <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">Docs</a>
 <ul class="dropdown-menu">
 <li><a href="/versions/1.2.1/faq/index.html" tabindex="-1">FAQ</a></li>
 <li><a href="/versions/1.2.1/tutorials/index.html" tabindex="-1">Tutorials</a></li>
 <li><a href="https://github.com/apache/incubator-mxnet/tree/1.2.1/example" tabindex="-1">Examples</a></li>
 <li><a href="/versions/1.2.1/architecture/index.html" tabindex="-1">Architecture</a></li>
 <li><a href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home" tabindex="-1">Developer Wiki</a></li>
 <li><a href="/versions/1.2.1/model_zoo/index.html" tabindex="-1">Gluon Model Zoo</a></li>
 <li><a href="https://github.com/onnx/onnx-mxnet" tabindex="-1">ONNX</a></li>
 </ul>
 </li>
 <li class="dropdown-submenu dropdown">
 <a aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" role="button" tabindex="-1">Community</a>
 <ul class="dropdown-menu">
 <li><a href="http://discuss.mxnet.io" tabindex="-1">Forum</a></li>
 <li><a href="https://github.com/apache/incubator-mxnet/tree/1.2.1" tabindex="-1">Github</a></li>
 <li><a href="/versions/1.2.1/community/contribute.html" tabindex="-1">Contribute</a></li>
 <li><a href="/versions/1.2.1/community/powered_by.html" tabindex="-1">Powered By</a></li>
 </ul>
 </li>
 <li id="dropdown-menu-position-anchor-version-mobile" class="dropdown-submenu" style="position: relative"><a href="#" tabindex="-1">1.2.1</a><ul class="dropdown-menu"><li><a tabindex="-1" href=/>master</a></li><li><a tabindex="-1" href=/versions/1.6.0/>1.6.0</a></li><li><a tabindex="-1" href=/versions/1.5.0/>1.5.0</a></li><li><a tabindex="-1" href=/versions/1.4.1/>1.4.1</a></li><li><a tabindex="-1" href=/versions/1.3.1/>1.3.1</a></li><li><a tabindex="-1" href=/versions/1.2.1/>1.2.1</a></li><li><a tabindex="-1" href=/versions/1.1.0/>1.1.0</a></li><li><a tabindex="-1" href=/versions/1.0.0/>1.0.0</a></li><li><a tabindex="-1" href=/versions/0.12.1/>0.12.1</a></li><li><a tabindex="-1" href=/versions/0.11.0/>0.11.0</a></li></ul></li></ul>
 </div>
 <div class="plusIcon dropdown">
 <a class="dropdown-toggle" data-toggle="dropdown" href="#" role="button"><span aria-hidden="true" class="glyphicon glyphicon-plus"></span></a>
 <ul class="dropdown-menu dropdown-menu-right" id="plusMenu"></ul>
 </div>
 <div id="search-input-wrap">
 <form action="../../../search.html" autocomplete="off" class="" method="get" role="search">
 <div class="form-group inner-addon left-addon">
 <i class="glyphicon glyphicon-search"></i>
 <input class="form-control" name="q" placeholder="Search" type="text"/>
 </div>
 <input name="check_keywords" type="hidden" value="yes">
 <input name="area" type="hidden" value="default"/>
 </input></form>
 <div id="search-preview"></div>
 </div>
 <div id="searchIcon">
 <span aria-hidden="true" class="glyphicon glyphicon-search"></span>
 </div>
 <!-- <div id="lang-select-wrap"> -->
 <!--   <label id="lang-select-label"> -->
 <!--     <\!-- <i class="fa fa-globe"></i> -\-> -->
 <!--     <span></span> -->
 <!--   </label> -->
 <!--   <select id="lang-select"> -->
 <!--     <option value="en">Eng</option> -->
 <!--     <option value="zh">中文</option> -->
 <!--   </select> -->
 <!-- </div> -->
 <!--     <a id="mobile-nav-toggle">
         <span class="mobile-nav-toggle-bar"></span>
         <span class="mobile-nav-toggle-bar"></span>
         <span class="mobile-nav-toggle-bar"></span>
       </a> -->
 </div>
 </div>
 </div>
 <script type="text/javascript">
         $('body').css('background', 'white');
     </script>
 <div class="container">
 <div class="row">
 <div aria-label="main navigation" class="sphinxsidebar leftsidebar" role="navigation">
 <div class="sphinxsidebarwrapper">
 <ul class="current">
 <li class="toctree-l1 current"><a class="reference internal" href="../index.html">Python Documents</a><ul class="current">
 <li class="toctree-l2"><a class="reference internal" href="../index.html#ndarray-api">NDArray API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#symbol-api">Symbol API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#module-api">Module API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#autograd-api">Autograd API</a></li>
 <li class="toctree-l2 current"><a class="reference internal" href="../index.html#gluon-api">Gluon API</a><ul class="current">
 <li class="toctree-l3 current"><a class="reference internal" href="gluon.html">Gluon Package</a><ul class="current">
 <li class="toctree-l4 current"><a class="reference internal" href="gluon.html#overview">Overview</a></li>
 <li class="toctree-l4"><a class="reference internal" href="gluon.html#parameter">Parameter</a></li>
 <li class="toctree-l4"><a class="reference internal" href="gluon.html#containers">Containers</a></li>
 <li class="toctree-l4"><a class="reference internal" href="gluon.html#trainer">Trainer</a></li>
 <li class="toctree-l4"><a class="reference internal" href="gluon.html#utilities">Utilities</a></li>
 <li class="toctree-l4"><a class="reference internal" href="gluon.html#api-reference">API Reference</a></li>
 </ul>
 </li>
 <li class="toctree-l3 current"><a class="current reference internal" href="#">Gluon Neural Network Layers</a><ul>
 <li class="toctree-l4"><a class="reference internal" href="#overview">Overview</a></li>
 <li class="toctree-l4"><a class="reference internal" href="#basic-layers">Basic Layers</a></li>
 <li class="toctree-l4"><a class="reference internal" href="#convolutional-layers">Convolutional Layers</a></li>
 <li class="toctree-l4"><a class="reference internal" href="#pooling-layers">Pooling Layers</a></li>
 <li class="toctree-l4"><a class="reference internal" href="#activation-layers">Activation Layers</a></li>
 <li class="toctree-l4"><a class="reference internal" href="#api-reference">API Reference</a></li>
 </ul>
 </li>
 <li class="toctree-l3"><a class="reference internal" href="rnn.html">Gluon Recurrent Neural Network API</a></li>
 <li class="toctree-l3"><a class="reference internal" href="loss.html">Gluon Loss API</a></li>
 <li class="toctree-l3"><a class="reference internal" href="data.html">Gluon Data API</a></li>
 <li class="toctree-l3"><a class="reference internal" href="model_zoo.html">Gluon Model Zoo</a></li>
 <li class="toctree-l3"><a class="reference internal" href="contrib.html">Gluon Contrib API</a></li>
 </ul>
 </li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#kvstore-api">KVStore API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#io-api">IO API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#image-api">Image API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#optimization-api">Optimization API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#callback-api">Callback API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#metric-api">Metric API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#run-time-compilation-api">Run-Time Compilation API</a></li>
 <li class="toctree-l2"><a class="reference internal" href="../index.html#contrib-package">Contrib Package</a></li>
 </ul>
 </li>
 <li class="toctree-l1"><a class="reference internal" href="../../r/index.html">R Documents</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../julia/index.html">Julia Documents</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../c++/index.html">C++ Documents</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../scala/index.html">Scala Documents</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../perl/index.html">Perl Documents</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../faq/index.html">HowTo Documents</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../architecture/index.html">System Documents</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../tutorials/index.html">Tutorials</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../community/index.html">Community</a></li>
 </ul>
 </div>
 </div>
 <div class="content">
 <div class="page-tracker"></div>
 <div class="section" id="gluon-neural-network-layers">
 <span id="gluon-neural-network-layers"></span><h1>Gluon Neural Network Layers<a class="headerlink" href="#gluon-neural-network-layers" title="Permalink to this headline">¶</a></h1>
 <div class="section" id="overview">
 <span id="overview"></span><h2>Overview<a class="headerlink" href="#overview" title="Permalink to this headline">¶</a></h2>
 <p>This document lists the neural network blocks in Gluon:</p>
 </div>
 <div class="section" id="basic-layers">
 <span id="basic-layers"></span><h2>Basic Layers<a class="headerlink" href="#basic-layers" title="Permalink to this headline">¶</a></h2>
 <table border="1" class="longtable docutils">
 <colgroup>
 <col width="10%"/>
 <col width="90%"/>
 </colgroup>
 <tbody valign="top">
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.Dense" title="mxnet.gluon.nn.Dense"><code class="xref py py-obj docutils literal"><span class="pre">Dense</span></code></a></td>
 <td>Just your regular densely-connected NN layer.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.Dropout" title="mxnet.gluon.nn.Dropout"><code class="xref py py-obj docutils literal"><span class="pre">Dropout</span></code></a></td>
 <td>Applies Dropout to the input.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.BatchNorm" title="mxnet.gluon.nn.BatchNorm"><code class="xref py py-obj docutils literal"><span class="pre">BatchNorm</span></code></a></td>
 <td>Batch normalization layer (Ioffe and Szegedy, 2014).</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.InstanceNorm" title="mxnet.gluon.nn.InstanceNorm"><code class="xref py py-obj docutils literal"><span class="pre">InstanceNorm</span></code></a></td>
 <td>Applies instance normalization to the n-dimensional input array.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.LayerNorm" title="mxnet.gluon.nn.LayerNorm"><code class="xref py py-obj docutils literal"><span class="pre">LayerNorm</span></code></a></td>
 <td>Applies layer normalization to the n-dimensional input array.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.Embedding" title="mxnet.gluon.nn.Embedding"><code class="xref py py-obj docutils literal"><span class="pre">Embedding</span></code></a></td>
 <td>Turns non-negative integers (indexes/tokens) into dense vectors of fixed size.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.Flatten" title="mxnet.gluon.nn.Flatten"><code class="xref py py-obj docutils literal"><span class="pre">Flatten</span></code></a></td>
 <td>Flattens the input to two dimensional.</td>
 </tr>
 </tbody>
 </table>
 </div>
 <div class="section" id="convolutional-layers">
 <span id="convolutional-layers"></span><h2>Convolutional Layers<a class="headerlink" href="#convolutional-layers" title="Permalink to this headline">¶</a></h2>
 <table border="1" class="longtable docutils">
 <colgroup>
 <col width="10%"/>
 <col width="90%"/>
 </colgroup>
 <tbody valign="top">
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.Conv1D" title="mxnet.gluon.nn.Conv1D"><code class="xref py py-obj docutils literal"><span class="pre">Conv1D</span></code></a></td>
 <td>1D convolution layer (e.g. temporal convolution).</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.Conv2D" title="mxnet.gluon.nn.Conv2D"><code class="xref py py-obj docutils literal"><span class="pre">Conv2D</span></code></a></td>
 <td>2D convolution layer (e.g. spatial convolution over images).</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.Conv3D" title="mxnet.gluon.nn.Conv3D"><code class="xref py py-obj docutils literal"><span class="pre">Conv3D</span></code></a></td>
 <td>3D convolution layer (e.g. spatial convolution over volumes).</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.Conv1DTranspose" title="mxnet.gluon.nn.Conv1DTranspose"><code class="xref py py-obj docutils literal"><span class="pre">Conv1DTranspose</span></code></a></td>
 <td>Transposed 1D convolution layer (sometimes called Deconvolution).</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.Conv2DTranspose" title="mxnet.gluon.nn.Conv2DTranspose"><code class="xref py py-obj docutils literal"><span class="pre">Conv2DTranspose</span></code></a></td>
 <td>Transposed 2D convolution layer (sometimes called Deconvolution).</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.Conv3DTranspose" title="mxnet.gluon.nn.Conv3DTranspose"><code class="xref py py-obj docutils literal"><span class="pre">Conv3DTranspose</span></code></a></td>
 <td>Transposed 3D convolution layer (sometimes called Deconvolution).</td>
 </tr>
 </tbody>
 </table>
 </div>
 <div class="section" id="pooling-layers">
 <span id="pooling-layers"></span><h2>Pooling Layers<a class="headerlink" href="#pooling-layers" title="Permalink to this headline">¶</a></h2>
 <table border="1" class="longtable docutils">
 <colgroup>
 <col width="10%"/>
 <col width="90%"/>
 </colgroup>
 <tbody valign="top">
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.MaxPool1D" title="mxnet.gluon.nn.MaxPool1D"><code class="xref py py-obj docutils literal"><span class="pre">MaxPool1D</span></code></a></td>
 <td>Max pooling operation for one dimensional data.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.MaxPool2D" title="mxnet.gluon.nn.MaxPool2D"><code class="xref py py-obj docutils literal"><span class="pre">MaxPool2D</span></code></a></td>
 <td>Max pooling operation for two dimensional (spatial) data.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.MaxPool3D" title="mxnet.gluon.nn.MaxPool3D"><code class="xref py py-obj docutils literal"><span class="pre">MaxPool3D</span></code></a></td>
 <td>Max pooling operation for 3D data (spatial or spatio-temporal).</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.AvgPool1D" title="mxnet.gluon.nn.AvgPool1D"><code class="xref py py-obj docutils literal"><span class="pre">AvgPool1D</span></code></a></td>
 <td>Average pooling operation for temporal data.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.AvgPool2D" title="mxnet.gluon.nn.AvgPool2D"><code class="xref py py-obj docutils literal"><span class="pre">AvgPool2D</span></code></a></td>
 <td>Average pooling operation for spatial data.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.AvgPool3D" title="mxnet.gluon.nn.AvgPool3D"><code class="xref py py-obj docutils literal"><span class="pre">AvgPool3D</span></code></a></td>
 <td>Average pooling operation for 3D data (spatial or spatio-temporal).</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.GlobalMaxPool1D" title="mxnet.gluon.nn.GlobalMaxPool1D"><code class="xref py py-obj docutils literal"><span class="pre">GlobalMaxPool1D</span></code></a></td>
 <td>Global max pooling operation for temporal data.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.GlobalMaxPool2D" title="mxnet.gluon.nn.GlobalMaxPool2D"><code class="xref py py-obj docutils literal"><span class="pre">GlobalMaxPool2D</span></code></a></td>
 <td>Global max pooling operation for spatial data.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.GlobalMaxPool3D" title="mxnet.gluon.nn.GlobalMaxPool3D"><code class="xref py py-obj docutils literal"><span class="pre">GlobalMaxPool3D</span></code></a></td>
 <td>Global max pooling operation for 3D data.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.GlobalAvgPool1D" title="mxnet.gluon.nn.GlobalAvgPool1D"><code class="xref py py-obj docutils literal"><span class="pre">GlobalAvgPool1D</span></code></a></td>
 <td>Global average pooling operation for temporal data.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.GlobalAvgPool2D" title="mxnet.gluon.nn.GlobalAvgPool2D"><code class="xref py py-obj docutils literal"><span class="pre">GlobalAvgPool2D</span></code></a></td>
 <td>Global average pooling operation for spatial data.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.GlobalAvgPool3D" title="mxnet.gluon.nn.GlobalAvgPool3D"><code class="xref py py-obj docutils literal"><span class="pre">GlobalAvgPool3D</span></code></a></td>
 <td>Global max pooling operation for 3D data.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.ReflectionPad2D" title="mxnet.gluon.nn.ReflectionPad2D"><code class="xref py py-obj docutils literal"><span class="pre">ReflectionPad2D</span></code></a></td>
 <td>Pads the input tensor using the reflection of the input boundary.</td>
 </tr>
 </tbody>
 </table>
 </div>
 <div class="section" id="activation-layers">
 <span id="activation-layers"></span><h2>Activation Layers<a class="headerlink" href="#activation-layers" title="Permalink to this headline">¶</a></h2>
 <table border="1" class="longtable docutils">
 <colgroup>
 <col width="10%"/>
 <col width="90%"/>
 </colgroup>
 <tbody valign="top">
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.Activation" title="mxnet.gluon.nn.Activation"><code class="xref py py-obj docutils literal"><span class="pre">Activation</span></code></a></td>
 <td>Applies an activation function to input.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.LeakyReLU" title="mxnet.gluon.nn.LeakyReLU"><code class="xref py py-obj docutils literal"><span class="pre">LeakyReLU</span></code></a></td>
 <td>Leaky version of a Rectified Linear Unit.</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.PReLU" title="mxnet.gluon.nn.PReLU"><code class="xref py py-obj docutils literal"><span class="pre">PReLU</span></code></a></td>
 <td>Parametric leaky version of a Rectified Linear Unit.</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.ELU" title="mxnet.gluon.nn.ELU"><code class="xref py py-obj docutils literal"><span class="pre">ELU</span></code></a></td>
 <td>Exponential Linear Unit (ELU)</td>
 </tr>
 <tr class="row-odd"><td><a class="reference internal" href="#mxnet.gluon.nn.SELU" title="mxnet.gluon.nn.SELU"><code class="xref py py-obj docutils literal"><span class="pre">SELU</span></code></a></td>
 <td>Scaled Exponential Linear Unit (SELU)</td>
 </tr>
 <tr class="row-even"><td><a class="reference internal" href="#mxnet.gluon.nn.Swish" title="mxnet.gluon.nn.Swish"><code class="xref py py-obj docutils literal"><span class="pre">Swish</span></code></a></td>
 <td>Swish Activation function</td>
 </tr>
 </tbody>
 </table>
 </div>
 <div class="section" id="api-reference">
 <span id="api-reference"></span><h2>API Reference<a class="headerlink" href="#api-reference" title="Permalink to this headline">¶</a></h2>
 <script src="../../_static/js/auto_module_index.js" type="text/javascript"></script><span class="target" id="module-mxnet.gluon.nn"></span><p>Neural network layers.</p>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Activation">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Activation</code><span class="sig-paren">(</span><em>activation</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/activations.html#Activation"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Activation" title="Permalink to this definition">¶</a></dt>
 <dd><p>Applies an activation function to input.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>activation</strong> (<em>str</em>) – Name of activation function to use.
 See <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.Activation" title="mxnet.ndarray.Activation"><code class="xref py py-func docutils literal"><span class="pre">Activation()</span></code></a> for available choices.</td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.AvgPool1D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">AvgPool1D</code><span class="sig-paren">(</span><em>pool_size=2</em>, <em>strides=None</em>, <em>padding=0</em>, <em>layout='NCW'</em>, <em>ceil_mode=False</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#AvgPool1D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.AvgPool1D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Average pooling operation for temporal data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>pool_size</strong> (<em>int</em>) – Size of the max pooling windows.</li>
 <li><strong>strides</strong> (<em>int</em><em>, or </em><em>None</em>) – Factor by which to downscale. E.g. 2 will halve the input size.
 If <cite>None</cite>, it will default to <cite>pool_size</cite>.</li>
 <li><strong>padding</strong> (<em>int</em>) – If padding is non-zero, then the input is implicitly
 zero-padded on both sides for padding number of points.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCW'</em>) – Dimension ordering of data and weight. Can be ‘NCW’, ‘NWC’, etc.
 ‘N’, ‘C’, ‘W’ stands for batch, channel, and width (time) dimensions
 respectively. padding is applied on ‘W’ dimension.</li>
 <li><strong>ceil_mode</strong> (<em>bool</em><em>, </em><em>default False</em>) – When <cite>True</cite>, will use ceil instead of floor to compute the output shape.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 3D input tensor with shape <cite>(batch_size, in_channels, width)</cite>
 when <cite>layout</cite> is <cite>NCW</cite>. For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 3D output tensor with shape <cite>(batch_size, channels, out_width)</cite>
 when <cite>layout</cite> is <cite>NCW</cite>. out_width is calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="o">-</span><span class="n">pool_size</span><span class="p">)</span><span class="o">/</span><span class="n">strides</span><span class="p">)</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 <p>When <cite>ceil_mode</cite> is <cite>True</cite>, ceil will be used instead of floor in this
 equation.</p>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.AvgPool2D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">AvgPool2D</code><span class="sig-paren">(</span><em>pool_size=(2</em>, <em>2)</em>, <em>strides=None</em>, <em>padding=0</em>, <em>ceil_mode=False</em>, <em>layout='NCHW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#AvgPool2D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.AvgPool2D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Average pooling operation for spatial data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>pool_size</strong> (<em>int</em><em> or </em><em>list/tuple of 2 ints</em><em>,</em>) – Size of the max pooling windows.</li>
 <li><strong>strides</strong> (<em>int</em><em>, </em><em>list/tuple of 2 ints</em><em>, or </em><em>None.</em>) – Factor by which to downscale. E.g. 2 will halve the input size.
 If <cite>None</cite>, it will default to <cite>pool_size</cite>.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>list/tuple of 2 ints</em><em>,</em>) – If padding is non-zero, then the input is implicitly
 zero-padded on both sides for padding number of points.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCHW'</em>) – Dimension ordering of data and weight. Can be ‘NCHW’, ‘NHWC’, etc.
 ‘N’, ‘C’, ‘H’, ‘W’ stands for batch, channel, height, and width
 dimensions respectively. padding is applied on ‘H’ and ‘W’ dimension.</li>
 <li><strong>ceil_mode</strong> (<em>bool</em><em>, </em><em>default False</em>) – When True, will use ceil instead of floor to compute the output shape.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 4D input tensor with shape
 <cite>(batch_size, in_channels, height, width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 4D output tensor with shape
 <cite>(batch_size, channels, out_height, out_width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 out_height and out_width are calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_height</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">height</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 <p>When <cite>ceil_mode</cite> is <cite>True</cite>, ceil will be used instead of floor in this
 equation.</p>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.AvgPool3D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">AvgPool3D</code><span class="sig-paren">(</span><em>pool_size=(2</em>, <em>2</em>, <em>2)</em>, <em>strides=None</em>, <em>padding=0</em>, <em>ceil_mode=False</em>, <em>layout='NCDHW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#AvgPool3D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.AvgPool3D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Average pooling operation for 3D data (spatial or spatio-temporal).</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>pool_size</strong> (<em>int</em><em> or </em><em>list/tuple of 3 ints</em><em>,</em>) – Size of the max pooling windows.</li>
 <li><strong>strides</strong> (<em>int</em><em>, </em><em>list/tuple of 3 ints</em><em>, or </em><em>None.</em>) – Factor by which to downscale. E.g. 2 will halve the input size.
 If <cite>None</cite>, it will default to <cite>pool_size</cite>.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>list/tuple of 3 ints</em><em>,</em>) – If padding is non-zero, then the input is implicitly
 zero-padded on both sides for padding number of points.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCDHW'</em>) – Dimension ordering of data and weight. Can be ‘NCDHW’, ‘NDHWC’, etc.
 ‘N’, ‘C’, ‘H’, ‘W’, ‘D’ stands for batch, channel, height, width and
 depth dimensions respectively. padding is applied on ‘D’, ‘H’ and ‘W’
 dimension.</li>
 <li><strong>ceil_mode</strong> (<em>bool</em><em>, </em><em>default False</em>) – When True, will use ceil instead of floor to compute the output shape.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 5D input tensor with shape
 <cite>(batch_size, in_channels, depth, height, width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 5D output tensor with shape
 <cite>(batch_size, channels, out_depth, out_height, out_width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 out_depth, out_height and out_width are calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_depth</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">depth</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_height</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">height</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 <p>When <cite>ceil_mode</cite> is <cite>True,</cite> ceil will be used instead of floor in this
 equation.</p>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.BatchNorm">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">BatchNorm</code><span class="sig-paren">(</span><em>axis=1</em>, <em>momentum=0.9</em>, <em>epsilon=1e-05</em>, <em>center=True</em>, <em>scale=True</em>, <em>use_global_stats=False</em>, <em>beta_initializer='zeros'</em>, <em>gamma_initializer='ones'</em>, <em>running_mean_initializer='zeros'</em>, <em>running_variance_initializer='ones'</em>, <em>in_channels=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#BatchNorm"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.BatchNorm" title="Permalink to this definition">¶</a></dt>
 <dd><p>Batch normalization layer (Ioffe and Szegedy, 2014).
 Normalizes the input at each batch, i.e. applies a transformation
 that maintains the mean activation close to 0 and the activation
 standard deviation close to 1.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>axis</strong> (<em>int</em><em>, </em><em>default 1</em>) – The axis that should be normalized. This is typically the channels
 (C) axis. For instance, after a <cite>Conv2D</cite> layer with <cite>layout=’NCHW’</cite>,
 set <cite>axis=1</cite> in <cite>BatchNorm</cite>. If <cite>layout=’NHWC’</cite>, then set <cite>axis=3</cite>.</li>
 <li><strong>momentum</strong> (<em>float</em><em>, </em><em>default 0.9</em>) – Momentum for the moving average.</li>
 <li><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-5</em>) – Small float added to variance to avoid dividing by zero.</li>
 <li><strong>center</strong> (<em>bool</em><em>, </em><em>default True</em>) – If True, add offset of <cite>beta</cite> to normalized tensor.
 If False, <cite>beta</cite> is ignored.</li>
 <li><strong>scale</strong> (<em>bool</em><em>, </em><em>default True</em>) – If True, multiply by <cite>gamma</cite>. If False, <cite>gamma</cite> is not used.
 When the next layer is linear (also e.g. <cite>nn.relu</cite>),
 this can be disabled since the scaling
 will be done by the next layer.</li>
 <li><strong>use_global_stats</strong> (<em>bool</em><em>, </em><em>default False</em>) – If True, use global moving statistics instead of local batch-norm. This will force
 change batch-norm into a scale shift operator.
 If False, use local batch-norm.</li>
 <li><strong>beta_initializer</strong> (str or <cite>Initializer</cite>, default ‘zeros’) – Initializer for the beta weight.</li>
 <li><strong>gamma_initializer</strong> (str or <cite>Initializer</cite>, default ‘ones’) – Initializer for the gamma weight.</li>
 <li><strong>moving_mean_initializer</strong> (str or <cite>Initializer</cite>, default ‘zeros’) – Initializer for the moving mean.</li>
 <li><strong>moving_variance_initializer</strong> (str or <cite>Initializer</cite>, default ‘ones’) – Initializer for the moving variance.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – Number of channels (feature maps) in input data. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Conv1D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Conv1D</code><span class="sig-paren">(</span><em>channels</em>, <em>kernel_size</em>, <em>strides=1</em>, <em>padding=0</em>, <em>dilation=1</em>, <em>groups=1</em>, <em>layout='NCW'</em>, <em>activation=None</em>, <em>use_bias=True</em>, <em>weight_initializer=None</em>, <em>bias_initializer='zeros'</em>, <em>in_channels=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#Conv1D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Conv1D" title="Permalink to this definition">¶</a></dt>
 <dd><p>1D convolution layer (e.g. temporal convolution).</p>
 <p>This layer creates a convolution kernel that is convolved
 with the layer input over a single spatial (or temporal) dimension
 to produce a tensor of outputs.
 If <cite>use_bias</cite> is True, a bias vector is created and added to the outputs.
 Finally, if <cite>activation</cite> is not <cite>None</cite>,
 it is applied to the outputs as well.</p>
 <p>If <cite>in_channels</cite> is not specified, <cite>Parameter</cite> initialization will be
 deferred to the first time <cite>forward</cite> is called and <cite>in_channels</cite> will be
 inferred from the shape of input data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>channels</strong> (<em>int</em>) – The dimensionality of the output space, i.e. the number of output
 channels (filters) in the convolution.</li>
 <li><strong>kernel_size</strong> (<em>int</em><em> or </em><em>tuple/list of 1 int</em>) – Specifies the dimensions of the convolution window.</li>
 <li><strong>strides</strong> (<em>int</em><em> or </em><em>tuple/list of 1 int</em><em>,</em>) – Specify the strides of the convolution.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>a tuple/list of 1 int</em><em>,</em>) – If padding is non-zero, then the input is implicitly zero-padded
 on both sides for padding number of points</li>
 <li><strong>dilation</strong> (<em>int</em><em> or </em><em>tuple/list of 1 int</em>) – Specifies the dilation rate to use for dilated convolution.</li>
 <li><strong>groups</strong> (<em>int</em>) – Controls the connections between inputs and outputs.
 At groups=1, all inputs are convolved to all outputs.
 At groups=2, the operation becomes equivalent to having two conv
 layers side by side, each seeing half the input channels, and producing
 half the output channels, and both subsequently concatenated.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCW'</em>) – Dimension ordering of data and weight. Can be ‘NCW’, ‘NWC’, etc.
 ‘N’, ‘C’, ‘W’ stands for batch, channel, and width (time) dimensions
 respectively. Convolution is applied on the ‘W’ dimension.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – The number of input channels to this layer. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 <li><strong>activation</strong> (<em>str</em>) – Activation function to use. See <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.Activation" title="mxnet.ndarray.Activation"><code class="xref py py-func docutils literal"><span class="pre">Activation()</span></code></a>.
 If you don’t specify anything, no activation is applied
 (ie. “linear” activation: <cite>a(x) = x</cite>).</li>
 <li><strong>use_bias</strong> (<em>bool</em>) – Whether the layer uses a bias vector.</li>
 <li><strong>weight_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the <cite>weight</cite> weights matrix.</li>
 <li><strong>bias_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the bias vector.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 3D input tensor with shape <cite>(batch_size, in_channels, width)</cite>
 when <cite>layout</cite> is <cite>NCW</cite>. For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 3D output tensor with shape <cite>(batch_size, channels, out_width)</cite>
 when <cite>layout</cite> is <cite>NCW</cite>. out_width is calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="o">-</span><span class="n">dilation</span><span class="o">*</span><span class="p">(</span><span class="n">kernel_size</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">/</span><span class="n">stride</span><span class="p">)</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Conv1DTranspose">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Conv1DTranspose</code><span class="sig-paren">(</span><em>channels</em>, <em>kernel_size</em>, <em>strides=1</em>, <em>padding=0</em>, <em>output_padding=0</em>, <em>dilation=1</em>, <em>groups=1</em>, <em>layout='NCW'</em>, <em>activation=None</em>, <em>use_bias=True</em>, <em>weight_initializer=None</em>, <em>bias_initializer='zeros'</em>, <em>in_channels=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#Conv1DTranspose"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Conv1DTranspose" title="Permalink to this definition">¶</a></dt>
 <dd><p>Transposed 1D convolution layer (sometimes called Deconvolution).</p>
 <p>The need for transposed convolutions generally arises
 from the desire to use a transformation going in the opposite direction
 of a normal convolution, i.e., from something that has the shape of the
 output of some convolution to something that has the shape of its input
 while maintaining a connectivity pattern that is compatible with
 said convolution.</p>
 <p>If <cite>in_channels</cite> is not specified, <cite>Parameter</cite> initialization will be
 deferred to the first time <cite>forward</cite> is called and <cite>in_channels</cite> will be
 inferred from the shape of input data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>channels</strong> (<em>int</em>) – The dimensionality of the output space, i.e. the number of output
 channels (filters) in the convolution.</li>
 <li><strong>kernel_size</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em>) – Specifies the dimensions of the convolution window.</li>
 <li><strong>strides</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em><em>,</em>) – Specify the strides of the convolution.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>a tuple/list of 3 int</em><em>,</em>) – If padding is non-zero, then the input is implicitly zero-padded
 on both sides for padding number of points</li>
 <li><strong>dilation</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em>) – Specifies the dilation rate to use for dilated convolution.</li>
 <li><strong>groups</strong> (<em>int</em>) – Controls the connections between inputs and outputs.
 At groups=1, all inputs are convolved to all outputs.
 At groups=2, the operation becomes equivalent to having two conv
 layers side by side, each seeing half the input channels, and producing
 half the output channels, and both subsequently concatenated.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCW'</em>) – Dimension ordering of data and weight. Can be ‘NCW’, ‘NWC’, etc.
 ‘N’, ‘C’, ‘W’ stands for batch, channel, and width (time) dimensions
 respectively. Convolution is applied on the ‘W’ dimension.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – The number of input channels to this layer. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 <li><strong>activation</strong> (<em>str</em>) – Activation function to use. See <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.Activation" title="mxnet.ndarray.Activation"><code class="xref py py-func docutils literal"><span class="pre">Activation()</span></code></a>.
 If you don’t specify anything, no activation is applied
 (ie. “linear” activation: <cite>a(x) = x</cite>).</li>
 <li><strong>use_bias</strong> (<em>bool</em>) – Whether the layer uses a bias vector.</li>
 <li><strong>weight_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the <cite>weight</cite> weights matrix.</li>
 <li><strong>bias_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the bias vector.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 3D input tensor with shape <cite>(batch_size, in_channels, width)</cite>
 when <cite>layout</cite> is <cite>NCW</cite>. For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 3D output tensor with shape <cite>(batch_size, channels, out_width)</cite>
 when <cite>layout</cite> is <cite>NCW</cite>. out_width is calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_width</span> <span class="o">=</span> <span class="p">(</span><span class="n">width</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">strides</span><span class="o">-</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="o">+</span><span class="n">kernel_size</span><span class="o">+</span><span class="n">output_padding</span>
 </pre></div>
 </div>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Conv2D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Conv2D</code><span class="sig-paren">(</span><em>channels</em>, <em>kernel_size</em>, <em>strides=(1</em>, <em>1)</em>, <em>padding=(0</em>, <em>0)</em>, <em>dilation=(1</em>, <em>1)</em>, <em>groups=1</em>, <em>layout='NCHW'</em>, <em>activation=None</em>, <em>use_bias=True</em>, <em>weight_initializer=None</em>, <em>bias_initializer='zeros'</em>, <em>in_channels=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#Conv2D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Conv2D" title="Permalink to this definition">¶</a></dt>
 <dd><p>2D convolution layer (e.g. spatial convolution over images).</p>
 <p>This layer creates a convolution kernel that is convolved
 with the layer input to produce a tensor of
 outputs. If <cite>use_bias</cite> is True,
 a bias vector is created and added to the outputs. Finally, if
 <cite>activation</cite> is not <cite>None</cite>, it is applied to the outputs as well.</p>
 <p>If <cite>in_channels</cite> is not specified, <cite>Parameter</cite> initialization will be
 deferred to the first time <cite>forward</cite> is called and <cite>in_channels</cite> will be
 inferred from the shape of input data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>channels</strong> (<em>int</em>) – The dimensionality of the output space, i.e. the number of output
 channels (filters) in the convolution.</li>
 <li><strong>kernel_size</strong> (<em>int</em><em> or </em><em>tuple/list of 2 int</em>) – Specifies the dimensions of the convolution window.</li>
 <li><strong>strides</strong> (<em>int</em><em> or </em><em>tuple/list of 2 int</em><em>,</em>) – Specify the strides of the convolution.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>a tuple/list of 2 int</em><em>,</em>) – If padding is non-zero, then the input is implicitly zero-padded
 on both sides for padding number of points</li>
 <li><strong>dilation</strong> (<em>int</em><em> or </em><em>tuple/list of 2 int</em>) – Specifies the dilation rate to use for dilated convolution.</li>
 <li><strong>groups</strong> (<em>int</em>) – Controls the connections between inputs and outputs.
 At groups=1, all inputs are convolved to all outputs.
 At groups=2, the operation becomes equivalent to having two conv
 layers side by side, each seeing half the input channels, and producing
 half the output channels, and both subsequently concatenated.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCHW'</em>) – Dimension ordering of data and weight. Can be ‘NCHW’, ‘NHWC’, etc.
 ‘N’, ‘C’, ‘H’, ‘W’ stands for batch, channel, height, and width
 dimensions respectively. Convolution is applied on the ‘H’ and
 ‘W’ dimensions.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – The number of input channels to this layer. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 <li><strong>activation</strong> (<em>str</em>) – Activation function to use. See <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.Activation" title="mxnet.ndarray.Activation"><code class="xref py py-func docutils literal"><span class="pre">Activation()</span></code></a>.
 If you don’t specify anything, no activation is applied
 (ie. “linear” activation: <cite>a(x) = x</cite>).</li>
 <li><strong>use_bias</strong> (<em>bool</em>) – Whether the layer uses a bias vector.</li>
 <li><strong>weight_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the <cite>weight</cite> weights matrix.</li>
 <li><strong>bias_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the bias vector.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 4D input tensor with shape
 <cite>(batch_size, in_channels, height, width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 4D output tensor with shape
 <cite>(batch_size, channels, out_height, out_width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 out_height and out_width are calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_height</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">height</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="n">dilation</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">*</span><span class="p">(</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">/</span><span class="n">stride</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="n">dilation</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">*</span><span class="p">(</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">/</span><span class="n">stride</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Conv2DTranspose">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Conv2DTranspose</code><span class="sig-paren">(</span><em>channels</em>, <em>kernel_size</em>, <em>strides=(1</em>, <em>1)</em>, <em>padding=(0</em>, <em>0)</em>, <em>output_padding=(0</em>, <em>0)</em>, <em>dilation=(1</em>, <em>1)</em>, <em>groups=1</em>, <em>layout='NCHW'</em>, <em>activation=None</em>, <em>use_bias=True</em>, <em>weight_initializer=None</em>, <em>bias_initializer='zeros'</em>, <em>in_channels=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#Conv2DTranspose"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Conv2DTranspose" title="Permalink to this definition">¶</a></dt>
 <dd><p>Transposed 2D convolution layer (sometimes called Deconvolution).</p>
 <p>The need for transposed convolutions generally arises
 from the desire to use a transformation going in the opposite direction
 of a normal convolution, i.e., from something that has the shape of the
 output of some convolution to something that has the shape of its input
 while maintaining a connectivity pattern that is compatible with
 said convolution.</p>
 <p>If <cite>in_channels</cite> is not specified, <cite>Parameter</cite> initialization will be
 deferred to the first time <cite>forward</cite> is called and <cite>in_channels</cite> will be
 inferred from the shape of input data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>channels</strong> (<em>int</em>) – The dimensionality of the output space, i.e. the number of output
 channels (filters) in the convolution.</li>
 <li><strong>kernel_size</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em>) – Specifies the dimensions of the convolution window.</li>
 <li><strong>strides</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em><em>,</em>) – Specify the strides of the convolution.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>a tuple/list of 3 int</em><em>,</em>) – If padding is non-zero, then the input is implicitly zero-padded
 on both sides for padding number of points</li>
 <li><strong>dilation</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em>) – Specifies the dilation rate to use for dilated convolution.</li>
 <li><strong>groups</strong> (<em>int</em>) – Controls the connections between inputs and outputs.
 At groups=1, all inputs are convolved to all outputs.
 At groups=2, the operation becomes equivalent to having two conv
 layers side by side, each seeing half the input channels, and producing
 half the output channels, and both subsequently concatenated.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCHW'</em>) – Dimension ordering of data and weight. Can be ‘NCHW’, ‘NHWC’, etc.
 ‘N’, ‘C’, ‘H’, ‘W’ stands for batch, channel, height, and width
 dimensions respectively. Convolution is applied on the ‘H’ and
 ‘W’ dimensions.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – The number of input channels to this layer. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 <li><strong>activation</strong> (<em>str</em>) – Activation function to use. See <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.Activation" title="mxnet.ndarray.Activation"><code class="xref py py-func docutils literal"><span class="pre">Activation()</span></code></a>.
 If you don’t specify anything, no activation is applied
 (ie. “linear” activation: <cite>a(x) = x</cite>).</li>
 <li><strong>use_bias</strong> (<em>bool</em>) – Whether the layer uses a bias vector.</li>
 <li><strong>weight_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the <cite>weight</cite> weights matrix.</li>
 <li><strong>bias_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the bias vector.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 4D input tensor with shape
 <cite>(batch_size, in_channels, height, width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 4D output tensor with shape
 <cite>(batch_size, channels, out_height, out_width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 out_height and out_width are calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_height</span> <span class="o">=</span> <span class="p">(</span><span class="n">height</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">strides</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">+</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">+</span><span class="n">output_padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
 <span class="n">out_width</span> <span class="o">=</span> <span class="p">(</span><span class="n">width</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">strides</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">+</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">+</span><span class="n">output_padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
 </pre></div>
 </div>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Conv3D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Conv3D</code><span class="sig-paren">(</span><em>channels</em>, <em>kernel_size</em>, <em>strides=(1</em>, <em>1</em>, <em>1)</em>, <em>padding=(0</em>, <em>0</em>, <em>0)</em>, <em>dilation=(1</em>, <em>1</em>, <em>1)</em>, <em>groups=1</em>, <em>layout='NCDHW'</em>, <em>activation=None</em>, <em>use_bias=True</em>, <em>weight_initializer=None</em>, <em>bias_initializer='zeros'</em>, <em>in_channels=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#Conv3D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Conv3D" title="Permalink to this definition">¶</a></dt>
 <dd><p>3D convolution layer (e.g. spatial convolution over volumes).</p>
 <p>This layer creates a convolution kernel that is convolved
 with the layer input to produce a tensor of
 outputs. If <cite>use_bias</cite> is <cite>True</cite>,
 a bias vector is created and added to the outputs. Finally, if
 <cite>activation</cite> is not <cite>None</cite>, it is applied to the outputs as well.</p>
 <p>If <cite>in_channels</cite> is not specified, <cite>Parameter</cite> initialization will be
 deferred to the first time <cite>forward</cite> is called and <cite>in_channels</cite> will be
 inferred from the shape of input data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>channels</strong> (<em>int</em>) – The dimensionality of the output space, i.e. the number of output
 channels (filters) in the convolution.</li>
 <li><strong>kernel_size</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em>) – Specifies the dimensions of the convolution window.</li>
 <li><strong>strides</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em><em>,</em>) – Specify the strides of the convolution.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>a tuple/list of 3 int</em><em>,</em>) – If padding is non-zero, then the input is implicitly zero-padded
 on both sides for padding number of points</li>
 <li><strong>dilation</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em>) – Specifies the dilation rate to use for dilated convolution.</li>
 <li><strong>groups</strong> (<em>int</em>) – Controls the connections between inputs and outputs.
 At groups=1, all inputs are convolved to all outputs.
 At groups=2, the operation becomes equivalent to having two conv
 layers side by side, each seeing half the input channels, and producing
 half the output channels, and both subsequently concatenated.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCDHW'</em>) – Dimension ordering of data and weight. Can be ‘NCDHW’, ‘NDHWC’, etc.
 ‘N’, ‘C’, ‘H’, ‘W’, ‘D’ stands for batch, channel, height, width and
 depth dimensions respectively. Convolution is applied on the ‘D’,
 ‘H’ and ‘W’ dimensions.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – The number of input channels to this layer. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 <li><strong>activation</strong> (<em>str</em>) – Activation function to use. See <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.Activation" title="mxnet.ndarray.Activation"><code class="xref py py-func docutils literal"><span class="pre">Activation()</span></code></a>.
 If you don’t specify anything, no activation is applied
 (ie. “linear” activation: <cite>a(x) = x</cite>).</li>
 <li><strong>use_bias</strong> (<em>bool</em>) – Whether the layer uses a bias vector.</li>
 <li><strong>weight_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the <cite>weight</cite> weights matrix.</li>
 <li><strong>bias_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the bias vector.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 5D input tensor with shape
 <cite>(batch_size, in_channels, depth, height, width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 5D output tensor with shape
 <cite>(batch_size, channels, out_depth, out_height, out_width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 out_depth, out_height and out_width are calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_depth</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">depth</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="n">dilation</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">*</span><span class="p">(</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">/</span><span class="n">stride</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_height</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">height</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="n">dilation</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">*</span><span class="p">(</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">/</span><span class="n">stride</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">-</span><span class="n">dilation</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">*</span><span class="p">(</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">/</span><span class="n">stride</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Conv3DTranspose">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Conv3DTranspose</code><span class="sig-paren">(</span><em>channels</em>, <em>kernel_size</em>, <em>strides=(1</em>, <em>1</em>, <em>1)</em>, <em>padding=(0</em>, <em>0</em>, <em>0)</em>, <em>output_padding=(0</em>, <em>0</em>, <em>0)</em>, <em>dilation=(1</em>, <em>1</em>, <em>1)</em>, <em>groups=1</em>, <em>layout='NCDHW'</em>, <em>activation=None</em>, <em>use_bias=True</em>, <em>weight_initializer=None</em>, <em>bias_initializer='zeros'</em>, <em>in_channels=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#Conv3DTranspose"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Conv3DTranspose" title="Permalink to this definition">¶</a></dt>
 <dd><p>Transposed 3D convolution layer (sometimes called Deconvolution).</p>
 <p>The need for transposed convolutions generally arises
 from the desire to use a transformation going in the opposite direction
 of a normal convolution, i.e., from something that has the shape of the
 output of some convolution to something that has the shape of its input
 while maintaining a connectivity pattern that is compatible with
 said convolution.</p>
 <p>If <cite>in_channels</cite> is not specified, <cite>Parameter</cite> initialization will be
 deferred to the first time <cite>forward</cite> is called and <cite>in_channels</cite> will be
 inferred from the shape of input data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>channels</strong> (<em>int</em>) – The dimensionality of the output space, i.e. the number of output
 channels (filters) in the convolution.</li>
 <li><strong>kernel_size</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em>) – Specifies the dimensions of the convolution window.</li>
 <li><strong>strides</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em><em>,</em>) – Specify the strides of the convolution.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>a tuple/list of 3 int</em><em>,</em>) – If padding is non-zero, then the input is implicitly zero-padded
 on both sides for padding number of points</li>
 <li><strong>dilation</strong> (<em>int</em><em> or </em><em>tuple/list of 3 int</em>) – Specifies the dilation rate to use for dilated convolution.</li>
 <li><strong>groups</strong> (<em>int</em>) – Controls the connections between inputs and outputs.
 At groups=1, all inputs are convolved to all outputs.
 At groups=2, the operation becomes equivalent to having two conv
 layers side by side, each seeing half the input channels, and producing
 half the output channels, and both subsequently concatenated.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCDHW'</em>) – Dimension ordering of data and weight. Can be ‘NCDHW’, ‘NDHWC’, etc.
 ‘N’, ‘C’, ‘H’, ‘W’, ‘D’ stands for batch, channel, height, width and
 depth dimensions respectively. Convolution is applied on the ‘D’,
 ‘H’, and ‘W’ dimensions.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – The number of input channels to this layer. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 <li><strong>activation</strong> (<em>str</em>) – Activation function to use. See <a class="reference internal" href="../ndarray/ndarray.html#mxnet.ndarray.Activation" title="mxnet.ndarray.Activation"><code class="xref py py-func docutils literal"><span class="pre">Activation()</span></code></a>.
 If you don’t specify anything, no activation is applied
 (ie. “linear” activation: <cite>a(x) = x</cite>).</li>
 <li><strong>use_bias</strong> (<em>bool</em>) – Whether the layer uses a bias vector.</li>
 <li><strong>weight_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the <cite>weight</cite> weights matrix.</li>
 <li><strong>bias_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the bias vector.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 5D input tensor with shape
 <cite>(batch_size, in_channels, depth, height, width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 5D output tensor with shape
 <cite>(batch_size, channels, out_depth, out_height, out_width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 out_depth, out_height and out_width are calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_depth</span> <span class="o">=</span> <span class="p">(</span><span class="n">depth</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">strides</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">+</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">+</span><span class="n">output_padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
 <span class="n">out_height</span> <span class="o">=</span> <span class="p">(</span><span class="n">height</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">strides</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">+</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">+</span><span class="n">output_padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
 <span class="n">out_width</span> <span class="o">=</span> <span class="p">(</span><span class="n">width</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">strides</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">-</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">+</span><span class="n">kernel_size</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">+</span><span class="n">output_padding</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span>
 </pre></div>
 </div>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Dense">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Dense</code><span class="sig-paren">(</span><em>units</em>, <em>activation=None</em>, <em>use_bias=True</em>, <em>flatten=True</em>, <em>dtype='float32'</em>, <em>weight_initializer=None</em>, <em>bias_initializer='zeros'</em>, <em>in_units=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#Dense"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Dense" title="Permalink to this definition">¶</a></dt>
 <dd><p>Just your regular densely-connected NN layer.</p>
 <p><cite>Dense</cite> implements the operation:
 <cite>output = activation(dot(input, weight) + bias)</cite>
 where <cite>activation</cite> is the element-wise activation function
 passed as the <cite>activation</cite> argument, <cite>weight</cite> is a weights matrix
 created by the layer, and <cite>bias</cite> is a bias vector created by the layer
 (only applicable if <cite>use_bias</cite> is <cite>True</cite>).</p>
 <p>Note: the input must be a tensor with rank 2. Use <cite>flatten</cite> to convert it
 to rank 2 manually if necessary.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>units</strong> (<em>int</em>) – Dimensionality of the output space.</li>
 <li><strong>activation</strong> (<em>str</em>) – Activation function to use. See help on <cite>Activation</cite> layer.
 If you don’t specify anything, no activation is applied
 (ie. “linear” activation: <cite>a(x) = x</cite>).</li>
 <li><strong>use_bias</strong> (<em>bool</em>) – Whether the layer uses a bias vector.</li>
 <li><strong>flatten</strong> (<em>bool</em>) – Whether the input tensor should be flattened.
 If true, all but the first axis of input data are collapsed together.
 If false, all but the last axis of input data are kept the same, and the transformation
 applies on the last axis.</li>
 <li><strong>dtype</strong> (<em>str</em><em> or </em><em>np.dtype</em><em>, </em><em>default 'float32'</em>) – Data type of output embeddings.</li>
 <li><strong>weight_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the <cite>kernel</cite> weights matrix.</li>
 <li><strong>bias_initializer</strong> (str or <cite>Initializer</cite>) – Initializer for the bias vector.</li>
 <li><strong>in_units</strong> (<em>int</em><em>, </em><em>optional</em>) – Size of the input data. If not specified, initialization will be
 deferred to the first time <cite>forward</cite> is called and <cite>in_units</cite>
 will be inferred from the shape of input data.</li>
 <li><strong>prefix</strong> (<em>str</em><em> or </em><em>None</em>) – See document of <cite>Block</cite>.</li>
 <li><strong>params</strong> (<a class="reference internal" href="gluon.html#mxnet.gluon.ParameterDict" title="mxnet.gluon.ParameterDict"><em>ParameterDict</em></a><em> or </em><em>None</em>) – See document of <cite>Block</cite>.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: if <cite>flatten</cite> is True, <cite>data</cite> should be a tensor with shape
 <cite>(batch_size, x1, x2, ..., xn)</cite>, where x1 * x2 * ... * xn is equal to
 <cite>in_units</cite>. If <cite>flatten</cite> is False, <cite>data</cite> should have shape
 <cite>(x1, x2, ..., xn, in_units)</cite>.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: if <cite>flatten</cite> is True, <cite>out</cite> will be a tensor with shape
 <cite>(batch_size, units)</cite>. If <cite>flatten</cite> is False, <cite>out</cite> will have shape
 <cite>(x1, x2, ..., xn, units)</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Dropout">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Dropout</code><span class="sig-paren">(</span><em>rate</em>, <em>axes=()</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#Dropout"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Dropout" title="Permalink to this definition">¶</a></dt>
 <dd><p>Applies Dropout to the input.</p>
 <p>Dropout consists in randomly setting a fraction <cite>rate</cite> of input units
 to 0 at each update during training time, which helps prevent overfitting.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>rate</strong> (<em>float</em>) – Fraction of the input units to drop. Must be a number between 0 and 1.</li>
 <li><strong>axes</strong> (<em>tuple of int</em><em>, </em><em>default</em><em> (</em><em>)</em>) – The axes on which dropout mask is shared. If empty, regular dropout is applied.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 <p class="rubric">References</p>
 <p><a class="reference external" href="http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf">Dropout: A Simple Way to Prevent Neural Networks from Overfitting</a></p>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.ELU">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">ELU</code><span class="sig-paren">(</span><em>alpha=1.0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/activations.html#ELU"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.ELU" title="Permalink to this definition">¶</a></dt>
 <dd><dl class="docutils">
 <dt>Exponential Linear Unit (ELU)</dt>
 <dd>“Fast and Accurate Deep Network Learning by Exponential Linear Units”, Clevert et al, 2016
 <a class="reference external" href="https://arxiv.org/abs/1511.07289">https://arxiv.org/abs/1511.07289</a>
 Published as a conference paper at ICLR 2016</dd>
 </dl>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>alpha</strong> (<em>float</em>) – The alpha parameter as described by Clevert et al, 2016</td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Embedding">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Embedding</code><span class="sig-paren">(</span><em>input_dim</em>, <em>output_dim</em>, <em>dtype='float32'</em>, <em>weight_initializer=None</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#Embedding"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Embedding" title="Permalink to this definition">¶</a></dt>
 <dd><p>Turns non-negative integers (indexes/tokens) into dense vectors
 of fixed size. eg. [4, 20] -> [[0.25, 0.1], [0.6, -0.2]]</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>input_dim</strong> (<em>int</em>) – Size of the vocabulary, i.e. maximum integer index + 1.</li>
 <li><strong>output_dim</strong> (<em>int</em>) – Dimension of the dense embedding.</li>
 <li><strong>dtype</strong> (<em>str</em><em> or </em><em>np.dtype</em><em>, </em><em>default 'float32'</em>) – Data type of output embeddings.</li>
 <li><strong>weight_initializer</strong> (<a class="reference internal" href="../optimization/optimization.html#mxnet.initializer.Initializer" title="mxnet.initializer.Initializer"><em>Initializer</em></a>) – Initializer for the <cite>embeddings</cite> matrix.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: (N-1)-D tensor with shape: <cite>(x1, x2, ..., xN-1)</cite>.</li>
 </ul>
 </dd>
 <dt>Output:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: N-D tensor with shape: <cite>(x1, x2, ..., xN-1, output_dim)</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Flatten">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Flatten</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#Flatten"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Flatten" title="Permalink to this definition">¶</a></dt>
 <dd><p>Flattens the input to two dimensional.</p>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape <cite>(N, x1, x2, ..., xn)</cite></li>
 </ul>
 </dd>
 <dt>Output:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: 2D tensor with shape: <cite>(N, x1 cdot x2 cdot ... cdot xn)</cite></li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.GlobalAvgPool1D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">GlobalAvgPool1D</code><span class="sig-paren">(</span><em>layout='NCW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#GlobalAvgPool1D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.GlobalAvgPool1D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Global average pooling operation for temporal data.</p>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.GlobalAvgPool2D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">GlobalAvgPool2D</code><span class="sig-paren">(</span><em>layout='NCHW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#GlobalAvgPool2D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.GlobalAvgPool2D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Global average pooling operation for spatial data.</p>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.GlobalAvgPool3D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">GlobalAvgPool3D</code><span class="sig-paren">(</span><em>layout='NCDHW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#GlobalAvgPool3D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.GlobalAvgPool3D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Global max pooling operation for 3D data.</p>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.GlobalMaxPool1D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">GlobalMaxPool1D</code><span class="sig-paren">(</span><em>layout='NCW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#GlobalMaxPool1D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.GlobalMaxPool1D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Global max pooling operation for temporal data.</p>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.GlobalMaxPool2D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">GlobalMaxPool2D</code><span class="sig-paren">(</span><em>layout='NCHW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#GlobalMaxPool2D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.GlobalMaxPool2D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Global max pooling operation for spatial data.</p>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.GlobalMaxPool3D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">GlobalMaxPool3D</code><span class="sig-paren">(</span><em>layout='NCDHW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#GlobalMaxPool3D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.GlobalMaxPool3D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Global max pooling operation for 3D data.</p>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.HybridLambda">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">HybridLambda</code><span class="sig-paren">(</span><em>function</em>, <em>prefix=None</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#HybridLambda"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.HybridLambda" title="Permalink to this definition">¶</a></dt>
 <dd><p>Wraps an operator or an expression as a HybridBlock object.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>function</strong> (<em>str</em><em> or </em><em>function</em>) – <p>Function used in lambda must be one of the following:
 1) the name of an operator that is available in both symbol and ndarray. For example:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">block</span> <span class="o">=</span> <span class="n">HybridLambda</span><span class="p">(</span><span class="s1">'tanh'</span><span class="p">)</span>
 </pre></div>
 </div>
 <ol class="arabic" start="2">
 <li>a function that conforms to “def function(F, data, <a href="#id1"><span class="problematic" id="id2">*</span></a>args)”. For example:<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">block</span> <span class="o">=</span> <span class="n">HybridLambda</span><span class="p">(</span><span class="k">lambda</span> <span class="n">F</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span> <span class="n">F</span><span class="o">.</span><span class="n">LeakyReLU</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">slope</span><span class="o">=</span><span class="mf">0.1</span><span class="p">))</span>
 </pre></div>
 </div>
 </li>
 </ol>
 </li>
 <li><strong>Inputs</strong> – <ul>
 <li>** <em>args *</em>: one or more input data. First argument must be symbol or ndarray.</li>
 </ul>
 <p>Their shapes depend on the function.</p>
 </li>
 <li><strong>Output</strong> – <ul>
 <li>** <em>outputs *</em>: one or more output data. Their shapes depend on the function.</li>
 </ul>
 </li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.InstanceNorm">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">InstanceNorm</code><span class="sig-paren">(</span><em>axis=1</em>, <em>epsilon=1e-05</em>, <em>center=True</em>, <em>scale=False</em>, <em>beta_initializer='zeros'</em>, <em>gamma_initializer='ones'</em>, <em>in_channels=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#InstanceNorm"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.InstanceNorm" title="Permalink to this definition">¶</a></dt>
 <dd><p>Applies instance normalization to the n-dimensional input array.
 This operator takes an n-dimensional input array where (n>2) and normalizes
 the input using the following formula:</p>
 <div class="math">
 \[ \begin{align}\begin{aligned}\bar{C} = \{i \mid i \neq 0, i \neq axis\}\\out = \frac{x - mean[data, \bar{C}]}{ \sqrt{Var[data, \bar{C}]} + \epsilon}
  * gamma + beta\end{aligned}\end{align} \]</div>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>axis</strong> (<em>int</em><em>, </em><em>default 1</em>) – The axis that will be excluded in the normalization process. This is typically the channels
 (C) axis. For instance, after a <cite>Conv2D</cite> layer with <cite>layout=’NCHW’</cite>,
 set <cite>axis=1</cite> in <cite>InstanceNorm</cite>. If <cite>layout=’NHWC’</cite>, then set <cite>axis=3</cite>. Data will be
 normalized along axes excluding the first axis and the axis given.</li>
 <li><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-5</em>) – Small float added to variance to avoid dividing by zero.</li>
 <li><strong>center</strong> (<em>bool</em><em>, </em><em>default True</em>) – If True, add offset of <cite>beta</cite> to normalized tensor.
 If False, <cite>beta</cite> is ignored.</li>
 <li><strong>scale</strong> (<em>bool</em><em>, </em><em>default True</em>) – If True, multiply by <cite>gamma</cite>. If False, <cite>gamma</cite> is not used.
 When the next layer is linear (also e.g. <cite>nn.relu</cite>),
 this can be disabled since the scaling
 will be done by the next layer.</li>
 <li><strong>beta_initializer</strong> (str or <cite>Initializer</cite>, default ‘zeros’) – Initializer for the beta weight.</li>
 <li><strong>gamma_initializer</strong> (str or <cite>Initializer</cite>, default ‘ones’) – Initializer for the gamma weight.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – Number of channels (feature maps) in input data. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 <p class="rubric">References</p>
 <p><a class="reference external" href="https://arxiv.org/abs/1607.08022">Instance Normalization: The Missing Ingredient for Fast Stylization</a></p>
 <p class="rubric">Examples</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="c1"># Input of shape (2,1,2)</span>
 <span class="gp">>>> </span><span class="n">x</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">([[[</span> <span class="mf">1.1</span><span class="p">,</span>  <span class="mf">2.2</span><span class="p">]],</span>
 <span class="gp">... </span>                <span class="p">[[</span> <span class="mf">3.3</span><span class="p">,</span>  <span class="mf">4.4</span><span class="p">]]])</span>
 <span class="gp">>>> </span><span class="c1"># Instance normalization is calculated with the above formula</span>
 <span class="gp">>>> </span><span class="n">layer</span> <span class="o">=</span> <span class="n">InstanceNorm</span><span class="p">()</span>
 <span class="gp">>>> </span><span class="n">layer</span><span class="o">.</span><span class="n">initialize</span><span class="p">(</span><span class="n">ctx</span><span class="o">=</span><span class="n">mx</span><span class="o">.</span><span class="n">cpu</span><span class="p">(</span><span class="mi">0</span><span class="p">))</span>
 <span class="gp">>>> </span><span class="n">layer</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
 <span class="go">[[[-0.99998355  0.99998331]]</span>
 <span class="go"> [[-0.99998319  0.99998361]]]</span>
 <span class="go"><NDArray 2x1x2 @cpu(0)></span>
 </pre></div>
 </div>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Lambda">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Lambda</code><span class="sig-paren">(</span><em>function</em>, <em>prefix=None</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#Lambda"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Lambda" title="Permalink to this definition">¶</a></dt>
 <dd><p>Wraps an operator or an expression as a Block object.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>function</strong> (<em>str</em><em> or </em><em>function</em>) – <p>Function used in lambda must be one of the following:
 1) the name of an operator that is available in ndarray. For example:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">block</span> <span class="o">=</span> <span class="n">Lambda</span><span class="p">(</span><span class="s1">'tanh'</span><span class="p">)</span>
 </pre></div>
 </div>
 <ol class="arabic" start="2">
 <li>a function that conforms to “def function(<a href="#id3"><span class="problematic" id="id4">*</span></a>args)”. For example:<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">block</span> <span class="o">=</span> <span class="n">Lambda</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">nd</span><span class="o">.</span><span class="n">LeakyReLU</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">slope</span><span class="o">=</span><span class="mf">0.1</span><span class="p">))</span>
 </pre></div>
 </div>
 </li>
 </ol>
 </li>
 <li><strong>Inputs</strong> – <ul>
 <li>** <em>args *</em>: one or more input data. Their shapes depend on the function.</li>
 </ul>
 </li>
 <li><strong>Output</strong> – <ul>
 <li>** <em>outputs *</em>: one or more output data. Their shapes depend on the function.</li>
 </ul>
 </li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.LayerNorm">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">LayerNorm</code><span class="sig-paren">(</span><em>axis=-1</em>, <em>epsilon=1e-05</em>, <em>center=True</em>, <em>scale=True</em>, <em>beta_initializer='zeros'</em>, <em>gamma_initializer='ones'</em>, <em>in_channels=0</em>, <em>prefix=None</em>, <em>params=None</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/basic_layers.html#LayerNorm"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.LayerNorm" title="Permalink to this definition">¶</a></dt>
 <dd><p>Applies layer normalization to the n-dimensional input array.
 This operator takes an n-dimensional input array and normalizes
 the input using the given axis:</p>
 <div class="math">
 \[out = \frac{x - mean[data, axis]}{ \sqrt{Var[data, axis]} + \epsilon} * gamma + beta\]</div>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>axis</strong> (<em>int</em><em>, </em><em>default -1</em>) – The axis that should be normalized. This is typically the axis of the channels.</li>
 <li><strong>epsilon</strong> (<em>float</em><em>, </em><em>default 1e-5</em>) – Small float added to variance to avoid dividing by zero.</li>
 <li><strong>center</strong> (<em>bool</em><em>, </em><em>default True</em>) – If True, add offset of <cite>beta</cite> to normalized tensor.
 If False, <cite>beta</cite> is ignored.</li>
 <li><strong>scale</strong> (<em>bool</em><em>, </em><em>default True</em>) – If True, multiply by <cite>gamma</cite>. If False, <cite>gamma</cite> is not used.</li>
 <li><strong>beta_initializer</strong> (str or <cite>Initializer</cite>, default ‘zeros’) – Initializer for the beta weight.</li>
 <li><strong>gamma_initializer</strong> (str or <cite>Initializer</cite>, default ‘ones’) – Initializer for the gamma weight.</li>
 <li><strong>in_channels</strong> (<em>int</em><em>, </em><em>default 0</em>) – Number of channels (feature maps) in input data. If not specified,
 initialization will be deferred to the first time <cite>forward</cite> is called
 and <cite>in_channels</cite> will be inferred from the shape of input data.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 <p class="rubric">References</p>
 <p><a class="reference external" href="https://arxiv.org/pdf/1607.06450.pdf">Layer Normalization</a></p>
 <p class="rubric">Examples</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="c1"># Input of shape (2, 5)</span>
 <span class="gp">>>> </span><span class="n">x</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">([[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">],</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">]])</span>
 <span class="gp">>>> </span><span class="c1"># Layer normalization is calculated with the above formula</span>
 <span class="gp">>>> </span><span class="n">layer</span> <span class="o">=</span> <span class="n">LayerNorm</span><span class="p">()</span>
 <span class="gp">>>> </span><span class="n">layer</span><span class="o">.</span><span class="n">initialize</span><span class="p">(</span><span class="n">ctx</span><span class="o">=</span><span class="n">mx</span><span class="o">.</span><span class="n">cpu</span><span class="p">(</span><span class="mi">0</span><span class="p">))</span>
 <span class="gp">>>> </span><span class="n">layer</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
 <span class="go">[[-1.41421    -0.707105    0.          0.707105    1.41421   ]</span>
 <span class="go"> [-1.2247195  -1.2247195   0.81647956  0.81647956  0.81647956]]</span>
 <span class="go"><NDArray 2x5 @cpu(0)></span>
 </pre></div>
 </div>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.LeakyReLU">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">LeakyReLU</code><span class="sig-paren">(</span><em>alpha</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/activations.html#LeakyReLU"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.LeakyReLU" title="Permalink to this definition">¶</a></dt>
 <dd><p>Leaky version of a Rectified Linear Unit.</p>
 <p>It allows a small gradient when the unit is not active</p>
 <div class="math">
 \[\begin{split}f\left(x\right) = \left\{
     \begin{array}{lr}
        \alpha x &amp; : x \lt 0 \\
               x &amp; : x \geq 0 \\
     \end{array}
 \right.\\\end{split}\]</div>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>alpha</strong> (<em>float</em>) – slope coefficient for the negative half axis. Must be >= 0.</td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.MaxPool1D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">MaxPool1D</code><span class="sig-paren">(</span><em>pool_size=2</em>, <em>strides=None</em>, <em>padding=0</em>, <em>layout='NCW'</em>, <em>ceil_mode=False</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#MaxPool1D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.MaxPool1D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Max pooling operation for one dimensional data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>pool_size</strong> (<em>int</em>) – Size of the max pooling windows.</li>
 <li><strong>strides</strong> (<em>int</em><em>, or </em><em>None</em>) – Factor by which to downscale. E.g. 2 will halve the input size.
 If <cite>None</cite>, it will default to <cite>pool_size</cite>.</li>
 <li><strong>padding</strong> (<em>int</em>) – If padding is non-zero, then the input is implicitly
 zero-padded on both sides for padding number of points.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCW'</em>) – Dimension ordering of data and weight. Can be ‘NCW’, ‘NWC’, etc.
 ‘N’, ‘C’, ‘W’ stands for batch, channel, and width (time) dimensions
 respectively. Pooling is applied on the W dimension.</li>
 <li><strong>ceil_mode</strong> (<em>bool</em><em>, </em><em>default False</em>) – When <cite>True</cite>, will use ceil instead of floor to compute the output shape.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 3D input tensor with shape <cite>(batch_size, in_channels, width)</cite>
 when <cite>layout</cite> is <cite>NCW</cite>. For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 3D output tensor with shape <cite>(batch_size, channels, out_width)</cite>
 when <cite>layout</cite> is <cite>NCW</cite>. out_width is calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="o">-</span><span class="n">pool_size</span><span class="p">)</span><span class="o">/</span><span class="n">strides</span><span class="p">)</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 <p>When <cite>ceil_mode</cite> is <cite>True</cite>, ceil will be used instead of floor in this
 equation.</p>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.MaxPool2D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">MaxPool2D</code><span class="sig-paren">(</span><em>pool_size=(2</em>, <em>2)</em>, <em>strides=None</em>, <em>padding=0</em>, <em>layout='NCHW'</em>, <em>ceil_mode=False</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#MaxPool2D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.MaxPool2D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Max pooling operation for two dimensional (spatial) data.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>pool_size</strong> (<em>int</em><em> or </em><em>list/tuple of 2 ints</em><em>,</em>) – Size of the max pooling windows.</li>
 <li><strong>strides</strong> (<em>int</em><em>, </em><em>list/tuple of 2 ints</em><em>, or </em><em>None.</em>) – Factor by which to downscale. E.g. 2 will halve the input size.
 If <cite>None</cite>, it will default to <cite>pool_size</cite>.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>list/tuple of 2 ints</em><em>,</em>) – If padding is non-zero, then the input is implicitly
 zero-padded on both sides for padding number of points.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCHW'</em>) – Dimension ordering of data and weight. Can be ‘NCHW’, ‘NHWC’, etc.
 ‘N’, ‘C’, ‘H’, ‘W’ stands for batch, channel, height, and width
 dimensions respectively. padding is applied on ‘H’ and ‘W’ dimension.</li>
 <li><strong>ceil_mode</strong> (<em>bool</em><em>, </em><em>default False</em>) – When <cite>True</cite>, will use ceil instead of floor to compute the output shape.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 4D input tensor with shape
 <cite>(batch_size, in_channels, height, width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 4D output tensor with shape
 <cite>(batch_size, channels, out_height, out_width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 out_height and out_width are calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_height</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">height</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 <p>When <cite>ceil_mode</cite> is <cite>True</cite>, ceil will be used instead of floor in this
 equation.</p>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.MaxPool3D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">MaxPool3D</code><span class="sig-paren">(</span><em>pool_size=(2</em>, <em>2</em>, <em>2)</em>, <em>strides=None</em>, <em>padding=0</em>, <em>ceil_mode=False</em>, <em>layout='NCDHW'</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#MaxPool3D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.MaxPool3D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Max pooling operation for 3D data (spatial or spatio-temporal).</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
 <li><strong>pool_size</strong> (<em>int</em><em> or </em><em>list/tuple of 3 ints</em><em>,</em>) – Size of the max pooling windows.</li>
 <li><strong>strides</strong> (<em>int</em><em>, </em><em>list/tuple of 3 ints</em><em>, or </em><em>None.</em>) – Factor by which to downscale. E.g. 2 will halve the input size.
 If <cite>None</cite>, it will default to <cite>pool_size</cite>.</li>
 <li><strong>padding</strong> (<em>int</em><em> or </em><em>list/tuple of 3 ints</em><em>,</em>) – If padding is non-zero, then the input is implicitly
 zero-padded on both sides for padding number of points.</li>
 <li><strong>layout</strong> (<em>str</em><em>, </em><em>default 'NCDHW'</em>) – Dimension ordering of data and weight. Can be ‘NCDHW’, ‘NDHWC’, etc.
 ‘N’, ‘C’, ‘H’, ‘W’, ‘D’ stands for batch, channel, height, width and
 depth dimensions respectively. padding is applied on ‘D’, ‘H’ and ‘W’
 dimension.</li>
 <li><strong>ceil_mode</strong> (<em>bool</em><em>, </em><em>default False</em>) – When <cite>True</cite>, will use ceil instead of floor to compute the output shape.</li>
 </ul>
 </td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: 5D input tensor with shape
 <cite>(batch_size, in_channels, depth, height, width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 For other layouts shape is permuted accordingly.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: 5D output tensor with shape
 <cite>(batch_size, channels, out_depth, out_height, out_width)</cite> when <cite>layout</cite> is <cite>NCW</cite>.
 out_depth, out_height and out_width are calculated as:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">out_depth</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">depth</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_height</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">height</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 <span class="n">out_width</span> <span class="o">=</span> <span class="n">floor</span><span class="p">((</span><span class="n">width</span><span class="o">+</span><span class="mi">2</span><span class="o">*</span><span class="n">padding</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">-</span><span class="n">pool_size</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span><span class="o">/</span><span class="n">strides</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span><span class="o">+</span><span class="mi">1</span>
 </pre></div>
 </div>
 <p>When <cite>ceil_mode</cite> is <cite>True</cite>, ceil will be used instead of floor in this
 equation.</p>
 </li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.PReLU">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">PReLU</code><span class="sig-paren">(</span><em>alpha_initializer=<mxnet.initializer.Constant object></em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/activations.html#PReLU"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.PReLU" title="Permalink to this definition">¶</a></dt>
 <dd><p>Parametric leaky version of a Rectified Linear Unit.
 <<a class="reference external" href="https://arxiv.org/abs/1502.01852">https://arxiv.org/abs/1502.01852</a>>`_ paper.</p>
 <p>It learns a gradient when the unit is not active</p>
 <div class="math">
 \[\begin{split}f\left(x\right) = \left\{
     \begin{array}{lr}
        \alpha x &amp; : x \lt 0 \\
               x &amp; : x \geq 0 \\
     \end{array}
 \right.\\\end{split}\]</div>
 <p>where alpha is a learned parameter.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>alpha_initializer</strong> (<a class="reference internal" href="../optimization/optimization.html#mxnet.initializer.Initializer" title="mxnet.initializer.Initializer"><em>Initializer</em></a>) – Initializer for the <cite>embeddings</cite> matrix.</td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.ReflectionPad2D">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">ReflectionPad2D</code><span class="sig-paren">(</span><em>padding=0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/conv_layers.html#ReflectionPad2D"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.ReflectionPad2D" title="Permalink to this definition">¶</a></dt>
 <dd><p>Pads the input tensor using the reflection of the input boundary.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>padding</strong> (<em>int</em>) – An integer padding size</td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with the shape <span class="math">\((N, C, H_{in}, W_{in})\)</span>.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last">
 <li><p class="first"><strong>out</strong>: output tensor with the shape <span class="math">\((N, C, H_{out}, W_{out})\)</span>, where</p>
 <div class="math">
 \[ \begin{align}\begin{aligned}H_{out} = H_{in} + 2 \cdot padding\\W_{out} = W_{in} + 2 \cdot padding\end{aligned}\end{align} \]</div>
 </li>
 </ul>
 </dd>
 </dl>
 <p class="rubric">Examples</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">m</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">ReflectionPad2D</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
 <span class="gp">>>> </span><span class="nb">input</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">224</span><span class="p">,</span> <span class="mi">224</span><span class="p">))</span>
 <span class="gp">>>> </span><span class="n">output</span> <span class="o">=</span> <span class="n">m</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
 </pre></div>
 </div>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.SELU">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">SELU</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/activations.html#SELU"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.SELU" title="Permalink to this definition">¶</a></dt>
 <dd><dl class="docutils">
 <dt>Scaled Exponential Linear Unit (SELU)</dt>
 <dd>“Self-Normalizing Neural Networks”, Klambauer et al, 2017
 <a class="reference external" href="https://arxiv.org/abs/1706.02515">https://arxiv.org/abs/1706.02515</a></dd>
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <dl class="class">
 <dt id="mxnet.gluon.nn.Swish">
 <em class="property">class </em><code class="descclassname">mxnet.gluon.nn.</code><code class="descname">Swish</code><span class="sig-paren">(</span><em>beta=1.0</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/mxnet/gluon/nn/activations.html#Swish"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#mxnet.gluon.nn.Swish" title="Permalink to this definition">¶</a></dt>
 <dd><dl class="docutils">
 <dt>Swish Activation function</dt>
 <dd><a class="reference external" href="https://arxiv.org/pdf/1710.05941.pdf">https://arxiv.org/pdf/1710.05941.pdf</a></dd>
 </dl>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name"/>
 <col class="field-body"/>
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>beta</strong> (<em>float</em>) – swish(x) = x * sigmoid(beta*x)</td>
 </tr>
 </tbody>
 </table>
 <dl class="docutils">
 <dt>Inputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>data</strong>: input tensor with arbitrary shape.</li>
 </ul>
 </dd>
 <dt>Outputs:</dt>
 <dd><ul class="first last simple">
 <li><strong>out</strong>: output tensor with the same shape as <cite>data</cite>.</li>
 </ul>
 </dd>
 </dl>
 </dd></dl>
 <script>auto_index("api-reference");</script></div>
 </div>
 </div>
 </div>
 <div aria-label="main navigation" class="sphinxsidebar rightsidebar" role="navigation">
 <div class="sphinxsidebarwrapper">
 <h3><a href="../../../index.html">Table Of Contents</a></h3>
 <ul>
 <li><a class="reference internal" href="#">Gluon Neural Network Layers</a><ul>
 <li><a class="reference internal" href="#overview">Overview</a></li>
 <li><a class="reference internal" href="#basic-layers">Basic Layers</a></li>
 <li><a class="reference internal" href="#convolutional-layers">Convolutional Layers</a></li>
 <li><a class="reference internal" href="#pooling-layers">Pooling Layers</a></li>
 <li><a class="reference internal" href="#activation-layers">Activation Layers</a></li>
 <li><a class="reference internal" href="#api-reference">API Reference</a></li>
 </ul>
 </li>
 </ul>
 </div>
 </div>
 </div><div class="footer">
 <div class="section-disclaimer">
 <div class="container">
 <div>
 <img height="60" src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/apache_incubator_logo.png"/>
 <p>
             Apache MXNet is an effort undergoing incubation at The Apache Software Foundation (ASF), <strong>sponsored by the <i>Apache Incubator</i></strong>. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
         </p>
 <p>
             "Copyright © 2017-2018, The Apache Software Foundation
             Apache MXNet, MXNet, Apache, the Apache feather, and the Apache MXNet project logo are either registered trademarks or trademarks of the Apache Software Foundation."
         </p>
 </div>
 </div>
 </div>
 </div> <!-- pagename != index -->
 </div>
 <script crossorigin="anonymous" integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS" src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"></script>
 <script src="../../../_static/js/sidebar.js" type="text/javascript"></script>
 <script src="../../../_static/js/search.js" type="text/javascript"></script>
 <script src="../../../_static/js/navbar.js" type="text/javascript"></script>
 <script src="../../../_static/js/clipboard.min.js" type="text/javascript"></script>
 <script src="../../../_static/js/copycode.js" type="text/javascript"></script>
 <script src="../../../_static/js/page.js" type="text/javascript"></script>
 <script src="../../../_static/js/docversion.js" type="text/javascript"></script>
 <script type="text/javascript">
         $('body').ready(function () {
             $('body').css('visibility', 'visible');
         });
     </script>
 </body>
 </html>