| <!DOCTYPE html> |
| |
| <html lang="en"> |
| <head> |
| <meta charset="utf-8"/> |
| <meta content="IE=edge" http-equiv="X-UA-Compatible"/> |
| <meta content="width=device-width, initial-scale=1" name="viewport"/> |
| <meta content="Iterators - Loading data" property="og:title"> |
| <meta content="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/og-logo.png" property="og:image"> |
| <meta content="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/og-logo.png" property="og:image:secure_url"> |
| <meta content="Iterators - Loading data" property="og:description"/> |
| <title>Iterators - Loading data — mxnet documentation</title> |
| <link crossorigin="anonymous" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" rel="stylesheet"/> |
| <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css" rel="stylesheet"/> |
| <link href="../../_static/basic.css" rel="stylesheet" type="text/css"> |
| <link href="../../_static/pygments.css" rel="stylesheet" type="text/css"> |
| <link href="../../_static/mxnet.css" rel="stylesheet" type="text/css"/> |
| <script type="text/javascript"> |
| var DOCUMENTATION_OPTIONS = { |
| URL_ROOT: '../../', |
| VERSION: '', |
| COLLAPSE_INDEX: false, |
| FILE_SUFFIX: '.html', |
| HAS_SOURCE: true, |
| SOURCELINK_SUFFIX: '.txt' |
| }; |
| </script> |
| <script src="https://code.jquery.com/jquery-1.11.1.min.js" type="text/javascript"></script> |
| <script src="../../_static/underscore.js" type="text/javascript"></script> |
| <script src="../../_static/searchtools_custom.js" type="text/javascript"></script> |
| <script src="../../_static/doctools.js" type="text/javascript"></script> |
| <script src="../../_static/selectlang.js" type="text/javascript"></script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script> |
| <script type="text/javascript"> jQuery(function() { Search.loadIndex("/versions/1.1.0/searchindex.js"); Search.init();}); </script> |
| <script> |
| (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ |
| (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new |
| Date();a=s.createElement(o), |
| m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) |
| })(window,document,'script','https://www.google-analytics.com/analytics.js','ga'); |
| |
| ga('create', 'UA-96378503-1', 'auto'); |
| ga('send', 'pageview'); |
| |
| </script> |
| <!-- --> |
| <!-- <script type="text/javascript" src="../../_static/jquery.js"></script> --> |
| <!-- --> |
| <!-- <script type="text/javascript" src="../../_static/underscore.js"></script> --> |
| <!-- --> |
| <!-- <script type="text/javascript" src="../../_static/doctools.js"></script> --> |
| <!-- --> |
| <!-- <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> --> |
| <!-- --> |
| <link href="../../genindex.html" rel="index" title="Index"> |
| <link href="../../search.html" rel="search" title="Search"/> |
| <link href="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet-icon.png" rel="icon" type="image/png"/> |
| </link></link></link></meta></meta></meta></head> |
| <body background="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet-background-compressed.jpeg" role="document"> |
| <div class="content-block"><div class="navbar navbar-fixed-top"> |
| <div class="container" id="navContainer"> |
| <div class="innder" id="header-inner"> |
| <h1 id="logo-wrap"> |
| <a href="../../" id="logo"><img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet_logo.png"/></a> |
| </h1> |
| <nav class="nav-bar" id="main-nav"> |
| <a class="main-nav-link" href="/versions/1.1.0/install/index.html">Install</a> |
| <span id="dropdown-menu-position-anchor"> |
| <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Gluon <span class="caret"></span></a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu"> |
| <li><a class="main-nav-link" href="/versions/1.1.0/tutorials/gluon/gluon.html">About</a></li> |
| <li><a class="main-nav-link" href="https://www.d2l.ai/">Dive into Deep Learning</a></li> |
| <li><a class="main-nav-link" href="https://gluon-cv.mxnet.io">GluonCV Toolkit</a></li> |
| <li><a class="main-nav-link" href="https://gluon-nlp.mxnet.io/">GluonNLP Toolkit</a></li> |
| </ul> |
| </span> |
| <span id="dropdown-menu-position-anchor"> |
| <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">API <span class="caret"></span></a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu"> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/python/index.html">Python</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/c++/index.html">C++</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/julia/index.html">Julia</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/perl/index.html">Perl</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/r/index.html">R</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/scala/index.html">Scala</a></li> |
| </ul> |
| </span> |
| <span id="dropdown-menu-position-anchor-docs"> |
| <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Docs <span class="caret"></span></a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu-docs"> |
| <li><a class="main-nav-link" href="/versions/1.1.0/faq/index.html">FAQ</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/tutorials/index.html">Tutorials</a> |
| <li><a class="main-nav-link" href="https://github.com/apache/incubator-mxnet/tree/v1.1.0/example">Examples</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/architecture/index.html">Architecture</a></li> |
| <li><a class="main-nav-link" href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home">Developer Wiki</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/model_zoo/index.html">Model Zoo</a></li> |
| <li><a class="main-nav-link" href="https://github.com/onnx/onnx-mxnet">ONNX</a></li> |
| </li></ul> |
| </span> |
| <span id="dropdown-menu-position-anchor-community"> |
| <a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Community <span class="caret"></span></a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu-community"> |
| <li><a class="main-nav-link" href="http://discuss.mxnet.io">Forum</a></li> |
| <li><a class="main-nav-link" href="https://github.com/apache/incubator-mxnet/tree/v1.1.0">Github</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/community/contribute.html">Contribute</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/community/powered_by.html">Powered By</a></li> |
| </ul> |
| </span> |
| <span id="dropdown-menu-position-anchor-version" style="position: relative"><a href="#" class="main-nav-link dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="true">1.1.0<span class="caret"></span></a><ul id="package-dropdown-menu" class="dropdown-menu"><li><a href="/">master</a></li><li><a href="/versions/1.7.0/">1.7.0</a></li><li><a href=/versions/1.6.0/>1.6.0</a></li><li><a href=/versions/1.5.0/>1.5.0</a></li><li><a href=/versions/1.4.1/>1.4.1</a></li><li><a href=/versions/1.3.1/>1.3.1</a></li><li><a href=/versions/1.2.1/>1.2.1</a></li><li><a href=/versions/1.1.0/>1.1.0</a></li><li><a href=/versions/1.0.0/>1.0.0</a></li><li><a href=/versions/0.12.1/>0.12.1</a></li><li><a href=/versions/0.11.0/>0.11.0</a></li></ul></span></nav> |
| <script> function getRootPath(){ return "../../" } </script> |
| <div class="burgerIcon dropdown"> |
| <a class="dropdown-toggle" data-toggle="dropdown" href="#" role="button">☰</a> |
| <ul class="dropdown-menu" id="burgerMenu"> |
| <li><a href="/versions/1.1.0/install/index.html">Install</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/tutorials/index.html">Tutorials</a></li> |
| <li class="dropdown-submenu dropdown"> |
| <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">Gluon</a> |
| <ul class="dropdown-menu navbar-menu" id="package-dropdown-menu"> |
| <li><a class="main-nav-link" href="/versions/1.1.0/tutorials/gluon/gluon.html">About</a></li> |
| <li><a class="main-nav-link" href="http://gluon.mxnet.io">The Straight Dope (Tutorials)</a></li> |
| <li><a class="main-nav-link" href="https://gluon-cv.mxnet.io">GluonCV Toolkit</a></li> |
| <li><a class="main-nav-link" href="https://gluon-nlp.mxnet.io/">GluonNLP Toolkit</a></li> |
| </ul> |
| </li> |
| <li class="dropdown-submenu"> |
| <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">API</a> |
| <ul class="dropdown-menu"> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/python/index.html">Python</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/c++/index.html">C++</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/julia/index.html">Julia</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/perl/index.html">Perl</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/r/index.html">R</a></li> |
| <li><a class="main-nav-link" href="/versions/1.1.0/api/scala/index.html">Scala</a></li> |
| </ul> |
| </li> |
| <li class="dropdown-submenu"> |
| <a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">Docs</a> |
| <ul class="dropdown-menu"> |
| <li><a href="/versions/1.1.0/faq/index.html" tabindex="-1">FAQ</a></li> |
| <li><a href="/versions/1.1.0/tutorials/index.html" tabindex="-1">Tutorials</a></li> |
| <li><a href="https://github.com/apache/incubator-mxnet/tree/v1.1.0/example" tabindex="-1">Examples</a></li> |
| <li><a href="/versions/1.1.0/architecture/index.html" tabindex="-1">Architecture</a></li> |
| <li><a href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home" tabindex="-1">Developer Wiki</a></li> |
| <li><a href="/versions/1.1.0/model_zoo/index.html" tabindex="-1">Gluon Model Zoo</a></li> |
| <li><a href="https://github.com/onnx/onnx-mxnet" tabindex="-1">ONNX</a></li> |
| </ul> |
| </li> |
| <li class="dropdown-submenu dropdown"> |
| <a aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" role="button" tabindex="-1">Community</a> |
| <ul class="dropdown-menu"> |
| <li><a href="http://discuss.mxnet.io" tabindex="-1">Forum</a></li> |
| <li><a href="https://github.com/apache/incubator-mxnet/tree/v1.1.0" tabindex="-1">Github</a></li> |
| <li><a href="/versions/1.1.0/community/contribute.html" tabindex="-1">Contribute</a></li> |
| <li><a href="/versions/1.1.0/community/powered_by.html" tabindex="-1">Powered By</a></li> |
| </ul> |
| </li> |
| <li id="dropdown-menu-position-anchor-version-mobile" class="dropdown-submenu" style="position: relative"><a href="#" tabindex="-1">1.1.0</a><ul class="dropdown-menu"><li><a tabindex="-1" href=/>master</a></li><li><a tabindex="-1" href=/versions/1.6.0/>1.6.0</a></li><li><a tabindex="-1" href=/versions/1.5.0/>1.5.0</a></li><li><a tabindex="-1" href=/versions/1.4.1/>1.4.1</a></li><li><a tabindex="-1" href=/versions/1.3.1/>1.3.1</a></li><li><a tabindex="-1" href=/versions/1.2.1/>1.2.1</a></li><li><a tabindex="-1" href=/versions/1.1.0/>1.1.0</a></li><li><a tabindex="-1" href=/versions/1.0.0/>1.0.0</a></li><li><a tabindex="-1" href=/versions/0.12.1/>0.12.1</a></li><li><a tabindex="-1" href=/versions/0.11.0/>0.11.0</a></li></ul></li></ul> |
| </div> |
| <div class="plusIcon dropdown"> |
| <a class="dropdown-toggle" data-toggle="dropdown" href="#" role="button"><span aria-hidden="true" class="glyphicon glyphicon-plus"></span></a> |
| <ul class="dropdown-menu dropdown-menu-right" id="plusMenu"></ul> |
| </div> |
| <div id="search-input-wrap"> |
| <form action="../../search.html" autocomplete="off" class="" method="get" role="search"> |
| <div class="form-group inner-addon left-addon"> |
| <i class="glyphicon glyphicon-search"></i> |
| <input class="form-control" name="q" placeholder="Search" type="text"/> |
| </div> |
| <input name="check_keywords" type="hidden" value="yes"> |
| <input name="area" type="hidden" value="default"/> |
| </input></form> |
| <div id="search-preview"></div> |
| </div> |
| <div id="searchIcon"> |
| <span aria-hidden="true" class="glyphicon glyphicon-search"></span> |
| </div> |
| <!-- <div id="lang-select-wrap"> --> |
| <!-- <label id="lang-select-label"> --> |
| <!-- <\!-- <i class="fa fa-globe"></i> -\-> --> |
| <!-- <span></span> --> |
| <!-- </label> --> |
| <!-- <select id="lang-select"> --> |
| <!-- <option value="en">Eng</option> --> |
| <!-- <option value="zh">中文</option> --> |
| <!-- </select> --> |
| <!-- </div> --> |
| <!-- <a id="mobile-nav-toggle"> |
| <span class="mobile-nav-toggle-bar"></span> |
| <span class="mobile-nav-toggle-bar"></span> |
| <span class="mobile-nav-toggle-bar"></span> |
| </a> --> |
| </div> |
| </div> |
| </div> |
| <script type="text/javascript"> |
| $('body').css('background', 'white'); |
| </script> |
| <div class="container"> |
| <div class="row"> |
| <div aria-label="main navigation" class="sphinxsidebar leftsidebar" role="navigation"> |
| <div class="sphinxsidebarwrapper"> |
| <ul> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/python/index.html">Python Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/r/index.html">R Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/julia/index.html">Julia Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/c++/index.html">C++ Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/scala/index.html">Scala Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../api/perl/index.html">Perl Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../faq/index.html">HowTo Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../architecture/index.html">System Documents</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../index.html">Tutorials</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../community/index.html">Community</a></li> |
| </ul> |
| </div> |
| </div> |
| <div class="content"> |
| <div class="page-tracker"></div> |
| <div class="section" id="iterators-loading-data"> |
| <span id="iterators-loading-data"></span><h1>Iterators - Loading data<a class="headerlink" href="#iterators-loading-data" title="Permalink to this headline">¶</a></h1> |
| <p>In this tutorial, we focus on how to feed data into a training or inference program. |
| Most training and inference modules in MXNet accept data iterators, |
| which simplifies this procedure, especially when reading large datasets. |
| Here we discuss the API conventions and several provided iterators.</p> |
| <div class="section" id="prerequisites"> |
| <span id="prerequisites"></span><h2>Prerequisites<a class="headerlink" href="#prerequisites" title="Permalink to this headline">¶</a></h2> |
| <p>To complete this tutorial, we need:</p> |
| <ul class="simple"> |
| <li>MXNet. See the instructions for your operating system in <a class="reference external" href="/versions/1.1.0/install/index.html">Setup and Installation</a>.</li> |
| <li><a class="reference external" href="http://opencv.org/opencv-3-2.html">OpenCV Python library</a>, <a class="reference external" href="http://docs.python-requests.org/en/master/">Python Requests</a>, <a class="reference external" href="https://matplotlib.org/">Matplotlib</a> and <a class="reference external" href="http://jupyter.org/index.html">Jupyter Notebook</a>.</li> |
| </ul> |
| <div class="highlight-default"><div class="highlight"><pre><span></span>$ pip install opencv-python requests matplotlib jupyter |
| </pre></div> |
| </div> |
| <ul class="simple"> |
| <li>Set the environment variable <code class="docutils literal"><span class="pre">MXNET_HOME</span></code> to the root of the MXNet source folder.</li> |
| </ul> |
| <div class="highlight-default"><div class="highlight"><pre><span></span>$ git clone https://github.com/dmlc/mxnet ~/mxnet |
| $ export MXNET_HOME='~/mxnet' |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="mxnet-data-iterator"> |
| <span id="mxnet-data-iterator"></span><h2>MXNet Data Iterator<a class="headerlink" href="#mxnet-data-iterator" title="Permalink to this headline">¶</a></h2> |
| <p>Data Iterators in <em>MXNet</em> are similar to Python iterator objects. |
| In Python, the function <code class="docutils literal"><span class="pre">iter</span></code> allows fetching items sequentially by calling <code class="docutils literal"><span class="pre">next()</span></code> on |
| iterable objects such as a Python <code class="docutils literal"><span class="pre">list</span></code>. |
| Iterators provide an abstract interface for traversing various types of iterable collections |
| without needing to expose details about the underlying data source.</p> |
| <p>In MXNet, data iterators return a batch of data as <code class="docutils literal"><span class="pre">DataBatch</span></code> on each call to <code class="docutils literal"><span class="pre">next</span></code>. |
| A <code class="docutils literal"><span class="pre">DataBatch</span></code> often contains <em>n</em> training examples and their corresponding labels. Here <em>n</em> is the <code class="docutils literal"><span class="pre">batch_size</span></code> of the iterator. At the end of the data stream when there is no more data to read, the iterator raises <code class="docutils literal"><span class="pre">StopIteration</span></code> exception like Python <code class="docutils literal"><span class="pre">iter</span></code>. |
| The structure of <code class="docutils literal"><span class="pre">DataBatch</span></code> is defined <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.io.DataBatch">here</a>.</p> |
| <p>Information such as name, shape, type and layout on each training example and their corresponding label can be provided as <code class="docutils literal"><span class="pre">DataDesc</span></code> data descriptor objects via the <code class="docutils literal"><span class="pre">provide_data</span></code> and <code class="docutils literal"><span class="pre">provide_label</span></code> properties in <code class="docutils literal"><span class="pre">DataBatch</span></code>. |
| The structure of <code class="docutils literal"><span class="pre">DataDesc</span></code> is defined <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.io.DataDesc">here</a>.</p> |
| <p>All IO in MXNet is handled via <code class="docutils literal"><span class="pre">mx.io.DataIter</span></code> and its subclasses. In this tutorial, we’ll discuss a few commonly used iterators provided by MXNet.</p> |
| <p>Before diving into the details let’s setup the environment by importing some required packages:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">mxnet</span> <span class="kn">as</span> <span class="nn">mx</span> |
| <span class="o">%</span><span class="n">matplotlib</span> <span class="n">inline</span> |
| <span class="kn">import</span> <span class="nn">os</span> |
| <span class="kn">import</span> <span class="nn">sys</span> |
| <span class="kn">import</span> <span class="nn">subprocess</span> |
| <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> |
| <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span> |
| <span class="kn">import</span> <span class="nn">tarfile</span> |
| |
| <span class="kn">import</span> <span class="nn">warnings</span> |
| <span class="n">warnings</span><span class="o">.</span><span class="n">filterwarnings</span><span class="p">(</span><span class="s2">"ignore"</span><span class="p">,</span> <span class="n">category</span><span class="o">=</span><span class="ne">DeprecationWarning</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="reading-data-in-memory"> |
| <span id="reading-data-in-memory"></span><h2>Reading data in memory<a class="headerlink" href="#reading-data-in-memory" title="Permalink to this headline">¶</a></h2> |
| <p>When data is stored in memory, backed by either an <code class="docutils literal"><span class="pre">NDArray</span></code> or <code class="docutils literal"><span class="pre">numpy</span></code> <code class="docutils literal"><span class="pre">ndarray</span></code>, |
| we can use the <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.io.NDArrayIter"><strong><code class="docutils literal"><span class="pre">NDArrayIter</span></code></strong></a> to read data as below:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span><span class="mi">3</span><span class="p">)</span> |
| <span class="n">label</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="p">(</span><span class="mi">100</span><span class="p">,))</span> |
| <span class="n">data_iter</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">io</span><span class="o">.</span><span class="n">NDArrayIter</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">,</span> <span class="n">batch_size</span><span class="o">=</span><span class="mi">30</span><span class="p">)</span> |
| <span class="k">for</span> <span class="n">batch</span> <span class="ow">in</span> <span class="n">data_iter</span><span class="p">:</span> |
| <span class="k">print</span><span class="p">([</span><span class="n">batch</span><span class="o">.</span><span class="n">data</span><span class="p">,</span> <span class="n">batch</span><span class="o">.</span><span class="n">label</span><span class="p">,</span> <span class="n">batch</span><span class="o">.</span><span class="n">pad</span><span class="p">])</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="reading-data-from-csv-files"> |
| <span id="reading-data-from-csv-files"></span><h2>Reading data from CSV files<a class="headerlink" href="#reading-data-from-csv-files" title="Permalink to this headline">¶</a></h2> |
| <p>MXNet provides <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.io.CSVIter"><code class="docutils literal"><span class="pre">CSVIter</span></code></a> |
| to read from CSV files and can be used as below:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1">#lets save `data` into a csv file first and try reading it back</span> |
| <span class="n">np</span><span class="o">.</span><span class="n">savetxt</span><span class="p">(</span><span class="s1">'data.csv'</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">delimiter</span><span class="o">=</span><span class="s1">','</span><span class="p">)</span> |
| <span class="n">data_iter</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">io</span><span class="o">.</span><span class="n">CSVIter</span><span class="p">(</span><span class="n">data_csv</span><span class="o">=</span><span class="s1">'data.csv'</span><span class="p">,</span> <span class="n">data_shape</span><span class="o">=</span><span class="p">(</span><span class="mi">3</span><span class="p">,),</span> <span class="n">batch_size</span><span class="o">=</span><span class="mi">30</span><span class="p">)</span> |
| <span class="k">for</span> <span class="n">batch</span> <span class="ow">in</span> <span class="n">data_iter</span><span class="p">:</span> |
| <span class="k">print</span><span class="p">([</span><span class="n">batch</span><span class="o">.</span><span class="n">data</span><span class="p">,</span> <span class="n">batch</span><span class="o">.</span><span class="n">pad</span><span class="p">])</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="custom-iterator"> |
| <span id="custom-iterator"></span><h2>Custom Iterator<a class="headerlink" href="#custom-iterator" title="Permalink to this headline">¶</a></h2> |
| <p>When the built-in iterators do not suit your application needs, |
| you can create your own custom data iterator.</p> |
| <p>An iterator in <em>MXNet</em> should</p> |
| <ol class="simple"> |
| <li>Implement <code class="docutils literal"><span class="pre">next()</span></code> in <code class="docutils literal"><span class="pre">Python2</span></code> or <code class="docutils literal"><span class="pre">__next()__</span></code> in <code class="docutils literal"><span class="pre">Python3</span></code>, |
| returning a <code class="docutils literal"><span class="pre">DataBatch</span></code> or raising a <code class="docutils literal"><span class="pre">StopIteration</span></code> exception if at the end of the data stream.</li> |
| <li>Implement the <code class="docutils literal"><span class="pre">reset()</span></code> method to restart reading from the beginning.</li> |
| <li>Have a <code class="docutils literal"><span class="pre">provide_data</span></code> attribute, consisting of a list of <code class="docutils literal"><span class="pre">DataDesc</span></code> objects that store the name, shape, type and layout information of the data (more info <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.io.DataBatch">here</a>).</li> |
| <li>Have a <code class="docutils literal"><span class="pre">provide_label</span></code> attribute consisting of a list of <code class="docutils literal"><span class="pre">DataDesc</span></code> objects that store the name, shape, type and layout information of the label.</li> |
| </ol> |
| <p>When creating a new iterator, you can either start from scratch and define an iterator or reuse one of the existing iterators. |
| For example, in the image captioning application, the input example is an image while the label is a sentence. |
| Thus we can create a new iterator by:</p> |
| <ul class="simple"> |
| <li>creating a <code class="docutils literal"><span class="pre">image_iter</span></code> by using <code class="docutils literal"><span class="pre">ImageRecordIter</span></code> which provides multithreaded pre-fetch and augmentation.</li> |
| <li>creating a <code class="docutils literal"><span class="pre">caption_iter</span></code> by using <code class="docutils literal"><span class="pre">NDArrayIter</span></code> or the bucketing iterator provided in the <em>rnn</em> package.</li> |
| <li><code class="docutils literal"><span class="pre">next()</span></code> returns the combined result of <code class="docutils literal"><span class="pre">image_iter.next()</span></code> and <code class="docutils literal"><span class="pre">caption_iter.next()</span></code></li> |
| </ul> |
| <p>The example below shows how to create a Simple iterator.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">SimpleIter</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">io</span><span class="o">.</span><span class="n">DataIter</span><span class="p">):</span> |
| <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data_names</span><span class="p">,</span> <span class="n">data_shapes</span><span class="p">,</span> <span class="n">data_gen</span><span class="p">,</span> |
| <span class="n">label_names</span><span class="p">,</span> <span class="n">label_shapes</span><span class="p">,</span> <span class="n">label_gen</span><span class="p">,</span> <span class="n">num_batches</span><span class="o">=</span><span class="mi">10</span><span class="p">):</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">_provide_data</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">data_names</span><span class="p">,</span> <span class="n">data_shapes</span><span class="p">))</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">_provide_label</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">label_names</span><span class="p">,</span> <span class="n">label_shapes</span><span class="p">))</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">num_batches</span> <span class="o">=</span> <span class="n">num_batches</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">data_gen</span> <span class="o">=</span> <span class="n">data_gen</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">label_gen</span> <span class="o">=</span> <span class="n">label_gen</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">cur_batch</span> <span class="o">=</span> <span class="mi">0</span> |
| |
| <span class="k">def</span> <span class="fm">__iter__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> |
| <span class="k">return</span> <span class="bp">self</span> |
| |
| <span class="k">def</span> <span class="nf">reset</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">cur_batch</span> <span class="o">=</span> <span class="mi">0</span> |
| |
| <span class="k">def</span> <span class="nf">__next__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> |
| <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">next</span><span class="p">()</span> |
| |
| <span class="nd">@property</span> |
| <span class="k">def</span> <span class="nf">provide_data</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> |
| <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">_provide_data</span> |
| |
| <span class="nd">@property</span> |
| <span class="k">def</span> <span class="nf">provide_label</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> |
| <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">_provide_label</span> |
| |
| <span class="k">def</span> <span class="nf">next</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> |
| <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">cur_batch</span> <span class="o"><</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_batches</span><span class="p">:</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">cur_batch</span> <span class="o">+=</span> <span class="mi">1</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="p">[</span><span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">g</span><span class="p">(</span><span class="n">d</span><span class="p">[</span><span class="mi">1</span><span class="p">]))</span> <span class="k">for</span> <span class="n">d</span><span class="p">,</span><span class="n">g</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_provide_data</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">data_gen</span><span class="p">)]</span> |
| <span class="n">label</span> <span class="o">=</span> <span class="p">[</span><span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">g</span><span class="p">(</span><span class="n">d</span><span class="p">[</span><span class="mi">1</span><span class="p">]))</span> <span class="k">for</span> <span class="n">d</span><span class="p">,</span><span class="n">g</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_provide_label</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">label_gen</span><span class="p">)]</span> |
| <span class="k">return</span> <span class="n">mx</span><span class="o">.</span><span class="n">io</span><span class="o">.</span><span class="n">DataBatch</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> |
| <span class="k">else</span><span class="p">:</span> |
| <span class="k">raise</span> <span class="ne">StopIteration</span> |
| </pre></div> |
| </div> |
| <p>We can use the above defined <code class="docutils literal"><span class="pre">SimpleIter</span></code> to train a simple MLP program below:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">mxnet</span> <span class="kn">as</span> <span class="nn">mx</span> |
| <span class="n">num_classes</span> <span class="o">=</span> <span class="mi">10</span> |
| <span class="n">net</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="s1">'data'</span><span class="p">)</span> |
| <span class="n">net</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">FullyConnected</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">net</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">'fc1'</span><span class="p">,</span> <span class="n">num_hidden</span><span class="o">=</span><span class="mi">64</span><span class="p">)</span> |
| <span class="n">net</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">Activation</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">net</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">'relu1'</span><span class="p">,</span> <span class="n">act_type</span><span class="o">=</span><span class="s2">"relu"</span><span class="p">)</span> |
| <span class="n">net</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">FullyConnected</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">net</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">'fc2'</span><span class="p">,</span> <span class="n">num_hidden</span><span class="o">=</span><span class="n">num_classes</span><span class="p">)</span> |
| <span class="n">net</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">SoftmaxOutput</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">net</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">'softmax'</span><span class="p">)</span> |
| <span class="k">print</span><span class="p">(</span><span class="n">net</span><span class="o">.</span><span class="n">list_arguments</span><span class="p">())</span> |
| <span class="k">print</span><span class="p">(</span><span class="n">net</span><span class="o">.</span><span class="n">list_outputs</span><span class="p">())</span> |
| </pre></div> |
| </div> |
| <p>Here, there are four variables that are learnable parameters: |
| the <em>weights</em> and <em>biases</em> of FullyConnected layers <em>fc1</em> and <em>fc2</em>, |
| two variables for input data: <em>data</em> for the training examples |
| and <em>softmax_label</em> contains the respective labels and the <em>softmax_output</em>.</p> |
| <p>The <em>data</em> variables are called free variables in MXNet’s Symbol API. |
| To execute a Symbol, they need to be bound with data. |
| <a class="reference external" href="/versions/1.1.0/tutorials/basic/symbol.html">Click here learn more about Symbol</a>.</p> |
| <p>We use the data iterator to feed examples to a neural network via MXNet’s <code class="docutils literal"><span class="pre">module</span></code> API. |
| <a class="reference external" href="/versions/1.1.0/tutorials/basic/module.html">Click here to learn more about Module</a>.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">logging</span> |
| <span class="n">logging</span><span class="o">.</span><span class="n">basicConfig</span><span class="p">(</span><span class="n">level</span><span class="o">=</span><span class="n">logging</span><span class="o">.</span><span class="n">INFO</span><span class="p">)</span> |
| |
| <span class="n">n</span> <span class="o">=</span> <span class="mi">32</span> |
| <span class="n">data_iter</span> <span class="o">=</span> <span class="n">SimpleIter</span><span class="p">([</span><span class="s1">'data'</span><span class="p">],</span> <span class="p">[(</span><span class="n">n</span><span class="p">,</span> <span class="mi">100</span><span class="p">)],</span> |
| <span class="p">[</span><span class="k">lambda</span> <span class="n">s</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">s</span><span class="p">)],</span> |
| <span class="p">[</span><span class="s1">'softmax_label'</span><span class="p">],</span> <span class="p">[(</span><span class="n">n</span><span class="p">,)],</span> |
| <span class="p">[</span><span class="k">lambda</span> <span class="n">s</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">num_classes</span><span class="p">,</span> <span class="n">s</span><span class="p">)])</span> |
| |
| <span class="n">mod</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">mod</span><span class="o">.</span><span class="n">Module</span><span class="p">(</span><span class="n">symbol</span><span class="o">=</span><span class="n">net</span><span class="p">)</span> |
| <span class="n">mod</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">data_iter</span><span class="p">,</span> <span class="n">num_epoch</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p>A note on python 3 usage: Lot of the methods in mxnet use string for python2 and bytes for python3. |
| In order to keep this tutorial readable, we are going to define a utility function that converts |
| string to bytes in python 3 environment</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">str_or_bytes</span><span class="p">(</span><span class="nb">str</span><span class="p">):</span> |
| <span class="sd">"""</span> |
| <span class="sd"> A utility function for this tutorial that helps us convert string </span> |
| <span class="sd"> to bytes if we are using python3.</span> |
| |
| <span class="sd"> Parameters</span> |
| <span class="sd"> ----------</span> |
| <span class="sd"> str : string</span> |
| |
| <span class="sd"> Returns</span> |
| <span class="sd"> -------</span> |
| <span class="sd"> string (python2) or bytes (python3)</span> |
| <span class="sd"> """</span> |
| <span class="k">if</span> <span class="n">sys</span><span class="o">.</span><span class="n">version_info</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><</span> <span class="mi">3</span><span class="p">:</span> |
| <span class="k">return</span> <span class="nb">str</span> |
| <span class="k">else</span><span class="p">:</span> |
| <span class="k">return</span> <span class="nb">bytes</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="s1">'utf-8'</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="record-io"> |
| <span id="record-io"></span><h2>Record IO<a class="headerlink" href="#record-io" title="Permalink to this headline">¶</a></h2> |
| <p>Record IO is a file format used by MXNet for data IO. |
| It compactly packs the data for efficient read and writes from distributed file system like Hadoop HDFS and AWS S3. |
| You can learn more about the design of <code class="docutils literal"><span class="pre">RecordIO</span></code> <a class="reference external" href="/versions/1.1.0/architecture/note_data_loading.html">here</a>.</p> |
| <p>MXNet provides <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.recordio.MXRecordIO"><strong><code class="docutils literal"><span class="pre">MXRecordIO</span></code></strong></a> |
| and <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.recordio.MXIndexedRecordIO"><strong><code class="docutils literal"><span class="pre">MXIndexedRecordIO</span></code></strong></a> |
| for sequential access of data and random access of the data.</p> |
| <div class="section" id="mxrecordio"> |
| <span id="mxrecordio"></span><h3>MXRecordIO<a class="headerlink" href="#mxrecordio" title="Permalink to this headline">¶</a></h3> |
| <p>First, let’s look at an example on how to read and write sequentially |
| using <code class="docutils literal"><span class="pre">MXRecordIO</span></code>. The files are named with a <code class="docutils literal"><span class="pre">.rec</span></code> extension.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">record</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">MXRecordIO</span><span class="p">(</span><span class="s1">'tmp.rec'</span><span class="p">,</span> <span class="s1">'w'</span><span class="p">)</span> |
| <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">5</span><span class="p">):</span> |
| <span class="n">record</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">str_or_bytes</span><span class="p">(</span><span class="s1">'record_</span><span class="si">%d</span><span class="s1">'</span><span class="o">%</span><span class="n">i</span><span class="p">))</span> |
| |
| <span class="n">record</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| <p>We can read the data back by opening the file with an option <code class="docutils literal"><span class="pre">r</span></code> as below:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">record</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">MXRecordIO</span><span class="p">(</span><span class="s1">'tmp.rec'</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span> |
| <span class="k">while</span> <span class="bp">True</span><span class="p">:</span> |
| <span class="n">item</span> <span class="o">=</span> <span class="n">record</span><span class="o">.</span><span class="n">read</span><span class="p">()</span> |
| <span class="k">if</span> <span class="ow">not</span> <span class="n">item</span><span class="p">:</span> |
| <span class="k">break</span> |
| <span class="k">print</span> <span class="p">(</span><span class="n">item</span><span class="p">)</span> |
| <span class="n">record</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="mxindexedrecordio"> |
| <span id="mxindexedrecordio"></span><h3>MXIndexedRecordIO<a class="headerlink" href="#mxindexedrecordio" title="Permalink to this headline">¶</a></h3> |
| <p><code class="docutils literal"><span class="pre">MXIndexedRecordIO</span></code> supports random or indexed access to the data. |
| We will create an indexed record file and a corresponding index file as below:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">record</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">MXIndexedRecordIO</span><span class="p">(</span><span class="s1">'tmp.idx'</span><span class="p">,</span> <span class="s1">'tmp.rec'</span><span class="p">,</span> <span class="s1">'w'</span><span class="p">)</span> |
| <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">5</span><span class="p">):</span> |
| <span class="n">record</span><span class="o">.</span><span class="n">write_idx</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">str_or_bytes</span><span class="p">(</span><span class="s1">'record_</span><span class="si">%d</span><span class="s1">'</span><span class="o">%</span><span class="n">i</span><span class="p">))</span> |
| |
| <span class="n">record</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| <p>Now, we can access the individual records using the keys</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">record</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">MXIndexedRecordIO</span><span class="p">(</span><span class="s1">'tmp.idx'</span><span class="p">,</span> <span class="s1">'tmp.rec'</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span> |
| <span class="n">record</span><span class="o">.</span><span class="n">read_idx</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <p>You can also list all the keys in the file.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">record</span><span class="o">.</span><span class="n">keys</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="packing-and-unpacking-data"> |
| <span id="packing-and-unpacking-data"></span><h3>Packing and Unpacking data<a class="headerlink" href="#packing-and-unpacking-data" title="Permalink to this headline">¶</a></h3> |
| <p>Each record in a .rec file can contain arbitrary binary data. However, most deep learning tasks require data to be input in label/data format. |
| The <code class="docutils literal"><span class="pre">mx.recordio</span></code> package provides a few utility functions for such operations, namely: <code class="docutils literal"><span class="pre">pack</span></code>, <code class="docutils literal"><span class="pre">unpack</span></code>, <code class="docutils literal"><span class="pre">pack_img</span></code>, and <code class="docutils literal"><span class="pre">unpack_img</span></code>.</p> |
| <div class="section" id="packing-unpacking-binary-data"> |
| <span id="packing-unpacking-binary-data"></span><h4>Packing/Unpacking Binary Data<a class="headerlink" href="#packing-unpacking-binary-data" title="Permalink to this headline">¶</a></h4> |
| <p><a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.recordio.pack"><strong><code class="docutils literal"><span class="pre">pack</span></code></strong></a> and <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.recordio.unpack"><strong><code class="docutils literal"><span class="pre">unpack</span></code></strong></a> are used for storing float (or 1d array of float) label and binary data. The data is packed along with a header. The header structure is defined <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.recordio.IRHeader">here</a>.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># pack</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="s1">'data'</span> |
| <span class="n">label1</span> <span class="o">=</span> <span class="mf">1.0</span> |
| <span class="n">header1</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">IRHeader</span><span class="p">(</span><span class="n">flag</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label1</span><span class="p">,</span> <span class="nb">id</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">id2</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span> |
| <span class="n">s1</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="n">header1</span><span class="p">,</span> <span class="n">str_or_bytes</span><span class="p">(</span><span class="n">data</span><span class="p">))</span> |
| |
| <span class="n">label2</span> <span class="o">=</span> <span class="p">[</span><span class="mf">1.0</span><span class="p">,</span> <span class="mf">2.0</span><span class="p">,</span> <span class="mf">3.0</span><span class="p">]</span> |
| <span class="n">header2</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">IRHeader</span><span class="p">(</span><span class="n">flag</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label2</span><span class="p">,</span> <span class="nb">id</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">id2</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span> |
| <span class="n">s2</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="n">header2</span><span class="p">,</span> <span class="n">str_or_bytes</span><span class="p">(</span><span class="n">data</span><span class="p">))</span> |
| </pre></div> |
| </div> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># unpack</span> |
| <span class="k">print</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">unpack</span><span class="p">(</span><span class="n">s1</span><span class="p">))</span> |
| <span class="k">print</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">unpack</span><span class="p">(</span><span class="n">s2</span><span class="p">))</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="packing-unpacking-image-data"> |
| <span id="packing-unpacking-image-data"></span><h4>Packing/Unpacking Image Data<a class="headerlink" href="#packing-unpacking-image-data" title="Permalink to this headline">¶</a></h4> |
| <p>MXNet provides <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.recordio.pack_img"><strong><code class="docutils literal"><span class="pre">pack_img</span></code></strong></a> and <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.recordio.unpack_img"><strong><code class="docutils literal"><span class="pre">unpack_img</span></code></strong></a> to pack/unpack image data. |
| Records packed by <code class="docutils literal"><span class="pre">pack_img</span></code> can be loaded by <code class="docutils literal"><span class="pre">mx.io.ImageRecordIter</span></code>.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="mi">3</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">1</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">uint8</span><span class="p">)</span> |
| <span class="n">label</span> <span class="o">=</span> <span class="mf">1.0</span> |
| <span class="n">header</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">IRHeader</span><span class="p">(</span><span class="n">flag</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">,</span> <span class="nb">id</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">id2</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span> |
| <span class="n">s</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">pack_img</span><span class="p">(</span><span class="n">header</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">quality</span><span class="o">=</span><span class="mi">100</span><span class="p">,</span> <span class="n">img_fmt</span><span class="o">=</span><span class="s1">'.jpg'</span><span class="p">)</span> |
| </pre></div> |
| </div> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># unpack_img</span> |
| <span class="k">print</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">recordio</span><span class="o">.</span><span class="n">unpack_img</span><span class="p">(</span><span class="n">s</span><span class="p">))</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="using-tools-im2rec-py"> |
| <span id="using-tools-im2rec-py"></span><h4>Using tools/im2rec.py<a class="headerlink" href="#using-tools-im2rec-py" title="Permalink to this headline">¶</a></h4> |
| <p>You can also convert raw images into <em>RecordIO</em> format using the <code class="docutils literal"><span class="pre">im2rec.py</span></code> utility script that is provided in the MXNet <a class="reference external" href="https://github.com/dmlc/mxnet/tree/master/tools">src/tools</a> folder. |
| An example of how to use the script for converting to <em>RecordIO</em> format is shown in the <code class="docutils literal"><span class="pre">Image</span> <span class="pre">IO</span></code> section below.</p> |
| </div> |
| </div> |
| </div> |
| <div class="section" id="image-io"> |
| <span id="image-io"></span><h2>Image IO<a class="headerlink" href="#image-io" title="Permalink to this headline">¶</a></h2> |
| <p>In this section, we will learn how to preprocess and load image data in MXNet.</p> |
| <p>There are 4 ways of loading image data in MXNet.</p> |
| <ol class="simple"> |
| <li>Using <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.image.imdecode"><strong>mx.image.imdecode</strong></a> to load raw image files.</li> |
| <li>Using <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.image.ImageIter"><strong><code class="docutils literal"><span class="pre">mx.img.ImageIter</span></code></strong></a> implemented in Python which is very flexible to customization. It can read from .rec(<code class="docutils literal"><span class="pre">RecordIO</span></code>) files and raw image files.</li> |
| <li>Using <a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.io.ImageRecordIter"><strong><code class="docutils literal"><span class="pre">mx.io.ImageRecordIter</span></code></strong></a> implemented on the MXNet backend in C++. This is less flexible to customization but provides various language bindings.</li> |
| <li>Creating a Custom iterator inheriting <code class="docutils literal"><span class="pre">mx.io.DataIter</span></code></li> |
| </ol> |
| <div class="section" id="preprocessing-images"> |
| <span id="preprocessing-images"></span><h3>Preprocessing Images<a class="headerlink" href="#preprocessing-images" title="Permalink to this headline">¶</a></h3> |
| <p>Images can be preprocessed in different ways. We list some of them below:</p> |
| <ul class="simple"> |
| <li>Using <code class="docutils literal"><span class="pre">mx.io.ImageRecordIter</span></code> which is fast but not very flexible. It is great for simple tasks like image recognition but won’t work for more complex tasks like detection and segmentation.</li> |
| <li>Using <code class="docutils literal"><span class="pre">mx.recordio.unpack_img</span></code> (or <code class="docutils literal"><span class="pre">cv2.imread</span></code>, <code class="docutils literal"><span class="pre">skimage</span></code>, etc) + <code class="docutils literal"><span class="pre">numpy</span></code> is flexible but slow due to Python Global Interpreter Lock (GIL).</li> |
| <li>Using MXNet provided <code class="docutils literal"><span class="pre">mx.image</span></code> package. It stores images in <a class="reference external" href="/versions/1.1.0/tutorials/basic/ndarray.html"><strong><code class="docutils literal"><span class="pre">NDArray</span></code></strong></a> format and leverages MXNet’s <a class="reference external" href="/versions/1.1.0/architecture/note_engine.html">dependency engine</a> to automatically parallelize processing and circumvent GIL.</li> |
| </ul> |
| <p>Below, we demonstrate some of the frequently used preprocessing routines provided by the <code class="docutils literal"><span class="pre">mx.image</span></code> package.</p> |
| <p>Let’s download sample images that we can work with.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">fname</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">test_utils</span><span class="o">.</span><span class="n">download</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="s1">'http://data.mxnet.io/data/test_images.tar.gz'</span><span class="p">,</span> <span class="n">dirname</span><span class="o">=</span><span class="s1">'data'</span><span class="p">,</span> <span class="n">overwrite</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> |
| <span class="n">tar</span> <span class="o">=</span> <span class="n">tarfile</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">fname</span><span class="p">)</span> |
| <span class="n">tar</span><span class="o">.</span><span class="n">extractall</span><span class="p">(</span><span class="n">path</span><span class="o">=</span><span class="s1">'./data'</span><span class="p">)</span> |
| <span class="n">tar</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| <div class="section" id="loading-raw-images"> |
| <span id="loading-raw-images"></span><h4>Loading raw images<a class="headerlink" href="#loading-raw-images" title="Permalink to this headline">¶</a></h4> |
| <p><code class="docutils literal"><span class="pre">mx.image.imdecode</span></code> lets us load the images. <code class="docutils literal"><span class="pre">imdecode</span></code> provides a similar interface to <code class="docutils literal"><span class="pre">OpenCV</span></code>.</p> |
| <p><strong>Note:</strong> You will still need <code class="docutils literal"><span class="pre">OpenCV</span></code>(not the CV2 Python library) installed to use <code class="docutils literal"><span class="pre">mx.image.imdecode</span></code>.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">img</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">image</span><span class="o">.</span><span class="n">imdecode</span><span class="p">(</span><span class="nb">open</span><span class="p">(</span><span class="s1">'data/test_images/ILSVRC2012_val_00000001.JPEG'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">())</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">img</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">());</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="image-transformations"> |
| <span id="image-transformations"></span><h4>Image Transformations<a class="headerlink" href="#image-transformations" title="Permalink to this headline">¶</a></h4> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># resize to w x h</span> |
| <span class="n">tmp</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">image</span><span class="o">.</span><span class="n">imresize</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="mi">70</span><span class="p">)</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">tmp</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">());</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># crop a random w x h region from image</span> |
| <span class="n">tmp</span><span class="p">,</span> <span class="n">coord</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">image</span><span class="o">.</span><span class="n">random_crop</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="mi">150</span><span class="p">,</span> <span class="mi">200</span><span class="p">))</span> |
| <span class="k">print</span><span class="p">(</span><span class="n">coord</span><span class="p">)</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">tmp</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">());</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| </div> |
| </div> |
| <div class="section" id="loading-data-using-image-iterators"> |
| <span id="loading-data-using-image-iterators"></span><h3>Loading Data using Image Iterators<a class="headerlink" href="#loading-data-using-image-iterators" title="Permalink to this headline">¶</a></h3> |
| <p>Before we see how to read data using the two built-in Image iterators, |
| lets get a sample <strong>Caltech 101</strong> dataset |
| that contains 101 classes of objects and converts them into record io format. |
| Download and unzip</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">fname</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">test_utils</span><span class="o">.</span><span class="n">download</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="s1">'http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz'</span><span class="p">,</span> <span class="n">dirname</span><span class="o">=</span><span class="s1">'data'</span><span class="p">,</span> <span class="n">overwrite</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> |
| <span class="n">tar</span> <span class="o">=</span> <span class="n">tarfile</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">fname</span><span class="p">)</span> |
| <span class="n">tar</span><span class="o">.</span><span class="n">extractall</span><span class="p">(</span><span class="n">path</span><span class="o">=</span><span class="s1">'./data'</span><span class="p">)</span> |
| <span class="n">tar</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| <p>Let’s take a look at the data. As you can see, under the root folder (./data/101_ObjectCategories) every category has a subfolder(./data/101_ObjectCategories/yin_yang).</p> |
| <p>Now let’s convert them into record io format using the <code class="docutils literal"><span class="pre">im2rec.py</span></code> utility script. |
| First, we need to make a list that contains all the image files and their categories:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">os</span><span class="o">.</span><span class="n">system</span><span class="p">(</span><span class="s1">'python </span><span class="si">%s</span><span class="s1">/tools/im2rec.py --list=1 --recursive=1 --shuffle=1 --test-ratio=0.2 data/caltech data/101_ObjectCategories'</span><span class="o">%</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'MXNET_HOME'</span><span class="p">])</span> |
| </pre></div> |
| </div> |
| <p>The resulting list file (./data/caltech_train.lst) is in the format <code class="docutils literal"><span class="pre">index\t(one</span> <span class="pre">or</span> <span class="pre">more</span> <span class="pre">label)\tpath</span></code>. In this case, there is only one label for each image but you can modify the list to add in more for multi-label training.</p> |
| <p>Then we can use this list to create our record io file:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">os</span><span class="o">.</span><span class="n">system</span><span class="p">(</span><span class="s2">"python </span><span class="si">%s</span><span class="s2">/tools/im2rec.py --num-thread=4 --pass-through=1 data/caltech data/101_ObjectCategories"</span><span class="o">%</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'MXNET_HOME'</span><span class="p">])</span> |
| </pre></div> |
| </div> |
| <p>The record io files are now saved at here (./data)</p> |
| <div class="section" id="using-imagerecorditer"> |
| <span id="using-imagerecorditer"></span><h4>Using ImageRecordIter<a class="headerlink" href="#using-imagerecorditer" title="Permalink to this headline">¶</a></h4> |
| <p><a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.io.ImageRecordIter"><strong><code class="docutils literal"><span class="pre">ImageRecordIter</span></code></strong></a> can be used for loading image data saved in record io format. To use ImageRecordIter, simply create an instance by loading your record file:</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data_iter</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">io</span><span class="o">.</span><span class="n">ImageRecordIter</span><span class="p">(</span> |
| <span class="n">path_imgrec</span><span class="o">=</span><span class="s2">"./data/caltech.rec"</span><span class="p">,</span> <span class="c1"># the target record file</span> |
| <span class="n">data_shape</span><span class="o">=</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">227</span><span class="p">,</span> <span class="mi">227</span><span class="p">),</span> <span class="c1"># output data shape. An 227x227 region will be cropped from the original image.</span> |
| <span class="n">batch_size</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span> <span class="c1"># number of samples per batch</span> |
| <span class="n">resize</span><span class="o">=</span><span class="mi">256</span> <span class="c1"># resize the shorter edge to 256 before cropping</span> |
| <span class="c1"># ... you can add more augumentation options as defined in ImageRecordIter.</span> |
| <span class="p">)</span> |
| <span class="n">data_iter</span><span class="o">.</span><span class="n">reset</span><span class="p">()</span> |
| <span class="n">batch</span> <span class="o">=</span> <span class="n">data_iter</span><span class="o">.</span><span class="n">next</span><span class="p">()</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="n">batch</span><span class="o">.</span><span class="n">data</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> |
| <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">4</span><span class="p">):</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">()</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">uint8</span><span class="p">)</span><span class="o">.</span><span class="n">transpose</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">0</span><span class="p">)))</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="section" id="using-imageiter"> |
| <span id="using-imageiter"></span><h4>Using ImageIter<a class="headerlink" href="#using-imageiter" title="Permalink to this headline">¶</a></h4> |
| <p><a class="reference external" href="/versions/1.1.0/api/python/io/io.html#mxnet.io.ImageIter"><strong>ImageIter</strong></a> is a flexible interface that supports loading of images in both RecordIO and Raw format.</p> |
| <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data_iter</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">image</span><span class="o">.</span><span class="n">ImageIter</span><span class="p">(</span><span class="n">batch_size</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span> <span class="n">data_shape</span><span class="o">=</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">227</span><span class="p">,</span> <span class="mi">227</span><span class="p">),</span> |
| <span class="n">path_imgrec</span><span class="o">=</span><span class="s2">"./data/caltech.rec"</span><span class="p">,</span> |
| <span class="n">path_imgidx</span><span class="o">=</span><span class="s2">"./data/caltech.idx"</span> <span class="p">)</span> |
| <span class="n">data_iter</span><span class="o">.</span><span class="n">reset</span><span class="p">()</span> |
| <span class="n">batch</span> <span class="o">=</span> <span class="n">data_iter</span><span class="o">.</span><span class="n">next</span><span class="p">()</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="n">batch</span><span class="o">.</span><span class="n">data</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> |
| <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">4</span><span class="p">):</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">()</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">uint8</span><span class="p">)</span><span class="o">.</span><span class="n">transpose</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">0</span><span class="p">)))</span> |
| <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> |
| </pre></div> |
| </div> |
| <div class="btn-group" role="group"> |
| <div class="download-btn"><a download="data.ipynb" href="data.ipynb"><span class="glyphicon glyphicon-download-alt"></span> data.ipynb</a></div></div></div> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div aria-label="main navigation" class="sphinxsidebar rightsidebar" role="navigation"> |
| <div class="sphinxsidebarwrapper"> |
| <h3><a href="../../index.html">Table Of Contents</a></h3> |
| <ul> |
| <li><a class="reference internal" href="#">Iterators - Loading data</a><ul> |
| <li><a class="reference internal" href="#prerequisites">Prerequisites</a></li> |
| <li><a class="reference internal" href="#mxnet-data-iterator">MXNet Data Iterator</a></li> |
| <li><a class="reference internal" href="#reading-data-in-memory">Reading data in memory</a></li> |
| <li><a class="reference internal" href="#reading-data-from-csv-files">Reading data from CSV files</a></li> |
| <li><a class="reference internal" href="#custom-iterator">Custom Iterator</a></li> |
| <li><a class="reference internal" href="#record-io">Record IO</a><ul> |
| <li><a class="reference internal" href="#mxrecordio">MXRecordIO</a></li> |
| <li><a class="reference internal" href="#mxindexedrecordio">MXIndexedRecordIO</a></li> |
| <li><a class="reference internal" href="#packing-and-unpacking-data">Packing and Unpacking data</a><ul> |
| <li><a class="reference internal" href="#packing-unpacking-binary-data">Packing/Unpacking Binary Data</a></li> |
| <li><a class="reference internal" href="#packing-unpacking-image-data">Packing/Unpacking Image Data</a></li> |
| <li><a class="reference internal" href="#using-tools-im2rec-py">Using tools/im2rec.py</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li><a class="reference internal" href="#image-io">Image IO</a><ul> |
| <li><a class="reference internal" href="#preprocessing-images">Preprocessing Images</a><ul> |
| <li><a class="reference internal" href="#loading-raw-images">Loading raw images</a></li> |
| <li><a class="reference internal" href="#image-transformations">Image Transformations</a></li> |
| </ul> |
| </li> |
| <li><a class="reference internal" href="#loading-data-using-image-iterators">Loading Data using Image Iterators</a><ul> |
| <li><a class="reference internal" href="#using-imagerecorditer">Using ImageRecordIter</a></li> |
| <li><a class="reference internal" href="#using-imageiter">Using ImageIter</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div><div class="footer"> |
| <div class="section-disclaimer"> |
| <div class="container"> |
| <div> |
| <img height="60" src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/apache_incubator_logo.png"/> |
| <p> |
| Apache MXNet is an effort undergoing incubation at The Apache Software Foundation (ASF), <strong>sponsored by the <i>Apache Incubator</i></strong>. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. |
| </p> |
| <p> |
| "Copyright © 2017-2018, The Apache Software Foundation |
| Apache MXNet, MXNet, Apache, the Apache feather, and the Apache MXNet project logo are either registered trademarks or trademarks of the Apache Software Foundation." |
| </p> |
| </div> |
| </div> |
| </div> |
| </div> <!-- pagename != index --> |
| </div> |
| <script crossorigin="anonymous" integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS" src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"></script> |
| <script src="../../_static/js/sidebar.js" type="text/javascript"></script> |
| <script src="../../_static/js/search.js" type="text/javascript"></script> |
| <script src="../../_static/js/navbar.js" type="text/javascript"></script> |
| <script src="../../_static/js/clipboard.min.js" type="text/javascript"></script> |
| <script src="../../_static/js/copycode.js" type="text/javascript"></script> |
| <script src="../../_static/js/page.js" type="text/javascript"></script> |
| <script src="../../_static/js/docversion.js" type="text/javascript"></script> |
| <script type="text/javascript"> |
| $('body').ready(function () { |
| $('body').css('visibility', 'visible'); |
| }); |
| </script> |
| </body> |
| </html> |