blob: cbb7575202ca8bc50382979789e93bc72b97ddf5 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta content="width=device-width, initial-scale=1" name="viewport"/>
<meta content="Profiling MXNet Models" property="og:title">
<meta content="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/og-logo.png" property="og:image">
<meta content="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/og-logo.png" property="og:image:secure_url">
<meta content="Profiling MXNet Models" property="og:description"/>
<title>Profiling MXNet Models — mxnet documentation</title>
<link crossorigin="anonymous" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" rel="stylesheet"/>
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css" rel="stylesheet"/>
<link href="../../_static/basic.css" rel="stylesheet" type="text/css">
<link href="../../_static/pygments.css" rel="stylesheet" type="text/css">
<link href="../../_static/mxnet.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: '../../',
VERSION: '',
COLLAPSE_INDEX: false,
FILE_SUFFIX: '.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: '.txt'
};
</script>
<script src="https://code.jquery.com/jquery-1.11.1.min.js" type="text/javascript"></script>
<script src="../../_static/underscore.js" type="text/javascript"></script>
<script src="../../_static/searchtools_custom.js" type="text/javascript"></script>
<script src="../../_static/doctools.js" type="text/javascript"></script>
<script src="../../_static/selectlang.js" type="text/javascript"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
<script type="text/javascript"> jQuery(function() { Search.loadIndex("/versions/1.4.1/searchindex.js"); Search.init();}); </script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new
Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-96378503-1', 'auto');
ga('send', 'pageview');
</script>
<!-- -->
<!-- <script type="text/javascript" src="../../_static/jquery.js"></script> -->
<!-- -->
<!-- <script type="text/javascript" src="../../_static/underscore.js"></script> -->
<!-- -->
<!-- <script type="text/javascript" src="../../_static/doctools.js"></script> -->
<!-- -->
<!-- <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> -->
<!-- -->
<link href="../../genindex.html" rel="index" title="Index">
<link href="../../search.html" rel="search" title="Search"/>
<link href="index.html" rel="up" title="Tutorials"/>
<link href="types_of_data_augmentation.html" rel="next" title="Types of Data Augmentation"/>
<link href="predict_image.html" rel="prev" title="Predict with pre-trained models"/>
<link href="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet-icon.png" rel="icon" type="image/png"/>
</link></link></link></meta></meta></meta></head>
<body background="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet-background-compressed.jpeg" role="document">
<div class="content-block"><div class="navbar navbar-fixed-top">
<div class="container" id="navContainer">
<div class="innder" id="header-inner">
<h1 id="logo-wrap">
<a href="../../" id="logo"><img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/mxnet_logo.png"/></a>
</h1>
<nav class="nav-bar" id="main-nav">
<a class="main-nav-link" href="/versions/1.4.1/install/index.html">Install</a>
<span id="dropdown-menu-position-anchor">
<a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Gluon <span class="caret"></span></a>
<ul class="dropdown-menu navbar-menu" id="package-dropdown-menu">
<li><a class="main-nav-link" href="/versions/1.4.1/tutorials/gluon/gluon.html">About</a></li>
<li><a class="main-nav-link" href="https://www.d2l.ai/">Dive into Deep Learning</a></li>
<li><a class="main-nav-link" href="https://gluon-cv.mxnet.io">GluonCV Toolkit</a></li>
<li><a class="main-nav-link" href="https://gluon-nlp.mxnet.io/">GluonNLP Toolkit</a></li>
</ul>
</span>
<span id="dropdown-menu-position-anchor">
<a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">API <span class="caret"></span></a>
<ul class="dropdown-menu navbar-menu" id="package-dropdown-menu">
<li><a class="main-nav-link" href="/versions/1.4.1/api/python/index.html">Python</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/c++/index.html">C++</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/clojure/index.html">Clojure</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/java/index.html">Java</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/julia/index.html">Julia</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/perl/index.html">Perl</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/r/index.html">R</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/scala/index.html">Scala</a></li>
</ul>
</span>
<span id="dropdown-menu-position-anchor-docs">
<a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Docs <span class="caret"></span></a>
<ul class="dropdown-menu navbar-menu" id="package-dropdown-menu-docs">
<li><a class="main-nav-link" href="/versions/1.4.1/faq/index.html">FAQ</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/tutorials/index.html">Tutorials</a>
<li><a class="main-nav-link" href="https://github.com/apache/incubator-mxnet/tree/1.4.1/example">Examples</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/architecture/index.html">Architecture</a></li>
<li><a class="main-nav-link" href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home">Developer Wiki</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/model_zoo/index.html">Model Zoo</a></li>
<li><a class="main-nav-link" href="https://github.com/onnx/onnx-mxnet">ONNX</a></li>
</li></ul>
</span>
<span id="dropdown-menu-position-anchor-community">
<a aria-expanded="true" aria-haspopup="true" class="main-nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button">Community <span class="caret"></span></a>
<ul class="dropdown-menu navbar-menu" id="package-dropdown-menu-community">
<li><a class="main-nav-link" href="http://discuss.mxnet.io">Forum</a></li>
<li><a class="main-nav-link" href="https://github.com/apache/incubator-mxnet/tree/1.4.1">Github</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/community/contribute.html">Contribute</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/community/ecosystem.html">Ecosystem</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/community/powered_by.html">Powered By</a></li>
</ul>
</span>
<span id="dropdown-menu-position-anchor-version" style="position: relative"><a href="#" class="main-nav-link dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="true">1.4.1<span class="caret"></span></a><ul id="package-dropdown-menu" class="dropdown-menu"><li><a href="/">master</a></li><li><a href="/versions/1.7.0/">1.7.0</a></li><li><a href=/versions/1.6.0/>1.6.0</a></li><li><a href=/versions/1.5.0/>1.5.0</a></li><li><a href=/versions/1.4.1/>1.4.1</a></li><li><a href=/versions/1.3.1/>1.3.1</a></li><li><a href=/versions/1.2.1/>1.2.1</a></li><li><a href=/versions/1.1.0/>1.1.0</a></li><li><a href=/versions/1.0.0/>1.0.0</a></li><li><a href=/versions/0.12.1/>0.12.1</a></li><li><a href=/versions/0.11.0/>0.11.0</a></li></ul></span></nav>
<script> function getRootPath(){ return "../../" } </script>
<div class="burgerIcon dropdown">
<a class="dropdown-toggle" data-toggle="dropdown" href="#" role="button"></a>
<ul class="dropdown-menu" id="burgerMenu">
<li><a href="/versions/1.4.1/install/index.html">Install</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/tutorials/index.html">Tutorials</a></li>
<li class="dropdown-submenu dropdown">
<a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">Gluon</a>
<ul class="dropdown-menu navbar-menu" id="package-dropdown-menu">
<li><a class="main-nav-link" href="/versions/1.4.1/tutorials/gluon/gluon.html">About</a></li>
<li><a class="main-nav-link" href="http://gluon.mxnet.io">The Straight Dope (Tutorials)</a></li>
<li><a class="main-nav-link" href="https://gluon-cv.mxnet.io">GluonCV Toolkit</a></li>
<li><a class="main-nav-link" href="https://gluon-nlp.mxnet.io/">GluonNLP Toolkit</a></li>
</ul>
</li>
<li class="dropdown-submenu">
<a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">API</a>
<ul class="dropdown-menu">
<li><a class="main-nav-link" href="/versions/1.4.1/api/python/index.html">Python</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/c++/index.html">C++</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/clojure/index.html">Clojure</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/java/index.html">Java</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/julia/index.html">Julia</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/perl/index.html">Perl</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/r/index.html">R</a></li>
<li><a class="main-nav-link" href="/versions/1.4.1/api/scala/index.html">Scala</a></li>
</ul>
</li>
<li class="dropdown-submenu">
<a aria-expanded="true" aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" tabindex="-1">Docs</a>
<ul class="dropdown-menu">
<li><a href="/versions/1.4.1/faq/index.html" tabindex="-1">FAQ</a></li>
<li><a href="/versions/1.4.1/tutorials/index.html" tabindex="-1">Tutorials</a></li>
<li><a href="https://github.com/apache/incubator-mxnet/tree/1.4.1/example" tabindex="-1">Examples</a></li>
<li><a href="/versions/1.4.1/architecture/index.html" tabindex="-1">Architecture</a></li>
<li><a href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home" tabindex="-1">Developer Wiki</a></li>
<li><a href="/versions/1.4.1/model_zoo/index.html" tabindex="-1">Gluon Model Zoo</a></li>
<li><a href="https://github.com/onnx/onnx-mxnet" tabindex="-1">ONNX</a></li>
</ul>
</li>
<li class="dropdown-submenu dropdown">
<a aria-haspopup="true" class="dropdown-toggle burger-link" data-toggle="dropdown" href="#" role="button" tabindex="-1">Community</a>
<ul class="dropdown-menu">
<li><a href="http://discuss.mxnet.io" tabindex="-1">Forum</a></li>
<li><a href="https://github.com/apache/incubator-mxnet/tree/1.4.1" tabindex="-1">Github</a></li>
<li><a href="/versions/1.4.1/community/contribute.html" tabindex="-1">Contribute</a></li>
<li><a href="/versions/1.4.1/community/ecosystem.html" tabindex="-1">Ecosystem</a></li>
<li><a href="/versions/1.4.1/community/powered_by.html" tabindex="-1">Powered By</a></li>
</ul>
</li>
<li id="dropdown-menu-position-anchor-version-mobile" class="dropdown-submenu" style="position: relative"><a href="#" tabindex="-1">1.4.1</a><ul class="dropdown-menu"><li><a tabindex="-1" href=/>master</a></li><li><a tabindex="-1" href=/versions/1.6.0/>1.6.0</a></li><li><a tabindex="-1" href=/versions/1.5.0/>1.5.0</a></li><li><a tabindex="-1" href=/versions/1.4.1/>1.4.1</a></li><li><a tabindex="-1" href=/versions/1.3.1/>1.3.1</a></li><li><a tabindex="-1" href=/versions/1.2.1/>1.2.1</a></li><li><a tabindex="-1" href=/versions/1.1.0/>1.1.0</a></li><li><a tabindex="-1" href=/versions/1.0.0/>1.0.0</a></li><li><a tabindex="-1" href=/versions/0.12.1/>0.12.1</a></li><li><a tabindex="-1" href=/versions/0.11.0/>0.11.0</a></li></ul></li></ul>
</div>
<div class="plusIcon dropdown">
<a class="dropdown-toggle" data-toggle="dropdown" href="#" role="button"><span aria-hidden="true" class="glyphicon glyphicon-plus"></span></a>
<ul class="dropdown-menu dropdown-menu-right" id="plusMenu"></ul>
</div>
<div id="search-input-wrap">
<form action="../../search.html" autocomplete="off" class="" method="get" role="search">
<div class="form-group inner-addon left-addon">
<i class="glyphicon glyphicon-search"></i>
<input class="form-control" name="q" placeholder="Search" type="text"/>
</div>
<input name="check_keywords" type="hidden" value="yes">
<input name="area" type="hidden" value="default"/>
</input></form>
<div id="search-preview"></div>
</div>
<div id="searchIcon">
<span aria-hidden="true" class="glyphicon glyphicon-search"></span>
</div>
<!-- <div id="lang-select-wrap"> -->
<!-- <label id="lang-select-label"> -->
<!-- <\!-- <i class="fa fa-globe"></i> -\-> -->
<!-- <span></span> -->
<!-- </label> -->
<!-- <select id="lang-select"> -->
<!-- <option value="en">Eng</option> -->
<!-- <option value="zh">中文</option> -->
<!-- </select> -->
<!-- </div> -->
<!-- <a id="mobile-nav-toggle">
<span class="mobile-nav-toggle-bar"></span>
<span class="mobile-nav-toggle-bar"></span>
<span class="mobile-nav-toggle-bar"></span>
</a> -->
</div>
</div>
</div>
<script type="text/javascript">
$('body').css('background', 'white');
</script>
<div class="container">
<div class="row">
<div aria-label="main navigation" class="sphinxsidebar leftsidebar" role="navigation">
<div class="sphinxsidebarwrapper">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../api/index.html">MXNet APIs</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../architecture/index.html">MXNet Architecture</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../community/index.html">MXNet Community</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../faq/index.html">MXNet FAQ</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../gluon/index.html">About Gluon</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../install/index.html">Installing MXNet</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../install/index.html#nvidia-jetson-tx-family">Nvidia Jetson TX family</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../install/index.html#source-download">Source Download</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../model_zoo/index.html">MXNet Model Zoo</a></li>
<li class="toctree-l1"><a class="reference internal" href="../index.html">Tutorials</a></li>
</ul>
</div>
</div>
<div class="content">
<div class="page-tracker"></div>
<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements. See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership. The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License. You may obtain a copy of the License at --><!--- http://www.apache.org/licenses/LICENSE-2.0 --><!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied. See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. --><div class="section" id="profiling-mxnet-models">
<span id="profiling-mxnet-models"></span><h1>Profiling MXNet Models<a class="headerlink" href="#profiling-mxnet-models" title="Permalink to this headline"></a></h1>
<p>It is often helpful to understand what operations take how much time while running a model. This helps optimize the model to run faster. In this tutorial, we will learn how to profile MXNet models to measure their running time and memory consumption using the MXNet profiler.</p>
<div class="section" id="the-incorrect-way-to-profile">
<span id="the-incorrect-way-to-profile"></span><h2>The incorrect way to profile<a class="headerlink" href="#the-incorrect-way-to-profile" title="Permalink to this headline"></a></h2>
<p>If you have just begun using MXNet, you might be tempted to measure the execution time of your model using Python’s <code class="docutils literal"><span class="pre">time</span></code> module like shown below:</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">time</span> <span class="kn">import</span> <span class="n">time</span>
<span class="kn">from</span> <span class="nn">mxnet</span> <span class="kn">import</span> <span class="n">autograd</span><span class="p">,</span> <span class="n">nd</span>
<span class="kn">import</span> <span class="nn">mxnet</span> <span class="kn">as</span> <span class="nn">mx</span>
<span class="n">start</span> <span class="o">=</span> <span class="n">time</span><span class="p">()</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">random_uniform</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="mi">2000</span><span class="p">,</span><span class="mi">2000</span><span class="p">))</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s1">'Time for matrix multiplication: </span><span class="si">%f</span><span class="s1"> sec</span><span class="se">\n</span><span class="s1">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">time</span><span class="p">()</span> <span class="o">-</span> <span class="n">start</span><span class="p">))</span>
<span class="n">start</span> <span class="o">=</span> <span class="n">time</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">y</span><span class="o">.</span><span class="n">asnumpy</span><span class="p">())</span>
<span class="k">print</span><span class="p">(</span><span class="s1">'Time for printing the output: </span><span class="si">%f</span><span class="s1"> sec'</span> <span class="o">%</span> <span class="p">(</span><span class="n">time</span><span class="p">()</span> <span class="o">-</span> <span class="n">start</span><span class="p">))</span>
</pre></div>
</div>
<p><strong>Time for matrix multiplication: 0.005051 sec</strong><!--notebook-skip-line--></p>
<p>[[501.1584 508.29724 495.65237 ... 492.84705 492.69092 490.0481 ]<!--notebook-skip-line--></p>
<p>[508.81058 507.1822 495.1743 ... 503.10526 497.29315 493.67917]<!--notebook-skip-line--></p>
<p>[489.56598 499.47015 490.17722 ... 490.99945 488.05008 483.28836]<!--notebook-skip-line--></p>
<p>...<!--notebook-skip-line--></p>
<p>[484.0019 495.7179 479.92142 ... 493.69952 478.89194 487.2074 ]<!--notebook-skip-line--></p>
<p>[499.64932 507.65094 497.5938 ... 493.0474 500.74512 495.82712]<!--notebook-skip-line--></p>
<p>[516.0143 519.1715 506.354 ... 510.08878 496.35608 495.42523]]<!--notebook-skip-line--></p>
<p><strong>Time for printing the output: 0.167693 sec</strong><!--notebook-skip-line--></p>
<p>From the output above, it seems as if printing the output takes lot more time that multiplying two large matrices. That doesn’t feel right.</p>
<p>This is because, in MXNet, all operations are executed asynchronously. So, when <code class="docutils literal"><span class="pre">nd.dot(x,</span> <span class="pre">x)</span></code> returns, the matrix multiplication is not complete, it has only been queued for execution. <code class="docutils literal"><span class="pre">asnumpy</span></code> in <code class="docutils literal"><span class="pre">print(y.asnumpy())</span></code> however, waits for the result to be computed and hence takes longer time.</p>
<p>While it is possible to use <code class="docutils literal"><span class="pre">NDArray.waitall()</span></code> before and after operations to get running time of operations, it is not a scalable method to measure running time of multiple sets of operations, especially in a Sequential or Hybrid network.</p>
</div>
<div class="section" id="the-correct-way-to-profile">
<span id="the-correct-way-to-profile"></span><h2>The correct way to profile<a class="headerlink" href="#the-correct-way-to-profile" title="Permalink to this headline"></a></h2>
<p>The correct way to measure running time of MXNet models is to use MXNet profiler. In the rest of this tutorial, we will learn how to use the MXNet profiler to measure the running time and memory consumption of MXNet models. You can import the profiler and configure it from Python code.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">mxnet</span> <span class="kn">import</span> <span class="n">profiler</span>
<span class="n">profiler</span><span class="o">.</span><span class="n">set_config</span><span class="p">(</span><span class="n">profile_all</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">aggregate_stats</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">filename</span><span class="o">=</span><span class="s1">'profile_output.json'</span><span class="p">)</span>
</pre></div>
</div>
<p><code class="docutils literal"><span class="pre">profile_all</span></code> enables all types of profiling. You can also individually enable the following types of profiling:</p>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">profile_symbolic</span></code> (boolean): whether to profile symbolic operators</li>
<li><code class="docutils literal"><span class="pre">profile_imperative</span></code> (boolean): whether to profile imperative operators</li>
<li><code class="docutils literal"><span class="pre">profile_memory</span></code> (boolean): whether to profile memory usage</li>
<li><code class="docutils literal"><span class="pre">profile_api</span></code> (boolean): whether to profile the C API</li>
</ul>
<p><code class="docutils literal"><span class="pre">aggregate_stats</span></code> aggregates statistics in memory which can then be printed to console by calling <code class="docutils literal"><span class="pre">profiler.dumps()</span></code>.</p>
<div class="section" id="setup-build-a-model">
<span id="setup-build-a-model"></span><h3>Setup: Build a model<a class="headerlink" href="#setup-build-a-model" title="Permalink to this headline"></a></h3>
<p>Let’s build a small convolutional neural network that we can use for profiling.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">mxnet</span> <span class="kn">import</span> <span class="n">gluon</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">HybridSequential</span><span class="p">()</span>
<span class="k">with</span> <span class="n">net</span><span class="o">.</span><span class="n">name_scope</span><span class="p">():</span>
<span class="n">net</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Conv2D</span><span class="p">(</span><span class="n">channels</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">kernel_size</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s1">'relu'</span><span class="p">))</span>
<span class="n">net</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">MaxPool2D</span><span class="p">(</span><span class="n">pool_size</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">strides</span><span class="o">=</span><span class="mi">2</span><span class="p">))</span>
<span class="n">net</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Conv2D</span><span class="p">(</span><span class="n">channels</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">kernel_size</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s1">'relu'</span><span class="p">))</span>
<span class="n">net</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">MaxPool2D</span><span class="p">(</span><span class="n">pool_size</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">strides</span><span class="o">=</span><span class="mi">2</span><span class="p">))</span>
<span class="n">net</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Flatten</span><span class="p">())</span>
<span class="n">net</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">512</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s2">"relu"</span><span class="p">))</span>
<span class="n">net</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">10</span><span class="p">))</span>
</pre></div>
</div>
<p>We need data that we can run through the network for profiling. We’ll use the MNIST dataset.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">mxnet.gluon.data.vision</span> <span class="kn">import</span> <span class="n">transforms</span>
<span class="n">train_data</span> <span class="o">=</span> <span class="n">gluon</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">DataLoader</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">vision</span><span class="o">.</span><span class="n">MNIST</span><span class="p">(</span><span class="n">train</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span><span class="o">.</span><span class="n">transform_first</span><span class="p">(</span><span class="n">transforms</span><span class="o">.</span><span class="n">ToTensor</span><span class="p">()),</span>
<span class="n">batch_size</span><span class="o">=</span><span class="mi">64</span><span class="p">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</pre></div>
</div>
<p>Let’s define a method that will run one training iteration given data and label.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Use GPU if available</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">test_utils</span><span class="o">.</span><span class="n">list_gpus</span><span class="p">())</span><span class="o">!=</span><span class="mi">0</span><span class="p">:</span>
<span class="n">ctx</span><span class="o">=</span><span class="n">mx</span><span class="o">.</span><span class="n">gpu</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">ctx</span><span class="o">=</span><span class="n">mx</span><span class="o">.</span><span class="n">cpu</span><span class="p">()</span>
<span class="c1"># Initialize the parameters with random weights</span>
<span class="n">net</span><span class="o">.</span><span class="n">collect_params</span><span class="p">()</span><span class="o">.</span><span class="n">initialize</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">init</span><span class="o">.</span><span class="n">Xavier</span><span class="p">(),</span> <span class="n">ctx</span><span class="o">=</span><span class="n">ctx</span><span class="p">)</span>
<span class="c1"># Use SGD optimizer</span>
<span class="n">trainer</span> <span class="o">=</span> <span class="n">gluon</span><span class="o">.</span><span class="n">Trainer</span><span class="p">(</span><span class="n">net</span><span class="o">.</span><span class="n">collect_params</span><span class="p">(),</span> <span class="s1">'sgd'</span><span class="p">,</span> <span class="p">{</span><span class="s1">'learning_rate'</span><span class="p">:</span> <span class="o">.</span><span class="mi">1</span><span class="p">})</span>
<span class="c1"># Softmax Cross Entropy is a frequently used loss function for multi-classs classification</span>
<span class="n">softmax_cross_entropy</span> <span class="o">=</span> <span class="n">gluon</span><span class="o">.</span><span class="n">loss</span><span class="o">.</span><span class="n">SoftmaxCrossEntropyLoss</span><span class="p">()</span>
<span class="c1"># A helper function to run one training iteration</span>
<span class="k">def</span> <span class="nf">run_training_iteration</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">label</span><span class="p">):</span>
<span class="c1"># Load data and label is the right context</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">as_in_context</span><span class="p">(</span><span class="n">ctx</span><span class="p">)</span>
<span class="n">label</span> <span class="o">=</span> <span class="n">label</span><span class="o">.</span><span class="n">as_in_context</span><span class="p">(</span><span class="n">ctx</span><span class="p">)</span>
<span class="c1"># Run the forward pass</span>
<span class="k">with</span> <span class="n">autograd</span><span class="o">.</span><span class="n">record</span><span class="p">():</span>
<span class="n">output</span> <span class="o">=</span> <span class="n">net</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="n">loss</span> <span class="o">=</span> <span class="n">softmax_cross_entropy</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
<span class="c1"># Run the backward pass</span>
<span class="n">loss</span><span class="o">.</span><span class="n">backward</span><span class="p">()</span>
<span class="c1"># Apply changes to parameters</span>
<span class="n">trainer</span><span class="o">.</span><span class="n">step</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
</pre></div>
</div>
</div>
<div class="section" id="starting-and-stopping-the-profiler-from-python">
<span id="starting-and-stopping-the-profiler-from-python"></span><h3>Starting and stopping the profiler from Python<a class="headerlink" href="#starting-and-stopping-the-profiler-from-python" title="Permalink to this headline"></a></h3>
<p>When the first forward pass is run on a network, MXNet does a number of housekeeping tasks including inferring the shapes of various parameters, allocating memory for intermediate and final outputs, etc. For these reasons, profiling the first iteration doesn’t provide accurate results. We will, therefore skip the first iteration.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="c1"># Run the first iteration without profiling</span>
<span class="n">itr</span> <span class="o">=</span> <span class="nb">iter</span><span class="p">(</span><span class="n">train_data</span><span class="p">)</span>
<span class="n">run_training_iteration</span><span class="p">(</span><span class="o">*</span><span class="nb">next</span><span class="p">(</span><span class="n">itr</span><span class="p">))</span>
</pre></div>
</div>
<p>We’ll run the next iteration with the profiler turned on.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">data</span><span class="p">,</span> <span class="n">label</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">itr</span><span class="p">)</span>
<span class="c1"># Ask the profiler to start recording</span>
<span class="n">profiler</span><span class="o">.</span><span class="n">set_state</span><span class="p">(</span><span class="s1">'run'</span><span class="p">)</span>
<span class="n">run_training_iteration</span><span class="p">(</span><span class="o">*</span><span class="nb">next</span><span class="p">(</span><span class="n">itr</span><span class="p">))</span>
<span class="c1"># Ask the profiler to stop recording after operations have completed</span>
<span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">waitall</span><span class="p">()</span>
<span class="n">profiler</span><span class="o">.</span><span class="n">set_state</span><span class="p">(</span><span class="s1">'stop'</span><span class="p">)</span>
</pre></div>
</div>
<p>Between running and stopping the profiler, you can also pause and resume the profiler using <code class="docutils literal"><span class="pre">profiler.pause()</span></code> and <code class="docutils literal"><span class="pre">profiler.resume()</span></code> respectively to profile only parts of the code you want to profile.</p>
</div>
<div class="section" id="starting-profiler-automatically-using-environment-variable">
<span id="starting-profiler-automatically-using-environment-variable"></span><h3>Starting profiler automatically using environment variable<a class="headerlink" href="#starting-profiler-automatically-using-environment-variable" title="Permalink to this headline"></a></h3>
<p>The method described above requires code changes to start and stop the profiler. You can also start the profiler automatically and profile the entire code without any code changes using the <code class="docutils literal"><span class="pre">MXNET_PROFILER_AUTOSTART</span></code> environment variable.</p>
<p>MXNet will start the profiler automatically if you run your code with the environment variable <code class="docutils literal"><span class="pre">MXNET_PROFILER_AUTOSTART</span></code> set to <code class="docutils literal"><span class="pre">1</span></code>. The profiler output is stored into <code class="docutils literal"><span class="pre">profile.json</span></code> in the current directory.</p>
<p>Note that the profiler output could be large depending on your code. It might be helpful to profile only sections of your code using the <code class="docutils literal"><span class="pre">set_state</span></code> API described in the previous section.</p>
</div>
<div class="section" id="increasing-granularity-of-the-profiler-output">
<span id="increasing-granularity-of-the-profiler-output"></span><h3>Increasing granularity of the profiler output<a class="headerlink" href="#increasing-granularity-of-the-profiler-output" title="Permalink to this headline"></a></h3>
<p>MXNet executes computation graphs in ‘bulk mode’ which reduces kernel launch gaps in between symbolic operators for faster execution. This could reduce the granularity of the profiler output. If you need profiling result of every operator, please set the environment variables <code class="docutils literal"><span class="pre">MXNET_EXEC_BULK_EXEC_INFERENCE</span></code> and <code class="docutils literal"><span class="pre">MXNET_EXEC_BULK_EXEC_TRAIN</span></code> to <code class="docutils literal"><span class="pre">0</span></code> to disable the bulk execution mode.</p>
</div>
<div class="section" id="viewing-profiler-output">
<span id="viewing-profiler-output"></span><h3>Viewing profiler output<a class="headerlink" href="#viewing-profiler-output" title="Permalink to this headline"></a></h3>
<p>There are two ways to view the information collected by the profiler. You can either view it in the console or you can view a more graphical version in a browser.</p>
<div class="section" id="view-in-console">
<span id="view-in-console"></span><h4>1. View in console<a class="headerlink" href="#view-in-console" title="Permalink to this headline"></a></h4>
<p>You can use the <code class="docutils literal"><span class="pre">profiler.dumps()</span></code> method to view the information collected by the profiler in the console. The collected information contains time taken by each operator, time taken by each C API and memory consumed in both CPU and GPU.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">print</span><span class="p">(</span><span class="n">profiler</span><span class="o">.</span><span class="n">dumps</span><span class="p">())</span>
</pre></div>
</div>
<p><img alt="Profile Statistics" src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tutorials/python/profiler/profile_stats.png"><!--notebook-skip-line--></img></p>
</div>
<div class="section" id="view-in-browser">
<span id="view-in-browser"></span><h4>2. View in browser<a class="headerlink" href="#view-in-browser" title="Permalink to this headline"></a></h4>
<p>You can also dump the information collected by the profiler into a <code class="docutils literal"><span class="pre">json</span></code> file using the <code class="docutils literal"><span class="pre">profiler.dump()</span></code> function and view it in a browser.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">profiler</span><span class="o">.</span><span class="n">dump</span><span class="p">()</span>
</pre></div>
</div>
<p><code class="docutils literal"><span class="pre">dump()</span></code> creates a <code class="docutils literal"><span class="pre">json</span></code> file which can be viewed using a trace consumer like <code class="docutils literal"><span class="pre">chrome://tracing</span></code> in the Chrome browser. Here is a snapshot that shows the output of the profiling we did above.</p>
<p><img alt="Tracing Screenshot" src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tutorials/python/profiler/profiler_output_chrome.png"/></p>
<p>Let’s zoom in to check the time taken by operators</p>
<p><img alt="Operator profiling" src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tutorials/python/profiler/profile_operators.png"/></p>
<p>The above picture visualizes the sequence in which the operators were executed and the time taken by each operator.</p>
</div>
</div>
<div class="section" id="further-reading">
<span id="further-reading"></span><h3>Further reading<a class="headerlink" href="#further-reading" title="Permalink to this headline"></a></h3>
<div class="toctree-wrapper compound">
<ul>
<li class="toctree-l1"><a class="reference external" href="https://github.com/apache/incubator-mxnet/tree/master/example/profiler">Examples using MXNet profiler.</a></li>
<li class="toctree-l1"><a class="reference external" href="/faq/perf.html">Some tips for improving MXNet performance.</a></li>
</ul>
</div>
<div class="btn-group" role="group">
<div class="download-btn"><a download="profiler.ipynb" href="profiler.ipynb"><span class="glyphicon glyphicon-download-alt"></span> profiler.ipynb</a></div></div></div>
</div>
</div>
</div>
</div>
<div aria-label="main navigation" class="sphinxsidebar rightsidebar" role="navigation">
<div class="sphinxsidebarwrapper">
<h3><a href="../../index.html">Table Of Contents</a></h3>
<ul>
<li><a class="reference internal" href="#">Profiling MXNet Models</a><ul>
<li><a class="reference internal" href="#the-incorrect-way-to-profile">The incorrect way to profile</a></li>
<li><a class="reference internal" href="#the-correct-way-to-profile">The correct way to profile</a><ul>
<li><a class="reference internal" href="#setup-build-a-model">Setup: Build a model</a></li>
<li><a class="reference internal" href="#starting-and-stopping-the-profiler-from-python">Starting and stopping the profiler from Python</a></li>
<li><a class="reference internal" href="#starting-profiler-automatically-using-environment-variable">Starting profiler automatically using environment variable</a></li>
<li><a class="reference internal" href="#increasing-granularity-of-the-profiler-output">Increasing granularity of the profiler output</a></li>
<li><a class="reference internal" href="#viewing-profiler-output">Viewing profiler output</a><ul>
<li><a class="reference internal" href="#view-in-console">1. View in console</a></li>
<li><a class="reference internal" href="#view-in-browser">2. View in browser</a></li>
</ul>
</li>
<li><a class="reference internal" href="#further-reading">Further reading</a></li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
</div>
</div><div class="footer">
<div class="section-disclaimer">
<div class="container">
<div>
<img height="60" src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/apache_incubator_logo.png"/>
<p>
Apache MXNet is an effort undergoing incubation at The Apache Software Foundation (ASF), <strong>sponsored by the <i>Apache Incubator</i></strong>. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
</p>
<p>
"Copyright © 2017-2018, The Apache Software Foundation
Apache MXNet, MXNet, Apache, the Apache feather, and the Apache MXNet project logo are either registered trademarks or trademarks of the Apache Software Foundation."
</p>
</div>
</div>
</div>
</div> <!-- pagename != index -->
</div>
<script crossorigin="anonymous" integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS" src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"></script>
<script src="../../_static/js/sidebar.js" type="text/javascript"></script>
<script src="../../_static/js/search.js" type="text/javascript"></script>
<script src="../../_static/js/navbar.js" type="text/javascript"></script>
<script src="../../_static/js/clipboard.min.js" type="text/javascript"></script>
<script src="../../_static/js/copycode.js" type="text/javascript"></script>
<script src="../../_static/js/page.js" type="text/javascript"></script>
<script src="../../_static/js/docversion.js" type="text/javascript"></script>
<script type="text/javascript">
$('body').ready(function () {
$('body').css('visibility', 'visible');
});
</script>
</body>
</html>