blob: 7438bcf97a17efebe3931d8bad3c352139f2718a [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Training on GPU &mdash; incubator-singa 0.3.0 documentation</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="top" title="incubator-singa 0.3.0 documentation" href="../index.html"/>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search">
<a href="../index.html" class="icon icon-home"> incubator-singa
<img src="../_static/singa.png" class="logo" />
</a>
<div class="version">
0.3.0
</div>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="../downloads.html">Download SINGA</a></li>
<li class="toctree-l1"><a class="reference internal" href="index.html">Documentation</a></li>
</ul>
<p class="caption"><span class="caption-text">Development</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../develop/schedule.html">Development Schedule</a></li>
<li class="toctree-l1"><a class="reference internal" href="../develop/how-contribute.html">How to Contribute to SINGA</a></li>
<li class="toctree-l1"><a class="reference internal" href="../develop/contribute-code.html">How to Contribute Code</a></li>
<li class="toctree-l1"><a class="reference internal" href="../develop/contribute-docs.html">How to Contribute Documentation</a></li>
</ul>
<p class="caption"><span class="caption-text">Community</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../community/source-repository.html">Source Repository</a></li>
<li class="toctree-l1"><a class="reference internal" href="../community/mail-lists.html">Project Mailing Lists</a></li>
<li class="toctree-l1"><a class="reference internal" href="../community/issue-tracking.html">Issue Tracking</a></li>
<li class="toctree-l1"><a class="reference internal" href="../community/team-list.html">The SINGA Team</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
<nav class="wy-nav-top" role="navigation" aria-label="top navigation">
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../index.html">incubator-singa</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li><a href="../index.html">Docs</a> &raquo;</li>
<li>Training on GPU</li>
<li class="wy-breadcrumbs-aside">
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="training-on-gpu">
<span id="training-on-gpu"></span><h1>Training on GPU<a class="headerlink" href="#training-on-gpu" title="Permalink to this headline"></a></h1>
<hr class="docutils" />
<p>Considering GPU is much faster than CPU for linear algebra operations,
it is essential to support the training of deep learning models (which involves
a lot of linear algebra operations) on GPU cards.
SINGA now supports training on a single node (i.e., process) with multiple GPU
cards. Training in a GPU cluster with multiple nodes is under development.</p>
<div class="section" id="instructions">
<span id="instructions"></span><h2>Instructions<a class="headerlink" href="#instructions" title="Permalink to this headline"></a></h2>
<div class="section" id="compilation">
<span id="compilation"></span><h3>Compilation<a class="headerlink" href="#compilation" title="Permalink to this headline"></a></h3>
<p>To enable the training on GPU, you need to compile SINGA with <a class="reference external" href="http://www.nvidia.com/object/cuda_home_new.html">CUDA</a> from Nvidia,</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="o">./</span><span class="n">configure</span> <span class="o">--</span><span class="n">enable</span><span class="o">-</span><span class="n">cuda</span> <span class="o">--</span><span class="k">with</span><span class="o">-</span><span class="n">cuda</span><span class="o">=&lt;</span><span class="n">path</span> <span class="n">to</span> <span class="n">cuda</span> <span class="n">folder</span><span class="o">&gt;</span>
</pre></div>
</div>
<p>In addition, if you want to use the <a class="reference external" href="https://developer.nvidia.com/cudnn">CUDNN library</a> for convolutional neural network
provided by Nvidia, you need to enable CUDNN,</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="o">./</span><span class="n">configure</span> <span class="o">--</span><span class="n">enable</span><span class="o">-</span><span class="n">cuda</span> <span class="o">--</span><span class="k">with</span><span class="o">-</span><span class="n">cuda</span><span class="o">=&lt;</span><span class="n">path</span> <span class="n">to</span> <span class="n">cuda</span> <span class="n">folder</span><span class="o">&gt;</span> <span class="o">--</span><span class="n">enable</span><span class="o">-</span><span class="n">cudnn</span> <span class="o">--</span><span class="k">with</span><span class="o">-</span><span class="n">cudnn</span><span class="o">=&lt;</span><span class="n">path</span> <span class="n">to</span> <span class="n">cudnn</span> <span class="n">folder</span><span class="o">&gt;</span>
</pre></div>
</div>
<p>SINGA now supports CUDNN V3 and V4.</p>
</div>
<div class="section" id="configuration">
<span id="configuration"></span><h3>Configuration<a class="headerlink" href="#configuration" title="Permalink to this headline"></a></h3>
<p>The job configuration for GPU training is similar to that for training on CPU.
There is one more field to configure, <code class="docutils literal"><span class="pre">gpu</span></code>, which indicate the device ID of
the GPU you want to use. The simplest configuration is</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># job.conf</span>
<span class="o">...</span>
<span class="n">gpu</span><span class="p">:</span> <span class="mi">0</span>
<span class="o">...</span>
</pre></div>
</div>
<div class="section" id="single-node-with-multiple-gpus">
<span id="single-node-with-multiple-gpus"></span><h4>Single node with multiple GPUs<a class="headerlink" href="#single-node-with-multiple-gpus" title="Permalink to this headline"></a></h4>
<p>This configuration will run the worker on GPU 0. If you want to launch multiple
workers, each on a separate GPU, you can configure it as</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># job.conf</span>
<span class="o">...</span>
<span class="n">gpu</span><span class="p">:</span> <span class="mi">0</span>
<span class="n">gpu</span><span class="p">:</span> <span class="mi">2</span>
<span class="o">...</span>
<span class="n">cluster</span> <span class="p">{</span>
<span class="n">nworkers_per_group</span><span class="p">:</span> <span class="mi">2</span>
<span class="n">nworkers_per_process</span><span class="p">:</span> <span class="mi">2</span>
<span class="p">}</span>
</pre></div>
</div>
<p>Using the above configuration, SINGA would partition each mini-batch evenly
onto two workers which run on GPU 0 and GPU 2 respectively. For more information
on running multiple workers in a single node, please refer to
<a class="reference external" href="frameworks.html">Training Framework</a>. Please be careful to configure the same number
of workers and number of <code class="docutils literal"><span class="pre">gpu</span></code>s. Otherwise some workers would run on GPU and the
rest would run on CPU. This kind of hybrid training is not well supported for now.</p>
<p>For some layers, their implementation is transparent to GPU/CPU, like the InnerProductLayer
GRULayer, ReLULayer, etc. Hence, you can use the same configuration for these layers to run
on GPU or CPU. For other layers, especially the layers involved in ConvNet, SINGA
uses different implementations for GPU and CPU. Particularly, the GPU version is
implemented using CUDNN library. To train a ConvNet on GPU, you configure the layers as</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">layer</span> <span class="p">{</span>
<span class="nb">type</span><span class="p">:</span> <span class="n">kCudnnConv</span>
<span class="o">...</span>
<span class="p">}</span>
<span class="n">layer</span> <span class="p">{</span>
<span class="nb">type</span><span class="p">:</span> <span class="n">kCudnnPool</span>
<span class="o">...</span>
<span class="p">}</span>
</pre></div>
</div>
<p>The <a class="reference external" href="cnn.html">cifar10 example</a> and <a class="reference external" href="alexnet.html">Alexnet example</a> have complete
configurations for ConvNet.</p>
</div>
<div class="section" id="gpu-cluster">
<span id="gpu-cluster"></span><h4>GPU cluster<a class="headerlink" href="#gpu-cluster" title="Permalink to this headline"></a></h4>
<p>For distributed training over a (GPU) cluster, you just need to configure SINGA with
<code class="docutils literal"><span class="pre">--enable-dist</span></code>, which would then compile SINGA with zookeeper and ZeroMQ.</p>
</div>
</div>
</div>
<div class="section" id="implementation-details">
<span id="implementation-details"></span><h2>Implementation details<a class="headerlink" href="#implementation-details" title="Permalink to this headline"></a></h2>
<p>SINGA implements the GPU training by assigning each worker a GPU device at the beginning
of training (by the Driver class). Then the work can call GPU functions and run them on the
assigned GPU. GPU is typically used for linear algebra computation in layer
functions, because GPU is good at such computation. There is a <a class="reference external" href="#">Context</a> singleton,
which stores the handles and random generators for each device. The layer code
should detect its running device and then call the CPU or GPU functions correspondingly.</p>
<p>To make the layer implementation easier
SINGA provides some linear algebra functions (in <em>math_blob.h</em>), which are transparent to the running
device for users. Internally, they query the Context singleton to get the device information
and call CPU or GPU to do the computation. Consequently, users can implement
layers without awareness of the underlying running device.</p>
<p>If the functionality cannot be implemented using SINGA provided functions in
<em>math_blob.h</em>, the layer code needs to handle the CPU and GPU devices explicitly
by querying the Context singleton. For layers that cannot run on GPU, e.g.,
input/output layers and connection layers which have little computation but much
IO or network workload, there is no need to consider the GPU device.
When these layers are configured in a neural net, they will run on CPU (since
they don&#8217;t call GPU functions).</p>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016 The Apache Software Foundation. All rights reserved. Apache Singa, Apache, the Apache feather logo, and the Apache Singa project logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners..
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'0.3.0',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script type="text/javascript">
jQuery(function () {
SphinxRtdTheme.StickyNav.enable();
});
</script>
<div class="rst-versions shift-up" data-toggle="rst-versions" role="note" aria-label="versions">
<img src="../_static/apache.jpg">
<span class="rst-current-version" data-toggle="rst-current-version">
<span class="fa fa-book"> incubator-singa </span>
v: 0.3.0
<span class="fa fa-caret-down"></span>
</span>
<div class="rst-other-versions">
<dl>
<dt>Languages</dt>
<dd><a href="../../en/index.html">English</a></dd>
<dd><a href="../../zh/index.html">中文</a></dd>
<dd><a href="../../jp/index.html">日本語</a></dd>
<dd><a href="../../kr/index.html">한국어</a></dd>
</dl>
</div>
</div>
<a href="https://github.com/apache/incubator-singa">
<img style="position: absolute; top: 0; right: 0; border: 0; z-index: 10000;"
src="https://s3.amazonaws.com/github/ribbons/forkme_right_orange_ff7600.png"
alt="Fork me on GitHub">
</a>
</body>
</html>