blob: f69a0fa6cc76a22b0f1118a7000f6d0f054a8049 [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>분산 트레이닝 &mdash; incubator-singa 0.3.0 documentation</title>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="top" title="incubator-singa 0.3.0 documentation" href="../index.html"/>
<script src="../_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search">
<a href="../index.html" class="icon icon-home"> incubator-singa
<img src="../_static/singa.png" class="logo" />
</a>
<div class="version">
0.3.0
</div>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="index.html">최신 문서</a></li>
</ul>
<p class="caption"><span class="caption-text">Development</span></p>
<ul class="simple">
</ul>
<p class="caption"><span class="caption-text">Community</span></p>
<ul class="simple">
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
<nav class="wy-nav-top" role="navigation" aria-label="top navigation">
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../index.html">incubator-singa</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li><a href="../index.html">Docs</a> &raquo;</li>
<li>분산 트레이닝</li>
<li class="wy-breadcrumbs-aside">
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="">
<span id="id1"></span><h1>분산 트레이닝<a class="headerlink" href="#" title="Permalink to this headline"></a></h1>
<hr class="docutils" />
<div class="section" id="cluster-topology">
<span id="cluster-topology"></span><h2>Cluster Topology 설정<a class="headerlink" href="#cluster-topology" title="Permalink to this headline"></a></h2>
<p>SINGA 에서 다양한 분산 트레이닝 프레임워크를 실행하는 방법을 설명합니다.</p>
<p>cluster topology 는 <code class="docutils literal"><span class="pre">JobProto</span></code> 속의 <code class="docutils literal"><span class="pre">cluster</span></code> field 를 설정해줍니다.
<code class="docutils literal"><span class="pre">cluster</span></code><code class="docutils literal"><span class="pre">ClusterProto</span></code> 타입 입니다. 예를 들어</p>
<div class="highlight-default"><div class="highlight"><pre><span></span>message ClusterProto {
optional int32 nworker_groups = 1;
optional int32 nserver_groups = 2;
optional int32 nworkers_per_group = 3 [default = 1];
optional int32 nservers_per_group = 4 [default = 1];
optional int32 nworkers_per_procs = 5 [default = 1];
optional int32 nservers_per_procs = 6 [default = 1];
// servers and workers in different processes?
optional bool server_worker_separate = 20 [default = false];
......
}
</pre></div>
</div>
<p>자주 사용되는 field 는 다음과 같습니다:</p>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">nworkers_per_group</span></code> and <code class="docutils literal"><span class="pre">nworkers_per_procs</span></code>:
decide the partitioning of worker side ParamShard.</li>
<li><code class="docutils literal"><span class="pre">nservers_per_group</span></code> and <code class="docutils literal"><span class="pre">nservers_per_procs</span></code>:
decide the partitioning of server side ParamShard.</li>
<li><code class="docutils literal"><span class="pre">server_worker_separate</span></code>:
separate servers and workers in different processes.</li>
</ul>
</div>
<div class="section" id="">
<span id="id2"></span><h2>다양한 트레이닝 프레임워크<a class="headerlink" href="#" title="Permalink to this headline"></a></h2>
<p>SINGA 에서 worker groups 들은 비동기적으로, group 속에서 workers 들은 동기적으로 실행됩니다. 유저는 이 일반디저인을 이용해서 <strong>synchronous</strong><strong>asynchronous</strong> 트레이닝 프레임워크를 실행 할수 있습니다. 널리 알려진 분산 트레이닝을 어떻게 설정하고 실행하는지 설명하겠습니다.</p>
<p><img src="../_static/images/frameworks.png" style="width: 800px"/></p>
<p><strong> Fig.1 - 다양한 트레이닝 프레임워크</strong></p><p>###Sandblaster</p>
<p>Google Brain 에서 쓰이는 <strong>synchronous</strong> 프레임워크.
Fig.2(a) 는 SINGA에서 Sandblaster 프레임워크를 실행하기 위한 cluster 의 설정 예입니다.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cluster</span> <span class="p">{</span>
<span class="n">nworker_groups</span><span class="p">:</span> <span class="mi">1</span>
<span class="n">nserver_groups</span><span class="p">:</span> <span class="mi">1</span>
<span class="n">nworkers_per_group</span><span class="p">:</span> <span class="mi">3</span>
<span class="n">nservers_per_group</span><span class="p">:</span> <span class="mi">2</span>
<span class="n">server_worker_separate</span><span class="p">:</span> <span class="n">true</span>
<span class="p">}</span>
</pre></div>
</div>
<p>각 server group 는 모든 workers 의 requests 를 처리합니다.
각 worker 는 뉴럴네트 모델의 한 부분을 담당하여 계산을 하고, 모든 servers 와 통신을 하여 관련 parameters 값을 얻습니다.</p>
<p>###AllReduce</p>
<p>Baidu&#8217;s DeepImage 에서 쓰이는 <strong>synchronous</strong> 프레임워크.
Fig.2(b) 는 SINGA에서 AllReduce 프레임워크를 실행하기 위한 cluster 의 설정 예입니다.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cluster</span> <span class="p">{</span>
<span class="n">nworker_groups</span><span class="p">:</span> <span class="mi">1</span>
<span class="n">nserver_groups</span><span class="p">:</span> <span class="mi">1</span>
<span class="n">nworkers_per_group</span><span class="p">:</span> <span class="mi">3</span>
<span class="n">nservers_per_group</span><span class="p">:</span> <span class="mi">3</span>
<span class="n">server_worker_separate</span><span class="p">:</span> <span class="n">false</span>
<span class="p">}</span>
</pre></div>
</div>
<p>각 node 에서 1 worker 와 1 server 를 실행하여, 각 node 가 parameters 의 한 부분을 담당하고 계산을 하도록 설정합니다. 다른 nodes 와 업데이트 된 정보를 교환합니다.</p>
<p>###Downpour</p>
<p>Google Brain 에서 쓰이는 <strong>asynchronous</strong> 프레임워크.
Fig.2(c) 는 SINGA에서 Downpour 프레임워크를 실행하기 위한 cluster 의 설정 예입니다.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cluster</span> <span class="p">{</span>
<span class="n">nworker_groups</span><span class="p">:</span> <span class="mi">2</span>
<span class="n">nserver_groups</span><span class="p">:</span> <span class="mi">1</span>
<span class="n">nworkers_per_group</span><span class="p">:</span> <span class="mi">2</span>
<span class="n">nservers_per_group</span><span class="p">:</span> <span class="mi">2</span>
<span class="n">server_worker_separate</span><span class="p">:</span> <span class="n">true</span>
<span class="p">}</span>
</pre></div>
</div>
<p>synchronous Sandblaster 와 비슷하게, 모든 workers 는 1 server group 에 requests 를 보냅니다. 여기서는 workers 들을 여러 worker groups 으로 나누어서, 각 worker 가 <em>update</em> reply 에서 받은 최신 parameters 를 써서 계산 하도록 설정하였습니다.</p>
<p>###Distributed Hogwild</p>
<p>Caffe 에서 쓰이는 <strong>asynchronous</strong> 프레임워크.
Fig.2(d) 는 SINGA에서 Hogwild 프레임워크를 실행하기 위한 cluster 의 설정 예입니다.</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cluster</span> <span class="p">{</span>
<span class="n">nworker_groups</span><span class="p">:</span> <span class="mi">3</span>
<span class="n">nserver_groups</span><span class="p">:</span> <span class="mi">3</span>
<span class="n">nworkers_per_group</span><span class="p">:</span> <span class="mi">1</span>
<span class="n">nservers_per_group</span><span class="p">:</span> <span class="mi">1</span>
<span class="n">server_worker_separate</span><span class="p">:</span> <span class="n">false</span>
<span class="p">}</span>
</pre></div>
</div>
<p>각 node 는 1 server group 와 1 worker group 를 실행합니다.
Parameter updates 를 node 에서 각각 실행시킴으로써 통신코스트와 트레이닝 스텝을 최소화 합니다. 그러나 server groups 들은 정기적으로 네이버링 groups 들과 동기 시켜야 됩니다.</p>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016 The Apache Software Foundation. All rights reserved. Apache Singa, Apache, the Apache feather logo, and the Apache Singa project logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners..
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'../',
VERSION:'0.3.0',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script type="text/javascript">
jQuery(function () {
SphinxRtdTheme.StickyNav.enable();
});
</script>
<div class="rst-versions shift-up" data-toggle="rst-versions" role="note" aria-label="versions">
<img src="../_static/apache.jpg">
<span class="rst-current-version" data-toggle="rst-current-version">
<span class="fa fa-book"> incubator-singa </span>
v: 0.3.0
<span class="fa fa-caret-down"></span>
</span>
<div class="rst-other-versions">
<dl>
<dt>Languages</dt>
<dd><a href="../../en/index.html">English</a></dd>
<dd><a href="../../zh/index.html">中文</a></dd>
<dd><a href="../../jp/index.html">日本語</a></dd>
<dd><a href="../../kr/index.html">한국어</a></dd>
</dl>
</div>
</div>
<a href="https://github.com/apache/incubator-singa">
<img style="position: absolute; top: 0; right: 0; border: 0; z-index: 10000;"
src="https://s3.amazonaws.com/github/ribbons/forkme_right_orange_ff7600.png"
alt="Fork me on GitHub">
</a>
</body>
</html>