blob: f4c546954bd492133fc6d274aa41dee10ca3b7e9 [file] [log] [blame]
<!doctype html>
<html lang="en" class="no-js">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1">
<meta http-equiv="x-ua-compatible" content="ie=edge">
<meta name="lang:clipboard.copy" content="Copy to clipboard">
<meta name="lang:clipboard.copied" content="Copied to clipboard">
<meta name="lang:search.language" content="en">
<meta name="lang:search.pipeline.stopwords" content="True">
<meta name="lang:search.pipeline.trimmer" content="True">
<meta name="lang:search.result.none" content="No matching documents">
<meta name="lang:search.result.one" content="1 matching document">
<meta name="lang:search.result.other" content="# matching documents">
<meta name="lang:search.tokenizer" content="[\s\-]+">
<link rel="shortcut icon" href="../../assets/images/favicon.png">
<meta name="generator" content="mkdocs-1.0.4, mkdocs-material-4.6.0">
<title>Overview - MXNet.jl</title>
<link rel="stylesheet" href="../../assets/stylesheets/application.1b62728e.css">
<script src="../../assets/javascripts/modernizr.268332fc.js"></script>
<link href="https://fonts.gstatic.com" rel="preconnect" crossorigin>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,400i,700%7CRoboto+Mono&display=fallback">
<style>body,input{font-family:"Roboto","Helvetica Neue",Helvetica,Arial,sans-serif}code,kbd,pre{font-family:"Roboto Mono","Courier New",Courier,monospace}</style>
<link rel="stylesheet" href="../../assets/fonts/material-icons.css">
<link rel="stylesheet" href="../../assets/Documenter.css">
</head>
<body dir="ltr">
<svg class="md-svg">
<defs>
<svg xmlns="http://www.w3.org/2000/svg" width="416" height="448" viewBox="0 0 416 448" id="__github"><path fill="currentColor" d="M160 304q0 10-3.125 20.5t-10.75 19T128 352t-18.125-8.5-10.75-19T96 304t3.125-20.5 10.75-19T128 256t18.125 8.5 10.75 19T160 304zm160 0q0 10-3.125 20.5t-10.75 19T288 352t-18.125-8.5-10.75-19T256 304t3.125-20.5 10.75-19T288 256t18.125 8.5 10.75 19T320 304zm40 0q0-30-17.25-51T296 232q-10.25 0-48.75 5.25Q229.5 240 208 240t-39.25-2.75Q130.75 232 120 232q-29.5 0-46.75 21T56 304q0 22 8 38.375t20.25 25.75 30.5 15 35 7.375 37.25 1.75h42q20.5 0 37.25-1.75t35-7.375 30.5-15 20.25-25.75T360 304zm56-44q0 51.75-15.25 82.75-9.5 19.25-26.375 33.25t-35.25 21.5-42.5 11.875-42.875 5.5T212 416q-19.5 0-35.5-.75t-36.875-3.125-38.125-7.5-34.25-12.875T37 371.5t-21.5-28.75Q0 312 0 260q0-59.25 34-99-6.75-20.5-6.75-42.5 0-29 12.75-54.5 27 0 47.5 9.875t47.25 30.875Q171.5 96 212 96q37 0 70 8 26.25-20.5 46.75-30.25T376 64q12.75 25.5 12.75 54.5 0 21.75-6.75 42 34 40 34 99.5z"/></svg>
</defs>
</svg>
<input class="md-toggle" data-md-toggle="drawer" type="checkbox" id="__drawer" autocomplete="off">
<input class="md-toggle" data-md-toggle="search" type="checkbox" id="__search" autocomplete="off">
<label class="md-overlay" data-md-component="overlay" for="__drawer"></label>
<a href="#overview" tabindex="1" class="md-skip">
Skip to content
</a>
<header class="md-header" data-md-component="header">
<nav class="md-header-nav md-grid">
<div class="md-flex">
<div class="md-flex__cell md-flex__cell--shrink">
<a href="../.." title="MXNet.jl" class="md-header-nav__button md-logo">
<i class="md-icon"></i>
</a>
</div>
<div class="md-flex__cell md-flex__cell--shrink">
<label class="md-icon md-icon--menu md-header-nav__button" for="__drawer"></label>
</div>
<div class="md-flex__cell md-flex__cell--stretch">
<div class="md-flex__ellipsis md-header-nav__title" data-md-component="title">
<span class="md-header-nav__topic">
MXNet.jl
</span>
<span class="md-header-nav__topic">
Overview
</span>
</div>
</div>
<div class="md-flex__cell md-flex__cell--shrink">
<label class="md-icon md-icon--search md-header-nav__button" for="__search"></label>
<div class="md-search" data-md-component="search" role="dialog">
<label class="md-search__overlay" for="__search"></label>
<div class="md-search__inner" role="search">
<form class="md-search__form" name="search">
<input type="text" class="md-search__input" name="query" placeholder="Search" autocapitalize="off" autocorrect="off" autocomplete="off" spellcheck="false" data-md-component="query" data-md-state="active">
<label class="md-icon md-search__icon" for="__search"></label>
<button type="reset" class="md-icon md-search__icon" data-md-component="reset" tabindex="-1">
&#xE5CD;
</button>
</form>
<div class="md-search__output">
<div class="md-search__scrollwrap" data-md-scrollfix>
<div class="md-search-result" data-md-component="result">
<div class="md-search-result__meta">
Type to start searching
</div>
<ol class="md-search-result__list"></ol>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="md-flex__cell md-flex__cell--shrink">
<div class="md-header-nav__source">
<a href="https://github.com/apache/mxnet/tree/master/julia#mxnet/" title="Go to repository" class="md-source" data-md-source="github">
<div class="md-source__icon">
<svg viewBox="0 0 24 24" width="24" height="24">
<use xlink:href="#__github" width="24" height="24"></use>
</svg>
</div>
<div class="md-source__repository">
GitHub
</div>
</a>
</div>
</div>
</div>
</nav>
</header>
<div class="md-container">
<main class="md-main" role="main">
<div class="md-main__inner md-grid" data-md-component="container">
<div class="md-sidebar md-sidebar--primary" data-md-component="navigation">
<div class="md-sidebar__scrollwrap">
<div class="md-sidebar__inner">
<nav class="md-nav md-nav--primary" data-md-level="0">
<label class="md-nav__title md-nav__title--site" for="__drawer">
<a href="../.." title="MXNet.jl" class="md-nav__button md-logo">
<i class="md-icon"></i>
</a>
MXNet.jl
</label>
<div class="md-nav__source">
<a href="https://github.com/apache/mxnet/tree/master/julia#mxnet/" title="Go to repository" class="md-source" data-md-source="github">
<div class="md-source__icon">
<svg viewBox="0 0 24 24" width="24" height="24">
<use xlink:href="#__github" width="24" height="24"></use>
</svg>
</div>
<div class="md-source__repository">
GitHub
</div>
</a>
</div>
<ul class="md-nav__list" data-md-scrollfix>
<li class="md-nav__item">
<a href="../.." title="Home" class="md-nav__link">
Home
</a>
</li>
<li class="md-nav__item md-nav__item--nested">
<input class="md-toggle md-nav__toggle" data-md-toggle="nav-2" type="checkbox" id="nav-2">
<label class="md-nav__link" for="nav-2">
Tutorial
</label>
<nav class="md-nav" data-md-component="collapsible" data-md-level="1">
<label class="md-nav__title" for="nav-2">
Tutorial
</label>
<ul class="md-nav__list" data-md-scrollfix>
<li class="md-nav__item">
<a href="../../tutorial/mnist/" title="Digit Recognition on MNIST" class="md-nav__link">
Digit Recognition on MNIST
</a>
</li>
<li class="md-nav__item">
<a href="../../tutorial/char-lstm/" title="Generating Random Sentence with LSTM RNN" class="md-nav__link">
Generating Random Sentence with LSTM RNN
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item md-nav__item--active md-nav__item--nested">
<input class="md-toggle md-nav__toggle" data-md-toggle="nav-3" type="checkbox" id="nav-3" checked>
<label class="md-nav__link" for="nav-3">
User Guide
</label>
<nav class="md-nav" data-md-component="collapsible" data-md-level="1">
<label class="md-nav__title" for="nav-3">
User Guide
</label>
<ul class="md-nav__list" data-md-scrollfix>
<li class="md-nav__item">
<a href="../install/" title="Installation Guide" class="md-nav__link">
Installation Guide
</a>
</li>
<li class="md-nav__item md-nav__item--active">
<input class="md-toggle md-nav__toggle" data-md-toggle="toc" type="checkbox" id="__toc">
<label class="md-nav__link md-nav__link--active" for="__toc">
Overview
</label>
<a href="./" title="Overview" class="md-nav__link md-nav__link--active">
Overview
</a>
<nav class="md-nav md-nav--secondary">
<label class="md-nav__title" for="__toc">Table of contents</label>
<ul class="md-nav__list" data-md-scrollfix>
<li class="md-nav__item">
<a href="#mxnetjl-namespace" class="md-nav__link">
MXNet.jl Namespace
</a>
</li>
<li class="md-nav__item">
<a href="#low-level-interface" class="md-nav__link">
Low Level Interface
</a>
<nav class="md-nav">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#ndarray" class="md-nav__link">
NDArray
</a>
</li>
<li class="md-nav__item">
<a href="#distributed-key-value-store" class="md-nav__link">
Distributed Key-value Store
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#intermediate-level-interface" class="md-nav__link">
Intermediate Level Interface
</a>
<nav class="md-nav">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#symbols-and-composition" class="md-nav__link">
Symbols and Composition
</a>
</li>
<li class="md-nav__item">
<a href="#shape-inference" class="md-nav__link">
Shape Inference
</a>
</li>
<li class="md-nav__item">
<a href="#binding-and-executing" class="md-nav__link">
Binding and Executing
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#high-level-interface" class="md-nav__link">
High Level Interface
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="../faq/" title="FAQ" class="md-nav__link">
FAQ
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item md-nav__item--nested">
<input class="md-toggle md-nav__toggle" data-md-toggle="nav-4" type="checkbox" id="nav-4">
<label class="md-nav__link" for="nav-4">
API Documentation
</label>
<nav class="md-nav" data-md-component="collapsible" data-md-level="1">
<label class="md-nav__title" for="nav-4">
API Documentation
</label>
<ul class="md-nav__list" data-md-scrollfix>
<li class="md-nav__item">
<a href="../../api/context/" title="Context" class="md-nav__link">
Context
</a>
</li>
<li class="md-nav__item">
<a href="../../api/model/" title="Models" class="md-nav__link">
Models
</a>
</li>
<li class="md-nav__item">
<a href="../../api/initializer/" title="Initializers" class="md-nav__link">
Initializers
</a>
</li>
<li class="md-nav__item">
<a href="../../api/optimizer/" title="Optimizers" class="md-nav__link">
Optimizers
</a>
</li>
<li class="md-nav__item">
<a href="../../api/callback/" title="Callbacks in training" class="md-nav__link">
Callbacks in training
</a>
</li>
<li class="md-nav__item">
<a href="../../api/metric/" title="Evaluation Metrics" class="md-nav__link">
Evaluation Metrics
</a>
</li>
<li class="md-nav__item">
<a href="../../api/io/" title="Data Providers" class="md-nav__link">
Data Providers
</a>
</li>
<li class="md-nav__item">
<a href="../../api/ndarray/" title="NDArray API" class="md-nav__link">
NDArray API
</a>
</li>
<li class="md-nav__item">
<a href="../../api/symbolic-node/" title="Symbolic API" class="md-nav__link">
Symbolic API
</a>
</li>
<li class="md-nav__item">
<a href="../../api/nn-factory/" title="Neural Networks Factory" class="md-nav__link">
Neural Networks Factory
</a>
</li>
<li class="md-nav__item">
<a href="../../api/executor/" title="Executor" class="md-nav__link">
Executor
</a>
</li>
<li class="md-nav__item">
<a href="../../api/kvstore/" title="Key-Value Store" class="md-nav__link">
Key-Value Store
</a>
</li>
<li class="md-nav__item">
<a href="../../api/visualize/" title="Network Visualization" class="md-nav__link">
Network Visualization
</a>
</li>
</ul>
</nav>
</li>
</ul>
</nav>
</div>
</div>
</div>
<div class="md-sidebar md-sidebar--secondary" data-md-component="toc">
<div class="md-sidebar__scrollwrap">
<div class="md-sidebar__inner">
<nav class="md-nav md-nav--secondary">
<label class="md-nav__title" for="__toc">Table of contents</label>
<ul class="md-nav__list" data-md-scrollfix>
<li class="md-nav__item">
<a href="#mxnetjl-namespace" class="md-nav__link">
MXNet.jl Namespace
</a>
</li>
<li class="md-nav__item">
<a href="#low-level-interface" class="md-nav__link">
Low Level Interface
</a>
<nav class="md-nav">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#ndarray" class="md-nav__link">
NDArray
</a>
</li>
<li class="md-nav__item">
<a href="#distributed-key-value-store" class="md-nav__link">
Distributed Key-value Store
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#intermediate-level-interface" class="md-nav__link">
Intermediate Level Interface
</a>
<nav class="md-nav">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#symbols-and-composition" class="md-nav__link">
Symbols and Composition
</a>
</li>
<li class="md-nav__item">
<a href="#shape-inference" class="md-nav__link">
Shape Inference
</a>
</li>
<li class="md-nav__item">
<a href="#binding-and-executing" class="md-nav__link">
Binding and Executing
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#high-level-interface" class="md-nav__link">
High Level Interface
</a>
</li>
</ul>
</nav>
</div>
</div>
</div>
<div class="md-content">
<article class="md-content__inner md-typeset">
<a href="https://github.com/apache/mxnet/tree/master/edit/master/docs/user-guide/overview.md" title="Edit this page" class="md-icon md-content__icon">&#xE3C9;</a>
<!–- Licensed to the Apache Software Foundation (ASF) under one –> <!–- or more contributor license agreements. See the NOTICE file –> <!–- distributed with this work for additional information –> <!–- regarding copyright ownership. The ASF licenses this file –> <!–- to you under the Apache License, Version 2.0 (the –> <!–- "License"); you may not use this file except in compliance –> <!–- with the License. You may obtain a copy of the License at –>
<!–- http://www.apache.org/licenses/LICENSE-2.0 –>
<!–- Unless required by applicable law or agreed to in writing, –> <!–- software distributed under the License is distributed on an –> <!–- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY –> <!–- KIND, either express or implied. See the License for the –> <!–- specific language governing permissions and limitations –> <!–- under the License. –>
<p><a id='Overview-1'></a></p>
<h1 id="overview">Overview</h1>
<p><a id='MXNet.jl-Namespace-1'></a></p>
<h2 id="mxnetjl-namespace">MXNet.jl Namespace</h2>
<p>Most the functions and types in MXNet.jl are organized in a flat namespace. Because many some functions are conflicting with existing names in the Julia Base module, we wrap them all in a <code>mx</code> module. The convention of accessing the MXNet.jl interface is the to use the <code>mx.</code> prefix explicitly:</p>
<pre><code class="julia">julia&gt; using MXNet
julia&gt; x = mx.zeros(2, 3) # MXNet NDArray
2×3 mx.NDArray{Float32} @ CPU0:
0.0 0.0 0.0
0.0 0.0 0.0
julia&gt; y = zeros(eltype(x), size(x)) # Julia Array
2×3 Array{Float32,2}:
0.0 0.0 0.0
0.0 0.0 0.0
julia&gt; copy!(y, x) # Overloaded function in Julia Base
2×3 Array{Float32,2}:
0.0 0.0 0.0
0.0 0.0 0.0
julia&gt; z = mx.ones(size(x), mx.gpu()) # MXNet NDArray on GPU
2×3 mx.NDArray{Float32} @ GPU0:
1.0 1.0 1.0
1.0 1.0 1.0
julia&gt; mx.copy!(z, y) # Same as copy!(z, y)
2×3 mx.NDArray{Float32} @ GPU0:
0.0 0.0 0.0
0.0 0.0 0.0
</code></pre>
<p>Note functions like <code>size</code>, <code>copy!</code> that is extensively overloaded for various types works out of the box. But functions like <code>zeros</code> and <code>ones</code> will be ambiguous, so we always use the <code>mx.</code> prefix. If you prefer, the <code>mx.</code> prefix can be used explicitly for all MXNet.jl functions, including <code>size</code> and <code>copy!</code> as shown in the last line.</p>
<p><a id='Low-Level-Interface-1'></a></p>
<h2 id="low-level-interface">Low Level Interface</h2>
<p><a id='NDArray-1'></a></p>
<h3 id="ndarray"><code>NDArray</code></h3>
<p><code>NDArray</code> is the basic building blocks of the actual computations in MXNet. It is like a Julia <code>Array</code> object, with some important differences listed here:</p>
<ul>
<li>The actual data could live on different <code>Context</code> (e.g. GPUs). For some contexts, iterating into the elements one by one is very slow, thus indexing into NDArray is not recommanded in general. The easiest way to inspect the contents of an NDArray is to use the <code>copy</code> function to copy the contents as a Julia <code>Array</code>.</li>
<li>Operations on <code>NDArray</code> (including basic arithmetics and neural network related operators) are executed in parallel with automatic dependency tracking to ensure correctness.</li>
<li>There is no generics in <code>NDArray</code>, the <code>eltype</code> is always <code>mx.MX_float</code>. Because for applications in machine learning, single precision floating point numbers are typical a best choice balancing between precision, speed and portability. Also since libmxnet is designed to support multiple languages as front-ends, it is much simpler to implement with a fixed data type.</li>
</ul>
<p>While most of the computation is hidden in libmxnet by operators corresponding to various neural network layers. Getting familiar with the <code>NDArray</code> API is useful for implementing <code>Optimizer</code> or customized operators in Julia directly.</p>
<p>The followings are common ways to create <code>NDArray</code> objects:</p>
<ul>
<li><code>NDArray(undef, shape...; ctx = context, writable = true)</code>: create an uninitialized array of a given shape on a specific device. For example, <code>NDArray(undef, 2, 3)</code>, <code>NDArray(undef, 2, 3, ctx = mx.gpu(2))</code>.</li>
<li><code>NDArray(undef, shape; ctx = context, writable = true)</code></li>
<li><code>NDArray{T}(undef, shape...; ctx = context, writable = true)</code>: create an uninitialized with the given type <code>T</code>.</li>
<li><code>mx.zeros(shape[, context])</code> and <code>mx.ones(shape[, context])</code>: similar to the Julia's built-in <code>zeros</code> and <code>ones</code>.</li>
<li><code>mx.copy(jl_arr, context)</code>: copy the contents of a Julia <code>Array</code> to a specific device.</li>
</ul>
<p>Most of the convenient functions like <code>size</code>, <code>length</code>, <code>ndims</code>, <code>eltype</code> on array objects should work out-of-the-box. Although indexing is not supported, it is possible to take <em>slices</em>:</p>
<pre><code class="julia-repl">julia&gt; using MXNet
julia&gt; a = mx.ones(2, 3)
2×3 NDArray{Float32,2} @ cpu0:
1.0f0 1.0f0 1.0f0
1.0f0 1.0f0 1.0f0
julia&gt; b = mx.slice(a, 1:2)
2×2 NDArray{Float32,2} @ cpu0:
1.0f0 1.0f0
1.0f0 1.0f0
julia&gt; b[:] = 2
2
julia&gt; a
2×3 NDArray{Float32,2} @ cpu0:
2.0f0 2.0f0 1.0f0
2.0f0 2.0f0 1.0f0
</code></pre>
<p>A slice is a sub-region sharing the same memory with the original <code>NDArray</code> object. A slice is always a contiguous piece of memory, so only slicing on the <em>last</em> dimension is supported. The example above also shows a way to set the contents of an <code>NDArray</code>.</p>
<pre><code class="julia-repl">julia&gt; using MXNet
julia&gt; mx.srand(42)
┌ Warning: `mx.srand` is deprecated, use `mx.seed!` instead.
└ @ MXNet.mx /work/mxnet/julia/src/random.jl:86
julia&gt; a = NDArray(undef, 2, 3)
2×3 NDArray{Float32,2} @ cpu0:
2.2f-44 3.36f-43 0.0f0
0.0f0 0.0f0 4.5901f-41
julia&gt; a[:] = 0.5 # set all elements to a scalar
0.5
julia&gt; a[:] = rand(size(a)) # set contents with a Julia Array
ERROR: rand(rng, dims) is discontinued; try rand(rng, Float64, dims)
julia&gt; copy!(a, rand(size(a))) # set value by copying a Julia Array
ERROR: rand(rng, dims) is discontinued; try rand(rng, Float64, dims)
julia&gt; b = NDArray(undef, size(a))
2×3 NDArray{Float32,2} @ cpu0:
2.2f-44 3.14f-43 0.0f0
0.0f0 0.0f0 4.5901f-41
julia&gt; b[:] = a # copying and assignment between NDArrays
2×3 NDArray{Float32,2} @ cpu0:
0.5f0 0.5f0 0.5f0
0.5f0 0.5f0 0.5f0
</code></pre>
<p>Note due to the intrinsic design of the Julia language, a normal assignment</p>
<pre><code class="julia">a = b
</code></pre>
<p>does <strong>not</strong> mean copying the contents of <code>b</code> to <code>a</code>. Instead, it just make the variable <code>a</code> pointing to a new object, which is <code>b</code>. Similarly, inplace arithmetics does not work as expected:</p>
<pre><code class="julia-repl">julia&gt; using MXNet
julia&gt; a = mx.ones(2)
2-element NDArray{Float32,1} @ cpu0:
1.0f0
1.0f0
julia&gt; r = a # keep a reference to a
2-element NDArray{Float32,1} @ cpu0:
1.0f0
1.0f0
julia&gt; b = mx.ones(2)
2-element NDArray{Float32,1} @ cpu0:
1.0f0
1.0f0
julia&gt; a += b # translates to a = a + b
2-element NDArray{Float32,1} @ cpu0:
2.0f0
2.0f0
julia&gt; a
2-element NDArray{Float32,1} @ cpu0:
2.0f0
2.0f0
julia&gt; r
2-element NDArray{Float32,1} @ cpu0:
1.0f0
1.0f0
</code></pre>
<p>As we can see, <code>a</code> has expected value, but instead of inplace updating, a new <code>NDArray</code> is created and <code>a</code> is set to point to this new object. If we look at <code>r</code>, which still reference to the old <code>a</code>, its content has not changed. There is currently no way in Julia to overload the operators like <code>+=</code> to get customized behavior.</p>
<p>Instead, you will need to write <code>a[:] = a + b</code>, or if you want <em>real</em> inplace <code>+=</code> operation, MXNet.jl provides a simple macro <code>@mx.inplace</code>:</p>
<pre><code class="julia-repl">julia&gt; @mx.inplace a += b
2-element NDArray{Float32,1} @ cpu0:
3.0f0
3.0f0
julia&gt; macroexpand(:(@mx.inplace a += b))
ERROR: MethodError: no method matching macroexpand(::Expr)
Closest candidates are:
macroexpand(!Matched::Module, !Matched::Any; recursive) at expr.jl:91
</code></pre>
<p>As we can see, it translate the <code>+=</code> operator to an explicit <code>add_to!</code> function call, which invokes into libmxnet to add the contents of <code>b</code> into <code>a</code> directly. For example, the following is the update rule in the <code>SGD Optimizer</code> (both gradient <code></code> and weight <code>W</code> are <code>NDArray</code> objects):</p>
<pre><code class="julia">@inplace W .+= -η .* (∇ + λ .* W)
</code></pre>
<p>Note there is no much magic in <code>mx.inplace</code>: it only does a shallow translation. In the SGD update rule example above, the computation like scaling the gradient by <code>grad_scale</code> and adding the weight decay all create temporary <code>NDArray</code> objects. To mitigate this issue, libmxnet has a customized memory allocator designed specifically to handle this kind of situations. The following snippet does a simple benchmark on allocating temp <code>NDArray</code> vs. pre-allocating:</p>
<pre><code class="julia">using Benchmark
using MXNet
N_REP = 1000
SHAPE = (128, 64)
CTX = mx.cpu()
LR = 0.1
function inplace_op()
weight = mx.zeros(SHAPE, CTX)
grad = mx.ones(SHAPE, CTX)
# pre-allocate temp objects
grad_lr = NDArray(undef, SHAPE, ctx = CTX)
for i = 1:N_REP
copy!(grad_lr, grad)
@mx.inplace grad_lr .*= LR
@mx.inplace weight -= grad_lr
end
return weight
end
function normal_op()
weight = mx.zeros(SHAPE, CTX)
grad = mx.ones(SHAPE, CTX)
for i = 1:N_REP
weight[:] -= LR * grad
end
return weight
end
# make sure the results are the same
@assert(maximum(abs(copy(normal_op() - inplace_op()))) &lt; 1e-6)
println(compare([inplace_op, normal_op], 100))
</code></pre>
<p>The comparison on my laptop shows that <code>normal_op</code> while allocating a lot of temp NDArray in the loop (the performance gets worse when increasing <code>N_REP</code>), is only about twice slower than the pre-allocated one.</p>
<table>
<thead>
<tr>
<th align="right">Row</th>
<th align="right">Function</th>
<th align="right">Average</th>
<th align="right">Relative</th>
<th align="right">Replications</th>
</tr>
</thead>
<tbody>
<tr>
<td align="right">1</td>
<td align="right">"inplace_op"</td>
<td align="right">0.0074854</td>
<td align="right">1.0</td>
<td align="right">100</td>
</tr>
<tr>
<td align="right">2</td>
<td align="right">"normal_op"</td>
<td align="right">0.0174202</td>
<td align="right">2.32723</td>
<td align="right">100</td>
</tr>
</tbody>
</table>
<p>So it will usually not be a big problem unless you are at the bottleneck of the computation.</p>
<p><a id='Distributed-Key-value-Store-1'></a></p>
<h3 id="distributed-key-value-store">Distributed Key-value Store</h3>
<p>The type <code>KVStore</code> and related methods are used for data sharing across different devices or machines. It provides a simple and efficient integer - NDArray key-value storage system that each device can pull or push.</p>
<p>The following example shows how to create a local <code>KVStore</code>, initialize a value and then pull it back.</p>
<pre><code class="julia">kv = mx.KVStore(:local)
shape = (2, 3)
key = 3
mx.init!(kv, key, mx.ones(shape) * 2)
a = NDArray(undef, shape)
mx.pull!(kv, key, a) # pull value into a
a
</code></pre>
<pre><code>2×3 NDArray{Float32,2} @ cpu0:
2.0f0 2.0f0 2.0f0
2.0f0 2.0f0 2.0f0
</code></pre>
<p><a id='Intermediate-Level-Interface-1'></a></p>
<h2 id="intermediate-level-interface">Intermediate Level Interface</h2>
<p><a id='Symbols-and-Composition-1'></a></p>
<h3 id="symbols-and-composition">Symbols and Composition</h3>
<p>The way we build deep learning models in MXNet.jl is to use the powerful symbolic composition system. It is like <a href="http://deeplearning.net/software/theano/">Theano</a>, except that we avoided long expression compilation time by providing <em>larger</em> neural network related building blocks to guarantee computation performance.</p>
<p>The basic type is <code>mx.SymbolicNode</code>. The following is a trivial example of composing two symbols with the <code>+</code> operation.</p>
<pre><code class="julia">A = mx.Variable(:A)
B = mx.Variable(:B)
C = A + B
print(C) # debug printing
</code></pre>
<pre><code>Symbol Outputs:
output[0]=_plus0(0)
Variable:A
Variable:B
--------------------
Op:elemwise_add, Name=_plus0
Inputs:
arg[0]=A(0) version=0
arg[1]=B(0) version=0
</code></pre>
<p>We get a new <code>SymbolicNode</code> by composing existing <code>SymbolicNode</code>s by some <em>operations</em>. A hierarchical architecture of a deep neural network could be realized by recursive composition. For example, the following code snippet shows a simple 2-layer MLP construction, using a hidden layer of 128 units and a <code>ReLU</code> activation function.</p>
<pre><code class="julia">net = mx.Variable(:data)
net = mx.FullyConnected(net, name=:fc1, num_hidden=128)
net = mx.Activation(net, name=:relu1, act_type=:relu)
net = mx.FullyConnected(net, name=:fc2, num_hidden=64)
net = mx.SoftmaxOutput(net, name=:out)
print(net) # debug printing
</code></pre>
<pre><code>Symbol Outputs:
output[0]=out(0)
Variable:data
Variable:fc1_weight
Variable:fc1_bias
--------------------
Op:FullyConnected, Name=fc1
Inputs:
arg[0]=data(0) version=0
arg[1]=fc1_weight(0) version=0
arg[2]=fc1_bias(0) version=0
Attrs:
num_hidden=128
--------------------
Op:Activation, Name=relu1
Inputs:
arg[0]=fc1(0)
Attrs:
act_type=relu
Variable:fc2_weight
Variable:fc2_bias
--------------------
Op:FullyConnected, Name=fc2
Inputs:
arg[0]=relu1(0)
arg[1]=fc2_weight(0) version=0
arg[2]=fc2_bias(0) version=0
Attrs:
num_hidden=64
Variable:out_label
--------------------
Op:SoftmaxOutput, Name=out
Inputs:
arg[0]=fc2(0)
arg[1]=out_label(0) version=0
</code></pre>
<p>Each time we take the previous symbol, and compose with an operation. Unlike the simple <code>+</code> example above, the <em>operations</em> here are "bigger" ones, that correspond to common computation layers in deep neural networks.</p>
<p>Each of those operation takes one or more input symbols for composition, with optional hyper-parameters (e.g. <code>num_hidden</code>, <code>act_type</code>) to further customize the composition results.</p>
<p>When applying those operations, we can also specify a <code>name</code> for the result symbol. This is convenient if we want to refer to this symbol later on. If not supplied, a name will be automatically generated.</p>
<p>Each symbol takes some arguments. For example, in the <code>+</code> case above, to compute the value of <code>C</code>, we will need to know the values of the two inputs <code>A</code> and <code>B</code>. For neural networks, the arguments are primarily two categories: <em>inputs</em> and <em>parameters</em>. <em>inputs</em> are data and labels for the networks, while <em>parameters</em> are typically trainable <em>weights</em>, <em>bias</em>, <em>filters</em>.</p>
<p>When composing symbols, their arguments accumulates. We can list all the arguments by</p>
<pre><code class="julia">mx.list_arguments(net)
</code></pre>
<pre><code>6-element Array{Symbol,1}:
:data
:fc1_weight
:fc1_bias
:fc2_weight
:fc2_bias
:out_label
</code></pre>
<p>Note the names of the arguments are generated according to the provided name for each layer. We can also specify those names explicitly:</p>
<pre><code class="julia-repl">julia&gt; using MXNet
julia&gt; net = mx.Variable(:data)
SymbolicNode data
julia&gt; w = mx.Variable(:myweight)
SymbolicNode myweight
julia&gt; net = mx.FullyConnected(net, weight=w, name=:fc1, num_hidden=128)
SymbolicNode fc1
julia&gt; mx.list_arguments(net)
3-element Array{Symbol,1}:
:data
:myweight
:fc1_bias
</code></pre>
<p>The simple fact is that a <code>Variable</code> is just a placeholder <code>mx.SymbolicNode</code>. In composition, we can use arbitrary symbols for arguments. For example:</p>
<pre><code class="julia-repl">julia&gt; using MXNet
julia&gt; net = mx.Variable(:data)
SymbolicNode data
julia&gt; net = mx.FullyConnected(net, name=:fc1, num_hidden=128)
SymbolicNode fc1
julia&gt; net2 = mx.Variable(:data2)
SymbolicNode data2
julia&gt; net2 = mx.FullyConnected(net2, name=:net2, num_hidden=128)
SymbolicNode net2
julia&gt; mx.list_arguments(net2)
3-element Array{Symbol,1}:
:data2
:net2_weight
:net2_bias
julia&gt; composed_net = net2(data2=net, name=:composed)
SymbolicNode composed
julia&gt; mx.list_arguments(composed_net)
5-element Array{Symbol,1}:
:data
:fc1_weight
:fc1_bias
:net2_weight
:net2_bias
</code></pre>
<p>Note we use a composed symbol, <code>net</code> as the argument <code>data2</code> for <code>net2</code> to get a new symbol, which we named <code>:composed</code>. It also shows that a symbol itself is a call-able object, which can be invoked to fill in missing arguments and get more complicated symbol compositions.</p>
<p><a id='Shape-Inference-1'></a></p>
<h3 id="shape-inference">Shape Inference</h3>
<p>Given enough information, the shapes of all arguments in a composed symbol could be inferred automatically. For example, given the input shape, and some hyper-parameters like <code>num_hidden</code>, the shapes for the weights and bias in a neural network could be inferred.</p>
<pre><code class="julia-repl">julia&gt; using MXNet
julia&gt; net = mx.Variable(:data)
SymbolicNode data
julia&gt; net = mx.FullyConnected(net, name=:fc1, num_hidden=10)
SymbolicNode fc1
julia&gt; arg_shapes, out_shapes, aux_shapes = mx.infer_shape(net, data=(10, 64))
(Tuple[(10, 64), (10, 10), (10,)], Tuple[(10, 64)], Tuple[])
</code></pre>
<p>The returned shapes corresponds to arguments with the same order as returned by <code>mx.list_arguments</code>. The <code>out_shapes</code> are shapes for outputs, and <code>aux_shapes</code> can be safely ignored for now.</p>
<pre><code class="julia-repl">julia&gt; for (n, s) in zip(mx.list_arguments(net), arg_shapes)
println(&quot;$n\t=&gt; $s&quot;)
end
data =&gt; (10, 64)
fc1_weight =&gt; (10, 10)
fc1_bias =&gt; (10,)
</code></pre>
<pre><code class="julia-repl">julia&gt; for (n, s) in zip(mx.list_outputs(net), out_shapes)
println(&quot;$n\t=&gt; $s&quot;)
end
fc1_output =&gt; (10, 64)
</code></pre>
<p><a id='Binding-and-Executing-1'></a></p>
<h3 id="binding-and-executing">Binding and Executing</h3>
<p>In order to execute the computation graph specified a composed symbol, we will <em>bind</em> the free variables to concrete values, specified as <code>mx.NDArray</code>. This will create an <code>mx.Executor</code> on a given <code>mx.Context</code>. A context describes the computation devices (CPUs, GPUs, etc.) and an executor will carry out the computation (forward/backward) specified in the corresponding symbolic composition.</p>
<pre><code class="julia-repl">julia&gt; using MXNet
julia&gt; A = mx.Variable(:A)
SymbolicNode A
julia&gt; B = mx.Variable(:B)
SymbolicNode B
julia&gt; C = A .* B
SymbolicNode _mul0
julia&gt; a = mx.ones(3) * 4
3-element NDArray{Float32,1} @ cpu0:
4.0f0
4.0f0
4.0f0
julia&gt; b = mx.ones(3) * 2
3-element NDArray{Float32,1} @ cpu0:
2.0f0
2.0f0
2.0f0
julia&gt; c_exec = mx.bind(C, context=mx.cpu(), args=Dict(:A =&gt; a, :B =&gt; b));
julia&gt; mx.forward(c_exec)
1-element Array{NDArray{Float32,1},1}:
NDArray(Float32[8.0, 8.0, 8.0])
julia&gt; c_exec.outputs[1]
3-element NDArray{Float32,1} @ cpu0:
8.0f0
8.0f0
8.0f0
julia&gt; copy(c_exec.outputs[1]) # copy turns NDArray into Julia Array
3-element Array{Float32,1}:
8.0
8.0
8.0
</code></pre>
<p>For neural networks, it is easier to use <code>simple_bind</code>. By providing the shape for input arguments, it will perform a shape inference for the rest of the arguments and create the NDArray automatically. In practice, the binding and executing steps are hidden under the <code>Model</code> interface.</p>
<p><strong>TODO</strong> Provide pointers to model tutorial and further details about binding and symbolic API.</p>
<p><a id='High-Level-Interface-1'></a></p>
<h2 id="high-level-interface">High Level Interface</h2>
<p>The high level interface include model training and prediction API, etc.</p>
</article>
</div>
</div>
</main>
<footer class="md-footer">
<div class="md-footer-nav">
<nav class="md-footer-nav__inner md-grid">
<a href="../install/" title="Installation Guide" class="md-flex md-footer-nav__link md-footer-nav__link--prev" rel="prev">
<div class="md-flex__cell md-flex__cell--shrink">
<i class="md-icon md-icon--arrow-back md-footer-nav__button"></i>
</div>
<div class="md-flex__cell md-flex__cell--stretch md-footer-nav__title">
<span class="md-flex__ellipsis">
<span class="md-footer-nav__direction">
Previous
</span>
Installation Guide
</span>
</div>
</a>
<a href="../faq/" title="FAQ" class="md-flex md-footer-nav__link md-footer-nav__link--next" rel="next">
<div class="md-flex__cell md-flex__cell--stretch md-footer-nav__title">
<span class="md-flex__ellipsis">
<span class="md-footer-nav__direction">
Next
</span>
FAQ
</span>
</div>
<div class="md-flex__cell md-flex__cell--shrink">
<i class="md-icon md-icon--arrow-forward md-footer-nav__button"></i>
</div>
</a>
</nav>
</div>
<div class="md-footer-meta md-typeset">
<div class="md-footer-meta__inner md-grid">
<div class="md-footer-copyright">
powered by
<a href="https://www.mkdocs.org">MkDocs</a>
and
<a href="https://squidfunk.github.io/mkdocs-material/">
Material for MkDocs</a>
</div>
</div>
</div>
</footer>
</div>
<script src="../../assets/javascripts/application.808e90bb.js"></script>
<script>app.initialize({version:"1.0.4",url:{base:"../.."}})</script>
<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
<script src="../../assets/mathjaxhelper.js"></script>
</body>
</html>