| <!DOCTYPE html> |
| |
| <!--- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| http://www.apache.org/licenses/LICENSE-2.0 |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| <html lang=" en"><head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge"> |
| <meta name="viewport" content="width=device-width, initial-scale=1"> |
| <link href="/versions/master/assets/img/mxnet-icon.png" rel="icon" type="image/png"><!-- Begin Jekyll SEO tag v2.6.1 --> |
| <title>Using MXNet with Large Tensor Support | Apache MXNet</title> |
| <meta name="generator" content="Jekyll v4.0.0" /> |
| <meta property="og:title" content="Using MXNet with Large Tensor Support" /> |
| <meta property="og:locale" content="en_US" /> |
| <meta name="description" content="A flexible and efficient library for deep learning." /> |
| <meta property="og:description" content="A flexible and efficient library for deep learning." /> |
| <link rel="canonical" href="https://mxnet.apache.org/versions/master/api/faq/large_tensor_support" /> |
| <meta property="og:url" content="https://mxnet.apache.org/versions/master/api/faq/large_tensor_support" /> |
| <meta property="og:site_name" content="Apache MXNet" /> |
| <script type="application/ld+json"> |
| {"url":"https://mxnet.apache.org/versions/master/api/faq/large_tensor_support","headline":"Using MXNet with Large Tensor Support","description":"A flexible and efficient library for deep learning.","@type":"WebPage","@context":"https://schema.org"}</script> |
| <!-- End Jekyll SEO tag --> |
| <link rel="stylesheet" href="/versions/master/assets/docsearch.min.css" /><link rel="stylesheet" href="/versions/master/assets/main.css"><link type="application/atom+xml" rel="alternate" href="https://mxnet.apache.org/versions/master/feed.xml" title="Apache MXNet" /><!-- Matomo --> |
| <script> |
| var _paq = window._paq = window._paq || []; |
| /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ |
| /* We explicitly disable cookie tracking to avoid privacy issues */ |
| _paq.push(['disableCookies']); |
| _paq.push(['trackPageView']); |
| _paq.push(['enableLinkTracking']); |
| (function() { |
| var u="https://analytics.apache.org/"; |
| _paq.push(['setTrackerUrl', u+'matomo.php']); |
| _paq.push(['setSiteId', '23']); |
| var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; |
| g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); |
| })(); |
| </script> |
| <!-- End Matomo Code --> |
| |
| <script src="/versions/master/assets/js/jquery-3.3.1.min.js"></script> |
| <script src="/versions/master/assets/js/docsearch.min.js"></script><script src="/versions/master/assets/js/globalSearch.js" defer></script> |
| <script src="/versions/master/assets/js/clipboard.js" defer></script> |
| <script src="/versions/master/assets/js/copycode.js" defer></script></head> |
| <body><header class="site-header" role="banner"> |
| |
| <script> |
| $(document).ready(function () { |
| |
| // HEADER OPACITY LOGIC |
| |
| function opacity_header() { |
| var value = "rgba(4,140,204," + ($(window).scrollTop() / 300 + 0.4) + ")" |
| $('.site-header').css("background-color", value) |
| } |
| |
| $(window).scroll(function () { |
| opacity_header() |
| }) |
| opacity_header(); |
| |
| // MENU SELECTOR LOGIC |
| $('.page-link').each( function () { |
| if (window.location.href.includes(this.href)) { |
| $(this).addClass("page-current"); |
| } |
| }); |
| }) |
| </script> |
| <div class="wrapper"> |
| <a class="site-title" rel="author" href="/versions/master/"><img |
| src="/versions/master/assets/img/mxnet_logo.png" class="site-header-logo"></a> |
| <nav class="site-nav"> |
| <input type="checkbox" id="nav-trigger" class="nav-trigger"/> |
| <label for="nav-trigger"> |
| <span class="menu-icon"> |
| <svg viewBox="0 0 18 15" width="18px" height="15px"> |
| <path d="M18,1.484c0,0.82-0.665,1.484-1.484,1.484H1.484C0.665,2.969,0,2.304,0,1.484l0,0C0,0.665,0.665,0,1.484,0 h15.032C17.335,0,18,0.665,18,1.484L18,1.484z M18,7.516C18,8.335,17.335,9,16.516,9H1.484C0.665,9,0,8.335,0,7.516l0,0 c0-0.82,0.665-1.484,1.484-1.484h15.032C17.335,6.031,18,6.696,18,7.516L18,7.516z M18,13.516C18,14.335,17.335,15,16.516,15H1.484 C0.665,15,0,14.335,0,13.516l0,0c0-0.82,0.665-1.483,1.484-1.483h15.032C17.335,12.031,18,12.695,18,13.516L18,13.516z"/> |
| </svg> |
| </span> |
| </label> |
| <div class="gs-search-border"> |
| <div id="gs-search-icon"></div> |
| <form id="global-search-form"> |
| <input id="global-search" type="text" title="Search" placeholder="Search" /> |
| <div id="global-search-dropdown-container"> |
| <button class="gs-current-version btn" type="button" data-toggle="dropdown"> |
| <span id="gs-current-version-label">master</span> |
| <svg class="gs-dropdown-caret" viewBox="0 0 32 32" class="icon icon-caret-bottom" aria-hidden="true"> |
| <path class="dropdown-caret-path" d="M24 11.305l-7.997 11.39L8 11.305z"></path> |
| </svg> |
| </button> |
| <ul class="gs-opt-group gs-version-dropdown"> |
| |
| |
| <li class="gs-opt gs-versions active">master</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.9.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.8.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.7.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.6.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.5.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.4.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.3.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.2.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.1.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.0.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">0.12.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">0.11.0</li> |
| |
| |
| </ul> |
| </div> |
| <span id="global-search-close">x</span> |
| </form> |
| </div> |
| <div class="trigger"> |
| <div id="global-search-mobile-border"> |
| <div id="gs-search-icon-mobile"></div> |
| <input id="global-search-mobile" placeholder="Search..." type="text"/> |
| <div id="global-search-dropdown-container-mobile"> |
| <button class="gs-current-version-mobile btn" type="button" data-toggle="dropdown"> |
| <svg class="gs-dropdown-caret" viewBox="0 0 32 32" class="icon icon-caret-bottom" aria-hidden="true"> |
| <path class="dropdown-caret-path" d="M24 11.305l-7.997 11.39L8 11.305z"></path> |
| </svg> |
| </button> |
| <ul class="gs-opt-group gs-version-dropdown-mobile"> |
| |
| |
| <li class="gs-opt gs-versions active">master</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.9.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.8.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.7.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.6.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.5.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.4.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.3.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.2.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.1.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">1.0.0</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">0.12.1</li> |
| |
| |
| |
| <li class="gs-opt gs-versions">0.11.0</li> |
| |
| |
| </ul> |
| </div> |
| </div> |
| <a class="page-link" href="/versions/master/get_started">Get Started</a> |
| <a class="page-link" href="/versions/master/features">Features</a> |
| <a class="page-link" href="/versions/master/ecosystem">Ecosystem</a> |
| <a class="page-link" href="/versions/master/api">Docs & Tutorials</a> |
| <a class="page-link" href="/versions/master/trusted_by">Trusted By</a> |
| <a class="page-link" href="https://github.com/apache/incubator-mxnet">GitHub</a> |
| <div class="dropdown" style="min-width:100px"> |
| <span class="dropdown-header">Apache |
| <svg class="dropdown-caret" viewBox="0 0 32 32" class="icon icon-caret-bottom" aria-hidden="true"><path class="dropdown-caret-path" d="M24 11.305l-7.997 11.39L8 11.305z"></path></svg> |
| </span> |
| <div class="dropdown-content" style="min-width:250px"> |
| <a href="https://www.apache.org/foundation/">Apache Software Foundation</a> |
| <a href="https://incubator.apache.org/">Apache Incubator</a> |
| <a href="https://www.apache.org/licenses/">License</a> |
| <a href="/versions/master/api/faq/security.html">Security</a> |
| <a href="https://privacy.apache.org/policies/privacy-policy-public.html">Privacy</a> |
| <a href="https://www.apache.org/events/current-event">Events</a> |
| <a href="https://www.apache.org/foundation/sponsorship.html">Sponsorship</a> |
| <a href="https://www.apache.org/foundation/thanks.html">Thanks</a> |
| </div> |
| </div> |
| <div class="dropdown"> |
| <span class="dropdown-header">master |
| <svg class="dropdown-caret" viewBox="0 0 32 32" class="icon icon-caret-bottom" aria-hidden="true"><path class="dropdown-caret-path" d="M24 11.305l-7.997 11.39L8 11.305z"></path></svg> |
| </span> |
| <div class="dropdown-content"> |
| |
| |
| <a class="dropdown-option-active" href="/">master</a> |
| |
| |
| |
| <a href="/versions/1.9.1/">1.9.1</a> |
| |
| |
| |
| <a href="/versions/1.8.0/">1.8.0</a> |
| |
| |
| |
| <a href="/versions/1.7.0/">1.7.0</a> |
| |
| |
| |
| <a href="/versions/1.6.0/">1.6.0</a> |
| |
| |
| |
| <a href="/versions/1.5.0/">1.5.0</a> |
| |
| |
| |
| <a href="/versions/1.4.1/">1.4.1</a> |
| |
| |
| |
| <a href="/versions/1.3.1/">1.3.1</a> |
| |
| |
| |
| <a href="/versions/1.2.1/">1.2.1</a> |
| |
| |
| |
| <a href="/versions/1.1.0/">1.1.0</a> |
| |
| |
| |
| <a href="/versions/1.0.0/">1.0.0</a> |
| |
| |
| |
| <a href="/versions/0.12.1/">0.12.1</a> |
| |
| |
| |
| <a href="/versions/0.11.0/">0.11.0</a> |
| |
| |
| </div> |
| </div> |
| </div> |
| </nav> |
| </div> |
| </header> |
| <main class="page-content" aria-label="Content"> |
| <script> |
| |
| </script> |
| <article class="post"> |
| |
| <header class="post-header wrapper"> |
| <h1 class="post-title">Using MXNet with Large Tensor Support</h1> |
| <h3></h3></header> |
| |
| <div class="post-content"> |
| <div class="wrapper"> |
| <div class="row"> |
| <div class="col-3 docs-side-bar"> |
| <h3 style="text-transform: capitalize; padding-left:10px">faq</h3> |
| <ul> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/add_op_in_backend">A Beginner's Guide to Implementing Operators in MXNet Backend</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/cloud">MXNet on the Cloud</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/distributed_training">Distributed Training in MXNet</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/env_var">Environment Variables</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/float16">Float16</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/large_tensor_support">Using MXNet with Large Tensor Support</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/model_parallel_lstm">Model Parallel</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/new_op">Create New Operators</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/perf">Some Tips for Improving MXNet Performance</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/recordio">Create a Dataset Using RecordIO</a></li> |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/s3_integration">Use data from S3 for training</a></li> |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/security">MXNet Security Best Practices</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/tensor_inspector_tutorial">Use TensorInspector to Help Debug Operators</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/using_rtc">Using runtime compilation (RTC) to write CUDA kernels in MXNet</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| |
| |
| <li><a href="/versions/master/api/faq/why_mxnet">Why MXNet came to be?</a></li> |
| <!-- page-category --> |
| |
| <!-- page-category --> |
| <!-- resource-p --> |
| </ul> |
| </div> |
| <div class="col-9"> |
| <!--- Licensed to the Apache Software Foundation (ASF) under one --> |
| <!--- or more contributor license agreements. See the NOTICE file --> |
| <!--- distributed with this work for additional information --> |
| <!--- regarding copyright ownership. The ASF licenses this file --> |
| <!--- to you under the Apache License, Version 2.0 (the --> |
| <!--- "License"); you may not use this file except in compliance --> |
| <!--- with the License. You may obtain a copy of the License at --> |
| |
| <!--- http://www.apache.org/licenses/LICENSE-2.0 --> |
| |
| <!--- Unless required by applicable law or agreed to in writing, --> |
| <!--- software distributed under the License is distributed on an --> |
| <!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY --> |
| <!--- KIND, either express or implied. See the License for the --> |
| <!--- specific language governing permissions and limitations --> |
| <!--- under the License. --> |
| |
| <h1 id="using-mxnet-with-large-tensor-support">Using MXNet with Large Tensor Support</h1> |
| |
| <h2 id="what-is-large-tensor-support">What is large tensor support?</h2> |
| <p>When creating a network that uses large amounts of data, as in a deep graph problem, you may need large tensor support. This means tensors are indexed using INT64, instead of INT32 indices.</p> |
| |
| <p>This feature is enabled when MXNet is built with a flag <em>USE_INT64_TENSOR_SIZE=1</em>, which is now a default setting. You can also make MXNet use INT32 indices by changing this flag.</p> |
| |
| <h2 id="when-do-you-need-it">When do you need it?</h2> |
| <ol> |
| <li>When you are creating NDArrays of size larger than 2^31 elements.</li> |
| <li>When the input to your model requires tensors that have inputs larger than 2^31 (when you load them all at once in your code) or attributes greater than 2^31.</li> |
| </ol> |
| |
| <h2 id="how-to-identify-that-you-need-to-use-large-tensors-">How to identify that you need to use large tensors ?</h2> |
| <p>When you see one of the following errors:</p> |
| |
| <ol> |
| <li>OverflowError: unsigned int is greater than maximum</li> |
| <li>Check failed: inp->shape().Size() < 1 » 31 (4300000000 vs. 0) : Size of tensor you are trying to allocate is larger than 2^32 elements. Please build with flag USE_INT64_TENSOR_SIZE=1</li> |
| <li>Invalid Parameter format for end expect int or None but value=’2150000000’, in operator slice_axis(name=””, end=”2150000000”, begin=”0”, axis=”0”). <em>_Basically input attribute was expected to be int32, which is less than 2^31 and the received value is larger than that so, operator’s parmeter inference treats that as a string which becomes unexpected input.`_</em></li> |
| </ol> |
| |
| <h2 id="how-to-use-it-">How to use it ?</h2> |
| <p>You can create a large NDArray that requires large tensor enabled build to run as follows:</p> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">LARGE_X</span><span class="o">=</span><span class="mi">4300000000</span> |
| <span class="n">a</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">LARGE_X</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="err">“</span><span class="n">int64</span><span class="err">”</span><span class="p">)</span> |
| <span class="ow">or</span> |
| <span class="n">a</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">ones</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="n">LARGE_X</span><span class="p">)</span> |
| <span class="ow">or</span> |
| <span class="n">a</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">empty</span><span class="p">(</span><span class="n">LARGE_X</span><span class="p">)</span> |
| <span class="ow">or</span> |
| <span class="n">a</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">exponential</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="n">LARGE_X</span><span class="p">)</span> |
| <span class="ow">or</span> |
| <span class="n">a</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">gamma</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="n">LARGE_X</span><span class="p">)</span> |
| <span class="ow">or</span> |
| <span class="n">a</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="n">LARGE_X</span><span class="p">)</span> |
| </code></pre></div></div> |
| |
| <h2 id="caveats">Caveats</h2> |
| <ol> |
| <li>Use <code class="highlighter-rouge">int64</code> as <code class="highlighter-rouge">dtype</code> whenever attempting to slice an NDArray when range is over maximum <code class="highlighter-rouge">int32</code> value</li> |
| <li>Use <code class="highlighter-rouge">int64</code> as <code class="highlighter-rouge">dtype</code> when passing indices as parameters or expecting output as parameters to and from operators</li> |
| </ol> |
| |
| <p>The following are the cases for large tensor usage where you must specify <code class="highlighter-rouge">dtype</code> as <code class="highlighter-rouge">int64</code>:</p> |
| |
| <ul> |
| <li><em>randint():</em></li> |
| </ul> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">low_large_value</span> <span class="o">=</span> <span class="mi">2</span><span class="o">**</span><span class="mi">32</span> |
| <span class="n">high_large_value</span> <span class="o">=</span> <span class="mi">2</span><span class="o">**</span><span class="mi">34</span> |
| <span class="c1"># dtype is explicitly specified since default type is int32 for randint |
| </span><span class="n">a</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="n">low_large_value</span><span class="p">,</span> <span class="n">high_large_value</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int64</span><span class="p">)</span> |
| </code></pre></div></div> |
| |
| <ul> |
| <li><em>ravel_multi_index()</em> and <em>unravel_index()</em>:</li> |
| </ul> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x1</span><span class="p">,</span> <span class="n">y1</span> <span class="o">=</span> <span class="n">rand_coord_2d</span><span class="p">((</span><span class="n">LARGE_X</span> <span class="o">-</span> <span class="mi">100</span><span class="p">),</span> <span class="n">LARGE_X</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">SMALL_Y</span><span class="p">)</span> |
| <span class="n">x2</span><span class="p">,</span> <span class="n">y2</span> <span class="o">=</span> <span class="n">rand_coord_2d</span><span class="p">((</span><span class="n">LARGE_X</span> <span class="o">-</span> <span class="mi">200</span><span class="p">),</span> <span class="n">LARGE_X</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="n">SMALL_Y</span><span class="p">)</span> |
| <span class="n">x3</span><span class="p">,</span> <span class="n">y3</span> <span class="o">=</span> <span class="n">rand_coord_2d</span><span class="p">((</span><span class="n">LARGE_X</span> <span class="o">-</span> <span class="mi">300</span><span class="p">),</span> <span class="n">LARGE_X</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="n">SMALL_Y</span><span class="p">)</span> |
| <span class="n">indices_2d</span> <span class="o">=</span> <span class="p">[[</span><span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">,</span> <span class="n">x3</span><span class="p">],</span> <span class="p">[</span><span class="n">y1</span><span class="p">,</span> <span class="n">y2</span><span class="p">,</span> <span class="n">y3</span><span class="p">]]</span> |
| <span class="c1"># dtype is explicitly specified for indices else they will default to float32 |
| </span><span class="n">idx</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">ravel_multi_index</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">indices_2d</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int64</span><span class="p">),</span> |
| <span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="n">LARGE_X</span><span class="p">,</span> <span class="n">SMALL_Y</span><span class="p">))</span> |
| <span class="n">indices_2d</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">unravel_index</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">idx_numpy</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int64</span><span class="p">),</span> |
| <span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="n">LARGE_X</span><span class="p">,</span> <span class="n">SMALL_Y</span><span class="p">))</span> |
| </code></pre></div></div> |
| |
| <ul> |
| <li><em>argsort()</em> and <em>topk()</em></li> |
| </ul> |
| |
| <p>They both return indices which are specified by <code class="highlighter-rouge">dtype=np.int64</code>.</p> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">b</span> <span class="o">=</span> <span class="n">create_2d_tensor</span><span class="p">(</span><span class="n">rows</span><span class="o">=</span><span class="n">LARGE_X</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="n">SMALL_Y</span><span class="p">)</span> |
| <span class="c1"># argsort |
| </span><span class="n">s</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">argsort</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">is_ascend</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int64</span><span class="p">)</span> |
| <span class="c1"># topk |
| </span><span class="n">k</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">topk</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int64</span><span class="p">)</span> |
| </code></pre></div></div> |
| |
| <ul> |
| <li><em>index_copy()</em></li> |
| </ul> |
| |
| <p>Again whenever we are passing indices as arguments and using large tensor, the <code class="highlighter-rouge">dtype</code> of indices must be <code class="highlighter-rouge">int64</code>.</p> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">zeros</span><span class="p">((</span><span class="n">LARGE_X</span><span class="p">,</span> <span class="n">SMALL_Y</span><span class="p">))</span> |
| <span class="n">t</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">SMALL_Y</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span><span class="o">.</span><span class="n">reshape</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span> <span class="n">SMALL_Y</span><span class="p">))</span> |
| <span class="c1"># explicitly specifying dtype of indices to np.int64 |
| </span><span class="n">index</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="n">LARGE_X</span> <span class="o">-</span> <span class="mi">1</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="s">"int64"</span><span class="p">)</span> |
| <span class="n">x</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">index_copy</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">index</span><span class="p">,</span> <span class="n">t</span><span class="p">)</span> |
| </code></pre></div></div> |
| |
| <ul> |
| <li><em>one_hot()</em></li> |
| </ul> |
| |
| <p>Here again array is used as indices that act as location of bits inside the large vector that need to be activated.</p> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># a is the index array here whose dtype should be int64. |
| </span><span class="n">a</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="p">(</span><span class="n">VLARGE_X</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int64</span><span class="p">)</span> |
| <span class="n">b</span> <span class="o">=</span> <span class="n">nd</span><span class="o">.</span><span class="n">one_hot</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">VLARGE_X</span><span class="p">)</span> |
| </code></pre></div></div> |
| |
| <h2 id="what-platforms-and-version-of-mxnet-are-supported-">What platforms and version of MXNet are supported ?</h2> |
| <p>You can use MXNet with large tensor support in the following configuration:</p> |
| |
| <p><em>MXNet built for CPU on Linux (Ubuntu or Amazon Linux), and only for python bindings.</em> |
| <em>Custom wheels are provided with this configuration.</em></p> |
| |
| <p>These flavors of MXNet are currently built with large tensor support:</p> |
| |
| <ol> |
| <li>MXNet for linux-cpu</li> |
| <li>MXNet for linux_cu100</li> |
| </ol> |
| |
| <p>Large tensor support only works for <em>forward pass</em>. |
| Backward pass is partially supported and not completely tested, so it is considered experimental at best.</p> |
| |
| <p>Not supported:</p> |
| |
| <ul> |
| <li>GPU.</li> |
| <li>Windows, ARM or any operating system other than Ubuntu</li> |
| <li>Other language bindings like Scala, Java, R, and Julia.</li> |
| </ul> |
| |
| <h2 id="other-known-issues">Other known Issues:</h2> |
| <ul> |
| <li>Randint operator is flaky: https://github.com/apache/incubator-mxnet/issues/16172.</li> |
| <li>dgemm operations using BLAS libraries currently don’t support int64.</li> |
| <li>linspace() is not supported.</li> |
| </ul> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">a</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="s">'a'</span><span class="p">)</span> |
| <span class="n">b</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="s">'b'</span><span class="p">)</span> |
| <span class="n">c</span> <span class="o">=</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span> |
| <span class="n">texec</span> <span class="o">=</span> <span class="n">c</span><span class="o">.</span><span class="n">bind</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">cpu</span><span class="p">(),</span> <span class="p">{</span><span class="s">'a'</span><span class="p">:</span> <span class="n">nd</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">LARGE_X</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'int64'</span><span class="p">)</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">LARGE_X</span><span class="p">),</span> <span class="s">'b'</span> <span class="p">:</span> <span class="n">nd</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">LARGE_X</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'int64'</span><span class="p">)</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">LARGE_X</span><span class="p">)})</span> |
| <span class="n">new_shape</span> <span class="o">=</span> <span class="p">{</span><span class="s">'a'</span><span class="p">:</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="o">*</span><span class="n">LARGE_X</span><span class="p">),</span> <span class="s">'b'</span><span class="p">:</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="o">*</span><span class="n">LARGE_X</span><span class="p">)}</span> |
| <span class="n">texec</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">allow_up_sizing</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="o">**</span><span class="n">new_shape</span><span class="p">)</span> |
| |
| <span class="n">Traceback</span> <span class="p">(</span><span class="n">most</span> <span class="n">recent</span> <span class="n">call</span> <span class="n">last</span><span class="p">):</span> |
| <span class="n">File</span> <span class="s">"<stdin>"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">1</span><span class="p">,</span> <span class="ow">in</span> <span class="o"><</span><span class="n">module</span><span class="o">></span> |
| <span class="n">File</span> <span class="s">"/home/ubuntu/incubator-mxnet/python/mxnet/executor.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">449</span><span class="p">,</span> <span class="ow">in</span> <span class="n">reshape</span> |
| <span class="n">py_array</span><span class="p">(</span><span class="s">'i'</span><span class="p">,</span> <span class="n">provided_arg_shape_data</span><span class="p">)),</span> |
| <span class="nb">OverflowError</span><span class="p">:</span> <span class="n">signed</span> <span class="n">integer</span> <span class="ow">is</span> <span class="n">greater</span> <span class="n">than</span> <span class="n">maximum</span><span class="p">}</span> |
| </code></pre></div></div> |
| |
| <p>Symbolic reshape is not supported. Please see the following example.</p> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">a</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="s">'a'</span><span class="p">)</span> |
| <span class="n">b</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">sym</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="s">'b'</span><span class="p">)</span> |
| <span class="n">c</span> <span class="o">=</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span> |
| <span class="n">texec</span> <span class="o">=</span> <span class="n">c</span><span class="o">.</span><span class="n">bind</span><span class="p">(</span><span class="n">mx</span><span class="o">.</span><span class="n">cpu</span><span class="p">(),</span> <span class="p">{</span><span class="s">'a'</span><span class="p">:</span> <span class="n">nd</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">LARGE_X</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'int64'</span><span class="p">)</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">LARGE_X</span><span class="p">),</span> <span class="s">'b'</span> <span class="p">:</span> <span class="n">nd</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">LARGE_X</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'int64'</span><span class="p">)</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">LARGE_X</span><span class="p">)})</span> |
| <span class="n">new_shape</span> <span class="o">=</span> <span class="p">{</span><span class="s">'a'</span><span class="p">:</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">LARGE_X</span><span class="p">),</span> <span class="s">'b'</span><span class="p">:</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">LARGE_X</span><span class="p">)}</span> |
| <span class="n">texec</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">allow_up_sizing</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="o">**</span><span class="n">new_shape</span><span class="p">)</span> |
| |
| <span class="n">Traceback</span> <span class="p">(</span><span class="n">most</span> <span class="n">recent</span> <span class="n">call</span> <span class="n">last</span><span class="p">):</span> |
| <span class="n">File</span> <span class="s">"<stdin>"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">1</span><span class="p">,</span> <span class="ow">in</span> <span class="o"><</span><span class="n">module</span><span class="o">></span> |
| <span class="n">File</span> <span class="s">"/home/ubuntu/incubator-mxnet/python/mxnet/executor.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">449</span><span class="p">,</span> <span class="ow">in</span> <span class="n">reshape</span> |
| <span class="n">py_array</span><span class="p">(</span><span class="s">'i'</span><span class="p">,</span> <span class="n">provided_arg_shape_data</span><span class="p">)),</span> |
| <span class="nb">OverflowError</span><span class="p">:</span> <span class="n">signed</span> <span class="n">integer</span> <span class="ow">is</span> <span class="n">greater</span> <span class="n">than</span> <span class="n">maximum</span> |
| </code></pre></div></div> |
| |
| <h2 id="working-dgl-exampledglai">Working DGL Example(dgl.ai)</h2> |
| <p>The following is a sample running code for DGL which works with int64 but not with int32.</p> |
| |
| <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">mxnet</span> <span class="k">as</span> <span class="n">mx</span> |
| <span class="kn">from</span> <span class="nn">mxnet</span> <span class="kn">import</span> <span class="n">gluon</span> |
| <span class="kn">import</span> <span class="nn">dgl</span> |
| <span class="kn">import</span> <span class="nn">dgl.function</span> <span class="k">as</span> <span class="n">fn</span> |
| <span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span> |
| <span class="kn">from</span> <span class="nn">scipy</span> <span class="kn">import</span> <span class="n">sparse</span> <span class="k">as</span> <span class="n">spsp</span> |
| |
| <span class="n">num_nodes</span> <span class="o">=</span> <span class="mi">10000000</span> |
| <span class="n">num_edges</span> <span class="o">=</span> <span class="mi">100000000</span> |
| |
| <span class="n">col1</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">num_nodes</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="n">num_edges</span><span class="p">,))</span> |
| <span class="k">print</span><span class="p">(</span><span class="s">'create col1'</span><span class="p">)</span> |
| <span class="n">col2</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">num_nodes</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="n">num_edges</span><span class="p">,))</span> |
| <span class="k">print</span><span class="p">(</span><span class="s">'create col2'</span><span class="p">)</span> |
| <span class="n">data</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="n">num_edges</span><span class="p">,))</span> |
| <span class="k">print</span><span class="p">(</span><span class="s">'create data'</span><span class="p">)</span> |
| <span class="n">spm</span> <span class="o">=</span> <span class="n">spsp</span><span class="o">.</span><span class="n">coo_matrix</span><span class="p">((</span><span class="n">data</span><span class="p">,</span> <span class="p">(</span><span class="n">col1</span><span class="p">,</span> <span class="n">col2</span><span class="p">)),</span> <span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="n">num_nodes</span><span class="p">,</span> <span class="n">num_nodes</span><span class="p">))</span> |
| <span class="k">print</span><span class="p">(</span><span class="s">'create coo'</span><span class="p">)</span> |
| <span class="n">labels</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="n">num_nodes</span><span class="p">,))</span> |
| |
| <span class="n">g</span> <span class="o">=</span> <span class="n">dgl</span><span class="o">.</span><span class="n">DGLGraph</span><span class="p">(</span><span class="n">spm</span><span class="p">,</span> <span class="n">readonly</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> |
| <span class="k">print</span><span class="p">(</span><span class="s">'create DGLGraph'</span><span class="p">)</span> |
| <span class="n">g</span><span class="o">.</span><span class="n">ndata</span><span class="p">[</span><span class="s">'h'</span><span class="p">]</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="n">num_nodes</span><span class="p">,</span> <span class="mi">200</span><span class="p">))</span> |
| <span class="k">print</span><span class="p">(</span><span class="s">'create node data'</span><span class="p">)</span> |
| |
| <span class="k">class</span> <span class="nc">node_update</span><span class="p">(</span><span class="n">gluon</span><span class="o">.</span><span class="n">Block</span><span class="p">):</span> |
| <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">out_feats</span><span class="p">):</span> |
| <span class="nb">super</span><span class="p">(</span><span class="n">node_update</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="n">__init__</span><span class="p">()</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">dense</span> <span class="o">=</span> <span class="n">gluon</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="n">out_feats</span><span class="p">,</span> <span class="s">'relu'</span><span class="p">)</span> |
| <span class="bp">self</span><span class="o">.</span><span class="n">dropout</span> <span class="o">=</span> <span class="mf">0.5</span> |
| |
| <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">nodes</span><span class="p">):</span> |
| <span class="n">h</span> <span class="o">=</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">concat</span><span class="p">(</span><span class="n">nodes</span><span class="o">.</span><span class="n">data</span><span class="p">[</span><span class="s">'h'</span><span class="p">],</span> <span class="n">nodes</span><span class="o">.</span><span class="n">data</span><span class="p">[</span><span class="s">'accum'</span><span class="p">],</span> <span class="n">dim</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> |
| <span class="n">h</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">dense</span><span class="p">(</span><span class="n">h</span><span class="p">)</span> |
| <span class="k">return</span> <span class="p">{</span><span class="s">'h'</span><span class="p">:</span> <span class="n">mx</span><span class="o">.</span><span class="n">nd</span><span class="o">.</span><span class="n">Dropout</span><span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="n">p</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">dropout</span><span class="p">)}</span> |
| <span class="n">update_fn</span> <span class="o">=</span> <span class="n">node_update</span><span class="p">(</span><span class="mi">200</span><span class="p">)</span> |
| <span class="n">update_fn</span><span class="o">.</span><span class="n">initialize</span><span class="p">(</span><span class="n">ctx</span><span class="o">=</span><span class="n">mx</span><span class="o">.</span><span class="n">cpu</span><span class="p">())</span> |
| |
| <span class="n">g</span><span class="o">.</span><span class="n">update_all</span><span class="p">(</span><span class="n">fn</span><span class="o">.</span><span class="n">copy_src</span><span class="p">(</span><span class="n">src</span><span class="o">=</span><span class="s">'h'</span><span class="p">,</span> <span class="n">out</span><span class="o">=</span><span class="s">'m'</span><span class="p">),</span> <span class="n">fn</span><span class="o">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">msg</span><span class="o">=</span><span class="s">'m'</span><span class="p">,</span> <span class="n">out</span><span class="o">=</span><span class="s">'accum'</span><span class="p">),</span> <span class="n">update_fn</span><span class="p">)</span> |
| <span class="k">print</span><span class="p">(</span><span class="s">'update all'</span><span class="p">)</span> |
| |
| <span class="n">loss_fcn</span> <span class="o">=</span> <span class="n">gluon</span><span class="o">.</span><span class="n">loss</span><span class="o">.</span><span class="n">SoftmaxCELoss</span><span class="p">()</span> |
| <span class="n">loss</span> <span class="o">=</span> <span class="n">loss_fcn</span><span class="p">(</span><span class="n">g</span><span class="o">.</span><span class="n">ndata</span><span class="p">[</span><span class="s">'h'</span><span class="p">],</span> <span class="n">labels</span><span class="p">)</span> |
| <span class="k">print</span><span class="p">(</span><span class="s">'loss'</span><span class="p">)</span> |
| <span class="n">loss</span> <span class="o">=</span> <span class="n">loss</span><span class="o">.</span><span class="nb">sum</span><span class="p">()</span> |
| <span class="k">print</span><span class="p">(</span><span class="n">loss</span><span class="p">)</span> |
| </code></pre></div></div> |
| |
| <h2 id="performance-regression">Performance Regression:</h2> |
| <p>Roughly 40 operators have shown performance regression in our preliminary analysis: Large Tensor Performance as shown in table below.</p> |
| |
| <table> |
| <thead> |
| <tr> |
| <th>Operator</th> |
| <th>int32(msec)</th> |
| <th>int64(msec)</th> |
| <th>int64/int32</th> |
| <th>int32+mkl(msec)</th> |
| <th>int64+mkl(msec)</th> |
| <th>int64+mkl/int32+mkl</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>topk</td> |
| <td>12.81245198</td> |
| <td>42.2472195</td> |
| <td>329.74%</td> |
| <td>12.728027</td> |
| <td>43.462353</td> |
| <td>341.47%</td> |
| </tr> |
| <tr> |
| <td>argsort</td> |
| <td>16.43896801</td> |
| <td>46.2231455</td> |
| <td>281.18%</td> |
| <td>17.200311</td> |
| <td>46.7779985</td> |
| <td>271.96%</td> |
| </tr> |
| <tr> |
| <td>sort</td> |
| <td>16.57822751</td> |
| <td>46.5644815</td> |
| <td>280.88%</td> |
| <td>16.401236</td> |
| <td>46.263803</td> |
| <td>282.08%</td> |
| </tr> |
| <tr> |
| <td>flip</td> |
| <td>0.221817521</td> |
| <td>0.535838</td> |
| <td>241.57%</td> |
| <td>0.2123705</td> |
| <td>0.7950055</td> |
| <td>374.35%</td> |
| </tr> |
| <tr> |
| <td>depth_to_space</td> |
| <td>0.250976998</td> |
| <td>0.534083</td> |
| <td>212.80%</td> |
| <td>0.2338155</td> |
| <td>0.631252</td> |
| <td>269.98%</td> |
| </tr> |
| <tr> |
| <td>space_to_depth</td> |
| <td>0.254336512</td> |
| <td>0.5368935</td> |
| <td>211.10%</td> |
| <td>0.2334405</td> |
| <td>0.6343175</td> |
| <td>271.73%</td> |
| </tr> |
| <tr> |
| <td>min_axis</td> |
| <td>0.685826526</td> |
| <td>1.4393255</td> |
| <td>209.87%</td> |
| <td>0.6266175</td> |
| <td>1.3538925</td> |
| <td>216.06%</td> |
| </tr> |
| <tr> |
| <td>sum_axis</td> |
| <td>0.720809505</td> |
| <td>1.5110635</td> |
| <td>209.63%</td> |
| <td>0.6566265</td> |
| <td>0.8290575</td> |
| <td>126.26%</td> |
| </tr> |
| <tr> |
| <td>nansum</td> |
| <td>1.279337012</td> |
| <td>2.635434</td> |
| <td>206.00%</td> |
| <td>1.227156</td> |
| <td>2.4305255</td> |
| <td>198.06%</td> |
| </tr> |
| <tr> |
| <td>argmax</td> |
| <td>4.765146994</td> |
| <td>9.682672</td> |
| <td>203.20%</td> |
| <td>4.6576605</td> |
| <td>9.394067</td> |
| <td>201.69%</td> |
| </tr> |
| <tr> |
| <td>swapaxes</td> |
| <td>0.667943008</td> |
| <td>1.3544455</td> |
| <td>202.78%</td> |
| <td>0.649036</td> |
| <td>1.8293235</td> |
| <td>281.85%</td> |
| </tr> |
| <tr> |
| <td>argmin</td> |
| <td>4.774890491</td> |
| <td>9.545651</td> |
| <td>199.91%</td> |
| <td>4.666858</td> |
| <td>9.5194385</td> |
| <td>203.98%</td> |
| </tr> |
| <tr> |
| <td>sum_axis</td> |
| <td>0.540210982</td> |
| <td>1.0550705</td> |
| <td>195.31%</td> |
| <td>0.500895</td> |
| <td>0.616179</td> |
| <td>123.02%</td> |
| </tr> |
| <tr> |
| <td>max_axis</td> |
| <td>0.117824005</td> |
| <td>0.226481</td> |
| <td>192.22%</td> |
| <td>0.149085</td> |
| <td>0.224334</td> |
| <td>150.47%</td> |
| </tr> |
| <tr> |
| <td>argmax_channel</td> |
| <td>0.261897018</td> |
| <td>0.49573</td> |
| <td>189.28%</td> |
| <td>0.251171</td> |
| <td>0.4814885</td> |
| <td>191.70%</td> |
| </tr> |
| <tr> |
| <td>min_axis</td> |
| <td>0.147698505</td> |
| <td>0.2675355</td> |
| <td>181.14%</td> |
| <td>0.148424</td> |
| <td>0.2874105</td> |
| <td>193.64%</td> |
| </tr> |
| <tr> |
| <td>nansum</td> |
| <td>1.142132009</td> |
| <td>2.058077</td> |
| <td>180.20%</td> |
| <td>1.042387</td> |
| <td>1.263102</td> |
| <td>121.17%</td> |
| </tr> |
| <tr> |
| <td>min_axis</td> |
| <td>0.56951947</td> |
| <td>1.020972</td> |
| <td>179.27%</td> |
| <td>0.4722595</td> |
| <td>0.998179</td> |
| <td>211.36%</td> |
| </tr> |
| <tr> |
| <td>min</td> |
| <td>1.154684491</td> |
| <td>2.0446045</td> |
| <td>177.07%</td> |
| <td>1.0534145</td> |
| <td>1.9723065</td> |
| <td>187.23%</td> |
| </tr> |
| <tr> |
| <td>sum</td> |
| <td>1.121753477</td> |
| <td>1.959272</td> |
| <td>174.66%</td> |
| <td>0.9984095</td> |
| <td>1.213339</td> |
| <td>121.53%</td> |
| </tr> |
| <tr> |
| <td>sum_axis</td> |
| <td>0.158632494</td> |
| <td>0.2744115</td> |
| <td>172.99%</td> |
| <td>0.1573735</td> |
| <td>0.2266315</td> |
| <td>144.01%</td> |
| </tr> |
| <tr> |
| <td>nansum</td> |
| <td>0.21418152</td> |
| <td>0.3661335</td> |
| <td>170.95%</td> |
| <td>0.2162935</td> |
| <td>0.269517</td> |
| <td>124.61%</td> |
| </tr> |
| <tr> |
| <td>random_normal</td> |
| <td>1.229072484</td> |
| <td>2.093057</td> |
| <td>170.30%</td> |
| <td>1.222785</td> |
| <td>2.095916</td> |
| <td>171.41%</td> |
| </tr> |
| <tr> |
| <td>LeakyReLU</td> |
| <td>0.344101485</td> |
| <td>0.582337</td> |
| <td>169.23%</td> |
| <td>0.389167</td> |
| <td>0.7003465</td> |
| <td>179.96%</td> |
| </tr> |
| <tr> |
| <td>nanprod</td> |
| <td>1.273265516</td> |
| <td>2.095068</td> |
| <td>164.54%</td> |
| <td>1.0906815</td> |
| <td>2.054369</td> |
| <td>188.36%</td> |
| </tr> |
| <tr> |
| <td>nanprod</td> |
| <td>0.203272473</td> |
| <td>0.32792</td> |
| <td>161.32%</td> |
| <td>0.202548</td> |
| <td>0.3288335</td> |
| <td>162.35%</td> |
| </tr> |
| <tr> |
| <td>sample_gamma</td> |
| <td>8.079962019</td> |
| <td>12.7266385</td> |
| <td>157.51%</td> |
| <td>12.4216245</td> |
| <td>12.7957475</td> |
| <td>103.01%</td> |
| </tr> |
| <tr> |
| <td>sum</td> |
| <td>0.21571602</td> |
| <td>0.3396875</td> |
| <td>157.47%</td> |
| <td>0.1939995</td> |
| <td>0.262942</td> |
| <td>135.54%</td> |
| </tr> |
| <tr> |
| <td>argmin</td> |
| <td>0.086381478</td> |
| <td>0.1354795</td> |
| <td>156.84%</td> |
| <td>0.0826235</td> |
| <td>0.134886</td> |
| <td>163.25%</td> |
| </tr> |
| <tr> |
| <td>argmax</td> |
| <td>0.08664903</td> |
| <td>0.135826</td> |
| <td>156.75%</td> |
| <td>0.082693</td> |
| <td>0.1269225</td> |
| <td>153.49%</td> |
| </tr> |
| <tr> |
| <td>sample_gamma</td> |
| <td>7.712843508</td> |
| <td>12.0266355</td> |
| <td>155.93%</td> |
| <td>11.8900915</td> |
| <td>12.143009</td> |
| <td>102.13%</td> |
| </tr> |
| <tr> |
| <td>sample_exponential</td> |
| <td>2.312778</td> |
| <td>3.5953945</td> |
| <td>155.46%</td> |
| <td>3.0935085</td> |
| <td>3.5656265</td> |
| <td>115.26%</td> |
| </tr> |
| <tr> |
| <td>prod</td> |
| <td>0.203170988</td> |
| <td>0.3113865</td> |
| <td>153.26%</td> |
| <td>0.180757</td> |
| <td>0.264523</td> |
| <td>146.34%</td> |
| </tr> |
| <tr> |
| <td>random_uniform</td> |
| <td>0.40893798</td> |
| <td>0.6240795</td> |
| <td>152.61%</td> |
| <td>0.244613</td> |
| <td>0.6319695</td> |
| <td>258.35%</td> |
| </tr> |
| <tr> |
| <td>min</td> |
| <td>0.205482502</td> |
| <td>0.3122025</td> |
| <td>151.94%</td> |
| <td>0.2023835</td> |
| <td>0.33234</td> |
| <td>164.21%</td> |
| </tr> |
| <tr> |
| <td>random_negative_binomial</td> |
| <td>3.919228504</td> |
| <td>5.919488</td> |
| <td>151.04%</td> |
| <td>5.685851</td> |
| <td>6.0220735</td> |
| <td>105.91%</td> |
| </tr> |
| <tr> |
| <td>max</td> |
| <td>0.212521001</td> |
| <td>0.3130105</td> |
| <td>147.28%</td> |
| <td>0.2039755</td> |
| <td>0.2956105</td> |
| <td>144.92%</td> |
| </tr> |
| <tr> |
| <td>LeakyReLU</td> |
| <td>2.813424013</td> |
| <td>4.1121625</td> |
| <td>146.16%</td> |
| <td>2.719118</td> |
| <td>5.613753</td> |
| <td>206.45%</td> |
| </tr> |
| <tr> |
| <td>mean</td> |
| <td>0.242281501</td> |
| <td>0.344385</td> |
| <td>142.14%</td> |
| <td>0.209396</td> |
| <td>0.313411</td> |
| <td>149.67%</td> |
| </tr> |
| <tr> |
| <td>Deconvolution</td> |
| <td>7.43279251</td> |
| <td>10.4240845</td> |
| <td>140.24%</td> |
| <td>2.9548925</td> |
| <td>5.812926</td> |
| <td>196.72%</td> |
| </tr> |
| <tr> |
| <td>abs</td> |
| <td>0.273286481</td> |
| <td>0.38319</td> |
| <td>140.22%</td> |
| <td>0.3711615</td> |
| <td>0.338064</td> |
| <td>91.08%</td> |
| </tr> |
| <tr> |
| <td>arcsinh</td> |
| <td>0.155792513</td> |
| <td>0.2090985</td> |
| <td>134.22%</td> |
| <td>0.113365</td> |
| <td>0.1702855</td> |
| <td>150.21%</td> |
| </tr> |
| <tr> |
| <td>sample_gamma</td> |
| <td>0.137634983</td> |
| <td>0.1842455</td> |
| <td>133.87%</td> |
| <td>0.1792825</td> |
| <td>0.172175</td> |
| <td>96.04%</td> |
| </tr> |
| <tr> |
| <td>sort</td> |
| <td>0.864107016</td> |
| <td>1.1560165</td> |
| <td>133.78%</td> |
| <td>0.8239285</td> |
| <td>1.1454645</td> |
| <td>139.02%</td> |
| </tr> |
| <tr> |
| <td>argsort</td> |
| <td>0.847259507</td> |
| <td>1.1320885</td> |
| <td>133.62%</td> |
| <td>0.842302</td> |
| <td>1.1179105</td> |
| <td>132.72%</td> |
| </tr> |
| <tr> |
| <td>cosh</td> |
| <td>0.129947497</td> |
| <td>0.1727415</td> |
| <td>132.93%</td> |
| <td>0.1192565</td> |
| <td>0.1217325</td> |
| <td>102.08%</td> |
| </tr> |
| <tr> |
| <td>random_randint</td> |
| <td>0.822044531</td> |
| <td>1.085645</td> |
| <td>132.07%</td> |
| <td>0.6036805</td> |
| <td>1.0953995</td> |
| <td>181.45%</td> |
| </tr> |
| <tr> |
| <td>arctanh</td> |
| <td>0.119817996</td> |
| <td>0.1576315</td> |
| <td>131.56%</td> |
| <td>0.115616</td> |
| <td>0.111907</td> |
| <td>96.79%</td> |
| </tr> |
| <tr> |
| <td>arccos</td> |
| <td>0.185662502</td> |
| <td>0.2423095</td> |
| <td>130.51%</td> |
| <td>0.238534</td> |
| <td>0.2351415</td> |
| <td>98.58%</td> |
| </tr> |
| <tr> |
| <td>mean</td> |
| <td>1.758513477</td> |
| <td>2.2908485</td> |
| <td>130.27%</td> |
| <td>1.5868465</td> |
| <td>2.530801</td> |
| <td>159.49%</td> |
| </tr> |
| <tr> |
| <td>erfinv</td> |
| <td>0.142498524</td> |
| <td>0.184796</td> |
| <td>129.68%</td> |
| <td>0.1529025</td> |
| <td>0.1538225</td> |
| <td>100.60%</td> |
| </tr> |
| <tr> |
| <td>degrees</td> |
| <td>0.12517249</td> |
| <td>0.1576175</td> |
| <td>125.92%</td> |
| <td>0.1166425</td> |
| <td>0.1199775</td> |
| <td>102.86%</td> |
| </tr> |
| <tr> |
| <td>sample_exponential</td> |
| <td>0.07651851</td> |
| <td>0.0960485</td> |
| <td>125.52%</td> |
| <td>0.0885775</td> |
| <td>0.095597</td> |
| <td>107.92%</td> |
| </tr> |
| <tr> |
| <td>arctan</td> |
| <td>0.120863522</td> |
| <td>0.1496115</td> |
| <td>123.79%</td> |
| <td>0.1161245</td> |
| <td>0.17206</td> |
| <td>148.17%</td> |
| </tr> |
| <tr> |
| <td>prod</td> |
| <td>1.147695002</td> |
| <td>1.408007</td> |
| <td>122.68%</td> |
| <td>1.0491025</td> |
| <td>1.4065515</td> |
| <td>134.07%</td> |
| </tr> |
| <tr> |
| <td>fix</td> |
| <td>0.073436997</td> |
| <td>0.089991</td> |
| <td>122.54%</td> |
| <td>0.0390455</td> |
| <td>0.099307</td> |
| <td>254.34%</td> |
| </tr> |
| <tr> |
| <td>exp</td> |
| <td>0.047701993</td> |
| <td>0.058272</td> |
| <td>122.16%</td> |
| <td>0.0397295</td> |
| <td>0.0506725</td> |
| <td>127.54%</td> |
| </tr> |
| </tbody> |
| </table> |
| |
| </div> |
| </div> |
| |
| </div> |
| </div> |
| |
| </article> |
| |
| </main><footer class="site-footer h-card"> |
| <div class="wrapper"> |
| <div class="row"> |
| <div class="col-4"> |
| <h4 class="footer-category-title">Resources</h4> |
| <ul class="contact-list"> |
| <li><a href="/versions/master/community#stay-connected">Mailing lists</a></li> |
| <li><a href="/versions/master/community#github-issues">Github Issues</a></li> |
| <li><a href="https://github.com/apache/incubator-mxnet/projects">Projects</a></li> |
| <li><a href="https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+Home">Developer Wiki</a></li> |
| <li><a href="https://discuss.mxnet.io">Forum</a></li> |
| <li><a href="/versions/master/community">Contribute To MXNet</a></li> |
| </ul> |
| </div> |
| |
| <div class="col-4"><ul class="social-media-list"><li><a href="https://github.com/apache/incubator-mxnet"><svg class="svg-icon"><use xlink:href="/versions/master/assets/minima-social-icons.svg#github"></use></svg> <span class="username">apache/incubator-mxnet</span></a></li><li><a href="https://www.twitter.com/apachemxnet"><svg class="svg-icon"><use xlink:href="/versions/master/assets/minima-social-icons.svg#twitter"></use></svg> <span class="username">apachemxnet</span></a></li><li><a href="https://youtube.com/apachemxnet"><svg class="svg-icon"><use xlink:href="/versions/master/assets/minima-social-icons.svg#youtube"></use></svg> <span class="username">apachemxnet</span></a></li></ul> |
| </div> |
| |
| <div class="col-4 footer-text"> |
| <p>A flexible and efficient library for deep learning.</p> |
| </div> |
| </div> |
| </div> |
| </footer> |
| <footer class="site-footer2"> |
| <div class="wrapper"> |
| <div class="row"> |
| <div class="col-3"> |
| <img src="/versions/master/assets/img/apache_incubator_logo.png" class="footer-logo col-2"> |
| </div> |
| <div class="footer-bottom-warning col-9"> |
| <p>Apache MXNet is an effort undergoing incubation at <a href="http://www.apache.org/">The Apache Software Foundation</a> (ASF), <span |
| style="font-weight:bold">sponsored by the <i>Apache Incubator</i></span>. Incubation is required |
| of all newly accepted projects until a further review indicates that the infrastructure, |
| communications, and decision making process have stabilized in a manner consistent with other |
| successful ASF projects. While incubation status is not necessarily a reflection of the completeness |
| or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. |
| </p><p>"Copyright © 2017-2022, The Apache Software Foundation Apache MXNet, MXNet, Apache, the Apache |
| feather, and the Apache MXNet project logo are either registered trademarks or trademarks of the |
| Apache Software Foundation."</p> |
| </div> |
| </div> |
| </div> |
| </footer> |
| |
| |
| |
| |
| </body> |
| |
| </html> |