| <!-- |
| Documentation/_templates/layout.html |
| |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. The |
| ASF licenses this file to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance with the |
| License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, WITHOUT |
| WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the |
| License for the specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| |
| |
| <!DOCTYPE html> |
| <html class="writer-html5" lang="en"> |
| <head> |
| <meta charset="utf-8" /><meta name="generator" content="Docutils 0.18.1: http://docutils.sourceforge.net/" /> |
| |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> |
| <title>Work Queue Deadlocks — NuttX latest documentation</title> |
| <link rel="stylesheet" type="text/css" href="../../_static/pygments.css" /> |
| <link rel="stylesheet" type="text/css" href="../../_static/css/theme.css" /> |
| <link rel="stylesheet" type="text/css" href="../../_static/copybutton.css" /> |
| <link rel="stylesheet" type="text/css" href="../../_static/custom.css" /> |
| |
| |
| <link rel="shortcut icon" href="../../_static/favicon.ico"/> |
| <script src="../../_static/jquery.js"></script> |
| <script src="../../_static/_sphinx_javascript_frameworks_compat.js"></script> |
| <script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script> |
| <script src="../../_static/doctools.js"></script> |
| <script src="../../_static/sphinx_highlight.js"></script> |
| <script src="../../_static/clipboard.min.js"></script> |
| <script src="../../_static/copybutton.js"></script> |
| <script src="../../_static/js/theme.js"></script> |
| <link rel="index" title="Index" href="../../genindex.html" /> |
| <link rel="search" title="Search" href="../../search.html" /> |
| <link rel="next" title="Memory Management" href="../mm/index.html" /> |
| <link rel="prev" title="SLIP" href="slip.html" /> |
| </head> |
| |
| <body class="wy-body-for-nav"> |
| <div class="wy-grid-for-nav"> |
| <nav data-toggle="wy-nav-shift" class="wy-nav-side"> |
| <div class="wy-side-scroll"> |
| <div class="wy-side-nav-search" > |
| |
| <a href="../../index.html" class="icon icon-home"> NuttX |
| |
| |
| |
| </a> |
| |
| <!-- this version selector is quite ugly, should be probably replaced by something |
| more modern --> |
| |
| <div class="version-selector"> |
| <select onchange="javascript:location.href = this.value;"> |
| |
| <option value="../../../latest" selected="selected">latest</option> |
| |
| <option value="../../../10.0.0" >10.0.0</option> |
| |
| <option value="../../../10.0.1" >10.0.1</option> |
| |
| <option value="../../../10.1.0" >10.1.0</option> |
| |
| <option value="../../../10.2.0" >10.2.0</option> |
| |
| <option value="../../../10.3.0" >10.3.0</option> |
| |
| <option value="../../../11.0.0" >11.0.0</option> |
| |
| <option value="../../../12.0.0" >12.0.0</option> |
| |
| <option value="../../../12.1.0" >12.1.0</option> |
| |
| <option value="../../../12.2.0" >12.2.0</option> |
| |
| <option value="../../../12.2.1" >12.2.1</option> |
| |
| <option value="../../../12.3.0" >12.3.0</option> |
| |
| <option value="../../../12.4.0" >12.4.0</option> |
| |
| <option value="../../../12.5.0" >12.5.0</option> |
| |
| <option value="../../../12.5.1" >12.5.1</option> |
| |
| <option value="../../../12.6.0" >12.6.0</option> |
| |
| <option value="../../../12.7.0" >12.7.0</option> |
| |
| <option value="../../../12.8.0" >12.8.0</option> |
| |
| <option value="../../../12.9.0" >12.9.0</option> |
| |
| <option value="../../../12.10.0" >12.10.0</option> |
| |
| <option value="../../../12.11.0" >12.11.0</option> |
| |
| </select> |
| </div> |
| |
| |
| <div role="search"> |
| <form id="rtd-search-form" class="wy-form" action="../../search.html" method="get"> |
| <input type="text" name="q" placeholder="Search docs" aria-label="Search docs" /> |
| <input type="hidden" name="check_keywords" value="yes" /> |
| <input type="hidden" name="area" value="default" /> |
| </form> |
| </div> |
| |
| </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu"> |
| <p class="caption" role="heading"><span class="caption-text">Table of Contents</span></p> |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="../../index.html">Home</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../introduction/index.html">Introduction</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../quickstart/index.html">Getting Started</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../contributing/index.html">Contributing</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../introduction/inviolables.html">The Inviolable Principles of NuttX</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../platforms/index.html">Supported Platforms</a></li> |
| <li class="toctree-l1 current"><a class="reference internal" href="../index.html">OS Components</a><ul class="current"> |
| <li class="toctree-l2"><a class="reference internal" href="../binfmt.html">Binary Loader</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../drivers/index.html">Device Drivers</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../nxflat.html">NXFLAT</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../nxgraphics/index.html">NX Graphics Subsystem</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../paging.html">On-Demand Paging</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../audio/index.html">Audio Subsystem</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../filesystem/index.html">NuttX File System</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../libs/index.html">NuttX libraries</a></li> |
| <li class="toctree-l2 current"><a class="reference internal" href="index.html">Network Support</a><ul class="current"> |
| <li class="toctree-l3"><a class="reference internal" href="sixlowpan.html">6LoWPAN</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="socketcan.html">SocketCAN Device Drivers</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="pkt.html">“Raw” packet socket support</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="ipfilter.html">IP Packet Filter</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="nat.html">Network Address Translation (NAT)</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="netdev.html">Network Devices</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="netdriver.html">Network Drivers</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="netguardsize.html">CONFIG_NET_GUARDSIZE</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="netlink.html">Netlink Route support</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="slip.html">SLIP</a></li> |
| <li class="toctree-l3 current"><a class="current reference internal" href="#">Work Queue Deadlocks</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="#use-of-work-queues">Use of Work Queues</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#high-and-low-priority-work-queues">High and Low Priority Work Queues</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#downsides-of-work-queues">Downsides of Work Queues</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#networking-on-work-queues">Networking on Work Queues</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#deadlocks">Deadlocks</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#iobs">IOBs</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#alternatives-to-work-queues">Alternatives to Work Queues</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../mm/index.html">Memory Management</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../syscall.html">Syscall Layer</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../tools/index.html"><code class="docutils literal notranslate"><span class="pre">/tools</span></code> Host Tools</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../arch/index.html">Architecture-Specific Code</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../boards.html">Boards Support</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../cmake.html">CMake Support</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../openamp.html">OpenAMP Support</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../video.html">Video Subsystem</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../crypto.html">Crypto API Subsystem</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../wireless.html">Wireless Subsystem</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l1"><a class="reference internal" href="../../applications/index.html">Applications</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../implementation/index.html">Implementation Details</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../reference/index.html">API Reference</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../faq/index.html">FAQ</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../guides/index.html">Guides</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../glossary.html">Glossary</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../logos/index.html">NuttX Logos</a></li> |
| </ul> |
| |
| </div> |
| </div> |
| </nav> |
| |
| <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" > |
| <i data-toggle="wy-nav-top" class="fa fa-bars"></i> |
| <a href="../../index.html">NuttX</a> |
| </nav> |
| |
| <div class="wy-nav-content"> |
| <div class="rst-content"> |
| <div role="navigation" aria-label="Page navigation"> |
| <ul class="wy-breadcrumbs"> |
| <li><a href="../../index.html" class="icon icon-home" aria-label="Home"></a></li> |
| <li class="breadcrumb-item"><a href="../index.html">OS Components</a></li> |
| <li class="breadcrumb-item"><a href="index.html">Network Support</a></li> |
| <li class="breadcrumb-item active">Work Queue Deadlocks</li> |
| <li class="wy-breadcrumbs-aside"> |
| <a href="../../_sources/components/net/wqueuedeadlocks.rst.txt" rel="nofollow"> View page source</a> |
| </li> |
| </ul> |
| <hr/> |
| </div> |
| <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article"> |
| <div itemprop="articleBody"> |
| |
| <section id="work-queue-deadlocks"> |
| <h1>Work Queue Deadlocks<a class="headerlink" href="#work-queue-deadlocks" title="Permalink to this heading"></a></h1> |
| <section id="use-of-work-queues"> |
| <h2>Use of Work Queues<a class="headerlink" href="#use-of-work-queues" title="Permalink to this heading"></a></h2> |
| <p>Most network drivers use a work queue to handle network events. This is done for |
| two reason: (1) Most of the example code to leverage from does it that way, and (2) |
| it is easier and is a more efficient use memory resources to use the work queue |
| rather than creating a dedicated task/thread to service the network.</p> |
| </section> |
| <section id="high-and-low-priority-work-queues"> |
| <h2>High and Low Priority Work Queues<a class="headerlink" href="#high-and-low-priority-work-queues" title="Permalink to this heading"></a></h2> |
| <p>There are two work queues: A single, high priority work queue that is intended |
| only to service the back end interrupt processing in a semi-normal, tasking |
| context. And low priority work queue(s) that are similar but as then name implies |
| are lower in priority and not dedicated for time-critical back end interrupt |
| processing.</p> |
| </section> |
| <section id="downsides-of-work-queues"> |
| <h2>Downsides of Work Queues<a class="headerlink" href="#downsides-of-work-queues" title="Permalink to this heading"></a></h2> |
| <p>There are two important downsides to the use of work queues. First, the work queues |
| are inherently non-deterministic. The time delay from the point at which you |
| schedule work and the time at which the work is performed in highly random and |
| that delay is due not only to the strict priority scheduling but also to what |
| work as been queued ahead of you.</p> |
| <p>Why do you bother to use an RTOS if you rely on non-deterministic work queues to do |
| most of the work?</p> |
| <p>A second problem is related: Only one work queue job can be performed at a time. |
| That job should be brief so that it can make the work queue available again for |
| the next work queue job as soon as possible. And that job should never block |
| waiting for resources! If the job blocks, then it blocks the entire work queue |
| and makes the whole work queue unavailable for the duration of the wait.</p> |
| </section> |
| <section id="networking-on-work-queues"> |
| <h2>Networking on Work Queues<a class="headerlink" href="#networking-on-work-queues" title="Permalink to this heading"></a></h2> |
| <p>As mentioned, most network drivers use a work queue to handle network events. |
| (some are even configurable to use high priority work queue… YIKES!). Most |
| network operations are not really suited for execution on a work queue: The |
| networking operations can be quite extended and also can block waiting for for |
| the availability of resources. So, at a minimum, networking should never use |
| the high priority work queue.</p> |
| </section> |
| <section id="deadlocks"> |
| <h2>Deadlocks<a class="headerlink" href="#deadlocks" title="Permalink to this heading"></a></h2> |
| <p>If there is only a single instance of a work queue, then it is easy to create a |
| deadlock on the work queue if a work job blocks on the work queue. Here is the |
| generic work queue deadlock scenario:</p> |
| <ul class="simple"> |
| <li><p>A job runs on a work queue and waits for the availability of a resource.</p></li> |
| <li><p>The operation that provides that resource also runs on the same work queue.</p></li> |
| <li><p>But since the work queue is blocked waiting for the resource, the job that |
| provides the resource cannot run and a deadlock results.</p></li> |
| </ul> |
| </section> |
| <section id="iobs"> |
| <h2>IOBs<a class="headerlink" href="#iobs" title="Permalink to this heading"></a></h2> |
| <p>IOBs (I/O Blocks) are small I/O buffers that can be linked together in chains to |
| efficiently buffer variable sized network packet data. This is a much more |
| efficient use of buffering space than full packet buffers since the packets |
| content is often much smaller than the full packet size (the MSS).</p> |
| <p>The network allocates IOBs to support TCP and UDP read-ahead buffering and write |
| buffering. Read-head buffering is used when TCP/UDP data is received and there is |
| no receiver in place waiting to accept the data. In this case, the received |
| payload is buffered in the IOB-based, read-ahead buffers. When the application |
| next calls <code class="docutils literal notranslate"><span class="pre">revc()</span></code> or <code class="docutils literal notranslate"><span class="pre">recvfrom()</span></code>, the date will be removed from the read-ahead |
| buffer and returned to the caller immediately.</p> |
| <p>Write-buffering refers to the similar feature on the outgoing side. When application |
| calls <code class="docutils literal notranslate"><span class="pre">send()</span></code> or <code class="docutils literal notranslate"><span class="pre">sendto()</span></code> and the driver is not available to accept the new packet |
| data, then data is buffered in IOBs in the write buffer chain. When the network |
| driver is finally available to take more data, then packet data is removed from |
| the write-buffer and provided to the driver.</p> |
| <p>The IOBs are allocated with a fixed size. A fixed number of IOBs are pre-allocated |
| when the system starts. If the network runs out of IOBs, additional IOBs will not |
| be allocated dynamically, rather, the IOB allocator, <code class="docutils literal notranslate"><span class="pre">iob_alloc()</span></code> will block waiting |
| until an IOB is finally returned to pool of free IOBs. There is also a non-blocking |
| IOB allocator, <code class="docutils literal notranslate"><span class="pre">iob_tryalloc()</span></code>.</p> |
| <p>Under conditions of high utilization, such as sending large amount of data at high |
| rates or receiving large amounts of data at high rates, it is inevitable that the |
| system will run out of pre-allocated IOBs. For read-ahead buffering, the packets |
| are simply dropped in this case. For TCP this means that there will be a subsequent |
| timeout on the remote peer because no ACK will be received and the remote peer will |
| eventually re-transmit the packet. UDP is a lossy transfer and handling of lost or |
| dropped datagrams must be included in any UDP design.</p> |
| <p>For write-buffering, there are three possible behaviors that can occur when the |
| IOB pool has been exhausted: First, if there are no available IOBs at the beginning |
| of a <code class="docutils literal notranslate"><span class="pre">send()</span></code> or <code class="docutils literal notranslate"><span class="pre">sendto()</span></code> transfer, then the operation will block until IOBs are again |
| available if <code class="docutils literal notranslate"><span class="pre">O_NONBLOCK</span></code> is not selected. This delay can can be a substantial amount |
| of time.</p> |
| <p>Second, if <code class="docutils literal notranslate"><span class="pre">O_NONBLOCK</span></code> is selected, the send will, of course, return immediately, |
| failing with errno set <code class="docutils literal notranslate"><span class="pre">EAGAIN</span></code> if we cannot allocate the first IOB for the transfer.</p> |
| <p>The third behavior occurs if the we run out of IOBs in the middle of the transfer. |
| Then the send operation will not wait but will instead send then number of bytes that |
| it has successfully buffered. Applications should always check the return value from |
| <code class="docutils literal notranslate"><span class="pre">send()</span></code> or <code class="docutils literal notranslate"><span class="pre">sendto()</span></code>. If it a is a byte count less then the requested transfer |
| size, then the send function should be called again.</p> |
| <p>The blocking iob_alloc() call is also the a common cause of work queue deadlocks. |
| The scenario again is:</p> |
| <ul class="simple"> |
| <li><p>Some logic in the OS runs on a work queue and blocks waiting for an IOB to |
| become available,</p></li> |
| <li><p>The logic that releases the IOB also runs on the same work queue, but</p></li> |
| <li><p>That logic that provides the IOB cannot execute, however, because the other job |
| is blocked waiting for the IOB on the same work queue.</p></li> |
| </ul> |
| </section> |
| <section id="alternatives-to-work-queues"> |
| <h2>Alternatives to Work Queues<a class="headerlink" href="#alternatives-to-work-queues" title="Permalink to this heading"></a></h2> |
| <p>To avoid network deadlocks here is the rule: Never run the network on a singleton |
| work queue!</p> |
| <p>Most network implementation do just that! Here are a couple of alternatives:</p> |
| <ol class="arabic"> |
| <li><p>Use Multiple Low Priority Work Queues |
| Unlike the high priority work queues, the low priority work queues utilize a |
| thread pool. The number of threads in the pool is controlled by the |
| <code class="docutils literal notranslate"><span class="pre">CONFIG_SCHED_LPNTHREADS</span></code>. If <code class="docutils literal notranslate"><span class="pre">CONFIG_SCHED_LPNTHREADS</span></code> is greater than one, |
| then such deadlocks should not be possible: In that case, if a thread is busy with |
| some other job (even if it is only waiting for a resource), then the job will be |
| assigned to a different thread and the deadlock will be broken. The cost of the |
| additional low priority work queue thread is primarily the memory set aside for |
| the thread’s stack.</p></li> |
| <li><p>Use a Dedicated Network Thread |
| The best solution would be to write a custom kernel thread to handle driver |
| network operations. This would be the highest performing and the most manageable. |
| It would also, however, but substantially more work.</p></li> |
| <li><p>Interactions with Network Locks |
| The network lock is a re-entrant mutex that enforces mutually exclusive access to |
| the network. The network lock can also cause deadlocks and can also interact with |
| the work queues to degrade performance. Consider this scenario:</p> |
| <blockquote> |
| <div><ul class="simple"> |
| <li><p>Some network logic, perhaps running on on the application thread, takes the network |
| lock then waits for an IOB to become available (on the application thread, not a |
| work queue).</p></li> |
| <li><p>Some network related event runs on the work queue but is blocked waiting for |
| the network lock.</p></li> |
| <li><p>Another job is queued behind that network job. This is the one that provides the |
| IOB, but it cannot run because the other thread is blocked waiting for the network |
| lock on the work queue.</p></li> |
| </ul> |
| </div></blockquote> |
| <p>But the network will not be unlocked because the application logic holds the network |
| lock and is waiting for the IOB which can never be released.</p> |
| <p>Within the network, this deadlock condition is avoided using a special function |
| <code class="docutils literal notranslate"><span class="pre">net_ioballoc()</span></code>. <code class="docutils literal notranslate"><span class="pre">net_ioballoc()</span></code> is a wrapper around the blocking <code class="docutils literal notranslate"><span class="pre">iob_alloc()</span></code> |
| that momentarily releases the network lock while waiting for the IOB to become available.</p> |
| <p>Similarly, the network functions <code class="docutils literal notranslate"><span class="pre">net_lockedait()</span></code> and <code class="docutils literal notranslate"><span class="pre">net_timedait()</span></code> are wrappers |
| around <code class="docutils literal notranslate"><span class="pre">nxsem_wait()</span></code> <code class="docutils literal notranslate"><span class="pre">nxsem_timedwait()</span></code>, respectively, and also release the network |
| lock for the duration of the wait.</p> |
| <p>Caution should be used with any of these wrapper functions. Because the network lock is |
| relinquished during the wait, there could changes in the network state that occur before |
| the lock is recovered. Your design should account for this possibility.</p> |
| </li> |
| </ol> |
| </section> |
| </section> |
| |
| |
| </div> |
| </div> |
| <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer"> |
| <a href="slip.html" class="btn btn-neutral float-left" title="SLIP" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a> |
| <a href="../mm/index.html" class="btn btn-neutral float-right" title="Memory Management" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a> |
| </div> |
| |
| <hr/> |
| |
| <div role="contentinfo"> |
| <p>© Copyright 2023, The Apache Software Foundation.</p> |
| </div> |
| |
| |
| |
| </footer> |
| </div> |
| </div> |
| </section> |
| </div> |
| <script> |
| jQuery(function () { |
| SphinxRtdTheme.Navigation.enable(true); |
| }); |
| </script> |
| |
| </body> |
| </html> |