| <?xml version="1.0" encoding="ISO-8859-1"?> | 
 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | 
 | <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!-- | 
 |         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | 
 |               This file is generated from xml source: DO NOT EDIT | 
 |         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | 
 |       --> | 
 | <title>Connections in the FIN_WAIT_2 state and Apache - Apache HTTP Server</title> | 
 | <link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" /> | 
 | <link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" /> | 
 | <link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" /> | 
 | <link href="../images/favicon.ico" rel="shortcut icon" /></head> | 
 | <body id="manual-page"><div id="page-header"> | 
 | <p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p> | 
 | <p class="apache">Apache HTTP Server Version 2.0</p> | 
 | <img alt="" src="../images/feather.gif" /></div> | 
 | <div class="up"><a href="./"><img title="<-" alt="<-" src="../images/left.gif" /></a></div> | 
 | <div id="path"> | 
 | <a href="http://www.apache.org/">Apache</a> > <a href="http://httpd.apache.org/">HTTP Server</a> > <a href="http://httpd.apache.org/docs-project/">Documentation</a> > <a href="../">Version 2.0</a> > <a href="./">Miscellaneous Documentation</a></div><div id="page-content"><div id="preamble"><h1>Connections in the FIN_WAIT_2 state and Apache</h1> | 
 | <div class="toplang"> | 
 | <p><span>Available Languages: </span><a href="../en/misc/fin_wait_2.html" title="English"> en </a></p> | 
 | </div> | 
 |  | 
 |  | 
 |     <div class="warning"><h3>Warning:</h3> | 
 |       <p>This document has not been fully updated | 
 |       to take into account changes made in the 2.0 version of the | 
 |       Apache HTTP Server. Some of the information may still be | 
 |       relevant, but please use it with care.</p> | 
 |     </div> | 
 |  | 
 |         <p>Starting with the Apache 1.2 betas, people are reporting | 
 |         many more connections in the FIN_WAIT_2 state (as reported | 
 |         by <code>netstat</code>) than they saw using older | 
 |         versions. When the server closes a TCP connection, it sends | 
 |         a packet with the FIN bit set to the client, which then | 
 |         responds with a packet with the ACK bit set. The client | 
 |         then sends a packet with the FIN bit set to the server, | 
 |         which responds with an ACK and the connection is closed. | 
 |         The state that the connection is in during the period | 
 |         between when the server gets the ACK from the client and | 
 |         the server gets the FIN from the client is known as | 
 |         FIN_WAIT_2. See the <a href="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</a> for | 
 |         the technical details of the state transitions.</p> | 
 |  | 
 |         <p>The FIN_WAIT_2 state is somewhat unusual in that there | 
 |         is no timeout defined in the standard for it. This means | 
 |         that on many operating systems, a connection in the | 
 |         FIN_WAIT_2 state will stay around until the system is | 
 |         rebooted. If the system does not have a timeout and too | 
 |         many FIN_WAIT_2 connections build up, it can fill up the | 
 |         space allocated for storing information about the | 
 |         connections and crash the kernel. The connections in | 
 |         FIN_WAIT_2 do not tie up an httpd process.</p> | 
 |  | 
 |   </div> | 
 | <div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#why">Why Does It Happen?</a></li> | 
 | <li><img alt="" src="../images/down.gif" /> <a href="#what">What Can I Do About it?</a></li> | 
 | <li><img alt="" src="../images/down.gif" /> <a href="#appendix">Appendix</a></li> | 
 | </ul></div> | 
 | <div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div> | 
 | <div class="section"> | 
 | <h2><a name="why" id="why">Why Does It Happen?</a></h2> | 
 |  | 
 |     <p>There are numerous reasons for it happening, some of them | 
 |     may not yet be fully clear. What is known follows.</p> | 
 |  | 
 |     <h3><a name="buggy" id="buggy">Buggy Clients and Persistent  | 
 |                                Connections</a></h3> | 
 |  | 
 |         <p>Several clients have a bug which pops up when dealing with | 
 |         persistent connections (aka | 
 |         keepalives). When the connection is idle and the server | 
 |         closes the connection (based on the <code class="directive"><a href="../mod/core.html#keepalivetimeout">KeepAliveTimeout</a></code>), | 
 |         the client is programmed so that the client does not send | 
 |         back a FIN and ACK to the server. This means that the | 
 |         connection stays in the FIN_WAIT_2 state until one of the | 
 |         following happens:</p> | 
 |  | 
 |         <ul> | 
 |           <li>The client opens a new connection to the same or a | 
 |           different site, which causes it to fully close the older | 
 |           connection on that socket.</li> | 
 |  | 
 |           <li>The user exits the client, which on some (most?) | 
 |           clients causes the OS to fully shutdown the | 
 |           connection.</li> | 
 |  | 
 |           <li>The FIN_WAIT_2 times out, on servers that have a | 
 |           timeout for this state.</li> | 
 |         </ul> | 
 |  | 
 |         <p>If you are lucky, this means that the buggy client will | 
 |         fully close the connection and release the resources on | 
 |         your server. However, there are some cases where the socket | 
 |         is never fully closed, such as a dialup client | 
 |         disconnecting from their provider before closing the | 
 |         client. In addition, a client might sit idle for days | 
 |         without making another connection, and thus may hold its | 
 |         end of the socket open for days even though it has no | 
 |         further use for it. <strong>This is a bug in the browser or | 
 |         in its operating system's TCP implementation.</strong></p> | 
 |  | 
 |         <p>The clients on which this problem has been verified to | 
 |         exist:</p> | 
 |  | 
 |         <ul> | 
 |           <li>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE | 
 |           i386)</li> | 
 |  | 
 |           <li>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE | 
 |           i386)</li> | 
 |  | 
 |           <li>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)</li> | 
 |  | 
 |           <li>MSIE 3.01 on the Macintosh</li> | 
 |  | 
 |           <li>MSIE 3.01 on Windows 95</li> | 
 |         </ul> | 
 |  | 
 |         <p>This does not appear to be a problem on:</p> | 
 |  | 
 |         <ul> | 
 |           <li>Mozilla/3.01 (Win95; I)</li> | 
 |         </ul> | 
 |  | 
 |         <p>It is expected that many other clients have the same | 
 |         problem. What a client <strong>should do</strong> is | 
 |         periodically check its open socket(s) to see if they have | 
 |         been closed by the server, and close their side of the | 
 |         connection if the server has closed. This check need only | 
 |         occur once every few seconds, and may even be detected by a | 
 |         OS signal on some systems (<em>e.g.</em>, Win95 and NT | 
 |         clients have this capability, but they seem to be ignoring | 
 |         it).</p> | 
 |  | 
 |         <p>Apache <strong>cannot</strong> avoid these FIN_WAIT_2 | 
 |         states unless it disables persistent connections for the | 
 |         buggy clients, just like we recommend doing for Navigator | 
 |         2.x clients due to other bugs. However, non-persistent | 
 |         connections increase the total number of connections needed | 
 |         per client and slow retrieval of an image-laden web page. | 
 |         Since non-persistent connections have their own resource | 
 |         consumptions and a short waiting period after each closure, | 
 |         a busy server may need persistence in order to best serve | 
 |         its clients.</p> | 
 |  | 
 |         <p>As far as we know, the client-caused FIN_WAIT_2 problem | 
 |         is present for all servers that support persistent | 
 |         connections, including Apache 1.1.x and 1.2.</p> | 
 |  | 
 |      | 
 |  | 
 |     <h3><a name="code" id="code">A necessary bit of code  | 
 |                               introduced in 1.2</a></h3> | 
 |  | 
 |         <p>While the above bug is a problem, it is not the whole | 
 |         problem. Some users have observed no FIN_WAIT_2 problems | 
 |         with Apache 1.1.x, but with 1.2b enough connections build | 
 |         up in the FIN_WAIT_2 state to crash their server. The most | 
 |         likely source for additional FIN_WAIT_2 states is a | 
 |         function called <code>lingering_close()</code> which was | 
 |         added between 1.1 and 1.2. This function is necessary for | 
 |         the proper handling of persistent connections and any | 
 |         request which includes content in the message body | 
 |         (<em>e.g.</em>, PUTs and POSTs). What it does is read any | 
 |         data sent by the client for a certain time after the server | 
 |         closes the connection. The exact reasons for doing this are | 
 |         somewhat complicated, but involve what happens if the | 
 |         client is making a request at the same time the server | 
 |         sends a response and closes the connection. Without | 
 |         lingering, the client might be forced to reset its TCP | 
 |         input buffer before it has a chance to read the server's | 
 |         response, and thus understand why the connection has | 
 |         closed. See the <a href="#appendix">appendix</a> for more | 
 |         details.</p> | 
 |  | 
 |         <p>The code in <code>lingering_close()</code> appears to | 
 |         cause problems for a number of factors, including the | 
 |         change in traffic patterns that it causes. The code has | 
 |         been thoroughly reviewed and we are not aware of any bugs | 
 |         in it. It is possible that there is some problem in the BSD | 
 |         TCP stack, aside from the lack of a timeout for the | 
 |         FIN_WAIT_2 state, exposed by the | 
 |         <code>lingering_close</code> code that causes the observed | 
 |         problems.</p> | 
 |  | 
 |      | 
 |   </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div> | 
 | <div class="section"> | 
 | <h2><a name="what" id="what">What Can I Do About it?</a></h2> | 
 |  | 
 |     <p>There are several possible workarounds to the problem, some | 
 |      of which work better than others.</p> | 
 |  | 
 |     <h3><a name="add_timeout" id="add_timeout">Add a timeout for FIN_WAIT_2</a></h3> | 
 |  | 
 |         <p>The obvious workaround is to simply have a timeout for the | 
 |         FIN_WAIT_2 state. This is not specified by the RFC, and | 
 |         could be claimed to be a violation of the RFC, but it is | 
 |         widely recognized as being necessary. The following systems | 
 |         are known to have a timeout:</p> | 
 |  | 
 |         <ul> | 
 |           <li><a href="http://www.freebsd.org/">FreeBSD</a> | 
 |           versions starting at 2.0 or possibly earlier.</li> | 
 |  | 
 |           <li><a href="http://www.netbsd.org/">NetBSD</a> version | 
 |           1.2(?)</li> | 
 |  | 
 |           <li><a href="http://www.openbsd.org/">OpenBSD</a> all | 
 |           versions(?)</li> | 
 |  | 
 |           <li><a href="http://www.bsdi.com/">BSD/OS</a> 2.1, with | 
 |           the <a href="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027"> | 
 |           K210-027</a> patch installed.</li> | 
 |  | 
 |           <li><a href="http://www.sun.com/">Solaris</a> as of | 
 |           around version 2.2. The timeout can be tuned by using | 
 |           <code>ndd</code> to modify | 
 |           <code>tcp_fin_wait_2_flush_interval</code>, but the | 
 |           default should be appropriate for most servers and | 
 |           improper tuning can have negative impacts.</li> | 
 |  | 
 |           <li><a href="http://www.linux.org/">Linux</a> 2.0.x and | 
 |           earlier(?)</li> | 
 |  | 
 |           <li><a href="http://www.hp.com/">HP-UX</a> 10.x defaults | 
 |           to terminating connections in the FIN_WAIT_2 state after | 
 |           the normal keepalive timeouts. This does not refer to the | 
 |           persistent connection or HTTP keepalive timeouts, but the | 
 |           <code>SO_LINGER</code> socket option which is enabled by | 
 |           Apache. This parameter can be adjusted by using | 
 |           <code>nettune</code> to modify parameters such as | 
 |           <code>tcp_keepstart</code> and <code>tcp_keepstop</code>. | 
 |           In later revisions, there is an explicit timer for | 
 |           connections in FIN_WAIT_2 that can be modified; contact | 
 |           HP support for details.</li> | 
 |  | 
 |           <li><a href="http://www.sgi.com/">SGI IRIX</a> can be | 
 |           patched to support a timeout. For IRIX 5.3, 6.2, and 6.3, | 
 |           use patches 1654, 1703 and 1778 respectively. If you have | 
 |           trouble locating these patches, please contact your SGI | 
 |           support channel for help.</li> | 
 |  | 
 |           <li><a href="http://www.ncr.com/">NCR's MP RAS Unix</a> | 
 |           2.xx and 3.xx both have FIN_WAIT_2 timeouts. In 2.xx it | 
 |           is non-tunable at 600 seconds, while in 3.xx it defaults | 
 |           to 600 seconds and is calculated based on the tunable | 
 |           "max keep alive probes" (default of 8) multiplied by the | 
 |           "keep alive interval" (default 75 seconds).</li> | 
 |  | 
 |           <li><a href="http://www.sequent.com">Sequent's ptx/TCP/IP | 
 |           for DYNIX/ptx</a> has had a FIN_WAIT_2 timeout since | 
 |           around release 4.1 in mid-1994.</li> | 
 |         </ul> | 
 |  | 
 |         <p>The following systems are known to not have a | 
 |         timeout:</p> | 
 |  | 
 |         <ul> | 
 |           <li><a href="http://www.sun.com/">SunOS 4.x</a> does not | 
 |           and almost certainly never will have one because it as at | 
 |           the very end of its development cycle for Sun. If you | 
 |           have kernel source should be easy to patch.</li> | 
 |         </ul> | 
 |  | 
 |         <p>There is a <a href="http://www.apache.org/dist/httpd/contrib/patches/1.2/fin_wait_2.patch"> | 
 |         patch available</a> for adding a timeout to the FIN_WAIT_2 | 
 |         state; it was originally intended for BSD/OS, but should be | 
 |         adaptable to most systems using BSD networking code. You | 
 |         need kernel source code to be able to use it.</p> | 
 |  | 
 |      | 
 |  | 
 |     <h3><a name="no_lingering" id="no_lingering">Compile without using | 
 |                              <code>lingering_close()</code></a></h3> | 
 |  | 
 |         <p>It is possible to compile Apache 1.2 without using the | 
 |         <code>lingering_close()</code> function. This will result | 
 |         in that section of code being similar to that which was in | 
 |         1.1. If you do this, be aware that it can cause problems | 
 |         with PUTs, POSTs and persistent connections, especially if | 
 |         the client uses pipelining. That said, it is no worse than | 
 |         on 1.1, and we understand that keeping your server running | 
 |         is quite important.</p> | 
 |  | 
 |         <p>To compile without the <code>lingering_close()</code> | 
 |         function, add <code>-DNO_LINGCLOSE</code> to the end of the | 
 |         <code>EXTRA_CFLAGS</code> line in your | 
 |         <code>Configuration</code> file, rerun | 
 |         <code class="program"><a href="../programs/Configure.html">Configure</a></code> and rebuild the server.</p> | 
 |  | 
 |      | 
 |  | 
 |     <h3><a name="so_linger" id="so_linger">Use <code>SO_LINGER</code> as  | 
 |                 an alternative to <code>lingering_close()</code></a></h3> | 
 |  | 
 |         <p>On most systems, there is an option called | 
 |         <code>SO_LINGER</code> that can be set with | 
 |         <code>setsockopt(2)</code>. It does something very similar | 
 |         to <code>lingering_close()</code>, except that it is broken | 
 |         on many systems so that it causes far more problems than | 
 |         <code>lingering_close</code>. On some systems, it could | 
 |         possibly work better so it may be worth a try if you have | 
 |         no other alternatives.</p> | 
 |  | 
 |         <p>To try it, add <code>-DUSE_SO_LINGER | 
 |         -DNO_LINGCLOSE</code> to the end of the | 
 |         <code>EXTRA_CFLAGS</code> line in your | 
 |         <code>Configuration</code> file, rerun | 
 |         <code class="program"><a href="../programs/Configure.html">Configure</a></code> and rebuild the server.</p> | 
 |  | 
 |         <div class="note"><h3>NOTE</h3>Attempting to use | 
 |         <code>SO_LINGER</code> and <code>lingering_close()</code> | 
 |         at the same time is very likely to do very bad things, so | 
 |         don't.</div> | 
 |  | 
 |      | 
 |  | 
 |     <h3><a name="increase_mem" id="increase_mem">Increase the amount of memory  | 
 |                            used for storing connection state</a></h3> | 
 |  | 
 |         <dl> | 
 |           <dt>BSD based networking code:</dt> | 
 |  | 
 |           <dd> | 
 |             BSD stores network data, such as connection states, in | 
 |             something called an mbuf. When you get so many | 
 |             connections that the kernel does not have enough mbufs | 
 |             to put them all in, your kernel will likely crash. You | 
 |             can reduce the effects of the problem by increasing the | 
 |             number of mbufs that are available; this will not | 
 |             prevent the problem, it will just make the server go | 
 |             longer before crashing.  | 
 |  | 
 |             <p>The exact way to increase them may depend on your | 
 |             OS; look for some reference to the number of "mbufs" or | 
 |             "mbuf clusters". On many systems, this can be done by | 
 |             adding the line <code>NMBCLUSTERS="n"</code>, where | 
 |             <code>n</code> is the number of mbuf clusters you want | 
 |             to your kernel config file and rebuilding your | 
 |             kernel.</p> | 
 |           </dd> | 
 |         </dl> | 
 |  | 
 |      | 
 |  | 
 |     <h3><a name="disable" id="disable">Disable KeepAlive</a></h3> | 
 |  | 
 |         <p>If you are unable to do any of the above then you | 
 |         should, as a last resort, disable KeepAlive. Edit your | 
 |         httpd.conf and change "KeepAlive On" to "KeepAlive | 
 |         Off".</p> | 
 |  | 
 |      | 
 |   </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div> | 
 | <div class="section"> | 
 | <h2><a name="appendix" id="appendix">Appendix</a></h2> | 
 |  | 
 |    <p>Below is a message from Roy Fielding, one of the authors | 
 |    of HTTP/1.1.</p> | 
 |  | 
 |    <h3><a name="message" id="message">Why the lingering close  | 
 |                      functionality is necessary with HTTP</a></h3> | 
 |  | 
 |         <p>The need for a server to linger on a socket after a close | 
 |         is noted a couple times in the HTTP specs, but not | 
 |         explained. This explanation is based on discussions between | 
 |         myself, Henrik Frystyk, Robert S. Thau, Dave Raggett, and | 
 |         John C. Mallery in the hallways of MIT while I was at W3C.</p> | 
 |  | 
 |         <p>If a server closes the input side of the connection | 
 |         while the client is sending data (or is planning to send | 
 |         data), then the server's TCP stack will signal an RST | 
 |         (reset) back to the client. Upon receipt of the RST, the | 
 |         client will flush its own incoming TCP buffer back to the | 
 |         un-ACKed packet indicated by the RST packet argument. If | 
 |         the server has sent a message, usually an error response, | 
 |         to the client just before the close, and the client | 
 |         receives the RST packet before its application code has | 
 |         read the error message from its incoming TCP buffer and | 
 |         before the server has received the ACK sent by the client | 
 |         upon receipt of that buffer, then the RST will flush the | 
 |         error message before the client application has a chance to | 
 |         see it. The result is that the client is left thinking that | 
 |         the connection failed for no apparent reason.</p> | 
 |  | 
 |         <p>There are two conditions under which this is likely to | 
 |         occur:</p> | 
 |  | 
 |         <ol> | 
 |           <li>sending POST or PUT data without proper | 
 |           authorization</li> | 
 |  | 
 |           <li>sending multiple requests before each response | 
 |           (pipelining) and one of the middle requests resulting in | 
 |           an error or other break-the-connection result.</li> | 
 |         </ol> | 
 |  | 
 |         <p>The solution in all cases is to send the response, close | 
 |         only the write half of the connection (what shutdown is | 
 |         supposed to do), and continue reading on the socket until | 
 |         it is either closed by the client (signifying it has | 
 |         finally read the response) or a timeout occurs. That is | 
 |         what the kernel is supposed to do if SO_LINGER is set. | 
 |         Unfortunately, SO_LINGER has no effect on some systems; on | 
 |         some other systems, it does not have its own timeout and | 
 |         thus the TCP memory segments just pile-up until the next | 
 |         reboot (planned or not).</p> | 
 |  | 
 |         <p>Please note that simply removing the linger code will | 
 |         not solve the problem -- it only moves it to a different | 
 |         and much harder one to detect.</p> | 
 |      | 
 |   </div></div> | 
 | <div class="bottomlang"> | 
 | <p><span>Available Languages: </span><a href="../en/misc/fin_wait_2.html" title="English"> en </a></p> | 
 | </div><div id="footer"> | 
 | <p class="apache">Copyright 1995-2005 The Apache Software Foundation or its licensors, as applicable.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p> | 
 | <p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div> | 
 | </body></html> |