docs/manual/misc/fin_wait_2.html - httpd - Git at Google

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
 <TITLE>Connections in FIN_WAIT_2 and Apache</TITLE>
 <LINK REV="made" HREF="mailto:marc@apache.org">

 </HEAD>

 <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
 <BODY
  BGCOLOR="#FFFFFF"
  TEXT="#000000"
  LINK="#0000FF"
  VLINK="#000080"
  ALINK="#FF0000"
 >
 <!--#include virtual="header.html" -->

 <H1 ALIGN="CENTER">Connections in the FIN_WAIT_2 state and Apache</H1>
 <OL>
 <LI><H2>What is the FIN_WAIT_2 state?</H2>
 Starting with the Apache 1.2 betas, people are reporting many more
 connections in the FIN_WAIT_2 state (as reported by
 <code>netstat</code>) than they saw using older versions.  When the
 server closes a TCP connection, it sends a packet with the FIN bit
 sent to the client, which then responds with a packet with the ACK bit
 set.  The client then sends a packet with the FIN bit set to the
 server, which responds with an ACK and the connection is closed.  The
 state that the connection is in during the period between when the
 server gets the ACK from the client and the server gets the FIN from
 the client is known as FIN_WAIT_2.  See the <A
 HREF="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</A> for the
 technical details of the state transitions.<P>

 The FIN_WAIT_2 state is somewhat unusual in that there is no timeout
 defined in the standard for it.  This means that on many operating
 systems, a connection in the FIN_WAIT_2 state will stay around until
 the system is rebooted.  If the system does not have a timeout and
 too many FIN_WAIT_2 connections build up, it can fill up the space
 allocated for storing information about the connections and crash
 the kernel.  The connections in FIN_WAIT_2 do not tie up an httpd
 process.<P>

 <LI><H2>But why does it happen?</H2>

 There are several reasons for it happening, and not all of them are
 fully understood by the Apache team yet.  What is known follows.<P>

 <H3>Buggy clients and persistent connections</H3>

 Several clients have a bug which pops up when dealing with
 <A HREF="../keepalive.html">persistent connections</A> (aka keepalives).
 When the connection is idle and the server closes the connection
 (based on the <A HREF="../mod/core.html#keepalivetimeout">
 KeepAliveTimeout</A>), the client is programmed so that the client does
 not send back a FIN and ACK to the server.  This means that the
 connection stays in the FIN_WAIT_2 state until one of the following
 happens:<P>
 <UL>
         <LI>The client opens a new connection to the same or a different
             site, which causes it to fully close the older connection on
             that socket.
         <LI>The user exits the client, which on some (most?) clients
             causes the OS to fully shutdown the connection.
         <LI>The FIN_WAIT_2 times out, on servers that have a timeout
             for this state.
 </UL><P>
 If you are lucky, this means that the buggy client will fully close the
 connection and release the resources on your server.  However, there
 are some cases where the socket is never fully closed, such as a dialup
 client disconnecting from their provider before closing the client.
 In addition, a client might sit idle for days without making another
 connection, and thus may hold its end of the socket open for days
 even though it has no further use for it.
 <STRONG>This is a bug in the browser or in its operating system's
 TCP implementation.</STRONG>  <P>

 The clients on which this problem has been verified to exist:<P>
 <UL>
         <LI>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
         <LI>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE i386)
         <LI>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)
         <LI>MSIE 3.01 on the Macintosh
         <LI>MSIE 3.01 on Windows 95
 </UL><P>

 This does not appear to be a problem on:
 <UL>
         <LI>Mozilla/3.01 (Win95; I)
 </UL>
 <P>

 It is expected that many other clients have the same problem. What a
 client <STRONG>should do</STRONG> is periodically check its open
 socket(s) to see if they have been closed by the server, and close their
 side of the connection if the server has closed.  This check need only
 occur once every few seconds, and may even be detected by a OS signal
 on some systems (e.g., Win95 and NT clients have this capability, but
 they seem to be ignoring it).<P>

 Apache <STRONG>cannot</STRONG> avoid these FIN_WAIT_2 states unless it
 disables persistent connections for the buggy clients, just
 like we recommend doing for Navigator 2.x clients due to other bugs.
 However, non-persistent connections increase the total number of
 connections needed per client and slow retrieval of an image-laden
 web page.  Since non-persistent connections have their own resource
 consumptions and a short waiting period after each closure, a busy server
 may need persistence in order to best serve its clients.<P>

 As far as we know, the client-caused FIN_WAIT_2 problem is present for
 all servers that support persistent connections, including Apache 1.1.x
 and 1.2.<P>

 <H3>Something in Apache may be broken</H3>

 While the above bug is a problem, it is not the whole problem.
 Some users have observed no FIN_WAIT_2 problems with Apache 1.1.x,
 but with 1.2b enough connections build up in the FIN_WAIT_2 state to
 crash their server.  We have not yet identified why this would occur
 and welcome additional test input.<P>

 One possible (and most likely) source for additional FIN_WAIT_2 states
 is a function called <CODE>lingering_close()</CODE> which was added
 between 1.1 and 1.2.  This function is necessary for the proper
 handling of persistent connections and any request which includes
 content in the message body (e.g., PUTs and POSTs).
 What it does is read any data sent by the client for
 a certain time after the server closes the connection.  The exact
 reasons for doing this are somewhat complicated, but involve what
 happens if the client is making a request at the same time the
 server sends a response and closes the connection. Without lingering,
 the client might be forced to reset its TCP input buffer before it
 has a chance to read the server's response, and thus understand why
 the connection has closed.
 See the <A HREF="#appendix">appendix</A> for more details.<P>

 We have not yet tracked down the exact reason why
 <CODE>lingering_close()</CODE> causes problems.  Its code has been
 thoroughly reviewed and extensively updated in 1.2b6.  It is possible
 that there is some problem in the BSD TCP stack which is causing the
 observed problems.  It is also possible that we fixed it in 1.2b6.
 Unfortunately, we have not been able to replicate the problem on our
 test servers.<P>

 <H2><LI>What can I do about it?</H2>

 There are several possible workarounds to the problem, some of
 which work better than others.<P>

 <H3>Add a timeout for FIN_WAIT_2</H3>

 The obvious workaround is to simply have a timeout for the FIN_WAIT_2 state.
 This is not specified by the RFC, and could be claimed to be a
 violation of the RFC, but it is widely recognized as being necessary.
 The following systems are known to have a timeout:
 <P>
 <UL>
         <LI><A HREF="http://www.freebsd.org/">FreeBSD</A> versions starting at 2.0 or possibly earlier.
         <LI><A HREF="http://www.netbsd.org/">NetBSD</A> version 1.2(?)
         <LI><A HREF="http://www.openbsd.org/">OpenBSD</A> all versions(?)
         <LI><A HREF="http://www.bsdi.com/">BSD/OS</A> 2.1, with the
             <A HREF="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
             K210-027</A> patch installed.
         <LI><A HREF="http://www.sun.com/">Solaris</A> as of around version
             2.2.  The timeout can be tuned by using <CODE>ndd</CODE> to
             modify <CODE>tcp_fin_wait_2_flush_interval</CODE>, but the
             default should be appropriate for most servers and improper
             tuning can have negative impacts.
         <LI><A HREF="http://www.sco.com/">SCO TCP/IP Release 1.2.1</A>
             can be modified to have a timeout by following
             <A HREF="http://www.sco.com/cgi-bin/waisgate?WAISdocID=2242622956+0+0+0&WAISaction=retrieve"> SCO's instructions</A>.
         <LI><A HREF="http://www.linux.org/">Linux</A> 2.0.x and
             earlier(?)
         <LI><A HREF="http://www.hp.com/">HP-UX</A> 10.x defaults to
             terminating connections in the FIN_WAIT_2 state after the
             normal keepalive timeouts.  This does not
             refer to the persistent connection or HTTP keepalive
             timeouts, but the <CODE>SO_LINGER</CODE> socket option
             which is enabled by Apache.  This parameter can be adjusted
             by using <CODE>nettune</CODE> to modify parameters such as
             <CODE>tcp_keepstart</CODE> and <CODE>tcp_keepstop</CODE>.
             In later revisions, there is an explicit timer for
             connections in FIN_WAIT_2 that can be modified; contact HP
             support for details.
         <LI><A HREF="http://www.sgi.com/">SGI IRIX</A> can be patched to
             support a timeout.  For IRIX 5.3, 6.2, and 6.3,
             use patches 1654, 1703 and 1778 respectively.  If you
             have trouble locating these patches, please contact your
             SGI support channel for help.
         <LI><A HREF="http://www.ncr.com/">NCR's MP RAS Unix</A> 2.xx and
             3.xx both have FIN_WAIT_2 timeouts.  In 2.xx it is non-tunable
             at 600 seconds, while in 3.xx it defaults to 600 seconds and
             is calculated based on the tunable "max keep alive probes"
             (default of 8) multiplied by the "keep alive interval" (default
             75 seconds).
         <LI><A HREF="http://www.sequent.com">Squent's ptx/TCP/IP for
             DYNIX/ptx</A> has had a FIN_WAIT_2 timeout since around
             release 4.1 in mid-1994.
 </UL>
 <P>
 The following systems are known to not have a timeout:
 <P>
 <UL>
         <LI><A HREF="http://www.sun.com/">SunOS 4.x</A> does not and
             almost certainly never will have one because it as at the
             very end of its development cycle for Sun.  If you have kernel
             source should be easy to patch.
 </UL>
 <P>
 There is a
 <A HREF="http://www.apache.org/dist/contrib/patches/1.2/fin_wait_2.patch">
 patch available</A> for adding a timeout to the FIN_WAIT_2 state; it
 was originally intended for BSD/OS, but should be adaptable to most
 systems using BSD networking code.  You need kernel source code to be
 able to use it.  If you do adapt it to work for any other systems,
 please drop me a note at <A HREF="mailto:marc@apache.org">marc@apache.org</A>.
 <P>
 <H3>Compile without using <CODE>lingering_close()</CODE></H3>

 It is possible to compile Apache 1.2 without using the
 <CODE>lingering_close()</CODE> function.  This will result in that
 section of code being similar to that which was in 1.1.  If you do
 this, be aware that it can cause problems with PUTs, POSTs and
 persistent connections, especially if the client uses pipelining.
 That said, it is no worse than on 1.1, and we understand that keeping your
 server running is quite important.<P>

 To compile without the <CODE>lingering_close()</CODE> function, add
 <CODE>-DNO_LINGCLOSE</CODE> to the end of the
 <CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE> file,
 rerun <CODE>Configure</CODE> and rebuild the server.
 <P>
 <H3>Use <CODE>SO_LINGER</CODE> as an alternative to
 <CODE>lingering_close()</CODE></H3>

 On most systems, there is an option called <CODE>SO_LINGER</CODE> that
 can be set with <CODE>setsockopt(2)</CODE>.  It does something very
 similar to <CODE>lingering_close()</CODE>, except that it is broken
 on many systems so that it causes far more problems than
 <CODE>lingering_close</CODE>.  On some systems, it could possibly work
 better so it may be worth a try if you have no other alternatives. <P>

 To try it, add <CODE>-DUSE_SO_LINGER -DNO_LINGCLOSE</CODE>  to the end of the
 <CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE>
 file, rerun <CODE>Configure</CODE> and rebuild the server.  <P>

 <STRONG>NOTE:</STRONG> Attempting to use <CODE>SO_LINGER</CODE> and
 <CODE>lingering_close()</CODE> at the same time is very likely to do
 very bad things, so don't.<P>

 <H3>Increase the amount of memory used for storing connection state</H3>
 <DL>
 <DT>BSD based networking code:
 <DD>BSD stores network data, such as connection states,
 in something called an mbuf.  When you get so many connections
 that the kernel does not have enough mbufs to put them all in, your
 kernel will likely crash.  You can reduce the effects of the problem
 by increasing the number of mbufs that are available; this will not
 prevent the problem, it will just make the server go longer before
 crashing.<P>

 The exact way to increase them may depend on your OS; look
 for some reference to the number of "mbufs" or "mbuf clusters".  On
 many systems, this can be done by adding the line
 <CODE>NMBCLUSTERS="n"</CODE>, where <CODE>n</CODE> is the number of
 mbuf clusters you want to your kernel config file and rebuilding your
 kernel.<P>
 </DL>

 <H3>Disable KeepAlive</H3>
 <P>If you are unable to do any of the above then you should, as a last
 resort, disable KeepAlive.  Edit your httpd.conf and change "KeepAlive On"
 to "KeepAlive Off".

 <H2><LI>Feedback</H2>

 If you have any information to add to this page, please contact me at
 <A HREF="mailto:marc@apache.org">marc@apache.org</A>.<P>

 <H2><A NAME="appendix"><LI>Appendix</A></H2>
 <P>
 Below is a message from Roy Fielding, one of the authors of HTTP/1.1.

 <H3>Why the lingering close functionality is necessary with HTTP</H3>

 The need for a server to linger on a socket after a close is noted a couple
 times in the HTTP specs, but not explained.  This explanation is based on
 discussions between myself, Henrik Frystyk, Robert S. Thau, Dave Raggett,
 and John C. Mallery in the hallways of MIT while I was at W3C.<P>

 If a server closes the input side of the connection while the client
 is sending data (or is planning to send data), then the server's TCP
 stack will signal an RST (reset) back to the client.  Upon
 receipt of the RST, the client will flush its own incoming TCP buffer
 back to the un-ACKed packet indicated by the RST packet argument.
 If the server has sent a message, usually an error response, to the
 client just before the close, and the client receives the RST packet
 before its application code has read the error message from its incoming
 TCP buffer and before the server has received the ACK sent by the client
 upon receipt of that buffer, then the RST will flush the error message
 before the client application has a chance to see it. The result is
 that the client is left thinking that the connection failed for no
 apparent reason.<P>

 There are two conditions under which this is likely to occur:
 <OL>
 <LI>sending POST or PUT data without proper authorization
 <LI>sending multiple requests before each response (pipelining)
     and one of the middle requests resulting in an error or
     other break-the-connection result.
 </OL>
 <P>
 The solution in all cases is to send the response, close only the
 write half of the connection (what shutdown is supposed to do), and
 continue reading on the socket until it is either closed by the
 client (signifying it has finally read the response) or a timeout occurs.
 That is what the kernel is supposed to do if SO_LINGER is set.
 Unfortunately, SO_LINGER has no effect on some systems; on some other
 systems, it does not have its own timeout and thus the TCP memory
 segments just pile-up until the next reboot (planned or not).<P>

 Please note that simply removing the linger code will not solve the
 problem -- it only moves it to a different and much harder one to detect.
 </OL>
 <!--#include virtual="footer.html" -->
 </BODY>
 </HTML>
	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
	<HTML>
	<HEAD>
	<TITLE>Connections in FIN_WAIT_2 and Apache</TITLE>
	<LINK REV="made" HREF="mailto:marc@apache.org">

	</HEAD>

	<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
	<BODY
	BGCOLOR="#FFFFFF"
	TEXT="#000000"
	LINK="#0000FF"
	VLINK="#000080"
	ALINK="#FF0000"
	>
	<!--#include virtual="header.html" -->

	<H1 ALIGN="CENTER">Connections in the FIN_WAIT_2 state and Apache</H1>
	<OL>
	<LI><H2>What is the FIN_WAIT_2 state?</H2>
	Starting with the Apache 1.2 betas, people are reporting many more
	connections in the FIN_WAIT_2 state (as reported by
	<code>netstat</code>) than they saw using older versions. When the
	server closes a TCP connection, it sends a packet with the FIN bit
	sent to the client, which then responds with a packet with the ACK bit
	set. The client then sends a packet with the FIN bit set to the
	server, which responds with an ACK and the connection is closed. The
	state that the connection is in during the period between when the
	server gets the ACK from the client and the server gets the FIN from
	the client is known as FIN_WAIT_2. See the <A
	HREF="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</A> for the
	technical details of the state transitions.<P>

	The FIN_WAIT_2 state is somewhat unusual in that there is no timeout
	defined in the standard for it. This means that on many operating
	systems, a connection in the FIN_WAIT_2 state will stay around until
	the system is rebooted. If the system does not have a timeout and
	too many FIN_WAIT_2 connections build up, it can fill up the space
	allocated for storing information about the connections and crash
	the kernel. The connections in FIN_WAIT_2 do not tie up an httpd
	process.<P>

	<LI><H2>But why does it happen?</H2>

	There are several reasons for it happening, and not all of them are
	fully understood by the Apache team yet. What is known follows.<P>

	<H3>Buggy clients and persistent connections</H3>

	Several clients have a bug which pops up when dealing with
	<A HREF="../keepalive.html">persistent connections</A> (aka keepalives).
	When the connection is idle and the server closes the connection
	(based on the <A HREF="../mod/core.html#keepalivetimeout">
	KeepAliveTimeout</A>), the client is programmed so that the client does
	not send back a FIN and ACK to the server. This means that the
	connection stays in the FIN_WAIT_2 state until one of the following
	happens:<P>
	<UL>
	<LI>The client opens a new connection to the same or a different
	site, which causes it to fully close the older connection on
	that socket.
	<LI>The user exits the client, which on some (most?) clients
	causes the OS to fully shutdown the connection.
	<LI>The FIN_WAIT_2 times out, on servers that have a timeout
	for this state.
	</UL><P>
	If you are lucky, this means that the buggy client will fully close the
	connection and release the resources on your server. However, there
	are some cases where the socket is never fully closed, such as a dialup
	client disconnecting from their provider before closing the client.
	In addition, a client might sit idle for days without making another
	connection, and thus may hold its end of the socket open for days
	even though it has no further use for it.
	<STRONG>This is a bug in the browser or in its operating system's
	TCP implementation.</STRONG> <P>

	The clients on which this problem has been verified to exist:<P>
	<UL>
	<LI>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
	<LI>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE i386)
	<LI>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)
	<LI>MSIE 3.01 on the Macintosh
	<LI>MSIE 3.01 on Windows 95
	</UL><P>

	This does not appear to be a problem on:
	<UL>
	<LI>Mozilla/3.01 (Win95; I)
	</UL>
	<P>

	It is expected that many other clients have the same problem. What a
	client <STRONG>should do</STRONG> is periodically check its open
	socket(s) to see if they have been closed by the server, and close their
	side of the connection if the server has closed. This check need only
	occur once every few seconds, and may even be detected by a OS signal
	on some systems (e.g., Win95 and NT clients have this capability, but
	they seem to be ignoring it).<P>

	Apache <STRONG>cannot</STRONG> avoid these FIN_WAIT_2 states unless it
	disables persistent connections for the buggy clients, just
	like we recommend doing for Navigator 2.x clients due to other bugs.
	However, non-persistent connections increase the total number of
	connections needed per client and slow retrieval of an image-laden
	web page. Since non-persistent connections have their own resource
	consumptions and a short waiting period after each closure, a busy server
	may need persistence in order to best serve its clients.<P>

	As far as we know, the client-caused FIN_WAIT_2 problem is present for
	all servers that support persistent connections, including Apache 1.1.x
	and 1.2.<P>

	<H3>Something in Apache may be broken</H3>

	While the above bug is a problem, it is not the whole problem.
	Some users have observed no FIN_WAIT_2 problems with Apache 1.1.x,
	but with 1.2b enough connections build up in the FIN_WAIT_2 state to
	crash their server. We have not yet identified why this would occur
	and welcome additional test input.<P>

	One possible (and most likely) source for additional FIN_WAIT_2 states
	is a function called <CODE>lingering_close()</CODE> which was added
	between 1.1 and 1.2. This function is necessary for the proper
	handling of persistent connections and any request which includes
	content in the message body (e.g., PUTs and POSTs).
	What it does is read any data sent by the client for
	a certain time after the server closes the connection. The exact
	reasons for doing this are somewhat complicated, but involve what
	happens if the client is making a request at the same time the
	server sends a response and closes the connection. Without lingering,
	the client might be forced to reset its TCP input buffer before it
	has a chance to read the server's response, and thus understand why
	the connection has closed.
	See the <A HREF="#appendix">appendix</A> for more details.<P>

	We have not yet tracked down the exact reason why
	<CODE>lingering_close()</CODE> causes problems. Its code has been
	thoroughly reviewed and extensively updated in 1.2b6. It is possible
	that there is some problem in the BSD TCP stack which is causing the
	observed problems. It is also possible that we fixed it in 1.2b6.
	Unfortunately, we have not been able to replicate the problem on our
	test servers.<P>

	<H2><LI>What can I do about it?</H2>

	There are several possible workarounds to the problem, some of
	which work better than others.<P>

	<H3>Add a timeout for FIN_WAIT_2</H3>

	The obvious workaround is to simply have a timeout for the FIN_WAIT_2 state.
	This is not specified by the RFC, and could be claimed to be a
	violation of the RFC, but it is widely recognized as being necessary.
	The following systems are known to have a timeout:
	<P>
	<UL>
	<LI><A HREF="http://www.freebsd.org/">FreeBSD</A> versions starting at 2.0 or possibly earlier.
	<LI><A HREF="http://www.netbsd.org/">NetBSD</A> version 1.2(?)
	<LI><A HREF="http://www.openbsd.org/">OpenBSD</A> all versions(?)
	<LI><A HREF="http://www.bsdi.com/">BSD/OS</A> 2.1, with the
	<A HREF="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
	K210-027</A> patch installed.
	<LI><A HREF="http://www.sun.com/">Solaris</A> as of around version
	2.2. The timeout can be tuned by using <CODE>ndd</CODE> to
	modify <CODE>tcp_fin_wait_2_flush_interval</CODE>, but the
	default should be appropriate for most servers and improper
	tuning can have negative impacts.
	<LI><A HREF="http://www.sco.com/">SCO TCP/IP Release 1.2.1</A>
	can be modified to have a timeout by following
	<A HREF="http://www.sco.com/cgi-bin/waisgate?WAISdocID=2242622956+0+0+0&WAISaction=retrieve"> SCO's instructions</A>.
	<LI><A HREF="http://www.linux.org/">Linux</A> 2.0.x and
	earlier(?)
	<LI><A HREF="http://www.hp.com/">HP-UX</A> 10.x defaults to
	terminating connections in the FIN_WAIT_2 state after the
	normal keepalive timeouts. This does not
	refer to the persistent connection or HTTP keepalive
	timeouts, but the <CODE>SO_LINGER</CODE> socket option
	which is enabled by Apache. This parameter can be adjusted
	by using <CODE>nettune</CODE> to modify parameters such as
	<CODE>tcp_keepstart</CODE> and <CODE>tcp_keepstop</CODE>.
	In later revisions, there is an explicit timer for
	connections in FIN_WAIT_2 that can be modified; contact HP
	support for details.
	<LI><A HREF="http://www.sgi.com/">SGI IRIX</A> can be patched to
	support a timeout. For IRIX 5.3, 6.2, and 6.3,
	use patches 1654, 1703 and 1778 respectively. If you
	have trouble locating these patches, please contact your
	SGI support channel for help.
	<LI><A HREF="http://www.ncr.com/">NCR's MP RAS Unix</A> 2.xx and
	3.xx both have FIN_WAIT_2 timeouts. In 2.xx it is non-tunable
	at 600 seconds, while in 3.xx it defaults to 600 seconds and
	is calculated based on the tunable "max keep alive probes"
	(default of 8) multiplied by the "keep alive interval" (default
	75 seconds).
	<LI><A HREF="http://www.sequent.com">Squent's ptx/TCP/IP for
	DYNIX/ptx</A> has had a FIN_WAIT_2 timeout since around
	release 4.1 in mid-1994.
	</UL>
	<P>
	The following systems are known to not have a timeout:
	<P>
	<UL>
	<LI><A HREF="http://www.sun.com/">SunOS 4.x</A> does not and
	almost certainly never will have one because it as at the
	very end of its development cycle for Sun. If you have kernel
	source should be easy to patch.
	</UL>
	<P>
	There is a
	<A HREF="http://www.apache.org/dist/contrib/patches/1.2/fin_wait_2.patch">
	patch available</A> for adding a timeout to the FIN_WAIT_2 state; it
	was originally intended for BSD/OS, but should be adaptable to most
	systems using BSD networking code. You need kernel source code to be
	able to use it. If you do adapt it to work for any other systems,
	please drop me a note at <A HREF="mailto:marc@apache.org">marc@apache.org</A>.
	<P>
	<H3>Compile without using <CODE>lingering_close()</CODE></H3>

	It is possible to compile Apache 1.2 without using the
	<CODE>lingering_close()</CODE> function. This will result in that
	section of code being similar to that which was in 1.1. If you do
	this, be aware that it can cause problems with PUTs, POSTs and
	persistent connections, especially if the client uses pipelining.
	That said, it is no worse than on 1.1, and we understand that keeping your
	server running is quite important.<P>

	To compile without the <CODE>lingering_close()</CODE> function, add
	<CODE>-DNO_LINGCLOSE</CODE> to the end of the
	<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE> file,
	rerun <CODE>Configure</CODE> and rebuild the server.
	<P>
	<H3>Use <CODE>SO_LINGER</CODE> as an alternative to
	<CODE>lingering_close()</CODE></H3>

	On most systems, there is an option called <CODE>SO_LINGER</CODE> that
	can be set with <CODE>setsockopt(2)</CODE>. It does something very
	similar to <CODE>lingering_close()</CODE>, except that it is broken
	on many systems so that it causes far more problems than
	<CODE>lingering_close</CODE>. On some systems, it could possibly work
	better so it may be worth a try if you have no other alternatives. <P>

	To try it, add <CODE>-DUSE_SO_LINGER -DNO_LINGCLOSE</CODE> to the end of the
	<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE>
	file, rerun <CODE>Configure</CODE> and rebuild the server. <P>

	<STRONG>NOTE:</STRONG> Attempting to use <CODE>SO_LINGER</CODE> and
	<CODE>lingering_close()</CODE> at the same time is very likely to do
	very bad things, so don't.<P>

	<H3>Increase the amount of memory used for storing connection state</H3>
	<DL>
	<DT>BSD based networking code:
	<DD>BSD stores network data, such as connection states,
	in something called an mbuf. When you get so many connections
	that the kernel does not have enough mbufs to put them all in, your
	kernel will likely crash. You can reduce the effects of the problem
	by increasing the number of mbufs that are available; this will not
	prevent the problem, it will just make the server go longer before
	crashing.<P>

	The exact way to increase them may depend on your OS; look
	for some reference to the number of "mbufs" or "mbuf clusters". On
	many systems, this can be done by adding the line
	<CODE>NMBCLUSTERS="n"</CODE>, where <CODE>n</CODE> is the number of
	mbuf clusters you want to your kernel config file and rebuilding your
	kernel.<P>
	</DL>

	<H3>Disable KeepAlive</H3>
	<P>If you are unable to do any of the above then you should, as a last
	resort, disable KeepAlive. Edit your httpd.conf and change "KeepAlive On"
	to "KeepAlive Off".

	<H2><LI>Feedback</H2>

	If you have any information to add to this page, please contact me at
	<A HREF="mailto:marc@apache.org">marc@apache.org</A>.<P>

	<H2><A NAME="appendix"><LI>Appendix</A></H2>
	<P>
	Below is a message from Roy Fielding, one of the authors of HTTP/1.1.

	<H3>Why the lingering close functionality is necessary with HTTP</H3>

	The need for a server to linger on a socket after a close is noted a couple
	times in the HTTP specs, but not explained. This explanation is based on
	discussions between myself, Henrik Frystyk, Robert S. Thau, Dave Raggett,
	and John C. Mallery in the hallways of MIT while I was at W3C.<P>

	If a server closes the input side of the connection while the client
	is sending data (or is planning to send data), then the server's TCP
	stack will signal an RST (reset) back to the client. Upon
	receipt of the RST, the client will flush its own incoming TCP buffer
	back to the un-ACKed packet indicated by the RST packet argument.
	If the server has sent a message, usually an error response, to the
	client just before the close, and the client receives the RST packet
	before its application code has read the error message from its incoming
	TCP buffer and before the server has received the ACK sent by the client
	upon receipt of that buffer, then the RST will flush the error message
	before the client application has a chance to see it. The result is
	that the client is left thinking that the connection failed for no
	apparent reason.<P>

	There are two conditions under which this is likely to occur:
	<OL>
	<LI>sending POST or PUT data without proper authorization
	<LI>sending multiple requests before each response (pipelining)
	and one of the middle requests resulting in an error or
	other break-the-connection result.
	</OL>
	<P>
	The solution in all cases is to send the response, close only the
	write half of the connection (what shutdown is supposed to do), and
	continue reading on the socket until it is either closed by the
	client (signifying it has finally read the response) or a timeout occurs.
	That is what the kernel is supposed to do if SO_LINGER is set.
	Unfortunately, SO_LINGER has no effect on some systems; on some other
	systems, it does not have its own timeout and thus the TCP memory
	segments just pile-up until the next reboot (planned or not).<P>

	Please note that simply removing the linger code will not solve the
	problem -- it only moves it to a different and much harder one to detect.
	</OL>
	<!--#include virtual="footer.html" -->
	</BODY>
	</HTML>