| <?xml version="1.0" encoding="UTF-8" ?> |
| <!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd"> |
| <?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?> |
| <!-- $LastChangedRevision$ --> |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <manualpage metafile="compliance.xml.meta"> |
| |
| <title>HTTP Protocol Compliance</title> |
| |
| <summary> |
| <p>This document describes the mechanism to set a policy for HTTP |
| protocol compliance for a given URL space by the origin servers or |
| applications behind that URL space.</p> |
| |
| <p>For those who may have received an error message from a rejected |
| policy, and need to know what the policy rejection means and what |
| they might do to fix the error, each policy is described below.</p> |
| </summary> |
| <seealso><a href="filter.html">Filters</a></seealso> |
| |
| <section id="intro"> |
| <title>Enforcing HTTP Protocol Compliance in Apache 2</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyConditional</directive> |
| <directive module="mod_policy">PolicyLength</directive> |
| <directive module="mod_policy">PolicyKeepalive</directive> |
| <directive module="mod_policy">PolicyType</directive> |
| <directive module="mod_policy">PolicyVary</directive> |
| <directive module="mod_policy">PolicyValidation</directive> |
| <directive module="mod_policy">PolicyNocache</directive> |
| <directive module="mod_policy">PolicyMaxage</directive> |
| <directive module="mod_policy">PolicyVersion</directive> |
| </directivelist> |
| </related> |
| |
| <p>The HTTP protocol follows the <strong>robustness principle</strong> |
| as described in <a href="http://tools.ietf.org/html/rfc1122">RFC1122</a>, |
| which states <strong>"Be liberal in what you accept, and conservative in |
| what you send"</strong>. As a result of this principle, HTTP clients will |
| compensate for and recover from incorrect or misconfigured responses, or |
| responses that are uncacheable.</p> |
| |
| <p>As a website is scaled up to face greater and greater traffic loads, |
| suboptimal or misconfigured applications or server configurations can |
| threaten both the stability and scalability of the website, as well as |
| the hosting costs associated with it. A website can also scale up to face |
| greater configuration complexity, and it can be increasingly difficult to |
| detect and keep track of suboptimally configured URL spaces on a given |
| server.</p> |
| |
| <p>Eventually a point is reached where the principle "conservative in |
| what you send" needs to be enforced by the server administrator.</p> |
| |
| <p>The <module>mod_policy</module> module provides a set of filters |
| which can be applied to a server, allowing key features of the HTTP |
| protocol to be explicitly tested, and non compliant responses logged as |
| warnings, or rejected outright as an error. Each filter can be applied |
| separately, allowing the administrator to pick and choose which policies |
| should be enforced depending on the circumstances of their environment. |
| </p> |
| |
| <p>The filters might be placed in testing and staging environments for |
| the benefit of application and website developers, or may be applied |
| to production servers to protect infrastructure from systems outside |
| the administrator's direct control.</p> |
| |
| <p class="figure"> |
| <img src="images/compliance-reverse-proxy.png" width="666" height="239" alt= |
| "Enforcing HTTP protocol compliance for an application server"/> |
| </p> |
| |
| <p>In the above example, an Apache httpd server has been placed between |
| the application server and the internet at large, and configured to cache |
| responses from the application server. The <module>mod_policy</module> |
| filters have been added to enforce support for cacheable content and |
| conditional requests, ensuring that both <module>mod_cache</module> and |
| public caches on the internet are fully able to cache content created |
| by the restful application server efficiently.</p> |
| |
| <p class="figure"> |
| <img src="images/compliance-static.png" width="469" height="239" alt= |
| "Enforcing HTTP protocol compliance in a static server"/> |
| </p> |
| |
| <p>In the above simpler example, a static server serving highly cacheable |
| content has a set of policies applied to ensure that the server configuration |
| conforms to a minimum level of compliance.</p> |
| |
| </section> |
| |
| <section id="policyconditional"> |
| <title>Conditional Request Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyConditional</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the server does not correctly respond |
| to a conditional request with the appropriate status code.</p> |
| |
| <p>Conditional requests form the mechanism by which an HTTP cache makes |
| stale content fresh again, and particularly for content with short freshness |
| lifetimes, lack of support for conditional requests can add avoidable load |
| to the server.</p> |
| |
| <p>Most specifically, the existence of any of following headers in the |
| request makes the request conditional:</p> |
| |
| <dl> |
| <dt><code>If-Match</code></dt> |
| <dd>If the provided ETag in the <code>If-Match</code> header does not match |
| the ETag of the response, the server should return |
| <code>412 Precondition Failed</code>. Full details of how to handle an |
| <code>If-Match</code> header can be found in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24"> |
| RFC2616 section 14.24</a>.</dd> |
| |
| <dt><code>If-None-Match</code></dt> |
| <dd>If the provided ETag in the <code>If-None-Match</code> header matches |
| the ETag of the response, the server should return either |
| <code>304 Not Modified</code> for GET/HEAD requests, or |
| <code>412 Precondition Failed</code> for other methods. Full details of how |
| to handle an <code>If-None-Match</code> header can be found in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26"> |
| RFC2616 section 14.26</a>.</dd> |
| |
| <dt><code>If-Modified-Since</code></dt> |
| <dd>If the provided date in the <code>If-Modified-Since</code> header is |
| older than the <code>Last-Modified</code> header of the response, the server |
| should return <code>304 Not Modified</code>. Full details of how to handle an |
| <code>If-Modified-Since</code> header can be found in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.25"> |
| RFC2616 section 14.25</a>.</dd> |
| |
| <dt><code>If-Unmodified-Since</code></dt> |
| <dd>If the provided date in the <code>If-Modified-Since</code> header is |
| newer than the <code>Last-Modified</code> header of the response, the server |
| should return <code>412 Precondition Failed</code>. Full details of how to |
| handle an <code>If-Unmodified-Since</code> header can be found in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.28"> |
| RFC2616 section 14.28</a>.</dd> |
| |
| <dt><code>If-Range</code></dt> |
| <dd>If the provided ETag or date in the <code>If-Range</code> header matches |
| the ETag or Last-Modified of the response, and a valid <code>Range</code> |
| is present, the server should return |
| <code>206 Partial Response</code>. Full details of how to handle an |
| <code>If-Range</code> header can be found in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.27"> |
| RFC2616 section 14.27</a>.</dd> |
| |
| </dl> |
| |
| <p>If the response is detected to have been successful (a 2xx response), |
| but was conditional and one of the responses above was expected instead, |
| this policy will be rejected. Responses that indicate a redirect or a |
| failure of some kind (3xx, 4xx, 5xx) will be ignored by this policy.</p> |
| |
| <p>This policy is implemented by the <strong>POLICY_CONDITIONAL</strong> |
| filter.</p> |
| |
| </section> |
| |
| <section id="policylength"> |
| <title>Content-Length Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyLength</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the server response does not contain |
| an explicit <code>Content-Length</code> header.</p> |
| |
| <p>There are a number of ways of determining the length of a response |
| body, described in full in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4"> |
| RFC2616 section 4.4 Message Length</a>.</p> |
| |
| <p>When the <code>Content-Length</code> header is present, the size of |
| the body is declared at the start of the response. If this information |
| is missing, an HTTP cache might choose to ignore the response, as it |
| does not know in advance whether the response will fit within the |
| cache's defined limits.</p> |
| |
| <p>HTTP/1.1 defines the <code>Transfer-Encoding</code> header as an |
| alternative to <code>Content-Length</code>, allowing the end of the |
| response to be indicated to the client without the client having to |
| know the length beforehand. However, when HTTP/1.0 requests are |
| processed, and no <code>Content-Length</code> is specified, the only |
| mechanism available to the server to indicate the end of the request |
| is to drop the connection. In an environment containing load |
| balancers, this can cause the keepalive mechanism to be bypassed. |
| </p> |
| |
| <p>If the response is detected to have been successful (a 2xx response), |
| and has a response body (this excludes <code>204 No Content</code>), and |
| the <code>Content-Length</code> header is missing, this policy will be |
| rejected. Responses that indicate a redirect or a failure of some kind |
| (3xx, 4xx, 5xx) will be ignored by this policy.</p> |
| |
| <note type="warning">It should be noted that some modules, such as |
| <module>mod_proxy</module>, add their own <code>Content-Length</code> |
| header should the response be small enough for it to have been possible |
| to read the response lacking such a header in one go. This may cause |
| small responses to pass this policy, while larger responses may |
| fail for the same URL.</note> |
| |
| <p>This policy is implemented by the <strong>POLICY_LENGTH</strong> |
| filter.</p> |
| |
| </section> |
| |
| <section id="policytype"> |
| <title>Content-Type Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyType</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the server response does not contain |
| an explicit and syntactically correct <code>Content-Type</code> header |
| that matches the server defined pattern.</p> |
| |
| <p>The media type of the body is placed in the <code>Content-Type</code> |
| header, and the format of the header is described in full in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7"> |
| RFC2616 section 3.7 Media Types</a>.</p> |
| |
| <p>A syntactically valid content type might look as follows:</p> |
| |
| <example> |
| Content-Type: text/html; charset=iso-8859-1 |
| </example> |
| |
| <p>Invalid content types might include:</p> |
| |
| <example> |
| # invalid<br /> |
| Content-Type: foo<br /> |
| # blank<br /> |
| Content-Type: |
| </example> |
| |
| <p>The server administrator has the option to restrict the policy to one |
| or more specific types, or could specify a general wildcard type such as |
| <code>*/*</code>.</p> |
| |
| <p>This policy is implemented by the <strong>POLICY_TYPE</strong> |
| filter.</p> |
| |
| </section> |
| |
| <section id="policykeepalive"> |
| <title>Keepalive Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyKeepalive</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the server response does not contain |
| an explicit <code>Content-Length</code> header, or a |
| <code>Transfer-Encoding</code> of chunked.</p> |
| |
| <p>There are a number of ways of determining the length of a response |
| body, described in full in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4"> |
| RFC2616 section 4.4 Message Length</a>.</p> |
| |
| <p>When the <code>Content-Length</code> header is present, the size of |
| the body is declared at the start of the response. HTTP/1.1 defines the |
| <code>Transfer-Encoding</code> header as an alternative to |
| <code>Content-Length</code>, allowing the end of the response to be |
| indicated to the client without the client having to know the length |
| beforehand. In the absence of these two mechanisms, the only way for |
| a server to indicate the end of the request is to drop the connection. |
| In an environment containing load balancers, this can cause the keepalive |
| mechanism to be bypassed. |
| </p> |
| |
| <p>Most specifically, we follow these rules:</p> |
| |
| <dl> |
| <dt>IF</dt> |
| <dd>we have not marked this connection as errored;</dd> |
| |
| <dt>and</dt> |
| <dd>the client isn't expecting 100-continue</dd> |
| |
| <dt>and</dt> |
| <dd>the response status does not require a close;</dd> |
| |
| <dt>and</dt> |
| <dd>the response body has a defined length due to the status code |
| being 304 or 204, the request method being HEAD, already having defined |
| Content-Length or Transfer-Encoding: chunked, or the request version |
| being HTTP/1.1 and thus capable of being set as chunked</dd> |
| |
| <dt>THEN</dt> |
| <dd>we support keepalive.</dd> |
| </dl> |
| |
| <note type="warning">The server may choose to turn off keepalive for |
| various reasons, such as an imminent shutdown, or a Connection: close from |
| the client, or an HTTP/1.0 client request with a response with no |
| <code>Content-Length</code>, but for our purposes we only care that |
| keepalive was possible from the application, not that keepalive actually |
| took place.</note> |
| |
| <p>It should also be noted that the Apache httpd server includes a filter |
| that adds chunked encoding to responses without an explicit content |
| length. This policy catches those cases where this filter is bypassed or |
| not in effect.</p> |
| |
| <p>This policy is implemented by the <strong>POLICY_KEEPALIVE</strong> |
| filter.</p> |
| |
| </section> |
| |
| <section id="policymaxage"> |
| <title>Freshness Lifetime / Maxage Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyMaxage</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the server response does not have |
| an explicit <strong>freshness lifetime</strong> at least as long |
| as the server defined limit, or if the freshness lifetime is |
| calculated based on a heuristic.</p> |
| |
| <p>Full details of how a freshness lifetime is calculated is described in |
| full in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2"> |
| RFC2616 section 13.2 Expiration Model</a>.</p> |
| |
| <p>During the freshness lifetime, a cache does not need to contact the |
| origin server at all, it can simply pass the cached content as is back |
| to the client.</p> |
| |
| <p>When the freshness lifetime is reached, the cache should contact the |
| origin server in an effort to check whether the content is still fresh, |
| and if not, replace the content.</p> |
| |
| <p>When the freshness lifetime is too short, it can result in excessive |
| load on the server. In addition, should an outage occur that is as long |
| or longer than the freshness lifetime, all cached content will become |
| stale, which could cause a thundering herd of traffic when the |
| server or network returns.</p> |
| |
| <p>This policy is implemented by the <strong>POLICY_MAXAGE</strong> |
| filter.</p> |
| |
| </section> |
| |
| <section id="policynocache"> |
| <title>No Cache Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyNocache</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the server response declares itself |
| uncacheable using either the <code>Cache-Control</code> or |
| <code>Pragma</code> headers.</p> |
| |
| <p>Full details of how content may be declared uncacheable is described in |
| full in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1"> |
| RFC2616 section 14.9.1 What is Cacheable</a>, and within the definition |
| for the <code>Pragma</code> header in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32"> |
| RFC2616 section 14.32 Pragma</a>.</p> |
| |
| <p>Most specifically, should any of the following header combinations |
| exist in the response headers, the response will be rejected:</p> |
| |
| <ul> |
| <li><code>Cache-Control: no-cache</code></li> |
| <li><code>Cache-Control: no-store</code></li> |
| <li><code>Cache-Control: private</code></li> |
| <li><code>Pragma: no-cache</code></li> |
| </ul> |
| |
| <p>When unexpected, uncacheable content may produce unacceptable levels |
| of server load, or may incur significant cost. When this policy is enabled, |
| all server defined uncacheable content will be rejected.</p> |
| |
| <p>This policy is implemented by the <strong>POLICY_NOCACHE</strong> |
| filter.</p> |
| |
| </section> |
| |
| <section id="policyvalidation"> |
| <title>Validation Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyValidation</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the server response does not contain |
| either a syntactically correct <code>ETag</code> or |
| <code>Last-Modified</code> header.</p> |
| |
| <p>The <code>ETag</code> header is described in full in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.19"> |
| RFC2616 section 14.19 Etag</a>, and the <code>Last-Modified</code> header |
| is described in full in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.29"> |
| RFC2616 section 14.29 Last-Modified</a>.</p> |
| |
| <p>In addition to being checked present, the headers are checked for |
| syntax.</p> |
| |
| <p>An <code>ETag</code> that is not surrounded with quotes, or is not |
| declared "weak" by prefixing it with a "W/" will cause the policy to be |
| rejected. A <code>Last-Modified</code> that is not parsed as a valid date |
| will cause the policy to be rejected.</p> |
| |
| <p>This policy is implemented by the <strong>POLICY_VALIDATION</strong> |
| filter.</p> |
| |
| </section> |
| |
| <section id="policyvary"> |
| <title>Vary Header Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyVary</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the server response contains a |
| <code>Vary</code> header, and that header in turn contains a header |
| forbidden by the administrator.</p> |
| |
| <p>The <code>Vary</code> header is described in full in |
| <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44"> |
| RFC2616 section 14.44 Vary</a>.</p> |
| |
| <p>Some client provided headers, such as <code>User-Agent</code>, |
| can contain thousands or millions of combinations of values over a period |
| of time, and if the response is declared cacheable, a cache might attempt |
| to cache each of these responses separately, filling up the cache and |
| crowding out other entries in the cache. In this scenario, if so |
| configured, the policy will reject the response.</p> |
| |
| <p>This policy is implemented by the <strong>POLICY_VARY</strong> |
| filter.</p> |
| |
| </section> |
| |
| <section id="policyversion"> |
| <title>Protocol Version Policy</title> |
| <related> |
| <modulelist> |
| <module>mod_policy</module> |
| </modulelist> |
| <directivelist> |
| <directive module="mod_policy">PolicyVersion</directive> |
| </directivelist> |
| </related> |
| |
| <p>This policy will be rejected if the client request was made with a |
| version number lower than the version of HTTP specified.</p> |
| |
| <p>This policy is typically used with restful applications where |
| control over the type of client is desired. This policy can be used |
| alongside the <code>POLICY_KEEPALIVE</code> filter to ensure that |
| HTTP/1.0 clients don't cause keepalive connections to be dropped.</p> |
| |
| <p>Possible minimum versions that could be specified are:</p> |
| |
| <ul><li><code>HTTP/1.1</code></li> |
| <li><code>HTTP/1.0</code></li> |
| <li><code>HTTP/0.9</code></li> |
| </ul> |
| |
| <p>This policy is implemented by the <strong>POLICY_VERSON</strong> |
| filter.</p> |
| |
| </section> |
| </manualpage> |