| <?xml version='1.0' encoding='UTF-8' ?> |
| <!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd"> |
| <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?> |
| <!-- $LastChangedRevision$ --> |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <manualpage metafile="tech.xml.meta"> |
| <parentdocument href="./">Rewrite</parentdocument> |
| |
| <title>Apache mod_rewrite Technical Details</title> |
| |
| <summary> |
| <p>This document discusses some of the technical details of mod_rewrite |
| and URL matching.</p> |
| </summary> |
| <seealso><a href="../mod/mod_rewrite.html">Module documentation</a></seealso> |
| <seealso><a href="intro.html">mod_rewrite introduction</a></seealso> |
| <seealso><a href="remapping.html">Redirection and remapping</a></seealso> |
| <seealso><a href="access.html">Controlling access</a></seealso> |
| <seealso><a href="vhosts.html">Virtual hosts</a></seealso> |
| <seealso><a href="proxy.html">Proxying</a></seealso> |
| <seealso><a href="rewritemap.html">Using RewriteMap</a></seealso> |
| <seealso><a href="advanced.html">Advanced techniques</a></seealso> |
| <seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso> |
| |
| <section id="InternalAPI"><title>API Phases</title> |
| |
| <p>The Apache HTTP Server handles requests in several phases. At |
| each of these phases, one or more modules may be called upon to |
| handle that portion of the request lifecycle. Phases include things |
| like URL-to-filename translation, authentication, authorization, |
| content, and logging. (This is not an exhaustive list.)</p> |
| |
| <p>mod_rewrite acts in two of these phases (or "hooks", as they are |
| often called) to influence how URLs may be rewritten.</p> |
| |
| <p>First, it uses the URL-to-filename translation hook, which occurs |
| after the HTTP request has been read, but before any authorization |
| starts. Secondly, it uses the Fixup hook, which is after the |
| authorization phases, and after per-directory configuration files |
| (<code>.htaccess</code> files) have been read, but before the |
| content handler is called.</p> |
| |
| <p>After a request comes in and a corresponding server or |
| virtual host has been determined, the rewriting engine starts |
| processing any <code>mod_rewrite</code> directives appearing in the |
| per-server configuration. (i.e., in the main server configuration file |
| and <directive module="core" type="section">Virtualhost</directive> |
| sections.) This happens in the URL-to-filename phase.</p> |
| |
| <p>A few steps later, once the final data directories have been found, |
| the per-directory configuration directives (<code>.htaccess</code> |
| files and <directive module="core" |
| type="section">Directory</directive> blocks) are applied. This |
| happens in the Fixup phase.</p> |
| |
| <p>In each of these cases, mod_rewrite rewrites the |
| <code>REQUEST_URI</code> either to a new URL, or to a filename.</p> |
| |
| <p>In per-directory context (i.e., within <code>.htaccess</code> files |
| and <code>Directory</code> blocks), these rules are being applied |
| after a URL has already been translated to a filename. Because of |
| this, the URL-path that mod_rewrite initially compares <directive |
| module="mod_rewrite">RewriteRule</directive> directives against |
| is the full filesystem path to the translated filename with the current |
| directories path (including a trailing slash) removed from the front.</p> |
| |
| <p> To illustrate: If rules are in /var/www/foo/.htaccess and a request |
| for /foo/bar/baz is being processed, an expression like ^bar/baz$ would |
| match.</p> |
| |
| <p> If a substitution is made in per-directory context, a new internal |
| subrequest is issued with the new URL, which restarts processing of the |
| request phases. If the substitution is a relative path, the <directive |
| module="mod_rewrite">RewriteBase</directive> directive |
| determines the URL-path prefix prepended to the substitution. |
| In per-directory context, care must be taken to |
| create rules which will eventually (in some future "round" of per-directory |
| rewrite processing) not perform a substitution to avoid looping. |
| (See <a href="http://wiki.apache.org/httpd/RewriteLooping">RewriteLooping</a> |
| for further discussion of this problem.)</p> |
| |
| <p>Because of this further manipulation of the URL in per-directory |
| context, you'll need to take care to craft your rewrite rules |
| differently in that context. In particular, remember that the |
| leading directory path will be stripped off of the URL that your |
| rewrite rules will see. Consider the examples below for further |
| clarification.</p> |
| |
| <table border="1"> |
| |
| <tr> |
| <th>Location of rule</th> |
| <th>Rule</th> |
| </tr> |
| |
| <tr> |
| <td>VirtualHost section</td> |
| <td>RewriteRule "^/images/(.+)\.jpg" "/images/$1.gif"</td> |
| </tr> |
| |
| <tr> |
| <td>.htaccess file in document root</td> |
| <td>RewriteRule "^images/(.+)\.jpg" "images/$1.gif"</td> |
| </tr> |
| |
| <tr> |
| <td>.htaccess file in images directory</td> |
| <td>RewriteRule "^(.+)\.jpg" "$1.gif"</td> |
| </tr> |
| |
| </table> |
| |
| <p>For even more insight into how mod_rewrite manipulates URLs in |
| different contexts, you should consult the <a |
| href="../mod/mod_rewrite.html#logging">log entries</a> made during |
| rewriting.</p> |
| |
| </section> |
| |
| <section id="InternalRuleset"><title>Ruleset Processing</title> |
| |
| <p>Now when mod_rewrite is triggered in these two API phases, it |
| reads the configured rulesets from its configuration |
| structure (which itself was either created on startup for |
| per-server context or during the directory walk of the Apache |
| kernel for per-directory context). Then the URL rewriting |
| engine is started with the contained ruleset (one or more |
| rules together with their conditions). The operation of the |
| URL rewriting engine itself is exactly the same for both |
| configuration contexts. Only the final result processing is |
| different.</p> |
| |
| <p>The order of rules in the ruleset is important because the |
| rewriting engine processes them in a special (and not very |
| obvious) order. The rule is this: The rewriting engine loops |
| through the ruleset rule by rule (<directive |
| module="mod_rewrite">RewriteRule</directive> directives) and |
| when a particular rule matches it optionally loops through |
| existing corresponding conditions (<code>RewriteCond</code> |
| directives). For historical reasons the conditions are given |
| first, and so the control flow is a little bit long-winded. See |
| Figure 1 for more details.</p> |
| <p class="figure"> |
| <img src="../images/rewrite_process_uri.png" |
| alt="Flow of RewriteRule and RewriteCond matching" /><br /> |
| <dfn>Figure 1:</dfn>The control flow through the rewriting ruleset |
| </p> |
| <p>First the URL is matched against the |
| <em>Pattern</em> of each rule. If it fails, mod_rewrite |
| immediately stops processing this rule, and continues with the |
| next rule. If the <em>Pattern</em> matches, mod_rewrite looks |
| for corresponding rule conditions (RewriteCond directives, |
| appearing immediately above the RewriteRule in the configuration). |
| If none are present, it substitutes the URL with a new value, which is |
| constructed from the string <em>Substitution</em>, and goes on |
| with its rule-looping. But if conditions exist, it starts an |
| inner loop for processing them in the order that they are |
| listed. For conditions, the logic is different: we don't match |
| a pattern against the current URL. Instead we first create a |
| string <em>TestString</em> by expanding variables, |
| back-references, map lookups, <em>etc.</em> and then we try |
| to match <em>CondPattern</em> against it. If the pattern |
| doesn't match, the complete set of conditions and the |
| corresponding rule fails. If the pattern matches, then the |
| next condition is processed until no more conditions are |
| available. If all conditions match, processing is continued |
| with the substitution of the URL with |
| <em>Substitution</em>.</p> |
| |
| </section> |
| |
| |
| </manualpage> |