| <?xml version="1.0" encoding="UTF-8" ?> |
| <!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd"> |
| <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?> |
| <!-- $LastChangedRevision$ --> |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <manualpage metafile="advanced.xml.meta"> |
| <parentdocument href="./">Rewrite</parentdocument> |
| |
| <title>Advanced Techniques with mod_rewrite</title> |
| |
| <summary> |
| |
| <p>This document supplements the <module>mod_rewrite</module> |
| <a href="../mod/mod_rewrite.html">reference documentation</a>. It provides |
| a few advanced techniques using mod_rewrite.</p> |
| |
| <!-- |
| I question whether anything remailing in this document qualifies as |
| "advanced". It's probably time to take inventory of the examples that we |
| have in the various docs, and consider a reorg of the stuff in this |
| directory. Again. |
| --> |
| |
| <note type="warning">Note that many of these examples won't work unchanged in your |
| particular server configuration, so it's important that you understand |
| them, rather than merely cutting and pasting the examples into your |
| configuration.</note> |
| |
| </summary> |
| <seealso><a href="../mod/mod_rewrite.html">Module documentation</a></seealso> |
| <seealso><a href="intro.html">mod_rewrite introduction</a></seealso> |
| <seealso><a href="remapping.html">Redirection and remapping</a></seealso> |
| <seealso><a href="access.html">Controlling access</a></seealso> |
| <seealso><a href="vhosts.html">Virtual hosts</a></seealso> |
| <seealso><a href="proxy.html">Proxying</a></seealso> |
| <seealso><a href="rewritemap.html">Using RewriteMap</a></seealso> |
| <!--<seealso><a href="advanced.html">Advanced techniques</a></seealso>--> |
| <seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso> |
| |
| <section id="sharding"> |
| |
| <title>URL-based sharding across multiple backends</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>A common technique for distributing the burden of |
| server load or storage space is called "sharding". |
| When using this method, a front-end server will use the |
| url to consistently "shard" users or objects to separate |
| backend servers.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>A mapping is maintained, from users to target servers, in |
| external map files. They look like:</p> |
| |
| <example> |
| user1 physical_host_of_user1<br /> |
| user2 physical_host_of_user2<br /> |
| : : |
| </example> |
| |
| <p>We put this into a <code>map.users-to-hosts</code> file. The |
| aim is to map;</p> |
| |
| <example> |
| /u/user1/anypath |
| </example> |
| |
| <p>to</p> |
| |
| <example> |
| http://physical_host_of_user1/u/user/anypath |
| </example> |
| |
| <p>thus every URL path need not be valid on every backend physical |
| host. The following ruleset does this for us with the help of the map |
| files assuming that server0 is a default server which will be used if |
| a user has no entry in the map:</p> |
| |
| <highlight language="config"> |
| RewriteEngine on |
| RewriteMap users-to-hosts "txt:/path/to/map.users-to-hosts" |
| RewriteRule "^/u/([^/]+)/?(.*)" "http://${users-to-hosts:$1|server0}/u/$1/$2" |
| </highlight> |
| </dd> |
| </dl> |
| |
| <p>See the <directive module="mod_rewrite">RewriteMap</directive> |
| documentation for more discussion of the syntax of this directive.</p> |
| |
| </section> |
| |
| <section id="on-the-fly-content"> |
| |
| <title>On-the-fly Content-Regeneration</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to dynamically generate content, but store it |
| statically once it is generated. This rule will check for the |
| existence of the static file, and if it's not there, generate |
| it. The static files can be removed periodically, if desired (say, |
| via cron) and will be regenerated on demand.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| This is done via the following ruleset: |
| |
| <highlight language="config"> |
| # This example is valid in per-directory context only |
| RewriteCond "%{REQUEST_URI}" !-U |
| RewriteRule "^(.+)\.html$" "/regenerate_page.cgi" [PT,L] |
| </highlight> |
| |
| <p>The <code>-U</code> operator determines whether the test string |
| (in this case, <code>REQUEST_URI</code>) is a valid URL. It does |
| this via a subrequest. In the event that this subrequest fails - |
| that is, the requested resource doesn't exist - this rule invokes |
| the CGI program <code>/regenerate_page.cgi</code>, which generates |
| the requested resource and saves it into the document directory, so |
| that the next time it is requested, a static copy can be served.</p> |
| |
| <p>In this way, documents that are infrequently updated can be served in |
| static form. if documents need to be refreshed, they can be deleted |
| from the document directory, and they will then be regenerated the |
| next time they are requested.</p> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="load-balancing"> |
| |
| <title>Load Balancing</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to randomly distribute load across several servers |
| using mod_rewrite.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>We'll use <directive |
| module="mod_rewrite">RewriteMap</directive> and a list of servers |
| to accomplish this.</p> |
| |
| <highlight language="config"> |
| RewriteEngine on |
| RewriteMap lb "rnd:/path/to/serverlist.txt" |
| RewriteRule "^/(.*)" "http://${lb:servers}/$1" [P,L] |
| </highlight> |
| |
| <p><code>serverlist.txt</code> will contain a list of the servers:</p> |
| |
| <example> |
| ## serverlist.txt<br /> |
| <br /> |
| servers one.example.com|two.example.com|three.example.com<br /> |
| </example> |
| |
| <p>If you want one particular server to get more of the load than the |
| others, add it more times to the list.</p> |
| |
| </dd> |
| |
| <dt>Discussion</dt> |
| <dd> |
| <p>Apache comes with a load-balancing module - |
| <module>mod_proxy_balancer</module> - which is far more flexible and |
| featureful than anything you can cobble together using mod_rewrite.</p> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="structuredhomedirs"> |
| |
| <title>Structured Userdirs</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>Some sites with thousands of users use a |
| structured homedir layout, <em>i.e.</em> each homedir is in a |
| subdirectory which begins (for instance) with the first |
| character of the username. So, <code>/~larry/anypath</code> |
| is <code>/home/<strong>l</strong>/larry/public_html/anypath</code> |
| while <code>/~waldo/anypath</code> is |
| <code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>We use the following ruleset to expand the tilde URLs |
| into the above layout.</p> |
| |
| <highlight language="config"> |
| RewriteEngine on |
| RewriteRule "^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*)" "/home/<strong>$2</strong>/$1/public_html$3" |
| </highlight> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="redirectanchors"> |
| |
| <title>Redirecting Anchors</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>By default, redirecting to an HTML anchor doesn't work, |
| because mod_rewrite escapes the <code>#</code> character, |
| turning it into <code>%23</code>. This, in turn, breaks the |
| redirection.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>Use the <code>[NE]</code> flag on the |
| <code>RewriteRule</code>. NE stands for No Escape. |
| </p> |
| </dd> |
| |
| <dt>Discussion:</dt> |
| <dd>This technique will of course also work with other |
| special characters that mod_rewrite, by default, URL-encodes.</dd> |
| </dl> |
| |
| </section> |
| |
| <section id="time-dependent"> |
| |
| <title>Time-Dependent Rewriting</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to use mod_rewrite to serve different content based on |
| the time of day.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>There are a lot of variables named <code>TIME_xxx</code> |
| for rewrite conditions. In conjunction with the special |
| lexicographic comparison patterns <code><STRING</code>, |
| <code>>STRING</code> and <code>=STRING</code> we can |
| do time-dependent redirects:</p> |
| |
| <highlight language="config"> |
| RewriteEngine on |
| RewriteCond "%{TIME_HOUR}%{TIME_MIN}" >0700 |
| RewriteCond "%{TIME_HOUR}%{TIME_MIN}" <1900 |
| RewriteRule "^foo\.html$" "foo.day.html" [L] |
| RewriteRule "^foo\.html$" "foo.night.html" |
| </highlight> |
| |
| <p>This provides the content of <code>foo.day.html</code> |
| under the URL <code>foo.html</code> from |
| <code>07:01-18:59</code> and at the remaining time the |
| contents of <code>foo.night.html</code>.</p> |
| |
| <note type="warning"><module>mod_cache</module>, intermediate proxies |
| and browsers may each cache responses and cause the either page to be |
| shown outside of the time-window configured. |
| <module>mod_expires</module> may be used to control this |
| effect. You are, of course, much better off simply serving the |
| content dynamically, and customizing it based on the time of day.</note> |
| |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="setenvvars"> |
| |
| <title>Set Environment Variables Based On URL Parts</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>At time, we want to maintain some kind of status when we |
| perform a rewrite. For example, you want to make a note that |
| you've done that rewrite, so that you can check later to see if a |
| request can via that rewrite. One way to do this is by setting an |
| environment variable.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>Use the [E] flag to set an environment variable.</p> |
| |
| <highlight language="config"> |
| RewriteEngine on |
| RewriteRule "^/horse/(.*)" "/pony/$1" [E=<strong>rewritten:1</strong>] |
| </highlight> |
| |
| <p>Later in your ruleset you might check for this environment |
| variable using a RewriteCond:</p> |
| |
| <highlight language="config"> |
| RewriteCond "%{ENV:rewritten}" =1 |
| </highlight> |
| |
| <p>Note that environment variables do not survive an external |
| redirect. You might consider using the [CO] flag to set a |
| cookie.</p> |
| |
| </dd> |
| </dl> |
| |
| </section> |
| |
| </manualpage> |