| <?xml version="1.0" encoding="UTF-8" ?> |
| <!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd"> |
| <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?> |
| <!-- $LastChangedRevision$ --> |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <manualpage metafile="avoid.xml.meta"> |
| <parentdocument href="./">Rewrite</parentdocument> |
| |
| <title>Advanced Techniques with mod_rewrite</title> |
| |
| <summary> |
| |
| <p>This document supplements the <module>mod_rewrite</module> |
| <a href="../mod/mod_rewrite.html">reference documentation</a>. It provides |
| a few advanced techniques and tricks using mod_rewrite.</p> |
| |
| <note type="warning">Note that many of these examples won't work unchanged in your |
| particular server configuration, so it's important that you understand |
| them, rather than merely cutting and pasting the examples into your |
| configuration.</note> |
| |
| </summary> |
| <seealso><a href="../mod/mod_rewrite.html">Module documentation</a></seealso> |
| <seealso><a href="intro.html">mod_rewrite introduction</a></seealso> |
| <seealso><a href="remapping.html">Redirection and remapping</a></seealso> |
| <seealso><a href="access.html">Controlling access</a></seealso> |
| <seealso><a href="vhosts.html">Virtual hosts</a></seealso> |
| <seealso><a href="proxy.html">Proxying</a></seealso> |
| <seealso><a href="rewritemap.html">Using RewriteMap</a></seealso> |
| <!--<seealso><a href="advanced.html">Advanced techniques and tricks</a></seealso>--> |
| <seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso> |
| |
| <section id="sharding"> |
| |
| <title>URL-based sharding accross multiple backends</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>A common technique for distributing the burden of |
| server load or storage space is called "sharding". |
| When using this method, a front-end server will use the |
| url to consistently "shard" users or objects to separate |
| backend servers.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>A mapping is maintained, from users to target servers, in |
| external map files. They look like:</p> |
| |
| <example> |
| user1 physical_host_of_user1<br /> |
| user2 physical_host_of_user2<br /> |
| : : |
| </example> |
| |
| <p>We put this into a <code>map.users-to-hosts</code> file. The |
| aim is to map;</p> |
| |
| <example> |
| /u/user1/anypath |
| </example> |
| |
| <p>to</p> |
| |
| <example> |
| http://physical_host_of_user1/u/user/anypath |
| </example> |
| |
| <p>thus every URL path need not be valid on every backend physical |
| host. The following ruleset does this for us with the help of the map |
| files assuming that server0 is a default server which will be used if |
| a user has no entry in the map:</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| <br /> |
| RewriteMap users-to-hosts txt:/path/to/map.users-to-hosts<br /> |
| <br /> |
| RewriteRule ^/u/<strong>([^/]+)</strong>/?(.*) http://<strong>${users-to-hosts:$1|server0}</strong>/u/$1/$2 |
| </example> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="on-the-fly-content"> |
| |
| <title>On-the-fly Content-Regeneration</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to dynamically generate content, but store it |
| statically once it is generated. This rule will check for the |
| existence of the static file, and if it's not there, generate |
| it. The static files can be removed periodically, if desired (say, |
| via cron) and will be regenerated on demand.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| This is done via the following ruleset: |
| |
| <example> |
| # This example is valid in per-directory context only<br /> |
| RewriteCond %{REQUEST_FILENAME} <strong>!-s</strong><br /> |
| RewriteRule ^page\.<strong>html</strong>$ page.<strong>cgi</strong> [T=application/x-httpd-cgi,L] |
| </example> |
| |
| <p>Here a request for <code>page.html</code> leads to an |
| internal run of a corresponding <code>page.cgi</code> if |
| <code>page.html</code> is missing or has filesize |
| null. The trick here is that <code>page.cgi</code> is a |
| CGI script which (additionally to its <code>STDOUT</code>) |
| writes its output to the file <code>page.html</code>. |
| Once it has completed, the server sends out |
| <code>page.html</code>. When the webmaster wants to force |
| a refresh of the contents, he just removes |
| <code>page.html</code> (typically from <code>cron</code>).</p> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="load-balancing"> |
| |
| <title>Load Balancing</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to randomly distribute load across several servers |
| using mod_rewrite.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>We'll use <directive |
| module="mod_rewrite">RewriteMap</directive> and a list of servers |
| to accomplish this.</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| RewriteMap lb rnd:/path/to/serverlist.txt<br /> |
| <br /> |
| RewriteRule ^/(.*) http://${lb:servers}/$1 [P,L] |
| </example> |
| |
| <p><code>serverlist.txt</code> will contain a list of the servers:</p> |
| |
| <example> |
| ## serverlist.txt<br /> |
| <br /> |
| servers one.example.com|two.example.com|three.example.com<br /> |
| </example> |
| |
| <p>If you want one particular server to get more of the load than the |
| others, add it more times to the list.</p> |
| |
| </dd> |
| |
| <dt>Discussion</dt> |
| <dd> |
| <p>Apache comes with a load-balancing module - |
| <module>mod_proxy_balancer</module> - which is far more flexible and |
| featureful than anything you can cobble together using mod_rewrite.</p> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="autorefresh"> |
| |
| <title>Document With Autorefresh</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>Wouldn't it be nice, while creating a complex web page, if |
| the web browser would automatically refresh the page every |
| time we save a new version from within our editor? |
| Impossible?</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>No! We just combine the MIME multipart feature, the |
| web server NPH feature, and the URL manipulation power of |
| <module>mod_rewrite</module>. First, we establish a new |
| URL feature: Adding just <code>:refresh</code> to any |
| URL causes the 'page' to be refreshed every time it is |
| updated on the filesystem.</p> |
| |
| <example> |
| RewriteRule ^(/[uge]/[^/]+/?.*):refresh /internal/cgi/apache/nph-refresh?f=$1 |
| </example> |
| |
| <p>Now when we reference the URL</p> |
| |
| <example> |
| /u/foo/bar/page.html:refresh |
| </example> |
| |
| <p>this leads to the internal invocation of the URL</p> |
| |
| <example> |
| /internal/cgi/apache/nph-refresh?f=/u/foo/bar/page.html |
| </example> |
| |
| <p>The only missing part is the NPH-CGI script. Although |
| one would usually say "left as an exercise to the reader" |
| ;-) I will provide this, too.</p> |
| |
| <example><pre> |
| #!/sw/bin/perl |
| ## |
| ## nph-refresh -- NPH/CGI script for auto refreshing pages |
| ## Copyright (c) 1997 Ralf S. Engelschall, All Rights Reserved. |
| ## |
| $| = 1; |
| |
| # split the QUERY_STRING variable |
| @pairs = split(/&/, $ENV{'QUERY_STRING'}); |
| foreach $pair (@pairs) { |
| ($name, $value) = split(/=/, $pair); |
| $name =~ tr/A-Z/a-z/; |
| $name = 'QS_' . $name; |
| $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; |
| eval "\$$name = \"$value\""; |
| } |
| $QS_s = 1 if ($QS_s eq ''); |
| $QS_n = 3600 if ($QS_n eq ''); |
| if ($QS_f eq '') { |
| print "HTTP/1.0 200 OK\n"; |
| print "Content-type: text/html\n\n"; |
| print "&lt;b&gt;ERROR&lt;/b&gt;: No file given\n"; |
| exit(0); |
| } |
| if (! -f $QS_f) { |
| print "HTTP/1.0 200 OK\n"; |
| print "Content-type: text/html\n\n"; |
| print "&lt;b&gt;ERROR&lt;/b&gt;: File $QS_f not found\n"; |
| exit(0); |
| } |
| |
| sub print_http_headers_multipart_begin { |
| print "HTTP/1.0 200 OK\n"; |
| $bound = "ThisRandomString12345"; |
| print "Content-type: multipart/x-mixed-replace;boundary=$bound\n"; |
| &print_http_headers_multipart_next; |
| } |
| |
| sub print_http_headers_multipart_next { |
| print "\n--$bound\n"; |
| } |
| |
| sub print_http_headers_multipart_end { |
| print "\n--$bound--\n"; |
| } |
| |
| sub displayhtml { |
| local($buffer) = @_; |
| $len = length($buffer); |
| print "Content-type: text/html\n"; |
| print "Content-length: $len\n\n"; |
| print $buffer; |
| } |
| |
| sub readfile { |
| local($file) = @_; |
| local(*FP, $size, $buffer, $bytes); |
| ($x, $x, $x, $x, $x, $x, $x, $size) = stat($file); |
| $size = sprintf("%d", $size); |
| open(FP, "&lt;$file"); |
| $bytes = sysread(FP, $buffer, $size); |
| close(FP); |
| return $buffer; |
| } |
| |
| $buffer = &readfile($QS_f); |
| &print_http_headers_multipart_begin; |
| &displayhtml($buffer); |
| |
| sub mystat { |
| local($file) = $_[0]; |
| local($time); |
| |
| ($x, $x, $x, $x, $x, $x, $x, $x, $x, $mtime) = stat($file); |
| return $mtime; |
| } |
| |
| $mtimeL = &mystat($QS_f); |
| $mtime = $mtime; |
| for ($n = 0; $n &lt; $QS_n; $n++) { |
| while (1) { |
| $mtime = &mystat($QS_f); |
| if ($mtime ne $mtimeL) { |
| $mtimeL = $mtime; |
| sleep(2); |
| $buffer = &readfile($QS_f); |
| &print_http_headers_multipart_next; |
| &displayhtml($buffer); |
| sleep(5); |
| $mtimeL = &mystat($QS_f); |
| last; |
| } |
| sleep($QS_s); |
| } |
| } |
| |
| &print_http_headers_multipart_end; |
| |
| exit(0); |
| |
| ##EOF## |
| </pre></example> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="structuredhomedirs"> |
| |
| <title>Structured Userdirs</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>Some sites with thousands of users use a |
| structured homedir layout, <em>i.e.</em> each homedir is in a |
| subdirectory which begins (for instance) with the first |
| character of the username. So, <code>/~larry/anypath</code> |
| is <code>/home/<strong>l</strong>/larry/public_html/anypath</code> |
| while <code>/~waldo/anypath</code> is |
| <code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>We use the following ruleset to expand the tilde URLs |
| into the above layout.</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| RewriteRule ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*) /home/<strong>$2</strong>/$1/public_html$3 |
| </example> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="redirectanchors"> |
| |
| <title>Redirecting Anchors</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>By default, redirecting to an HTML anchor doesn't work, |
| because mod_rewrite escapes the <code>#</code> character, |
| turning it into <code>%23</code>. This, in turn, breaks the |
| redirection.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>Use the <code>[NE]</code> flag on the |
| <code>RewriteRule</code>. NE stands for No Escape. |
| </p> |
| </dd> |
| |
| <dt>Discussion:</dt> |
| <dd>This technique will of course also work with other |
| special characters that mod_rewrite, by default, URL-encodes.</dd> |
| </dl> |
| |
| </section> |
| |
| <section id="time-dependent"> |
| |
| <title>Time-Dependent Rewriting</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to use mod_rewrite to serve different content based on |
| the time of day.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>There are a lot of variables named <code>TIME_xxx</code> |
| for rewrite conditions. In conjunction with the special |
| lexicographic comparison patterns <code><STRING</code>, |
| <code>>STRING</code> and <code>=STRING</code> we can |
| do time-dependent redirects:</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| RewriteCond %{TIME_HOUR}%{TIME_MIN} >0700<br /> |
| RewriteCond %{TIME_HOUR}%{TIME_MIN} <1900<br /> |
| RewriteRule ^foo\.html$ foo.day.html [L]<br /> |
| RewriteRule ^foo\.html$ foo.night.html |
| </example> |
| |
| <p>This provides the content of <code>foo.day.html</code> |
| under the URL <code>foo.html</code> from |
| <code>07:01-18:59</code> and at the remaining time the |
| contents of <code>foo.night.html</code>.</p> |
| |
| <note type="warning"><module>mod_cache</module>, intermediate proxies |
| and browsers may each cache responses and cause the either page to be |
| shown outside of the time-window configured. |
| <module>mod_expires</module> may be used to control this |
| effect. You are, of course, much better off simply serving the |
| content dynamically, and customizing it based on the time of day.</note> |
| |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="setenvvars"> |
| |
| <title>Set Environment Variables Based On URL Parts</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>At time, we want to maintain some kind of status when we |
| perform a rewrite. For example, you want to make a note that |
| you've done that rewrite, so that you can check later to see if a |
| request can via that rewrite. One way to do this is by setting an |
| environment variable.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>Use the [E] flag to set an environment variable.</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| RewriteRule ^/horse/(.*) /pony/$1 [E=<strong>rewritten:1</strong>] |
| </example> |
| |
| <p>Later in your ruleset you might check for this environment |
| variable using a RewriteCond:</p> |
| |
| <example> |
| RewriteCond %{ENV:rewritten} =1 |
| </example> |
| |
| </dd> |
| </dl> |
| |
| </section> |
| |
| </manualpage> |