| <?xml version="1.0" encoding="UTF-8" ?> |
| <!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd"> |
| <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?> |
| <!-- $LastChangedRevision$ --> |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <manualpage metafile="advanced.xml.meta"> |
| <parentdocument href="./">Rewrite</parentdocument> |
| |
| <title>Advanced Techniques with mod_rewrite</title> |
| |
| <summary> |
| |
| <p>This document supplements the <module>mod_rewrite</module> |
| <a href="../mod/mod_rewrite.html">reference documentation</a>. It provides |
| a few advanced techniques and tricks using mod_rewrite.</p> |
| |
| <note type="warning">Note that many of these examples won't work unchanged in your |
| particular server configuration, so it's important that you understand |
| them, rather than merely cutting and pasting the examples into your |
| configuration.</note> |
| |
| </summary> |
| <seealso><a href="../mod/mod_rewrite.html">Module documentation</a></seealso> |
| <seealso><a href="intro.html">mod_rewrite introduction</a></seealso> |
| <seealso><a href="remapping.html">Redirection and remapping</a></seealso> |
| <seealso><a href="access.html">Controlling access</a></seealso> |
| <seealso><a href="vhosts.html">Virtual hosts</a></seealso> |
| <seealso><a href="proxy.html">Proxying</a></seealso> |
| <seealso><a href="rewritemap.html">Using RewriteMap</a></seealso> |
| <!--<seealso><a href="advanced.html">Advanced techniques and tricks</a></seealso>--> |
| <seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso> |
| |
| <section id="sharding"> |
| |
| <title>URL-based sharding accross multiple backends</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>A common technique for distributing the burden of |
| server load or storage space is called "sharding". |
| When using this method, a front-end server will use the |
| url to consistently "shard" users or objects to separate |
| backend servers.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>A mapping is maintained, from users to target servers, in |
| external map files. They look like:</p> |
| |
| <example> |
| user1 physical_host_of_user1<br /> |
| user2 physical_host_of_user2<br /> |
| : : |
| </example> |
| |
| <p>We put this into a <code>map.users-to-hosts</code> file. The |
| aim is to map;</p> |
| |
| <example> |
| /u/user1/anypath |
| </example> |
| |
| <p>to</p> |
| |
| <example> |
| http://physical_host_of_user1/u/user/anypath |
| </example> |
| |
| <p>thus every URL path need not be valid on every backend physical |
| host. The following ruleset does this for us with the help of the map |
| files assuming that server0 is a default server which will be used if |
| a user has no entry in the map:</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| <br /> |
| RewriteMap users-to-hosts txt:/path/to/map.users-to-hosts<br /> |
| <br /> |
| RewriteRule ^/u/<strong>([^/]+)</strong>/?(.*) http://<strong>${users-to-hosts:$1|server0}</strong>/u/$1/$2 |
| </example> |
| </dd> |
| </dl> |
| |
| <p>See the <directive module="mod_rewrite">RewriteMap</directive> |
| documentation for more discussion of the syntax of this directive.</p> |
| |
| </section> |
| |
| <section id="on-the-fly-content"> |
| |
| <title>On-the-fly Content-Regeneration</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to dynamically generate content, but store it |
| statically once it is generated. This rule will check for the |
| existence of the static file, and if it's not there, generate |
| it. The static files can be removed periodically, if desired (say, |
| via cron) and will be regenerated on demand.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| This is done via the following ruleset: |
| |
| <highlight language="config"> |
| # This example is valid in per-directory context only |
| RewriteCond %{REQUEST_URI} !-U |
| RewriteRule ^(.+)\.html$ /regenerate_page.cgi [PT,L] |
| </highlight> |
| |
| <p>The <code>-U</code> operator determines whether the test string |
| (in this case, <code>REQUEST_URI</code>) is a valid URL. It does |
| this via a subrequest. In the event that this subrequest fails - |
| that is, the requested resource doesn't exist - this rule invokes |
| the CGI program <code>/regenerate_page.cgi</code>, which generates |
| the requested resource and saves it into the document directory, so |
| that the next time it is requested, a static copy can be served.</p> |
| |
| <p>In this way, documents that are infrequently updated can be served in |
| static form. if documents need to be refreshed, they can be deleted |
| from the document directory, and they will then be regenerated the |
| next time they are requested.</p> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="load-balancing"> |
| |
| <title>Load Balancing</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to randomly distribute load across several servers |
| using mod_rewrite.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>We'll use <directive |
| module="mod_rewrite">RewriteMap</directive> and a list of servers |
| to accomplish this.</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| RewriteMap lb rnd:/path/to/serverlist.txt<br /> |
| <br /> |
| RewriteRule ^/(.*) http://${lb:servers}/$1 [P,L] |
| </example> |
| |
| <p><code>serverlist.txt</code> will contain a list of the servers:</p> |
| |
| <example> |
| ## serverlist.txt<br /> |
| <br /> |
| servers one.example.com|two.example.com|three.example.com<br /> |
| </example> |
| |
| <p>If you want one particular server to get more of the load than the |
| others, add it more times to the list.</p> |
| |
| </dd> |
| |
| <dt>Discussion</dt> |
| <dd> |
| <p>Apache comes with a load-balancing module - |
| <module>mod_proxy_balancer</module> - which is far more flexible and |
| featureful than anything you can cobble together using mod_rewrite.</p> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="autorefresh"> |
| |
| <title>Document With Autorefresh</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>Wouldn't it be nice, while creating a complex web page, if |
| the web browser would automatically refresh the page every |
| time we save a new version from within our editor? |
| Impossible?</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>No! We just combine the MIME multipart feature, the |
| web server NPH feature, and the URL manipulation power of |
| <module>mod_rewrite</module>. First, we establish a new |
| URL feature: Adding just <code>:refresh</code> to any |
| URL causes the 'page' to be refreshed every time it is |
| updated on the filesystem.</p> |
| |
| <example> |
| RewriteRule ^(/[uge]/[^/]+/?.*):refresh /internal/cgi/apache/nph-refresh?f=$1 |
| </example> |
| |
| <p>Now when we reference the URL</p> |
| |
| <example> |
| /u/foo/bar/page.html:refresh |
| </example> |
| |
| <p>this leads to the internal invocation of the URL</p> |
| |
| <example> |
| /internal/cgi/apache/nph-refresh?f=/u/foo/bar/page.html |
| </example> |
| |
| <p>The only missing part is the NPH-CGI script. Although |
| one would usually say "left as an exercise to the reader" |
| ;-) I will provide this, too.</p> |
| |
| <example><pre> |
| #!/sw/bin/perl |
| ## |
| ## nph-refresh -- NPH/CGI script for auto refreshing pages |
| ## Copyright (c) 1997 Ralf S. Engelschall, All Rights Reserved. |
| ## |
| $| = 1; |
| |
| # split the QUERY_STRING variable |
| @pairs = split( /&/, $ENV{'QUERY_STRING'} ); |
| foreach $pair (@pairs) { |
| ( $name, $value ) = split( /=/, $pair ); |
| $name =~ tr/A-Z/a-z/; |
| $name = 'QS_' . $name; |
| $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; |
| eval "\$$name = \"$value\""; |
| } |
| $QS_s = 1 if ( $QS_s eq '' ); |
| $QS_n = 3600 if ( $QS_n eq '' ); |
| if ( $QS_f eq '' ) { |
| print "HTTP/1.0 200 OK\n"; |
| print "Content-type: text/html\n\n"; |
| print "&lt;b&gt;ERROR&lt;/b&gt;: No file given\n"; |
| exit(0); |
| } |
| if ( !-f $QS_f ) { |
| print "HTTP/1.0 200 OK\n"; |
| print "Content-type: text/html\n\n"; |
| print "&lt;b&gt;ERROR&lt;/b&gt;: File $QS_f not found\n"; |
| exit(0); |
| } |
| |
| sub print_http_headers_multipart_begin { |
| print "HTTP/1.0 200 OK\n"; |
| $bound = "ThisRandomString12345"; |
| print "Content-type: multipart/x-mixed-replace;boundary=$bound\n"; |
| &print_http_headers_multipart_next; |
| } |
| |
| sub print_http_headers_multipart_next { |
| print "\n--$bound\n"; |
| } |
| |
| sub print_http_headers_multipart_end { |
| print "\n--$bound--\n"; |
| } |
| |
| sub displayhtml { |
| local ($buffer) = @_; |
| $len = length($buffer); |
| print "Content-type: text/html\n"; |
| print "Content-length: $len\n\n"; |
| print $buffer; |
| } |
| |
| sub readfile { |
| local ($file) = @_; |
| local ( *FP, $size, $buffer, $bytes ); |
| ( $x, $x, $x, $x, $x, $x, $x, $size ) = stat($file); |
| $size = sprintf( "%d", $size ); |
| open(FP, "&lt;$file"); |
| $bytes = sysread( FP, $buffer, $size ); |
| close(FP); |
| return $buffer; |
| } |
| |
| $buffer = &readfile($QS_f); |
| &print_http_headers_multipart_begin; |
| &displayhtml($buffer); |
| |
| sub mystat { |
| local ($file) = $_[0]; |
| local ($time); |
| |
| ( $x, $x, $x, $x, $x, $x, $x, $x, $x, $mtime ) = stat($file); |
| return $mtime; |
| } |
| |
| $mtimeL = &mystat($QS_f); |
| $mtime = $mtime; |
| for ( $n = 0 ; $n & lt ; $QS_n ; $n++ ) { |
| while (1) { |
| $mtime = &mystat($QS_f); |
| if ( $mtime ne $mtimeL ) { |
| $mtimeL = $mtime; |
| sleep(2); |
| $buffer = &readfile($QS_f); |
| &print_http_headers_multipart_next; |
| &displayhtml($buffer); |
| sleep(5); |
| $mtimeL = &mystat($QS_f); |
| last; |
| } |
| sleep($QS_s); |
| } |
| } |
| |
| &print_http_headers_multipart_end; |
| |
| exit(0); |
| |
| ##EOF## |
| </pre></example> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="structuredhomedirs"> |
| |
| <title>Structured Userdirs</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>Some sites with thousands of users use a |
| structured homedir layout, <em>i.e.</em> each homedir is in a |
| subdirectory which begins (for instance) with the first |
| character of the username. So, <code>/~larry/anypath</code> |
| is <code>/home/<strong>l</strong>/larry/public_html/anypath</code> |
| while <code>/~waldo/anypath</code> is |
| <code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>We use the following ruleset to expand the tilde URLs |
| into the above layout.</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| RewriteRule ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*) /home/<strong>$2</strong>/$1/public_html$3 |
| </example> |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="redirectanchors"> |
| |
| <title>Redirecting Anchors</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>By default, redirecting to an HTML anchor doesn't work, |
| because mod_rewrite escapes the <code>#</code> character, |
| turning it into <code>%23</code>. This, in turn, breaks the |
| redirection.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>Use the <code>[NE]</code> flag on the |
| <code>RewriteRule</code>. NE stands for No Escape. |
| </p> |
| </dd> |
| |
| <dt>Discussion:</dt> |
| <dd>This technique will of course also work with other |
| special characters that mod_rewrite, by default, URL-encodes.</dd> |
| </dl> |
| |
| </section> |
| |
| <section id="time-dependent"> |
| |
| <title>Time-Dependent Rewriting</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>We wish to use mod_rewrite to serve different content based on |
| the time of day.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>There are a lot of variables named <code>TIME_xxx</code> |
| for rewrite conditions. In conjunction with the special |
| lexicographic comparison patterns <code><STRING</code>, |
| <code>>STRING</code> and <code>=STRING</code> we can |
| do time-dependent redirects:</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| RewriteCond %{TIME_HOUR}%{TIME_MIN} >0700<br /> |
| RewriteCond %{TIME_HOUR}%{TIME_MIN} <1900<br /> |
| RewriteRule ^foo\.html$ foo.day.html [L]<br /> |
| RewriteRule ^foo\.html$ foo.night.html |
| </example> |
| |
| <p>This provides the content of <code>foo.day.html</code> |
| under the URL <code>foo.html</code> from |
| <code>07:01-18:59</code> and at the remaining time the |
| contents of <code>foo.night.html</code>.</p> |
| |
| <note type="warning"><module>mod_cache</module>, intermediate proxies |
| and browsers may each cache responses and cause the either page to be |
| shown outside of the time-window configured. |
| <module>mod_expires</module> may be used to control this |
| effect. You are, of course, much better off simply serving the |
| content dynamically, and customizing it based on the time of day.</note> |
| |
| </dd> |
| </dl> |
| |
| </section> |
| |
| <section id="setenvvars"> |
| |
| <title>Set Environment Variables Based On URL Parts</title> |
| |
| <dl> |
| <dt>Description:</dt> |
| |
| <dd> |
| <p>At time, we want to maintain some kind of status when we |
| perform a rewrite. For example, you want to make a note that |
| you've done that rewrite, so that you can check later to see if a |
| request can via that rewrite. One way to do this is by setting an |
| environment variable.</p> |
| </dd> |
| |
| <dt>Solution:</dt> |
| |
| <dd> |
| <p>Use the [E] flag to set an environment variable.</p> |
| |
| <example> |
| RewriteEngine on<br /> |
| RewriteRule ^/horse/(.*) /pony/$1 [E=<strong>rewritten:1</strong>] |
| </example> |
| |
| <p>Later in your ruleset you might check for this environment |
| variable using a RewriteCond:</p> |
| |
| <example> |
| RewriteCond %{ENV:rewritten} =1 |
| </example> |
| |
| </dd> |
| </dl> |
| |
| </section> |
| |
| </manualpage> |