| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
| <HTML><HEAD> |
| <TITLE>An In-Depth Discussion of Virtual Host Matching</TITLE> |
| </HEAD> |
| |
| <!-- Background white, links blue (unvisited), navy (visited), red (active) --> |
| <BODY |
| BGCOLOR="#FFFFFF" |
| TEXT="#000000" |
| LINK="#0000FF" |
| VLINK="#000080" |
| ALINK="#FF0000" |
| > |
| <!--#include virtual="header.html" --> |
| <H1 ALIGN="CENTER">An In-Depth Discussion of Virtual Host Matching</H1> |
| |
| <P>The virtual host code was completely rewritten in |
| <STRONG>Apache 1.3</STRONG>. |
| This document attempts to explain exactly what Apache does when |
| deciding what virtual host to serve a hit from. With the help of the |
| new <A HREF="../mod/core.html#namevirtualhost"><SAMP>NameVirtualHost</SAMP></A> |
| directive virtual host configuration should be a lot easier and safer |
| than with versions prior to 1.3. |
| |
| <P>If you just want to <CITE>make it work</CITE> without understanding |
| how, here are <A HREF="examples.html">some examples</A>. |
| |
| <H3>Config File Parsing</H3> |
| |
| <P>There is a <EM>main_server</EM> which consists of all |
| the definitions appearing outside of <CODE><VirtualHost></CODE> sections. |
| There are virtual servers, called <EM>vhosts</EM>, which are defined by |
| <A HREF="../mod/core.html#virtualhost"><SAMP><VirtualHost></SAMP></A> |
| sections. |
| |
| <P>The directives |
| <A HREF="../mod/core.html#port"><SAMP>Port</SAMP></A>, |
| <A HREF="../mod/core.html#servername"><SAMP>ServerName</SAMP></A>, |
| <A HREF="../mod/core.html#serverpath"><SAMP>ServerPath</SAMP></A>, |
| and |
| <A HREF="../mod/core.html#serveralias"><SAMP>ServerAlias</SAMP></A> |
| can appear anywhere within the definition of |
| a server. However, each appearance overrides the previous appearance |
| (within that server). |
| |
| <P>The default value of the <CODE>Port</CODE> field for main_server |
| is 80. The main_server has no default <CODE>ServerPath</CODE>, or |
| <CODE>ServerAlias</CODE>. The default <CODE>ServerName</CODE> is |
| deduced from the servers IP address. |
| |
| <P>The main_server Port directive has two functions due to legacy |
| compatibility with NCSA configuration files. One function is |
| to determine the default network port Apache will bind to. This |
| default is overridden by the existence of any |
| <A HREF="../mod/core.html#listen"><CODE>Listen</CODE></A> directives. |
| The second function is to specify the port number which is used |
| in absolute URIs during redirects. |
| |
| <P>Unlike the main_server, vhost ports <EM>do not</EM> affect what |
| ports Apache listens for connections on. |
| |
| <P>Each address appearing in the <CODE>VirtualHost</CODE> directive |
| can have an optional port. If the port is unspecified it defaults to |
| the value of the main_server's most recent <CODE>Port</CODE> statement. |
| The special port <SAMP>*</SAMP> indicates a wildcard that matches any port. |
| Collectively the entire set of addresses (including multiple |
| <SAMP>A</SAMP> record |
| results from DNS lookups) are called the vhost's <EM>address set</EM>. |
| |
| <P>Unless a <A HREF="../mod/core.html#namevirtualhost">NameVirtualHost</A> |
| directive is used for a specific IP address the first vhost with |
| that address is treated as an IP-based vhost. The IP address can also |
| be the wildcard <CODE>*</CODE>. |
| |
| <P>If name-based vhosts should be used a <CODE>NameVirtualHost</CODE> |
| directive <EM>must</EM> appear with the IP address set to be used for the |
| name-based vhosts. In other words, you must specify the IP address that |
| holds the hostname aliases (CNAMEs) for your name-based vhosts via a |
| <CODE>NameVirtualHost</CODE> directive in your configuration file. |
| |
| <P>Multiple <CODE>NameVirtualHost</CODE> directives can be used each |
| with a set of <CODE>VirtualHost</CODE> directives but only one |
| <CODE>NameVirtualHost</CODE> directive should be used for each |
| specific IP:port pair. |
| |
| <P>The ordering of <CODE>NameVirtualHost</CODE> and |
| <CODE>VirtualHost</CODE> directives is not important which makes the |
| following two examples identical (only the order of the |
| <CODE>VirtualHost</CODE> directives for <EM>one</EM> address set |
| is important, see below): |
| |
| <PRE> |
| | |
| NameVirtualHost 111.22.33.44 | <VirtualHost 111.22.33.44> |
| <VirtualHost 111.22.33.44> | # server A |
| # server A | </VirtualHost> |
| ... | <VirtualHost 111.22.33.55> |
| </VirtualHost> | # server C |
| <VirtualHost 111.22.33.44> | ... |
| # server B | </VirtualHost> |
| ... | <VirtualHost 111.22.33.44> |
| </VirtualHost> | # server B |
| | ... |
| NameVirtualHost 111.22.33.55 | </VirtualHost> |
| <VirtualHost 111.22.33.55> | <VirtualHost 111.22.33.55> |
| # server C | # server D |
| ... | ... |
| </VirtualHost> | </VirtualHost> |
| <VirtualHost 111.22.33.55> | |
| # server D | NameVirtualHost 111.22.33.44 |
| ... | NameVirtualHost 111.22.33.55 |
| </VirtualHost> | |
| | |
| </PRE> |
| |
| <P>(To aid the readability of your configuration you should prefer the |
| left variant.) |
| |
| <P> After parsing the <CODE>VirtualHost</CODE> directive, the vhost server |
| is given a default <CODE>Port</CODE> equal to the port assigned to the |
| first name in its <CODE>VirtualHost</CODE> directive. |
| |
| <P>The complete list of names in the <CODE>VirtualHost</CODE> directive |
| are treated just like a <CODE>ServerAlias</CODE> (but are not overridden by any |
| <CODE>ServerAlias</CODE> statement) if all names resolve to the same address |
| set. Note that subsequent <CODE>Port</CODE> statements for this vhost will not |
| affect the ports assigned in the address set. |
| |
| <P>During initialization a list for each IP address |
| is generated and inserted into an hash table. If the IP address is |
| used in a <CODE>NameVirtualHost</CODE> directive the list contains |
| all name-based vhosts for the given IP address. If there are no |
| vhosts defined for that address the <CODE>NameVirtualHost</CODE> directive |
| is ignored and an error is logged. For an IP-based vhost the list in the |
| hash table is empty. |
| |
| <P>Due to a fast hashing function the overhead of hashing an IP address |
| during a request is minimal and almost not existent. Additionally |
| the table is optimized for IP addresses which vary in the last octet. |
| |
| <P>For every vhost various default values are set. In particular: |
| |
| <OL> |
| <LI>If a vhost has no |
| <A HREF="../mod/core.html#serveradmin"><CODE>ServerAdmin</CODE></A>, |
| <A HREF="../mod/core.html#resourceconfig"><CODE>ResourceConfig</CODE></A>, |
| <A HREF="../mod/core.html#accessconfig"><CODE>AccessConfig</CODE></A>, |
| <A HREF="../mod/core.html#timeout"><CODE>Timeout</CODE></A>, |
| <A HREF="../mod/core.html#keepalivetimeout" |
| ><CODE>KeepAliveTimeout</CODE></A>, |
| <A HREF="../mod/core.html#keepalive"><CODE>KeepAlive</CODE></A>, |
| <A HREF="../mod/core.html#maxkeepaliverequests" |
| ><CODE>MaxKeepAliveRequests</CODE></A>, |
| or |
| <A HREF="../mod/core.html#sendbuffersize"><CODE>SendBufferSize</CODE></A> |
| directive then the respective value is |
| inherited from the main_server. (That is, inherited from whatever |
| the final setting of that value is in the main_server.) |
| |
| <LI>The "lookup defaults" that define the default directory |
| permissions |
| for a vhost are merged with those of the main_server. This includes |
| any per-directory configuration information for any module. |
| |
| <LI>The per-server configs for each module from the main_server are |
| merged into the vhost server. |
| </OL> |
| |
| Essentially, the main_server is treated as "defaults" or a |
| "base" on which to build each vhost. |
| But the positioning of these main_server |
| definitions in the config file is largely irrelevant -- the entire |
| config of the main_server has been parsed when this final merging occurs. |
| So even if a main_server definition appears after a vhost definition |
| it might affect the vhost definition. |
| |
| <P> If the main_server has no <CODE>ServerName</CODE> at this point, |
| then the hostname of the machine that httpd is running on is used |
| instead. We will call the <EM>main_server address set</EM> those IP |
| addresses returned by a DNS lookup on the <CODE>ServerName</CODE> of |
| the main_server. |
| |
| <P> For any undefined <CODE>ServerName</CODE> fields, a name-based vhost |
| defaults to the address given first in the <CODE>VirtualHost</CODE> |
| statement defining the vhost. |
| |
| <P>Any vhost that includes the magic <SAMP>_default_</SAMP> wildcard |
| is given the same <CODE>ServerName</CODE> as the main_server. |
| |
| |
| <H3>Virtual Host Matching</H3> |
| |
| <P>The server determines which vhost to use for a request as follows: |
| |
| <H4>Hash table lookup</H4> |
| |
| <P>When the connection is first made by a client, the IP address to |
| which the client connected is looked up in the internal IP hash table. |
| |
| <P>If the lookup fails (the IP address wasn't found) the request is |
| served from the <SAMP>_default_</SAMP> vhost if there is such a vhost |
| for the port to which the client sent the request. If there is no |
| matching <SAMP>_default_</SAMP> vhost the request is served from the |
| main_server. |
| |
| <P>If the IP address is not found in the hash table then the match |
| against the port number may also result in an entry corresponding to a |
| <CODE>NameVirtualHost *</CODE>, which is subsequently handled like |
| other name-based vhosts. |
| |
| <P>If the lookup succeeded (a corresponding list for the IP address was |
| found) the next step is to decide if we have to deal with an IP-based |
| or a name-base vhost. |
| |
| <H4>IP-based vhost</H4> |
| |
| <P>If the entry we found has an empty name list then we have found an |
| IP-based vhost, no further actions are performed and the request is |
| served from that vhost. |
| |
| <H4>Name-based vhost</H4> |
| |
| <P>If the entry corresponds to a name-based vhost the name list contains |
| one or more vhost structures. This list contains the vhosts in the same |
| order as the <CODE>VirtualHost</CODE> directives appear in the config |
| file. |
| |
| <P>The first vhost on this list (the first vhost in the config file with |
| the specified IP address) has the highest priority and catches any request |
| to an unknown server name or a request without a <CODE>Host:</CODE> |
| header field. |
| |
| <P>If the client provided a <CODE>Host:</CODE> header field the list is |
| searched for a matching vhost and the first hit on a <CODE>ServerName</CODE> |
| or <CODE>ServerAlias</CODE> is taken and the request is served from |
| that vhost. A <CODE>Host:</CODE> header field can contain a port number, but |
| Apache always matches against the real port to which the client sent |
| the request. |
| |
| <P>If the client submitted a HTTP/1.0 request without <CODE>Host:</CODE> |
| header field we don't know to what server the client tried to connect and |
| any existing <CODE>ServerPath</CODE> is matched against the URI |
| from the request. The first matching path on the list is used and the |
| request is served from that vhost. |
| |
| <P>If no matching vhost could be found the request is served from the |
| first vhost with a matching port number that is on the list for the IP |
| to which the client connected (as already mentioned before). |
| |
| <H4>Persistent connections</H4> |
| The IP lookup described above is only done <EM>once</EM> for a particular |
| TCP/IP session while the name lookup is done on <EM>every</EM> request |
| during a KeepAlive/persistent connection. In other words a client may |
| request pages from different name-based vhosts during a single |
| persistent connection. |
| |
| |
| <H4>Absolute URI</H4> |
| |
| <P>If the URI from the request is an absolute URI, and its hostname and |
| port match the main server or one of the configured virtual hosts |
| <EM>and</EM> match the address and port to which the client sent the request, |
| then the scheme/hostname/port prefix is stripped off and the remaining |
| relative URI is served by the corresponding main server or virtual host. |
| If it does not match, then the URI remains untouched and the request is |
| taken to be a proxy request. |
| |
| |
| <H3>Observations</H3> |
| |
| <UL> |
| |
| <LI>A name-based vhost can never interfere with an IP-base vhost and |
| vice versa. IP-based vhosts can only be reached through an IP address |
| of its own address set and never through any other address. |
| The same applies to name-based vhosts, they can only be reached |
| through an IP address of the corresponding address set which must |
| be defined with a <CODE>NameVirtualHost</CODE> directive. |
| <P> |
| |
| <LI><CODE>ServerAlias</CODE> and <CODE>ServerPath</CODE> checks are never |
| performed for an IP-based vhost. |
| <P> |
| |
| <LI>The order of name-/IP-based, the <SAMP>_default_</SAMP> |
| vhost and the <CODE>NameVirtualHost</CODE> directive within the config |
| file is not important. Only the ordering |
| of name-based vhosts for a specific address set is significant. The one |
| name-based vhosts that comes first in the configuration file has |
| the highest priority for its corresponding address set. |
| <P> |
| |
| <LI>For security reasons the port number given in a <CODE>Host:</CODE> |
| header field is never used during the matching process. Apache always |
| uses the real port to which the client sent the request. |
| <P> |
| |
| <LI>If a <CODE>ServerPath</CODE> directive exists which is a prefix of |
| another <CODE>ServerPath</CODE> directive that appears later in |
| the configuration file, then the former will always be matched |
| and the latter will never be matched. (That is assuming that no |
| <CODE>Host:</CODE> header field was available to disambiguate the two.) |
| <P> |
| |
| <LI>If two IP-based vhosts have an address in common, the vhost appearing |
| first in the config file is always matched. Such a thing might happen |
| inadvertently. The server will give a warning in the error |
| logfile when it detects this. |
| <P> |
| |
| <LI>A <CODE>_default_</CODE> vhost catches a request only if there is no |
| other vhost with a matching IP address <EM>and</EM> a matching port |
| number for the request. The request is only caught if the port number |
| to which the client sent the request matches the port number of your |
| <CODE>_default_</CODE> vhost which is your standard <CODE>Port</CODE> |
| by default. A wildcard port can be specified (<EM>i.e.</EM>, |
| <CODE>_default_:*</CODE>) to catch requests to any available port. |
| This also applies to <CODE>NameVirtualHost *</CODE> vhosts. |
| <P> |
| |
| <LI>The main_server is only used to serve a request if the IP address |
| and port number to which the client connected is unspecified |
| and does not match any other vhost (including a <CODE>_default_</CODE> |
| vhost). In other words the main_server only catches a request for an |
| unspecified address/port combination (unless there is a |
| <CODE>_default_</CODE> vhost which matches that port). |
| <P> |
| |
| <LI>A <CODE>_default_</CODE> vhost or the main_server is <EM>never</EM> |
| matched for a request with an unknown or missing <CODE>Host:</CODE> header |
| field if the client connected to an address (and port) which is used |
| for name-based vhosts, <EM>e.g.</EM>, in a <CODE>NameVirtualHost</CODE> |
| directive. |
| <P> |
| |
| <LI>You should never specify DNS names in <CODE>VirtualHost</CODE> |
| directives because it will force your server to rely on DNS to boot. |
| Furthermore it poses a security threat if you do not control the |
| DNS for all the domains listed. |
| There's <A HREF="../dns-caveats.html">more information</A> |
| available on this and the next two topics. |
| <P> |
| |
| <LI><CODE>ServerName</CODE> should always be set for each vhost. Otherwise |
| A DNS lookup is required for each vhost. |
| <P> |
| |
| </UL> |
| |
| <H3>Tips</H3> |
| |
| <P>In addition to the tips on the <A HREF="../dns-caveats.html#tips">DNS |
| Issues</A> page, here are some further tips: |
| |
| <UL> |
| |
| <LI>Place all main_server definitions before any <CODE>VirtualHost</CODE> |
| definitions. (This is to aid the readability of the configuration -- |
| the post-config merging process makes it non-obvious that definitions |
| mixed in around virtual hosts might affect all virtual hosts.) |
| <P> |
| |
| <LI>Group corresponding <CODE>NameVirtualHost</CODE> and |
| <CODE>VirtualHost</CODE> definitions in your configuration to ensure |
| better readability. |
| <P> |
| |
| <LI>Avoid <CODE>ServerPaths</CODE> which are prefixes of other |
| <CODE>ServerPaths</CODE>. If you cannot avoid this then you have to |
| ensure that the longer (more specific) prefix vhost appears earlier in |
| the configuration file than the shorter (less specific) prefix |
| (<EM>i.e.</EM>, "ServerPath /abc" should appear after |
| "ServerPath /abc/def"). |
| <P> |
| |
| </UL> |
| |
| <!--#include virtual="footer.html" --> |
| </BODY> |
| </HTML> |