blob: cef737cab36ee69e12ae932edc35ad98c8db1cbb [file] [log] [blame]
<?xml version='1.0' encoding='UTF-8' ?>
<!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
<?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
<!-- $LastChangedRevision$ -->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<manualpage metafile="reverse_proxy.xml.meta">
<parentdocument href="./">How-To / Tutorials</parentdocument>
<title>Reverse Proxy Guide</title>
<summary>
<p>In addition to being a "basic" web server, and providing static and
dynamic content to end-users, Apache httpd (as well as most other web
servers) can also act as a reverse proxy server, also-known-as a
"gateway" server.</p>
<p>In such scenarios, httpd itself does not generate or host the data,
but rather the content is obtained by one or several backend servers,
which normally have no direct connection to the external network. As
httpd receives a request from a client, the request itself is <em>proxied</em>
to one of these backend servers, which then handles the request, generates
the content and then sends this content back to httpd, which then
generates the actual HTTP response back to the client.</p>
<p>There are numerous reasons for such an implementation, but generally
the typical rationales are due to security, high-availability, load-balancing
and centralized authentication/authorization. It is critical in these
implementations that the layout, design and architecture of the backend
infrastructure (those servers which actually handle the requests) are
insulated and protected from the outside; as far as the client is concerned,
the reverse proxy server <em>is</em> the sole source of all content.</p>
<p>A typical implementation is below:</p>
<p class="centered"><img src="../images/reverse-proxy-arch.png" alt="reverse-proxy-arch" /></p>
</summary>
<section id="related">
<title>Reverse Proxy</title>
<related>
<modulelist>
<module>mod_proxy</module>
<module>mod_proxy_balancer</module>
<module>mod_proxy_hcheck</module>
</modulelist>
<directivelist>
<directive module="mod_proxy">ProxyPass</directive>
<directive module="mod_proxy">BalancerMember</directive>
</directivelist>
</related>
</section>
<section id="simple">
<title>Simple reverse proxying</title>
<p>
The <directive module="mod_proxy">ProxyPass</directive>
directive specifies the mapping of incoming requests to the backend
server (or a cluster of servers known as a <code>Balancer</code>
group). The simpliest example proxies all requests (<code>"/"</code>)
to a single backend:
</p>
<highlight language="config">
ProxyPass "/" "http://www.example.com/"
</highlight>
<p>
To ensure that and <code>Location:</code> headers generated from
the backend are modified to point to the reverse proxy, instead of
back to itself, the <directive module="mod_proxy">ProxyPassReverse</directive>
directive is most often required:
</p>
<highlight language="config">
ProxyPass "/" "http://www.example.com/"
ProxyPassReverse "/" "http://www.example.com/"
</highlight>
<p>Only specific URIs can be proxied, as shown in this example:</p>
<highlight language="config">
ProxyPass "/images" "http://www.example.com/"
ProxyPassReverse "/images" "http://www.example.com/"
</highlight>
<p>In the above, any requests which start with the <code>/images</code>
path with be proxied to the specified backend, otherwise it will be handled
locally.
</p>
</section>
<section id="cluster">
<title>Clusters and Balancers</title>
<p>
As useful as the above is, it still has the deficiencies that should
the (single) backend node go down, or become heavily loaded, that proxying
those requests provides no real advantage. What is needed is the ability
to define a set or group of backend servers which can handle such
requests and for the reverse proxy to load balance and failover among
them. This group is sometimes called a <em>cluster</em> but Apache httpd's
term is a <em>balancer</em>. One defines a balancer by leveraging the
<directive module="mod_proxy" type="section">Proxy</directive> and
<directive module="mod_proxy">BalancerMember</directive> directives as
shown:
</p>
<highlight language="config">
&lt;Proxy balancer://myset&gt;
BalancerMember http://www2.example.com:8080
BalancerMember http://www3.example.com:8080
ProxySet lbmethod=bytraffic
&lt;/Proxy&gt;
ProxyPass "/images/" "balancer://myset/"
ProxyPassReverse "/images/" "balancer://myset/"
</highlight>
<p>
The <code>balancer://</code> scheme is what tells httpd that we are creating
a balancer set, with the name <em>myset</em>. It includes 2 backend servers,
which httpd calls <em>BalancerMembers</em>. In this case, any requests for
<code>/images</code> will be proxied to <em>one</em> of the 2 backends.
The <directive module="mod_proxy">ProxySet</directive> directive
specifies that the <em>myset</em> Balancer use a load balancing algorithm
that balances based on I/O bytes.
</p>
<note type="hint"><title>Hint</title>
<p>
<em>BalancerMembers</em> are also sometimes referred to as <em>workers</em>.
</p>
</note>
</section>
<section id="config">
<title>Balancer and BalancerMember configuration</title>
<p>
You can adjust numerous configuration details of the <em>balancers</em>
and the <em>workers</em> via the various parameters defined in
<directive module="mod_proxy">ProxyPass</directive>. For example,
assuming we would want <code>http://www3.example.com:8080</code> to
handle 3x the traffic with a timeout of 1 second, we would adjust the
configuration as follows:
</p>
<highlight language="config">
&lt;Proxy balancer://myset&gt;
BalancerMember http://www2.example.com:8080
BalancerMember http://www3.example.com:8080 loadfactor=3 timeout=1
ProxySet lbmethod=bytraffic
&lt;/Proxy&gt;
ProxyPass "/images" "balancer://myset/"
ProxyPassReverse "/images" "balancer://myset/"
</highlight>
</section>
<section id="failover">
<title>Failover</title>
<p>
You can also fine-tune various failover scenarios, detailing which workers
and even which balancers should be accessed in such cases. For example, the
below setup implements three failover cases:
</p>
<ol>
<li>
<code>http://spare1.example.com:8080</code> and
<code>http://spare2.example.com:8080</code> are only sent traffic if one
or both of <code>http://www2.example.com:8080</code> or
<code>http://www3.example.com:8080</code> is unavailable. (One spare
will be used to replace one unusable member of the same balancer set.)
</li>
<li>
<code>http://hstandby.example.com:8080</code> is only sent traffic if
all other workers in balancer set <code>0</code> are not available.
</li>
<li>
If all load balancer set <code>0</code> workers, spares, and the standby
are unavailable, only then will the
<code>http://bkup1.example.com:8080</code> and
<code>http://bkup2.example.com:8080</code> workers from balancer set
<code>1</code> be brought into rotation.
</li>
</ol>
<p>
Thus, it is possible to have one or more hot spares and hot standbys for
each load balancer set.
</p>
<highlight language="config">
&lt;Proxy balancer://myset&gt;
BalancerMember http://www2.example.com:8080
BalancerMember http://www3.example.com:8080 loadfactor=3 timeout=1
BalancerMember http://spare1.example.com:8080 status=+R
BalancerMember http://spare2.example.com:8080 status=+R
BalancerMember http://hstandby.example.com:8080 status=+H
BalancerMember http://bkup1.example.com:8080 lbset=1
BalancerMember http://bkup2.example.com:8080 lbset=1
ProxySet lbmethod=byrequests
&lt;/Proxy&gt;
ProxyPass "/images/" "balancer://myset/"
ProxyPassReverse "/images/" "balancer://myset/"
</highlight>
<p>
For failover, hot spares are used as replacements for unusable workers in
the same load balancer set. A worker is considered unusable if it is
draining, stopped, or otherwise in an error/failed state. Hot standbys are
used if all workers and spares in the load balancer set are
unavailable. Load balancer sets (with their respective hot spares and
standbys) are always tried in order from lowest to highest.
</p>
</section>
<section id="manager">
<title>Balancer Manager</title>
<p>
One of the most unique and useful features of Apache httpd's reverse proxy is
the embedded <em>balancer-manager</em> application. Similar to
<module>mod_status</module>, <em>balancer-manager</em> displays
the current working configuration and status of the enabled
balancers and workers currently in use. However, not only does it
display these parameters, it also allows for dynamic, runtime, on-the-fly
reconfiguration of almost all of them, including adding new <em>BalancerMembers</em>
(workers) to an existing balancer. To enable these capability, the following
needs to be added to your configuration:
</p>
<highlight language="config">
&lt;Location "/balancer-manager"&gt;
SetHandler balancer-manager
Require host localhost
&lt;/Location&gt;
</highlight>
<note type="warning"><title>Warning</title>
<p>Do not enable the <em>balancer-manager</em> until you have <a
href="../mod/mod_proxy.html#access">secured your server</a>. In
particular, ensure that access to the URL is tightly
restricted.</p>
</note>
<p>
When the reverse proxy server is accessed at that url
(eg: <code>http://rproxy.example.com/balancer-manager/</code>, you will see a
page similar to the below:
</p>
<p class="centered"><img src="../images/bal-man.png" alt="balancer-manager page" /></p>
<p>
This form allows the devops admin to adjust various parameters, take
workers offline, change load balancing methods and add new works. For
example, clicking on the balancer itself, you will get the following page:
</p>
<p class="centered"><img src="../images/bal-man-b.png" alt="balancer-manager page" /></p>
<p>
Whereas clicking on a worker, displays this page:
</p>
<p class="centered"><img src="../images/bal-man-w.png" alt="balancer-manager page" /></p>
<p>
To have these changes persist restarts of the reverse proxy, ensure that
<directive module="mod_proxy">BalancerPersist</directive> is enabled.
</p>
</section>
<section id="health-check">
<title>Dynamic Health Checks</title>
<p>
Before httpd proxies a request to a worker, it can <em>"test"</em> if that worker
is available via setting the <code>ping</code> parameter for that worker using
<directive module="mod_proxy">ProxyPass</directive>. Oftentimes it is
more useful to check the health of the workers <em>out of band</em>, in a
dynamic fashion. This is achieved in Apache httpd by the
<module>mod_proxy_hcheck</module> module.
</p>
</section>
<section id="status">
<title>BalancerMember status flags</title>
<p>
In the <em>balancer-manager</em> the current state, or <em>status</em>, of a worker
is displayed and can be set/reset. The meanings of these statuses are as follows:
</p>
<table border="1">
<tr><th>Flag</th><th>String</th><th>Description</th></tr>
<tr><td>&nbsp;</td><td><em>Ok</em></td><td>Worker is available</td></tr>
<tr><td>&nbsp;</td><td><em>Init</em></td><td>Worker has been initialized</td></tr>
<tr><td><code>D</code></td><td><em>Dis</em></td><td>Worker is disabled and will not accept any requests; will be
automatically retried.</td></tr>
<tr><td><code>S</code></td><td><em>Stop</em></td><td>Worker is administratively stopped; will not accept requests
and will not be automatically retried</td></tr>
<tr><td><code>I</code></td><td><em>Ign</em></td><td>Worker is in ignore-errors mode and will always be considered available.</td></tr>
<tr><td><code>R</code></td><td><em>Spar</em></td><td>Worker is a hot spare. For each worker in a given lbset that is unusable
(draining, stopped, in error, etc.), a usable hot spare with the same lbset will be used in
its place. Hot spares can help ensure that a specific number of workers are always available
for use by a balancer.</td></tr>
<tr><td><code>H</code></td><td><em>Stby</em></td><td>Worker is in hot-standby mode and will only be used if no other
viable workers or spares are available in the balancer set.</td></tr>
<tr><td><code>E</code></td><td><em>Err</em></td><td>Worker is in an error state, usually due to failing pre-request check;
requests will not be proxied to this worker, but it will be retried depending on
the <code>retry</code> setting of the worker.</td></tr>
<tr><td><code>N</code></td><td><em>Drn</em></td><td>Worker is in drain mode and will only accept existing sticky sessions
destined for itself and ignore all other requests.</td></tr>
<tr><td><code>C</code></td><td><em>HcFl</em></td><td>Worker has failed dynamic health check and will not be used until it
passes subsequent health checks.</td></tr>
</table>
</section>
</manualpage>