blob: a7ef54690dd45ea5b0c6a9cf9a85a138b85fb998 [file]
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
<?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
<!-- $LastChangedRevision$ -->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<manualpage metafile="request.xml.meta">
<parentdocument href="./">Developer Documentation</parentdocument>
<title>Request Processing in the Apache HTTP Server 2.x</title>
<summary>
<p>This document describes how the Apache HTTP Server processes
requests internally, covering the full hook sequence from URI
translation through content generation and logging. Module authors
should understand these phases to correctly insert their processing
at the appropriate point in the cycle.</p>
<p>All requests pass through
<code>ap_process_request_internal()</code> in
<code>server/request.c</code>, including subrequests and internal
redirects. Do not duplicate this logic elsewhere; doing so will
break when the request processing API changes.</p>
<p>The first major design principle is that all request processing
paths (main requests, subrequests, and redirects) share a single
code path. Duplicate code was folded back into
<code>ap_process_request_internal()</code> in 2.0 to prevent the
paths from falling out of sync.</p>
<p>To streamline requests, module authors can take advantage of
the <a href="./modguide.html#hooking">hooks offered</a> to drop
out of the request cycle early, or to bypass core hooks which are
irrelevant (and costly in terms of CPU).</p>
</summary>
<section id="overview"><title>Hook Overview</title>
<p>The complete request processing cycle involves the following hooks,
listed in execution order. Hooks marked with <em>(request.c)</em> are
implemented in <code>server/request.c</code>; others are declared
in <code>http_config.h</code> or <code>http_protocol.h</code> and
run from the MPM or protocol layer.</p>
<ol>
<li><strong><a href="#quick_handler">quick_handler</a></strong> &mdash;
Short-circuit before the full request cycle (e.g. cache hits)</li>
<li><strong><a href="#create_request">create_request</a></strong> &mdash;
Initialize request-specific module data</li>
<li><strong><a href="#pre_translate_name">pre_translate_name</a></strong> &mdash;
Manipulate URI before decoding/translation</li>
<li><strong><a href="#translate_name">translate_name</a></strong> &mdash;
Map URI to filesystem path</li>
<li><strong><a href="#map_to_storage">map_to_storage</a></strong> &mdash;
Merge per-directory config, directory/file walks</li>
<li><strong><a href="#post_perdir_config">post_perdir_config</a></strong> &mdash;
Act on merged per-directory configuration</li>
<li><strong><a href="#header_parser">header_parser</a></strong> &mdash;
Examine client request headers</li>
<li><strong><a href="#token_checker">token_checker</a></strong> &mdash;
Parse bearer tokens or other auth metadata (trunk)</li>
<li><strong><a href="#access_checker">access_checker</a></strong> &mdash;
Host-based or environment-based access control</li>
<li><strong><a href="#access_checker_ex">access_checker_ex</a></strong> &mdash;
Extended access control with auth bypass capability</li>
<li><strong><a href="#force_authn">force_authn</a></strong> &mdash;
Force authentication even when not otherwise required</li>
<li><strong><a href="#check_user_id">check_user_id</a></strong> &mdash;
Authenticate the user (set <code>r->user</code>)</li>
<li><strong><a href="#auth_checker">auth_checker</a></strong> &mdash;
Authorize the authenticated user</li>
<li><strong><a href="#type_checker">type_checker</a></strong> &mdash;
Determine content type, language, encoding</li>
<li><strong><a href="#fixups">fixups</a></strong> &mdash;
Last chance to adjust request fields before content generation</li>
<li><strong><a href="#insert_filter">insert_filter</a></strong> &mdash;
Insert content/protocol filters</li>
<li><strong><a href="#handler">handler</a></strong> &mdash;
Generate the response content</li>
<li><strong><a href="#log_transaction">log_transaction</a></strong> &mdash;
Log the completed transaction</li>
</ol>
<p>Additionally, the <strong><a href="#dirwalk_stat">dirwalk_stat</a></strong>
hook is called during directory walks to allow modules to emulate or
override <code>apr_stat()</code> calls.</p>
</section>
<section id="parsing"><title>The Request Parsing Phase</title>
<p>Before hooks run, the server performs URL normalization:</p>
<section id="unescape"><title>Unescapes the URL</title>
<p>The request's <code>parsed_uri</code> path is unescaped, once and only
once, at the beginning of internal request processing.</p>
<p>This step is bypassed if the proxyreq flag is set, or the
<code>parsed_uri.path</code> element is unset. The module has no further
control of this one-time unescape operation; either failing to
unescape or multiply unescaping the URL leads to security
repercussions.</p>
</section>
<section id="strip"><title>Strips Parent and This Elements from the
URI</title>
<p>All <code>/../</code> and <code>/./</code> elements are
removed by <code>ap_getparents()</code>, as well as any trailing
<code>/.</code> or <code>/..</code> element. This helps to ensure
the path is (nearly) absolute before the request processing
continues. (See RFC 1808 section 4 for further discussion.)</p>
<p>This step cannot be bypassed.</p>
</section>
<section id="initial-location-walk"><title>Initial URI Location Walk</title>
<p>Every request is subject to an
<code>ap_location_walk()</code> call. This ensures that
<directive type="section" module="core">Location</directive> sections
are consistently enforced for all requests. If the request is an internal
redirect or a sub-request, it may borrow some or all of the processing
from the previous or parent request's ap_location_walk, so this step
is generally very efficient after processing the main request.</p>
</section>
</section>
<section id="quick_handler"><title>Hook: quick_handler</title>
<p>The <code>quick_handler</code> hook runs <em>before</em> any other
request processing hooks — before location walks, directory walks,
access checking, and authentication. It provides a fast path for
modules that can serve content directly from a URI-keyed cache or
similar mechanism without needing per-directory configuration.</p>
<p>This hook is declared in <code>http_config.h</code> and called from
the MPM/protocol layer, not from
<code>ap_process_request_internal()</code>.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, quick_handler, (request_rec *r, int lookup_uri))
</highlight>
<p>The <code>lookup_uri</code> parameter is set to 1 when called from
<code>ap_sub_req_lookup_uri()</code>, indicating the caller only needs
metadata (not actual content delivery).</p>
<p>Used by: <module>mod_cache</module></p>
<p>Return <code>OK</code> to indicate the request has been fully handled.
Return <code>DECLINED</code> to fall through to normal processing.</p>
</section>
<section id="create_request"><title>Hook: create_request</title>
<p>Called when a new <code>request_rec</code> is created (for main
requests, subrequests, and internal redirects). Modules use this hook
to initialize per-request module state and set up private data
structures attached to the request pool or request notes.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, create_request, (request_rec *r))
</highlight>
<p>This is a <code>RUN_ALL</code> hook — all registered modules get
a chance to run. Return <code>OK</code> or <code>DECLINED</code>.</p>
<p>Used by: <module>mod_http</module> (http_core.c),
<module>mod_firehose</module></p>
</section>
<section id="translation"><title>The Translation Phase</title>
<section id="pre_translate_name"><title>Hook: pre_translate_name</title>
<p>Runs before URL decoding happens. Modules can manipulate the
raw URI before it is translated to a filesystem path. This is
useful for modules that need to operate on the URI before
percent-decoding or normalization.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, pre_translate_name, (request_rec *r))
</highlight>
<p>Return <code>DECLINED</code> to let other modules handle the
pre-translation, <code>OK</code> if it was handled, <code>DONE</code>
if no further transformation should happen on the URI, or an
HTTP error status code.</p>
<p>Used by: <module>mod_proxy</module></p>
</section>
<section id="translate_name"><title>Hook: translate_name</title>
<p>Modules can determine the file name, or alter the given URI
in this step. For example, <module>mod_vhost_alias</module> will
translate the URI's path into the configured virtual host,
<module>mod_alias</module> will translate the path to an alias path,
and if the request falls back on the core, the <directive module="core"
>DocumentRoot</directive> is prepended to the request resource.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, translate_name, (request_rec *r))
</highlight>
<p>If all modules <code>DECLINE</code> this phase, an error 500 is
returned to the browser, and a "couldn't translate name" error is logged
automatically.</p>
</section>
</section>
<section id="map_to_storage"><title>Hook: map_to_storage</title>
<p>After the file or correct URI was determined, the
appropriate per-dir configurations are merged together. For
example, <module>mod_proxy</module> compares and merges the appropriate
<directive module="mod_proxy" type="section">Proxy</directive> sections.
If the URI is nothing more than a local (non-proxy) <code>TRACE</code>
request, the core handles the request and returns <code>DONE</code>.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, map_to_storage, (request_rec *r))
</highlight>
<p>If no module answers this hook with <code>OK</code> or <code>DONE</code>,
the core will run the request filename against the <directive
module="core" type="section">Directory</directive> and <directive
module="core" type="section">Files</directive> sections. If the request
'filename' isn't an absolute, legal filename, a note is set for
later termination.</p>
<p>After <code>map_to_storage</code>, a second
<code>ap_location_walk()</code> call hardens the request by re-applying
<directive module="core" type="section">Location</directive> sections
to the translated URI.</p>
</section>
<section id="post_perdir_config"><title>Hook: post_perdir_config</title>
<p>This hook fires immediately after per-directory configuration has been
merged (after both <code>map_to_storage</code> and the second location
walk). Modules can use it to act on the fully-merged per-directory
configuration before access control runs.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, post_perdir_config, (request_rec *r))
</highlight>
<p>Return <code>OK</code> to allow processing to continue,
<code>DECLINED</code> to let later modules decide, or an HTTP
error status code to abort.</p>
</section>
<section id="header_parser"><title>Hook: header_parser</title>
<p>The main request then parses the client's headers. This
prepares the remaining request processing steps to better serve
the client's request. This hook only runs for the initial
request (not subrequests).</p>
</section>
<section id="security"><title>The Security Phase</title>
<p>The security phase in 2.4+ uses the "new" provider-based
authentication/authorization architecture managed by
<module>mod_auth_basic</module>, <module>mod_authz_core</module>,
and related modules. The hook execution order depends on the
<directive module="mod_authz_core">Satisfy</directive> setting
and whether access control is required (<directive
module="mod_authz_core">Require</directive> directives).</p>
<p>The hooks execute in this order:</p>
<section id="token_checker"><title>Hook: token_checker</title>
<p>Parses any tokens in the request (e.g. bearer tokens, API keys)
that contain metadata such as user identities or IP addresses
relevant to the request. Runs before the access checker.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, token_checker, (request_rec *r))
</highlight>
<p>If this hook returns <code>OK</code> under <code>Satisfy any</code>,
the request is authorized immediately without running further
access/auth hooks.</p>
<note><title>Note</title>
<p>This hook is available in trunk only (not backported to 2.4
at the time of writing).</p></note>
</section>
<section id="access_checker"><title>Hook: access_checker</title>
<p>Applies additional access control to the resource. This hook runs
<em>before</em> a user is authenticated, so it is for restrictions
independent of user identity (e.g. IP-based access, time-of-day
restrictions). It runs independent of <directive
module="mod_authz_core">Require</directive> directive usage.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, access_checker, (request_rec *r))
</highlight>
<p>This is a <code>RUN_ALL</code> hook — all registered modules run.
Return <code>OK</code> to allow, or an HTTP error status to deny.</p>
</section>
<section id="access_checker_ex"><title>Hook: access_checker_ex</title>
<p>Extended access control that runs after <code>access_checker</code>
but before user authentication. This hook can also <em>bypass</em>
authentication entirely by returning <code>OK</code> — used by
<module>mod_authz_core</module> to implement the new authorization
model where <code>Require</code> directives can grant access
without credentials (e.g. <code>Require ip</code>).</p>
<highlight language="c">
AP_DECLARE_HOOK(int, access_checker_ex, (request_rec *r))
</highlight>
<p>Return <code>OK</code> to grant access (skipping authn unless
<code>force_authn</code> overrides), <code>DECLINED</code> to
require authentication, or an HTTP error status to deny.</p>
</section>
<section id="force_authn"><title>Hook: force_authn</title>
<p>Allows a module to force authentication to be required even when
<code>access_checker_ex</code> has already granted access. This is
useful when a module needs the authenticated user identity for
purposes beyond authorization (e.g. logging, personalization).</p>
<highlight language="c">
AP_DECLARE_HOOK(int, force_authn, (request_rec *r))
</highlight>
<p>Return <code>OK</code> to force authentication, or
<code>DECLINED</code> to let later modules decide.</p>
</section>
<section id="check_user_id"><title>Hook: check_user_id (authn)</title>
<p>Authenticates the user — analyzes the request headers, validates
credentials, and sets <code>r-&gt;user</code> and
<code>r-&gt;ap_auth_type</code>. This hook only runs when Apache
determines that authentication is required for this resource.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, check_user_id, (request_rec *r))
</highlight>
<p>Modules should register using <code>ap_hook_check_authn()</code>
rather than hooking <code>check_user_id</code> directly.</p>
</section>
<section id="auth_checker"><title>Hook: auth_checker (authz)</title>
<p>Checks whether the authenticated user (<code>r-&gt;user</code>)
is authorized to access this resource. Runs after
<code>check_user_id</code>, and only when a <directive
module="mod_authz_core">Require</directive> directive is
in effect.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, auth_checker, (request_rec *r))
</highlight>
<p>Modules should register using <code>ap_hook_check_authz()</code>
rather than hooking <code>auth_checker</code> directly.</p>
</section>
</section>
<section id="preparation"><title>The Preparation Phase</title>
<section id="type_checker"><title>Hook: type_checker</title>
<p>The modules have an opportunity to test the URI or filename
against the target resource, and set mime information for the
request. Both <module>mod_mime</module> and
<module>mod_mime_magic</module> use this phase to compare the file
name or contents against the administrator's configuration and set the
content type, language, character set and request handler. Some modules
may set up their filters or other request handling parameters at this
time.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, type_checker, (request_rec *r))
</highlight>
<p>If all modules <code>DECLINE</code> this phase, an error 500 is
returned to the browser, and a "couldn't find types" error is logged
automatically.</p>
</section>
<section id="fixups"><title>Hook: fixups</title>
<p>Many modules are "trounced" by some phase above. The fixups
phase is used by modules to reassert their ownership or force
the request's fields to their appropriate values. It is the last
hook to run before content generation.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, fixups, (request_rec *r))
</highlight>
<p>This is a <code>RUN_ALL</code> hook — all registered modules
get a chance to run. Used by <module>mod_env</module>,
<module>mod_headers</module>, and others.</p>
</section>
</section>
<section id="handler"><title>The Handler Phase</title>
<p>This phase is <strong>not</strong> part of the processing in
<code>ap_process_request_internal()</code>. After the core or a module
calls <code>ap_process_request_internal()</code>, it then calls
<code>ap_invoke_handler()</code> to generate the request.</p>
<section id="insert_filter"><title>Hook: insert_filter</title>
<p>Modules that transform the content in some way can insert
their values and override existing filters, such that if the
user configured a more advanced filter out-of-order, then the
module can move its order as needed. There is no result code,
so actions in this hook must always succeed.</p>
<highlight language="c">
AP_DECLARE_HOOK(void, insert_filter, (request_rec *r))
</highlight>
<p>This is a VOID hook — no return value. Used by
<module>mod_deflate</module>, <module>mod_filter</module>,
<module>mod_ssl</module>, and other filter modules to insert
themselves into the output filter chain.</p>
</section>
<section id="hook_handler"><title>Hook: handler</title>
<p>The module finally has a chance to serve the request in its
handler hook. Note that not every prepared request is sent to
the handler hook. Many modules, such as <module>mod_autoindex</module>,
will create subrequests for a given URI, and then never serve the
subrequest, but simply list it for the user. Remember not to
put required teardown from the hooks above into this module,
but register pool cleanups against the request pool to free
resources as required.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, handler, (request_rec *r))
</highlight>
</section>
</section>
<section id="logging"><title>The Logging Phase</title>
<section id="log_transaction"><title>Hook: log_transaction</title>
<p>After the response has been sent to the client, modules can
perform logging activities. This hook is declared in
<code>http_protocol.h</code> and runs outside of
<code>ap_process_request_internal()</code>.</p>
<highlight language="c">
AP_DECLARE_HOOK(int, log_transaction, (request_rec *r))
</highlight>
<p>Used by: <module>mod_log_config</module>,
<module>mod_log_forensic</module>, <module>mod_logio</module></p>
<p>Return <code>OK</code> or <code>DECLINED</code>. Errors at
this stage do not affect the client response (it has already
been sent).</p>
</section>
</section>
<section id="dirwalk_stat"><title>Hook: dirwalk_stat</title>
<p>This hook is called during directory walks to allow modules to
handle or emulate the <code>apr_stat()</code> calls needed to
traverse the filesystem. This enables modules to serve content
from non-filesystem backends (databases, remote storage, etc.)
while still participating in the directory walk mechanism.</p>
<highlight language="c">
AP_DECLARE_HOOK(apr_status_t, dirwalk_stat,
(apr_finfo_t *finfo, request_rec *r, apr_int32_t wanted))
</highlight>
<p>Return an <code>apr_status_t</code> value, or
<code>AP_DECLINED</code> to let later modules (or the default
<code>apr_stat()</code> call) decide.</p>
</section>
<section id="hookorder"><title>Hook Types and Ordering</title>
<p>Each hook uses one of the following execution strategies:</p>
<dl>
<dt><code>RUN_FIRST</code></dt>
<dd>Hooks stop at the first module that does <em>not</em> return
<code>DECLINED</code>. Used by: <code>pre_translate_name</code>,
<code>translate_name</code>, <code>map_to_storage</code>,
<code>check_user_id</code>, <code>type_checker</code>,
<code>access_checker_ex</code>, <code>auth_checker</code>,
<code>force_authn</code>, <code>token_checker</code>,
<code>dirwalk_stat</code>.</dd>
<dt><code>RUN_ALL</code></dt>
<dd>Every registered module runs unless one returns an error.
Used by: <code>fixups</code>, <code>access_checker</code>,
<code>create_request</code>, <code>post_perdir_config</code>.</dd>
<dt><code>VOID</code></dt>
<dd>Every registered module runs with no return value.
Used by: <code>insert_filter</code>.</dd>
</dl>
<p>Modules control their position in the hook chain using the
<code>order</code>, <code>predecessors</code>, and
<code>successors</code> arguments to the <code>ap_hook_*</code>
registration functions. See <a href="./modguide.html#hooking">the
module guide</a> for details.</p>
</section>
</manualpage>