| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
| <HTML> |
| <HEAD> |
| <TITLE>Request Processing in Apache 2.0</TITLE> |
| </HEAD> |
| |
| <!-- Background white, links blue (unvisited), navy (visited), red (active) --> |
| <BODY |
| BGCOLOR="#FFFFFF" |
| TEXT="#000000" |
| LINK="#0000FF" |
| VLINK="#000080" |
| ALINK="#FF0000" |
| > |
| |
| <!--#include virtual="header.html" --> |
| |
| <h1>Request Processing in Apache 2.0</h1> |
| |
| <p>Warning - this is a first (fast) draft that needs further revision!</p> |
| |
| <p>Several changes in Apache 2.0 affect the internal request processing |
| mechanics. Module authors need to be aware of these changes so they |
| may take advantage of the optimizations and security enhancements.</p> |
| |
| <p>The first major change is to the subrequest and redirect mechanisms. |
| There were a number of different code paths in Apache 1.3 to attempt |
| to optimize subrequest or redirect behavior. As patches were introduced |
| to 2.0, these optimizations (and the server behavior) were quickly broken |
| due to this duplication of code. All duplicate code has been folded |
| back into <code>ap_process_internal_request()</code> to prevent the |
| code from falling out of sync again.</p> |
| |
| <p>This means that much of the existing code was 'unoptimized'. It is |
| the Apache HTTP Project's first goal to create a robust and correct |
| implementation of the HTTP server RFC. Additional goals include |
| security, scalability and optimization. New methods were sought to |
| optimize the server (beyond the performance of Apache 1.3) without |
| introducing fragile or insecure code.</p> |
| |
| <h2>The Request Processing Cycle</h2> |
| |
| <p>All requests pass through <code>ap_process_request_internal()</code> |
| in request.c, including subrequests and redirects. If a module doesn't |
| pass generated requests through this code, the author is cautioned that |
| the module may be broken by future changes to request processing.</p> |
| |
| <p>To streamline requests, the module author can take advantage of the |
| hooks offered to drop out of the request cycle early, or to bypass |
| core Apache hooks which are irrelevant (and costly in terms of CPU.)</p> |
| |
| <h2>The Request Parsing Phase</h3> |
| |
| <h3>Unescapes the URL</h3> |
| |
| <p>The request's parsed_uri path is unescaped, once and only once, at the |
| beginning of internal request processing.</p> |
| |
| <p>This step is bypassed if the proxyreq flag is set, or the parsed_uri.path |
| element is unset. The module has no further control of this one-time |
| unescape operation, either failing to unescape or multiply unescaping |
| the URL leads to security reprecussions.</p> |
| |
| <h3>Strips Parent and This Elements from the URI</h3> |
| |
| <p>All <code>/../</code> and <code>/./</code> elements are removed by |
| <code>ap_getparents()</code>. This helps to ensure the path is (nearly) |
| absolute before the request processing continues.</p> |
| |
| <p>This step cannot be bypassed.</p> |
| |
| <h3>Initial URI Location Walk</h3> |
| |
| <p>Every request is subject to an <code>ap_location_walk()</code> call. |
| This ensures that <Location > sections are consistently enforced for |
| all requests. If the request is an internal redirect or a sub-request, |
| it may borrow some or all of the processing from the previous or parent |
| request's ap_location_walk, so this step is generally very efficient |
| after processing the main request.</p> |
| |
| <h3>Hook: translate_name</h3> |
| |
| <p>Modules can determine the file name, or alter the given URI in this step. |
| For example, mod_vhost_alias will translate the URI's path into the |
| configured virtual host, mod_alias will translate the path to an alias |
| path, and if the request falls back on the core, the DocumentRoot is |
| prepended to the request resource. |
| |
| <p>If all modules DECLINE this phase, an error 500 is returned to the browser, |
| and a "couldn't translate name" error is logged automatically.</p> |
| |
| <h3>Hook: map_to_storage</h3> |
| |
| <p>After the file or correct URI was determined, the appropriate per-dir |
| configurations are merged together. For example, mod_proxy compares |
| and merges the appropriate <Proxy > sections. If the URI is nothing |
| more than a local (non-proxy) TRACE request, the core handles the |
| request and returns DONE. If no module answers this hook with OK or |
| DONE, the core will run the request filename against the <Directory > |
| and <Files > sections. If the request 'filename' isn't an absolute, |
| legal filename, a note is set for later termination.</p> |
| |
| <h3>Initial URI Location Walk</h3> |
| |
| <p>Every request is hardened by a second <code>ap_location_walk()</code> |
| call. This reassures that a translated request is still subjected to |
| the configured <Location > sections. The request again borrows |
| some or all of the processing from it's previous location_walk above, |
| so this step is almost always very efficient unless the translated URI |
| mapped to a substantially different path or Virtual Host.</p> |
| |
| <h3>Hook: header_parser</h3> |
| |
| <p>The main request then parses the client's headers. This prepares |
| the remaining request processing steps to better serve the client's |
| request.</p> |
| |
| <h2>The Security Phase</h3> |
| |
| <p>Needs Documentation. Code is;</p> |
| <pre> |
| switch (ap_satisfies(r)) { |
| case SATISFY_ALL: |
| case SATISFY_NOSPEC: |
| if ((access_status = ap_run_access_checker(r)) != 0) { |
| return decl_die(access_status, "check access", r); |
| } |
| if (ap_some_auth_required(r)) { |
| if (((access_status = ap_run_check_user_id(r)) != 0) || !ap_auth_type(r)) { |
| return decl_die(access_status, ap_auth_type(r) |
| ? "check user. No user file?" |
| : "perform authentication. AuthType not set!", r); |
| } |
| if (((access_status = ap_run_auth_checker(r)) != 0) || !ap_auth_type(r)) { |
| return decl_die(access_status, ap_auth_type(r) |
| ? "check access. No groups file?" |
| : "perform authentication. AuthType not set!", r); |
| } |
| } |
| break; |
| case SATISFY_ANY: |
| if (((access_status = ap_run_access_checker(r)) != 0) || !ap_auth_type(r)) { |
| if (!ap_some_auth_required(r)) { |
| return decl_die(access_status, ap_auth_type(r) |
| ? "check access" |
| : "perform authentication. AuthType not set!", r); |
| } |
| if (((access_status = ap_run_check_user_id(r)) != 0) || !ap_auth_type(r)) { |
| return decl_die(access_status, ap_auth_type(r) |
| ? "check user. No user file?" |
| : "perform authentication. AuthType not set!", r); |
| } |
| if (((access_status = ap_run_auth_checker(r)) != 0) || !ap_auth_type(r)) { |
| return decl_die(access_status, ap_auth_type(r) |
| ? "check access. No groups file?" |
| : "perform authentication. AuthType not set!", r); |
| } |
| } |
| break; |
| } |
| </pre> |
| <h2>The Preparation Phase</h2> |
| |
| <h3>Hook: type_checker</h3> |
| |
| <p>The modules have an opportunity to test the URI or filename against |
| the target resource, and set mime information for the request. Both |
| mod_mime and mod_mime_magic use this phase to compare the file name |
| or contents against the administrator's configuration and set the |
| content type, language, character set and request handler. Some |
| modules may set up their filters or other request handling parameters |
| at this time.</p> |
| |
| <p>If all modules DECLINE this phase, an error 500 is returned to the browser, |
| and a "couldn't find types" error is logged automatically.</p> |
| |
| <h3>Hook: fixups</h3> |
| |
| <p>Many modules are 'trounced' by some phase above. The fixups phase is |
| used by modules to 'reassert' their ownership or force the request's |
| fields to their appropriate values. It isn't always the cleanest |
| mechanism, but occasionally it's the only option.</p> |
| |
| <h3>Hook: insert_filter</h3> |
| |
| <p>Modules that transform the content in some way can insert their values |
| and override existing filters, such that if the user configured a more |
| advanced filter out-of-order, then the module can move it's order as |
| need be. |
| |
| <h2>The Handler Phase</h2> |
| |
| <p>This phase is <strong><em>not</em></strong> part of the processing in |
| <code>ap_process_request_internal()</code>. Many modules prepare one |
| or more subrequests prior to creating any content at all. After the |
| core, or a module calls <code>ap_process_request_internal()</code> it |
| then calls <code>ap_invoke_handler()</code> to generate the request.</p> |
| |
| <h3>Hook: handler</h3> |
| |
| <p>The module finally has a chance to serve the request in it's handler |
| hook. Note that not every prepared request is sent to the handler |
| hook. Many modules, such as mod_autoindex, will create subrequests |
| for a given URI, and then never serve the subrequest, but simply |
| lists it for the user. Remember not to put required teardown from |
| the hooks above into this module, but register pool cleanups against |
| the request pool to free resources as required.</p> |
| |
| <!--#include virtual="footer.html" --> |
| |
| </body> |
| </html> |
| |
| |
| |
| |
| |