| Design of APR |
| |
| The Apache Portable Run-time libraries have been designed to provide a common |
| interface to low level routines across any platform. The original goal of APR |
| was to combine all code in Apache to one common code base. This is not the |
| correct approach however, so the goal of APR has changed. |
| |
| There are places where common code is not a good thing. For example, how to |
| map requests to either threads or processes should be platform specific. |
| APR's place is now to combine any code that can be safely combined without |
| sacrificing performance. |
| |
| To this end we have created a set of operations that are required for cross |
| platfrom development. There may be other types that are desired and those |
| will be implemented in the future. The first version of APR will focus on |
| what Apache 2.0 needs. Of course, anything that is submitted will be |
| considered for inclusion. |
| |
| This document will discuss the structure of APR, and how best to contribute |
| code to the effort. |
| |
| APR On Windows |
| |
| APR on Windows is different from APR on all other systems, because it |
| doesn't use autoconf. On Unix, apr_private.h (private to APR) and apr.h |
| (public, used by applications that use APR) are generated by autoconf |
| from acconfig.h and apr.h.in respectively. On Windows, apr_private.h |
| and apr.h are created from apr_private.hw and apr.hw respectively. |
| |
| !!!*** If you add code to acconfig.h or tests to configure.in or aclocal.m4, |
| please give some thought to whether or not Windows needs this addition |
| as well. A general rule of thumb, is that if it is a feature macro, |
| such as APR_HAS_THREADS, Windows needs it. If the definition is going |
| to be used in a public APR header file, such as apr_general.h, Windows |
| needs it. |
| |
| The only time it is safe to add a macro or test without also adding |
| the macro to apr*.hw, is if the macro tells APR how to build. For |
| example, a test for a header file does not need to be added to Windows. |
| ***!!! |
| |
| APR Features |
| |
| One of the goals of APR is to provide a common set of features across all |
| platforms. This is an admirable goal, it is also not realisitic. We cannot |
| expect to be able to implement ALL features on ALL platforms. So we are |
| going to do the next best thing. Provide a common interface to ALL APR |
| features on MOST platforms. |
| |
| APR developers should create FEATURE MACROS for any feature that is not |
| available on ALL platforms. This should be a simple definition which has |
| the form: |
| |
| APR_HAS_FEATURE |
| |
| This macro should evaluate to true if APR has this feature on this platform. |
| For example, Linux and Windows have mmap'ed files, and APR is providing an |
| interface for mmapp'ing a file. On both Linux and Windows, APR_HAS_MMAP |
| should evaluate to one, and the ap_mmap_* functions should map files into |
| memory and return the appropriate status codes. |
| |
| If your OS of choice does not have mmap'ed files, APR_HAS_MMAP should evaluate |
| to zero, and all ap_mmap_* functions should not be defined. The second step |
| is a precaution that will allow us to break at compile time if a programmer |
| tries to use unsupported functions. |
| |
| APR types |
| |
| The base types in APR |
| file_io File I/O, including pipes |
| lib A portable library originally used in Apache. This contains |
| memory management, tables, and arrays. |
| locks Mutex and reader/writer locks |
| misc Any APR type which doesn't have any other place to belong |
| network_io Network I/O |
| shmem Shared Memory (Not currently implemented) |
| signal Asynchronous Signals |
| threadproc Threads and Processes |
| time Time |
| |
| Directory Structure |
| |
| Each type has a base directory. Inside this base directory, are |
| subdirectories, which contain the actual code. These subdirectories are named |
| after the platforms the are compiled on. Unix is also used as a common |
| directory. If the code you are writing is POSIX based, you should look at the |
| code in the unix directory. A good rule of thumb, is that if more than half |
| your code needs to be ifdef'ed out, and the structures required for your code |
| are substantively different from the POSIX code, you should create a new |
| directory. |
| |
| Currently, the APR code is written for Unix, BeOS, Windows, and OS/2. An |
| example of the directory structure is the file I/O directory: |
| |
| apr |
| | |
| -> file_io |
| | |
| -> unix The Unix and common base code |
| | |
| -> win32 The Windows code |
| | |
| -> os2 The OS/2 code |
| |
| Obviously, BeOS does not have a directory. This is because BeOS is currently |
| using the Unix directory for it's file_io. In the near future, it will be |
| possible to use indiviual files from the Unix directory. |
| |
| There are a few special top level directories. These are test, inc, include, |
| and libs. Test is a directory which stores all test programs. It is expected |
| that if a new type is developed, there will also be a new test program, to |
| help people port this new type to different platforms. Inc is a directory for |
| internal header files. This directory is likely to go away soon. Include is |
| a directory which stores all required APR header files for external use. The |
| distinction between internal and external header files will be made soon. |
| Finally, libs is a generated directory. When APR finishes building, it will |
| store it's library files in the libs directory. |
| |
| Creating an APR Type |
| |
| The current design of APR requires that APR types be incomplete. It is not |
| possible to write flexible portable code if programs can access the internals |
| of APR types. This is because different platforms are likely to define |
| different native types. |
| |
| For this reason, each platform defines a structure in their own directories. |
| Those structures are then typedef'ed in an external header file. For example |
| in file_io/unix/fileio.h: |
| |
| struct ap_file_t { |
| ap_context_t *cntxt; |
| int filedes; |
| FILE *filehand; |
| ... |
| } |
| |
| In include/apr_file_io.h: |
| typedef struct ap_file_t ap_file_t; |
| |
| This will cause a compiler error if somebody tries to access the filedes field |
| in this strcture. Windows does not have a filedes field, so obviously, it is |
| important that programs not be able to access these. |
| |
| The only exception to the incomplete type rule can be found in apr_portable.h. |
| This file defines the native types for each platform. Using these types, it |
| is possible to extract native types for any APR type. |
| |
| You may notice the ap_context_t field. All APR types have this field. This |
| type is used to allocate memory within APR. |
| |
| New Function |
| |
| When creating a new function, please try to adhere to these rules. |
| |
| 1) Result arguments should be the first arguments. |
| 2) If a function needs a context, it should be the last argument. |
| 3) These rules are flexible, especially if it makes the code easier |
| to understand because it mimics a standard function. |
| |
| Documentation |
| |
| Whenever a new function is added to APR, it MUST be documented. New |
| functions will not be committed unless there are docs to go along with them. |
| The documentation should be a comment block above the function in the header |
| file. |
| |
| The format for the comment block is: |
| |
| /** |
| * Brief description of the function |
| * @param parma_1_name explanation |
| * @param parma_2_name explanation |
| * @param parma_n_name explanation |
| * @tip Any extra information people should know. |
| * @deffunc function prototype if required |
| */ |
| |
| The last line is not strictly needed. The parser in ScanDoc is not perfect |
| yet, and it can not parse prototypes that are in any form other than |
| return_type program_name(type1 param1, type2 param2, ...) |
| This means that any function prototype that resembles: |
| APR_EXPORT(ap_status_t) ap_foo(int f1, char *f2) |
| will need the deffunc. |
| |
| For an actual example, look at any file in the include directory (ap_tables.h |
| hasn't been done yet). |
| |
| APR Error reporting |
| |
| Most APR functions should return an ap_status_t type. The only time an |
| APR function does not return an ap_status_t is if it absolutly CAN NOT |
| fail. Examples of this would be filling out an array when you know you are |
| not beyond the array's range. If it cannot fail on your platform, but it |
| could conceivably fail on another platform, it should return an ap_status_t. |
| Unless you are sure, return an ap_status_t. :-) |
| |
| All platform return errno values unchanged. Each platform can also have |
| one system error type, which can be returned after an offset is added. |
| There are five types of error values in APR, each with it's own offset. |
| |
| Name Purpose |
| 0) This is 0 for all platforms and isn't really defined |
| anywhere, but it is the offset for errno values. |
| (This has no name because it isn't actually defined, |
| but completeness we are discussing it here). |
| 1) APR_OS_START_ERROR This is platform dependant, and is the offset at which |
| APR errors start to be defined. (Canonical error |
| values are also defined in this section. [Canonical |
| error values are discussed later]). |
| 2) APR_OS_START_STATUS This is platform dependant, and is the offset at which |
| APR status values start. |
| 4) APR_OS_START_USEERR This is platform dependant, and is the offset at which |
| APR apps can begin to add their own error codes. |
| 3) APR_OS_START_SYSERR This is platform dependant, and is the offset at which |
| system error values begin. |
| |
| All of these definitions can be found in apr_errno.h for all platforms. When |
| an error occurs in an APR function, the function must return an error code. |
| If the error occurred in a system call and that system call uses errno to |
| report an error, then the code is returned unchanged. For example: |
| |
| if (open(fname, oflags, 0777) < 0) |
| return errno; |
| |
| |
| The next place an error can occur is a system call that uses some error value |
| other than the primary error value on a platform. This can also be handled |
| by APR applications. For example: |
| |
| if (CreateFile(fname, oflags, sharemod, NULL, |
| createflags, attributes,0) == INVALID_HANDLE_VALUE |
| return (GetLAstError() + APR_OS_START_SYSERR); |
| |
| These two examples implement the same function for two different platforms. |
| Obviously even if the underlying problem is the same on both platforms, this |
| will result in two different error codes being returned. This is OKAY, and |
| is correct for APR. APR relies on the fact that most of the time an error |
| occurs, the program logs the error and continues, it does not try to |
| programatically solve the problem. This does not mean we have not provided |
| support for programmatically solving the problem, it just isn't the default |
| case. We'll get to how this problem is solved in a little while. |
| |
| If the error occurs in an APR function but it is not due to a system call, |
| but it is actually an APR error or just a status code from APR, then the |
| appropriate code should be returned. These codes are defined in apr_errno.h |
| and are self explanatory. |
| |
| No APR code should ever return a code between APR_OS_START_USEERR and |
| APR_OS_START_SYSERR, those codes are reserved for APR applications. |
| |
| To programmatically correct an error in a running application, the error codes |
| need to be consistent across platforms. This should make sense. To get |
| consistent error codes, APR provides a function ap_canonical_error(). |
| This function will take as input any ap_status_t value, and return a small |
| subset of canonical APR error codes. These codes will be equivalent to |
| Unix errno's. Why is it a small subset? Because we don't want to try to |
| convert everything in the first pass. As more programs require that more |
| error codes are converted, they will be added to this function. |
| |
| Why did APR take this approach? There are two ways to deal with error |
| codes portably. |
| |
| 1) return the same error code across all platforms. 2) return platform |
| specific error codes and convert them when necessary. |
| |
| The problem with option number one is that it takes time to convert error |
| codes to a common code, and most of the time programs want to just output |
| an error string. If we convert all errors to a common subset, we have four |
| steps to output an error string: |
| |
| make syscall that fails |
| convert to common error code step 1 |
| return common error code |
| check for success |
| call error output function step 2 |
| convert back to system error step 3 |
| output error string step 4 |
| |
| By keeping the errors platform specific, we can output error strings in two |
| steps. |
| |
| make syscall that fails |
| return error code |
| check for success |
| call error output function step 1 |
| output error string step 2 |
| |
| Less often, programs change their execution based on what error was returned. |
| This is no more expensive using option 2 and it is using option 1, but we |
| put the onus of converting the error code on the programmer themselves. |
| For example, using option 1: |
| |
| make syscall that fails |
| convert to common error code |
| return common error code |
| decide execution basd on common error code |
| |
| Using option 2: |
| |
| make syscall that fails |
| return error code |
| convert to common error code (using ap_canonical_error) |
| decide execution based on common error code |
| |
| Finally, there is one more operation on error codes. You can get a string |
| that explains in human readable form what has happened. To do this using |
| APR, call ap_strerror(). |
| |
| On all platforms ap_strerror takes the form: |
| |
| char *ap_strerror(ap_status_t err) |
| { |
| if (err < APR_OS_START_ERRNO2) |
| return (platform dependant error string generator) |
| if (err < APR_OS_START_ERROR) |
| return (platform dependant error string generator for |
| supplemental error values) |
| if (err < APR_OS_SYSERR) |
| return (APR generated error or status string) |
| if (err == 0) |
| return "No error was found" |
| else |
| return "APR doesn't understand this error value" |
| } |
| |
| Notice, this does not handle canonicalized error values well. Those will |
| return "APR doesn't understand this error value" on some platforms and |
| an actual error string on others. To deal with this, just get the |
| string before canonicalizing your error code. |
| |
| The other problem with option 1, is that it is a lossy conversion. For |
| example, Windows and OS/2 have a couple hundred error codes, but POSIX errno |
| only defines about 50 errno values. This means that if we convert to a |
| canonical error value immediately, there is no way for the programmer to |
| get the actual system error. |
| |