blob: 267d74c7417405252b507c2820b7d8166e59330a [file] [log] [blame]
=head1 NAME
mod_perl_traps - common/known mod_perl traps
=head1 DESCRIPTION
In the CGI environment, the server starts a single external process
(Perl interpreter) per HTTP request which runs single script in that
process space. When the request is over, the process goes away
everything is cleaned up and a fresh script is started for the next
request. mod_perl brings Perl inside of the HTTP server not only for
speedup of CGI scripts, but also for access to server functionality
that CGI scripts do not and/or cannot have. Now that we're inside the
server, each process will likely handle more than one Perl script and
keep it "compiled" in memory for longer than a single HTTP request.
This new location and longer lifetime of Perl execution brings with it
some common traps. This document is here to tell you what they are
and how to prevent them. The descriptions here are short, please
consult the mod_perl FAQ for more detail. If you trip over something
not documented here, please send a message to the mod_perl list.
=head2 Migrating from CGI
=over 4
=item *
Be sure to have read L<cgi_to_mod_perl>
=item *
Scripts under Apache::Registry are not run in package B<main>, they
are run in a unique namespace based on the requested uri.
=item *
Apache::Registry scripts cannot contain __END__ or __DATA__ tokens
=item *
Output of C<system>, C<exec> and C<open PIPE, "|program"> calls will
not be sent to the browser unless you Perl was configured with sfio.
=item *
Perl's exit() built-in function cannot be used in mod_perl scripts.
The Apache::exit() function should be used instead. Apache::exit()
automatically overrides the built-in exit() for Apache::Registry
and Apache::PerlRun scripts.
=item *
Your script *will not* run from the command line if your script makes
any direct calls to Apache-E<gt>methods. See Apache::FakeRequest.
=back
=head2 Apache::Registry
=over 4
=item undefined subroutine &Apache::Registry::handler
Interaction with certain modules causes the shortcut configuration to
break, if you see this message change your configuration from this:
<Location /perl>
PerlHandler Apache::Registry
...
</Location>
To this:
PerlModule Apache::Registry
<Location /perl>
PerlHandler Apache::Registry::handler
...
</Location>
=back
=head2 Using CGI.pm and CGI::*
=over 4
=item *
CGI.pm users B<must> have version B<2.39> of the package or higher,
earlier versions will not work under mod_perl.
=item *
If you use the C<SendHeaders()> function, be sure to call
$req_obj-E<gt>cgi-E<gt>done when you are done with a request, just as you
would under I<CGI::MiniSrv>.
=back
=head2 Perl Modules and Extensions
=over 4
=item *
Files pulled in via C<use> or C<require> statements are not
automatically reloaded when changed on disk. See the Apache::StatINC
or the Apache::Reload module to add this functionality.
=item Undefined subroutines
A common trap with required files may result in an error message
similar to this in the error_log:
[Thu Sep 11 11:03:06 1997] Undefined subroutine
&Apache::ROOT::perl::test_2epl::some_function called at
/opt/www/apache/perl/test.pl line 79.
As the above items explains, a file pulled in via C<require> will only
happen once per-process (unless %INC is modified). If the file does
not contain a C<package> declaration, the file's subroutines and
variables will be created in the current package. Under CGI, this is
commonly package C<main>. However, B<Apache::Registry> scripts are
compiled into a unique package name (base on the uri). So, if
multiple scripts in the same process try to require the same file,
which does not declare a package, only one script will actually be
able to see the subroutines. The solution is to read L<perlmodlib>,
L<perlmod> and related perl documentation and re-work your required
file into a module which exports functions or defines a method
interface.
Or something more simple, along these lines:
#required_file.pl
package Test;
sub some_function {...}
...
__END__
Now, have your scripts say:
require "required_file.pl";
Test::some_function();
=item Undefined subroutine &Foo::Bar::handler called at PerlHandler subroutine `Foo::Bar' line 1.
You mistyped the module name in the 'package' line in your module.
=item "Use of uninitialized value"
Because of eval context, you may see warnings with useless
filename/line, example:
Use of uninitialized value at (eval 80) line 12.
Use of uninitialized value at (eval 80) line 43.
Use of uninitialized value at (eval 80) line 44.
To track down where this eval is really happening, try using a
B<__WARN__> handler to give you a stack trace:
use Carp ();
local $SIG{__WARN__} = \&Carp::cluck;
=item "Callback called exit"
=item "Out of memory!"
If something goes really wrong with your code, Perl may die with an
"Out of memory!" message and or "Callback called exit". A common
cause of this are never-ending loops, deep recursion or calling an
undefined subroutine. Here's one way to catch the problem:
See Perl's INSTALL document for this item:
=item -DPERL_EMERGENCY_SBRK
If PERL_EMERGENCY_SBRK is defined, running out of memory need not be a
fatal error: a memory pool can allocated by assigning to the special
variable $^M. See perlvar(1) for more details.
If you compile with that option and add 'use Apache::Debug level =E<gt> 4;'
to your PerlScript, it will allocate the $^M emergency pool and the
$SIG{__DIE__} handler will call Carp::confess, giving you a stack
trace which should reveal where the problem is.
See the B<Apache::Resource> module for prevention of spinning httpds.
=item *
If you wish to use a module that is normally linked static with your
Perl, it must be listed in static_ext in Perl's Config.pm to be linked
with httpd during the mod_perl build.
=item Can't load '$Config{sitearchexp}/auto/Foo/Foo.so' for module Foo...
When starting httpd some people have reported seeing an error along
the lines of:
[Thu Jul 9 17:33:42 1998] [error] Can't load
'/usr/local/ap/lib/perl5/site_perl/sun4-solaris/auto/DBI/DBI.so' for
module DBI: ld.so.1: src/httpd: fatal: relocation error: file
/usr/local/ap/lib/perl5/site_perl/sun4-solaris/auto/DBI/DBI.so: symbol
Perl_sv_undef: referenced symbol not found at
/usr/local/ap/lib/perl5/sun4-solaris/5.00404/DynaLoader.pm line 166.
Or similar for the IO module or whatever dynamic module mod_perl tries
to pull in first. The solution is to re-configure, re-build and
re-install Perl and dynamic modules with the following flags when
Configure asks for "additional LD flags":
-Xlinker --export-dynamic
or
-Xlinker -E
This problem is only known to be caused by installing gnu ld under Solaris.
Other known causes of this problem:
OS distributions that ship with a (broken) binary Perl installation.
The `perl' program and `libperl.a' library are somehow built with
different binary compatiblity flags.
The solution to these problems is to rebuild Perl and extension
modules from a fresh source tree. Tip for running Perl's Configure
script, use the `C<-des>' flags to accepts defaults and `C<-D>' flag to
override certain attributes:
% ./Configure -des -Dcc=gcc ... && make test && make install
Read Perl's INSTALL doc for more details.
=back
=head2 Clashes with other Apache C modules
=over 4
=item mod_auth_dbm
If you are a user of B<mod_auth_dbm> or B<mod_auth_db>, you may need
to edit Perl's C<Config> module. When Perl is configured it attempts
to find libraries for ndbm, gdbm, db, etc., for the *DBM*_File
modules. By default, these libraries are linked with Perl and
remembered by the B<Config> module. When mod_perl is configured with
apache, the B<ExtUtils::Embed> module returns these libraries to be
linked with httpd so Perl extensions will work under mod_perl.
However, the order in which these libraries are stored in
B<Config.pm>, may confuse C<mod_auth_db*>. If C<mod_auth_db*> does
not work with mod_perl, take a look at this order with the following
command:
% perl -V:libs
If C<-lgdbm> or C<-ldb> is before C<-lndbm>, example:
libs='-lnet -lnsl_s -lgdbm -lndbm -ldb -ldld -lm -lc -lndir -lcrypt';
Edit B<Config.pm> and move C<-lgdbm> and C<-ldb> to the end of the
list. Here's how to find B<Config.pm>:
% perl -MConfig -e 'print "$Config{archlibexp}/Config.pm\n"'
Another solution for building Apache/mod_perl+mod_auth_dbm under Solaris
is to remove the DBM and NDBM "emulation" from libgdbm.a. Seems
Solaris already provides its own DBM and NDBM, and there's no reason
to build GDBM with them (for us anyway).
In our Makefile for GDBM, we changed
OBJS = $(DBM_OF) $(NDBM_OF) $(GDBM_OF)
to
OBJS = $(GDBM_OF)
Rebuild libgdbm, then Apache/mod_perl.
=back
=head1 REGULAR EXPRESSIONS
=head2 COMPILED REGULAR EXPRESSIONS
When using a regular expression that contains an interpolated Perl variable,
if it is known that the variable (or variables) will not vary during the
execution of the program, a standard optimization technique consists of
adding the C<o> modifier to the regexp pattern, to direct the compiler to
build the internal table once, for the entire lifetime of the script, rather
than every time the pattern is executed. Consider:
my $pat = '^foo$'; # likely to be input from an HTML form field
foreach( @list ) {
print if /$pat/o;
}
This is usually a big win in loops over lists, or when using C<grep> or
C<map>.
In long-lived C<mod_perl> scripts, however, this can pose a problem if the
variable changes according to the invocation. The first invocation of a
fresh httpd child will compile the table and perform the search correctly,
however, all subsequent uses by the httpd child will continue to match the
original pattern, regardless of the current contents of the Perl variables
the pattern is dependent on. Your script will appear broken.
There are two solutions to this problem.
The first is to use C<eval q//>, to force the code to be evaluated each
time. Just make sure that the C<eval> block covers the entire loop of
processing, and not just the pattern match itself.
The above code fragment would be rewritten as:
my $pat = '^foo$';
eval q{
foreach( @list ) {
print if /$pat/o;
}
}
Just saying
eval q{ print if /$pat/o; };
is going to be a horribly expensive proposition.
You use this approach if you require more than one pattern match operator in
a given section of code. If the section contains only one operator (be it an
C<m//> or C<s///>), you can rely on the property of the null pattern, that
reuses the last pattern seen. This leads to the second solution, which also
eliminates the use of C<eval>.
The above code fragment becomes:
my $pat = '^foo$';
"something" =~ /$pat/; # dummy match (MUST NOT FAIL!)
foreach( @list ) {
print if //;
}
The only gotcha is that the dummy match that boots the regular expression
engine must absolutely, positively succeed, otherwise the pattern will not
be cached, and the // will match everything. If you can't count on fixed
text to ensure the match succeeds, you have two possibilities.
If you can guaranteee that the pattern variable contains no meta-characters
(things like C<*>, C<+>, C<^>, C<$>...), you can use the dummy match:
"$pat" =~ /\Q$pat\E/; # guaranteed if no meta-characters present
If there is a possibility that the pattern can contain meta-characters, you
should search for the pattern or the unsearchable C<\377> character as
follows:
"\377" =~ /$pat|^[\377]$/; # guarenteed if meta-characters present
=head2 References
The Camel Book, 2nd edition, p. 538 (p. 356 in the 1st edition).
=head1 AUTHORS
Doug MacEachern, with contributions from
Jens Heunemann E<lt>heunemann2@janet.deE<gt>,
David Landgren E<lt>david@landgren.netE<gt>,
Mark Mills E<lt>mark@ntr.netE<gt>,
Randal Schwartz E<lt>merlyn@stonehenge.comE<gt> and
Ask Bjoern Hansen E<lt>ask@develooper.comE<gt>