DocumentRoot /usr/local/www/htdocs PerlRequire /usr/local/mason/handler.pl <Location /> SetHandler perl-script PerlHandler HTML::Mason </Location>
handler.pl
creates three Mason objects: the Parser, Interpreter, and Apache handler.
The Parser compiles components into Perl subroutines; the Interpreter
executes those compiled components; and the Apache handler routes mod_perl
requests to Mason. These objects are created once in the parent httpd and
then copied to each child process.
These objects have a fair number of initial parameters, only two of which are required: comp_root and data_dir. The various parameters are documented in the individual reference manuals for each object: HTML::Mason::Parser, HTML::Mason::Interp, and HTML::Mason::ApacheHandler.
Components will often need access to external Perl modules. Any such modules that export symbols should by listed in handler.pl, rather than the standard practice of using a PerlModule configuration directive. This is because components are executed inside the HTML::Mason::Commands package, and can only access symbols exported to that package. Here's sample module list:
{ package HTML::Mason::Commands; use CGI ':standard'; use LWP::UserAgent; ... }
In any case, for optimal memory utilization, make sure all Perl modules are used in the parent process, and not in components. Otherwise, each child allocates its own copy and you lose the benefit of shared memory between parent processes and their children. See Vivek Khera's mod_perl tuning FAQ for details.
Another parent/child consideration is file ownership. Web servers that run on privileged ports like 80 start with a root parent process, then spawn children running as the 'User' and 'Group' specified in httpd.conf. This difference leads to permission errors when child processes try to write files or directories created by the parent process.
To work around this conflict, Mason remembers all directories and files
created at startup, returning them in response to
$interp->files_written
. This list can be fed to a chown()
at the end of the startup
code in handler.pl
:
chown ( [getpwnam('nobody')]->[2], [getgrnam('nobody')]->[2], $interp->files_written );
cache: data cache files debug: debug files etc: miscellaneous files obj: compiled components
These directories will be discussed in appropriate sections throughout this manual.
data_dir/cache
, replacing slashes in the component path with ``::''. For example, the
cache file for component /foo/bar
is data_dir/cache/foo::bar
.
Currently Mason never deletes cache files, not even when the associated component file is modified. (This may change in the near future.) Thus cache files hang around and grow indefinitely. You may want to use a cron job or similar mechanism to delete cache files that get too large or too old. For example:
# Shoot cache files more than 30 days old foreach (<data_dir/cache>) { # path to cache directory unlink $_ if (-M >= 30); }
In general you can feel free to delete cache files periodically and without warning, because the data cache mechanism is explicitly not guaranteed -- developers are warned that cached data may disappear anytime and components must still function.
$r
) and calls the same PerlHandler that Apache called. Several ApacheHandler
parameters are required to activate and configure debug files:
data_dir/debug/<username>
for authenticated users, otherwise they are placed in
data_dir/debug/anon
.
/usr/bin/perl
. This is used in the Unix ``shebang'' line at the top of each debug file.
handler.pl
script. Debug files invoke
handler.pl
just as Apache does as startup, to load needed modules and create Mason
objects.
handler.pl
. This routine is called with the saved Apache request object.
my $ah = new HTML::Mason::ApacheHandler (interp=>$interp, debug_mode=>'all', debug_perl_binary=>'/usr/local/bin/perl', debug_handler_script=>'/usr/local/mason/eg/handler.pl', debug_handler_proc=>'HTML::Mason::handler');
Follow these steps to activate the Previewer:
Listen your.site.ip.address:3001 ... Listen your.site.ip.address:3005
You'll also probably want to restrict access to these ports in your access.conf. If you have multiple site developers, it is helpful to use username/password access control, since the previewer will use the username to keep configurations separate.
handler.pl
) to intercept Previewer requests on the ports defined above. Your handler
should end up looking like this:
sub handler { my ($r) = @_;
# Compute port number from Host header my $host = $r->header_in('Host'); my ($port) = ($host =~ /:([0-9]+)$/); $port = 80 if (!defined($port));
# Handle previewer request on special ports if ($port >= 3001 && $port <= 3005) { my $parser = new HTML::Mason::Parser(...); my $interp = new HTML::Mason::Interp(...); my $ah = new HTML::Mason::ApacheHandler (...); return HTML::Mason::Preview::handle_preview_request($r,$ah); } else { $ah->handle_request($r); # else, normal request handler } }
The three ``new'' lines inside the if block should look exactly the same as
the lines at the top of handler.pl
. Note that these separate Mason objects are created for a single request
and discarded. The reason is that the previewer may alter the objects'
settings, so it is safer to create new ones every time.
data_dir/obj/component-path
. Future server processes can eval the object file and save time on
parsing. Both entities are recomputed if the component source file changes.
%my $name = "Jon"; Hello <% $name %>, how are you?
translates to something like:
my $name = "Jon"; $r->print("Hello "); $r->print($name); $r->print(", how are you?");
The amount of memory taken up by a compiled component is therefore at least as large as the combined size of its HTML blocks. If a component has 50K of HTML, that means 50K of storage for each child process that loads the component. Multiply that by ten processes and twenty such components and you've got some noticeable memory overhead.
To reduce this overhead Mason generates, in certain cases, code that reads from the source file at runtime. For example, the following component:
<%mc_comp(' top')%> ... 20K of HTML ... <%mc_comp('center')%> ... 30K of HTML ...
translates to something like:
my $_srctext = mc_file('/usr/local/www/htdocs/foo/bar'); $r->print(mc_comp('top')); $r->print(substr($_srctext,18,20498)); $r->print(mc_comp('center')); $r->print(substr($_srctext,20520,30720));
The resulting code is a bit slower but more memory efficient. Mason decides whether to use these ``source references'' by first measuring both the total size and the amount of HTML in a component. Those values are then examined by a customizable ``source_refer_predicate'' which makes a determination based on local policy, say ``more than 50% HTML'', or ``more than 20K of HTML''.
This feature requires no administration; I mention it simply so that you are not surprised to see zero size object files.
To prevent these constant file checks, Mason can monitor a single ``reload file'' of modified components. When a component changes, you append its component path to the reload file, one path per line. At the beginning of each request Mason checks to see if the reload file has changed; if so, it reads the new paths and invalidates their cache entries, which in turn forces a recompile the next time those components are requested.
The reload file is kept in data_dir/etc/reload.lst
. You can activate reload file monitoring with interp->use_reload_file
.
The advantage of using a reload file is that Mason stats one file per request instead of ten or twenty. The disadvantage is a major increase in maintenance costs as the reload file has to be kept up-to-date. If developers on your site use editorial tools to access and trigger components, you can update the reload file as part of these tools. Or you might run a cron job or similar timed task that periodically scans the component hierarchy, updating the reload file if anything has changed.
The priorities for the staging site are rapid development and easy debugging, while the main priority for the production site is performance. This section describes various ways to adapt Mason for each case.
Which is better, batch or stream? It depends on the context.
For production web servers, stream mode is better because it gets data to the browser more quickly. A browser can only process and display data at a certain rate--streaming the data allows the browser to start working in parallel with the server, while waiting to the end serializes the task (first the server does all its work, then the browser does all its work). From a user perspective the initial bytes are especially important: until the browser receives some data, it simply displays a ``waiting'' message. Serving a computationally intense page in batch mode makes the server look unresponsive and tempts users to hit Stop, whereas in stream mode the browser at least acknowledges an answer and draws a background.
For development or staging web servers, batch mode has the advantage of better error handling. Suppose an error occurs in the middle of a page. In stream mode, the error message interrupts existing output, often appearing in an awkward HTML context such as the middle of a table which never gets closed. The user may see a partial page and have to ``View source'' to see the error message. In batch mode, the error message is output neatly and alone.
You control output mode by setting ah->output_mode
to ``batch'' or ``stream''.
When configuring Mason to serve multiple virtual hosts, Mason's comp_root must be separated from the DocumentRoot (since DocumentRoot changes per virtual server). In this case you'll want to collect all of your DocumentRoots inside a single component space:
# httpd.conf PerlRequire /usr/local/mason/handler.pl
# Web site #1 <VirtualHost www.site1.com> DocumentRoot /usr/local/www/htdocs/site1 <Location /> SetHandler perl-script PerlHandler HTML::Mason </Location> </VirtualHost>
# Web site #2 <VirtualHost www.site2.com> DocumentRoot /usr/local/www/htdocs/site2 <Location /> SetHandler perl-script PerlHandler HTML::Mason </Location> </VirtualHost>
In contrast to these big changes to httpd.conf, the Mason bootstrap in handler.pl stays the same:
my $interp = new HTML::Mason::Interp (parser=>$parser, comp_root=>'/usr/local/www/htdocs' data_dir=>'/usr/local/mason/');
The <Location> directives in this example now route all requests through Mason--every page is dynamic. The directory structure for this scenario might looks like this:
/usr/local/www/htdocs/ # component root +- shared/ # shared components +- site1/ # DocumentRoot for first site +- site2/ # DocumentRoot for second site
Incoming URLs for each site can only request components in their respective DocumentRoots, while components internally can call other components anywhere in the component space. The shared/ directory, then, is a private directory for use by components, inaccessible from the Web.