NAME
Apache::Clean - run regular expressions on html output
SYNOPSIS
httpd.conf:
SetHandler perl-script
PerlHandler Apache::Clean
PerlSetVar CleanChange "change this"
PerlSetVar CleanTo "'to that'"
PerlAddVar CleanChange "another change"
PerlAddVar CleanTo "'gets made'"
Apache::Clean is Filter aware, meaning that it can be used within
Apache::Filter framework without modification. Just include the
directive
PerlSetVar Filter On
and modify the PerlHandler directive accordingly...
DESCRIPTION
Apache::Clean is a simple regular expression utility that allows you
to pass html output through the regular expression of your choice.
It is only as intelligent as regular expression, so it may have
unintended consequences on the robustness of your html if not used
with the appropriate level of supervision.
Only documents with a content type of "text/html" are affected - all
others are passed through unaltered.
EXAMPLE
a simple, but real life example:
httpd.conf:
SetHandler perl-script
PerlHandler Apache::Clean Apache::SSI
Options +Includes
PerlSetVar CleanChange "javascript\:parent\.myscroll.load\(\'(.*?)\'\)"
PerlSetVar CleanTo "$1"
PerlSetVar Filter On
foo.html before:
foo.html after:
This is used to make a javascript enabled page usable as text-only.
While the overhead is high, the number of requests for text-only
pages is significantly small - using a regex to clean up the page
saves significant maintanence overhead at minimal expense.
NOTES
Verbose debugging is enabled by setting $Apache::Clean::DEBUG=1. Very
verbose debugging is enabled at 2. To turn off all debug information,
set your apache LogLevel directive above info level.
This is alpha software, and as such has not been tested on multiple
platforms or environments. It requires PERL_LOG_API=1,
PERL_FILE_API=1, and maybe other hooks to function properly.
FEATURES/BUGS
Hopefully, you noted the additional set of single ticks in the
synopsis. They are unfortunately necessary for the right hand side
of the regex for plain text substituions if we want to be able to
allow stuff like $1 there as well.
The regular expression terms are internally stored in a plain hash.
Thus, the order of replacements cannot be guaranteed. There are also
implications if you want to use identical expressions for CleanChange,
as that is used as the hash key.
Apache::Clean performs a line by line replacement - sorry, no
multiline intelligence yet...
SEE ALSO
perl(1), mod_perl(3), Apache(3), Apache::Filter(3)
AUTHOR
Geoffrey Young
COPYRIGHT
Copyright (c) 2000, Geoffrey Young. All rights reserved.
This module is free software. It may be used, redistributed
and/or modified under the same terms as Perl itself.