...making Linux just a little more fun!
By Ben Okopnik
Recently, I needed to generate a Web page - the Linux Gazette's "Mirrors and Translations"
page, actually - based on the contents of a database. Perl is famous for
its ability to connect to almost any database via a common interface, given
its DBD::DBI
module kit; however, the challenge in this case
came from the front end, the HTML generation. Sure, I could use the CGI
module to output whatever I needed - but in this case, I already had the
static page that I wanted to create, and saw no reason to rewrite all the
static content in CGI. Also, the final product was not to be a CGI file but
a generated HTML page. In fact, everything in this case hinted at
templating, a process in which I would use the static HTML with a
few special tags and a script which would then apply processing based on
those tags. This made especially good sense since it drew a clean
separating line between writing HTML and creating code, very different
tasks and ones for which I have different mental states (layout designer
vs. programmer.)
As with anything in Perl, TMTOWTDI
- there was a number of modules available on CPAN
(the Comprehensive Perl Archive Network) that could do the job. However, I
had used the HTML::Template
module in the past, and the job
wasn't particularly complicated (although HTML::Template
can
handle some very complex jobs indeed), so that's what I settled on. My
first task was to hunt through the HTML, removing the dozens of repetitive
stanzas and replacing them with the appropriate tag framework that the
module would utilize later. We had also made the decision not to display
the maintainers email addresses, even in the munged form that I use to deter
spammers; those of you who use our mirrors and want to thank these fine
folks for making LG available should be able to find an address link on the
mirror site without much trouble.
Fragment of the old page (there were several dozen entries like this):
Single-stanza replacement for all the entries (new template):
... <A name="AU"></A> <DT><B><font color="maroon">AUSTRALIA (AU)</font></B></DT> <DD> <STRONG><FONT COLOR="green"><TT>[WWW]</TT></FONT></STRONG> <A HREF="http://www.localnet.com.au/lg/index.html">http://www.localnet.com.au/lg/index.html</A> <BR> <SMALL> Maintainer: Jim McGregor <<A HREF="mailto:nospam@here.please">nospam@here.please</A>> </SMALL> <P> </DD> <DD> <STRONG><FONT COLOR="green"><TT>[WWW]</TT></FONT></STRONG> <A HREF="http://www.eastwood.apana.org.au/Linux/LinuxGazette/">http://www.eastwood.apana.org.au/Linux/LinuxGazette/</A> <BR> <SMALL> Maintainer: Mick Stock <<A HREF="mailto:nospam@here.please">nospam@here.please</A>> </SMALL> <P> </DD> ...
Now, the challenge had shifted away from generating the HTML to just dealing with code. What I needed to do was sort the data into groups and subgroups - that is, there would some number of "country" headings, some number of "mirror" headings under each of those, and either one or two (WWW, FTP, or both) hosts plus a maintainer under each "mirror" heading. In programmatic terms, these are known as "nested loops", and are not that difficult to code. However, translating that into HTML terms could be an exercise in kind of language abilities of which your mother would not approve... if it wasn't for
... <a name="<TMPL_VAR NAME=FQDN>"></a> <dt><b><font color="maroon"><TMPL_VAR NAME=FQDN> (<TMPL_VAR NAME=TLD>)</font></b></dt> <dd><strong><font color="green"><tt>[WWW]</tt></font></strong> <a href="<TMPL_VAR NAME=HTTP>"><TMPL_VAR NAME=HTTP></a> <br> <strong><font color="red"><tt>[FTP]</tt></font></strong> <a href="<TMPL_VAR NAME=FTP>"><TMPL_VAR NAME=FTP></a> <br> <small> Maintainer: <TMPL_VAR NAME=MAINT> </small> <p> </dd> ...
HTML::Template
.
Note: Using HTML::Template
is normally very simple; in
fact, learning the basics of using it usually takes only a minute or two
(see the example at the top of "perldoc HTML::Template".) However, in this
instance, we're creating nested lists - a rather more complex issue
than simple variable/tag replacement - and thus, the coding issues get a bit
deeper. However, this isn't due to HTML::Template
; if you
think about the issues inherent in modeling what is already a complex data
structure and then transferring that structure into a "passive" layout
language... truth to tell, I'm somewhat surprised that it can be done at
all. Kudos and my hat's off to Sam Tregar (author of the module) and Jesse
Erlbaum (the man responsible for TMPL_LOOP.)
First, in order to understand how the data structure must be laid out to create the pattern that we need, let's take a look at that pattern. Fortunately, in Perl it's easy to lay out the data structures to match what they represent (whitespace is arbitrary, so you can follow your preferences - but see "perldoc perlstyle".) What we'll want to do here is build the structure that contains all the values we want to assign within the loop as well as the names which are associated with those values. Those of you with a little Perl experience are nodding and smiling already: the word "associated" points very clearly to the type of variable we need - a hash! Taking a single "row" (per-country entry) - Austria, as a random example - here is how it looks:
The above hash,
%row = ( tld => AT, fqdn => Austria, sites => [ { http => "http://www.luchs.at/linuxgazette/", maint => "Rene Pfeiffer" }, { http => "http://info.ccone.at/lg/", maint => "Gerhard Beck" }, { http => "http://linuxgazette.tuwien.ac.at/", ftp => "ftp://linuxgazette.tuwien.ac.at/pub/linuxgazette/", maint => "Tony Sprinzl" } ] );
%row
, matches our requirements exactly: its
keys will be used to match (case-insensitively) the tag names in the HTML,
and the values associated with those keys will be used to replace those
tags. That is, every instance of <TMPL_VAR NAME=FQDN>
in the template will be replaced by "Austria" while this entry is being
processed. Here are some of the less-obvious points of the above
structure:
sites
key points to (is associated with) the
reference to the above anonymous array,
and is the name of the loop that we'll use within the HTML to iterate
through all of the above data.
Note the '\' preceding the
# Add the hashref to the end of the array push @mirrors, \%row;
%row
; this stores a reference to
%row
rather than the hash itself (stuffing a hash into an
array would result in a generally unusable mess - key/value pairs in
effectively random order as array elements.) This is a standard mechanism
for creating multidimensional arrays, lists-of-hashes, etc. in Perl.
And - one more time, with gusto - HTML::Template's
param()
subroutine, as most other subroutines in Perl and many
other languages, expects a reference to the array rather than the array
itself:
"And", as Austin Powers would say, "Oi'm spent." Those of you scared of the Big Bad References may come out from under the bed now. :)
# Create a new HTML::Template object my $t = HTML::Template -> new( filename => "mirrors.tmpl" ); # Pass the listref to param() $t -> param( MIRR => \@mirrors );
Looking at it from the other end, the matching part of the template for this loop looks like this:
To recap what we're looking at, there are two loops defined in the above template, one inside the other:
<dl> <TMPL_LOOP NAME=MIRR> <dt><b><font color="maroon"> <a name="<TMPL_VAR NAME=TLD>"> <img src="gx/flags/<TMPL_VAR NAME=TLD>.jpg" border="1"> </a> <TMPL_VAR FQDN> [<TMPL_VAR NAME=TLD>] </font></b></dt> <TMPL_LOOP NAME=SITES> <dd> <TMPL_IF NAME="HTTP"> <strong><font color="green"><tt>[WWW]</tt></font></strong> <a href=<TMPL_VAR HTTP>> <TMPL_VAR HTTP> </a><br> </TMPL_IF> <TMPL_IF NAME="FTP"> <strong><font color="green"><tt>[FTP]</tt></font></strong> <a href=<TMPL_VAR NAME=FTP>> <TMPL_VAR NAME=FTP> </a><br> </TMPL_IF> <small> Maintainer: <TMPL_VAR NAME=MAINT> </small> <p> </dd> </TMPL_LOOP> </TMPL_LOOP> </dl>
<TMPL_LOOP NAME=MIRR>
and
<TMPL_LOOP NAME=SITES>
. Note that the outside loop corresponds
to the name of the parameter key that we assigned when passing the data
construct to param()
, and the name of the inside loop is the
same as the key associated with the groups inside the hash we created.
By the way, the data we're reading in looks like this:
for $tld ( @tlds ){ # Set some temporary (per-loop) variables my @sites; my %row; my %line; # Here's the inner loop! for ( grep /^$tld/, @mirr ){ # Parse the CSV into fields my @rec; my %site; s/\\,/,/g; @rec = split /,/; s/,/,/g for @rec; # Mirror listings don't require much data $site{ http } = $rec[2]; $site{ ftp } = $rec[3]; $site{ maint } = $rec[4]; # Load it up! push @sites, \%site; } # Outer loop vars $row{ tld } = $tld; $row{ country } = $country{ $tld }; # Ref to the inner loop, attached $row{ sites } = \@sites; # ...and load up the total into the array to be passed to param() push @mirrs, \%row; } # Feed the data to the hungry HTML::Template object $t -> param( MIRR => \@mirrs );
Now we have a highly dynamic chunk of code that will process the data that we give it, generate the necessary data structure on the fly, and feed it out to the template. Voila!
AT,,http://www.luchs.at/linuxgazette/,,Rene Pfeiffer,nospam@here.please, AT,,http://info.ccone.at/lg/,,Gerhard Beck,nospam@here.please, BE,,http://linuxgazette.linuxbe.org/,,Cedric Gavage,nospam@here.please, CA,,http://blue7green.crosswinds.net/hobbies/lg/,,Jim Pierce,nospam@here.please,
If you want to see the complete script that I wrote for this project, go here; the template can be found here. If you would like to see the latest generated page, go here. If you would like to change the way the page looks and do something great for the Linux community, join the folks on the list and become a mirror maintainer: commit some of your disk space and bandwidth and let the Linux Gazette "mirrors and translations" person - that's me! - know about it here.
Happy Linuxing to all!
perldoc perlreftut perldoc perlref perldoc HTML::TemplateMotivation:
My annoyance at the lack of good documentation for nested loops under HTML::Template. :)
Ben is the Editor-in-Chief for Linux Gazette and a member of The Answer Gang.
Ben was born in Moscow, Russia in 1962. He became interested in electricity at the tender age of six, promptly demonstrated it by sticking a fork into a socket and starting a fire, and has been falling down technological mineshafts ever since. He has been working with computers since the Elder Days, when they had to be built by soldering parts onto printed circuit boards and programs had to fit into 4k of memory. He would gladly pay good money to any psychologist who can cure him of the recurrent nightmares.
His subsequent experiences include creating software in nearly a dozen languages, network and database maintenance during the approach of a hurricane, and writing articles for publications ranging from sailing magazines to technological journals. After a seven-year Atlantic/Caribbean cruise under sail and passages up and down the East coast of the US, he is currently anchored in St. Augustine, Florida. He works as a technical instructor for Sun Microsystems and a private Open Source consultant/Web developer. His current set of hobbies includes flying, yoga, martial arts, motorcycles, writing, and Roman history; his Palm Pilot is crammed full of alarms, many of which contain exclamation points.
He has been working with Linux since 1997, and credits it with his complete
loss of interest in waging nuclear warfare on parts of the Pacific Northwest.