Here's small sample of some of the non-OO ways you can use this module:
use HTML::Stream qw(:funcs); print html_tag('A', HREF=>$link); print html_escape("<<Hello & welcome!>>");
And some of the OO ways as well:
use HTML::Stream; $HTML = new HTML::Stream \*STDOUT; # The vanilla interface... $HTML->tag('A', HREF=>"$href"); $HTML->tag('IMG', SRC=>"logo.gif", ALT=>"LOGO"); $HTML->text($copyright); $HTML->tag('_A'); # The chocolate interface... $HTML -> A(HREF=>"$href"); $HTML -> IMG(SRC=>"logo.gif", ALT=>"LOGO"); $HTML -> t($caption); $HTML -> _A; # The chocolate interface, with whipped cream... $HTML -> A(HREF=>"$href") -> IMG(SRC=>"logo.gif", ALT=>"LOGO") -> t($caption) -> _A;
# The strawberry interface... output $HTML [A, HREF=>"$href"], [IMG, SRC=>"logo.gif", ALT=>"LOGO"], $caption, [_A];
There's even a small built-in subclass, HTML::Stream::Latin1 , which can handle Latin-1 input right out of the box. But all in good time...
use HTML::Stream qw(:funcs); # imports functions from @EXPORT_OK print html_tag(A, HREF=>$url); print '© 1996 by', html_escape($myname), '!'; print html_tag('/A');
By the way: that last line could be rewritten as:
print html_tag(_A);
And if you need to get a parameter in your tag that doesn't have an associated value, supply the undefined value (not the empty string!):
print html_tag(TD, NOWRAP=>undef, ALIGN=>'LEFT'); <TD NOWRAP ALIGN=LEFT> print html_tag(IMG, SRC=>'logo.gif', ALT=>''); <IMG SRC="logo.gif" ALT="">
There are also some routines for reversing the process, like:
$text = "This <i>isn't</i> "fun"..."; print html_unmarkup($text); This isn't "fun"... print html_unescape($text); This isn't "fun"...
Yeah, yeah, yeah , I hear you cry. We've seen this stuff before. But wait! There's more...
use HTML::Stream; $HTML = new HTML::Stream \*STDOUT; $HTML->tag(A, HREF=>$url); $HTML->ent('copy'); $HTML->text(" 1996 by $myname!"); $HTML->tag(_A);
As you've probably guessed:
text() Outputs some text, which will be HTML-escaped. tag() Outputs an ordinary tag, like <A>, possibly with parameters. The parameters will all be HTML-escaped automatically. ent() Outputs an HTML entity, like the © or < . You mostly don't need to use it; you can often just put the Latin-1 representation of the character in the text().
You might prefer to use t()
and e()
instead of text()
and ent()
: they're absolutely identical, and easier to type:
$HTML -> tag(A, HREF=>$url); $HTML -> e('copy'); $HTML -> t(" 1996 by $myname!"); $HTML -> tag(_A);
Now, it wouldn't be nice to give you those text()
and ent()
shortcuts without giving you one for tag()
, would it? Of course not...
$HTML -> A(HREF=>$url); $HTML -> e('copy'); $HTML -> t(" 1996 by $myname!"); $HTML -> _A;
As you've probably guessed:
A(HREF=>$url) == tag(A, HREF=>$url) == <A HREF="/the/url"> _A == tag(_A) == </A>
All of the autoloaded ``tag-methods'' use the tagname in all-uppercase
. A "_"
prefix on any tag-method means that an end-tag is desired. The "_"
was chosen for several reasons: (1) it's short and easy to type, (2) it
doesn't produce much visual clutter to look at, (3) _TAG
looks a little like /TAG
because of the straight line.
$HTML -> IMGG(SRC=>$src);
(You're not yet protected from illegal tag parameters, but it's a start, ain't it?)
If you need to make a tag known (sorry, but this is currently a global operation, and not stream-specific), do this:
accept_tag HTML::Stream 'MARQUEE'; # for you MSIE fans...
Note: there is no corresponding "reject_tag".
I thought and thought about it, and could not convince myself that such a
method would do anything more useful than cause other people's modules to
suddenly stop working because some bozo function decided to reject the FONT
tag.
$HTML -> A(HREF=>$url) -> e('copy') -> t(" 1996 by $myname!") -> _A;
But wait! Neapolitan ice cream has one more flavor...
p(),
a(),
etc. (especially when markup-functions
like tr()
conflict with existing Perl functions). So I came up
with this:
output $HTML [A, HREF=>$url], "Here's my $caption", [_A];
Conceptually, arrayrefs are sent to html_tag()
, and strings to
html_escape()
.
$HTML -> HTML -> HEAD -> TITLE -> t("Hello!") -> _TITLE -> _HEAD -> BODY(BGCOLOR=>'#808080');
Actually produces this:
<HTML><HTML> <HEAD> <TITLE>Hello!</TITLE> </HEAD> <BODY BGCOLOR="#808080">
To turn off autoformatting altogether
on a given HTML::Stream object, use the auto_format()
method:
$HTML->auto_format(0); # stop autoformatting!
To change whether a newline is automatically output
before/after the begin/end form of a tag at a global
level, use set_tag()
:
HTML::Stream->set_tag('B', Newlines=>15); # 15 means "\n<B>\n \n</B>\n" HTML::Stream->set_tag('I', Newlines=>7); # 7 means "\n<I>\n \n</I> "
To change whether a newline is automatically output
before/after the begin/end form of a tag for a given stream
level, give the stream its own private ``tag info'' table, and then use set_tag()
:
$HTML->private_tags; $HTML->set_tag('B', Newlines=>0); # won't affect anyone else!
To output newlines explicitly
, just use the special nl
method in the Chocolate Interface:
$HTML->nl; # one newline $HTML->nl(6); # six newlines
I am sometimes asked, ``why don't you put more newlines in automatically?'' Well, mostly because...
PRE
environment.
ent()
(or e()
) method to output an entity:
$HTML->t('Copyright ')->e('copy')->t(' 1996 by Me!');
But this can be a pain, particularly for generating output with non-ASCII characters:
$HTML -> t('Copyright ') -> e('copy') -> t(' 1996 by Fran') -> e('ccedil') -> t('ois, Inc.!');
Granted, Europeans can always type the 8-bit characters directly in their Perl code, and just have this:
$HTML -> t("Copyright \251 1996 by Fran\347ois, Inc.!');
But folks without 8-bit text editors can find this kind of output cumbersome to generate. Sooooooooo...
The default ``auto-escape'' behavior of an HTML stream can be a drag if
you've got a lot character entities that you want to output, or if you're
using the Latin-1 character set, or some other input encoding. Fortunately,
you can use the auto_escape()
method to change the way a particular HTML::Stream works at any time.
First, here's a couple of special invocations:
$HTML->auto_escape('ALL'); # Default; escapes [<>"&] and 8-bit chars. $HTML->auto_escape('LATIN_1'); # Like ALL, but uses Latin-1 entities # instead of decimal equivalents. $HTML->auto_escape('NON_ENT'); # Like ALL, but leaves "&" alone.
You can also install your own auto-escape function (note that you might very well want to install it for just a little bit only, and then de-install it):
sub my_auto_escape { my $text = shift; HTML::Entities::encode($text); # start with default $text =~ s/\(c\)/©/ig; # (C) becomes copyright $text =~ s/\\,(c)/\&$1cedil;/ig; # \,c becomes a cedilla $text; } # Start using my auto-escape: my $old_esc = $HTML->auto_escape(\&my_auto_escape); # Output some stuff: $HTML-> IMG(SRC=>'logo.gif', ALT=>'Fran\,cois, Inc'); output $HTML 'Copyright (C) 1996 by Fran\,cois, Inc.!'; # Stop using my auto-escape: $HTML->auto_escape($old_esc);
If you find yourself in a situation where you're doing this a lot, a better way is to create a subclass of HTML::Stream which installs your custom function when constructed. For an example, see the HTML::Stream::Latin1 subclass in this module.
new()
with a filehandle: any object that responds to a print() method will do
. Of course, this includes blessed
FileHandles, and IO::Handles.
If you supply a GLOB reference (like \*STDOUT
) or a string (like
"Module::FH"
), HTML::Stream will automatically create an invisible object for talking
to that filehandle (I don't dare bless it into a FileHandle, since the
underlying descriptor would get closed when the HTML::Stream is destroyed,
and you might not want that).
You say you want to print to a string? For kicks and giggles, try this:
package StringHandle; sub new { my $self = ''; bless \$self, shift; } sub print { my $self = shift; $$self .= join('', @_); } package main; use HTML::Stream; my $SH = new StringHandle; my $HTML = new HTML::Stream $SH; $HTML -> H1 -> "<Hello & welcome!>" -> _H1; print "PRINTED STRING: ", $$SH, "\n";
package MY::HTML; @ISA = qw(HTML::Stream); sub Aside { $_[0] -> FONT(SIZE=>-1) -> I; } sub _Aside { $_[0] -> _I -> _FONT; }
Now, you can do this:
my $HTML = new MY::HTML \*STDOUT; $HTML -> Aside -> t("Don't drink the milk, it's spoiled... pass it on...") -> _Aside;
If you're defining these markup-like, chocolate-interface-style functions, I recommend using mixed case with a leading capital. You probably shouldn't use all-uppercase, since that's what this module uses for real HTML tags.
< > = &
Note: provided for convenience and backwards-compatibility only. You may want to use the more-powerful HTML::Entities::encode function instead.
For convenience and readability, you can say _A
instead of "/A"
for the first tag, if you're into barewords.
lt
, gt
, amp
, quot
, and #ddd
) into ASCII characters.
Note:
provided for convenience and backwards-compatibility only. You may want to
use the more-powerful HTML::Entities::decode
function instead: unlike this function, it can collapse entities like copy
and ccedil
into their Latin-1 byte values.
The PRINTABLE may be a FileHandle, a glob reference, or any object that
responds to a print()
message. If no PRINTABLE is given, does a select()
and uses
that.
If the argument is a subroutine reference SUBREF, then that subroutine will be used. Declare such subroutines like this:
sub my_escape { my $text = shift; # it's passed in the first argument ... $text; }
If a textual NAME is given, then one of the appropriate built-in functions is used. Possible values are:
#123
).
ccedil
) instead of decimal entity codes to escape characters. This makes the HTML
more readable but it is currently not advised, as ``older'' browsers (like
Netscape 2.0) do not recognize many of the ISO-8859-1 entity names (like deg
).
Warning: If you specify this option, you'll find that it attempts to ``require'' HTML::Entities at run time. That's because I didn't want to force you to have that module just to use the rest of HTML::Stream. To pick up problems at compile time, you are advised to say:
use HTML::Stream; use HTML::Entities;
in your source code.
output $HTML "If A is an acute angle, then A > 90°";
select()
. No arguments just returns the currently-installed function.
Please use no other values; they are reserved for future use.
$html->ent('nbsp');
You may abbreviate this method name as e
:
$html->e('nbsp');
Warning: this function assumes that the entity argument is legal.
print()
message:
$HTML->io->print("This is not auto-escaped or nuthin!");
_A
instead of "/A"
, if you're into barewords.
t
:
$html->t('Hi there, ', $yournamehere, '!');
html_tag()
and output the result. If an item is a
text string, escape the text and output the result. Like this:
output $HTML [A, HREF=>$url], "Here's my $caption!", [_A];
# Make sure methods MARQUEE and _MARQUEE are compiled on demand: HTML::Stream->accept_tag('MARQUEE');
...gives the Chocolate Interface permission to create (via AUTOLOAD) definitions for the MARQUEE and _MARQUEE methods, so you can then say:
$HTML -> MARQUEE -> t("Hi!") -> _MARQUEE;
If you want to set the default attribute of the tag as well, you can do so
via the set_tag()
method instead; it will effectively do an
accept_tag()
as well.
# Make sure methods MARQUEE and _MARQUEE are compiled on demand, # *and*, set the characteristics of that tag. HTML::Stream->set_tag('MARQUEE', Newlines=>9);
set_tag
will affect everyone.
However, if you want an HTML stream to have a private copy of that table to munge with, just send it this message after creating it. Like this:
my $HTML = new HTML::Stream \*STDOUT; $HTML->private_tags;
Then, you can say stuff like:
$HTML->set_tag('PRE', Newlines=>0); $HTML->set_tag('BLINK', Newlines=>9);
And it won't affect anyone else's auto-formatting
(although they will possibly be able to use the BLINK tag method without a
fatal exception :-(
).
Returns the self object.
HTML::Stream->set_tag('MARQUEE', Newlines=>9);
Once you do this, all HTML streams you open from then on will allow that tag to be output in the chocolate interface.
Warning:
by default, an HTML stream just references the ``master tag table'' (this
makes new()
more efficient), so by default, the
instance method will behave exactly like the class method.
my $HTML = new HTML::Stream \*STDOUT; $HTML->set_tag('BLINK', Newlines=>0); # changes it for others!
If you want to diddle with one stream's auto-formatting only, you'll need to give that stream its own private tag table. Like this:
my $HTML = new HTML::Stream \*STDOUT; $HTML->private_tags; $HTML->set_tag('BLINK', Newlines=>0); # doesn't affect other streams
Note: this will still force an default entry for BLINK in the master tag table: otherwise, we'd never know that it was legal to AUTOLOAD a BLINK method. However, it will only alter the characteristics of the BLINK tag (like auto-formatting) in the object's tag table.
0x01 newline before <TAG> .<TAG>. .</TAG>. 0x02 newline after <TAG> | | | | 0x04 newline before </TAG> 1 2 4 8 0x08 newline after </TAG>
Hence, to output BLINK environments which are preceded/followed by newlines:
set_tag HTML::Stream 'BLINK', Newlines=>9;
set_tag
for class/instance method differences).
ç
) for ISO-8859-1 characters.
So using HTML::Stream::Latin1 like this:
use HTML::Stream; $HTML = new HTML::Stream::Latin1 \*STDOUT; output $HTML "\253A right angle is 90\260, \277No?\273\n";
Prints this:
«A right angle is 90°, ¿No?»
Instead of what HTML::Stream would print, which is this:
«A right angle is 90°, ¿No?»
Warning: a lot of Latin-1 HTML markup is not recognized by older browsers (e.g., Netscape 2.0). Consider using HTML::Stream; it will output the decimal entities which currently seem to be more ``portable''.
Note: using this class ``requires'' that you have HTML::Entities.
output()
method and the various
``tag'' methods seem to run about 5 times slower than the old
just-hardcode-the-darn stuff approach. That is, in general, this:
### Approach #1... tag $HTML 'A', HREF=>"$href"; tag $HTML 'IMG', SRC=>"logo.gif", ALT=>"LOGO"; text $HTML $caption; tag $HTML '_A'; text $HTML $a_lot_of_text;
And this:
### Approach #2... output $HTML [A, HREF=>"$href"], [IMG, SRC=>"logo.gif", ALT=>"LOGO"], $caption, [_A]; output $HTML $a_lot_of_text;
And this:
### Approach #3... $HTML -> A(HREF=>"$href") -> IMG(SRC=>"logo.gif", ALT=>"LOGO") -> t($caption) -> _A -> t($a_lot_of_text);
Each run about 5x slower than this:
### Approach #4... print '<A HREF="', html_escape($href), '>', '<IMG SRC="logo.gif" ALT="LOGO">', html_escape($caption), '</A>'; print html_escape($a_lot_of_text);
Of course, I'd much rather use any of first three (especially #3)
if I had to get something done right in a hurry. Or did you not notice the
typo in approach #4? ;-)
(BTW, thanks to Benchmark:: for allowing me to... er... benchmark stuff.)
Added built-in support for escaping 8-bit characters.
Added LATIN_1 auto-escape, which uses HTML::Entities to generate mnemonic entities. This is now the default method for HTML::Stream::Latin1.
Added auto_format(),
so you can now turn auto-formatting off/on.
Added private_tags, so it is now possible for HTML streams to each have their own ``private''
copy of the %Tags
table, for use by set_tag()
.
Added set_tag()
. The tags tables may now be modified dynamically so as to change how
formatting is done on-the-fly. This will hopefully not compromise the
efficiency of the chocolate interface (until now, the formatting was
compiled into the method itself), and will
add greater flexibility for more-complex programs.
Added POD documentation for all subroutines in the public interface.
comment().
Thanks to John D Groenveld for the suggestion and the patch.
Fixed bug in accept_tag(),
where 'my' variable was shadowing
argument.
Thanks to John D Groenveld for the bug report and the patch.
John Buckman For suggesting that I write an "html2perlstream", and inspiring me to look at supporting Latin-1. Tony Cebzanov For suggesting that I write an "html2perlstream" John D Groenveld Bug reports, patches, and suggestions B. K. Oxley (binkley) For suggesting the support of "writing to strings" which became the "printable" interface.
Enjoy.