To make a long story short uHTML is an elegant way to extend HTML by defining new tags according to the requirements of a particular website. It allows to connect HTML tags with code chunks and insert function results into the tag attributes. It is irrelevant if a tag is part of a regular HTML tag set or means an extension of it. A function connected to a tag can leave it unchanged, modify it, replace it with a generated content or skip it completely. The library that performs the connection consists of two packages: uHTML and uHTMLnode. This documentation refers to the version 1.8 of the uHTML library.
The first package, uHTML, connects tag names with appropriate functions and invokes
the processing of the uHTML code. It loads modules from the script directory (cgi) to
bind website specific tags. The loaded modules have to meet one of two possible naming
conditions. At first they can be located in a subdirectory of the script directory named
uHTML
and match the pattern uHTML/*.pm
. At second they can
be placed in the script directory itself and match the pattern *-uHTML.pm
.
As third they can be placed in a uHTML
subdirectory located in any
global perl library directory. The last option is mainly meant for libraries used across
many websites (projects) like e.g. the uHTML::std
library.
Library modules can be put into the uHTML
subdirectory of the script
(cgi) directory, when there is no access to the system. In this case the strictly project
bound modules are supposed to match the pattern *-uHTML.pm
and be located
in the script (cgi) directory to avoid confusion. Keeping this standard eases the sharing
of uHTML modules among several projects.
The second package uHTMLnode provides the interface to a tag structure. This structure is passed to the functions bound by uHTML to a particular tag.
The following full working example shows the basic integration of uHTML into a website. It do not show the use of functions in attributes or request initialisation, but including any of this two is trivial. Please mind that uHTML is opposite to HTML case sensitive.
The <head>
data usually remains mostly unaltered across all HTML files of a website.
The vantage of keeping the constant part in a separate file is obvious. What we need to do it like this is to
create an <include>
tag
that inserts the content of one file called head-data
in several html
files. We suppose
that we have an include
directory in our document root where we keep our overlapping chunks of html.
Our uhtml
file could then look like this:
index.uhtml <html> <head> <include file="/include/head-data"> ... </head> ... </html>
The functionality of the <include>
tag is an asset in many projects. Following the mentioned
naming convention proposal the perl code is placed in a file named include.pm
located in the
subdirectory uHTML
of the script directory. The file include.pm
looks then like that:
include.pm use uHTML ; sub Include($) { my $Node = shift; $Node->map(join('',<FH>),'') if $Node->Attr('file') and open FH,$ENV{'DOCUMENT_ROOT'}.$Node->Attr('file'); } uHTML->registerTag('include',\&Include);
To link this small library into a website a cgi hook is needed. We place the necessary code in a file called
hook.pl
in the script directory:
hook.pl #!/usr/bin/perl use uHTML; open $FILE,"$ENV{'DOCUMENT_ROOT'}$ENV{'PATH_INFO'}" or die "File: $ENV{'PATH_INFO'} not found"; print "Content-type: text/html\n\n"; print uHTML::recode(join('',<$FILE>));
To get the functionality “magically” into all *.uhtml
files, we add some lines into the
.htaccess
file:
.htaccess DirectoryIndex index.uhtml RewriteEngine on RewriteRule ^(/?)(.*\.uhtml) $1cgi-bin/uhtml.pl/$2 [L,QSA]
Nginx serves only static files and expects an external server to handle dynamic content. uHTML works well with Plack over the FCGI interface. For this purpose uHTML must be installed system wide.
To link the small library above with our website a FCGI hook is needed.
We place the necessary code in a file called
uHTML.psgi
in the script directory:
uHTML.psgi use uHTML; use Encode; sub { my $env = shift ; my( $FILE,$DATA,$HTML,@HEAD,$LEN ) ; if( open $FILE,$env->{'PATH_TRANSLATED'} and $LEN = -s $FILE and read( $FILE,$DATA,$LEN ) == $LEN ) { $HTML = uHTML::recoded_list( $DATA,$env ) ; $LEN = 0 ; $LEN += length Encode::encode( 'UTF-8',$_ ) foreach @{$HTML} ; push @HEAD,'Content-Type','text/html; charset=UTF-8' ; push @HEAD,'Content-Length',$LEN ; push @HEAD,'x-powered-by','uHTML' ; return [ 200,\@HEAD, $HTML ] ; } return [ 404, [ 'Content-Type' => 'text/plain' ], [ 'File Not Found' ] ] ; }
Now we can start the FCGI server with using: plackup -s FCGI -S /tmp/uHTML -a /srv/cgi-bin/uHTML.psgi
In the nginx configuration we add the section redirecting uHTML requests to our plackup server:
server.conflocation ~ \.uhtml$ { try_files $uri /index.uhtml =404 ; fastcgi_keep_conn on ; fastcgi_split_path_info ^()(.*uhtml)$; fastcgi_pass unix:/tmp/uHTML ; fastcgi_index index.uhtml; fastcgi_param URI $uri; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_param SCRIPT_NAME $fastcgi_script_name; fastcgi_param PATH_TRANSLATED $document_root$fastcgi_path_info; fastcgi_param QUERY_STRING $query_string; fastcgi_param REQUEST_METHOD $request_method; fastcgi_param CONTENT_TYPE $content_type; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_param REQUEST_URI $request_uri; fastcgi_param REQUEST $request; fastcgi_param REQUEST_SCHEME $scheme ; fastcgi_param REQUEST_FILE $request_filename; fastcgi_param DOCUMENT_URI $document_uri; fastcgi_param DOCUMENT_ROOT $document_root; fastcgi_param SERVER_PROTOCOL $server_protocol; fastcgi_param GATEWAY_INTERFACE CGI/1.1; fastcgi_param SERVER_SOFTWARE nginx/$nginx_version; fastcgi_param REMOTE_ADDR $remote_addr; fastcgi_param REMOTE_PORT $remote_port; fastcgi_param SERVER_ADDR $server_addr; fastcgi_param SERVER_PORT $server_port; fastcgi_param SERVER_NAME $server_name; fastcgi_param HOST_NAME $hostname; fastcgi_param HTTPS $https; fastcgi_param HTTP_COOKIE $http_cookie; fastcgi_param HTTP_ACCEPT_LANGUAGE $http_accept_language; fastcgi_param REDIRECT_STATUS 200; fastcgi_param SCRIPT_ROOT /srv/cgi-bin ; fastcgi_param DATA_ROOT /srv/cgi-bin ; } location / { }
All in the packages uHTML and uHTMLnode defined methods and variables at a glance.
The uHTML package loads all modules from the script directory that match uHTML/*pm
and that match *-uHTML.pm
. It provides methods that assign code to tags and
attributes and invokes the uHTML to HTML translation.
uHTML::registerTagCode( $TagName,$Code ) ;
Bind the function $Code
to the tags named $TagName
.
The function $Code
will be called with a reference of the uHTML
node corresponding to the tag $Code( $Node )
. The function is
expected to alter and adjust the tag attributes and content. The modified
tag gets automatically inserted into the HTML output.
If more then one function is bound to one tag, the functions are daisy-chained. The execution order of those functions is not determined.
uHTML::registerTag( $TagName,$Code ) ;
Bind the function $Code
to the tags named $TagName. The function $Code
will be called with a reference of the uHTML node corresponding to the tag
$Code( $Node )
. The function is expected to insert necessary data using
the appropriate uHTMLnode methods $node->map( $HeadText,$TailText )
or
$node->insert()
.
uHTML::registerAttrCode( $VarName,$Code ) ; uHTML::registerVar( $VarName,$Code ) ;
Bind the function $Code
to the attribute variable called $VarName
.
Both functions are identical. The attribute variable gets replaced by the
return value of the function.
The function $Code
is called with a reference to the node representing the
tag, the name of the attribute containing the function and the function
name, followed by the function arguments:
$Code( $Node,$Attribute,$Function,$Value1,$Value2, ... )
.
uHTML::register( $Name,$Code ) ;
Bind the function $Code
to the attribute variable called $Name
and to a tag called $Name simultaneously. The tag or attribute variable gets
replaced by the return value of the function.
The function $Code
is called with a reference to the node representing the
tag, the name of the attribute containing the function and the function
name, followed by the function arguments:
$Code( $Node,$Attribute,$Function,$Value1,$Value2, ... )
.
If the function is called in reference to a tag, $Attribute
and $Function
are not defined. In this case the function if
necessary has to set the values $Value1, $Value2, ...,
from the attributes of the tag using $Node->Attr( $Name )
.
uHTML::Tags() ;
Returns a list of all tags with a function assigned to.
uHTML::TestTag( $Name )
Check if some code is bound to the tag $Name.
uHTML::TestVar( $Name )
Check if some code is bound to the attribute variable $Name.
uHTML::FileStart
Set the current file name for debug output. Ignored in production mode.
uHTML::FileEnd
Reset the current file name for debug output to the previous name. Ignored in production mode.
uHTML::recoded_list( $uhtml,$env ) ;
Translates uHTML data $uhtml
into HTML. Returns a reference to a array of HTML chunks
containing the final HTML code. $env
provides a reference to the environment. If not given,
the current environment is used.
uHTML::recode( $uhtml,$env ) = @_ ;
Translates uHTML data $uhtml
into HTML. Depending on the context returns a scalar
or string array containing the final HTML code. $env
provides a reference to the
environment. If not given, the current environment is used.
uHTML produces some (sparse) error codes. It is advisable to switch them
off in production mode. In production mode HTML comments get removed and the
code get slightly compacted too. The production mode is activated with
setting $uHTML::FileName = '' ;
prior to translation of uHTML to HTML.
The package uHTMLnode provides the hierarchical structure for the uHTML code and contains after the translation the HTML data.
uHTMLnode data structure is only remotely related to the HTML nodes in DOM. The data structure is intended to be manipulated only by its methods.
FirstChild
: - first child nodeLastChild
: - last child nodeParent
: - parent nodePrev
: - previous node (Null for the first node in a hierarchy level)Next
: - following node (Null for the last node in a hierarchy level)Name
: - name of the node (tag name)End
: - true if the node has a closing counterpart (e.g. <div> ... </div>)XMLClose
: - true if the node has no closing counterpart but is noted in XML manner with a "/" before the closing bracket (e.g. <img ..... />)Attributes
: - reference to a HASH containing the attributes of a tagText
: - text within a node till the first child node or end of the node (corresponds to the first text node in DOM if the first DOM child node is a text node)Trailer
: - text following a node (corresponds in DOM to the first text node following the node if the first following node is a DOM text node)tainted
: - recursive processing of the node necessaryHTML
: - final HTML codeENV
: - pointer to the current environment, decisive in FCGI environmentsuHTMLnode->new( $Name,$Text,$Prev,$env ) ;
Create a new node with the name $Name, a trailing text $Text and the preceding node $Prev. This method is called by the uHTML package and is seldom needed outside of it.
$node->name() ;
The name of a node. It equals to the name of the uHTML tag represented by
the node. By passing a argument
$node->Name($NewName)
the tag can be renamed.
$node->parent() ;
The parent node.
$node->prev() ;
The preceding node.
$node->next() ;
The following node.
$node->copy() ;
Copies a node. This function is useful to generate lists. The copy of the node is not hooked into the structure of the original uHTML file, although the parent node is correctly assigned. All child nodes are copied as well. The trailing text of the node is not included in the copy.
$node->prepend( $Node ) ;
Insert a node into the uHTML tree before current node.
$node->append( $Node ) ;
Insert a node into the uHTML tree after current node.
$node->embed( $Name ) ;
Creates a new node $Name and embeds the current node in it. In effect the current node gets replaced by the new node $Name while the current node becomes the only child of the new node.
$node->firstChild() ;
First subordinated node.
$node->lastChild() ;
Last subordinated node.
$node->addChild( $Child,$PrevChild ) ;
Add a child node after the child node $PrevChild
. If $PrevChild
is not
defined, add as new first child node, if $PrevChild
equals
$node->lastChild() the new node becomes the new last child.
The node $Child
mustn't be a child of $node
.
If $Child
has its parent node
set, it will be correctly moved within the uHTML document.
$node->appendChild( $Child ) ;
Add a child node as new last child.
The node $Child
mustn't be a child of $node
.
If $Child
has its parent node
set, it will be correctly moved within the uHTML document.
$node->detach( $KeepTrailer ) ;
Detaches a node from the uHTML structure. Normally the trailing text gets
deleted in process. To keep it, $KeepTrailer
must be true.
$node->delete() ;
Deletes a node from the uHTML structure.
$node->attr( $Name ) ;
The value of a singular attribute as a string. Possible attribute
functions get interpreted. If more then one attribute with the same name
exist, the values are concatenated. If a value get provided
($node->Attr( $Name,$Value );
), the attribute get set to this value.
If the attribute do not exists, it gets created.
$node->rawAttr( $Name ) ;
The original value of a singular attribute as a string. Possible attribute
functions are not interpreted. If more then one attribute with the same
name exist, the values are concatenated. If a value get provided
($node->RawAttr( $Name,$Value );
),
the attribute get set to this value. If the attribute do not exists, it gets created.
$node->codeAttr( $Name ) ;
The value of a singular attribute as a string. Possible attribute functions get interpreted. If more then one attribute with the same name exist, the values are concatenated.
$node->setAttr( $Name,$Value ) ;
Sets the attribute $Name
to the $Value
.
If the attribute do not exists, it gets created.
$node->testAttr( $Name ) ;
Tests the existence of the attribute $Name
.
This is necessary to test for attributes without any value provided.
$node->testAnyAttr( $Name1,$Name2,$Name3, ,... ) ;
Tests the existence of any of the attributes with the provided names.
$node->testAllAttr( $Name1,$Name2,$Name3, ,... ) ;
Tests the existence of all attributes with the provided names.
$node->addAttr( $Name1,$Name2,$Name3, ,... ) ;
Creates the attributes $Name1, $Name2, $Name3, ,...
,
without assigning a value to them.
$node->deleteAttr( $Name1,$Name2,$Name3, ,... ) ;
Deletes the attributes $Name1, $Name2, $Name3, ...
$node->attributes()
Reference to the attributes of a node. E.g. the style of a tag can be
accessed by $node->attributes()->{'style'}
.
The methods above which access single attributes should be preferred.
$node->text() ;
The text inside of a closed tag up to the first child tag. It corresponds to the first text node in DOM if the first DOM child node is a text node. Can be altered by passing a argument.
$node->trailer() ;
The text following a tag up to the next tag. It corresponds in DOM to the first text node following the node if the first following node is a DOM text node. Can be altered by passing a argument.
$node->end() ;
True, if a tag is closed (the closing tag exists). If a argument is passed, the node becomes a closed node or open node depending on the argument.
$node->XMLClose() ;
True if the tag is closed by a "/>" instead of a simple ">". Can be enforced or removed by passing an according argument.
$node->map( $HeadText,$TailText ) ;
Map a node into HTML output without tags preceding the node with $HeadText
and closing it with $TailText
. If a node has no closing tag, $TailText
follows directly $HeadText
. Practically seen it replaces the opening and
closing tags with $HeadText
and $TailText
. This is the most common way to
produce HTML output in functions hooked into uHTML using
uHTML::registerTag( $TagName,$Code ) ;
.
$node->insert() ;
Inserts a node's HTML code including the tags and attributes. It is meant
to insert an altered node into the HTML output. This is the second way to
produce HTML output in functions hooked into uHTML using
uHTML::registerTag( $TagName,$Code ) ;
.
$node->HTML() ;
The HTML code of a node after a map() or insert() was performed. It is empty before a
map() or insert() on the node is done. It is possible to set this value
directly by passing an argument $node->HTML( $html )
. By setting it the
resulting HTML code is replaced by $html
.
$node->appendText( $text ) ;
Append $text
to the existing HTML output.
$node->env() ;
Returns a reference to the current environment in which a HTTP request is performed.