WebMake Documentation (version 1.1)

All-In-One Documentation

Contents


The Blurb

WebMake is a simple content management system, based around a templating system for HTML documents, with lots of built-in smarts about what a "typical" informational website needs in the way of functionality; metadata, sitemapping, navigational aids, and (of course) embedded perl code. ;)

  • Creates portable sites: It requires no dynamic scripting capabilities on the server; WebMake sites can be deployed to a plain old FTP site without any problems.

  • No need to edit lots of files: A multi-level website can be generated entirely from 1 WebMake file containing content, links to content files, perl code (if needed), and output instructions.

  • Useful for team work: Since the file-to-page mapping is no longer required, WebMake allows the separation of responsibilities between the content editors, the HTML page designers, and the site architect. Only the site architect needs to edit the WebMake file itself, or know perl or WebMake code. Standard file access permissions can be used to restrict editing by role.

  • Efficient: WebMake supports dependency checking, so a one-line change to one source file will not regenerate your entire site -- unless it's supposed to. Only the files that refer to that chunk of content, however indirectly, will be modified.

  • Supports content conversion, on the fly: Text can be edited as standard HTML, converted from plain text (see below), or converted from any other format by adding a conversion method to the WebMake::FormatConvert module.

  • Edit text as text, not as HTML: One of the built-in content conversion modules is Text::EtText, which provides an easy-to-edit, easy-to-read and intuitive way to write HTML, based on the plain-text markup conventions we've been using for years.

  • Rearrange your site in 30 seconds: Since URLs can be referred to symbolically, pages can be moved around and URLs changed by changing just one line. All references to that URL will then change automatically. This is vaguely Xanalogical.

  • Scriptable: Content items and output URLs can be generated, altered, or read in dynamically using perl code. Perl code can even be used to generate other perl code to generate content/output URLs/etc., recursively. New tags can be defined and interpreted in perl.

  • Extensible: New tags (for use in content items or in the WebMake file itself) can be added from perl code, providing what amounts to a dynamically-loaded plugin API.

  • Inclusion of text: Content can incorporate other content items, simply by referring to it's name. This is a form of Xanadu-style transclusion.


WebMake as a CMS

WebMake is, arguably, a Content Management System, or CMS.

To be more specific, it's oriented entirely towards generating a relatively static site, such as a weblog, a news site (without comments or personalisation) or a typical informational site.

It does not have any dynamic, database-driven, features suitable for "live" sites that update frequently with dynamic data; nor does it have support for "personalisation" features, where the site displays different data based on what the user presents in their HTTP request. (Of course, using WebMake does not preclude using PHP, mod_perl, Mason etc. to provide these, however.)

Here's the relevant details of what it can do.

WebMake's CMS Features

  • Separation between content and layout

  • Since, logically, content and layout are entirely separate tasks, they should be easy to keep separate in the CMS.

    WebMake uses content references to include content into pages, and implement templating. This allows you to separate the content text from the template layout HTML.

  • No requirement for text editors to know HTML

  • Only the layout staff should really need to know HTML, so the staff who provide text content can do this without HTML knowledge.

    WebMake provides Text::EtText, which provides an easy-to-edit, easy-to-read and intuitive way to write HTML, based on the plain-text markup conventions we've been using for years.

  • Generation of pages automatically, using metadata from content items

  • It should be possible to generate index pages, sitemaps, navigation links, and other text automatically, based on properties and metadata of the pieces of content loaded.

    WebMake supports this using metadata on content items.

  • Flexible URL support

  • It should be trivial to rearrange a site, if required, totally changing the URLs used in the site's pages.

    WebMake supports this by using symbolic URL references, which can be modified by changing one line, causing references to that URL throughout the site to change.

What WebMake Is Missing

  • Edit-In-Page Functionality

  • Most CMSes boast a nicer, browser-based user interface to creating, naming, uploading and filling out content items and media.

    WebMake currently leaves you with your trusty text editor, to edit the .wmk file and add the tags needed to define these. This is fine if you're of the UNIX mindset (like me ;) -- but it does need work.

  • Database Support

  • It would be nice if WebMake could load content from a database. It currently cannot, although there's nothing in the architecture that would preclude this; there just has not been a need, just yet.

    Unfortunately, this may not be possible -- this IBM software patent details a mechanism whereby a server can dynamically rebuild its pages, based on changes to objects in a database. WebMake could run afoul of this if database support is added (although there are a few points where this could be avoided).

  • XML Support

  • This will definitely arrive -- as soon as a good XSLT engine becomes part of Perl, or at least becomes easy to install from CPAN. It's on my list ;)


WebMake Operation

When you run WebMake, it'll first search for a file ending with .wmk in the current directory, then in the parent directory, and so on 'til it hits the root directory.

Alternatively if you use the -R switch, it'll search relative to the filename specified on the command-line; this is very handy if you're calling WebMake from a macro in your editor or IDE, as it means you don't even have to be running the editor in the same working directory as the files you're working on.

Anyway, once it finds the WebMake file, it reads the file and parses it. Contents and media statements will cause it to search directories or other data sources for content (directly includable files and text blocks) or media (static content that can be linked to or otherwise referred to with a href). Include statements will cause it to directly include a block of code from another file.

Finally, once the WebMake file has been parsed, the list of out files will have been read. Each of these is roughly equivalent to a target in traditional UNIX make(1) terminology.

If a target has been specified on the command line, that file will be made; otherwise, all the out files named in the WebMake file will be made.

Dependencies And Other Optimisations

"Making" the target is not the end of it -- strictly speaking, the target may or may not be updated. WebMake tracks the dependencies of each file, and if these have not changed, the file will not be rebuilt.

That's the first optimisation. However it doesn't always work; if some of the file's text is generated by, or depends on text that contains dynamic Perl code, WebMake will always have to rebuild the file.

To avoid continually "churning" the file, regenerating it every time WebMake is run, a comparison step takes place. Before the file is written to disk, WebMake compares the file in memory with the file on disk; if there are no changes, the on-disk file will not be modified in any way. This means tools like rsync(1), rdist(1) or even make(1) itself will work fine with a WebMake site.

All of these optimisations can be overridden by using the -F (freshen) command-line switch; this will force output whether or not the files have changed.

Ensuring A Seamless Transition

A very large (or very complicated) WebMake site can take a while to update. To avoid broken links while updating the site, WebMake generates all output into temporary files called filename.new; once all the output has been generated, these are renamed into place. This minimises the time during which there may be inconsistencies in the site.

Caching

Since WebMake uses dependencies to avoid rebuilding the entire site every time, it needs to cache metadata and dependency information somewhere.

Currently this data is stored in a file called filename/cache.db, where filename is a sanitised version of the WebMake file's name, in the .webmake subdirectory of your home directory.


Invoking Webmake

WebMake can be run using the command-line tool webmake, or by using the Perl module HTML::WebMake::Main.

In addition the EtText format can be used using its command line tools, or by using the Perl modules directly.

The command-line tools' POD documentation:

And the POD documentation for the Perl module:


How to Use WebMake

Chances are, you already have a HTML site you wish to migrate to WebMake. This document introduces WebMake's way of doing things, and how to go about a typical migration.


Place The .wmk File

First, pick a top-level directory for the site; that's where you'll place your .wmk file. All the generated files should be beneath this directory. In this example I'll call it index.wmk.


Make Templates

Next, identify the page templates used in the site. To keep it simple, let's imagine you have only one look and feel on the pages, with the usual stuff in it; high-level HTML document tags, such as <html>, <head>, <title>, <body>, that kind of stuff. There may also be some formatting, such as a <table> with a side column containing links, etc., or a top-of-page title. All of these are good candidates for moving into a template. I typically call these templates something obvious like page_template or sitename_template, where sitename is the name of the site.

For this example, let's imagine you have the HTML high-level tags and a page title as your typical template items.

So edit the index.wmk file, and add a template content item, by cutting and pasting it from one of your pages. Instead of cutting and pasting the real title, use a metadata reference: $[this.title]. Also, replace the text of the page with ${page_text}; the plan is that, before this content item will be referenced, this content item will have been set to the text you wish to use.

<webmake>
<content name=page_template>
  <html><head><title>$[this.title]</title></head>
  <body bgcolor=#ffffff><h1>$[this.title]</h1>
  <hr>
    ${page_text}
  <hr>
  </body></html>
</content>

Grab The Pages' Text

Next, run through the pages you wish to WebMake-ify, and either:

  1. move them into a "raw" subdirectory, from where WebMake can read them with a <contents> tag, or;

  2. include them into the index.wmk file directly.

It's a matter of taste; I initially preferred to do 1, but nowadays 2 seems more convenient for editing, as it provides a very easy way to break up long pages, and it makes search-and-replace easy. Anyway, it's up to you. I'll illustrate using 2 in this example.

Give each content item a name. I generally use the name of the HTML file, but with a .txt extension instead of .html to mentally differentiate the input from the output.

Strip the template elements (head tag, surrounding eye-candy tables, etc.) from each page, leaving just the main text body behind. Keep the titles around for later, though.

<content name="document1.txt">
  ....your html here...
</content>
<content name="document2.txt">
  ....your html here...
</content>
<content name="document3.txt">
  ....your html here...
</content>

Convert To EtText (OPTIONAL!)

Now, one of the best bits of WebMake (in my opinion) is EtText, the built-in simple text markup language; to use this, run the command-line tool ethtml2text on each of your HTML files to convert them to EtText, then include that text, instead of the HTML, as the content items. Don't forget to add format="text/et" to the content tag's attributes, though:

<content name="document1.txt" format="text/et">
  ....your ettext here...
</content>
...

To keep things simple, I'll assume you haven't used EtText in the examples from now on.


Add Titles

Next, you need to set the titles in the content items, so that they can be used in higher-level templates, such as the page_template content item we defined earlier.

To really get some power from WebMake, use metadata to do this.


What is Metadata?

A metadatum is like a normal content item, except it is exposed to other pages in the index.wmk file. Normally, you cannot reliably read a dynamic content item that was set from another page; if one content item sets a variable like this:

<{set foo="Value!"}>

Any content items evaluated after that variable is set can access ${foo}, as long as they occur on the same output page. However if they occur on another output page, they may not be able to access ${foo}.

To get around this, WebMake includes the <wmmeta> tag, which allows you to attach data to a content item. This data will then be accessible, both to other pages in the site (as $[contentname.metaname], and to other content items within the same page (as $[this.metaname]).

Think of them as like size, modification time, owner etc. on files; or member variables in an object-oriented language.


Anyway, titles of pages are a perfect fit for metadata. So convert your page titles into <wmmeta> tags like so:

<content name="document1.txt">
  <wmmeta name="title">Your Title Here</a>
  ....your ettext here...
</content>
...

Sometimes, for example if you plan to generate index pages or a sitemap later on, you may wish to add a one-line summary of the content item as a metadatum called abstract. I'll leave it out of the examples, just to keep them simple.

Metadata should always be referred to in $[square brackets]. I'll explain why in the next section.


Naming The Output URLs

Finally, you've assembled all the content items; now to tell WebMake where they should go. This is accomplished using the <out> tag.

Each output URL, in this example, requires the following content items:

  • page_template, which refers to:

  • title

  • page_text

As you can see, both title and page_text rely on which output URL is being written, otherwise you'll wind up with lots of finished pages containing the same text. ;)

There are several ways to deal with this.

  1. Set a variable in the <out> text, using <{set}>, to the name of the content item that should be used for the page_text.

  2. Derive the correct value for page_text using the name of the <out> section itself.

The easiest way is the latter. WebMake defines a built-in "magic" variable, ${WebMake.OutName}, which contains the name of the output URL. (Note that output URLs have both a name and a filename; you'll see why in the next section.)

To do this, define another content item:

<content name=out_helper>
  <{set page_text="${${WebMake.OutName}.txt}" }>
  ${page_template}
</content>

As you can see, this takes the name of the output URL, appends .txt to it, and sets a variable called page_text to contain the content item named thereby.

BTW: you could simply skip defining this "helper" content item altogether, and just go to the top of the file and change the template to refer directly to ${${WebMake.OutName}.txt} instead of ${page_text} . That's what I usually do.

But what about the title? Handily, since we defined the titles as metadata, and referred to them as $[this.title] in page_template, this is taken care of; once the ${page_text} reference is expanded, $[this.title] will be set.

Remember I mentioned that metadata should always be referred to in $[square brackets]? Here's why. Square bracket references, or deferred references, are evaluated only after normal, "squiggly bracket" content references.

The example page contains the following content references:

  • ${page_template}, which refers to:

  • $[this.title]

  • ${page_text}

This allows ${page_text} to be expanded first, which causes this.title to be set. Finally, $[this.title] is expanded last.

If page_template had used a normal content reference to refer to ${this.title}, WebMake would have tried to expand it before ${page_text}, since it appeared in the file earlier.

Anyway, I digress.


Writing The <out> Tags

Each output URL needs an <out> tag, with a name and a file. The name provides a symbolic name which one can use to refer to the URL; the file names the file that the output should be written to.

Typically the name should be similar to the page's main content item's name, to keep things simple and allow the shortcut detailed in the previous section to work.

Also, sites typically use a pretty similar filename to the name, for obvious reasons. At least, they do, to start with; further down the line, you may need to move one (or more) pages around in the URL or directory hierarchy; since you've been referring to them by name, instead of by URL or by filename, this means changing only one attribute in the <out> tag, instead of trying to do a global search and replace throughout hundreds of HTML files.

Anyway, here's a sample <out> tag:

<out name="document1" file="document1.html"> ${out_helper} </out>

But what about multiple outputs? Two choices:

  1. Simply list all the output HTML files, one after the other. Works fine for small sites, and it's simple.

  2. Use a <for> tag.

I don't think you need to see how 1. works, it's pretty obvious. Let's see how 2. does it:

<for name="page" values="document1 document2 document3">
  <out name="${page}" file="${page}.html"> ${out_helper} </out>
</for>

Simple.


Putting <out> Names To Work

So you've named the output URLs. However all your content items contain static URLs in the HREFs! Let's fix that.

This really is up to you; it's a global search-and-replace. Let's say you want to fix all links to "document1.html". Replace this:

<a href="document1.html">foo</a>

with an URL reference, like this:

<a href="$(document1)">foo</a>

Now, even if "document1.html" is renamed to "blah/whatever/doc1.cgi", you won't have to do a search-and-replace again.


Getting Advanced - Adding Navigation and a Sitemap

This hasn't been written yet. Sorry! (TODO)


Tips On Using WebMake

Editor/IDE Support

The root directory of the WebMake distribution includes a Vim rc file to support syntax-highlighting for WebMake. To use it, make a directory called .vim in your home directory, copy it there, and add the following lines to your .vimrc:

au BufNewFile,BufReadPost *.wmk so $HOME/.vim/webmake.vim
 map ,wm :w!<CR>:! ~/ftp/webmake/webmake -R %<CR>

Once you do this, the macro sequence ,wm will cause a rebuild of the site which contains the file you're currently editing.

The Button

WebMake now includes a WebMake button:

Feel free to include it on your pages; but please, if possible, add it with a href to http://webmake.taint.org/, so people who are curious can find out more about WebMake.

It's 88 pixels wide and 31 high, by the way. If you look in the "images" directory of the distribution, there's also an 130x45 one and a 173x60 one.

To make things really easy, here's some cut-and-paste HTML for the image:


 <a href="http://webmake.taint.org/"><img
 src="http://webmake.taint.org/BuiltWithWebMake.png"
 width="88" height="31" border="0" /></a>


Contributors to WebMake

Here's a list of people who've contributed to WebMake:

  • Justin Mason <jm /at/ jmason.org>: original author and maintainer

  • Mark McLoughlin <mark /at/ skynet.ie>: added perlout directive, fixes to HTML cleaner

  • Caolan McNamara <caolan /at/ csn.ul.ie>: EtText contributions; lists, pre-formatted text, lots of suggestions

  • Matthew Clarke <clamat /at/ van.maves.ca>: doco fix for datasource documentation

  • rudif /at/ bluemail.ch: lots of help with supporting Windows

Thanks all! Patches and suggestions are welcomed -- send them in! (By the way, patch contributors get listed at the top, 'cos patches save me writing the code ;)


The <webmake> Tag

The <webmake> section is required in a WebMake file. Any text before or after this section will be ignored.

In the current implementation, you can leave these tags out, but it isn't advised; their requirement may be enforced later.

Example


  <webmake>
    [...WebMake file omitted...]
  </webmake>


The <include> Tag

Arbitrary files can be included into the current WebMake file using this tag. It has one attribute, file, which names the file to include.

A set of libraries are available to include, distributed with WebMake. See the Included Library Code section of the index page for their documentation. However, these should be loaded using the <use> tag instead of this one.

Example


  <include file="inc/footer.wmk" />


The <use> Tag

WebMake supports "plugin" libraries, which are generally other .wmk files or Perl modules which can be loaded to extend WebMake's functionality.

For example, there are standard plugins to provide support for "download" links, which allows links to files including their size, ownership information, etc.; there's also a plugin which allows HTML tables to be defined using a comma-separated value list.

It has one attribute, plugin, which names the plugin to load.

Plugins can be loaded from the WebMake perl library directory, or from the user's home directory. The search path for a plugin is as follows:

  • ~/.webmake/plugins/plugin.wmk

  • ${WebMake.PerlLib}/plugin.wmk

The set of standard plugins are listed in the Included Library Code section of the index page.

Example


  <use plugin="safe_tag" />


The <content> Tag

The <content> tag has one required attribute: its name, which is used to substitute in that section's text, by inserting it in other sections or out tags in a curly-bracket reference, like so:

${foo}

The following attributes are supported. These can also be set using the <attrdefault> tag.

format

This allows the user to define what format the content is in. This allows markup languages other than HTML to be used; webmake will convert to HTML format, or other output formats, as required using the HTML::WebMake::FormatConvert module. The default value is "text/html".

asis

This will block any interpretation of content or URL references in the content item, until after it has been converted into HTML format. This is useful for POD documentation, which may be embedded inside a file containing other text; without "asis", the text would be scanned for content references before the POD converter stripped out the extraneous bits. The default value is "false".

map

Whether the content item should be mapped in a site map, or not. The default value is "true".

up

The name of the content item which is this content item's parent, in the site map.

isroot

Whether or not this content item is the root of the site map. The default value is "false".

If you wish to define a number of content sections at once, they can be searched for and loaded en masse using the <contents> tag.

Every content item can have metadata associated with it. See the documentation for the <wmmeta> tag for details.

Defining Content Items On-The-Fly

The <{set}> processing instruction can be used to define small pieces of content on the fly, from within other content or <out> sections.

In addition, Perl code can create content items using the set_content() function.

Using Content From Perl Code

Perl code can obtain the text of content items using the get_content() function, and can treat content items as whitespace-separated lists using get_list().

In addition, each content item has a range of properties and associated metadata; the get_content_object() method allows Perl code to retrieve an object of type HTML::WebMake::Content representing the content item.

Example


  <content name="foo" format="text/html">
  <em>This is a test.</em>
  </content>

  <content name="bar" format="text/et">
  Still Testing
  -------------

  So is this!
  </content>


The <contenttable> Tag

Quite often, it's handy to define small (one-line) content items quickly, in bulk, directly inside the WMK file itself. The <contenttable> tag provides a good way to do this.

Firstly, pick a delimiter character, such as |. Set the delimiter attribute to this character.

Next, list a table of content names and their values, separated by a delimiter character, one name-value-pair per line.

Note: if you would prefer to load the content items from a separate file, the <contents> tag is better suited.

Example


  <contenttable delimiter="|">
  index.title|The index
  similar.title|Similar Projects
  administrivia.title|Administrivia
  reviews.title|Reviews
  </contenttable>


The <contents> Tag

Content can be searched for using the <contents> tag, which allows you to search a data source (directory, delimiter-separated-values file, database etc.) for a pattern.

The attributes supported are listed on the data source page.

Apart from the fact that it loads many contents instead of one, it's otherwise identical to the content tag; see that tag's documentation for details on what attributes are supported.

Example


  <contents src="file:raw/text" name=".../*.txt" format="text/et" />
  <contents src="file:raw/html" name=".../*.html" format="text/html" />


The <media> Tag

WebMake allows you to refer to files and web pages symbolically, separating the site layout from the URL structure, and avoiding later problems with dangling links when a page's URL is changed. This is done using $(url_refs).

This works well for content items defined in WebMake, such as output files defined using the <out> tag. However it is not handy when dealing with a images or other files that are not generated using WebMake.

Therefore media files and external, non-WebMake files can be searched for using the <media> tag, which allows you to search a data source (directory, etc.) for a pattern.

The attributes supported are listed on the data source page.

Note that data sources which do not map to files in a filesystem, or other methods accessible to a web browser browsing your site, do not make sense for the <media> tag; so, for example, the svfile: protocol is not supported, as a web browser cannot load an image from a CSV file. (yet.)

As a result, currently only one data source protocol can be used with the <media> tag, namely file:.

Example


  <media src="file:images" name=".../*.gif" />
  <media src="file:images" name=".../*.jpg" />


Data Sources for the <contents> and <media> Tags

Contents or URLs can be searched for using the <contents> or <media> tags, which allow you to search a data source (directory, delimiter-separated-values file, database etc.) for a pattern.

Currently two data source protocols are defined, file: and svfile: . More will probably follow, especially if other people contribute them, hint hint ;)

file: is the default protocol, if none is specified.

Attributes Supported By Datasource Tags

src

All datasources require this attribute, which specifies a protocol and path, in a URL-style syntax:

protocol:path

name

This attribute is used to specify the pattern of data, under this path, which will be converted into content items. The part of the data's location which matches this name pattern will become the name of the item. Typically, glob patterns, such as "*.txt" or ".../*.html" are used.

prefix

The items' names can be further modified by specifying a prefix and/or suffix; these strings are prepended or appended to the raw name to make the name the content is given.

suffix

See above.

namesubst

a Perl-formatted s// substitution, which is used to convert source filenames to content names.

nametr

a Perl tr// translation, which is used to convert source filenames to content names.

listname

a name of a content item. This content item will be created, and will contain the names of all content items picked up by the <contents> or <media> search.

In addition, the attributes supported by the content tag can be specified as attributes to <contents>, including format, up, map, etc.

The content blocks picked up from a <contents> search can also contain meta-data, such as headlines, visibilty dates, workflow approval statuses, etc. by including metadata.

The file: Protocol

The file: protocol loads content from a directory; each file is made into one content chunk. The src attribute indicates the source directory, the name attribute indicates the glob pattern that will pick up the content items in question. The filename of the file will be used as the content chunk's name.

<contents src="stories" name="*.txt" />

Note that the files in question are not actually opened until their content chunks are referenced using ${name} or get_content("name").

Normally only the top level of files inside the src directory are added to the content set. However, if the name pattern starts with .../, the directory will be searched recursively:

<contents src="stories" name=".../*.txt" />

The resulting content items will contain the full path from that directory down, i.e. the file stories/dir1/foo/bar.txt exists, the example above would define a content item called ${dir1/foo/bar.txt}.

The svfile: Protocol

The svfile: protocol loads content from a delimiter-separated-file; the src attribute is the name of the file, the name is the glob pattern used to catch the relevant content items. The namefield attribute specifies the field number (counting from 1) which the name pattern is matched against, and the valuefield specifies the number of the field from which the content chunk is read. The delimiter attribute specifies the delimiter used to separate values in the file.

<contents src="svfile:stories.csv" name="*" namefield=1 valuefield=2 delimiter="," />

Adding New Protocols

New data sources for <contents> and <media> tags are added by writing an implementation of the DataSourceBase.pm module, in the HTML::WebMake::DataSources package space (the lib/HTML/WebMake/DataSources directory of the distribution).

Every data source needs a protocol, an alphanumeric lowercase identifier to use at the start of the src attribute to indicate that a data source is of that type.

Each implementation of this module should implement these methods:

new ($parent)

instantiate the object, as usual.

add ()

add all the items in that data source as content chunks. (See below!)

get_location_url ($location)

get the location (in URL format) of a content chunk loaded by add().

get_location_contents ($location)

get the contents of the location. The location, again, is the string provided by add().

get_location_mod_time ($location)

get the current modification date of a location for dependency checking. The location, again, is in the format of the string provided by add().

Notes:

  • If you want add() to read the content immediately, call $self->{parent}->add_text ($name, $text, $self->{src}, $modtime).

  • add() can defer opening and reading content chunks straight away. If it calls $self->{parent}->add_location ($name, $location, $lastmod), providing a location string which starts with the data source's protocol identifier, the content will not be loaded until it is needed, at which point get_location_contents() is called.

  • This location string should contain all the information needed to access that content chunk later, even if add() was not been called. Consider it as similar to a URL. This is required so that get_location_mod_time() (see below) can work.

  • All implementations of add() should call $fixed = $self->{parent}->fixname ($name); to modify the name of each content chunk appropriately, followed by $self->{parent}->add_file_to_list ($fixed); to add the content chunk's name to the filelist content item.

  • Data sources that support the <media> tag need to implement get_location_url, otherwise an error message will be output.

  • Data sources that support the <contents> tag, and defer reading the content until it's required, need to implement get_location_contents, which is used to provide content from a location set using $self->{parent}->add_location().

  • Data sources that support the <contents> tag need to implement get_location_mod_time. This is used to support dependency checking, and should return the modification time (in UNIX time() format) of that location. Note that since this is used to compare the modification time of a content chunk from the previous time webmake was run, and the current modification time, this is called before the real data source is opened.


The <for> Tag

The <for> tag provides a quick way to iterate through a list of items.

It requires two attributes, name and values; the content item named name is set to each space-separated value in the values string, and the text inside the block is processed.

Supported Attributes

name

The name of the variable which will be set to each value in the values list in turn (if you know your comp-sci lingo, the iterator).

values

A space-separated list of values which is iterated through.

namesubst

A Perl s/// substitution; each value in the values list will be processed by this, if set.

Variable references to ${name} are processed immediately, so you can use this variable inside another variable reference, like this: ${all_${name}_text} .

Example

Here's an example, taken from my own home site:


	<!-- Create output for files in top dir -->
	<for name="out" values="index contact work nonwork home">
	  <out file="${out}.html" name="${out}">
	    ${jmason_template}
	  </out>
	</for>


The <out> Tag

The <out> tag is used to generate output. Surprise!

It has one required attribute -- file, which defines the output file generated by this section. In addition it has some optional attributes, as follows:

name

which is used to substitute in that section's URL address, by inserting it in other sections or out tags in a URL reference, like so: $(out_foo) .

More optional attributes are as follows. These ones also pick up defaults from the <attrdefault> tag.

format

which defines the format the output is expected in (MIME-style). The default is text/html.

clean

specifies which features of the HTML cleaner to use. The HTML cleaner is a powerful filter which can polish grotty, messy HTML into fully-standards-compliant glory. The default value is all.

ismainurl

Whether this output file should be used as a "main URL" for any content items used within it, to support the url magic metadatum. If you plan to have multiple output styles for your content, be sure to set "ismainurl=false" on the pages which use "alternative" styles. The default value is true.

Perl code can also access out URLs using the get_url() function.

The production of multiple out files that are more-or-less identical can be automated using the <for> tag.

Output and Dependencies

Out files will not be generated if the resulting text has not changed from the previous run, or if the content sections it depends on have not changed.

The latter functionality is accomplished by caching the modification dates of each file from which content was read to generate the output file. If:

  1. the output file exists,

  2. none of the files are newer than they were last time the output file was written,

  3. none of them are newer than the output file itself, and

  4. none of the content items contain dynamic content, such as Perl code or sitemaps,

then it does not need to be rebuilt.

Example


  <out name="index" file="index.html">
    ${header}
    ${index_text}
    ${footer}
  </out>


The <sitemap> Tag

The <sitemap> tag is used to generate a content item containing a map, in a tree structure, of the current site.

It does this by traversing every content item you have defined, looking for one tagged with a isroot=true attribute. This will become the root of the site map tree.

While traversing, it also searches for content items with a metadatum called up. This is used to tie all the content together into a tree structure.

Note: content items that do not have an up metadatum are considered children of the root by default. If you do not want to map a piece of content, declare it with the attribute map=false.

By default, the content items are arranged by their score and title metadata at each level. The sort criteria can be overridden by setting the sortorder attribute.

Note: if you wish to include external HTML pages into the sitemap, you will need to load them as URL references using the <media> tag and use the <metatable> tag to associate metadata with them. t/data/sitemap_with_metatable.wmk in the WebMake test suite demonstrates this. This needs more documentation (TODO).

The <sitemap> tag takes the following required attributes:

name

The name of the sitemap item, used to refer to it later. Sitemaps are referred to, in other content items or in out files, using the normal ${foo} style of content reference.

node

The name of the content item to evaluate for each node with children in the tree. See Processing, below.

leaf

The name of the content item to evaluate for each leaf node, ie. a node with no children, in the tree. See Processing, below.

And the following optional attributes:

rootname

The root content item to start traversing at. The default root is whichever content item has the isroot attribute set to true.

all

Whether or not all content items should be mapped. Normally dynamic content, such as metadata and perl-code-defined content items, are not included. (default: false)

dynamic

The name of the content item to evaluate for dynamic content items, required if the all attribute is set to true.

grep

Perl code to evaluate at each step of the tree. See the Grep section below.

sortorder

A sort string specifying what metadata should be used to sort the items in the tree, for example "section score title".

Note that the root attribute is deprecated; use rootname instead.

The sitemap can be declared either as an empty element, with /> at the end, or with a pair of starting and ending tags and text between. If the sitemap is declared using the latter style, any text between the tags will be prepended to the generated site map. It's typically only useful if you wish to set metadata on the map itself.

Processing

Here's the key to sitemap generation. Once the internal tree structure of the site has been determined, WebMake will run through each node from the root down up to 20 levels deep, and for each node, evaluate one of the 3 content items named in the <sitemap> tag's attributes:

  1. node: For pages with pages beneath them;

  2. leaf: For "leaf" pages with no pages beneath them;

  3. dynamic: For dynamic content items, defined by perl code or metadata.

By changing the template content items you name in the tag's attributes, you have total control over the way the sitemap is rendered.

The following variables (ie. content items) are set for each node:

name

the content name

title

the content's Title metadatum, if set

score

the content's Score metadatum, if set

list

the text for all children of this node (node items only)

is_node

whether the content is a node or a leaf (1 for node, 0 for leaf)

In addition, the following URL reference is set:

url

the first URL listed in a WebMake <out> tag to refer to the content item.

Confused? Don't worry, there's an example below.

Grep

The grep attribute is used to filter which content items are included in the site map.

The "grep" code is evaluated once for every node in the sitemap, and $_ is the name of that node; you can then decide to display/not display it, as follows.

$_ is set to the current content item's name. If the perl code returns 0, the node is skipped; if the perl code sets the variable $PRUNE to 1, all nodes at this level and below are skipped.

Example

If you're still not sure how it works, take a look at examples/sitemap.wmk in the distribution. Here's the important bits from that file.

Firstly, two content items are necessary -- a template for a sitemap node, and a template for a leaf. Note the use of $(url), ${title}, etc., which are filled in by the sitemap code.


	<content name=sitemapnode map=false>
	  <li>
	    <p>
	      <a href=$(url)>${title}</a>: $[${name}.abstract]<br>
	      <!-- don't forget to list the sub-items -->
	      <ul> ${list} </ul>
	    </p>
	  </li>
	</content>

	<content name=sitemapleaf map=false>
	  <li>
	    <p>
	      <a href=$(url)>${title}</a>: $[${name}.abstract]<br>
	      <!-- no sub-items here -->
	    </p>
	  </li>
	  </li>
	</content>

Finally, the sitemap itself is declared.


	<sitemap name=mainsitemap node=sitemapnode leaf=sitemapleaf />

From then on, it's just a matter of including the sitemap content item in an output file:


	<out name=map file=sitemap_html/map.html>
	  ${header}${mainsitemap}${footer}
	</out>

And that's it.

This documentation includes a sitemap, by the way. It's used to generate the navigation links. Take a look here.


The <navlinks> Tag

A common site structure strategy is to provide Back, Forward and Up links between pages. This is especially frequent in papers or documentation. WebMake now supports this using the <navlinks> tag.

To use this, first define a sitemap. This tells WebMake how to order the page hierarchy, and which pages to include.

Next, define 3 content items, one for previous, one for next and one for up links. These should contain references to ${url} (note: not $(url)), which will be replaced with the URL for the next, previous, or parent content item, whichever is applicable for the direction in question.

Also, references to ${name} will be expanded to the name of the content item in that direction, allowing you to retrieve metadata for that content like so: $[${name}.title] .

You can also add content items to be used when there is no previous, next or up content item; for example, the "top" page of a site has no up content item. These are strictly optional though.

Then add a <navlinks> tag to the WebMake file as follows.


	<navlinks name=mynavlinks map=sitemapname
		up=upcontentname
		next=nextcontentname
		prev=prevcontentname
		noup=noupcontentname
		nonext=nonextcontentname
		noprev=noprevcontentname>
	content text
	</navlinks>

The content text acts just like a normal content item, but references to ${nexttext}, ${prevtext} or ${uptext} will be replaced with the appropriate content item; e.g. ${uptext} will be replaced by either ${upcontentname} or ${noupcontentname} depending on if this is the top page or not.

You can then add references to $ in other content items, and the navigation links will be inserted. Note! be sure to use a deferred reference, or the links may not appear!

Attribute Reference

These are the attributes accepted by the <navlinks> tag.

name

the name of the navigation-links content item. Required.

map

the name of the sitemap used to determine page ordering. Required.

up

the name of the content item used to draw Up links. Required.

next

the name of the content item used to draw Next links. Required.

prev

the name of the content item used to draw Prev links. Required.

noup

the name of the content item used when there is no Up link, ie. for the page at the top level of the site. Optional -- the default is an empty string.

nonext

the name of the content item used when there is no Next link, ie. the last page in the site. Optional -- the default is an empty string.

noprev

the name of the content item used when there is no Prev link, ie. for the first page in the site. Optional -- the default is an empty string.

Example

This will generate an extremely simple set of <a href> links, no frills. The sitemap it uses isn't specified here; see the sitemap documentation for details on doing that.


	<content name=up><a href=${url}>Up</a></content>

	<content name=next><a href=${url}>Next</a></content>

	<content name=prev><a href=${url}>Prev</a></content>

	<navlinks name=name map=sitemapname up=up next=next prev=prev>
	  ${prevtext} | ${uptext} | ${nexttext}
	</navlinks>


The <breadcrumbs> Tag

Another common site navigation strategy is to provide what Jakob Nielsen has called a "breadcrumb trail". The <breadcrumbs> tag supports this.

WTF Is A Breadcrumb Trail?

The "breadcrumb trail" is a piece of navigation text, displaying a list of the parent pages, from the top-level page right down to the current page. You've probably seen them before; take a look at this Yahoo category for an example.

To illustrate, here's an example. Let's say you're browsing the Man Bites Dog story in an issue of Dogbiting Monthly, which in turn is part of the Bizarre Periodicals site. Here's a hypothetical breadcrumb trail for that page:

Bizarre Periodicals : Dogbiting Monthly : Issue 24 : Man Bites Dog

Typically those would be links, of course, so the user can jump right back to the contents page for Issue 24 with one click.

If you have a site that contains pages that are more than 2 levels deep from the front page, you should consider using this to aid navigation.

How To Use It With WebMake

To use a breadcrumb trail, first define a sitemap. This tells WebMake how to order the page hierarchy, and which pages to include.

Next, define a content item to be used for each entry in the trail. This should contain references to ${url} (note: not $(url)), which will be replaced with the URL for the page in question; and ${name}, which will be expanded to the name of the "main" content item on that page, allowing you to retrieve metadata for that content like so: $[${name}.title] .

Note: the "main" content item is defined as the first content item on the page which is not metadata, not perl-generated code, and has the map attribute set to "true".

You can also define two more content items to be used at the top of the breadcrumb trail, ie. the root page, and at the tail of it, ie. the current page being viewed. These are optional though, and if not specified, the above content item will be used.

Then add a <breadcrumbs> tag to the WebMake file as follows.


	<breadcrumbs name=mycrumbs map=sitemapname
		top=topcontentname
		tail=tailcontentname
		level=levelcontentname />

The top and tail attributes are optional, as explained above. The level attribute, which names the "generic" breadcrumb content item to use for intermediate levels, is mandatory.

You can then add references to $[mycrumbs] in other content items, and the breadcrumb-trail text will be inserted. Note! be sure to use a deferred reference, or the links may not appear!

Attribute Reference

These are the attributes accepted by the <breadcrumbs> tag.

name

the name of the breadcrumb-trail content item. Required.

map

the name of the sitemap used to determine page hierarchy. Required.

level

the name of the content item used to draw links at the intermediate levels of the trail. Required.

top

the name of the content item used to draw the link to the top-most, or root, page. Optional -- level will be used as a fallback.

tail

the name of the content item used to draw the link to the bottom-most, currently-viewed page. Optional -- level will be used as a fallback.

Example

This will generate an extremely simple set of <a href> links, no frills. The sitemap it uses isn't specified here; see the sitemap documentation for details on doing that.


  <content name=btop map=false>
  	[ <a href=${url}>$[${name}.title]</a> /
  </content>
  <content name=blevel map=false>
  	<a href=${url}>$[${name}.title]</a> /
  </content>
  <content name=btail map=false>
  	<a href=${url}>$[${name}.title]</a> ]
  </content>
  <breadcrumbs map=sitemapname name=crumbs
  	top=btop tail=btail level=blevel />


Defining Tags

Like Roxen or Java Server Pages, WebMake allows you to define your own tags; these cause a perl function to be called whenever they are encountered in either content text, or inside the WebMake file itself.

Defining Content Tags

You do this by calling the define_tag() function from within a <{perl}> section in the WebMake file. This will set up a tag, and indicates a reference to the handler function to call when that tag is encountered, and the list of attributes that are required to use that tag.

Any occurrences of this tag, with at least the set of attributes defined in the define_tag() call, will cause the handler function to be called.

Handler functions are called as fcllows:


        handler ($tagname, $attrs, $text, $perlcode);

Where $tagname is the name of the tag, $attrs is a reference to a hash containing the attribute names and the values used in the tag, and $text is the text between the start and end tags.

$perlcode is the PerlCode object, allowing you to write proper object-oriented code that can be run in a threaded environment or from mod_perl. This can be ignored if you like.

Note that there are two variations, one for conventional tag pairs with a start and end tag, the other for stand-alone empty tags with no end tag. The latter variation is called define_empty_tag().

define_empty_tag()

define a content tag with a start and end

define_tag()

define a standalone content tag

Defining WebMake Tags

This is identical to using content tags, above, but the functions are as follows:

define_empty_wmk_tag()

define a WebMake tag with a start and end

define_wmk_tag()

define a standalone WebMake tag

Example

Let's say you've got the following in your WebMake file.


  <{perl
   define_tag ("thumb", \&make_thumbnail, qw(img thumb));
  }>

  <content name="foo">
    <thumb img="big.jpg" thumb="big_thumb.jpg">
      Picture of a big thing
    </thumb>
  </content>

When the foo content item came to be included in an output file, the tag will be replaced with a call to a perl function, as follows:


  make_thumbnail ("thumb",
     { img => 'big.jpg', thumb => 'big_thumb.jpg' },
     'Picture of a big thing', $perlcode);

Note that if the tag omitted one of the 2 required attributes, img or thumb, it would result in an error message.

$perlcode is a reference to the PerlCode interpreter which called it, allowing you to use the object-oriented style of WebMake perl code, which is required for safe use under mod_perl or other multithreaded environments.


The Order of Processing

In order to fully control the WebMake file processing using Perl code, it's important to know the order in which the tags and so on are parsed.

Initially, WebMake used a set order of tag parsing, but this proved to be unwieldy and confusing. Now, it uses the order in which the tags are defined in the .wmk file, so if you want tag A to be interpreted before tag B, put A before B and the right thing will happen.

Once the file is fully parsed, the <out> tags are processed, one by one. At this point, <{set}> and <{perl}> processing directives will be interpreted, if they are found within content chunks.


The <{set}> Directive

Small pieces of content can be set from within other content chunks or <out> sections using the <set> directive. The format is

<{set name="value"}>

This can be useful to set headlines or titles, by including a <{set}> directive in their text.

In addition, to insert some text into a template, set the text as variables beforehand:

<{set body="${foo.txt}"}> ${bar_template}

Note: Order of Content Reference Processing

The processing of content references starts at each <out> URL in turn, and descends from the chunk of text defined for that file, replacing each ${content_ref} and $(url_ref) one-by-one, in a depth-first manner.

Finally, the tree-traversal starts again from the chunk of <out> text, searching for $[deferred_content refs].

Therefore if you wish to <{set}> a variable, let's say x, in a chunk of content that will not be loaded before x is accessed, you should use a $[deferred content ref] to access it.

How <{set}> Relates To Meta-data

The <{set}> directive was implemented before metadata was, and initially provided a way to do similar things, such as substitute page titles, etc.

Now, however, it's probably better to use <wmmeta> tags to do that kind of content-item metadata, as it means your pages will be able to take advantage of new features, like index and site-map generation.

The <{set}> directive is retained as a way of quickly setting content items from within other content, in case this feature proves useful for other purposes.


The <{perl}> Directives

Arbitrary perl code can be executed using this directive.

It works like perl's eval command; the return value from the perl block is inserted into the file, so a perl code block like this:


	<{perl
	  $_ = '';
	  for my $fruit (qw(apples oranges pears)) {
	    $_ .= " ".$fruit;
	  }
	  $_;
	}>

will be replaced with the string " apples oranges pears". Note that the $_ variable is declared as local when you enter the perl block, you don't have to do this yourself.

If you don't like the eval style, you can use a more PHP-like construct using the perlout directive, which replaces the perl code text with anything that the perl code prints on STDOUT, like so:


	<{perlout
	  for my $fruit (qw(apples oranges pears)) {
	    print " ", $fruit;
	  }
	}>

<{perl}> sections found at the top level of the WebMake file will be evaluated during the file-parsing pass, as they are found.

<{perl}> sections embedded inside content chunks or other tagged blocks will be evaluated only once they are referenced.

Perl code can access content variables and URLs using the library functions provided.

The library functions are available both as normal perl functions in the default main package, or, if you want to write thread-safe or mod_perl-safe perl code, as methods on the $self object. The $self object is available as a local variable in the perl code block.

A good example of perl use inside a WebMake file can be found in the news_site.wmk file in the examples directory.


Globs and Regexps

A number of WebMake parameters and perl APIs support pattern matching. This is performed using glob patterns and regular expressions.

Glob Patterns

These are more-or-less traditional shell- or MS-DOS-like globs, as follows:

* matches any number of characters except /
... matches any number of characters, including /
? matches one character

This is the default mode of matching. Example globs are: *.html, .../*.txt.

Regular Expressions

These are perl-style regular expressions. They are differentiated from glob patterns by prefixing them with RE:, for example: RE:^.*\.html$.

For more details, check your perl documentation, or search the web.


Sorting Lists of Content Items

Frequently, you will need to get a list of content items in sorted order. WebMake itself does this for the sitemap tag, among others.

Sorting is typically performed using a content item's metadata; some metadata that are especially useful are:

score

A number representing the "priority" of a content item; specifically intended for use when sorting. Defaults to 50 if unset.

title

The title of a content item. Handy for alphabetic lists. Defaults to (Untitled) if not set.

declared

The item's declaration order. This is a number representing when the content item was first encountered in the WebMake file; earlier content items have a lower declaration order. You do not need to set this; WebMake will do so automatically.

mtime

The modification date, in UNIX time_t seconds-since-the-epoch format, of the file the content item was loaded from.

name

The name of the content item.

WebMake provides a built-in mechanism to allow easy sorting of content items, called a sort spec or sort string.

This is typically used either with the Perl code library's sort_content_objects() call, or using a sortorder attribute as the sitemap tag does.

A sort string is a text string, containing a space-separated list of metadata items. The first entry in the list is the main sorting criterion; the second entry is then used to break deadlocks if two entries match for the main criterion, etc.

In addition, a metadata item can be prefixed with a !, to reverse its order.

Example

score title: sort by score, and if two content items have the same score, sort by title.

declared: sort by the order in which they were declared in the WebMake file.

score title !mtime: sort by score and title, and if more than one content item have the same score and title, sort them into oldest-first order.


${content_refs} - References to Content Chunks

Content chunks and variables can be referred to using this format. This is evaluated before any other variable reference is.

${name}

Content chunks can refer to other chunks, URLs, or use deferred references, up to 30 levels deep.

If you wish to refer to a content item or variable, but are not sure if it exists, you can provide a default value by following the content name with a question mark and the default value.

${name?defaultvalue}


$(url_refs) - References to URLs

URLs of defined <out> sections and <media> items can be inserted into the current content using this reference format.

$(name)

Note that all URL references are written relatively; so a file created in the foo/bar/baz subdirectory which contains a URL reference to blah/argh.html will be rewritten to refer to ../../../blah/argh.html.

Again, if you're not sure a URL exists, a default value can be supplied, using this format:

$(name?defaultvalue)


$[deferred_content refs] - Deferred Content References

These are identical to ${content_refs}, but are evaluated only after all other references.

$[name]

This means that a content variable can be set at the end of an <out> section, but referred to at the start, for example. Handy for HTML page titles.

In addition, this is the recommended way to access metadata set using the wmmeta tag.

Again, a default value can be supplied, using this format:

$[name?defaultvalue]


The <wmmeta> Tag

WebMake can load metadata embedded in any content chunk, and use this metadata at any point in the site. In addition, metadata can be set externally from the content using the <metatable> and <metadefault> tags.

A metadatum is like a normal content item, except it is exposed to other pages in the index.wmk file. Normally, you cannot reliably read a dynamic content item that was set from another page; if one content item sets a variable like this:

<{set foo="Value!"}>

Any content items evaluated after that variable is set can access ${foo}, as long as they occur on the same output page. However if they occur on another output page, they may not be able to access ${foo}.

To get around this, WebMake includes the <wmmeta> tag, which allows you to attach data to a content item. This data will then be accessible, both to other pages in the site (as $[contentname.metaname]), and to other content items within the same page (as $[this.metaname]).

Think of them as like size, modification time, owner etc. on files; or member variables in an object-oriented language. Another good way to think of it is as "catalog data", as opposed to "narrative data", which is what a normal content item is. (thanks to Vaibhav Arya, vaibhav /at/ mymcomm.com, for that analogy.)

Examples of metadata that can be useful, and suggested names for that data, are as follows:

Title

the title of a content item. The default title for content items is (Untitled). (built-in)

Score

a number representing the "priority" of a content item; used to affect how the item should be ranked in a list of stories. The default value is 50. (built-in)

Abstract

a short summary of a content item. (optional)

Up

used to map the site's content; this metadata indicates the content item that is the parent of the current content item. This metadatum is used to generate dynamic sitemaps. (built-in)

Here's some built-in "magic" items of metadata that do not need to be tagged with the <wmmeta> tag. Instead, they are automatically inferred by WebMake itself:

declared

the item's declaration order. This is a number representing when the content item was first encountered in the WebMake file; earlier content items have a lower declaration order. Useful for sorting.

url

the first <out> URL which contains that content item (you should order your <out> tags to ensure each stories' "primary" page is listed first, or set ismainurl=false on the "alternative" output pages, if you plan to use this). See also the get_url() method on the HTML::WebMake::Content object.

is_generated

0 for items loaded from a <content> or <contents> tag, 1 for items created by Perl code using the add_content() function.

mtime

The modification date, in UNIX time_t seconds-since-the-epoch format, of the file the content item was loaded from. Handy for sorting.

More suggested meta tags, and their formats, are listed at the end of this document.

Note that content items representing metadata cannot, themselves, have metadata.

How to Use It

Meta-data is set from within a content chunk using the <wmmeta> tag; this tag is automatically stripped from the content when the content is referenced. It can be used either as an XML-style empty tag, similar to the HTML <meta> tag, if it ends in />:


  <wmmeta name="Title" value="Story 1, blah blah" />

or with start and end tags, for longer bits of content:


  <wmmeta name="Abstract">
    Story 1, just another story.
  </wmmeta>

As you can see, each item of metadata needs a name and a value. The latter format reads the value from the text between the start and end tags.

For efficiency during subsequent site builds, metadata is cached in the site cache file, so it will not need to be re-read from the original content chunk unless that content chunk is modified again.

Meta tag names are case-insensitive, for compatibility with HTML meta tags.

Referring to Metadata

Metadata is referred to using the deferred content ref format:

$[content.metaname]

Where content is the name of the content item, and metaname is the name of the metadatum. So, for example, $[blurb.txt.title] would return the title metadatum from the content item blurb.txt.

Any content chunk can access metadata from other content chunks within the same <out> tag, using this as the content name, i.e. $[this.title] . This is handy, for example, in setting the page title in the main content chunk, and accessing it from the header chunk.

If more than one content item sets the same item of metadata inside the <out> tag, the first one will take precedence.

The example files "news_site.wmk" and "news_site_with_sections.wmk" demonstrate how meta tags can be used to generate a SlashDot or Wired News-style news site. The index pages in those sites are generated dynamically, using the metadata to decide which pages to link to, their ordering, and the titles and abstracts to use.

Suggested Metadata Names

The tags marked (built-in) are supported directly inside WebMake, and used internally for functionality like building site maps and indices. All the other suggested metadata names here are just that, suggestions, which support commonly-required functionality.

Also note that tag names are case-insensitive, they're just capitalised here for presentation.

Title

the title of a content item. The default title for content items is (Untitled). (built-in)

Score

a number representing the "priority" of a content item; used to affect how the item should be ranked in a list of stories. The default value is 50. Items with the same score will be ranked alphabetically by title. (built-in)

Abstract

a short summary of a content item.

Section

the section of a site under which a story should be filed.

Author

who wrote the item.

Approved

has this item been approved by an editor; used to support workflow, so that content items need to be approved before they are displayed on the site.

Visible_Start

the start of an item's "visibility window", ie. when it is listed on an index page. (TODO: define a recommended format for this, or replace with DC.Coverage.temporal)

Visible_End

the end of an item's "visibility window", ie. when it is listed on an index page.

DC.Publisher

a Dublin Core tag. The organisation or individual that publishes the entire site.

The Dublin Core is a whole load of suggested metadata names and formats, which can be used either to replace or supplement the optional tags named above. Regardless of whether you replace or supplement the tags above internally, it is definitely recommended to use the DC tag names for metadata that's made visible in the output HTML through conventional HTML <meta> tags.

Why Use Metadata

Support for metadata is an important CMS feature.

It is used by Midgard and Microsoft's SiteServer, and is available as user-contributed code for Manila. It provides copious benefits for flexible index and sitemap generation, and, with the addition of an Approved tag, adds initial support for workflow.

It allows the efficient generation of site maps, back/forward navigation links, and breadcrumb trails, and enables index pages to be generated using Perl code easily and in a well-defined way.

Example


  <content name="foo">
    < wmmeta name="Title" value="Foo" />
    < wmmeta name="Abstract">
      Foo is all about fooing.
    </ wmmeta>

    Foo foo foo foo bar. etc.
  </content>


The <metadefault> Tag

Metadata is usually embedded inside a content item using the <wmmeta> tag. However, this can be a chore for lots of content items, so to make things easier, you can specify a default metadata using the <metadefault> tag.

Specify this tag before the content items in question, and those content items will all be tagged with the metadata you set.

Like the attrdefault tag, this tag can be used either in a scoped mode, or in a command mode.

Scoped Mode

"Scoped" mode uses opening (<metadefault>) and closing (</metadefault>) tags; the metadata is only set on content items between the two tags.

Note! one warning about "scoped" mode: note that WebMake does not use a fully-correct XML parser to parse the XML in the .wmk file, so if you nest <metadefault> tags, it will not correctly parse them; instead, the first closing </metadefault> tag found will be used.

Command Mode

Command mode uses standalone tags (<metadefault ... />); the metadata are set until the end of the WebMake file, or until you change them with another <metadefault> tag.

Attributes

name

the metadatum's name, e.g. Title, Section, etc. This is required.

value

the metadatum's value. This is optional. If the value is not specified, the metadatum will be removed from the list of default metadata.


The <metatable> Tag

Metadata is usually embedded inside a content item using the <wmmeta> tag. However, sometimes you may want to tag a content item with metadata from outside, if the text of the content is not under your control; or you may want to tag metadata to an object that is not even a content item, such as an image.

The metatable tag allows you to do this, and in bulk. You list a table of content names and the metadata you want to attach to each content item, in tab-, comma-, or pipe-separated-value format.

Firstly, pick a delimiter character, such as |. Set the delimiter attribute to this character.

Next, the first line of the metatable lists the metadata you wish to set; it must start with the value .. This indicates to WebMake that it's defining the metadata to be set.

Finally, list as many lines of metadata as you like; the first value on the line is the name of the content item you wish to attach the metadata to. From then on, the other values on the line are the values of the metadata.

So, for example, consider this table, from the WebMake documentation:

<metatable delimiter="|">
 .|title|abstract
 Main.pm|HTML::WebMake::Main|module documentation
 PerlCodeLibrary.pm|HTML::WebMake::PerlCodeLibrary|module documentation
 Content.pm|HTML::WebMake::Content|module documentation
 EtText2HTML.pm|Text::EtText::EtText2HTML|module documentation
 HTML2EtText.pm|Text::EtText::HTML2EtText|module documentation
 webmake|webmake(1)|script documentation
 ettext2html|ettext2html(1)|script documentation
 ethtml2text|ethtml2text(1)|script documentation
 </metatable>

This will set Main.pm.title to HTML::WebMake::Main, Main.pm.abstract to module documentation, etc.

Using <metatable> To Tag Non-Content Items

Often, you will need to attach metadata to non-content items, such as images, or HTML files that are not generated by WebMake. Here's how to do this.

First, load the URLs of the items using a <media> tag. Then create an empty content item for each one (possibly automated using the <for> tag). The URLs from the <media> tag will automatically take precedence over the URLs of the fake content items.

Then use a metatable, as above, to set the metadata you wish to use.


The <attrdefault> Tag

Attributes are usually specified inside a content item's <content> or <contents> tags, or, for output files, inside the <out> tag. However, this can be a chore if you have many items to set attributes on, so, to make things easier, you can specify default attributes using the <attrdefault> tag.

Specify this tag before the content items or output files in question, and those items will all be tagged with the attributes you set.

Like the metadefault tag, this tag can be used either in a scoped mode, or in a command mode.

Scoped Mode

"Scoped" mode uses opening (<attrdefault>) and closing (</attrdefault>) tags; the attributes are only set on content items or output files between the two tags.

Note! one warning about "scoped" mode: note that WebMake does not use a fully-correct XML parser to parse the XML in the .wmk file, so if you nest <attrdefault> tags, it will not correctly parse them; instead, the first closing </attrdefault> tag found will be used.

Command Mode

Command mode uses standalone tags (<attrdefault ... />); the attributes are set until the end of the WebMake file, or until you change them with another <attrdefault> tag.

Attributes

name

the attribute's name, e.g. up, map, etc. This is required.

value

the attribute's value. This is optional. If the value is not specified, the attribute will be removed from the list of default attributes.

Example

Using the scoped style:


  <attrdefault name="format" value="text/html">
    <content name="chunk_1.txt">...</content>
    <content name="chunk_2.txt">...</content>
    <content name="chunk_3.txt">...</content>
    <content name="chunk_4.txt">...</content>
  </attrdefault>

Or, in the "command" style:


  <attrdefault name="format" value="text/html" />

  <content name="chunk_1.txt">...</content>
  <content name="chunk_2.txt">...</content>
  <content name="chunk_3.txt">...</content>
  <content name="chunk_4.txt">...</content>

  <attrdefault name="format" />


The ${IMGSIZE} Magic Variable

This reference provides an easy way to automatically add image size information to an <img> tag, for example:

<img src="foo.gif" ${IMGSIZE}>

Would become:

<img src="foo.gif" height=30 width=11>

It requires the Image::Size Perl module be installed, otherwise it does nothing.


The $(TOP/) Magic Variable

This URL reference always evaluates to a relative path to the top-level of the site, for URLs.

Note that setting the EtTextHrefsRelativeToTop option will cause all URLs in Text::EtText blocks, which don't start with a slash or a protocol specification, to be made relative to the top-level of the site.


The ${WebMake.*} Magic Variables

WebMake defines several magic variables that expand to useful information about the current environment. These are as follows. Each one is illustrated with the value at the time this documentation was generated.

WebMake.Version

The version of WebMake that generated this site. (1.1)

WebMake.GeneratorString

A generator string for WebMake; this is in the form WebMake/v.vv where v.vv is the version number of WebMake. (WebMake/1.1)

WebMake.Who

The username of the person who generated the site. (jm)

WebMake.Time

The time the site was last generated. (Wed Mar 21 15:10:28 2001)

WebMake.OutFile

The filename used in the current <out> tag. (allinone.html)

WebMake.OutName

The name used in the current <out> tag. (allinone)

WebMake.PerlLib

The directory WebMake expects to find Perl code library files (ie. plugins) in.


The Text::EtText Format Converter

This converter converts from Text::EtText, a simple plain-text format, to HTML. Like most simple text markup formats (POD, setext, etc.), EtText markup handles the usual things: insertion of <P> tags, header recognition and markup. However it adds a powerful link markup system.

EtText markup is simple and effective; it's based loosely on WikiWikiWeb TextFormattingRules.

Basic Text Markup

If you leave blank lines between paragraphs, <p> and </p> tags will be inserted in the correct places. EtText does quite a good job of this.

Words wrap and fill automatically, so there's no need to worry about wrapping before 80 characters. (It's good form to do so anyway, in case other people ever need to edit your text, though.)

A paragraph consisting of a line of 10 or more consecutive - or _ signs will be converted to a HR tag.

Sections of text between pairs of certain characters will be turned into markup, as follows:

EtText Tag Used Result
**text** <strong> text
__text__ <em> text
##text## <code> text

& signs that have whitespace on either side will be converted to &amp; signs automatically.

Text indented from the left margin will be converted into a <P> paragraph wrapped in a <blockquote> -- unless it starts with a *, -, + or o character followed by whitespace, in which case it's interpreted as a list item; see Lists below.

Another exception to the above rule is that text indented by only 1 space, or on lines starting in the first column with two colon characters, will be surrounded by <pre> tags.

If you find writing HTML tag-pairs manually annoying, EtText includes an idea from Latte; balanced-tag generation. Wrap the text to be tagged with the name of the tag followed immediately by a { character on the left, and a } character on the right. In other words,

strong{text}

will be rendered as

<strong>text</strong>

or, in other words, text . This can be nested, so strong{text with i{italic} bits} will be rendered as text with italic bits.

In addition, the balanced-tag support has a bonus feature, in that it supports CSS classes; follow the name of the tag with a full stop and the class, and it will use that class, like so:

i.green{foo}

will be rendered as

<i class="green>foo</i>

Lists

A paragraph indented from the left margin (by either spaces or tabs, or both), and starting with a *, -, + or o character followed by whitespace, will be converted into a list item (<li> tag).

The same goes for indented paragraphs that start with the string 1., followed by whitespace. However the default list tag in this case will be an <ol>...</ol> list. Any positive integer followed immediately by a full stop and a space will do the trick. (BTW: I used to use # to do this, but I preferred the WikiIdea, it looks better.)

(Compatibility note: previous versions of EtText required that the <ul> or <ol> tags be written manually. This is no longer the case.)

Some text editors (such as vim) will reformat list items automatically, assuming that you want the text to line up with the start of the text, instead of the bullet-point character, on the previous line, like so:


	- this is a list item. We should make sure that
	  blah blah etc. etc.

WebMake supports this.

Indented paragraphs that start with term: tab rest of paragraph will be converted into definition lists (this is another StolenFromWikiIdea). They look like this:

Foo

Blah blah blah etc.

Sidebars and Side Images

If you wish to display an image, or small sidebar, beside a paragraph of text, use the <etleft> and <etright> tags. These are rendered as a one-row, two-column <table> wrapping the paragraph and the sidebar, as follows:


<etleft><img src=bubba.png></etleft>This is the main
paragraph body.  Foo bar baz blah blah blah etc.

Is displayed as:

This is the main paragraph body. Foo bar baz blah blah blah etc.


<etright><img src=bubba.png></etright>This is the
main paragraph body.  Foo bar baz blah blah blah etc.

Is displayed as:

This is the main paragraph body. Foo bar baz blah blah blah etc.

When HTML and EtText Collide

HTML tags can be used freely throughout an EtText document. However, in some situations, you may wish to preserve whitespace, avoid paragraph tags being added, etc.; to use your own HTML without meddling from EtText, wrap it in an <!--etsafe-->...<!--/etsafe--> tag pair; this will protect it.

Note that text blocks wrapped in <pre>, <listing> and <xmp> tags are automatically protected in this way; the <!--etsafe--> tag pair is not required.

EtText adds two entities, &etsqi; and &etsqo;. These represent [ and ] respectively, and are used to protect a square-bracketed piece of text from being interpreted as a link URL (see Link Markup below).

EtText Links

As well as the standard <a href=url>...</a> link specification used in HTML, EtText will automatically add href tags for URLs and email addresses that occur in the text. In addition, EtText supports its own link format, as follows.

The basic concept is of a word or "quoted set of words" followed by a link label in [square brackets], like this: "this is a link" [label].

The href used in the link is then defined at another point in the document, as an indented line like this:

[label]: http://url...

Text and markup can be enclosed in the quotes, everything quoted will become part of the link text. Single words or HTML tags do not need to be quoted, so <img src="http://jmason.org/license_plate.jpg" width="10" height="10"> [homepage] will work correctly.

Glossary Links

EtText also supports a concept called glossary links; if you define a link, the name of that link will automatically become a href if enclosed in quotes. For example:

[Justin Mason]: http://jmason.org/

will mean that any occurrence of the name "Justin Mason'', in quotes, in any EtText content chunk or file in the site, becomes a link to that address. These links are stored in the WebMake cache file.

Quoted bits of text that do not map to an entry in the glossary are not converted to links (unless they're followed by a square-bracketed link-label reference).

URLs, such as http://webmake.taint.org/ , and email addresses, such as jm@nospam-jmason.org, are automatically converted into links to that same address.

Blocking EtText Link Interpretation

To block interpretation as a link, replace square brackets with the HTML entities &etsqi; and &etsqo;, which map to [ and ] respectively; replace quote characters, ", with two apostrophes, ''. If that doesn't do the trick, wrap the entire section of text with the <!--etsafe-->...<!--/etsafe--> tags.

Similar Systems

EtText-like plain-text-to-markup conversion systems have a long history. The first time I came across the concept was with Setext, which was included with Tony Sanders' Plexus web server, back in September 1993. Yes, 1993. Setext has been around for a while!

WikiWikiWeb is quite a recent, well-established system which uses a similar markup style.

Userland's Frontier includes a text-to-markup conversion system as well.

Some well-known sites that use their own converters to convert plain-text to markup include http://www.blogger.com/, http://slashdot.org/ (for comments) and http://www.advogato.org/.

Jorn Barger maintains an impressive summary of etext formats at his Robot Wisdom site. Skip down to section 3, Internet etext standards, for the directly-relevant stuff.

Zope and ZWiki use a format called StructuredText, which again comes from WikiLand. There's some interesting work going on there with the STXDocument object, which is a web-managable object that contains information marked up in the structured text format.


The POD Format Converter

This converter converts from POD to HTML, using Tom Christiansen's Pod::Html module.

POD is a powerful, but simple, editable-text format for marking up manual-page-style documentation. See the "perlpod" manual page in your Perl documentation for more information on the POD format.

Things to watch out for in WebMake's support for POD:

  • Anything before the <BODY> tag, or after the </BODY> tag, in the generated output is stripped, so that the POD output can be embedded in HTML pages without requiring a page of its own.

  • WebMake allows options to pod2html to be specified using the podargs attribute of the <content> tag; see below.

  • If you are reading POD documentation embedded inside other files, you should probably use the "asis" attribute on the content items in question, otherwise all sorts of wierd things could happen as WebMake tries to interpret Perl variable references and so on! See the <content> documentation for details on "asis".

Specifying Options to the POD Translator

If you want to specify pod2html options to the converter, just put them in a string as a podargs attribute of the <content> tag, like so:

<content name="some_pod" podargs="--noindex"> ... </content>


The HTML Cleaner

The HTML cleaner is a powerful filter which can polish grotty, messy HTML into fully-standards-compliant glory. By default, all output of format text/html (the default format) will be passed through it.

It is controlled using the clean parameter of the <out> tag. The features to be used should be listed in this parameter's value, separated by whitespace.

Here are the features available:

  • pack - Compress the HTML, removing all white space that is not part of an attribute's value, or inside <xmp> or <pre> tags.

  • nocomments - Trim all comments.

  • addimgsizes - Add image sizes to <img> tags if they do not already specify them.

  • cleanattrs - Quote all attributes in opening tags, and lowercase all tag names.

  • addxmlslashes - Add XML-style slashes to the end of empty-element tags, such as <hr>, <img> etc.

  • fixcolors - Fix colors that do not start with a # character, so that they do.

The feature string all can be used to include all cleaning modes. This is the default.


HTML::WebMake::Content


NAME

Content - a content item.


SYNOPSIS

  <{perl

    $cont = get_content_object ("foo.txt");
    [... etc.]

  }>


DESCRIPTION

This object allows manipulation of WebMake content items directly.


METHODS

$text = $cont->get_name();

Return the content item's name.

$text = $cont->as_string();

A textual description of the object for debugging purposes; currently it's name.

$fname = $cont->get_filename();

Get the filename or datasource location that this content was loaded from. Datasource locations look like this: proto:protocol-specific-location-data, e.g. file:blah/foo.txt or http://webmake.taint.org/index.html.

@filenames = $cont->get_deps();

Return an array of filenames and locations that this content depends on, i.e. the filenames or locations that it contains variable references to.

$flag = $cont->is_generated_content();

Whether or not a content item was generated from Perl code, or is metadata. Generated content items cannot themselves hold metadata.

$val = $cont->expand()

Expand a content item, as if in a curly-bracket content reference. If the content item has not been expanded before, the current output file will be noted as the content item's ''main'' URL.

$val = $cont->expand_no_ref()

Expand a content item, as if in a curly-bracket content reference. The current output file will not be used as the content item's ''main'' URL.

$val = $cont->get_metadata($metaname);

Get an item of this object's metadata, e.g.

        $score = $cont->get_metadata("score");

The metadatum is converted to its native type, e.g. score is return as an integer, title as a string, etc. If the metadatum is not provided, the default value for that item, defined in HTML::WebMake::Metadata, is used.

$score = $cont->get_score();

Return a content item's score.

$title = $cont->get_title();

Return a content item's title.

$modtime = $cont->get_modtime();

Return a content item's modification date, in UNIX time_t format, ie. seconds since Jan 1 1970.

$order = $cont->get_declared();

Returns the content item's declaration order. This is a number representing when the content item was first encountered in the WebMake file; earlier content items have a lower declaration order. Useful for sorting.

$text = $cont->get_url();

Get a content item's URL. The URL is defined as the first page listed in the WebMake file's out tags which refers to that item of content.

Note that, in some cases, the content item may not have been referred to yet by the time it's get_url() method is called. In this case, WebMake will insert a symbolic tag, hold the file in memory, and defer writing the file in question until all other output files have been processed and the URL has been found.


HTML::WebMake::Main


NAME

HTML::WebMake - a simple web site management system, allowing an entire site to be created from a set of text and markup files and one WebMake file.


SYNOPSIS

  my $f = new HTML::WebMake::Main ();
  $f->readfile ($filename);
  $f->make();
  my $failures = $f->finish();
  exit $failures;


DESCRIPTION

WebMake is a simple web site management system, allowing an entire site to be created from a set of text and markup files and one WebMake file.

It requires no dynamic scripting capabilities on the server; WebMake sites can be deployed to a plain old FTP site without any problems.

It allows the separation of responsibilities between the content editors, the HTML page designers, and the site architect; only the site architect needs to edit the WebMake file itself, or know perl or WebMake code.

A multi-level website can be generated entirely from 1 or more WebMake files containing content, links to content files, perl code (if needed), and output instructions. Since the file-to-page mapping no longer applies, and since elements of pages can be loaded from different files, this means that standard file access permissions can be used to restrict editing by role.

Since WebMake is written in perl, it is not limited to command-line invocation; using the HTML::WebMake::Main module directly allows WebMake to be run from other Perl scripts, or even mod_perl (WebMake uses use strict throughout, and temporary globals are used only where strictly necessary).


METHODS

$f = new HTML::WebMake::Main

Constructs a new HTML::WebMake::Main object. You may pass the following attribute-value pairs to the constructor.

force_output

Force output. Normally if a file is already up to date, it is not modified. This will force the file to be re-made.

force_cache_rebuild

Force the cached metadata and dependency data for the site to be rebuilt. Normally this is used to speed up partial rebuilds of the site. This option implies force_output.

risky_fast_rebuild

Run more quickly, but take more risks. Normally, dynamic content, such as Perl sections, sitemaps, or navigation links, are always considered to be in need of rebuilding, as mapping their dependencies is often very difficult or impossible. This switch forces them to be ignored for dependency-tracking purposes, and so an output file that depends on them will not be rebuilt unless a normal content item on that page changes.

base_href

Rewrite links to be absolute URLs based at this URL. By default, links are specified as relative wherever possible.

base_dir

Generate output, and look for support files (images etc.), relative to this directory.

paranoid

Paranoid mode; do not allow perl code evaluation or accesses to directories above the WebMake file.

debug

Debug mode; more output.

$f->set_option ($optname, $optval);

Set a WebMake option. Currently supported options are:

$f->readfile ($filename)

Read and parse the given WebMake file.

$f->make ()

Make all outputs, based on the WebMake files read earlier.

$num_failures = $f->finish();

Finish with a WebMake object and dispose of its internal open files etc. Returns the number of serious failure conditions that occurred (files that could not be created, etc.).


MORE DOCUMENTATION

See also http://webmake.taint.org/ for more information.


SEE ALSO

webmake ettext2html ethtml2text HTML::WebMake Text::EtText::EtText2HTML Text::EtText::EtHTML2Text


AUTHOR

Justin Mason <jm /at/ jmason.org>


COPYRIGHT

WebMake is distributed under the terms of the GNU Public License.


AVAILABILITY

The latest version of this library is likely to be available from CPAN as well as:

  http://webmake.taint.org/

HTML::WebMake::PerlCodeLibrary


NAME

PerlCodeLibrary - a selection of functions for use by perl code embedded in a WebMake file.


SYNOPSIS

  <{perl

    $foo = get_content ($bar);
    [... etc.]

    # or:

    $foo = $self->get_content ($bar);
    [... etc.]

  }>


DESCRIPTION

These functions allow code embedded in a <{perl}> or <{perlout}> section of a WebMake file to be used to script the generation of content.

Each of these functions is defined both as a standalone function, or as a function on the PerlCode object. Code in one of the <{perl*}> sections can access this PerlCode object as the $self variable. If you plan to use WebMake from mod_perl or in a threaded environment, be sure to call them as methods on $self.


METHODS

@names = content_matching ($pattern);

Find all items of content that match the glob pattern $pattern. If $pattern begins with the prefix RE:, it is treated as a regular expression. The list of items returned is not in any logical order.

@objs = content_names_to_objects (@names);

Given a list of content names, convert to the corresponding list of content objects, ie. objects of type HTML::WebMake::Content.

$obj = get_content_object ($name);

Given a content name, convert to the corresponding content object, ie. objects of type HTML::WebMake::Content.

@names = content_objects_to_names (@objs);

Given a list of objects of type HTML::WebMake::Content, convert to the corresponding list of content name strings.

@sortedobjs = sort_content_objects ($sortstring, @objs);

Sort a list of content objects by the sort string $sortstring. See ''sorting.html'' in the WebMake documentation for details on sort strings.

@names = sorted_content_matching ($sortstring, $pattern);

Find all items of content that match the glob-style pattern $pattern. The list of items returned is ordered according to the sort string $sortstring. If $pattern begins with the prefix RE:, it is treated as a regular expression.

See ''sorting.html'' in the WebMake documentation for details on sort strings.

This, by the way, is essentially implemented as follows:

        my @list = $self->content_matching ($pattern);
        @list = $self->content_names_to_objects (@list);
        @list = $self->sort_content_objects ($sortstring, @list);
        return $self->content_objects_to_names (@list);
$str = get_content ($name);

Get the item of content named $name. Equivalent to a $ {content_reference}.

@list = get_list ($name);

Get the item of content named, but in Perl list format. It is assumed that the list is stored in the content item in whitespace-separated format.

set_content ($name, $value);

Set a content chunk to the value provided. This content will not appear in a sitemap, and navigation links will never point to it.

Returns the content object created.

set_list ($name, @values);

Set a content chunk to a list containing the values provided, separated by spaces. This content will not appear in a sitemap, and navigation links will never point to it.

Returns the content object created.

set_mapped_content ($name, $value, $upname);

Set a content chunk to the value provided. This content will appear in a sitemap and the navigation hierarchy. $upname should be the name of it's parent content item. This item must not be metadata, or other dynamically-generated content; only first-class mapped content can be used.

Returns the content object created.

del_content ($name);

Delete a named content chunk.

@names = url_matching ($pattern);

Find all URLs (from <out> and <media> tags) whose name matches the glob-style pattern $pattern. The names of the URLs, not the URLs themselves, are returned. If $pattern begins with the prefix RE:, it is treated as a regular expression.

$url = get_url ($name);

Get a named URL. Equivalent to an $ (url_reference).

set_url ($name, $url);

Set an URL to the value provided.

del_url ($name);

Delete an URL.

$listtext = make_list ($itemname, @namelist);

Generate a list by iterating through the @namelist, setting the content item item to the current name, and interpreting the content chunk named $itemname. This content chunk should refer to sitetree.wmk appropriately.

Each resulting block of content is appended to a $listtext, which is finally returned.

See the news_site.wmk sample site for an example of this in use.

define_tag ($tagname, \&amp;handlerfn, @required_attributes);

Define a tag for use in content items. Any occurrences of this tag, with at least the set of attributes defined in @required_attributes, will cause the handler function referred to by handlerfn to be called.

Handler functions are called as fcllows:

        handler ($tagname, $attrs, $text, $perlcode);

Where $tagname is the name of the tag, $attrs is a reference to a hash containing the attribute names and the values used in the tag, and $text is the text between the start and end tags.

$perlcode is the PerlCode object, allowing you to write proper object-oriented code that can be run in a threaded environment or from mod_perl. This can be ignored if you like.

This function returns an empty string.

define_empty_tag ($tagname, \&amp;handlerfn, @required_attributes);

Define a tag for use in content items. This is identical to define_tag above, but is intended for use to define ''empty'' tags, ie. tags which occur alone, not as part of a start and end tag pair.

The handler in this case is called with an empty string for the $text argument.

define_wmk_tag ($tagname, \&amp;handlerfn, @required_attributes);

Define a tag for use in the WebMake file.

Aside from operating on the WebMake file instead of inside content items, this is otherwise identical to define_tag above,

define_empty_wmk_tag ($tagname, \&amp;handlerfn, @required_attributes);

Define an empty, aka. standalone, tag for use in the WebMake file.

Aside from operating on the WebMake file instead of inside content items, this is otherwise identical to define_tag above,

$obj = get_root_content_object();

Get the content object representing the ''root'' of the site map. Returns undef if no root object exists, or the WebMake file does not contain a &lt;sitemap&gt; command.

$name = get_current_main_content();

Get the ''main'' content on the current output page. The ''main'' content is defined as the most recently referenced content item which (a) is not generated content (perl code, sitemaps, breadcrumb trails etc.), and (b) has its map attribute set to ``true''.

Note that this API should only be called from a deferred content reference; otherwise the ''main'' content item may not have been referenced by the time this API is called.

undef is returned if no main content item has been referenced.

$main = get_webmake_main_object();

Get the current WebMake interpreter's instance of HTML::WebMake::Main object. Virtually all of WebMake's functionality and internals can be accessed through this.


webmake(1)


NAME

webmake - a simple web site management system, allowing an entire site to be created from a set of text and markup files and one WebMake file.


SYNOPSIS

  webmake [option ...]

  webmake [option ...] [-f webmakefile]

  webmake [option ...] [-R dir_or_file]


DESCRIPTION

WebMake is a simple web site management system, allowing an entire site to be created from a set of text and markup files and one WebMake file.

It requires no dynamic scripting capabilities on the server; WebMake sites can be deployed to a plain old FTP site without any problems.

It allows the separation of responsibilities between the content editors, the HTML page designers, and the site architect; only the site architect needs to edit the WebMake file itself, or know perl or WebMake code.

A multi-level website can be generated entirely from 1 or more WebMake files containing content, links to content files, perl code (if needed), and output instructions. Since the file-to-page mapping no longer applies, and since elements of pages can be loaded from different files, this means that standard file access permissions can be used to restrict editing by role.

Text can be edited as standard HTML, converted from plain text (using the included Text::EtText module), or converted from any other format by adding a conversion method to the WebMake::FormatConvert module.

Since URLs can be referred to symbolically, pages can be moved around and URLs changed by changing just one line. All references to that URL will then change automatically.

Content items and output URLs can be generated, altered, or read in dynamically using perl code. Perl code can even be used to generate other perl code to generate content/output URLs/etc., recursively.


OPTIONS

-f

The WebMake file to read and generate output from. If this option is not supplied, the default behaviour is to search the current directory and its parents for a file ending in .wmk.

-F

Force output. Normally if a file is already up to date, it is not modified. This will force the file to be re-made.

-r

Run more quickly, but take more risks. Normally, dynamic content, such as Perl sections, sitemaps, or navigation links, are always considered to be in need of rebuilding, as mapping their dependencies is often very difficult or impossible. This switch forces them to be ignored for dependency-tracking purposes, and so an output file that depends on them will not be rebuilt unless a normal content item on that page changes.

-b basehref

Rewrite links to be absolute URLs based at this URL. By default, links are specified as relative wherever possible.

-d basedir

Generate output, and look for support files (images etc.), relative to this directory.

-p

Paranoid mode; do not allow perl code evaluation or accesses to directories above the WebMake file.

-D

Debug mode; more output.

-L

Debug level; how much debug output to produce. 0 means no debug output, 3 means lots.

-C dir

Change to this directory before reading files or generating output.

-R dir_or_file

If dir_or_file is a directory, change to that directory, or if it is a file, change to that file's parent directory, before starting.


INSTALLATION

The webmake command is part of the HTML::WebMake Perl module. Install this as a normal Perl module, using perl -MCPAN -e shell, or by hand.


ENVIRONMENT

No environment variables, aside from those used by perl, are required to be set.


SEE ALSO

webmake ettext2html ethtml2text HTML::WebMake Text::EtText


AUTHOR

Justin Mason <jm /at/ jmason.org>


PREREQUISITES

HTML::Entities File::Spec File::Path File::Basename Carp Cwd


COREQUISITES

Image::Size is required to support the IMGSIZE tag. If this tag is not used, or if the module is not available, webmake can still operate acceptably.


csvtable_tag.wmk


LOADING

  < use plugin="csvtable_tag" />


HTML TAGS

  < csvtable [delimiter="char"] [HTML table attributes] >
  [...cells...]
  < /csvtable >


DESCRIPTION

This WebMake Perl library provides a tag to allow HTML tables to be constructed, quickly, using a tab-, comma-, or pipe-separated value table.

Firstly, pick a delimiter character, such as |. Set the delimiter attribute to this character.

Each line of the CSV table will become a < TR >; each delimiter-separated cell will be enclosed in a < TD > tag pair.

Attributes for the HTML table tag itself, can be provided as attributes to this tag; they will be passed through into the resulting < TABLE > tag.

By default, items inside the tables are represented as < TD > cells, with no attributes. Certain special line prefixes allow control over formatting of table items, as follows. These are all case-insensitive, and whitespace after them will be stripped; but they must start on the first character of the line (no leading spaces), and, despite how they're rendered here, should not contain any spaces between the angle brackets.

Blank lines are skipped.

< !-- .... -- >

Comments, a la HTML.

< csvfmt >

The rest of the line is used to specify the format to be used for each line afterwards, until the end of the < csvtable >, or until the next < csvfmt > line.

The line should end in a < /csvfmt > closing tag.

Specify a < tr >...< /tr > block, with $1, $2, $3, etc. for the numbered cells (counting from 1). For example:

  < csvfmt >< tr >< td >$1< /td >< td >$2< /td >< td >$3< /td >< /tr >< /csvfmt >


EXAMPLE

  < csvtable delimiter="|" >
  < !-- heading -- >
  < csvfmt >< tr >< th >$1< /th >< th >$2< /th >< th >$3< /th >< /tr ></ csvfmt >
  First Name|Surname|Title
  < !-- contents -- >
  < csvfmt >< tr >< td >$1< /td >< td >$2< /td >< td >$3< /td >< /tr ></ csvfmt >
  Justin|Mason|JAPH
  Foo|Bar|Baz
  < /csvtable >


THANKS

Thanks to Chris Barrett; he suggested this tag.


download_tag.wmk


LOADING

  < use plugin="download_tag" />


HTML TAGS

  < download file="filename.dat" [text="template"] />


DESCRIPTION

This WebMake Perl library provides a quick shortcut to make links to files for download.

The attributes supported are as follows:

file="filename.dat"

The filename to link to. If a file by this filename does not exist, a warning will be printed.

Filenames should be specified relative to one of the following:

the top level of the site
the output file which contains the tag (not recommended, as it precludes the tag being used in another output file in a different directory)
a directory named in the FileSearchPath WebMake option
text="template"

The link text to be used. The following content items are defined for use inside the link text:

download.path

The real path to the file.

download.href

The path to the file, relative to the current output file.

download.name

The file's name, without directories.

download.mdate

The file's modification date, in ctime() format, e.g. Thu Mar 01 20:54:34 2001.

download.mtime

The file's modification date, in UNIX time_t format.

download.size_in_k

The file's size, in kilobytes (rounded up).

download.size

The file's size, in bytes.

download.owner

The file's owner.

download.group

The file's group.

download.tag_attrs

The remaining attributes of the download tag.

template can be a $ {content_reference}. The default template is:

  < a href="$ {download.href}" $ {download.tag_attrs}>$ {download.name}
  ($ {download.size_in_k}k)< /a>

Note that this means that any unrecognised attributes of the download tag itself will become attributes of the A tag.

OPTIONS WHICH AFFECT THIS TAG

FileSearchPath - WebMake option


dump_vars.wmk


NAME

dump_vars.wmk - dump all WebMake variables and content items


LOADING

  < use plugin="dump_vars" />


CONTENT ITEMS

  $ {DumpVars_names}

  $ {DumpVars_full}


DESCRIPTION

Some debugging help. If you include this file in your WebMake file, it will define these content items:

This content contains a list of the names of all content items defined.

This content contains a dump of all content items defined, including their names and their values. It excludes and .


safe_tag.wmk


LOADING

  < use plugin="safe_tag" />


HTML TAGS

  < safe>
  ...some data with HTML tags or WebMake references
  < /safe>


PERL CODE

  <{perl

    $safe_text = make_safe ($unsafe_text);

  }>


DESCRIPTION

This WebMake Perl library provides a way to ``make safe'' WebMake, EtText or HTML data, escaping all metacharacters appropriately so that content references, EtText links or HTML tags are not interpreted.


sitetree.wmk


LOADING

  < use plugin="sitetree" />


WEBMAKE TAGS

  < sitetree name=... sitemap=...
        opennode=... closednode=...
        thispage=... leaf=... />


DESCRIPTION

This WebMake Perl library provides the sitetree tag.

Sitetree operates similarly to the built-in sitemap tag, but, displays only a subset of all the site's nodes; it will map all of the top-level nodes of the site, and then only the parent nodes of the current page. The effect is similar to a tree-view-based file browser, like Windows Explorer.

In terms of differences in usage, where sitemap creates a single map which includes every page in the site, sitetree maps only the pages up to and including the current page, and generates a map for each individual output page.

So, for a site like this:

+ Section 1
+ Section 1 Subsection 1
+ Section 1 Subsection 2
+ Section 2
+ Section 2 Subsection 1
+ Section 2 Subsection 2

A reference to the site tree on page Section 1 Subsection 1 would result in a site tree like this:

- Section 1
- Section 1 Subsection 1
+ Section 2

Display of each page's entry in the tree is performed by expanding one of the 4 template content items named in the tag's attributes: closednode, opennode, thispage, or leaf. See the sitemap tag documentation for more details on how to use these (note however that the is_node variable is not available for sitetrees).


ATTRIBUTES

name

The name of the sitetree object. To include a sitetree in a page, refer to it using this name, as a deferred reference.

sitemap

The name of the sitemap. The sitetree requires a sitemap, as the sitemap is responsible for mapping out the site and defining which pages and content items are included.

closednode

A content item which is evaluated to display a ''closed'' node, ie. a node which is not on the path to the current page.

opennode

A content item which is evaluated to display an ''open'' node, one which is on the path to the current page. As for the sitemap tag's node attribute, this content item must include a reference to the list variable, which will contain all the entries for the pages beneath it in the hierarchy.

thispage

A content item which is evaluated to display the current page.

leaf

A content item which is evaluated to display a leaf-node page, one which has no pages beneath it in the hierarchy.


THANKS

Thanks go to Alex Canady, who came up with the idea for this one.


WebMake Documentation (version 1.1)