EtText Documentation (version 2.1)

All-In-One Documentation

Contents


The Blurb

EtText is a simple plain-text format which allows conversion to and from HTML. Instead of editing HTML directly, it provides an easy-to-edit, easy-to-read and intuitive way to write HTML, based on the plain-text markup conventions we've been using for years.

Like most simple text markup formats (POD, setext, etc.), EtText markup handles the usual things: insertion of P tags, header recognition and markup. However it also adds a powerful link markup system.

EtText markup is simple and effective; it's very similar to setext, WikiWikiWeb TextFormattingRules or Zope's StructuredText.

EtText is distributed under the same licensing terms as Perl itself.


Contributors to Text::EtText

Here's a list of people who've contributed to Text::EtText:

  • Justin Mason <jm /at/ jmason.org>: original author and maintainer

  • Caolan McNamara <caolan /at/ csn.ul.ie>: EtText contributions; lists, pre-formatted text, lots of suggestions

  • rudif /at/ bluemail.ch: lots of help with supporting Windows

  • Chris Barrett, chris /at/ getfrank.com: suggested CSS class support for the Latte-style balanced tags

Thanks all! Patches and suggestions are welcomed -- send them in! (By the way, patch contributors get listed at the top, 'cos patches save me writing the code ;)


Using EtText

Like most simple text markup formats (POD, setext, etc.), EtText markup handles the usual things: insertion of <P> tags, header recognition and markup. However it adds a powerful link markup system and several other useful features.

EtText markup is simple and effective; it's based loosely on setext, with bits of WikiWikiWebTextFormattingRules thrown in.

EtText was previously part of WebMake, but is now distributed as a standalone component.

Basic Text Markup

If you leave blank lines between paragraphs, <p> and </p> tags will be inserted in the correct places. EtText does quite a good job of this.

Words wrap and fill automatically, so there's no need to worry about wrapping before 80 characters. (It's good form to do so anyway, in case other people ever need to edit your text, or you need to mail it around.)

A paragraph consisting of a line of 10 or more consecutive - or _ signs will be converted to a HR tag.

Sections of text between pairs of certain characters will be turned into markup, as follows:

EtText Tag Used Result
**text** <strong> text
__text__ <em> text
##text## <code> text

& signs that have whitespace on either side will be converted to &amp; signs automatically.

Text indented from the left margin will be converted into a <P> paragraph wrapped in a <blockquote> -- unless it starts with a *, -, + or o character followed by whitespace, in which case it's interpreted as a list item; see Lists below.

Another exception to the above rule is that text indented by only 1 space, or on lines starting in the first column with two colon characters, will be surrounded by <pre> tags.

If you find writing HTML tag-pairs manually annoying, EtText includes an idea from Latte; balanced-tag generation. Wrap the text to be tagged with the name of the tag followed immediately by a { character on the left, and a } character on the right. In other words,

strong{text}

will be rendered as

<strong>text</strong>

or, in other words, text . This can be nested, so strong{text with i{italic} bits} will be rendered as text with italic bits.

In addition, the balanced-tag support has a bonus feature, in that it supports CSS classes; follow the name of the tag with a full stop and the class, and it will use that class, like so:

i.green{foo}

will be rendered as

<i class="green>foo</i>


Lists

A paragraph indented from the left margin (by either spaces or tabs, or both), and starting with a *, -, + or o character followed by whitespace, will be converted into a list item (<li> tag).

The same goes for indented paragraphs that start with the string 1., followed by whitespace. However the default list tag in this case will be an <ol>...</ol> list. Any positive integer followed immediately by a full stop and a space will do the trick. (BTW: I used to use # to do this, but I preferred the WikiIdea, it looks better.)

(Compatibility note: previous versions of EtText required that the <ul> or <ol> tags be written manually. This is no longer the case, they will be added automatically.)

Some text editors (such as vim) will reformat list items automatically, assuming that you want the text to line up with the start of the text, instead of the bullet-point character, on the previous line, like so:

                      
    - this is a list item. We should make sure that
      blah blah etc. etc.
                    

This is pretty handy, so using a - as the list bullet point character is recommended.

Indented paragraphs that start with term:tabrest of paragraph will be converted into definition lists (this is another StolenFromWikiIdea). As a result, this:

                    
                    
    Foo:	Blah blah blah etc.

                                      

Will look like this:

Foo

Blah blah blah etc.


Sidebars and Side Images

If you wish to display an image, or small sidebar, beside a paragraph of text, use the <etleft> and <etright> tags. These are rendered as a one-row, two-column <table> wrapping the paragraph and the sidebar, as follows:

                
  <etleft><img src=bubba.png></etleft>This is the main
  paragraph body.  Foo bar baz blah blah blah etc.
              

Is displayed as:

This is the main paragraph body. Foo bar baz blah blah blah etc.

              
  <etright><img src=bubba.png></etright>This is the
  main paragraph body.  Foo bar baz blah blah blah etc.
            

Is displayed as:

This is the main paragraph body. Foo bar baz blah blah blah etc.


Links in EtText

As well as the standard <a href=url>...</a> link specification used in HTML, EtText will automatically add href tags for URLs and email addresses that occur in the text. In addition, EtText supports its own link format, as follows.

The basic concept is of a word or "quoted set of words" followed by an optional link label in [square brackets], like this: "this is a link" [label].

The href used in the link is then defined at another point in the document, as an indented line like this:

          
          
  label: http://url

                  

Text and markup can be enclosed in the quotes, everything quoted will become part of the link text. Single words or HTML tags do not need to be quoted, so

        
        
  <img src="/license_plate.jpg" width="10" height="10"> [homepage]

              

will work correctly.

Glossary Links

EtText also supports a concept called glossary links; if you define a link, the name of that link will automatically become a href if enclosed in quotes. For example:

      
      
  [Justin Mason]: http://jmason.org/

          

will mean that any occurrence of the name "Justin Mason", in quotes, in any EtText content chunk or file in the site, becomes a link to that address.

These links are stored in the WebMake cache file, if WebMake is being used. Alternatively, if you are using the Text::EtText modules yourself, provide an implementation of the Text::EtText::LinkGlossary interface to support this.

Quoted bits of text that do not map to an entry in the glossary are not converted to links (unless they're followed by a square-bracketed link-label reference).

More Convenient Links

In addition, if the link definition is preceded with Auto:, the quotes are not required, and any occurrence of the words in the link label, with or without quotes, will become a link.

    
    
  Auto: [WebMake]: http://webmake.taint.org/
  Auto: [any occurrence of the words]: http://webmake.taint.org/

      

URLs and Email Addresses

URLs, such as http://webmake.taint.org/ , and email addresses, such as jm@nospam-jmason.org, are automatically converted into links to that same address.

Blocking EtText Link Interpretation

To block interpretation as a link, replace square brackets with the HTML entities &etsqi; and &etsqo;, which map to [ and ] respectively; replace quote characters, ", with two apostrophes, ''. If that doesn't do the trick, wrap the entire section of text with the <!--etsafe-->...<!--/etsafe--> tags.


Similar Systems

EtText-like plain-text-to-markup conversion systems have a long history. The first time I came across the concept was with Setext, which was included with Tony Sanders' Plexus web server, back in September 1993. Yes, 1993. Setext has been around for a while!

WikiWikiWeb is quite a recent, well-established system which uses a similar markup style.

The real inspiration for EtText was Userland's Frontier; Dave Winer's evangelisation of its easily-editable markup system convinced me that it was worth polishing up the rudimentary EtText system I had then. In addition, the name "EtText" is derived from "Edit This Text", in a tip of the hat to Dave's "Edit This Page" concept.

Some well-known sites that use their own converters to convert plain-text to markup include http://www.blogger.com/, http://slashdot.org/ (for comments) and http://www.advogato.org/.

Jorn Barger maintains an impressive summary of etext formats at his Robot Wisdom site. Skip down to section 3, Internet etext standards, for the directly-relevant stuff.

Zope and ZWiki use a format called StructuredText, which again comes from WikiLand. There's some interesting work going on there with the STXDocument object, which is "a web-managable object that contains information marked up in the structured text format".


When HTML and EtText Collide

HTML tags can be used freely throughout an EtText document. However, in some situations, you may wish to preserve whitespace, avoid paragraph tags being added, etc.; to use your own HTML without meddling from EtText, wrap it in an <!--etsafe-->...<!--/etsafe--> tag pair; this will protect it.

Note that text blocks wrapped in <pre>, <listing> and <xmp> tags are automatically protected in this way; the <!--etsafe--> tag pair is not required.

EtText adds two entities, &etsqi; and &etsqo;. These represent [ and ] respectively, and are used to protect a square-bracketed piece of text from being interpreted as a link URL (see Link Markup below).

If this is insufficient, and you're using WebMake, the <safe> tag will escape any type of code to protect it from interpretation by WebMake, EtText or HTML.


Text::EtText::DefaultGlossary


NAME

Text::EtText::DefaultGlossary - default, non-persistent link glossary


SYNOPSIS


DESCRIPTION

The Text::EtText::DefaultGlossary is an implementation of Text::EtText::LinkGlossary which is used if no other implementation is registered.

It will not save glossary link details persistently.


METHODS


Text::EtText::EtText2HTML


NAME

Text::EtText::EtText2HTML - convert from the simple EtText editable-text format into HTML


SYNOPSIS

                                    my $t = new Text::EtText::EtText2HTML;
  print $t->text2html ($text);
                                

or

                                      my $t = new Text::EtText::EtText2HTML;
  print $t->text2html ();               # from STDIN
                                  

DESCRIPTION

ettext2html will convert a text file in the EtText editable-text format into HTML.

For more information on the EtText format, check the WebMake documentation on the web at http://webmake.taint.org/ .


METHODS

$f = new Text::EtText::EtText2HTML

Constructs a new Text::EtText::EtText2HTML object.

$f->set_option ($optname, $optval);

Set an EtText option. (Options can also be set on the WebMake object itself, or from inside the WebMake file.) Currently supported options are:

EtTextOneCharMarkup (default: 0)

Allow one-character sets of asterisks etc. to mark up as strong, emphasis etc., instead of the default two-character markup.

EtTextBaseHref (default: '')

The base HREF to use for relative links. If set, all relative links in tags with HREF attributes will be rewritten as absolute links, making the output HTML independent of the URL tree structure.

EtTextHrefsRelativeToTop (default: 1)

Indicates that all EtText links are relative to the top of the WebMake document tree. This (obviously) is only relevant if you are using EtText in conjunction with WebMake. If set, all relative links in tags with HREF attributes will be rewritten as relative to the ''top'' of the WebMake site, making the output HTML independent of the URL tree structure.

$html = $f->set_glossary ($glosobj)

Provide a glossary for shared link definitions, allowing link definitions to be shared and reused across multiple EtText files. $glosobj must implement the interface defined by Text::EtText::LinkGlossary.

See below for more information on this interface.

$html = $f->text2html( [$text] )

Convert text, either from the argument or from STDIN, into HTML.


MORE DOCUMENTATION

See also http://webmake.taint.org/ for more information.


SEE ALSO

webmakeettext2htmlethtml2textHTML::WebMakeText::EtText::EtText2HTMLText::EtText::HTML2EtTextText::EtText::LinkGlossaryText::EtText::DefaultGlossary


AUTHOR

Justin Mason <jm /at/ jmason.org>


COPYRIGHT

WebMake is distributed under the terms of the GNU Public License.


AVAILABILITY

The latest version of this library is likely to be available from CPAN as well as:

                                                                  http://webmake.taint.org/
                                                              

Text::EtText::HTML2EtText


NAME

Text::EtText::HTML2EtText - convert from HTML to the EtText editable-text format


SYNOPSIS

                                                                                      my $t = new Text::EtText::HTML2EtText;
  print $t->html2text ($html);
                                                                                  

or

                                                                                        my $t = new Text::EtText::HTML2EtText;
  print $t->html2text ();                       # from STDIN
                                                                                    

DESCRIPTION

ethtml2text will convert a HTML file into the EtText editable-text format, for use with webmake or ettext2html.

For more information on the EtText format, check the WebMake documentation on the web at http://webmake.taint.org/ .


METHODS

$f = new Text::EtText::HTML2EtText

Constructs a new Text::EtText::HTML2EtText object.

$text = $f->html2text( [$html] )

Convert HTML, either from the argument or from STDIN, into EtText.


MORE DOCUMENTATION

See also http://webmake.taint.org/ for more information.


SEE ALSO

webmakeettext2htmlethtml2textHTML::WebMakeText::EtText::EtText2HTMLText::EtText::HTML2EtText


AUTHOR

Justin Mason <jm /at/ jmason.org>


COPYRIGHT

WebMake is distributed under the terms of the GNU Public License.


AVAILABILITY

The latest version of this library is likely to be available from CPAN as well as:

                                                                                                          http://webmake.taint.org/
                                                                                                      

Text::EtText::LinkGlossary


NAME

Text::EtText::LinkGlossary - interface for EtText link glossaries to implement.


SYNOPSIS

                                                                                                                    use Text::EtText::LinkGlossary;
                                                                                                                

                                                                                                                    @ISA = qw(Text::EtText::LinkGlossary);
                                                                                                                

                                                                                                                    sub open { ... }
  sub close { ... }
  ...
                                                                                                                

DESCRIPTION

The Text::EtText::LinkGlossary is an interface which allows EtText to support ''link glossaries'', persistent collections of link text and its corresponding HREF.

The interface which needs to be implemented is as follows:


METHODS

$g->open()

Open the link glossary $g for reading and writing.

$g->close()

Close the link glossary; no more links can be written or read.

$url = $g->get_link ($name)

Get a named link from the glossary.

$g->put_link ($name, $url)

Put a named link to the glossary.

$url = $g->get_auto_link ($name)

Get a named automatic link from the glossary.

$g->put_auto_link ($name, $url)

Put a named automatic link to the glossary.

@keys = $g->get_auto_link_keys ()

Get a list of the names of automatic links stored in the glossary.

$g->add_auto_link_keys (@keys)

Add to the list of names of automatic links stored in the glossary.


ethtml2text(1)


NAME

ethtml2text - convert from HTML to the EtText editable-text format


SYNOPSIS

                                                                                                                                                          ethtml2text file.html > file.txt
                                                                                                                                                      

DESCRIPTION

ethtml2text will convert a HTML file into the EtText editable-text format, for use with webmake or ettext2html.

For more information on the EtText format, check the WebMake documentation on the web at http://ettext.taint.org/ .


INSTALLATION

The ethtml2text command is part of the HTML::WebMake Perl module set. Install this as a normal Perl module, using perl -MCPAN -e shell, or by installing WebMake.


ENVIRONMENT

No environment variables, aside from those used by perl, are required to be set.


SEE ALSO

webmakeettext2htmlethtml2textHTML::WebMakeText::EtText


AUTHOR

Justin Mason <jm /at/ jmason.org>


PREREQUISITES

HTML::Entities


ettext2html(1)


NAME

ettext2html - convert from the simple EtText editable-text format into HTML


SYNOPSIS

                                                                                                                                                                                          ettext2html file.txt > file.html
                                                                                                                                                                                      

DESCRIPTION

ettext2html will convert a text file in the EtText editable-text format into HTML.

For more information on the EtText format, check the WebMake documentation on the web at http://ettext.taint.org/ .


INSTALLATION

The ettext2html command is part of the HTML::WebMake Perl module set. Install this as a normal Perl module, using perl -MCPAN -e shell, or by installing WebMake.


ENVIRONMENT

No environment variables, aside from those used by perl, are required to be set.


SEE ALSO

webmakeettext2htmlethtml2textHTML::WebMakeText::EtText


AUTHOR

Justin Mason <jm /at/ jmason.org>


PREREQUISITES

HTML::Entities


EtText Documentation (version 2.1)