AsTMa= Authoring Tutorial

Robert Barta Bond University

rho@bond.edu.au

Copyright © 2002 Robert Barta

AsTMa= is part of the AsTMa language family which was designed to facilitate authoring, constraining and querying topic maps. This documents provides an introduction of AsTMa= to Topic Map authors who have already some knowledge using XTM.

v1.0, 2002-07-10, Revision 1.0

Introduction

Since the stabilisation of XTM, an XML-based notation for Topic Maps, the interest in authoring Topic Maps has increased.

While the automatic generation of topic maps from backend databases into XTM can easily be achieved, manual authoring in XTM is tedious and error-prone. One option is to use XML aware development tools, such as XML-editors or XML IDEs. While feasible, generic XML editors offer little help above syntactical conformance. Another option is to use integrated development environments for Topic Maps (server or client-side) as they have appeared in the market.

AsTMa= is a linear, textual notation for Topic Map information. The motivation was to create a shorthand notation in contrast to XTM which is mainly suitable for interchange purposes. This motivation it shares with comparable languages as LTM.

In the tradition of Huffman-encoding languages AsTMa= has the following design objectives:

AsTMa design objectives

Minimum of effort:
A converter should be able to interpret the intention of the author in a specific context reducing the verbosity of the language (DWIM, do what I mean).
Minimal use of special characters and keywords:
Banning of [(&^%$}] delimiters should increase the usability of the language. This also reduces the need to escape these special characters once they belong to the information.
Asymptotic regarding to XTM:
The language should not have a built-in syntax-sound-barrier making it impossible to reach the same expressiveness as XTM.
Keep things together:
The author should not be forced to split up (topic) information into several fragments, which have to be merged via TNC afterwards by a followup Topic Map processing stage.

At this stage AsTMa= does not fulfill all of the above objectives. As outlined in more detail in the section about conformance, AsTMa= is (yet) not as expressive as XTM, that not having been a prime concern. Still, AsTMa= is sufficiently rich to prototype medium sized topic maps.

The following setting assumes that the AsTMa= text will be either directly understood by a particular Topic Map processing software or that a specialized processor will first convert the AsTMa= text stream into an XTM stream.

Basics

AsTMa= is line oriented. This means that pertinent information is terminated with the end of the line. A single line containing

   filesystem (software)
already defines a topic (as explained below). If there is more to a topic (or an association) this information will be on follow-up lines:
   filesystem (software)
   bn: File System

An empty line, thus, separates items like topics and associations. On any line white-spaces can be used before and after keywords and special characters. They are silently ignored. Any line also can contain comment introduced by the character '#' (following a white-space character):

   filesystem (software) # more information will follow
Such comments will be discarded by any processor and are only for internal documentation purposes.

Please note that (starting from Rev1.5) such a comment must have at least one blank before the '#'. This allows the hasslefree notation of URIs containing a '#' and avoid that the XPointer part will be interpreted as comment. The blanks between the text and the '#' are ignored.

If you would like to have a comment in the processor output, then this comment MUST begin at the start of a line:

AsTMa=XTM
# I will survive and (hopefully) 
#     the line structure will not
#        be broken
<!--  I will survive and (hopefully)
          the line structure will not
             be broken -->

Comments on consecutive lines will be treated as one comment. Any non-comment line signals the end of such a group. Also, any '-->' occurrence within a comment will be converted into '--_ >' (one underscore character) to avoid obvious problems in the resulting XML code.

Topics

The line

   reiserfs (filesystem)
indicates the definition of a topic with id reiserfs which is an instance of another topic, filesystem:

AsTMa=XTM
  reiserfs (filesystem)
<topic id="reiserfs">
   <instanceOf>
     <topicRef xlink:href="#filesystem"/>
   </instanceOf>
   <baseName>
     <baseNameString>filesystem</baseNameString>
   </baseName>
</topic>

As we did not provide a base name, the topic id reiserfs is also assumed to be the topic's basename. While this heuristic approach works fine for some words, it does not necessarily do well with others, say,

   linux-distribution (software)

which would designate linux-distribution as being an instance of some software. Any AsTMa= processor is free to apply any kind heuristics to derive a base name once none is provided explicitely, as for example:

AsTMa=XTM
   linux-distribution (software)
<topic id="linux-distribution">
  <instanceOf>
    <topicRef xlink:href="#software"/>
  </instanceOf>
  <baseName>
     <baseNameString>linux distribution</baseNameString>
  </baseName>
</topic>

substituting dashes by blanks, looking up 3rd-party databases or leaving it as it is. Of course, the author can enforce a particular base name:

AsTMa=XTM
   RedHat-Linux-sparc (linux-distribution-port)
   bn: RedHat Linux for SPARC
<topic id="RedHat-Linux-sparc">
  <instanceOf>
    <topicRef 
      xlink:href="#linux-distribution-port"/>
  </instanceOf>
  <baseName>
     <baseNameString>RedHat Linux 
              for SPARC</baseNameString>
  </baseName>
</topic>

On a similar take, you can also add occurrences for topics:

AsTMa=XTM
  linux (os)
  bn: Linux kernel
  oc: http://www.kernel.org/
<topic id="linux">
  <instanceOf>
    <topicRef xlink:href="#os"/>
  </instanceOf>
  <baseName>
     <baseNameString>Linux kernel</baseNameString>
  </baseName>
  <occurrence>
    <resourceRef xlink:href="http://www.kernel.org/"/>
  </occurrence>
</topic>

in the case for resource references or also for inline data (XTM resourceData):

AsTMa=XTM
  linux-port-on-sparc (linux-port)
  bn: SPARC Linux port
  oc: http://www.sparc.org/linux.shtml
  in: The kernel and kernel modules \
      are 64-bit on sparc64, \
      userland is still 32-bit, \
      and in fact the same as on sparc32.
<topic id="linux-port-on-sparc">
  <instanceOf>
    <topicRef xlink:href="#linux-port"/>
  </instanceOf>
  <baseName>
     <baseNameString>SPARC Linux 
                               port</baseNameString>
  </baseName>
  <occurrence>
    <resourceRef xlink:href="http:....linux.shtml"/>
  </occurrence>
  <occurrence>
    <resourceData>The kernel ...</resourceData>
</occurrence>
</topic>
Any number of occurrences can be added.

Types and Scopes

If appropriate, you can also type topic characteristics:

AsTMa=XTM
reiserfs (filesystem)
bn: Reiser File System, ReiserFS
oc (download): \
     http://www.namesys.com/download.html
<topic id="reiserfs">
  <instanceOf>
    <topicRef xlink:href="#filesystem"/>
  </instanceOf>
  <baseName>
     <baseNameString>Reiser ....</baseNameString>
  </baseName>
  <occurrence>
    <instanceOf>
       <topicRef xlink:href="#download"/>
    </instanceOf>
    <resourceRef xlink:href="http:...download.html"/>
  </occurrence>
</topic>

To scope a characteristic you use '@' to introduce a particular context:

AsTMa=XTM
RedHat-Linux-sparc (linux-distribution-port)
bn: RedHat Linux for SPARC
bn @ deutsch : RedHat Linux f&uuml;r SPARC
<topic id="RedHat-Linux-sparc">
  <instanceOf>
    <topicRef
        xlink:href="#linux-distribution-port"/>
  </instanceOf>
  <baseName>
     <baseNameString>
        RedHat Linux for SPARC
     </baseNameString>
  </baseName>
  <baseName>
     <scope>
        <topicRef xlink:href="#deutsch"/>
     </scope>
     <baseNameString>
        RedHat Linux f&uuml;r SPARC
     </baseNameString>
  </baseName>
</topic>

Associations

Associations may or may not have a particular association type. This topic type is provided inside a () pair.

(kernel-patch-provides-feature)
...
If the association has no explicit type, it can be omitted, by writing only ().

Associations also have a number of members playing roles:
AsTMa=XTM
(kernel-patch-provides-feature)
feature: reiserfs
platform: i386
patch:   generic-reiserfs-patch-2.4.x
<association>
  <instanceOf>
    <topicRef xlink:href="#kernel-patch-provides-feature"/>
  </instanceOf>
  <member>
     <roleSpec>
       <topicRef xlink:href="#feature"/>
     </roleSpec>
     <topicRef xlink:href="#reiserfs"/>
  </member>
  <member>
     <roleSpec>
       <topicRef xlink:href="#platform"/>
     </roleSpec>
     <topicRef xlink:href="#i386"/>
  </member>
  <member>
     <roleSpec>
       <topicRef xlink:href="#patch"/>
     </roleSpec>
     <topicRef xlink:href="#generic-reiserfs-patch-2.4.x"/>
  </member>
</association>

For better readability you may want to indent the roles

  (kernel-patch-provides-feature)
      feature: reiserfs
      platform: i386
      patch:   generic-reiserfs-patch-2.4.x

Identification and Reification

Topics are said to reify subjects. Either a topic in a topic map is a representant of the subject. That is the case if the subject itself is not directly addressable. Or---if it is---then the topic can directly and unambiguously name the subject it embodies via a URI.

In case a subject indicator (a not necessarily unique identification for a particular subject) is known, it can be provided via sin:
AsTMa=XTM
linux (os)
bn: Linux kernel
oc: http://www.kernel.org/
sin: http://dmoz.org/.../Linux/
<topic id="linux">
  <instanceOf>
    <topicRef xlink:href="#os"/>
  </instanceOf>
  <baseName>
     <baseNameString>Linux kernel</baseNameString>
  </baseName>
  <occurrence>
    <resourceRef xlink:href="http://www.kernel.org/"/>
  </occurrence>
  <subjectIdentity>
     <subjectIndicatorRef 
         xlink:href="http://dmoz.org/.../Linux/"/>
  </subjectIdentity>
</topic>
Several such subject indicators can be provided for a single topic. If the indicator string provided contains a URI scheme, then AsTMa assumes a reference to an (external) subject indicator. Otherwise, AsTMa will assume this to be a reference to a local topic (topicRef):
AsTMa=XTM
linux (os)
bn: Linux kernel
...
sin: http://dmoz.org/.../Linux/
sin: linux-os
<topic id="linux">
  <instanceOf>
    <topicRef xlink:href="#os"/>
  </instanceOf>
  <baseName>
     <baseNameString>Linux kernel</baseNameString>
  </baseName>
  ...
  <subjectIdentity>
     <subjectIndicatorRef 
         xlink:href="http://dmoz.org/.../Linux/"/>
     <topicRef 
         xlink:href="#linux-os"/>
  </subjectIdentity>
</topic>

In the case where the topic is the subject in question, we can use AsTMa's reify clause:

AsTMa=XTM
linux-kernel-site (web-site) reifies http://www.linux.org/
bn: Linux kernel Site
...
<topic id="linux-kernel-site">
  <instanceOf>
    <topicRef xlink:href="#web-site"/>
  </instanceOf>
  <baseName>
     <baseNameString>Linux kernel Site</baseNameString>
  </baseName>
  ...
  <subjectIdentity>
     <resourceRef 
         xlink:href="http://www.linux.org/"/>
  </subjectIdentity>
</topic>
The subject provided can be external (by providing a full URI) or can also be a local resource, such as another topic or an association. It is clear from the syntax that only a single such resource reference can be specified.

On some occasions it is more convenient to use the reification the other way round, as in
AsTMa=XTM
linux-org (web-site) is-reified-by linux-kernel-site
oc: http://www.linux.org/

linux-kernel-site
bn: Linux kernel Site
<topic id="linux-org">
  <instanceOf>
    <topicRef xlink:href="#web-site"/>
  </instanceOf>
  <baseName>
     <baseNameString>linux org</baseNameString>
  </baseName>
</topic>

<topic id="linux-kernel-site>
  <baseName>
     <baseNameString>Linux kernel Site</baseNameString>
  </baseName>
  <subjectIdentity>
     <resourceRef 
         xlink:href="#linux-org"/>
  </subjectIdentity>
</topic>
While a topic-topic reification might be of limited use, reifying associations helps to create a topic about the association statement:

(threatens) is-reified-by threat-1
victim: it-market
threat: linux

threat-1 (threat)
bn: Linux threatens the IT market

(claims)
claimant: mirkosoft
claim: threat-1

Topic Maps

There is no special format or syntax for an AsTMa= Topic Map instance. All text blocks within the document are regarded to be part of the map.

Optionally you can control the name (id) of the Topic Map. This, though, might have only relevance to your local topic map processor, so there is no counterpart of this in XTM. When doing so, then the very first non-empty line within the document MUST provide this name (identifier) of the topic map itself:
AsTMa=XTM
sparclinux : iso-8859-1
<?xml version="1.0" encoding="iso-8859-1"?>
<topicMap id="sparclinux"
          xmlns       = 'http://www.topicmaps.org/xtm/1.0/'
          xmlns:xlink = 'http://www.w3.org/1999/xlink'>

As you can see, you may specify also a particular encoding. If omitted, the encoding defaults to iso-8859-1.

Any AsTMa= implementation may also provide special commands or syntactical forms to control other processing aspects of your map. Please consult the appropriate documentation.