|
|||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--gnu.iou.sgmlp
SGML Lexer splits document text string into lines and columns delimited by tags.
Static function
The `parse' function returns a two dimensional array. The first dimension is lines, the second we'll sometimes call "columns" -- although each line can have an irregular number of columns (zero or more second dimension elements per line).
This is less than an SGML tokenizer, but a pre- tokenizer preserving lines, quoted attributes, and creating arrays within lines for tags and non- tags.
Each element of a line is checked for starting or ending with a less- than ("<") or greater- than (">") character to see if it is a tag, or part of a tag. Tags can span multiple lines, so the opening less- than (LT) character may not be closed by a closing greater- than (GT) character till the next line.
The `parseValidate' function only returns a non- null result when the source text contains a "< - >" valid document possessing at least one "< - >" tag.
Field Summary | |
static java.lang.String |
EQ
Interned "=" |
Constructor Summary | |
sgmlp()
|
Method Summary | |
static void |
main(java.lang.String[] argv)
|
static java.lang.String[][] |
parse(java.io.InputStream in)
Parse text string into lines and columns. |
static java.lang.String[][] |
parse(java.io.InputStream in,
boolean validate,
boolean blindquotes)
|
static java.lang.String[][] |
parse(java.lang.String text)
Parse text string into lines and columns. |
static java.lang.String[][] |
parse(java.lang.String text,
boolean validate,
boolean blindquotes)
|
static java.lang.String[][] |
parseValidateClient(java.io.InputStream in)
Parse as a template, returning null if there are no template objects in the text. |
static java.lang.String[][] |
parseValidateClient(java.lang.String text)
Parse as a template, returning null if there are no template objects in the text. |
static java.lang.String[][] |
parseValidateServer(java.io.InputStream in)
Parse as a template, returning null if there are no template objects in the text. |
static java.lang.String[][] |
parseValidateServer(java.lang.String text)
Parse as a template, returning null if there are no template objects in the text. |
static java.lang.String[] |
tag_tokenizer(java.lang.String tag)
Split a tag into tokens according to tag syntax, preserving quoted attribute values, stripping leading SGML tag "start" and "end" ('<', '>') characters. |
static java.lang.String |
trim_value(java.lang.String att_value)
Normalize an SGML tag attribute value, stripping symmetric quotes, returning null for empty strings. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final java.lang.String EQ
Constructor Detail |
public sgmlp()
Method Detail |
public static final java.lang.String[][] parseValidateServer(java.lang.String text)
"Server- side" parse validation goes into quotes.
public static final java.lang.String[][] parseValidateClient(java.lang.String text)
"Client- side" parse validation ignores everything in quotes.
public static final java.lang.String[][] parse(java.lang.String text)
text
- Text stringpublic static final java.lang.String[][] parseValidateServer(java.io.InputStream in)
"Server- side" parse validation goes into quotes.
public static final java.lang.String[][] parseValidateClient(java.io.InputStream in)
"Client- side" parse validation ignores everything in quotes.
public static final java.lang.String[][] parse(java.io.InputStream in)
text
- Text stringpublic static final java.lang.String[][] parse(java.lang.String text, boolean validate, boolean blindquotes)
text
- Source with CRLF or LF newlinesvalidate
- If source doesn't contain <sgml
tokens>, return null.blindquotes
- If anything in quotes is blindly ignored.public static final java.lang.String[][] parse(java.io.InputStream in, boolean validate, boolean blindquotes)
in
- Sourcevalidate
- If source doesn't contain <sgml
tokens>, return null.blindquotes
- If anything in quotes is blindly ignored.public static final java.lang.String[] tag_tokenizer(java.lang.String tag)
Tag attributes are guaranteed as three tokens, as available: name string, equals character string, value string. This equals string is "interned" so that in processing the result, each element can be compared by value to a similarly interned string ("=".intern()) using the java equivalent value ("==") operator.
Returned tokens are all trim: no leading or trailing whitespace.
tag
- A whole or part of an SGML tag. Ignores anything
before the tag open character ('<').public static final java.lang.String trim_value(java.lang.String att_value)
att_value
- Must not include leading or trailing
whitespace, or otherwise be a string other than a bare tag
attribute value.public static void main(java.lang.String[] argv)
|
|||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |