gnu.xml
Class XMLFilter

java.lang.Object
  extended by gnu.xml.XMLFilter
All Implemented Interfaces:
Consumer, PositionConsumer, XConsumer, SourceLocator, org.xml.sax.ContentHandler, org.xml.sax.DocumentHandler, org.xml.sax.Locator

public class XMLFilter
extends java.lang.Object
implements org.xml.sax.DocumentHandler, org.xml.sax.ContentHandler, SourceLocator, XConsumer, PositionConsumer

Fixup XML input events. Handles namespace resolution, and adds "namespace nodes" if needed. Does various error checking. This wrapper should be used when creating a NodeTree, as is done for XQuery node constructor expressions. Can also be called directly from XMLParser, in which case we use a slightly lower-level interface where we array char array segments rather than Strings. This is to avoid duplicate String allocation and interning. The combination XMLParser+XMLFilter+NodeTree makes for a fast and compact way to read an XML file into a DOM.


Field Summary
static int COPY_NAMESPACES_INHERIT
           
static int COPY_NAMESPACES_PRESERVE
           
 int copyNamespacesMode
           
protected  int ignoringLevel
          Postive if all output should be ignored.
 boolean namespacePrefixes
          True if namespace declarations should be passed through as attributes.
protected  int nesting
          Twice the number of active startElement and startDocument calls.
 Consumer out
          The specified target Consumer that accepts the output.
protected  int stringizingElementNesting
          Value of nesting just before outermost startElement while stringizingLevel > 0.
protected  int stringizingLevel
          If stringizingLevel > 0 then stringize rather than copy nodes.
 
Constructor Summary
XMLFilter(Consumer out)
           
 
Method Summary
 void beginEntity(java.lang.Object baseUri)
           
 void characters(char[] ch, int start, int length)
           
protected  void checkValidComment(char[] chars, int offset, int length)
           
protected  boolean checkWriteAtomic()
           
 void commentFromParser(char[] chars, int start, int length)
          Process a comment, when called from an XML parser.
 void consume(SeqPosition position)
          Consume node at current position.
static java.lang.String duplicateAttributeMessage(Symbol attrSymbol, java.lang.Object elementName)
           
 void emitCharacterReference(int value, char[] name, int start, int length)
          Process a character entity reference.
 void emitDoctypeDecl(char[] buffer, int target, int tlength, int data, int dlength)
          Process a DOCTYPE declaration.
 void emitEndAttributes()
          Process the end of a start tag.
 void emitEndElement(char[] data, int start, int length)
          Process an end tag.
 void emitEntityReference(char[] name, int start, int length)
          Process an entity reference.
 void emitStartAttribute(char[] data, int start, int count)
          Process an attribute, with the given attribute name.
 void emitStartElement(char[] data, int start, int count)
          Process a start tag, with the given element name.
 void endAttribute()
          End of an attribute or end of an actual parameter.
 void endDocument()
           
 void endElement()
           
 void endElement(java.lang.String name)
           
 void endElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)
           
 void endEntity()
           
 void endPrefixMapping(java.lang.String prefix)
           
 void error(char severity, java.lang.String message)
           
 NamespaceBinding findNamespaceBinding(java.lang.String prefix, java.lang.String uri, NamespaceBinding oldBindings)
          Functionally equivalent to new NamespaceBinding(prefix, uri, oldBindings, but uses "hash consing".
 int getColumnNumber()
          Return current column number.
 java.lang.String getFileName()
          Normally same as getSystemId.
 int getLineNumber()
          Return current line number.
 java.lang.String getPublicId()
           
 java.lang.String getSystemId()
           
 void ignorableWhitespace(char[] ch, int start, int length)
           
 boolean ignoring()
          True if consumer is ignoring rest of element.
 boolean isStableSourceLocation()
          True if position is unlikely to change.
 void linefeedFromParser()
           
 gnu.xml.MappingInfo lookupNamespaceBinding(java.lang.String prefix, char[] uriChars, int uriStart, int uriLength, int uriHash, NamespaceBinding oldBindings)
          Return a MappingInfo containing a match namespaces.
 void processingInstruction(java.lang.String target, java.lang.String data)
           
 void processingInstructionFromParser(char[] buffer, int tstart, int tlength, int dstart, int dlength)
          Process a processing instruction.
 void setDocumentLocator(org.xml.sax.Locator locator)
           
 void setMessages(SourceMessages messages)
           
 void setSourceLocator(LineBufferedReader in)
           
 void setSourceLocator(SourceLocator locator)
           
 void skippedEntity(java.lang.String name)
           
 void startAttribute(java.lang.Object attrType)
          Write a attribute for the current element.
 void startDocument()
           
 void startElement(java.lang.Object type)
           
 void startElement(java.lang.String name, org.xml.sax.AttributeList atts)
           
 void startElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes atts)
           
protected  void startElementCommon()
           
 void startPrefixMapping(java.lang.String prefix, java.lang.String uri)
           
 void textFromParser(char[] data, int start, int length)
           
 void write(char[] data, int start, int length)
          Process raw text.
 void write(java.lang.CharSequence str, int start, int length)
           
 void write(int v)
           
 void write(java.lang.String str)
           
 void writeBoolean(boolean v)
           
 void writeCDATA(char[] data, int start, int length)
          Process a CDATA section.
 void writeComment(char[] chars, int start, int length)
          Process a comment.
 void writeDocumentUri(java.lang.Object uri)
           
 void writeDouble(double v)
           
 void writeFloat(float v)
           
 void writeInt(int v)
           
protected  void writeJoiner()
           
 void writeLong(long v)
           
 void writeObject(java.lang.Object v)
          If v is a node, make a copy of it.
 void writePosition(AbstractSequence seq, int ipos)
          Consume a single position pair.
 void writeProcessingInstruction(java.lang.String target, char[] content, int offset, int length)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

out

public Consumer out
The specified target Consumer that accepts the output. In contrast, base may be either ==out or ==tlist.


COPY_NAMESPACES_PRESERVE

public static final int COPY_NAMESPACES_PRESERVE
See Also:
Constant Field Values

COPY_NAMESPACES_INHERIT

public static final int COPY_NAMESPACES_INHERIT
See Also:
Constant Field Values

copyNamespacesMode

public transient int copyNamespacesMode

nesting

protected int nesting
Twice the number of active startElement and startDocument calls.


stringizingLevel

protected int stringizingLevel
If stringizingLevel > 0 then stringize rather than copy nodes. It counts the number of nested startAttributes that are active. (In the future it should also count begun comment and processing-instruction constructors, when those support nesting.)


stringizingElementNesting

protected int stringizingElementNesting
Value of nesting just before outermost startElement while stringizingLevel > 0. I.e. if we're nested inside a element nested inside an attribute then stringizingElementNesting >= 0, otherwise stringizingElementNesting == -1.


ignoringLevel

protected int ignoringLevel
Postive if all output should be ignored. This happens if we're inside an attribute value inside an element which is stringized because it is in turm inside an outer attribute. Phew. If gets increment by nested attributes so we can tell when to stop.


namespacePrefixes

public boolean namespacePrefixes
True if namespace declarations should be passed through as attributes. Like SAX2's http://xml.org/features/namespace-prefixes.

Constructor Detail

XMLFilter

public XMLFilter(Consumer out)
Method Detail

setSourceLocator

public void setSourceLocator(LineBufferedReader in)

setSourceLocator

public void setSourceLocator(SourceLocator locator)

setMessages

public void setMessages(SourceMessages messages)

findNamespaceBinding

public NamespaceBinding findNamespaceBinding(java.lang.String prefix,
                                             java.lang.String uri,
                                             NamespaceBinding oldBindings)
Functionally equivalent to new NamespaceBinding(prefix, uri, oldBindings, but uses "hash consing".


lookupNamespaceBinding

public gnu.xml.MappingInfo lookupNamespaceBinding(java.lang.String prefix,
                                                  char[] uriChars,
                                                  int uriStart,
                                                  int uriLength,
                                                  int uriHash,
                                                  NamespaceBinding oldBindings)
Return a MappingInfo containing a match namespaces. Specifically, return a MappingInfo info is such that info.namespaces is equal to new NamespaceBinding(prefix, uri, oldBindings), where uri is new String(uriChars, uriStart, uriLength).intern()).


endAttribute

public void endAttribute()
Description copied from interface: Consumer
End of an attribute or end of an actual parameter. The former use matches a startAttribute; the latter may not, and can be used to separate parameters in a parameter list. This double duty suggsts the method should at least be re-named.

Specified by:
endAttribute in interface Consumer

checkWriteAtomic

protected boolean checkWriteAtomic()

write

public void write(int v)
Specified by:
write in interface Consumer

writeBoolean

public void writeBoolean(boolean v)
Specified by:
writeBoolean in interface Consumer

writeFloat

public void writeFloat(float v)
Specified by:
writeFloat in interface Consumer

writeDouble

public void writeDouble(double v)
Specified by:
writeDouble in interface Consumer

writeInt

public void writeInt(int v)
Specified by:
writeInt in interface Consumer

writeLong

public void writeLong(long v)
Specified by:
writeLong in interface Consumer

writeDocumentUri

public void writeDocumentUri(java.lang.Object uri)

consume

public void consume(SeqPosition position)
Description copied from interface: PositionConsumer
Consume node at current position. The caller may invalidate or change the position after consume returns, so if the consumer wants to save it, it needs to copy it.

Specified by:
consume in interface PositionConsumer

writePosition

public void writePosition(AbstractSequence seq,
                          int ipos)
Description copied from interface: PositionConsumer
Consume a single position pair. This PositionConsumer may assume the sequence does no reference management; i.e. that copyPos is trivial and releasePos is a no-op. If that is not the case, use consume(TreePosition) instead.

Specified by:
writePosition in interface PositionConsumer

writeObject

public void writeObject(java.lang.Object v)
If v is a node, make a copy of it.

Specified by:
writeObject in interface Consumer

write

public void write(char[] data,
                  int start,
                  int length)
Process raw text.

Specified by:
write in interface Consumer

write

public void write(java.lang.String str)
Specified by:
write in interface Consumer

write

public void write(java.lang.CharSequence str,
                  int start,
                  int length)
Specified by:
write in interface Consumer

linefeedFromParser

public void linefeedFromParser()

textFromParser

public void textFromParser(char[] data,
                           int start,
                           int length)

writeJoiner

protected void writeJoiner()

writeCDATA

public void writeCDATA(char[] data,
                       int start,
                       int length)
Process a CDATA section. The data (starting at start for length char). Does not include the delimiters (i.e. "<![CDATA[" and "]]>" are excluded).

Specified by:
writeCDATA in interface XConsumer

startElementCommon

protected void startElementCommon()

emitStartElement

public void emitStartElement(char[] data,
                             int start,
                             int count)
Process a start tag, with the given element name.


startElement

public void startElement(java.lang.Object type)
Specified by:
startElement in interface Consumer

startAttribute

public void startAttribute(java.lang.Object attrType)
Description copied from interface: Consumer
Write a attribute for the current element. This is only allowed immediately after a startElement.

Specified by:
startAttribute in interface Consumer

emitStartAttribute

public void emitStartAttribute(char[] data,
                               int start,
                               int count)
Process an attribute, with the given attribute name. The attribute value is given using write. The value is terminated by either another emitStartAttribute or an emitEndAttributes.


emitEndAttributes

public void emitEndAttributes()
Process the end of a start tag. There are no more attributes.


emitEndElement

public void emitEndElement(char[] data,
                           int start,
                           int length)
Process an end tag. An abbreviated tag (such as '<br/>') has a name==null.


endElement

public void endElement()
Specified by:
endElement in interface Consumer

emitEntityReference

public void emitEntityReference(char[] name,
                                int start,
                                int length)
Process an entity reference. The entity name is given. This handles the predefined entities, such as "<" and """.


emitCharacterReference

public void emitCharacterReference(int value,
                                   char[] name,
                                   int start,
                                   int length)
Process a character entity reference. The string encoding of the character (e.g. "xFF" or "255") is given, as well as the character value.


checkValidComment

protected void checkValidComment(char[] chars,
                                 int offset,
                                 int length)

writeComment

public void writeComment(char[] chars,
                         int start,
                         int length)
Process a comment. The data (starting at start for length chars). Does not include the delimiters (i.e. "" are excluded).

Specified by:
writeComment in interface XConsumer

commentFromParser

public void commentFromParser(char[] chars,
                              int start,
                              int length)
Process a comment, when called from an XML parser. The data (starting at start for length chars). Does not include the delimiters (i.e. "" are excluded).


writeProcessingInstruction

public void writeProcessingInstruction(java.lang.String target,
                                       char[] content,
                                       int offset,
                                       int length)
Specified by:
writeProcessingInstruction in interface XConsumer

processingInstructionFromParser

public void processingInstructionFromParser(char[] buffer,
                                            int tstart,
                                            int tlength,
                                            int dstart,
                                            int dlength)
Process a processing instruction.


startDocument

public void startDocument()
Specified by:
startDocument in interface Consumer
Specified by:
startDocument in interface org.xml.sax.ContentHandler
Specified by:
startDocument in interface org.xml.sax.DocumentHandler

endDocument

public void endDocument()
Specified by:
endDocument in interface Consumer
Specified by:
endDocument in interface org.xml.sax.ContentHandler
Specified by:
endDocument in interface org.xml.sax.DocumentHandler

emitDoctypeDecl

public void emitDoctypeDecl(char[] buffer,
                            int target,
                            int tlength,
                            int data,
                            int dlength)
Process a DOCTYPE declaration.


beginEntity

public void beginEntity(java.lang.Object baseUri)
Specified by:
beginEntity in interface XConsumer

endEntity

public void endEntity()
Specified by:
endEntity in interface XConsumer

duplicateAttributeMessage

public static java.lang.String duplicateAttributeMessage(Symbol attrSymbol,
                                                         java.lang.Object elementName)

error

public void error(char severity,
                  java.lang.String message)

ignoring

public boolean ignoring()
Description copied from interface: Consumer
True if consumer is ignoring rest of element. The producer can use this information to skip ahead.

Specified by:
ignoring in interface Consumer

setDocumentLocator

public void setDocumentLocator(org.xml.sax.Locator locator)
Specified by:
setDocumentLocator in interface org.xml.sax.ContentHandler
Specified by:
setDocumentLocator in interface org.xml.sax.DocumentHandler

startElement

public void startElement(java.lang.String namespaceURI,
                         java.lang.String localName,
                         java.lang.String qName,
                         org.xml.sax.Attributes atts)
Specified by:
startElement in interface org.xml.sax.ContentHandler

endElement

public void endElement(java.lang.String namespaceURI,
                       java.lang.String localName,
                       java.lang.String qName)
Specified by:
endElement in interface org.xml.sax.ContentHandler

startElement

public void startElement(java.lang.String name,
                         org.xml.sax.AttributeList atts)
Specified by:
startElement in interface org.xml.sax.DocumentHandler

endElement

public void endElement(java.lang.String name)
                throws org.xml.sax.SAXException
Specified by:
endElement in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
Specified by:
characters in interface org.xml.sax.ContentHandler
Specified by:
characters in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException

ignorableWhitespace

public void ignorableWhitespace(char[] ch,
                                int start,
                                int length)
                         throws org.xml.sax.SAXException
Specified by:
ignorableWhitespace in interface org.xml.sax.ContentHandler
Specified by:
ignorableWhitespace in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException

processingInstruction

public void processingInstruction(java.lang.String target,
                                  java.lang.String data)
Specified by:
processingInstruction in interface org.xml.sax.ContentHandler
Specified by:
processingInstruction in interface org.xml.sax.DocumentHandler

startPrefixMapping

public void startPrefixMapping(java.lang.String prefix,
                               java.lang.String uri)
Specified by:
startPrefixMapping in interface org.xml.sax.ContentHandler

endPrefixMapping

public void endPrefixMapping(java.lang.String prefix)
Specified by:
endPrefixMapping in interface org.xml.sax.ContentHandler

skippedEntity

public void skippedEntity(java.lang.String name)
Specified by:
skippedEntity in interface org.xml.sax.ContentHandler

getPublicId

public java.lang.String getPublicId()
Specified by:
getPublicId in interface SourceLocator
Specified by:
getPublicId in interface org.xml.sax.Locator

getSystemId

public java.lang.String getSystemId()
Specified by:
getSystemId in interface SourceLocator
Specified by:
getSystemId in interface org.xml.sax.Locator

getFileName

public java.lang.String getFileName()
Description copied from interface: SourceLocator
Normally same as getSystemId.

Specified by:
getFileName in interface SourceLocator

getLineNumber

public int getLineNumber()
Description copied from interface: SourceLocator
Return current line number. The "first" line is line 1; unknown is -1.

Specified by:
getLineNumber in interface SourceLocator
Specified by:
getLineNumber in interface org.xml.sax.Locator

getColumnNumber

public int getColumnNumber()
Description copied from interface: SourceLocator
Return current column number. The "first" column is column 1; unknown is -1.

Specified by:
getColumnNumber in interface SourceLocator
Specified by:
getColumnNumber in interface org.xml.sax.Locator

isStableSourceLocation

public boolean isStableSourceLocation()
Description copied from interface: SourceLocator
True if position is unlikely to change. True for an expression but not an input file.

Specified by:
isStableSourceLocation in interface SourceLocator