gnu.xml.util

Class XCat

Implemented Interfaces:
EntityResolver, EntityResolver2

public class XCat
extends java.lang.Object
implements EntityResolver2

Packages OASIS XML Catalogs, primarily for entity resolution by parsers. That specification defines an XML syntax for mappings between identifiers declared in DTDs (particularly PUBLIC identifiers) and locations. SAX has always supported such mappings, but conventions for an XML file syntax to maintain them have previously been lacking.

This has three main operational modes. The primary intended mode is to create a resolver, then preloading it with one or more site-standard catalogs before using it with one or more SAX parsers:

	XCat	catalog = new XCat ();
	catalog.setErrorHandler (diagnosticErrorHandler);
	catalog.loadCatalog ("file:/local/catalogs/catalog.cat");
	catalog.loadCatalog ("http://shared/catalog.cat");
	...
	catalog.disableLoading ();
	parser1.setEntityResolver (catalog);
	parser2.setEntityResolver (catalog);
	...

A second mode is to arrange that your application uses instances of this class as its entity resolver, and automatically loads catalogs referenced by <?oasis-xml-catalog...?> processing instructions found before the DTD in documents it parses. It would then discard the resolver after each parse.

A third mode applies catalogs in contexts other than entity resolution for parsers. The resolveURI() method supports resolving URIs stored in XML application data, rather than inside DTDs. Catalogs would be loaded as shown above, and the catalog could be used concurrently for parser entity resolution and for application URI resolution.


Errors in catalogs implicitly loaded (during resolution) are ignored beyond being reported through any ErrorHandler assigned using setErrorHandler(). SAX exceptions thrown from such a handler won't abort resolution, although throwing a RuntimeException or Error will normally abort both resolution and parsing. Useful diagnostic information is available to any ErrorHandler used to report problems, or from any exception thrown from an explicit loadCatalog() invocation. Applications can use that information as troubleshooting aids.

While this class requires SAX2 Extensions 1.1 classes in its class path, basic functionality does not require using a SAX2 parser that supports the extended entity resolution functionality. See the original SAX1 resolveEntity() method for a list of restrictions which apply when it is used with older SAX parsers.

See Also:
EntityResolver2

Constructor Summary

XCat()
Initializes without preloading a catalog.
XCat(String uri)
Initializes, and preloads a catalog using the default SAX parser.

Method Summary

void
disableLoading()
Records that catalog loading is no longer permitted.
ErrorHandler
getErrorHandler()
Returns the error handler used to report catalog errors.
InputSource
getExternalSubset(String name, String baseURI)
"New Style" parser callback to add an external subset.
String
getParserClass()
Returns the name of the SAX2 parser class used to parse catalogs.
boolean
isUnified()
Returns true (the default) if all methods resolve a given URI in the same way.
boolean
isUsingPublic()
Returns true (the default) if a catalog's public identifier mappings will be used.
void
loadCatalog(String uri)
Loads an OASIS XML Catalog.
InputSource
resolveEntity(String publicId, String systemId)
"Old Style" external entity resolution for parsers.
InputSource
resolveEntity(String name, String publicId, String baseURI, String systemId)
"New Style" external entity resolution for parsers.
InputSource
resolveURI(String baseURI, String uri)
Resolves a URI reference that's not defined to the DTD.
void
setErrorHandler(ErrorHandler handler)
Assigns the error handler used to report catalog errors.
void
setParserClass(String parser)
Names the SAX2 parser class used to parse catalogs.
void
setUnified(boolean value)
Assigns the value of the flag returned by isUnified().
void
setUsingPublic(boolean value)
Specifies which catalog search mode is used.

Constructor Details

XCat

public XCat()
Initializes without preloading a catalog. This API is convenient when you may want to arrange that catalogs are automatically loaded when explicitly referenced in documents, using the oasis-xml-catalog processing instruction. In such cases you won't usually be able to preload catalogs.


XCat

public XCat(String uri)
            throws SAXException,
                   IOException
Initializes, and preloads a catalog using the default SAX parser. This API is convenient when you operate with one or more standard catalogs.

This just delegates to loadCatalog(); see it for exception information.

Parameters:
uri - absolute URI for the catalog file.

Method Details

disableLoading

public void disableLoading()
Records that catalog loading is no longer permitted. Loading is automatically disabled when lookups are performed, and should be manually disabled when startDTD() (or any other DTD declaration callback) is invoked, or at the latest when the document root element is seen.


getErrorHandler

public ErrorHandler getErrorHandler()
Returns the error handler used to report catalog errors. Null is returned if the parser's default error handling will be used.

See Also:
setErrorHandler(ErrorHandler)


getExternalSubset

public InputSource getExternalSubset(String name,
                                     String baseURI)
            throws SAXException,
                   IOException
"New Style" parser callback to add an external subset. For documents that don't include an external subset, this may return one according to doctype catalog entries. (This functionality is not a core part of the OASIS XML Catalog specification, though it's presented in an appendix.) If no such entry is defined, this returns null to indicate that this document will not be modified to include such a subset. Calls to this method prevent explicit loading of additional catalogs using loadCatalog().

Warning: That catalog functionality can be dangerous. It can provide definitions of general entities, and thereby mask certain well formedess errors.

Specified by:
getExternalSubset in interface EntityResolver2

Parameters:
name - Name of the document element, either as declared in a DOCTYPE declaration or as observed in the text.
baseURI - Document's base URI (absolute).

Returns:
Input source for accessing the external subset, or null if no mapping was found. The input source may have opened the stream, and will have a fully resolved URI.


getParserClass

public String getParserClass()
Returns the name of the SAX2 parser class used to parse catalogs. Null is returned if the system default is used.

See Also:
setParserClass(String)


isUnified

public boolean isUnified()
Returns true (the default) if all methods resolve a given URI in the same way. Returns false if calls resolving URIs as entities (such as resolveEntity()) use different catalog entries than those resolving them as URIs (resolveURI()), which will generally produce different results.

The OASIS XML Catalog specification defines two related schemes to map URIs "as URIs" or "as system IDs". URIs use uri, rewriteURI, and delegateURI elements. System IDs do the same things with systemId, rewriteSystemId, and delegateSystemId. It's confusing and error prone to maintain two parallel copies of such data. Accordingly, this class makes that behavior optional. The unified interpretation of URI mappings is preferred, since it prevents surprises where one URI gets mapped to different contents depending on whether the reference happens to have come from a DTD (or not).

See Also:
setUnified(boolean)


isUsingPublic

public boolean isUsingPublic()
Returns true (the default) if a catalog's public identifier mappings will be used. When false is returned, such mappings are ignored except when system IDs are discarded, such as for entities using the urn:publicid: URI scheme in their system identifiers. (See RFC 3151 for information about that URI scheme. Using it in system identifiers may not work well with many SAX parsers unless the resolve-dtd-uris feature flag is set to false.)

See Also:
setUsingPublic(boolean)


loadCatalog

public void loadCatalog(String uri)
            throws SAXException,
                   IOException
Loads an OASIS XML Catalog. It is appended to the list of currently active catalogs, or reloaded if a catalog with the same URI was already loaded. Callers have control over what parser is used, how catalog parsing errors are reported, and whether URIs will be resolved consistently.

The OASIS specification says that errors detected when loading catalogs "must recover by ignoring the catalog entry file that failed, and proceeding." In this API, that action can be the responsibility of applications, when they explicitly load any catalog using this method.

Note that catalogs referenced by this one will not be loaded at this time. Catalogs referenced through nextCatalog or delegate* elements are normally loaded only if needed.

Parameters:
uri - absolute URI for the catalog file.

Throws:
SAXException - As thrown by the parser, typically to indicate problems parsing data from that URI. It may also be thrown if the parser doesn't support necessary handlers.

See Also:
setErrorHandler(ErrorHandler), setParserClass(String), setUnified(boolean)


resolveEntity

public final InputSource resolveEntity(String publicId,
                                       String systemId)
            throws SAXException,
                   IOException
"Old Style" external entity resolution for parsers. This API provides only core functionality. Calls to this method prevent explicit loading of additional catalogs using loadCatalog().

The functional limitations of this interface include:

  • Since system IDs will be absolutized before the resolver sees them, matching against relative URIs won't work. This may affect system, rewriteSystem, and delegateSystem catalog entries.
  • Because of that absolutization, documents declaring entities with system IDs using URI schemes that the JVM does not recognize may be unparsable. URI schemes such as file:/, http://, https://, and ftp:// will usually work reliably.
  • Because missing external subsets can't be provided, the doctype catalog entries will be ignored. (The getExternalSubset() method is a "New Style" resolution option.)

Applications can tell whether this limited functionality will be used: if the feature flag associated with the EntityResolver2 interface is not true, the limitations apply. Applications can't usually know whether a given document and catalog will trigger those limitations. The issue can only be bypassed by operational procedures such as not using catalogs or documents which involve those features.

Specified by:
resolveEntity in interface EntityResolver

Parameters:
publicId - Either a normalized public ID, or null
systemId - Always an absolute URI.

Returns:
Input source for accessing the external entity, or null if no mapping was found. The input source may have opened the stream, and will have a fully resolved URI.


resolveEntity

public InputSource resolveEntity(String name,
                                 String publicId,
                                 String baseURI,
                                 String systemId)
            throws SAXException,
                   IOException
"New Style" external entity resolution for parsers. Calls to this method prevent explicit loading of additional catalogs using loadCatalog().

This supports the full core catalog functionality for locating (and relocating) parsed entities that have been declared in a document's DTD.

Specified by:
resolveEntity in interface EntityResolver2

Parameters:
name - Entity name, such as "dudley", "%nell", or "[dtd]".
publicId - Either a normalized public ID, or null.
baseURI - Absolute base URI associated with systemId.
systemId - URI found in entity declaration (may be relative to baseURI).

Returns:
Input source for accessing the external entity, or null if no mapping was found. The input source may have opened the stream, and will have a fully resolved URI.

See Also:
getExternalSubset(String,String)


resolveURI

public InputSource resolveURI(String baseURI,
                              String uri)
            throws SAXException,
                   IOException
Resolves a URI reference that's not defined to the DTD. This is intended for use with URIs found in document text, such as xml-stylesheet processing instructions and in attribute values, where they are not recognized as URIs by XML parsers. Calls to this method prevent explicit loading of additional catalogs using loadCatalog().

This functionality is supported by the OASIS XML Catalog specification, but will never be invoked by an XML parser. It corresponds closely to functionality for mapping system identifiers for entities declared in DTDs; closely enough that this implementation's default behavior is that they be identical, to minimize potential confusion.

This method could be useful when implementing the URIResolver interface, wrapping the input source in a SAXSource.

Parameters:
baseURI - The relevant base URI as specified by the XML Base specification. This recognizes xml:base attributes as overriding the actual (physical) base URI.
uri - Either an absolute URI, or one relative to baseURI

Returns:
Input source for accessing the mapped URI, or null if no mapping was found. The input source may have opened the stream, and will have a fully resolved URI.

See Also:
isUnified(), setUnified(boolean)


setErrorHandler

public void setErrorHandler(ErrorHandler handler)
Assigns the error handler used to report catalog errors. These errors may come either from the SAX2 parser or from the catalog parsing code driven by the parser.

If you're sharing the resolver between parsers, don't change this once lookups have begun.

Parameters:

See Also:
getErrorHandler()


setParserClass

public void setParserClass(String parser)

Parameters:
parser - The parser class name, or null saying to use the system default SAX2 parser.

See Also:
getParserClass()


setUnified

public void setUnified(boolean value)
Assigns the value of the flag returned by isUnified(). Set it to false to be strictly conformant with the OASIS XML Catalog specification. Set it to true to make all mappings for a given URI give the same result, regardless of the reason for the mapping.

Don't change this once you've loaded the first catalog.

Parameters:
value - new flag setting


setUsingPublic

public void setUsingPublic(boolean value)
Specifies which catalog search mode is used. By default, public identifier mappings are able to override system identifiers when both are available. Applications may choose to ignore public identifier mappings in such cases, so that system identifiers declared in DTDs will only be overridden by an explicit catalog match for that system ID.

If you're sharing the resolver between parsers, don't change this once lookups have begun.

Parameters:
value - true to always use public identifier mappings, false to only use them for system ids using the urn:publicid: URI scheme.

See Also:
isUsingPublic()