Filters

It is often useful to perform specialized data transforms, especially decompression, on the data used as a source for raster graphics imaging operators. In SPDL these types of data transforms are provided in the form of filters. An important feature of filters is the ability to layer, or pipeline, them. Connectivity between pipelined filters is provided in the form of StreamObjects. This clause describes the semantics of filters, StreamObjects, the standard filters specified by this International Standard, and the Filter operator which constructs new StreamObjects and pipelines filters.

Model for Filters

Filters are employed in a document via the Filter operator. This operator takes as operands a data source (see the description of data sources in ; this is not a Data Source Resource), parameters for the filter to be used (if necessary), and the name of the filter. The Filter operator returns a filtered StreamObject that can be used as an operand to subsequent executions of the Filter operator or to other operators (see ).

Data Sources for Filters

When a filtered StreamObject is accessed for data by an operator, the first data source in the filter pipeline with which the StreamObject is associated is accessed for data. The data flows through all the filters in the pipeline in turn, and the resultant filtered data is provided to the operator accessing the StreamObject.

There are three possible types of data sources for filters — StreamObjects, Procedures, and OctetStrings. The following subclauses describe how each is used as a filter data source.

StreamObjects

It is expected that the most common data source for filters in documents will be StreamObjects. A StreamObject can be obtained in one of two ways:

The pipelining of filters is possible because a StreamObject returned by the Filter operator, which is the output from the data transform defined by the filter, can in turn be used as the StreamObject operand to a subsequent execution of the Filter operator; thus the output from one filter can be the input to another filter.

Procedures

The data source for a filter can be a Procedure. When the filter requires more data to transform, it calls the Procedure. The Procedure shall return (on the operand stack) an value of type OctetStringReference which references an OctetString containing an arbitrary number of octets of data. The filter then pops this OctetStringReference from the stack and uses its contents as input to the filter.

This process repeats until the filter encounters its End-of-Data condition (see ); any leftover data in the final OctetString is discarded. The Procedure can return an OctetStringReference to an OctetString of length 0 to indicate that no more data is available.

The implementation of a filter shall use or safely store all octets in the data source OctetString to which a reference is returned by the Procedure before again calling the Procedure for more data. This allows the Procedure to allocate a single OctetString as a buffer and return the same OctetString each time with different contents.

OctetStrings

If the data source for a filter is an OctetString, the filter simply uses its contents as data to be decoded. If the filter encounters its End-of-Data condition (see ), it ignores the remainder of the OctetString; otherwise, it continues until it has exhausted the contents of the OctetString.

End-of-Data

A filter can reach a state in which it cannot continue filtering data; this is called the End-Of-Data condition (EOD). Most filters are able to detect an EOD marker encoded in the data that they are reading; the nature of this marker depends on the filter. In a few instances, the EOD condition is based on predetermined information, such as an octet count or a scan line count, instead of by an explicit marker in the encoded data. The EOD condition for each filter is specified in the description of the filter in .

Filter Identifiers

Unlike ColorSpaceObjects, which are parameterized in structure, Filters are parameterized in content at the time of their instantiation. Filters are parameterized under program control via execution of the Filter operator. This allows previously defined Filter resources to be reused many times in content with different sets of parameters. A Filter Identifier is obtained via execution of the FindResource operator with an operand that is the value of an INTERNAL RESOURCE IDENTIFIER which was previously bound to a Filter resource via a RESOURCE DECLARATION. The result returned by the FindResource operator is an OctetStringReference which references an OctetString which contains the canonical character string of a public identifier (as defined in ISO/IEC 9070) for the Filter. For the Filters defined in this International Standard, the owner name is "ISO/IEC 10180" and the Object Name is specified in the description of the filter in below

Standard Filters

The following subclauses describe the semantics, names, and parameters of the standard filters defined by this International Standard. The algorithms for some filters have been formally standardized (for example, CCITTFAXDecode), while the algorithms for others are based on de facto standards in wide use in the computing industry (for example, ASCII85Decode). Conforming implementations may provide additional filters not defined by this International Standard.

ASCIIHexDecode

The Object Name of the ASCIIHexDecode filter is Filter::ASCIIHexDecode.The ASCIIHexDecode filter requires no parameters. This filter decodes ASCII-encoded pairs of octets and produces binary data. For each pair of ASCII-encoded hexadecimal digits (0-9 and A-F or a-f), it interprets the first digit as a hexadecimal value for the most significant four bits of an octet, and the second digit as the hexadecimal value for the least significant four bits of the octet, thus it produces one octet of output data for each two octets of input. All white space characters (space, tab, carriage return, line feed, form feed, and null) are ignored. The ASCII character > indicates EOD. Any other characters will cause RaiseError to be invoked with DataError as its operand.

If the filter encounters EOD when it has read an odd number of hexadecimal digits, it will behave as if it had read an additional zero digit.

ASCII85Decode

The Object Name of the ASCII85Decode filter is Filter::ASCII85Decode The ASCII85Decode filter requires no parameters. This filter decodes data encoded in the ASCII base-85 encoding and produces binary data — four octets of binary data are produced for every five octets of input data. The input data is encoded as follows:

Binary data octets are encoded in 4-tuples (groups of 4). Each 4-tuple is used to produce a 5-tuple of ASCII-encoded characters. If the binary 4-tuple is (b1 b2 b3 b4) and the encoded 5-tuple is (c1 c2 c3 c4 c5), then the relation between them is

(b1 × 2563) + (b2 × 2562) + (b3 × 256) + b4 =

  • (c1 × 854) + (c2 × 853) + (c3 × 852) + (c4 × 85) + c5

    In other words, four octets of binary data are interpreted as a base-256 number and then converted into a base-85 number. The five digits of this number, (c1 c2 c3 c4 c5), are then converted into ASCII characters by adding 33, which is the ASCII code for !, to each. Thus, ASCII characters in the range ! to u are used, where ! represents the value 0 and u represents the value 84. As a special case, if all five digits are zero, they are represented by a single character z instead of by !!!!!.

    All white space characters (space, tab, carriage return, line feed, form feed, and null) are ignored. The character sequence If the filter encounters the character ˜ in its input, the next character shall be > and the filter will reach EOD. Any other characters will cause RaiseError to be invoked with DataError as its operand. Also, any character sequences that represent impossible combinations in the ASCII base-85 encoding will cause RaiseError to be invoked with IOError as its operand.

    LZWDecode

    The Object Name Filter::LZWDecode is reserved for the LZWDecode Filter. The LZWDecode filter requires no parameters. The filter supports the decoding scheme specified in of the TIFF 6.0 Specification, dated June 3, 1992 available from Adobe Corporation, Seattle, Washington. The encoded data format and the decoding algorithm are as specified in that Section. The code 257 indicates EOD.

    RunLengthDecode

    The Object Name of the RunLengthDecode filter is Filter::RunLengthDecode. The RunLengthDecode filter requires no parameters. This filter decodes data in run-length encoded format. The encoded data consists of a sequence of runs, where each run consists of a length octet followed by between 1 and 128 octets of data. If the length octet is in the range 0 to 127, the following length + 1 octets (between 1 and 128 octets) are to be copied literally to the output of the filter. If length is in the range 129 to 255, the following single octet is to be replicated 257 - length times (between 2 and 128 times) to the filter output. A run length of 128 indicates EOD.

    CCITTFaxDecode

    The Object Name of the CCITTFaxDecode filter is Filter::CCITTFaxDecode. The CCITTFaxDecode filter requires several parameters which are presented to the Filter operator in the form of a Dictionary. The filter supports both Group 3 and Group 4 encoding schemes, as specified in Recommendations T.4 and T.6 of the CCITT Blue Book, volume VII.3, 1988. The encoded data format and the decoding algorithm are specified in these Recommendations. If the filter encounters improperly encoded source data, RaiseError shall be invoked with DataError as its operand; the filter does not perform any error correction or re synchronization. The form in which EOD is indicated is determined by the value of the Rows key in the Dictionary. SPDL support for Recommendations T.4 and T.6 is limited to the decoding of data. It does not include initial connection and handshaking protocols that would be required to communicate with an actual facsimile machine. The purpose of this filter is to provide for efficient inclusion of facsimile-ready bi-level sampled raster graphics images in SPDL documents.

    The parameters in the Dictionary referenced by the DictionaryReference operand to the Filter operator for the CCITTFaxDecode filter are described in the following subclauses.

    K

    The optional key/value pair <K: Integer> specifies the encoding scheme that was used to encode the source data. A negative value indicates pure two-dimensional (Group 4) encoding. Zero indicates pure one-dimensional encoding (Group 3, 1-D). A positive value indicates mixed one- and two-dimensional encoding (Group 3, 2-D). The default value used by the filter if the key K is not in the Dictionary is 0.

    EndOfLine

    The optional key/value pair <EndOfLine: Boolean> specifies whether the source data was encoded with an end-of-line bit pattern prefixed to each line of data. The default value used by the filter if the key EndOfLine is not in the Dictionary is false.

    EncodedByteAlign

    The optional key/value pair <EncodedByteAlign: Boolean> specifies whether the source data was encoded with extra zero bits prefixed to each line of data as necessary to ensure that each line begins on an octet boundary. If the value of EncodedByteAlign is true, the extra zero bits are present in the source data. The default value used by the filter if the key EndOfLine is not in the Dictionary is false.

    Columns

    The optional key/value pair <Columns: Cardinal> specifies the width of the raster graphics image data in pixels. The default value used by the filter if the key Columns is not in the Dictionary is 1728.

    Rows

    The optional key/value pair <Rows: Cardinal> specifies the height of the raster graphics image data in scan lines. If this parameter is zero, the height of the raster graphics image data is not predetermined; the encoded data shall be terminated by an end-of-block bit pattern or by an EOD condition on the filter's data source. The default value used by the filter if the key Rows is not in the Dictionary is 0.

    EndOfBlock

    The optional key/value pair <EndOfBlock: Boolean> specifies whether the source data was encoded with an end-of-block bit pattern appended to the encoded data. An EndOfBlock value of true overrides the Rows parameter. The default value used by the filter if the key EndOfBlock is not in the Dictionary is true.

    The end-of-block pattern is the CCITT end-of-facsimile-block (EOFB) or return-to-control (RTC) appropriate for the K parameter.

    Blackls1

    The optional key/value pair <BlackIs1: Boolean>, if true, causes bits whose value is 1 to be interpreted as black pixels and bits whose value is 0 as white pixels. The default value used by the filter if the key BlackIs1 is not in the Dictionary is false.

    NullDecode

    The Object Name of the NullDecode filter is Filter::NullDecode. The NullDecode filter performs no data transformation — the output of the filter is identical to its input. This filter enables an arbitrary data source (Procedure or OctetString) to be easily treated as a data source because it does not require the detection of an EOD marker.

    The NullDecode filter requires two parameters

  • <EODstring: OctetStringReference>
  • <EODcount: Cardinal> which specify the condition under which the filter is to recognize EOD. If the value of EODcount is positive (and the length of the EODstring is non-zero), the filter will allow data to pass through the filter until it has encountered exactly EODcount instances of the EODstring; then it will reach EOD. All occurrences of the EODstring are also passed through.

    If EODcount is zero (and the length of the EODstring is non-zero), then the first occurrence of the EODstring results in EOD. In this case, the EODstring is consumed by the filter, but it will not be passed through the filter.

    In each case, overlapping instances of the EODstring are not recognized; for example, an EODstring of eee is recognized only once in the input XeeeeX.

    The EODstring may also be of length zero. In this case, if the value of EODcount is positive, the filter will simply pass EODcount octets of arbitrary data through to its output. If the value of EODcount is zero, detection of EOD markers is disabled; the filter will never reach EOD. This is useful primarily when using Procedures or OctetStrings as data sources.

    Operators

    The specification of the Filter operator and its semantics includes the specification of conditions which may cause content exceptions to be raised as a result of interpretation of the operator. Content exceptions and exception handling are defined in . In addition to these operator-specific exceptions, there are generic exceptions which may be raised during the interpretation of almost any operator. These generic exceptions and their semantics are described in .

    Filter

    The Filter operator takes 2 or more operands

  • <filteridentifier: OctetStringReference>
  • <paramn: Cardinal or DictionaryReference or OctetStringReference>
  • <param1: Cardinal or DictionaryReference or OctetStringReference >
  • <param0: Cardinal or DictionaryReference or OctetStringReference >
  • <datasource: StreamObject or Procedure or OctetStringReference> and returns one result
  • <stream: StreamObject> where filteridentifier is a reference to an OctetString that contains the canonical character string form of the Public Object Identifier which identifies the filter and said reference is obtained via execution of the FindResource operator, datasource is the source for encoded input data appropriate to filteridentifier, and the number of and nature of the intervening parameters, if any, is dependent on filteridentifier. The result is a newly created StreamObject that is filtered according to the value of filteridentifier and the other parameters to the Filter operator. The standard filters defined by this International Standard and their parameters, if any, are defined in

    If filteridentifier does not identify a filter known to the implementation, RaiseError shall be invoked with UndefinedKey as its operand. If filteridentifier is not a reference obtained via execution of the FindResource operator, RaiseError shall be invoked with UndefinedResource as its operand.