[version information: $XConsortium: proto.mif /main/25 1996/09/13 13:51:45 ray $]
This introductory chapter is non-normative. That is, this chapter provides background and overview material and
is not part of the formal definition of the protocol, but will aid in understanding the audio system.
This document was nearly finished when work on the project stopped. The remaining questions and issues are
indicated by enclosing the questions with [question]. The major open issues are roughly as follows:
- Audio Manager Hooks - how the how the user interactively selects or switches input and output devices for
one or all applications.
- Device attributes - precisely what controls we wanted in common across all devices.
- A request and reply need to be added to allow fetching the string for a given atom.
- Preferences - whether to provide a preference database for applications to look at.
- A few vestigial trusted/untrusted references.
Please send comments to the author, Ray Tice (ray@x.org). At some point, when the X Consortium goes away,
that won't be useful. Then try Shawn McMurdo (shawnm@sco.com), Peter Derr (pderr@zk3.dec.com), or
Stephen Hocking (sysseh@devetir.qld.gov.au).
The X Audio System is the result of efforts by many people. The architect, Ray Tice, of the X Consortium wishes
to thank Mark Welch of the X Consortium, Peter Derr of Digital Equipment Corp., David Rivas of SunSoft,
Shawn McMurdo of SCO, [...fill in list...]
The X Audio System also owes much of its heritage to the X Window System, Digital's AF, NCD's Network
Audio System, and many other prior systems.
Simply put, the X Audio System provide applications access to audio services. These include the ability to play,
generate, and record audio clips. The system also allows these services to be coordinated with other services,
such as video or graphics.
The X Audio System shares many goals with the X Window System. Network transparency allows an applica
tion to use hardware on the same machine, or across a network. Hardware independence allows programs to be
written once, but usable for a wide variety of audio hardware. Device sharing allows multiple applications to use
the audio hardware simultaneously. A C compatible common application programming interface (API) allows
programs to be portable across different platforms. And extensibility allows vendors to add additional capabili
ties.
There are other goals for the audio services that are not shared with the core X protocol. For example, notions of
security, compression, and cooperation with other media have been built into the core audio protocol. These
allow better integration into a larger infrastructure, currently known as Broadway. Description of the Broadway
project is outside of the scope of this document, and the X Audio System can be used and understood without
further knowledge of Broadway.
Finally, the X Audio system has been designed to make writing simple programs simple, with the remainder of
the system learnable on an incremental basis. The programming model has been designed to fit well with tool
kits, so that a single programming style can be utilized throughout an application.
Targeted Applications
In order to ship in a timely manner, version 1.0 focuses on support for the following applications:
- Basic record and playback
- Audio on the web
- Playing synchronized audio/video clips
- Teleconferencing
- Support needed for NT audio device drivers
In addition, the architecture was selected to allow future growth.
Applications Not Targeted
There are also application areas for which advanced capabilities have been omitted from the X Audio System, or
have been intentionally omitted from version 1.0 of the core protocol. In most cases it is felt that such capabilities
are best layered on top of the audio system, designed as an extension to the core audio system, or could be
deferred past the first release. Non-goals for version 1.0 of the core protocol include the following:
- A generalized digital signal processing or filtering environment.
The system focuses on handling audio for human consumption, rather than providing signal analysis tools.
- Post-production or studio production.
It is not the goal of the audio system to provide a full sound studio environment within the core server.
- Internal provision of sophisticated multimedia synchronization paradigms.
The current audio system provides low level support for synchronization of other media to audio and a virtual
time model. Its architecture allows clients or extensions to provide higher level synchronization, but the core
protocol does not provide these directly.
- Control of generalized analog signal routing and processing.
- MIDI support in the core server.
- Full game support.
Note that since the audio system architecture is designed to be very extensible, these service could be added at a
later date. For further discussion, please see "Future Directions" on page 79.
The audio system meets the needs of targeted applications with the following features:
- Record and playback of audio clips
- Temporary storage of audio clips
- Encapsulation of audio hardware services in a server.
- Rate and format conversion.
- Explicit time model for audio data streams and devices.
- An extensible programming model and interface compatible with toolkits.
- [...]
The X Audio System uses a client-server architectural model, where audio hardware is abstracted into the server,
and the application becomes a client of that server to obtain audio services. The application becomes a client of
the server by opening a connection to the server.
The X audio system defines three components: the API that the client uses to interact with the library, the proto
col that the library uses to interact with the server, and the object classes that the application manipulates via the
library and protocol. Objects exist on both the client and server sides, depending on what services they abstract.
This protocol specification defines the protocol and its encoding, as well as the server side objects. The library
interface and client side objects are defined in the X Audio System Library Specification. Note that it is possible
for someone to define their own library that also uses the audio protocol, as long as that library obeys the proto
col definition.
The object model uses the notion of classes. A class defines a list of values called attributes and defines the
meaning of each of these attributes for that class and what happens when these attributes change. Unlike some
object models, this object model define only a few methods (or requests) on the object: create, destroy, get, set,
and (for some objects) read and write.
The protocol and C API are relatively small, since they provide a generic mechanism to create, destroy, modify,
and query objects. The complete client visible state of the server and library is presented as a collection of
objects. The classes of these objects are defined in the protocol and library specifications. The system provides
pre-created instances of some of these classes, and the application may create instances of some classes. It is not
intended for applications to subclass from these classes.
Server Classes
A client uses an instance of a server side "port" object to move data into and out of the server. The application
uses the port to access the buffer of an output "device" or input device in the server. A simple example may help
explain this.
A simple case is where an application has samples in its memory and would like to play them. To do this, the
application takes the following steps:
- Opens a connection to the server.
- Obtains a "format" object in the server that describes the sample rate and other characteristics of the samples.
- Creates a port object in the server to accept samples from the client for output to the default output device.
(The port will use the format created in the previous step to decode the audio samples sent to it.)
- Sends the audio samples to the port.
The diagram below shows the resulting setup, with client-created objects to the left of the vertical dashed line:
In the above figure, the port object receives the audio samples in the client's format and timeline. The format of
the client's audio samples is defined by the format object attached to the port. The port object converts the sam
ples to the format of the device and schedules the samples to the timeline of the output device for playback.
To record samples, the process is very similar, except that the client creates a port object that makes the audio
samples from the default input device available for reading, and then the client fetches samples from the port. In
the figure below, client created objects are shown to the right of the vertical dashed line.

There are several other classes of objects in the server. For example - buckets to temporarily store audio clips in
the server, waveform objects that generate synthetic audio signals, and other classes used for access control. In
fact, the entire client visible state of the server is presented as attributes on instances of the various classes.
Client Classes
The audio system allows for libraries to support client side objects. These can be used to simplify client pro
grams by providing higher level services. Some possibilities are:
- Player object
This encapsulates the logic needed to pace the feeding of audio samples to the server so the number of
unplayed audio samples queued in the server is neither too high or too low.
- File object
A file class can be created to encapsulate knowledge of various audio file layouts.
- Event handler object
These can be used to receive events from the server and call application routines.
The actual list of client side objects depends on the library definition. Please refer to the appropriate specification
for details.
The X Audio System uses an object model to allow clients to allocate and use audio services in both the library
and server. This chapter of the document describes the object model used. In principle, this object model and the
related protocol are not specific to audio and are appropriate for many low to mid request rate client-server prob
lem domains. However, generalization of the object model and protocol is not a primary goal of the X Audio
System.
The X Audio system uses a connection oriented client-server approach to network transparency and sharing of
audio services. This approach is different than other object models such as CORBA that are not connection ori
ented. All server side objects are manipulated through the connection.
The X Audio System revolves around the use of server side objects. These objects are used for several purposes:
- representing services provided by the server.
- describing server state to a client.
- describing client state to the server.
- allocating server resources by a client.
There are also client side objects for services such as decoding audio file formats. The client side objects are
described in the library specification.
All server side objects are instances of a static set of classes. Each class has a set of attributes used to express the
values associated with the object. Each class defines the meaning of its attributes. The server uses instances of
some classes to describe its state or advertise the services it provides. The client creates objects to describe its
state to the server or to allocate server resources. Note that in this specification, the term "object" refers to an
instance of a class and not the class itself. A single-inheritance form of subclassing is used, however clients do
not have the ability to subclass or create new classes in the core protocol. Extensions may provide new classes
and capabilities.
All client operations on the server (known as requests) are done by operating on server objects. The client may
create and destroy objects, get and set attributes on objects, and on some objects, send data (such as audio sam
ples) to an object or retrieve data from an object. These operations are defined in the protocol chapter of this doc
ument.
Server messages to the client are known as events. The server sends events in one of three circumstances:
- An attribute of an object changes that the client has expressed interest in.
- The current request requires an event be sent. (A reply, in X11 parlance.)
- The request being processed is invalid. (An error, in X11 parlance.)
A client uses a monitor object to generate an event when an attribute changes. The Object Definitions chapter
defines the monitor class.
Tags are 32 bit numbers used to identify objects and atoms (atoms are defined below.) Tags are more formally
defined in "Tags and Atoms" on page 58 of the Protocol Components chapter.
An atom binds a tag a sequence of characters. This binding is unique: any FindAtom request with the same string
will return the same atom, until the server is reset. Atoms are commonly used in the protocol to represent con
stants. For a more complete definition of atoms, see "Tags and Atoms" on page 58 of the Protocol Components
chapter.
As mentioned in the previous chapter, the audio protocol uses atoms to identify attributes, and in some cases as
values for attributes. Since an atom consists of both a character string and a tag, the following conventions have
been established:
- XaN* is the character string of the atom used as a name. (e.g. XaNgain)
- XaA* is the tag of the atom, for when it is used as a value.
- XaP* is an atom used for parsing the protocol stream.
- XaT* is a tag. (e.g. XaTnone)
In this specification, attributes follow this example - For the attribute "gain":
- The atomized string is "gain".
- The string of the atom is referred to as XaNgain.
- The tag of the atom is XaAgain.
Parse atoms, used for parsing requests, follow slightly different rules. To distinguish them from normal atoms,
they use the reserved prefix "XaP" as part of their string. For example, the XaParray atom:
- The atomized string is "XaParray"
- The string of the atom is referred to as XaParray.
- The tag of the atom is XaAarray. (protocol parse tags can be distinguished by value.)
In the X Audio System, any object that has been created will continue to exist until it is no longer referenced, or
until the server is reset or killed. There are three types of references:
- Creator reference:
This is the reference held by the client (or sometimes the server) that created the object. In the case of an
object created with a create request, the reference is removed by a destroy request or by the shutdown of the
creating client.
- Inter-object reference:
This refers to when the tag of the object is used as the value of another server object's attribute. (A reference
to a server object by a client side object is not counted.)
- TagRange reference.
An object implicitly references the tag range object that its tag belongs to.
The following inter-object references are not counted for lifetime purposes, and these references are automati
cally removed when the object is destroyed:
- Reference by the "objects" attribute of the connection class.
- Reference by the "monitors" attribute of the core class.
- Reference by the "retainedObjects" attribute of the connection class for an untrusted connection. [xxx should
eliminate trusted/untrusted distinction].
It should be noted that the client's connection object remains after connection shutdown until all other objects
created by the client have been destroyed. This is a result of the references by the connection attribute of the core
class for the objects belonging to that connection.
Using the above rules, a client can save one or more of its objects past connection shutdown in a number of
ways:
- Add it to the retainedObjects attribute of its connection object.
- Add it to an advertized list (such as the buckets attribute of the server object).
- Make sure that it is actively being used by a longer-lived connection.
The class definitions below use tables to describe the attributes. Each row in the table contains an attribute's
name, type, default value on creation, and which requests can access the attribute. If a class subclasses from
another class, it inherits the attributes of its parent class. The subclass is permitted to change the set of valid val
ues for that attribute or change the semantics of the attribute, provided this change is documented in the subclass.
In the tables, types can be followed by square brackets. If the brackets contain a number, it mean the attribute is
an array of values, with the indicated number of dimensions. If the brackets contain a "c", it indicates that the
attribute contains a collection of values.
The access column of the table describes which requests can use the given attribute. The entries are the letters C,
S, G, and M. C, S, and G refer to the corresponding protocol requests, Create, Set and Get, while M (mandatory)
refers to a field that must be filled in by the client before using the object.
If the Default column of the table has the entry "(dynamic)", then the default value depends on the conditions
present when the object is created. An entry of "(empty)" means that the collection defaults to having no values
in it.
All object classes inherit from the core class. The core class allows objects to be queried for their type, and pro
vides notification support. The following attributes are defined for the core class: TABLE 1. Attributes common to all classes
-------------------------------------------------
| Name | Type | Default | Access |
=================================================
| classID | TAG | (dynamic) | G |
-------------------------------------------------
| name | TAG | XaTnone | CSG |
-------------------------------------------------
| monitors | TAG [c] | (empty) | CSG |
-------------------------------------------------
| connection | TAG | (dynamic) | G |
-------------------------------------------------
| access | TAG | XaTnone | CSG |
-------------------------------------------------
- .classID
- The classID attribute identifies the object's class. The attribute contains the tag of the corresponding class
object. If the class is a subclass of another class, only the subclass is reported. See the "Class Class" on
page 34 for more information.
- .name
- The name attribute is the tag of a string object that describes the object.
- monitors
- The monitors field contains references to monitor objects. Monitors, defined later in this chapter, send events
according to certain conditions in the referring object. See "Events and Errors" on page 68 for more informa
tion about events. The monitors field is defined as a collection, and may contain more than one monitor. (A
complete description of collections is in the "Protocol Components" chapter.) If a set request is done with a
scalar instead of a collection operation, the scalar is treated as a collectionAdd operation.
- connection
- The connection attribute contains the tag of the connection that created the object. If the object belongs to the
server instead of a connection, then the attribute has the value XaTnone.
- access
- The access attribute contains a collection of access objects that determine what other connections may access
this object, in addition to whatever access was granted by the connection object that owns this object. See
"Security" on page 53 for more detail.
Instances of the core class are never created directly. Instances of subclasses of the core class are created instead.
A buffer is an object that holds samples. There are four subclasses of buffers defined in the core protocol. They
are devices, buckets, waveforms and ports. It is possible for extensions to define other subclasses of the buffer
class.
The buffer class is an abstract class. That is, you can create an instance of one of the three subclasses of buffer,
but not the buffer class.
TABLE 2. Buffer Attributes
----------------------------------------------------------
| Name | Type | Default | Access |
==========================================================
| format | TAG | XaTnone | MCSG* |
----------------------------------------------------------
| supportedFormats | TAG[c] | (empty) | G |
----------------------------------------------------------
| latestTime | TIME | 0 | CSG |
----------------------------------------------------------
| earliestTime | TIME | 0 | CSG |
----------------------------------------------------------
| bufferSize | CARD32 | (dynamic) | CSG |
----------------------------------------------------------
| gain | DECIBELS | 0x0 (Unity) | CSG |
----------------------------------------------------------
| mix | DECIBELS | no mixing | CSG |
----------------------------------------------------------
The buffer class also inherits the attributes of the core class.
- format
- The format attribute is the tag of a format object that describes encoding of the contents of the buffer. The for
mat must be one of those listed in the supported attributes, or XaTnone. Some subclasses may require that the
format attribute not be XaTnone for some operations.
- .supportedFormats
- The supportedFormats attribute expresses the formats supported by this object. [5:Need description of how].
- latestTime
- The latestTime attribute is the timestamp of the latest complete sample in the buffer, in the time co-ordinates
of the buffer.
- earliestTime
- The earliestTime attribute is the timestamp of the earliest complete sample in the buffer, in the time co-ordi
nates of the buffer.
- .gain
- The gain attribute is the amplication factor to apply to the signal. For input buffers, it should be used to
express the amount of amplification to produce a signal of standard volume. [6:Need definition of standard
volume]. Gain is expressed in signed 16.16 notation in decibels, where 3dB represents a doubling in power,
and 6dB is a doubling in amplitude. (0 corresponds to unity, 0x80000000 is off.) [7:Is this the right expression
of gain?]
- mix
- The mix attribute descibes cross-track mixing of the signal. It is represented as a two-dimensional array, with
a gain factor (using the above description) for each input/output pair. The default matrix does no cross-track
mixing. (i.e., the diagonal elements are unity gain, and all others are off.)
.A buffer object that accepts samples may be listed as the output buffer of multiple port objects. In this case the
buffer must mix the samples provided by the ports. A buffer object that produces samples may be the input buffer
for multiple port objects, in which case the samples are shared among the ports. [Except buffers that are ports.
This is awkward, specification-wise.]
Port objects move data into and out of buffer objects (including devices). The port can move data between buff
ers, place data into a buffer from a write request on the port, or retrieve data from a buffer for read requests on the
port. The port can also optionally be used to store samples to make this movement occur more smoothly. As
audio samples move through the port they are translated from the source format and timeline to the destination
format and timeline. A discussion of data transfer and its timing using ports is in chapter "Time" on page 40.
The port class is a subclass of buffer, and defines the following additional attributes:
TABLE 3. Port Attributes
--------------------------------------------------------
| Name | Type | Default | Access |
========================================================
| continuity | BOOL | True | CSG |
--------------------------------------------------------
| ordered | BOOL | True | CSG |
--------------------------------------------------------
| syncPolicy | TAG | XaAoutput | CSG |
--------------------------------------------------------
| inputBuffer | TAG | (dynamic) | CSG |
--------------------------------------------------------
| outputBuffer | TAG | (dynamic) | CSG |
--------------------------------------------------------
| run | TAG | True | CSG |
--------------------------------------------------------
| overflowAction | TAG | ??? | CSG |
--------------------------------------------------------
| underflowAction | TAG | ??? | CSG |
--------------------------------------------------------
| format | TAG | none | CSGM* |
--------------------------------------------------------
| inputTimestamp | CARD32 | (dynamic) | CSG |
--------------------------------------------------------
| inputBufferTimes | CARD32 | (dynamic) | G? |
| tamp | | | |
--------------------------------------------------------
| outputTimestamp | CARD32 | (dynamic) | CSG |
--------------------------------------------------------
| outputBuffer | CARD32 | (dynamic) | G? |
| Timestamp | | | |
--------------------------------------------------------
| inputVolatileSize | CARD32 | (dynamic) | CSG |
--------------------------------------------------------
| outputVolatileSize | CARD32 | (dynamic) | CSG |
--------------------------------------------------------
| samplesQueued | CARD32 | (dynamic) | G |
--------------------------------------------------------
| pitchTempoNum | CARD32 | 1 | CSG |
--------------------------------------------------------
| pitchTempoDen | CARD32 | 1 | CSG |
--------------------------------------------------------
The port class also inherits the attributes of the "Buffer Class" on page 11. The meaning of the format attribute
defined by the buffer class is augmented.
- continuity
- Whether the data coming from the client is a continuous stream, or has gaps in it. This attribute affects syn
chronization, since otherwise the server can not distinguish late data from a gap in the data.
- ordered
- Whether the data coming from the client will always arrive in timestamp order. Setting this to true may allow
the server to work more efficiently, since it can discard out of order requests. [Is this really useful?]
- syncPolicy
- [This needs rework.] The synchronization expected between the port and the buffers. This may be one of
{XaAsyncInput, XaAsyncOutput, XaAsynchBoth}. XaAsyncInput causes the port to synchronize to the input
buffer. XaAsyncOutput causes the port to synchronize to the output buffer. A synchronization of XaAsyn
cBoth will cause the port to wait if needed for either the input or output buffer associated with this port. Serv
ers may provide other synchronization policies.
- inputBuffer, outputBuffer
- These are the tags of the source and destination buffers that the port should use for the audio data it handles.
The inputBuffer and outputBuffer will default to the default input and default output devices of the server.
[Subject to preferences and the audio manager.???] If the client wishes to perform write operations on the
buffer, then inputBuffer must be set to XaTnone. If the client wishes to perform read operations on the buffer,
then outputBuffer must be set to XaTnone. The inputBuffer or outputBuffer attributes can be set to XaTnone
only at port creation time, and at most one of them can be set to XaTnone. [8:should we reintroduce direction
attribute? Would that make things easier or harder to understand?] It is an error to set inputBuffer to a buffer
that is not an input buffer or to set outputBuffer to a buffer that is not an input buffer.
- run
- The run attribute enables or inhibits data transfer to output and input buffer and from the input buffer, and the
corresponding updates to earliestTimestamp and latestTimestamp. It does not affect whether read and write
requests can be performed on the port. This attribute does not affect the update of inputTimestamp, input
BufferTimestamp, outputTimestamp, or outputBufferTimestamp.
When the run attribute value is changed to true, the port will perform any underflow or overflow or synchro
nization as needed to resume operation. If there are any other attributes set in the same request that changes
the state of the run attribute, the effect is as if they are set before run transitions to true, or after run transitions
to false.
- overflowAction
- This attribute specifies what to do with data in the port when data must be transferred from the port to an out
put buffer that it too full to accept it. XaAoverflowDropOld and XaAoverflowDropNew actions are defined in
the core protocol. The XaAoverflowDropOld action will cause the port to discard its oldest samples, while
the XaAoverflowDropNew action will cause the newest samples to be dropped. Other actions may be defined
via server extensions.
- underflowAction
- This attribute specifies what action to take in the port requires samples from an input buffer that can not pro
vide samples required by the port. The core protocol defines the XaAunderflowSilence, XaAundeflowRepeat,
Other actions may be provided by server extensions. The XaAunderflowSilence action will cause the buffer
to insert silence when data is requested and none is available. The XaAunderflowRepeat action will repeat the
last available sample. [Might want a "fade" action, to minimize pops?]
- format
- The format attribute is applied to client read and write operations and must be set to a valid format for these
operations [else some kind of error happens].
- inputTimestamp, inputBufferTimestamp
- The inputTimestamp and inputBufferTimestamp attribute of the port represent the latestTimestamp attribute
of the input buffer. The inputTimestamp attribute presents the value in port time coordinates, while the input
BufferTimestamp attribute presents the value in the input buffer's time coordinates. See the port description
below for more details on these attributes.If there is no input buffer to this port, then both inputTimestamp
and inputBufferTimestamp are equivalent to the latestTimestamp of the port. [should one of them be different
for out of order write requests?]
- outputTimestamp, outputBufferTimestamp
- The outputTimestamp and outputBufferTimestamps attributes of the port represent the earliestTimestamp
attribute of the output buffer. The outputTimestamp attribute presents the value in port time coordinates,
while outputBufferTimestamp presents the value in the output buffer's time coordinates. If there is no output
buffer to this port, then both outputTimestamp and outputBufferTimestamp are equivalent to the earliest
Timestamp of the port. [should one of them be different for out of order read requests?]
- inputVolatileSize, outputVolatileSize
- The inputVolatileSize attribute specifies the maximum delay, as specified by the client between data arriving
in the input buffer and its transferral to the port's buffer. The delay is specified in samples in port time coordi
nates. The outputVolatileSize is similar, except it controls the how far ahead of the port's earliest time that
data may be moved to the output buffer.
Specifying too large or too small of a volatile size may cause data to be dropped. Also, a large value can cause
poor interactivity, and a small value can produce a high processing load in some implementations. The vola
tileSize attribute of the input and output buffers give minimum values for reliable operation. The client can
specify smaller values than these limits, but the implementation may choose to use any value between that
specified by the port, and the minimum advertised on the buffer.
If there is either no input or output buffer, the corresponding volatile size attribute is ignored.
- samplesQueued
- The samplesQueued attribute returns (outputTimestamp - inputTimestamp) which represents the number of
samples in the server for this port and its buffers. This provide a measure of latency in the server. (This value
could be computed by the client from other attributes in the port, however, samplesQueued is provided to
simplify event generation for flow control purposes.)
- pitchTempoNum, pitchTempoDen
- These two attributes are unsigned, nonzero values that form a ratio (numerator/ denominator) used to affect
the pitch and tempo of data passed through the port. Pitch and tempo are controlled together - doubling the
pitch (apparent frequency) will also double the tempo (rate of playback). These controls are not normally use
ful for input buffers that produce data in real-time, such as microphones. They are useful for waveform buck
ets, and client data.
Use of these control is not needed for format conversion between source and destination - the port calculates
sample rate conversions directly from format information of the source and destination.
The port object inherits attributes from the buffer class. The inherited volatileSize attribute is ignored for port
objects. The inherited bufferSize attribute control the maximum number of samples that can be stored in the port.
This value can be 0, which is useful in certain situations.
The details of using ports for data transfer and the timing of that transfer are quite lengthy, and are discussed in
"Time" on page 40.
The supported operations on port objects are XaCreate, XaDestroy, XaSet, XaGet, XaRead, and XaWrite.
Input and Output Timestamps
The port object allows applications to estimate the current latency in the server by providing the latest timestamp
of the input buffer in port coordinates as the inputTimestamp and outputTimestamp attributes.
Some applications also need to know the relationship between port timestamps and timestamps in either the input
or output buffer. The port provides this by providing, in input buffer coordinates, a timestamp equivalent to
inputTimestamp. This additional attribute is known as inputBufferTimestamp. Similarly, an outputBufferTimes
tamp is also provided. Knowledge of equivalent timestamps and their associated rates allows an application to
estimate the equivalent timestamps in the future.
The application can also use these four attributes to set time mappings. A new inputTimestamp value for the cur
rent value inputBufferTimestamp can be set by performing a set request on the inputTimestamp attribute with the
new inputTimestamp value. This is the most common form of setting inputTimestamp, and is a specific case of a
more powerful rule, described below, that allows positioning of the port's timeline to the input buffer's timeline.
If a set request specifies a new value for inputTimestamp (IT) and/or inputBufferTimestamp (IBT), a new value
for IT is calculated as follows:
This set request does not change the value of IBT. If the request has only one of IT.request or IBT.request, the old
value of the corresponding attribute is used in the calculation. The sample rates used in the calculation come
from the format objects associated with the port and input buffer.
The attributes above are updated whenever a get request on these attribute values are reported back to the client,
and for event triggering purposes, whenever the earliestTimestamp attribute of the output buffer changes. Addi
tionally a server implementation may update them more frequently. [Update granularity is still under discussion.]
XaRead Semantics on Port Objects
[...]
XaWrite Semantics on Port Objects
[...]
[Need way to tell what timeline the port defaults to.]
The device class is a subclass of the buffer class. Device objects are used to express server hardware that captures
or emits audio signals. For example, a Digital to Analog Converter (DAC) hooked to speaker would be modeled
in the server as a device. The device class is a subclass of buffer. The additional attributes of the device class are: TABLE 4. Device Attributes
-------------------------------------------------------
| Name | Type | Default | Access |
=======================================================
| direction | Atom | (dynamic) | G |
-------------------------------------------------------
| volatileSize | CARD32 | None | G |
-------------------------------------------------------
| lineLevel | ??? | ??? | SG |
-------------------------------------------------------
| jacks??? | ??? | ??? | SG |
-------------------------------------------------------
| jacksSupported??? | ???[1] | ??? | SG |
-------------------------------------------------------
| switches????? | ??? | ??? | SG |
-------------------------------------------------------
The device class also inherits the attributes of the "Buffer Class" on page 11.
[11:Obviously, how to express jacks, speakers, etc., is still up in the air.]
- direction
- The direction attribute indicates whether the device is an input or output device. It has two valid values -
XaAinput and XaAoutput.
- .volatileSize
- The volatile size is the number of samples of the buffer that may be in transport out of the buffer to the output
hardware, or in transport into the buffer from input hardware. An application should avoid access to this por
tion of the buffer of an output device, since samples written to the volatile zone may arrive in the buffer after
the hardware has read that portion of the buffer. Therefore playback of that data is not guaranteed. For input
devices, attempts to access the volatile zone may cause the request to stall or underflow, if data is not avail
able. See "Time" on page 40 for more details. [12:Is there a similar zone on the other end of the buffer, since
data transfer is typically not continuous, but occurs in bursts?]
In the core protocol, devices are created by the server and are not creatable nor destroyable by clients. When a
device is created, it is added in either the outputBuffers or inputBuffers attribute of the server object, depending
on whether it is an output device or an input device. A device is never both an output device and an input device,
and its direction never changes.
The supported operations on core device objects are XaSet and XaGet.
The bucket class is a subclass of the buffer class. A bucket is used to store samples for later playback. Bucket
objects follow the normal object lifetime rules. (See "Object Lifetime" on page 8 for details.) The bucket class
defines the following attributes: TABLE 5. Bucket Attributes
----------------------------------------------------
| Name | Type | Default | Access |
====================================================
| maxSize? | ??? | ??? | G??? |
----------------------------------------------------
| overflowAction | ATOM | XaAoverflow | CSG |
| | | DropNew | |
----------------------------------------------------
The bucket class also inherits the attributes of the "Buffer Class" on page 11.
- ..maxSize???
- [do we want to allow size limits for a bucket - does this make sense if we don't declare the format of a
bucket?]
- overflowAction
- The overflow action is applied when a port transfers samples with a minimum size that combined with the
currently stored samples, exceed the bucket's size. The valid attribute values for the core protocol for the
overflow attribute of the bucket object are:
XaAoverflowDropOld: Drop the oldest samples as needed to make room for the new samples.
XaAoverflowDropNew: Accept incoming samples until full, and then drop incoming samples, up to the min
imum transfer size.
XaAoverflowDont: Reject the transfer of samples that will not fit, which forces the port to apply its overflow
action.
XaAoverflowDropUsed: Drop the oldest samples in the port that have been copied by all ports using this
bucket as an input buffer. Refuse transfer of samples that will not fit, which forces the port to apply its over
flow action.
[Do we drop *all* used samples as soon as they are used, or only enough to satisfy minimum transfers? If the
answer is the former, does it deserve to be a separate attribute that is applied at every read transfer?]]
See "Port Class" on page 13 for details on manipulating apparent playback pitch and tempo.
A bucket can be used as both an input buffer and an output buffer. [Need more discussion of buckets.]
The supported operations on bucket objects are XaCreate, XaDestroy, XaSet, and XaGet.
The waveform class is a subclass of buffer that synthesizes samples of a function at the required sample rate. For
example, a waveform object might produce a 1000 Hz sine wave at any sample rate.
If a waveform is periodic, it has a native a periodicity of 1ms (1000Hz). The apparent frequency can changed by
manipulating the port used for playback. See the "Port Class" on page 13 for details on changing the apparent
frequency.
The waveform class inherits the attributes of the "Buffer Class" on page 11.
For the waveform types supported by the core protocol, the waveform objects are created by the server when the
server is started. Clients can not create or destroy the objects of the waveform class, (although subclasses may
override this restriction). The predefined waveforms can be distinguished from one another by their names. The
predefined waveforms are defined in "Waveforms" on page 37 of the predefined objects chapter of this docu
ment.
Waveforms can be used as input buffers, but not output buffers.
The supported operation for the waveform class is XaGet.
The sampledWaveform class allows a portion of the contents in a bucket to be used in a repeating fashion to
function as a waveform. This class is a subclass of the waveform class, and has the following attributes: TABLE 6. Bucket Attributes
---------------------------------------------------
| Name | Type | Default | Access |
===================================================
| bucket | TAG | XaTnone | CSG |
---------------------------------------------------
| earliestTime | CARD32 | (dynamic) | CSG |
---------------------------------------------------
| latestTime | CARD32 | (dynamic) | CSG |
---------------------------------------------------
| repetitions | CARD32 | 0xFFFFFFFF | CSG |
---------------------------------------------------
The sampledWaveform class also inherits from the "Waveform Class" on page 20.
- bucket
- The bucket attribute contains the tag of the bucket whose samples are used. If the value it XaTnone, then
silence is produced.
- earliestTime, latestTime
- The earliestTime and latestTime attributes reflect the range of samples used from the bucket. At creation
time, the default value of these attributes are copied from the earliestTime and latestTime of the bucket. If the
bucket attribute is XaTnone when the sampledWaveform object is created, then these attributes default to 0.
- repetitions
- The repetitions attribute specifies how many times the contents of the bucket are presented. A value of 0 indi
cates no repetitions (the bucket is played once), a value of 1 indicates 1 repetition, and so on. A value of
0xFFFFFFFF indicates that the contents should be repeated endlessly.
A sampledWaveform object presents samples from a bucket in a repeating fashion. The number of samples it
uses from the bucket is:
The waveform maps these samples to timestamps in the range of:
A waveform timestamp within that range would map to a bucket timestamp of:
All samples outside this range are treated as silence, as well as any samples that map to samples outside of the
bucket's current storage.
Samples are fetched from the bucket as needed - changing bucket contents used by the sampledWaveform affects
the output of the sampledWaveform.
The supported operations on sampledWaveform objects are XaCreate, XaDestroy, XaSet, and XaGet.
A format object describes certain characteristics of a stream of audio data. Each attribute of a format describes
some particular aspect of the stream. Some of the attributes apply only to conventional sampled data (such as the
sample width and number of channels), whereas others (such as the encoding tag) apply to all streams.
If a conventional sampled data stream has more than one channel, it is assumed that the channels are multiplexed
on a per-sample basis. Vendor defined formats may have other multiplexing schemes.
TABLE 7. Format Attributes
---------------------------------------------------------
| Name | Type | Default | Access |
=========================================================
| encoding | TAG | XaAEncodeLinear | CG |
---------------------------------------------------------
| bigEndian | BOOL | True | CG |
---------------------------------------------------------
| numChannels | CARD32 | 2 | CG |
---------------------------------------------------------
| bitsPerSample | CARD32 | 8 | CG |
---------------------------------------------------------
| sampleRate | CARD32 | 8000??? | CG |
---------------------------------------------------------
The format object also inherits the attributes of the core class.
- encoding
- The encoding attribute contains an atom that indicates how numbers in the data stream represent the magni
tude of the sample. The predefined encodings are listed in XXX. For some encodings, the encoding forces
limitations on the other attributes of the format.
- bigEndian
- The bigEndian attribute indicates whether the most or least bits occur first in multi-byte samples.
- numChannels
- The numChannels attribute defines the number of channels of data in the byte stream. For each timestamp,
there are numChannels samples, one for each channel.
- bitsPerSample
- The bitsPerSample attribute defines the number of bits per sample for fixed sample width encodings.
- sampleRate
- The sampleRate attribute defines the number of samples per second declared for this format.
The supported operations on format objects are XaCreate, XaDestroy, and XaGet.
A monitor object generates events in response to operations on an object that the monitor is attached to. The
attributes of a monitor allow an application to select what operations and conditions monitor the event genera
tion. The monitor object is a subclass of the core class and has the following attributes: TABLE 8. Monitor Attributes
---------------------------------------------------------
| Name | Type | Default | Access |
=========================================================
| eventType | ATOM | XaAeventChange | CG |
---------------------------------------------------------
| objects | TAG[c] | (empty) | CSG |
---------------------------------------------------------
| changes | ATOM[c] | (empty) | CSG |
---------------------------------------------------------
| conditions | TAG[c] | (empty) | CSG |
---------------------------------------------------------
| filters | TAG[c] | (empty) | CSG |
---------------------------------------------------------
| retAttributes | ATOM[c] | (empty) | CSG |
---------------------------------------------------------
| active | BOOL | XaAtrue | CSG |
---------------------------------------------------------
The Monitor class also inherits the attributes of the core class.
- eventType
- The eventType attribute specifies what type of event is generated. The valid values are XaAeventCreate,
XaAeventDestroy, XaAeventChange, and XaAeventError. The use of these values is described later in this
section.
- objects
- The objects attribute lists what objects the monitor is attached to. The attached object also has a list of
attached monitors as an attribute of the core class. The connection must have read access on the listed objects.
- changes
- The changes attribute is a collection of atoms that represent attributes for which any change should be
reported, subject to the monitor's filters. This monitoring happens only when the eventType attribute is set to
XaAeventChange, and the state of the target object(s) change. Atoms in the collection that represent
attributes that do not exist in the target object are ignored when the monitor is tested against the object. [need
value for "all attributes".]
- conditions
- The conditions attribute is a collection of condition objects, each of which gives a threshold for a single
attribute. When an attribute changes, all corresponding conditions are evaluated to see if they should cause an
event to happen (subject to the monitor's filters.) See "Condition Class" on page 25 for more details.
- filters
- The filters attribute provides a way to suppress unwanted events. The attribute is a collection of condition
objects. All of the condition objects listed in the filters attribute must evaluate to true for the event to be sent.
The filters are only evaluated only when the changes attribute or conditions attributes find a candidate for an
event.
- retAttributes
- The retAttributes attribute is a collection of names of attributes to return in the generated event. "[need value
for "just the ones that changed".][can this list contain names of attributes not in the target object(s)?].
- active
- The active attribute determines whether may generate events.
Monitors are most commonly used to generate change events. This is specified by creating the monitor with
eventType set to XaAeventChange. In this mode, an event is generated whenever an attribute in an object
changes, provided the object is listed in the objects attribute of the monitor, and the changed attribute(s) are listed
in the changes attribute of the monitor or listed in condition objects listed in the conditions attribute of the moni
tor. If a single set request changes multiple monitored attributes of an object, then the monitor generates a single
event for all of the changes. See "Events and Errors" on page 68 for more details on events.
Monitors can also be used to monitor for creation of objects. To do this, the monitor is created with the event
Type attribute set to XaAeventCreate. A monitor of this type can only be attached to class objects, and will gen
erate an event whenever a object of the that class or a subclass of that class is created. The event will contain the
attributes of the created object that are listed in the retAttributes attribute of the monitor.
A monitor can generate events for the destruction of an object if it created with eventType attribute set to XaAe
ventDestroy. The monitor can then be attached to an object to generate an event when it is destroyed. If the mon
itor is attached to a class object, it will report destruction of objects of that class or subclasses of that class. In
either case, the event will contain the attributes of the destroyed object that were listed in the retAttributes
attribute of the monitor.
[What about XaAeventError???]
The condition class is used by the monitor class to conditionally test an object's attribute. The conditional test is
performed by comparing the attribute's value against a reference, and interpreting the results. The condition class
defines the following attributes: TABLE 9. Server Attributes
-------------------------------------------------------
| Name | Type | Default | Access |
=======================================================
| attribute | ATOM | XaAnone | CSG |
-------------------------------------------------------
| comparison | ATOM | XaApositive | CSG |
-------------------------------------------------------
| onTransition | BOOL | False | CSG |
-------------------------------------------------------
| reference | (dynamic) | 0 | CSG |
-------------------------------------------------------
| delta | (dynamic) | 0 | CSG |
-------------------------------------------------------
The condition class also inherits the attributes of the core class.
- attribute
- The attribute attribute is the atom that represents the attribute to perform the comparison on. The condition
evaluates to false if the attribute is not present on the monitored object.
- comparison
- The comparison attribute indicates how the results of the comparison are interpreted. The attribute contains
one of the following atoms: XaApositive, XaAnegative, XaAequal, XaAnotEqual. For non-numeric
attributes, XaApositive and XaAnegative always evaluate to false.
- onTransition
- If the onTransition attribute is true, then the condition evaluates to true only when the results of the compari
son change from false to true. Otherwise the condition evaluates to true whenever the comparison evaluates
to true.
- reference
- The reference attribute is used as described in the comparison attribute. [Need to define out of range results]
[Need to talk about 64-bits vs. 32 bit values].
- delta
- The delta attribute is used to modify the reference when an event is generated, so that the comparison is again
false. The delta is only applied if the attribute under test is numeric and the delta is non-zero. The delta value
is applied by adding to the reference the smallest multiple of the delta needed to make the comparison false.
If the comparison attribute is set to XaAnotEqual and the delta is non-zero, the reference is replaced with the
current value.
The state of the condition object is evaluated by performing a comparison to produce a {positive, equal, or nega
tive} result, and resolving this ternary result to a binary result using the test specified in the comparison attribute.
The comparison is performed by subtracting the reference from the value of the attribute under test. The subtrac
tion is performed using modulo N arithmetic, where N is the number of significant bits for that attribute. If the
high order bit of the result is on, the comparison is negative. If the result is zero, the comparison is equal. Other
wise the comparison is positive. This ternary result is matched against the test specified in the comparison
attribute.
The above result is returned to the monitor if the result is false, or the onTransition attribute is set to false. Other
wise, a comparison is also performed using the old value of the attribute under test, and the condition returns true
to the monitor if this second comparison resolves to false.
A condition object can modify its reference value when an event is generated. This modification happens only if
this condition caused the event to be generated, or was used in filtering the event. The modification happens via
the delta attribute, which is described above. Applications should be careful about using a single self-modifying
condition object across monitors or monitored objects, since a single reference value is maintained for a given
condition object.
The server class represents server-wide information. There is only one instance of the server class in a server.
The server object has the following attributes: TABLE 10. Server Attributes
--------------------------------------------------------
| Name | Type | Default | Access |
========================================================
| connections | TAG[c] | (empty) | G |
--------------------------------------------------------
| buckets | TAG[c] | (empty) | SG |
--------------------------------------------------------
| inputBuffers | TAG[c] | (dynamic) | G |
--------------------------------------------------------
| outputBuffers | TAG[c] | (dynamic) | G |
--------------------------------------------------------
| formats | TAG[c] | (empty)??? | SG |
--------------------------------------------------------
| keys | TAG[c] | (empty) | G |
--------------------------------------------------------
| trustedAccess | TAG[c] | (dynamic) | S*G |
--------------------------------------------------------
| extensions | TAG[c] | (dynamic) | G |
--------------------------------------------------------
| resourceDatabase? | ??? | ??? | SG |
--------------------------------------------------------
The server class also inherits the attributes of the core class.
- connections
- The connection attribute contains the tags for all existing authenticated connections.
- buckets
- The bucket attribute contains all buckets that applications and server have decided to share. The buckets can
be differentiated by name.
- inputBuffers, outputBuffers
- The inputBuffers attribute and outputBuffers attribute advertise which existing input and output buffers are
available to clients for input and output of audio data. Clients can use these lists to discover where to get or
put data. Presence in these list does not imply access to the buffers listed. See the chapter "Security" on
page 53 for details on access. The inputBuffers and outputBuffers attributes always contain the existing input
and output devices. Clients may additionally add other buffers they wish to advertise.
- formats
- The format attribute contains the predefined formats available for clients to use.
- keys
- The keys attribute and contains the key objects that are used to authenticate clients.
- trustedAccess
- The trustedAccess attribute lists tags of access groups that grant trusted access. Trusted access allows access
to all resources in the server, including those of all clients. [This feature is intended primarily for audio man
agers, and should be used sparingly.] A client must have trusted access to modify this attribute, regardless of
the normal access control for the server object. For more detail on access, see "Security" on page 53.
- .extensions
- The extensions attribute contains the extensions (loaded and unloaded) currently available in the server.
[14:Need mechanism to cause this list to be reloaded. We could do it every time a get is done, but that might
be too heavyweight. Suggestions?]
- .resourceDatabase
- [15:What or whether this is depends on the how we deal with preferences]
The default value for attributes of the instance of the server class are [or rather should be] defined in "Predefined
Objects" on page 36.
The supported operations on the server class are XaSet and XaGet.
The connection class represents the connection-specific information. There is one instance of the connection
object per connection in the server. The connection object has the following attributes: TABLE 11. Connection Attributes
-----------------------------------------------------
| Name | Type | Default | Access |
=====================================================
| retainedObjects | TAG[c] | (empty) | SG* |
-----------------------------------------------------
| closed | BOOL | False | G |
-----------------------------------------------------
| access | TAG[c] | (empty) | SG |
-----------------------------------------------------
| key | TAG | (dynamic) | G |
-----------------------------------------------------
The connection class also inherits the attributes of the core class.
- retainedObjects
- The retainedObjects attribute lists objects that should remain when the connection closes. See "Object Life
time" on page 8 for details concerning object lifetime. The retainedObjects attribute values are ignored for an
untrusted client - its objects are always destroyed on the close. [17:Is this the right thing to do?] See "Secu
rity" on page 53 for further details on trusted vs. untrusted clients. [oops. no more trusted clients.]
- closed
- The closed attribute is set to true by the server when the connection to the client is severed, but the destroy
OnClose attribute was set to True.
- ..access
- The access attribute is a tag of an access object that determines which clients have access to all objects cre
ated by this connection. See "Security" on page 53 for more information.
- key
- The key attribute of the connection contains the tag of the key used to authenticate the client at connect time.
When the server determines that the communications link to the client has been severed, the server checks the
close down mode of the connection. Unless the close down mode has been set to retain the resources, the server
destroys the connection object and cleans up any allocations made for this object. A client that severs its connec
tion while having requests buffered for processing or samples queued for output has no guarantee that either the
requests or samples will be serviced.
Any client with full access to a connection object can destroy it. This terminates any link back to the client con
nected through that object, destroys the connection object, and cleans up any allocations made for this object.
The supported operations on connection objects are XaCreate, XaDestroy, XaSet, and XaGet.
- Objects of the access class describe what access is granted to specific connections, or to all connections of
specific keys. The scope of access that an access object grants depends on the object to which it is attached. This
is discussed in "Security" on page 53. The access object has the following attributes:
TABLE 12. Access Attributes
------------------------------------------------
| Name | Type | Default | Access |
================================================
| operations | ATOM[c] | XaAview | CSG |
------------------------------------------------
| keys | TAG[c] | (empty) | CSG |
------------------------------------------------
| connections | TAG[c] | (empty) | CSG |
------------------------------------------------
- The access object also inherits the attributes of the core class.
- operations
- The operations attribute describes what kind of operations are permitted. The defined values are: XaAcreate,
XaAfind, XaAget, XaAset, XaAread, XaAwrite, XaAdestroy and XaAany. These values represent the corre
sponding requests, except for XaAany, will allows any operation. The XaAcreate value is only applied to
class objects which allows objects of that class to be created.
- keys
- The keys attribute contains a list of keys. Access is granted to connections that were accepted using one of the
listed keys.
- connections
- The connections attribute is a collection of tags of connections that are granted access by this access object.
Access objects what operations are granted to which connections. What objects the permission applies to is spec
ified depend on the objects that refer to the access object. The "Connection Class" on page 29, and "Buffer
Class" on page 11 make such references to access objects. Also see the Chapter "Security" on page 53.
The key class is used to represent the keys used by the server to authenticate new clients. The key class has the
following attributes: TABLE 13. Key Attributes
--------------------------------------------------
| Name | Type | Default | Access |
==================================================
| authName | TAG | none | MC |
--------------------------------------------------
| authDataIn | BYTE[1] | none | M*CG |
--------------------------------------------------
| authDataOut | BYTE[1] | (dynamic) | G |
--------------------------------------------------
| active | BOOL | True | CSG |
--------------------------------------------------
The key class also inherits the attributes of the core class.
- authName
- The authName attribute contains the name of the authentication scheme used for this key. The list of valid
authentication schemes is implementation dependent and is listed on the server object.
- authDataIn
- The authDataIn attribute contains data passed in by the client when creating this key. The meaning of the
data, and whether the creating data is required, depend on the authentication scheme.
- authDataOut
- The authDataOut attribute contains the authorization data needed for a client to connect using this key. The
meaning of this data is authorization scheme dependent.
- active
- The active attribute determines whether the key can be used to accept new connections.
Key objects are used to authenticate incoming clients. When the client is authenticated, the key attribute of the
corresponding connection is marked with the key. When a key object is created, it is added to the list of keys in
the server object. If a destroy request is made on a key object, then the active attribute of the key is set to false,
and the key object is not destroyed until all connections made with that key are destroyed.
The supported operations on bucket objects are XaCreate, XaDestroy, XaSet, and XaGet.
The string class contains a user-defined byte string. TABLE 14. Extension Attributes
-----------------------------------------
| Name | Type | Default | Access |
=========================================
| data | BYTE[c] | empty | CSG |
-----------------------------------------
- data
- the data field is an array of user defined bytes.
String objects are different from atoms in the following ways:
- They are full fledged objects, and use the object interfaces.
- Their contents are not necessarily unique.
- They can be destroyed (atoms exist until the server resets).
- The string value can be changed.
String objects are used primarily for naming other objects.
Instances of the extension class are used to show what extensions are supported by a given server. TABLE 15. Extension Attributes
-----------------------------------------------
| Name | Type | Default | Access |
===============================================
| versions | CARD32[c] | none | G |
-----------------------------------------------
| vendor | ATOM? | XaTnone | G |
-----------------------------------------------
| release? | ATOM | XaTnone | G |
-----------------------------------------------
The extension class also inherits the attributes of the core class.
- versions
- The version attribute is a collection of 32-bit expressing the supported revisions of the protocol specification
for that extension. The high 16 bits represent the major version, and the low 16 bits represent the highest
minor version supported for that major version.
- vendor
- The vendor attribute is the Atom of a vendor-supplied string.
- release
- The release attribute is the Atom of a vendor-supplied string.
It is not a goal to provide fully self-descriptive extensions in this release of the core protocol.
Extensions may be loadable. An extension object exists regardless of whether the extension is loaded or not.
An extension object provides version and vendor information about an extension. Extensions are described in
"Extensions" on page 51.
This class provides a place for a client to place monitors for create and destroy event on all objects in a class.
This class also expresses the class hierarchy and a minimal description of the attributes of the class. This class is
a subclass of the core class and has the following attributes: TABLE 16. Class Attributes
-------------------------------------------------
| Name | Type | Default | Access |
=================================================
| superClass | ATOM | (dynamic) | G |
-------------------------------------------------
| attributes | ATOM[1] | (dynamic) | G |
-------------------------------------------------
| type | ATOM[1] | (dynamic) | G |
-------------------------------------------------
| grouping | ATOM[1] | (dynamic) | G |
-------------------------------------------------
The class class also inherits the attributes of the core class.
- superClass
- The superClass attribute lists the atom of the parent class of the described class.
- attributes
- The attributes attribute is an array of the atoms used to name attributes in the described class.
- type
- The types attribute is an array of the types of each of the attributes of the object. The types are those listed in
the specification, and are expressed as atoms.
- groupings
- The groupings attribute is an array of atoms that describe the grouping of values for each attribute. Valid
groupings are:
XaAScalar - the described attribute contains a single value.
XaACollection - the described attribute contains a collection of values.
XaAArray - the described attribute contains an array of values.
For each attribute in the attributes array, there are corresponding values in the type and groupings arrays.
The tag range class is provided so that the client library can allocate groups of tags within the server. These tags
are then passed by the client to the server in create requests to label the created objects. It is intended that this
class will be used primarily by the client library, rather than the application directly. The tag range class is a sub
class of the core class. TABLE 17. TagRange Attributes
---------------------------------------
| Name | Type | Default | Access |
=======================================
| empty | BOOL | False | G |
---------------------------------------
The tagRange class also inherits the attributes of the core class.
- empty
- The empty attribute is set to true when the last in-use tag in this range is freed.
A tag range contains tags starting with the tag after the tag used to identify this tag range. The tag range size is
sent to the client in the ProtocolReply message.
If the create request passes in a tag of XaTnone (0), then the created tag range object will be identified by the first
tag in its range. This is the normal usage by libraries. TagRange objects are not intended for direct use by appli
cations, but rather by the underlying library implementation of XaCreate.
While there are many objects that a client can create directly, some are created by the server. This chapter what
instances are created by the server.
The server has exactly one instance of the server class, created by the server prior to allowing connections. By
default, all clients in the access group have full access to the server object, and all others clients have read
access to the server object.[XXX - Need better text] If a client destroys the server object, the server terminates all
connections and resets itself.
The tag of the server object is passed to the client at connection setup time.
The server creates a connection object whenever a new client is connected to the server. This tag is passed back
to the client at connection setup time. This object can be destroyed (implying that the connection is shutdown) by
either the server or clients. See "Connection Class" on page 29 for further details.
[Do we want this?????] The server also creates a connection object for itself, to track objects created for the
server. Destroying the server connection object causes a server reset.
The server typically has a device object for each set of speakers or microphone in the server. Clients can not cre
ate the devices defined in the standard classes. (Extensions may create new device types that may have different
behavior.)
The server will predefine one or more buckets. At a minimum, there will be a "system beep" bucket pre
defined.[22:is this a requirement?] This is an implementation dependent sound. This sound may be replaced
[27:how?] by [trusted?]clients. Applications can find the available predefined buckets. [28:How? -probably as an
attribute on the server object, or as a named object?]
The server will have one predefined waveform, with a name of XaAsine. This waveform produces a sine wave.
The server will have a class object for each defined class.
The audio protocol defines atoms describing encodings. Not all servers will implement all encodings, although a
minimum required set is listed later in this chapter. The table below listed the defined encodings, and the encod
ing algorithm or document that describes how the sample is encoded. The string for the atom representing the
encoding is made by pre-pending "encoding" to it. TABLE 18.
------------------------------------------------------------------------------------------------
| encoding | Algorithm or Reference |
================================================================================================
| LinearSigned | linear, -(2 ^ (bitsPerSample - 1) <= x < 2 ^ (bitsPerSample - 1) |
------------------------------------------------------------------------------------------------
| LinearUnsigned | linear, 0 <= x < 2 ^ (bitsPerSample) |
------------------------------------------------------------------------------------------------
| U_Law | CCITT Recommendation G.711 ("Blue book") |
------------------------------------------------------------------------------------------------
| A_Law | CCITT Recommendation G.711 ("Blue book") |
------------------------------------------------------------------------------------------------
| GSM_721 | CCITT Recommendation G.721 ("Blue book") |
------------------------------------------------------------------------------------------------
| GSM_722 | CCITT Recommendation G.722 ("Blue book") |
------------------------------------------------------------------------------------------------
| GSM_723 | CCITT Recommendation G.723 ("Blue book") |
------------------------------------------------------------------------------------------------
| GSM_728 | CCITT Recommendation G.728 ("Blue book") |
------------------------------------------------------------------------------------------------
| CELP_1016 | "Federal Standard 1016, Telecommunications: Analog to Digital Conversion |
| | of Radio Voice by 4,800 bit/second Code Excited Linear Prediction (CELP)" |
------------------------------------------------------------------------------------------------
| GSM6_10 | ETSI GSM/DCS technical specification 06.10 |
------------------------------------------------------------------------------------------------
| IEEEFloat | ??? (need IEEE citation here) |
------------------------------------------------------------------------------------------------
All audio servers will provide conversion to and from the following formats.: TABLE 19. Required Formats
------------------------------------------------------------------------------
| encoding | bigEndian | numChannels | bitsPerSample | sampleRate |
==============================================================================
| LinearUnsigned | * | * | {8,16} | * |
------------------------------------------------------------------------------
| LinearSigned | * | * | {8,16} | * |
------------------------------------------------------------------------------
| U_Law | * | * | 8 | * |
------------------------------------------------------------------------------
A server may also support other formats and encodings by extension. Some servers have limited computational
facilities, and may not support real-time playback of arbitrarily high sample rates, or large numbers of simulta
neous streams.
The defined encodings are believed to place the following restrictions on formats using those encodings: TABLE 20. Format Restrictions Due to Encodings
-------------------------------------------------------------------------
| encoding | bigEndian | numChannels | bitsPerSample | sampleRate |
=========================================================================
| U_Law | | | 8 | |
-------------------------------------------------------------------------
| A_Law | | | 8 | |
-------------------------------------------------------------------------
| G_721 | True | 1 | ??? | |
-------------------------------------------------------------------------
| G_722 | True | 1 | ??? | |
-------------------------------------------------------------------------
| G_723 | True | 1 | ??? | |
-------------------------------------------------------------------------
| G_728 | True | 1 | ??? | |
-------------------------------------------------------------------------
| CELP_1016 | True | 1 | ??? | ??? |
-------------------------------------------------------------------------
| IEEEFloat | | | {32,64} | |
-------------------------------------------------------------------------
| GSM_6_10 | ??? | 1 | (unused) | ??? |
-------------------------------------------------------------------------
[Should we limit the G_7xx formats to one channel?]
[Should we limit the linear encodings to {8,16,32} bitsPerSample?]
For most fixed-rate bit encodings, the bit rate can be calculated by the following formula:
For the GSM_6_10 encoding, an 8000 Hz sample rate produces a bit rate of 13000 BPS.
The representation of time in the X Audio System is tightly bound to the representation of sound. This chapter
discusses the representation of time in the audio system, and how it is used to schedule the playback of sound.
Time and Sampled Audio Data
In the X Audio System, sound is represented as a sequence of digital samples known as a stream. The samples
occur at regularly spaced intervals, with the time interval between samples determined by the number of samples
per second (sample rate) of the stream. Each sample in the stream has a unique integer index, which increments
from sample to sample. These indices, called timestamps, are illustrated as subscripts in the figure below.
A stream may have more than one track of samples. For example, a stereo signal would have two tracks, where
concurrent samples in the tracks are marked with the same timestamp, as demonstrated by the figure below.

Different streams of audio data in the server may have different numbers of tracks, data sample rates, and repre
sentations of the audio data. These characteristics are collectively known as a format and are described by the
format object. Conversion of the samples between formats happens when data is transferred into or out of the
port, and is not discussed further in this chapter.
Numerical Representation of Timestamps
Timestamps in the X Audio System are represented by signed 32 bit integers. The timestamps increase towards
the future, and the number space wraps from (2^31 - 1) to (-2^31), as is normal for signed 32 bit integers. Times
tamps that are within 2^31 samples of one another can be compared by performing a 32 bit subtraction of times
tamp A from timestamp B. If the result is positive, then B occurred after A. Comparisons should be performed in
the preceding manner to correctly handle wrapping cases.
In some requests, time can also be specified relative to a reference time. A relative time is specified as an offset
from a reference time, where 0 indicates no offset, a positive offset indicates some time after the reference time,
and a negative value indicates a time previous to the reference time.
In the X Audio system, the sources and destinations of audio data may have timestamps in different coordinate
systems, and it is the responsibility of the port to resolve these differences. For example, a client may have a
stream sampled at 8 kHz with the first sample having a timestamp of 0, and desires to play it one second in the
future on a 8kHz device that has a current timestamp of 37. So the port would have to add an offset of (8000 +
37) = 8037 to the timestamps of the incoming data stream. This is illustrated below.
In most implementations, the samples are actually moved from the source to the destination in groups, as shown
in the figure below.

Sometimes, however, the destination consumes (or source produces) data at a fixed rate - for example, the DAC
(Digital to Analog Converter) of an output device constantly consumes samples, which forces its timeline for
ward, whether or not any data is available from the source. If the source of the audio data does not produce con
tinuous audio data, or the data arrives too late, there will be a gap in the source audio data, which is filled using
the underflowAction of the port. The impact on mapping subsequent data depends on the syncPolicy of the port.
When data arrives late, and the syncPolicy is set to XaAsyncLock, data that should already have been played is
discarded. This situation is shown below.

If the syncPolicy is set to XaAsyncWait, then if the samples arrive late, they samples will be played when they
arrive. This is pictured below.

Notice that this syncPolicy changes the mapping from input to output timestamps, causing all subsequent sam
ples to be played later than if the underflow had not occurred.
[This section is non-normative]
As noted earlier in this chapter, many implementations of the server actually move samples in clumps, rather
than one at a time, due to timesharing of compute facilities, and also for computational efficiency. This batching
of processing impacts applications because it (along with other factors) limits how little buffering can be used
and still achieve reliable results. The X Audio System provides means to monitor, and to some extent, control the
batching of samples.
To illustrate this process, and the interface the audio system provides, a simple output scenario will be discussed,
where a client is writing to a port that is attached to an output device. This is shown in the figure below.

Also, in this example, the write requests are supplying a continuous stream of ordered audio samples. The port's
run attribute is set to true.
For discussion purposes, the device in this example is a Hypothetical Output Device Implementation (hereafter
referred to as HODI). Actual implementations of devices may differ from the HODI, but the external interface
remains the same.
The HODI encases the actual audio hardware, which consists of a software buffer, a hardware buffer connected
to a DAC (Digital to Analog Converter), which is in turn hooked to a speaker or headphone. The hardware pulls
blocks of samples from the device object's software buffer and puts them into the hardware buffer. The hardware
also feeds samples one-by-one from the hardware to the DAC. This flow of data is shown in the figure below,
which is expanded from the previous figure.

Not all device implementations will have identical internal parts. For example, some device might not separate
the device buffer from the hardware buffer.
The above movement of the audio samples can also be shown in a cascade diagram, which allows us to label the
time relationships better. This diagram is shown below, with grey blocks representing blocks of audio samples,
and arrows indicating the movement of data. As time progresses, data is added to the left and removed from the
right, so the rightmost data is played first.

In the above diagram, dotted lines are used to indicate transfer points and are labeled tr1 to tr4. The first transfer
point, tr1, is where the latest sample of the last write request landed. The timestamp of this latest sample is visi
ble to the client as the latestTimestamp attribute of the port, and is marked in the diagram as LT.
The second transfer point, tr2, is where data moves from the port to the device, typically multiple samples at a
time. If samples are written to the port with a timestamp earlier than the timestamp of tr2, the samples are for
warded on to the device to be integrated into the data stream, but no guarantee is made that the device can merge
these samples. The timestamp of tr2 is visible to the client as the earliestTimestamp attribute of the port.
The third transfer point, tr3, is where data moves from the device object's buffer to the hardware buffer, typically
multiple samples at a time. Again, late samples can be forwarded on some hardware, but there is no guarantee
about their delivery. The current timestamp of tr3 is not directly accessible by the client since this transfer point
is an implementation detail.
The fourth transfer point, tr4, is where the data moves from the hardware buffer into the DAC, and becomes an
analog signal to drive the speaker. The data is typically moved one sample at a time. The conversion is nearly
instantaneous and the timestamp of tr4 can be considered to be the current timestamp of the device. The port
makes the timestamp of tr4 available in port coordinates via the outputTimestamp attribute, and in device coordi
nates via the outputBufferTimestamp. This is updated at least whenever tr2 is updated, and when a get request is
done on this attribute.
Controlling Transfer from the Port
The transfer of samples from the port to the device deserves some attention, since the timing of this transfer
affects delivery of the sound, and in many implementations, when volume is applied to the sound from this port.
If the transfer occurs too early, it increases the chance that data may arrive at the port after its scheduled transfer
time. Also, if the transfer occurs early, it may reduce the interactivity of volume control adjustments or other
changes to port or device controls. On the other hand, if the transfer from the port to the device occurs too close
to the time it is to be played, it increases the chance that the samples may not be given to the hardware in time to
be played. Also, since the transfers will tend to be much smaller, there will many more of them, which causes a
burden on some machines. So there needs to be limits placed so that the transfers do not occur too early or late.
This is controlled by the volatileSize attributes of the port and device.
Referring back to our diagram, the maximum delay from tr2 to tr4 is expressed by the volatileSize attribute of the
port object and is expressed in number of samples. The client can set this attribute. A small volatile size on the
port means data is moved to the device closer to presentation time. The minimum effective value for the maxi
mum delay is expressed by the volatileSize attribute of the device object. The client can specify smaller values
than this limit, and possibly have its samples dropped, but the implementation may choose to use any value
between that specified by the port, and the minimum advertised on the buffer. [Should the server be allowed to
violate the maximum, if needed, to provide reliable service?]
This section provides detailed descriptions of the behavior of data transfers and their affect on port attributes.
The formal definition of these attributes are in "Port Class" on page 13 and "Buffer Class" on page 11. An over
view of how data transfer in ports work is available in "Transfer Timing Example" on page 42. In reading the
definitions below, it is helpful to picture a simple loopback example, where an input device is used as the input
buffer to a port, and the output buffer of the port is an output device. This is pictured below. 
The movement of the audio samples in the figure above can be shown in the cascade diagram below, which
allows better labeling of time relationships. Grey blocks represent clusters of audio samples, and arrows indicate
the movement of data. Data moves down and to the right. The rightmost data is played first.

The port and input and output buffers all have the earliestTimestamp and latestTimestamp attributes, as defined
by the buffer class. The value of these attributes are indicated in the above diagram by the ET and LT labels.
These are the timestamps of the earliest and latest complete samples in possession of the corresponding object.
The remaining attributes from the diagram are described below.
- inputTimestamp, inputBufferTimestamp
- The inputTimestamp and inputBufferTimestamp attribute of the port represent the latestTimestamp attribute
of the input buffer. They are labelled as IT and IBT in the above diagram. The inputTimestamp attribute pre
sents the value in port time coordinates, while the inputBufferTimestamp attribute presents the value in the
input buffer's time coordinates. (Having both values available in the same object allows them to be fetched
by clients simultaneously to discover the current translation between coordinate systems.)
- outputTimestamp, outputBufferTimestamp
- (These attributes are conceptually identical to inputTimestamp and inputBufferTimestamp, except they deal
with output.) The outputTimestamp and outputBufferTimestamps attributes of the port represent the earliest
Timestamp attribute of the output buffer. They are labelled as OT and OBT in the above diagram. The output
Timestamp attribute presents the value in port time coordinates, while outputBufferTimestamp presents the
value in the output buffer's time coordinates. (Having both values available in the same object allows them to
be fetched by clients simultaneously to discover the current translation between coordinate systems.)
- inputVolatileSize, outputVolatileSize
The inputVolatileSize attribute specifies the maximum delay, as specified by the client between data arriving in
the input buffer and its transferral to the port's buffer. The delay is specified in samples in port time coordinates.
The outputVolatileSize is similar, except it controls the how far ahead of the port's earliest time that data may be
moved to the output buffer. These are illustrated on the diagram below, which is based on the previous diagram.
In the above diagram, the input volatile zone has been labeled IVZ, and the transfer of data (labeled LT) from the
input buffer to the port will occur within this zone. Similarly, the output volatile zone has been labeled OVZ, and
the transfer from the port to the output buffer will occur within this zone.
Specifying too large of an input volatile size or too small of an output volatile size may cause data to be dropped.
[Should we drop this, and have the server always try to "do the right thing"?] Also, a large value can cause poor
interactivity, and a small value can produce a high processing load in some implementations. The volatileSize
attribute of the input and output buffers give minimum values for reliable operation. The client can specify
smaller values than these limits, but the implementation may choose to use any value between that specified by
the port, and the minimum advertised on the buffer.
- These two attributes, along with the bufferSize attribute on the port, control the maximum latency of data in
the flow.
The port attempts to schedule transfers of audio samples from the input buffer to the port and from the port to the
output buffer in such that the input buffer and output buffer do not overflow or underflow their constraints - in
other words, the transfer of data is constraint based. The system provides the following constraints on buffer
sizes:
inputBuffer.latestTime <= inputBuffer.earliestTime + inputBuffer.bufferSize
port.latestTime <= port.earliestTime + port.bufferSize
ouputBuffer.latestTime <= outputBuffer.earliestTime + outputBuffer.bufferSize
The port is responsible for keeping these relationships, where the map() operation converts timestamps from one
timeline to another:
port.latestTime < map(inputBuffer.earliestTime)
port.latestTime < port.inputTime + port.inputVolatileSize
port.earliestTime + port.bufferSize < port.latestTime
port.earliestTime > map(outputBuffer.latestTime)
It is not always possible to meet the constraints, because of slight differences in clock rate, or data arriving late.
The the port object deals with these situations, based on the port's syncPolicy, overflowAction, and under
flowAction. Server implementations are not bound to a specific scheduling strategy, as long as they meet the con
straints. In general though, servers end up needing to transfer data between the objects.
There are two primary reasons for a transfer to happen:
- The output buffer requires more samples - causes transfer from port to output buffer.
- The input buffer must hand off samples - causes transfer from input buffer to port.
These transfers have minimum requirements to avoid overflow and underflow. To meet these requirements, the
server may have to adjust the amount of data in the port. This causes two secondary reasons of transfers:
- Port requires samples - causes transfer from input buffer to port.
- Port must hand off samples to output buffer - causes transfer from port to output buffer.
(These secondary reasons can also be caused by read or write requests.)
Sometimes the buffer can not supply or receive the minimum amount of data to be transferred. In this case the
underflow or overflow action of the port will be applied to either add or drop samples to the stream. This may, in
turn, affect the time mapping between the input and port and input and/or output buffers, depending on the syn
cPolicy.
The server may move additional samples during the transfers above, or schedule additional transfers, however,
these should be done so as no to cause additional dropping of data.
Port to Output Buffer Transfer
On a port to output buffer transfer, the port will use the samples stored in its buffer. If there will not be enough
samples after format conversion to meet the minimum transfer, then the port initiates a transfer from the input
buffer to the port with an appropriate minimum. If the output buffer does not have room for at least the minimum
transfer, then the port applies the overflow action. (Output policies are described later in this chapter.) The data is
converted to a format appropriate for the output buffer and transferred. The earliestTimestamp attribute of the
port is updated, and if appropriate[?] the latestTimestamp attribute of the outputBuffer is updated.
If there is no output buffer, the port applies the overflow action to meet the minimum transfer, and updates the
port's earliestTimestamp, outputTimestamp, and outputBufferTimestamp accordingly.
Input Buffer to Port Transfer
On an input buffer to port transfer, the samples come from the input buffer's storage. If the input buffer can not
supply enough samples, after conversion, to meet the minimum transfer, the underflow action is applied. (Under
flow policies are described later in the chapter.) If the port does not have enough room for the samples, then the
port initiates a transfer from the port to the output buffer, with an appropriate minimum. The samples are con
verted to a format appropriate for the port and transferred. The latestTimestamp attribute of the port is updated.
If there is no input buffer, the port applies the underflow action to meet the minimum transfer, and updates the
port's latestTimestamp, inputTimestamp, and inputBufferTimestamp accordingly.
Late Data
If samples arrive in a port or buffer after they should have been transferred out of the port or buffer, then they are
immediately forwarded on to the intended recipient(s). The quality of sample rate conversion may suffer as a
result, and for some buffers, (such as devices) the recipient may be unable to use the data. (It should be noted,
that if the syncPolicy is to wait, then the samples are never "late".)
Write and Read Requests
xxx [Need to talk about out-of-order requests]
The X Audio System provides a number of opportunities for extension:
- Additional subclasses.
These can be used to represent additional types of services or resource. For example, a voice synthesizer
might be represented by instances of a new class or classes.
- Additional attribute values.
For example, one could provide new underflow actions for ports.
- Additional requests or replies.
A server extension provides one or more of the above additions to the definition of the audio server. The core
protocol does not support client provided server extensions.
Each server extension provides an instance of the extension class. This object provides information about the
extension, but is not intended to fully describe the extension - The client is expected to know the details of the
extension based on the name and versions contained in the extension object. Applications can use the FindObject
request (matching on the extension class) to find which extensions the server supports. See "Extension Class" on
page 33 for more details on the extension class.
Each extension has a unique name, which is a STRING8, of the following form:
The organization field above is the organization name as registered in section 1 of the X Registry (the registry is
provided as a free service by the X Consortium.) This prevents conflicts among extensions.
As an example, if the X Consortium defined a synchronization extension, it might be called:
[By convention, the consortium will use XC- for standardized extensions, and XC-d- for extensions under devel
opment.]
Extensions have version major and minor version numbers. A major version change is used to indicate an incom
patible change across releases. A minor version change means that the newer version is backwards compatible.
An extension may define new subclasses in the server. The name of an extension subclass must start with the
organization prefix to prevent naming conflicts. It is recommended that the class name also include the extension
and major version number, and an extension-specific string.
As an example, if the X Consortium defined version 1 of a synchronization extension with a remote clock class,
it might call it:
An extension may also define new requests or replies. These requests and replies are carried over a separate ICE
subprotocol on the same connection. The client must open the corresponding ICE subprotocol before using these
requests or replies.
The recommended convention for naming audio extension subprotocols is to prepend "XA_" to the extension
name.
An extension may also define and implement new values for attributes of existing classes, and implement corre
sponding behavior for these values. If these new values are atoms, the atom names should follow the same nam
ing conventions used for extension classes.
An extension desiring to add attributes to a class must do so by creating a subclass with the additional attributes.
It can not add attributes to an existing class. The owner of an extension may add attributes to a class within the
extension by changing the extension's minor version. The additional attributes should have default values such
that clients of previous minor revisions work correctly.
There are many types of security threats, not all of which should be handled within the core audio protocol. The
following are the security requirements placed on the audio protocol:
- Restrict incoming connections to those with correct credentials.
- Selective restrict access to individual devices (e.g., allow access to speaker but not microphone.) This should
be on both a per credential (or credential group) and per connection basis.
- Allow "untrusted" clients partial service. This partial service must be configurable. (e.g. sometimes it's OK
for an untrusted client to have the microphone.)
- Allow clients privacy of their audio data and server resources.
- Allow clients to selectively share their audio data and server resources. (read-only and read/write)
- Hide object tags of unshared resources.
- Allow same user/machine to have both trusted and untrusted clients.
- Allow for dynamic addition and removal of valid credentials.
- Allow for multiple credential groups (and creation and destruction).
- Allow for shutdown of specific connections or all connections of a credential group.
- Allow clients selective privacy (for set and get) of object contents.
- Allow clients selective sharing of audio data and object contents.
- Allow for a trusted audio manager.
The following goal is no supported by the core audio protocol, but may be a desirable extension:
- Allow for, but do not mandate, the use of an interactive security manager. (does this need to be defined for
first release?)
The following are not requirements of the core audio protocol:
- Protection from snooping of protocol stream (including data) during transport. (Snooping = unauthorized
copying of) This should be handled by the transport via encryption or other means.
- Protection from alteration of the protocol stream (including data) during transport. This is transport's respon
sibility. This is sometimes called an "active attack".
- Prevention of denial of service attacks by accepted clients.
Security in the X Audio system uses two basic mechanisms. The first is to restrict access to the server by connec
tion-time credential authentication. The second mechanism is selective access, which defines what connections
have read or read/write access to which server objects. These mechanisms are described in the following sec
tions.
When a client attempts to connect to the server, authentication is performed, using the authentication portion of
the ICE protocol. Like X11, the authentication schemes in X Audio are not specified by the Audio protocol, and
are implementation dependent.
The server defines a "Key Class" on page 31 of this document. Each key contains authentication data and the
name of the authentication scheme used for that data. These keys are used to validate authentication data pre
sented by clients at connection time. The key used to admit a given client is noted in a read-only field in the con
nection object created for the client. Keys can also be used in access objects to identify applications that
connected using that key.
Only clients in the access groups listed in the class instance describing the key class may create keys. See the
"Key Class" on page 31 for full information.
A client can always access objects that it created. For a given client to examine, modify, or refer to any other
object in the server, it must have been granted access to it. There are three major questions to answer:
- What operations are permitted?
- What connections are permitted to do these operations?
- Which objects can these operations be applied to?
The protocol uses the notion of access groups to answer the first two questions, and the third question is
answered by identifying what access groups apply to a given object. Access groups are briefly described below.
Access Groups
Access groups, which are objects created from the access class, identify a group of connections that are being
granted access, and what operations they are being permitted to perform. The access group identifies connection
by listing the tag of specific connection objects, or listing the keys the connections used to connect to the server.
The access object also lists atoms representing the operations permitted operations. See "Access Class" on
page 30 for more information.
Granting access
Once access an access group has been defined, it can be used for granting access by adding it to the proper access
attribute. There are three categories of access
- Server global access - Access to all objects in the server, including those of other connections.
Controlled by the trustedAccess attribute of the server object.
- Per object access - Access to a specific object.
Controlled by the access attribute of the specific object.
- Per connection access - Access to all objects belonging to a specific connection.
Controlled by the access attribute of the connection object.
In each of the above cases, more than one access group can be used in an access list. The effective access for a
given request is the union of the accesses granted by the above mechanisms.
Server global access should be used very sparingly - only for connections, such as an audio manager, that must
have access to every client's resources.
[Right now, we lack the capability to start up an application such that its resources are automatically shared,
without specific action being taken by the application. This makes life hard for a debugger, unless the debugger
is a trusted client. On the other hand, it guarantees a higher level of privacy to incoming clients. Any opinions?]
Default Access Groups and keys
The server starts up with at least one access group and one key, which [somehow] grants access to the server
object. These are [or rather should be] defined in "Predefined Objects" on page 36.
[31:Need major work here...]
This chapter describes connection setup and teardown, as well as protocol components that are used in multiple
places in the protocol.
The X Audio System protocol uses the Inter-Client Exchange Protocol (ICE) version 1.0 for the connection
authorization framework, version negotiation, and protocol framework. Please refer to the ICE protocol specifi
cation for more information on ICE.
ProtocolSetup
For the protocol setup phase of ICE, the client library initiates the ICE ProtocolSetup message. The ICE Proto
colSetup message has three fields that are specific to each sub-protocol. Audio protocol uses these fields in the
following manner:
- protocol-name:
"X Consortium Audio"
- versions:
the current major and minor versions are 1 and 0, respectively.
- vendor, release:
These fields have no defined semantics in the core protocol and are ignored. However, the library may define
semantics for them.
The server responds to the ProtocolSetup message with the ICE ProtocolReply message. The audio protocol
attaches no semantics to the vendor and release fields of the ProtocolReply message. However, the server imple
mentation may identify itself via these fields in a server dependent fashion.
The following setup information is sent:
- Tag of server object.
- Tag of connection object for this client.
- Tag of first tag range for this connection.
- The number of tags in tag range objects for this connection. (Note that the first tag of the range is preallocated
as the tag of the range.)
Tags are unique CARD32 values which uniquely identify objects and atoms in the server. Tags may be shared
across connections, and the tag is valid for the life of the object or atom.
Objects are assigned a tag at creation time. Clients use this tag to identify the object in subsequent requests. A
client request to create an object contains the tag to assign to the created object. The client must supply a tag not
currently in use, but in a tag range allocated by this client from the server. See the description of the "TagRange
Class" on page 35 on how to allocate and free tag ranges. Many libraries will choose to perform tag assignment
automatically.
An atom assigns a tag to a unique string. This is assignment stays for the life of the server, and neither the tag or
string of the atom will change. Atoms are used for several purposes:
- As constants used for parsing requests in the protocol stream.
- To represent attribute names in the protocol stream.
- To represent constants used as values for some attributes.
A client obtains atoms from the server using the find request. The find request may cause a new atom to be cre
ated. See the "Find Atom Request" on page 62 for more details.
(Design note: the choice to use tags to identify atoms allows for an attribute of an object to accept both atoms and
objects as valid values, and be able to distinguish atoms from objects.)
Partitioning of Tag Values
The possible tag values are partitioned as follows: TABLE 21. Tag Space Partitioning
-------------------------------------------------------------
| From | To | Description |
=============================================================
| 0x00000000 | 0x0000003FF | Protocol parse constants |
-------------------------------------------------------------
| 0x000003FF | 0x7FFFFFFF | Allocatable Tags |
-------------------------------------------------------------
| 0x80000000 | 0xFFFFFFFF | Reserved for Client Library |
-------------------------------------------------------------
The first partition of tags is used for the protocol for parsing constants. See "Parsing Parameter Lists" on page 59
for defined constants. The value 0 is reserved for the atom XaTnone.
The second partition of tags listed is the range from which all server tag range allocations will come from,
regardless of whether the object in the server was created on behalf of the client or server. Ranges of tags within
this partition are allocated using Tag Range objects, which are described in "TagRange Class" on page 35.
The third partition of tags is reserved for use in the client library. It is anticipated that a client library may want to
employ tags to represent client side abstractions. This third partition allows the client library to easily distinguish
between client and server side tags. Examples of client library uses for tags include:
(Non-normative comment: Libraries may find that the range 0x80000000 to 0x800003FF convenient to use for
parse constants in arg lists from applications, due to bit similarities with the range that is used for parsing the pro
tocol.)
A parameter list consists of a sequence of items. Each item begins with an atom. If the atom is not in the range
reserved for parse constants, then the item is a generic item, which is a name/value pair of the following form:
The name is an atom for the string that represents that parameter or attribute of the object. The interpretation of
the value field depends on the type of the parameter or attribute reference in the name field. (These are defined in
the chapter "Object Definitions".) The encoding of the generic item is described in "Generic Item" on page 70.
If the Atom of the item is a parse constant, rather than the name of an attribute, then it affects how the subsequent
parameters are parsed. The protocol parse constants are defined in the following subsections.
The XaParray item is used to transmit an array of values. This item has the following form:
arrayAtom: ATOM
length: CARD32
name: ATOM
values: LISTofVALUES
The arrayAtom field contains XaParray. The length field is the number of items in the array. If the name field is
XaParray, then this is a nested array, and the length and name fields are repeated. The rest of the item is a list of
(N0 x N1 x ... Nn) values, where (N0 ... Nn) are the length fields of the XaParray item. The first occurring dimen
sion is for the outermost loop, so that a sequence of:
XaParray, 2, XaParray, 3, XaAmix, 10, 11, 12, 13, 14, 15
describes a matrix of 2 rows and three columns, and gives elements the following values:
mix0,0 = 10, mix0,1 = 11, mix0,2 = 12, mix1,0 = 13, mix1,1 = 14, mix1,2 = 15
Some attributes do not have restrictions on the dimensionality or size of the array they contain. If an attribute
does have this flexibility, is not an error for an application to supply an array of different size or dimension than
expected. In this case, the array is adjusted in size to fit the requirements. Each attribute may define its own
adjustment rules. However, unless specified in the object's definition, the rules are as follows:
Arrays too large in a given dimension are truncated to fit. Arrays that are too small are expanded with corre
sponding elements of the identity matrix of that dimension. [need better rules].
The encoding of the array item is described in "XaParray" on page 71.
The XaParrayPart tag is used to transmit a portion of an array. This item has the following form:
arrayAtom: ATOM
offset: CARD32
length: CARD32
name: ATOM
values: LISTofVALUES
The arrayAtom field contains XaParrayPart. The offset field is contains the number of elements to skip. The
length field is the number of items in the array. If the name field is XaParrayPart, then this is a nested array, and
the offset, length and name fields are repeated. The rest of the item is a list of (N0 * N1 * ... Nn) values, where
(N0 ... Nn) are the length fields of the item. In a multidimensional array, each offset is reported in the same
dimension as the corresponding length field. In multidimensional arrays, the ordering rules defined in the previ
ous section are used. Padding and truncation rules of the previous section are applied as well.
The encoding of the array part item is described in "XaParrayPart" on page 71.
The XaPtype atom is used to specify the source type of the item that follows it. This item has the following form:
typeAtom: ATOM
sourceType: ATOM
The type must be one that the server recognizes and can convert into the normal type for that argument. It is a
protocol error if another item does not follow this item. [36:Need list of valid conversions!]
The encoding of the XaPtype item s described in "XaPtype" on page 71.
A collection is an unordered list of values. Each value occurs at most once. There are three defined items that
operate on sets, to replace, add or subtract. All have the same form:
collectionAtom: ATOM
length: CARD32
name: ATOM
values: LISTofVALUES
The value of collectionAtom is one of {XaPcollectionReplace, XaPcollectionAdd, XaPcollectionSubtract}. The
number of entries in values list is described by the length field. The name field contains the atom of the attribute
to operate on. Multiple occurrences of a value in an item is not an error; the extra occurrences are ignored. For
the XaPcollectionAdd operation, if a value is added to a collection that already has that value, no change occurs.
If a value is subtracted from a collection that does not contain it, no error results.
The encoding of the collection items is in "XaPcollectionReplace" on page 71, "XaPcollectionAdd" on page 72,
and "XaPcollectionSubtract" on page 72.
[37:do we need a list parsing as well?]
This request is used to get the atom that corresponds to the given string. The tag will be valid until the server
resets. This request has the following form:
reply_id: CARD32
create: BOOL
name: STRING8
=>
reply_id: CARD32
tag: LISTofTAG
[38:...]
The name field is a null terminated STRING8 and should use the ISO Latin-1 encoding (although this encoding
is not mandatory.) The server's atom database is searched for a matching atom. If the atom exists, the tag of the
atom is returned. If the atom does not exist and the create field is true, then an atom is created and the tag of that
atom is returned. Otherwise, the tag XaTnone is returned.
The returned atom is placed in the tag field of the reply. The contents of the reply_id field are not interpreted by
the server, but are placed into the reply_id field of the reply. (Some library implementations may find the reply_id
field useful to aid in processing replies.)
For the encoding of this request, see "XaFindAtom" on page 72. For the encoding of the reply, see "XaFind" on
page 76.
Errors:
Alloc.
This request is used to find objects of a particular class match the given search values. This request has the fol
lowing form:
reply_id: CARD32
class: TAG
attributes: LISTofITEM
=>
reply_id: CARD32
object_tag: LISTofTAG
The request returns the tags of the objects of the class specified in the class field that also match the attribute val
ues listed in the attributes field. If no objects are matched the XaTnone tag is returned.
The returned tags are place in the object_tag field of the reply. The contents of the reply_id field are not inter
preted by the server, but are placed into the reply_id field of the reply. (Some library implementations may find
the reply_id field useful to aid in processing replies.)
For the encoding of this request, see "XaFindObject" on page 73. For the encoding of the reply, see "XaFind" on
page 76.
Errors:
Class, Name, Value.
The create request creates an object of the given class using the given tag as the identifier of the object. The form
of the request is as follows:
id: TAG
classId: ATOM
params: LISTofITEM
The id tag must be in a range allocated by this client from the server. The params list is a list of parameters.
These parameters can be used to set values for attributes of the object being created, and are defined by the "Pars
ing Parameter Lists" on page 59. For the encoding of this request, see "XaCreate" on page 73.
Errors:
Alloc, Class, Name, Value.
The destroy request removes the client's reference to the objects. When all references to the objects are gone, the
object is destroyed. See "Object Lifetime" on page 8 for details.
This request has the following form:
For the encoding of this request, see "XaDestroy" on page 73.
Errors:
Object.
This request sets the attributes of the given object. It has the following form:
object: TAG
params: LISTofITEM
The params list is a list of parameters. These parameters can be used to set values for attributes of the object
being created, and for some objects, a parameter can be included to define when the change should take place.
The forms used to express parameter lists are defined by "Parsing Parameter Lists" on page 59. The parameters
valid for the set vary by object class. See the object class definitions for valid parameters. For the encoding of
this request, see "XaSet" on page 74.
Errors:
Object, Name, Value.
This request is used to retrieve attributes from a given object. This request has the following form:
object: TAG
attributes: LISTofATOM
reply_id: CARD32
=>
reply_id: CARD32
params: LISTofITEM
The attributes field lists the attributes that are returned in the params field of the reply. The value supplied in the
reply_id field of the request is copied into the reply_id field of the reply and is not interpreted by the server.
(Some library implementations may find the reply_id field useful to aid in processing replies.)
For the encoding of this request, see "XaGet" on page 74. For the encoding of the reply, see "XaFindAtom" on
page 72.
Errors:
Object, Name.
The Write request is used to put data into an object. In the core protocol, the only objects that accept writes are
port with an input buffer of XaTnone.
object: TAG
when: TIME
timeReference: TAG
lengthInBits: CARD32
leftPad: CARD32
data: LISTofBYTE
The object field describes which (port) object the write is using. The when field specifies the input time coordi
nates of the first sample in the data field of the request. The timeReference field of the request specifies whether
the when field is an absolute time in the input timeline, or relative to the last sample of the last write request on
the object. [40:need detail here.]
The lengthInBits fields specifies the length of the data being written, in bits. The leftPad field describes how
many of the supplied bit to ignore, and these bits are not included in the count in the lengthInBits field. For the
encoding of this request, see "XaWrite" on page 74.
Errors:
Object, Class, Alloc.
=>
reply_id: CARD32
num_bits: CARD32
left_pad: CARD32
data: LISTofCARD8
[32:Need description of this request]
For the encoding of this request, see "XaRead" on page 75. For the encoding of the reply, see"XaFindAtom" on
page 72.
Errors:
Object, Class, Value.
(A write may also cause an error event at some later time on the port object, if the data is illegal for that encod
ing.)
This request sends a reply. This indicates to the client that all requests prior to this request have been parsed by
the server, and any errors from that parsing have been returned. The reply does not indicate that all data queued
by prior write requests has been played.
=>
Errors:
none.
Events are messages sent from the server to the client that reflect changes in the state of the server. These
changes include: creation of an object, destruction of an object, change in the state of an object, or an error occur
ring in an object. Events are sent only to clients that express interest in them. This interest is expressed using
monitor objects, which are defined in "Monitor Class" on page 17. A monitor contains information that specify
the conditions under which the monitor generates an event and what event is generated. An object may have mul
tiple monitors, and a monitor may be attached to multiple objects.
Create, destroy, and change events have following fields in common:
object: TAG
monitor_count: CARD32
monitor: LISTofTAG
params: LISTofITEM
The monitor field contains the tag(s) of the monitor(s) that generated the event, and the object field contains the
tag of the object the monitor is attached to that caused the event. The params field contains the attributes of the
object, as specified by the retAttributes attribute of the monitor. The monitor_count field indicates the number of
monitors listed in the monitor field.
The create and destroy events define an additional field that contains the tag of the object being created or
destroyed:
If more than one monitor generates an event on the same object for the same reason at the same time, the server
may optionally combine these events and send them as a single event on the connection. Combining events
requires that the sent monitor field contain all relevant monitors, and that the params field be the union of what
the individual params fields would have contained.
The library interface may optionally choose to use monitors to provide a higher level abstraction to applications,
and may discourage direct use of monitors by applications.
A change event notifies the client that the state of an object's attributes has changed. The event is generated if the
conditions specified by the monitor are met. If several attributes of an object are changed atomically by a single
request or internal server operation, at most one event is generated for any attached monitor. A change attribute is
not generated when the object is destroyed. The change event encoding is in "XaSet" on page 74.
A create event notifies the client that an object is created. This event is generated by a monitor attached to a class
object representing the class or a superclass of the created object. The object field of the event contains the tag of
the class object the monitor was attached to, and the params field contains attributes from the created object, as
specified by the retAttributes attribute of the monitor. The create event encoding is in "XaSet" on page 74.
A destroy event notifies the client that an object has been destroyed. This event is generated by a monitor
attached to the object being destroyed, or a class object representing the class or a superclass of the object. The
object field of the event contains the tag of the class object the monitor was attached to, and the params field con
tains attributes from the destroyed object, as specified by the retAttributes attribute of the monitor. The encoding
of the destroy event is in "XaSet" on page 74.
[Some rough work done, more needed.]
This chapter will use the documentation conventions for encoding as established by the X Protocol Specification.
[Should use formal title and rev]
All requests and events will be padded to 8 bytes to conform to the Inter-Client Exchange protocol (ICE). Vari
able length padding will be expressed as p=pad8(n).
Protocol senders must set all unused fields and padding in the protocol stream must to zero, with the exception of
the data portions of write requests and read replies. The receiver is not required to check unused fields and pad
ding for compliance. [44: should we insist that the receiver not generate errors?]
BYTE: 8-bit value
CARD32: 32-bit unsigned integer
INT32: 32-bit signed integer
ATOM: CARD32
TAG: CARD32
BUFFER: TAG
TIME: INT32
LPCE: A character from the X Portable Character Set in Latin Portable Character Encoding.
LISTofFOO: some number of repetitions of type FOO. Length of the list is expressed elsewhere.
Many requests use parse items to define the contents of the request. The defined parse types are listed below:] TABLE 22. Generic Item
---------------------------------------
| 4 | ATOM | | attribute name |
---------------------------------------
| 4 | CARD32 | | value |
---------------------------------------
The generic item is described in"Parsing Parameter Lists" on page 59.
TABLE 23. XaParray
---------------------------------------------------
| 4 | XaParrayPart | | array atom |
---------------------------------------------------
| 4 | CARD32 | | n (length of array) |
---------------------------------------------------
| 4 | ATOM | | attribute name |
---------------------------------------------------
| 4n | CARD32 | | elements |
---------------------------------------------------
The XaParray item is described in "XaParray parsing" on page 31.
TABLE 24. XaParrayPart
------------------------------------------------------
| 4 | XaParrayPart | | arrayPart atom |
------------------------------------------------------
| 4 | CARD32 | | offset into array |
------------------------------------------------------
| 4 | CARD32 | | n (length of subarray) |
------------------------------------------------------
| 4 | ATOM | | attribute name |
------------------------------------------------------
| 4n | CARD32 | | elements |
------------------------------------------------------
The XaParray item is described in "XaParrayPart" on page 60.
TABLE 25. XaPtype
-------------------------------------
| 4 | XaPtype | | type atom |
-------------------------------------
| 4 | ATOM | | source type |
-------------------------------------
An XaPtype item must be followed by another item.
The XaPtype item is described in "XaPtype" on page 61.
TABLE 26. XaPcollectionReplace
---------------------------------------------------------------
| 4 | XaPcollectionReplace | | collection replace atom |
---------------------------------------------------------------
| 4 | CARD32 | | n (number of items) |
---------------------------------------------------------------
| 4 | ATOM | | attribute name |
---------------------------------------------------------------
| 4n | CARD32 | | values in set |
---------------------------------------------------------------
TABLE 27. XaPcollectionAdd
-------------------------------------------------------
| 4 | XaPcollectionAdd | | collection add atom |
-------------------------------------------------------
| 4 | CARD32 | | n (number of items) |
-------------------------------------------------------
| 4 | ATOM | | attribute name |
-------------------------------------------------------
| 4n | CARD32 | | values in set |
-------------------------------------------------------
TABLE 28. XaPcollectionSubtract
-----------------------------------------------------------------
| 4 | XaPcollectionSubtract | | collection subtract atom |
-----------------------------------------------------------------
| 4 | CARD32 | | n (number of items) |
-----------------------------------------------------------------
| 4 | ATOM | | attribute name |
-----------------------------------------------------------------
| 4n | CARD32 | | values in set |
-----------------------------------------------------------------
The collection items are described in the section "XaPcollectionReplace, XaPcollectionAdd, XaPcollectionSub
tract" on page 61.
The audio protocol uses the ICE protocol framework (version 1.0) for connection setup and protocol setup. Refer
to the ICE protocol specification for encoding details. See"Connection Setup" on page 57 for the name of the
protocol.
The request encodings are defined in the following tables: TABLE 29. XaFindAtom
---------------------------------------------------
| 1 | BYTE | | major opcode |
---------------------------------------------------
| 1 | 1 | | minor opcode (FindAtom) |
---------------------------------------------------
| 1 | BOOL | | create |
---------------------------------------------------
| 1 | 0 | | unused |
---------------------------------------------------
| 4 | (4+m+p)/8 | | length |
---------------------------------------------------
| 4 | CARD32 | | reply_id |
---------------------------------------------------
| m | STRING8 | | name |
---------------------------------------------------
| p | 0 | | unused p = pad8(4+m) |
---------------------------------------------------
The XaFindAtom request is defined in "Find Atom Request" on page 62.
TABLE 30. XaFindObject
------------------------------------------------------
| 1 | BYTE | | major opcode |
------------------------------------------------------
| 1 | 8 | | minor opcode (FindObject) |
------------------------------------------------------
| 2 | 0 | | unused |
------------------------------------------------------
| 4 | (8+m+p)/8 | | length |
------------------------------------------------------
| 4 | CARD32 | | reply_id |
------------------------------------------------------
| 4 | TAG | | class |
------------------------------------------------------
| m | LISTofITEM | | name |
------------------------------------------------------
| p | 0 | | unused p = pad8(8+m) |
------------------------------------------------------
The XaFindObject request is defined in "Find Object Request" on page 62.
TABLE 31. XaCreate
--------------------------------------------------
| 1 | BYTE | | major opcode |
--------------------------------------------------
| 1 | 2 | | minor opcode (Create) |
--------------------------------------------------
| 2 | 0 | | unused |
--------------------------------------------------
| 4 | (8+m+p)/8 | | length |
--------------------------------------------------
| 4 | TAG | | id |
--------------------------------------------------
| 4 | ATOM | | classId |
--------------------------------------------------
| m | LISTofITEM | | items... |
--------------------------------------------------
| p | 0 | | unused p = pad8(8+m) |
--------------------------------------------------
The XaCreate request is defined in "Create Request" on page 63.
TABLE 32. XaDestroy
---------------------------------------------
| 1 | BYTE | | major opcode |
---------------------------------------------
| 1 | 3 | | minor opcode (Destroy) |
---------------------------------------------
| 2 | 0 | | unused |
---------------------------------------------
| 4 | 1 | | length |
---------------------------------------------
| 4 | TAG | | tag of object |
---------------------------------------------
| 4 | 0 | | unused |
---------------------------------------------
The XaDestroy request is defined in "Destroy Request" on page 64.
TABLE 33. XaSet
-------------------------------------------------
| 1 | BYTE | | major opcode |
-------------------------------------------------
| 1 | 4 | | minor opcode (Set) |
-------------------------------------------------
| 2 | 0 | | unused |
-------------------------------------------------
| 4 | (4+m+p)/8 | | length |
-------------------------------------------------
| 4 | TAG | | object |
-------------------------------------------------
| m | LISTofITEM | | items... |
-------------------------------------------------
| p | 0 | | unused p = pad8(4+m) |
-------------------------------------------------
The XaSet request is defined in "Set Request" on page 64.
TABLE 34. XaGet
-------------------------------------------------------
| 1 | BYTE | | major opcode |
-------------------------------------------------------
| 1 | 5 | | minor opcode (Get) |
-------------------------------------------------------
| 2 | 0 | | unused |
-------------------------------------------------------
| 4 | (12+4n+p)/8 | | length |
-------------------------------------------------------
| 4 | TAG | | object |
-------------------------------------------------------
| 4 | TAG | | reply_id |
-------------------------------------------------------
| 4 | CARD32 | | n (number of attributes) |
-------------------------------------------------------
| 4n | LISTofATOM | | return attributes |
-------------------------------------------------------
| p | 0 | | unused p = pad8(12+4n) |
-------------------------------------------------------
The XaGet request is defined in "Get Request" on page 64.
TABLE 35. XaWrite
--------------------------------------------------------------
| 1 | BYTE | | major opcode |
--------------------------------------------------------------
| 1 | 6 | | minor opcode (Write) |
--------------------------------------------------------------
| 2 | 0 | | unused |
--------------------------------------------------------------
| 4 | (20+m+p)/8 | | length |
--------------------------------------------------------------
| 4 | TAG | | object |
--------------------------------------------------------------
| 4 | TIME | | when |
--------------------------------------------------------------
| 4 | ATOM | | time reference |
--------------------------------------------------------------
| 4 | CARD32 | | n (length in bits); m = pad8(n)/8 |
--------------------------------------------------------------
| 4 | CARD32 | | leftPad |
--------------------------------------------------------------
| m | BYTE | | data |
--------------------------------------------------------------
| p | 0 | | unused p = pad8(20+m) |
--------------------------------------------------------------
[43:The encoding of leftPad needs to be thought about. It is really a card 8, but for alignment and byte swapping
reasons is easier to make a CARD32. This wastes space. We could stuff it in the unused field of the header,
although that removes the possibility of using that field globally in the future (and has slight impact on swap
ping?). We'll worry about this later.]
The XaWrite request is defined in "Write Request" on page 65.
TABLE 36. XaRead
--------------------------------------------
| 1 | BYTE | | major opcode |
--------------------------------------------
| 1 | 7 | | minor opcode (Read) |
--------------------------------------------
| 2 | 0 | | unused |
--------------------------------------------
| 4 | 3 | | length |
--------------------------------------------
| 4 | TAG | | object |
--------------------------------------------
| 4 | TIME | | when |
--------------------------------------------
| 4 | ATOM | | time reference |
--------------------------------------------
| 4 | CARD32 | | minimum bits |
--------------------------------------------
| 4 | CARD32 | | maximum bits |
--------------------------------------------
| 4 | TAG | | reply_id |
--------------------------------------------
The XaRead request is defined in "Read Request" on page 66.
TABLE 37. XaPing
------------------------------------------
| 1 | BYTE | | major opcode |
------------------------------------------
| 1 | 9 | | minor opcode (Ping) |
------------------------------------------
| 2 | 0 | | unused |
------------------------------------------
| 4 | 1 | | length |
------------------------------------------
| 4 | TAG | | reply_id |
------------------------------------------
| 4 | 0 | | unused |
------------------------------------------
The XaPing request is defined in "Ping Request" on page 66.
The reply encodings are defined in the following tables: TABLE 38. XaFind
----------------------------------------------------------
| 1 | BYTE | | major opcode |
----------------------------------------------------------
| 1 | 36 | | minor opcode (FindAtomReply) |
----------------------------------------------------------
| 2 | 0 | | unused |
----------------------------------------------------------
| 4 | (4+4m+p)/8 | | length |
----------------------------------------------------------
| 4 | CARD32 | | reply_id |
----------------------------------------------------------
| 4m | TAG | | tag |
----------------------------------------------------------
| p | 0 | | unused p = pad8(4+4m) |
----------------------------------------------------------
The XaFind reply encoding is used by the "Find Atom Request" on page 62 and the "Find Object Request" on
page 62.
TABLE 39. XaAttribute
----------------------------------------------------
| 1 | BYTE | | major opcode |
----------------------------------------------------
| 1 | 37 | | minor opcode (GetReply) |
----------------------------------------------------
| 2 | 0 | | unused |
----------------------------------------------------
| 4 | (4+m+p)/8 | | length |
----------------------------------------------------
| 4 | CARD32 | | reply_id |
----------------------------------------------------
| m | LISTofITEM | | items... |
----------------------------------------------------
| p | 0 | | unused p = pad8(4+m) |
----------------------------------------------------
TABLE 40. XaRead
------------------------------------------------------------
| 1 | BYTE | | major opcode |
------------------------------------------------------------
| 1 | 38 | | minor opcode (ReadReply) |
------------------------------------------------------------
| 2 | 0 | | unused |
------------------------------------------------------------
| 4 | (12+m+p)/8 | | length |
------------------------------------------------------------
| 4 | CARD32 | | reply_id |
------------------------------------------------------------
| 4 | CARD32 | | num_bits |
------------------------------------------------------------
| 4 | CARD32 | | leftPad |
------------------------------------------------------------
| m | CARD8 | | data m = pad8(leftPad+num_bits) |
------------------------------------------------------------
| p | 0 | | unused p = pad8(12+m) |
------------------------------------------------------------
XaAttribute reply is used for the "Get Request" on page 64.
The XaRead reply is defined in "Read Request" on page 66.
TABLE 41. XaPing
-----------------------------------------------
| 1 | BYTE | | major opcode |
-----------------------------------------------
| 1 | 39 | | minor opcode (PingReply) |
-----------------------------------------------
| 2 | 0 | | unused |
-----------------------------------------------
| 4 | 1 | | length |
-----------------------------------------------
| 4 | TAG | | reply_id |
-----------------------------------------------
| 4 | 0 | | unused |
-----------------------------------------------
The XaPing reply is defined in "Ping Request" on page 66.
The event encoding are defined in the following tables: TABLE 42. XaChange
---------------------------------------------------------
| 1 | BYTE | | major opcode |
---------------------------------------------------------
| 1 | 33 | | minor opcode (ChangeEvent) |
---------------------------------------------------------
| 2 | 0 | | unused |
---------------------------------------------------------
| 4 | (8+m+4n+p)/ | | length |
| | 8 | | |
---------------------------------------------------------
| 4 | TAG | | object |
---------------------------------------------------------
| 4 | CARD32 | | monitor_count n |
---------------------------------------------------------
| 4n | TAG | | monitors |
---------------------------------------------------------
| m | LISTofITEM | | items... |
---------------------------------------------------------
| p | 0 | | unused p = pad8(8+m+4n) |
---------------------------------------------------------
The XaChange Event is defined in "Change Event" on page 69
TABLE 43. XaCreate, XaDestroy
-------------------------------------------------------------
| 1 | BYTE | | major opcode |
-------------------------------------------------------------
| 1 | 34, 35 | | minor opcode (Create, Destroy) |
-------------------------------------------------------------
| 2 | 0 | | unused |
-------------------------------------------------------------
| 4 | (12+m+4n+p) | | length |
| | /8 | | |
-------------------------------------------------------------
| 4 | TAG | | object |
-------------------------------------------------------------
| 4 | CARD32 | | monitor_count n |
-------------------------------------------------------------
| 4 | TAG | | target_object |
-------------------------------------------------------------
| 4n | TAG | | monitors |
-------------------------------------------------------------
| m | LISTofITEM | | items... |
-------------------------------------------------------------
| p | 0 | | unused p = pad8(12+m+4n) |
-------------------------------------------------------------
The create event is defined in "Create Event" on page 69 and the destroy event is defined in "Destroy Event" on
page 69.
This is a non-normative chapter provided to show anticipated directions for future development and standardiza
tion. The futures listed are not a formal part of the standard, and no commitment is made toward these futures.
[should talk about true processing flows (that use mutliports as wrappers, direct connect devices, MPEG, etc.)]
Glossary Glossary
The glossary is non-normative. This means that the glossary is not a formal part of the definition of the Audio
system, but is provided as a service to the reader. [suggestions for additions to the glossary?]
- Active Attack
- An active attack is where the enemy can insert messages and -- in some variants -- delete or modify legitimate
messages. (From Firewalls and Internet Security: Repelling the Wily Hacker, Cheswick & Bellovin, p. 213.)
- Attribute
- An attribute is a field of a class.
- Bucket
- An instance of the bucket class is used to cache audio samples in a server. It is a subclass of buffer. Bucket is
a subclass of buffer. Data is added or removed from buckets using ports.
- Buffer
- The buffer class represents objects that hold samples. It has a format and a time coordinate system. The buffer
class is an abstract class. This means that all instances of the buffer class are actually instances of a subclass
of buffer.
- Class
- A class defines the attributes (or fields) of instances of that class. The class also defines the meanings of those
attributes, and any behavior of object of that class. In the core audio system, object classes are statically
defined. The client can not define new classes. (Also see instance and object.)
- Client
- An application connects to the audio server via some communications path, such as a TCP connection or
shared memory. This makes the application a client of the server, and the server keeps track of the communi
cations link via an object of the "client" class. If the application opens multiple paths to the server, it is treated
as multiple clients. The lifetimes of objects created by a client relate to the client object, not directly to the
application lifetime.
- Device
- A device is a subclass of buffer that converts audio sample to or from sound.
- Error
- An error is an event that is sent went an invalid request has been sent. (This definition differs slightly from the
X definition.)
- Event
- A message sent from the server to the client. Events can be generated by monitors, as responses to requests,
or to indicate errors in a request.
- Format
- A server class that encapsulates the layout and meaning of samples in the data steam. For example, it contains
the sample rate, bits per sample and number of tracks.
- Instance
- An instance of a class is an allocation of memory and server resources whose contents and behavior are
defined by the class.
- Object
- An object is an instance of a class.
- Padding
- Padding is extra bits or bytes in the protocol stream used to keep fields of the protocol properly aligned.
- Port
- Instances of the port class are used to move samples into or out of a buffer. This can be to or from the client,
or from another buffer. The port class is a subclass of the buffer class, and provides a view onto another
buffer. This allows audio samples to be moved from the source time system and format to the destination time
system and format.
- Reply
- A reply is an event that was generated as return information for a request.
- Request
- A request is a message sent from the client to the server.
- Server
- A server allows network transparent sharing of audio services by applications. The server accepts new com
munication paths from clients, processes requests sent by clients to create and manipulate server objects, and
send events back to the client.
- Server (Class)
- The server class represents the overall state of the server.
- Subclass
- A subclass is a class that inherits attributes of another class.
- Timestamp
- [...]
- Monitor
- An object of the monitor class conditionally generates an event when the object it monitors changes.