PXB: Perl XML data Binding
Maxim Grigoriev =head1 CONTENTS
This document presents and describes API designed for binding RelaxNG (Compact) or XML schema, represented by RelaxNG Compact notation into the perl class library. The main idea behind such binding is to provide flexible and scalable API which will allow developers of the document literal based web services easily describe any service messaging profile by writing simple perl data structures. This approach allows you to work with data from any XML message using your own class structures, so your XML elements are binded into the perl objects, therefore it called Perl XML data Binding (or PXB). The PXB framework will handle all the details of converting data to and from XML or DOM objects based on your instructions. The DOM objects tree is created by utilizing very fast LibXML library. PXB was designed to allow you a high degree of control over the translation process with custom callbacks and SQL mapping of the hierarchical XML entities into the flat SQL database schema. Even more importantly to note that PXB creates heavily documented, highly supportable and maintainable perl class hierarchies. It also automatically creates test suit for every created class. The Perl::Tidy and Perl::Critic profile files are provided in the doc directory to assure compliance with the Best Perl Practices. If you familiar with java XML dat abinding packages such L</xmlbeans>, L</CASTOR>, L</METRO> or older L</JAXB> then idea of unmarshalling XML into the simple objects should be known to you.
PXB uses the power of perl data structures to define the rules of how perl objects are converted to or from XML (the binding). Or more explicitly to and from the DOM tree objects. Why not to use DOM instead ? DOM is just a representation of the XML tree data structure. It does not provide any facility to hook up easily some custom callback aligned with some external data schema to the particular element in the tree. The current spectrum of available XML APIs for perl is limited to tree walking and XPath based search. Also, the process of refactoring heavy XML driven perl code based on XML/XPath approach was found to be very tedious and error prone. Especially when schemas are constantly adapting to the ever changing customer's requirements. The PXB framework is an attempt to bring configurable callbacks into the DOM and add some functional validation inside of such binding as well as provide mechanism to map SQL database schema to the hierarchical structure of the DOM tree. Importance of such mapping technique was exposed by the lack of robust support for native XML data type among the current list of “freeware” storage engines where the SQL RDBM is more or less a standard and performance of the native XML databases are still far from the optimal. Also, PXB uses perl as meta-language and it uses perl to create perl OO API which makes schema's code significantly smaller. Any developer can concentrate on implementing semantic rules, protocols and actual functionality instead of wasting any time on reimplementing once again XML tree walking API. Moreover, there is an automated test suite for the each class of the generated API and every method call is documented. The internal structure of every class was designed to be clean, readable and easily understandable by any other perl developer. It utilizes perl’ best practices [2] and uses L<fields> packages to provide tighter encapsulation and creates explicitly named lists of accessors and mutators. Behavioral semantic of the each class in the class tree is the same and implements absolutely the same methods to allow transparency and propagation of the implemented callbacks across the API.
The basic XML element definition is represented as perl hash reference:
Where each <xxxx-definition> can be expressed in EBNF[3] as
‘‘parameter’
sub-elements‘parameter
’ sub-elementparameters
’ sub-elements of different type‘‘datum
’ sub-elements of different typeIn elements-definition
the third member is an optional conditional statement which represents validation rule.
For example ‘unless:value
’ conditional statement will be translated into the perl’ conditional statement
&& !($self-
get_value)> where ‘value’ must be registered attribute or sub-element name and this condition
will be placed in every piece of code where perl object is serialized into the XML DOM object or unmarshalled from it.
In order to provide the same class interface throughout the API, the same universal constructors and same set
of method calls were implemented. Currently the constructor body includes initialization part as well and this
part is slightly different for different modules, but it might be reduced to generic constructor and some custom
initialization part in the future where generic constructor might be inherited from some sort of the base Object class.
Currently every constructor “knows” how to initialize the whole class from the DOM object, XML string fragment
or from the reference to the perl hash structure of the named class fields. Also, it “knows” how to handle arrays
of the class fields with support for elements identified by id
. It supports special field,
named idmap for that purpose. It “knows” how to serialize object of the class into the DOM or XML
text string and how to map contents of the object on some SQL schema. By issuing registerNamespaces
call to
the root object one can obtain the reference to the hash with all namespaces utilized by every sub-element in the object.
It utilizes another special field, called nsmap
for that. This field is an object of the generated helper
class and serves as container for map between local names and namespace prefixes. Every namespace is identified by
namespace prefix and the version of the schema is getting built into the generated API package pathname.
Our case is based on building interoperable SOAP document/literal based webservice for perfSONAR-PS
project.
Webservices are indeed wrappers around network performance monitoring tools. Where there are two types of services
- Measuremnt Archive (MA) with ability to publish historical data and Measurement Point (MP) providing on-demand
measurements.
The major utilization scenario of the PXB framework comes when there is a need to build MA service with SQL database based storage engine. In any case PXB framework will provide the complete solution and actual development for MP will be contained in writing real time measurements facility class and then utilizing PXB to inject measured data into the published XML message. For MA the whole SD process will be dedicated to writing the actual message handler and implementing actual protocol specifications. For example, for PingER MA it is the perfSONAR_PS::Datatypes::PingER manpage class, inherited from the perfSONAR_PS::Datatypes::Message manpage class with actual implementation of SetupData request and MetaDataKey request handlers. The the perfSONAR_PS::Datatypes::Message manpage is an abstract class, extending the perfSONAR_PS::Datatypes::v2_0::nmwg::Message manpage class. This extra class exists because the root Message class from the PXB is message type agnostic.
=head2 perfSONAR-PS data model for PingER service
The root of the perfSONAR-PS schema and the root of the OO API built by PXB is the Message object.
It exists in the perfSONAR base namespace identified by nmwg
id. The schema is versioned.
The most current root package name for Message class is the perfSONAR_PS::Datatypes::v2_0::nmwg::Message manpage.
The base schema is completely defined by the perfSONAR_PS::DataModels::DataModel manpage module.
This is a simple perl package and not the class, because it has only data definitions.
The current DataModel has implemented definitions from several OGF schemas, namely:
filter.rnc, nmtopo.rnc, nmbase.rnc, nmtm.rnc, nmtl3.rnc, nmtl4.rnc
.
There is no SQL mapping definitions in the base data model allowed, because SQL mapping is a service specific.
Any service specific extension of this base schema must be extended in the separate data model package
as it was done for PingER service. The PingER data model can be viewed as an example and it is contained
in the the perfSONAR_PS::DataModels::PingER_Model manpage. Any other extension data model package can be created for any other service.
Let’s look on example of the parameter element. It is described in the base model as:
for Parameter
object with name
attribute set to ‘count
’ and ‘value
attribute’ or text
content
set to ‘100’ it will add:
metaData =
{‘count’ => { ‘eq’ => ‘100’}}>
into the SQL query hash.
The resulted hash for metaData
table will be returned and will look as:
where it can be easily passed to any of SQL ORM frameworks in order to build proper WHERE
clause.
For example the Class::DBI manpage or with minor refactoring to the the Rose::DB manpage