XML-TreePuller INSTALLATION To install this module, run the following commands: perl Makefile.PL make make test make install ABOUT This module implements a tree oriented XML pull processor using a combination of XML::LibXML::Reader and an object-oriented interface around the output of XML::CompactTree. It provides a fast and convenient way to access the content of extremely large XML documents serially. EXAMPLE #!/usr/bin/env perl use strict; use warnings; use XML::TreePuller; sub gen_xml { return <<EOF <wiki version="0.3"> <!-- schema says that there is always 1 siteinfo and zero or more page elements follow --> <siteinfo> <sitename>ExamplePedia</sitename> <url>http://example.pedia/</url> <namespaces> <namespace key="-1">Special</namespace> <namespace key="0" /> <namespace key="1">Talk</namespace> </namespaces> </siteinfo> <page> <title>A good article</title> <text>Some good content</text> </page> <page> <title>A bad article</title> <text>Some bad content</text> </page> </wiki> EOF } sub element_example { my $xml = XML::TreePuller->new(string => gen_xml()); print "Printing namespace names using configuration style:\n"; $xml->config('/wiki/siteinfo/namespaces/namespace' => 'short'); while(defined(my $element = $xml->next)) { print $element->attribute('key'), ": ", $element->text, "\n"; } print "End of namespace names\n"; } sub subtree_example { my $xml = XML::TreePuller->new(string => gen_xml()); print "Printing titles using a subtree:\n"; $xml->config('/wiki/page' => 'subtree'); while(defined(my $element = $xml->next)) { print "Title: ", $element->get_elements('title')->text, "\n"; } print "End of titles\n"; } sub path_example { my $xml = XML::TreePuller->new(string => gen_xml()); print "Printing path example:\n"; $xml->config('/wiki/siteinfo', 'subtree'); $xml->config('/wiki/page/title', 'short'); while(my ($matched_path, $element) = $xml->next) { print "Path: $matched_path\n"; } print "End path example\n"; } element_example(); print "\n"; subtree_example(); print "\n"; path_example(); print "\n"; __END__ Printing namespace names using configuration style: -1: Special 0: 1: Talk End of namespace names Printing titles using a subtree: Title: A good article Title: A bad article End of titles Printing path example: Path: /wiki/siteinfo Path: /wiki/page/title Path: /wiki/page/title End path example SUPPORT AND DOCUMENTATION After installing, you can find documentation for this module with the perldoc command. perldoc XML::TreePuller You can also look for information at: RT, CPAN's request tracker http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-TreePuller AnnoCPAN, Annotated CPAN documentation http://annocpan.org/dist/XML-TreePuller CPAN Ratings http://cpanratings.perl.org/d/XML-TreePuller Search CPAN http://search.cpan.org/dist/XML-TreePuller/ COPYRIGHT AND LICENCE Copyright (C) 2010 "Tyler Riddle" This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License. See http://dev.perl.org/licenses/ for more information.