NAME
scrape.pl - simple HTML scraping from the command line
ABSTRACT
This is a simple program to extract data from HTML by specifying CSS3 or
XPath selectors.
SYNOPSIS
scrape.pl URL selector selector ...
# Print page title
scrape.pl http://perl.org title
# The Perl Programming Language - www.perl.org
# Print links with titles, make links absolute
scrape.pl http://perl.org a //a/@href --uri=2
# Print all links to JPG images, make links absolute
scrape.pl http://perl.org a[@href=$"jpg"]
DESCRIPTION
This program fetches an HTML page and extracts nodes matched by XPath or
CSS selectors from it.
If URL is `-', input will be read from STDIN.
OPTIONS
--sep
Separator character to use for columns. Default is tab.
--uri COLUMNS
Numbers of columns to convert into absolute URIs, if the known
attributes do not everything you want.
--no-uri
Switches off the automatic translation to absolute URIs for known
attributes like `href' and `src'.
REPOSITORY
The public repository of this module is
http://github.com/Corion/App-scrape.
SUPPORT
The public support forum of this program is http://perlmonks.org/.
AUTHOR
Max Maischein `corion@cpan.org'
COPYRIGHT (c)
Copyright 2011-2011 by Max Maischein `corion@cpan.org'.
LICENSE
This module is released under the same terms as Perl itself.