NAME
AnyEvent::WebDriver - control browsers using the W3C WebDriver protocol
SYNOPSIS
# start geckodriver or any other w3c-compatible webdriver via the shell
$ geckdriver -b myfirefox/firefox --log trace --port 4444
# then use it
use AnyEvent::WebDriver;
# create a new webdriver object
my $wd = new AnyEvent::WebDriver;
# create a new session with default capabilities.
$wd->new_session ({});
$wd->navigate_to ("https://duckduckgo.com/html");
my $searchbox = $wd->find_element ("css selector" => 'input[type="text"]');
$wd->element_send_keys ($searchbox => "free software");
$wd->element_click ($wd->find_element ("css selector" => 'input[type="submit"]'));
sleep 10;
DESCRIPTION
This module aims to implement the W3C WebDriver specification which is
the standardised equivalent to the Selenium WebDriver API., which in
turn aims at remotely controlling web browsers such as Firefox or
Chromium.
At the time of this writing, it was only available as a draft document,
so changes will be expected. Also, only geckodriver did implement it, or
at least, most of it.
To make most of this module, or, in fact, to make any reasonable use of
this module, you would need to refer tot he W3C WebDriver document,
which can be found here :
https://w3c.github.io/webdriver/
CREATING WEBDRIVER OBJECTS
new AnyEvent::WebDriver key => value...
Create a new WebDriver object. Example for a remote webdriver
connection (the only type supported at the moment):
my $wd = new AnyEvent::WebDriver host => "localhost", port => 4444;
Supported keys are:
endpoint => $string
For remote connections, the endpoint to connect to (defaults to
"http://localhost:4444").
proxy => $proxyspec
The proxy to use (same as the "proxy" argument used by
AnyEvent::HTTP). The default is "undef", which disables proxies.
To use the system-provided proxy (e.g. "http_proxy" environment
variable), specify a value of "default".
autodelete => $boolean
If true (the default), then automatically execute
"delete_session" when the WebDriver object is destroyed with an
active session. IF set to a false value, then the session will
continue to exist.
SIMPLIFIED API
This section documents the simplified API, which is really just a very
thin wrapper around the WebDriver protocol commands. They all block
(using AnyEvent condvars) the caller until the result is available, so
must not be called from an event loop callback - see "EVENT BASED API"
for an alternative.
The method names are preetty much taken directly from the W3C WebDriver
specification, e.g. the request documented in the "Get All Cookies"
section is implemented via the "get_all_cookies" method.
The order is the same as in the WebDriver draft at the tiome of this
writing, and only minimal massaging is done to request parameters and
results.
SESSIONS
$wd->new_session ({ key => value... })
Try to connect to a webdriver and initialize session with a "new
session" command, passing the given key-value pairs as value (e.g.
"capabilities").
No session-dependent methods must be called before this function
returns successfully.
On success, "$wd->{sid}" is set to the session id, and
"$wd->{capabilities}" is set to the returned capabilities.
my $wd = new AnyEvent::Selenium host => "localhost", port => 4545;
$wd->new_session ({
capabilities => {
pageLoadStrategy => "normal",
}.
});
$wd->delete_session
Deletes the session - the WebDriver object must not be used after
this call.
$timeouts = $wd->get_timeouts
Get the current timeouts, e.g.:
my $timeouts = $wd->get_timeouts;
# { implicit => 0, pageLoad => 300000, script => 30000 }
$wd->set_timeouts ($timeouts)
Sets one or more timeouts, e.g.:
$wd->set_timeouts ({ script => 60000 });
NAVIGATION
$wd->navigate_to ($url)
Navigates to the specified URL.
$url = $wd->get_current_url
Queries the czurrent page URL as set by "navigate_to".
$wd->back
The equivalent of pressing "back" in the browser.
$wd->forward
The equivalent of pressing "forward" in the browser.
$wd->refresh
The equivalent of pressing "refresh" in the browser.
$title = $wd->get_title
Returns the current document title.
COMMAND CONTEXTS
$handle = $wd->get_window_handle
Returns the current window handle.
$wd->close_window
Closes the current browsing context.
$wd->switch_to_window ($handle)
Changes the current browsing context to the given window.
$handles = $wd->get_window_handles
Return the current window handles as an array-ref of handle IDs.
$handles = $wd->switch_to_frame ($frame)
Switch to the given frame.
$handles = $wd->switch_to_parent_frame
Switch to the parent frame.
$rect = $wd->get_window_rect
Return the current window rect, e.g.:
$rect = $wd->get_window_rect
# { height => 1040, width => 540, x => 0, y => 0 }
$wd->set_window_rect ($rect)
Sets the window rect.
$wd->maximize_window
$wd->minimize_window
$wd->fullscreen_window
Changes the window size by eithe3r maximising, minimising or making
it fullscreen. In my experience, this might timeout if no window
manager is running.
ELEMENT RETRIEVAL
$element_id = $wd->find_element ($location_strategy, $selector)
Finds the first element specified by the given selector and returns
its web element id (the strong, not the object from the protocol).
Raises an error when no element was found.
$element = $wd->find_element ("css selector" => "body a");
$element = $wd->find_element ("link text" => "Click Here For Porn");
$element = $wd->find_element ("partial link text" => "orn");
$element = $wd->find_element ("tag name" => "input");
$element = $wd->find_element ("xpath" => '//input[@type="text"]');
# "decddca8-5986-4e1d-8c93-efe952505a5f"
$element_ids = $wd->find_elements ($location_strategy, $selector)
As above, but returns an arrayref of all found element IDs.
$element_id = $wd->find_element_from_element ($element_id,
$location_strategy, $selector)
Like "find_element", but looks only inside the specified $element.
$element_ids = $wd->find_elements_from_element ($element_id,
$location_strategy, $selector)
Like "find_elements", but looks only inside the specified $element.
my $head = $wd->find_element ("tag name" => "head");
my $links = $wd->find_elements_from_element ($head, "tag name", "link");
$element_id = $wd->get_active_element
Returns the active element.
ELEMENT STATE
$bool = $wd->is_element_selected
Returns whether the given input or option element is selected or
not.
$string = $wd->get_element_attribute ($element_id, $name)
Returns the value of the given attribute.
$string = $wd->get_element_property ($element_id, $name)
Returns the value of the given property.
$string = $wd->get_element_css_value ($element_id, $name)
Returns the value of the given css value.
$string = $wd->get_element_text ($element_id)
Returns the (rendered) text content of the given element.
$string = $wd->get_element_tag_name ($element_id)
Returns the tag of the given element.
$rect = $wd->get_element_rect ($element_id)
Returns the element rect of the given element.
$bool = $wd->is_element_enabled
Returns whether the element is enabled or not.
ELEMENT INTERACTION
$wd->element_click ($element_id)
Clicks the given element.
$wd->element_clear ($element_id)
Clear the contents of the given element.
$wd->element_send_keys ($element_id, $text)
Sends the given text as key events to the given element.
DOCUMENT HANDLING
$source = $wd->get_page_source
Returns the (HTML/XML) page source of the current document.
$results = $wd->execute_script ($javascript, $args)
Synchronously execute the given script with given arguments and
return its results ($args can be "undef" if no arguments are
wanted/needed).
$ten = $wd->execute_script ("return arguments[0]+arguments[1]", [3, 7]);
$results = $wd->execute_async_script ($javascript, $args)
Similar to "execute_script", but doesn't wait for script to return,
but instead waits for the script to call its last argument, which is
added to $args automatically.
$twenty = $wd->execute_async_script ("arguments[0](20)", undef);
COOKIES
$cookies = $wd->get_all_cookies
Returns all cookies, as an arrayref of hashrefs.
# google surely sets a lot of cookies without my consent
$wd->navigate_to ("http://google.com");
use Data::Dump;
ddx $wd->get_all_cookies;
$cookie = $wd->get_named_cookie ($name)
Returns a single cookie as a hashref.
$wd->add_cookie ($cookie)
Adds the given cookie hashref.
$wd->delete_cookie ($name)
Delete the named cookie.
$wd->delete_all_cookies
Delete all cookies.
ACTIONS
$wd->perform_actions ($actions)
Perform the given actions (an arrayref of action specifications
simulating user activity). For further details, read the spec.
An example to get you started:
$wd->navigate_to ("https://duckduckgo.com/html");
$wd->set_timeouts ({ implicit => 10000 });
my $input = $wd->find_element ("css selector", 'input[type="text"]');
$wd->perform_actions ([
{
id => "myfatfinger",
type => "pointer",
pointerType => "touch",
actions => [
{ type => "pointerMove", duration => 100, origin => $wd->element_object ($input), x => 40, y => 5 },
{ type => "pointerDown", button => 1 },
{ type => "pause", duration => 40 },
{ type => "pointerUp", button => 1 },
],
},
{
id => "mykeyboard",
type => "key",
actions => [
{ type => "pause" },
{ type => "pause" },
{ type => "pause" },
{ type => "pause" },
{ type => "keyDown", value => "a" },
{ type => "pause", duration => 100 },
{ type => "keyUp", value => "a" },
{ type => "pause", duration => 100 },
{ type => "keyDown", value => "b" },
{ type => "pause", duration => 100 },
{ type => "keyUp", value => "b" },
{ type => "pause", duration => 2000 },
{ type => "keyDown", value => "\x{E007}" }, # enter
{ type => "pause", duration => 100 },
{ type => "keyUp", value => "\x{E007}" }, # enter
{ type => "pause", duration => 5000 },
],
},
]);
$wd->release_actions
Release all keys and pointer buttons currently depressed.
USER PROMPTS
$wd->dismiss_alert
Dismiss a simple dialog, if present.
$wd->accept_alert
Accept a simple dialog, if present.
$text = $wd->get_alert_text
Returns the text of any simple dialog.
$text = $wd->send_alert_text
Fills in the user prompt with the given text.
SCREEN CAPTURE
$wd->take_screenshot
Create a screenshot, returning it as a PNG image in a data url.
$wd->take_element_screenshot ($element_id)
Accept a simple dialog, if present.
HELPER METHODS
$object = $wd->element_object ($element_id)
Encoding element ids in data structures is done by represetning them
as an object with a special key and the element id as value. This
helper method does this for you.
EVENT BASED API
This module wouldn't be a good AnyEvent citizen if it didn't have a true
event-based API.
In fact, the simplified API, as documented above, is emulated via the
event-based API and an "AUTOLOAD" function that automatically provides
blocking wrappers around the callback-based API.
Every method documented in the "SIMPLIFIED API" section has an
equivalent event-based method that is formed by appending a underscore
("_") to the method name, and appending a callback to the argument list
(mnemonic: the underscore indicates the "the action is not yet finished"
after the call returns).
For example, instead of a blocking calls to "new_session", "navigate_to"
and "back", you can make a callback-based ones:
my $cv = AE::cv;
$wd->new_session ({}, sub {
my ($status, $value) = @_,
die "error $value->{error}" if $status ne "200";
$wd->navigate_to_ ("http://www.nethype.de", sub {
$wd->back_ (sub {
print "all done\n";
$cv->send;
});
});
});
$cv->recv;
While the blocking methods "croak" on errors, the callback-based ones
all pass two values to the callback, $status and $res, where $status is
the HTTP status code (200 for successful requests, typically 4xx ot 5xx
for errors), and $res is the value of the "value" key in the JSON
response object.
Other than that, the underscore variants and the blocking variants are
identical.
LOW LEVEL API
All the simplfiied API methods are very thin wrappers around WebDriver
commands of the same name. Theyx are all implemented in terms of the
low-level methods ("req", "get", "post" and "delete"), which exists in
blocking and callback-based variants ("req_", "get_", "post_" and
"delete_").
Examples are after the function descriptions.
$wd->req_ ($method, $uri, $body, $cb->($status, $value))
$value = $wd->req ($method, $uri, $body)
Appends the $uri to the "endpoint/session/{sessionid}/" URL and
makes a HTTP $method request ("GET", "POST" etc.). "POST" requests
can provide a UTF-8-encoded JSON text as HTTP request body, or the
empty string to indicate no body is used.
For the callback version, the callback gets passed the HTTP status
code (200 for every successful request), and the value of the
"value" key in the JSON response object as second argument.
$wd->get_ ($uri, $cb->($status, $value))
$value = $wd->get ($uri)
Simply a call to "req_" with $method set to "GET" and an empty body.
$wd->post_ ($uri, $data, $cb->($status, $value))
$value = $wd->post ($uri, $data)
Simply a call to "req_" with $method set to "POST" - if $body is
"undef", then an empty object is send, otherwise, $data must be a
valid request object, which gets encoded into JSON for you.
$wd->delete_ ($uri, $cb->($status, $value))
$value = $wd->delete ($uri)
Simply a call to "req_" with $method set to "DELETE" and an empty
body.
Example: implement "get_all_cookies", which is a simple "GET" request
without any parameters:
$cookies = $wd->get ("cookie");
Example: implement "execute_script", which needs some parameters:
$results = $wd->post ("execute/sync" => { script => "$javascript", args => [] });
Example: call "find_elements" to find all "IMG" elements, stripping the
returned element objects to only return the element ID strings:
my $elems = $wd->post (elements => { using => "css selector", value => "img" });
# yes, the W3C found an interetsing way around the typelessness of JSON
$_ = $_->{"element-6066-11e4-a52e-4f735466cecf"}
for @$elems;
HISTORY
This module was unintentionally created (it started inside some quickly
hacked-together script) simply because I couldn't get the existing
"Selenium::Remote::Driver" module to work, ever, despite multiple
attempts over the years and trying to report multiple bugs, which have
been completely ignored. It's also not event-based, so, yeah...
AUTHOR
Marc Lehmann
http://anyevent.schmorp.de