NAME Brannigan - Comprehensive, flexible system for validating and parsing input, mainly targeted at web applications. VERSION version 0.3 SYNOPSIS This example uses Catalyst, but should be pretty self explanatory. It's fairly complex, since it details pretty much all of the available Brannigan functionality, so don't be alarmed by the size of this thing. package MyApp::Controller::Post; use strict; use warnings; use Brannigan; # create a new Brannigan object with two validation/parsing schemes: my $b = Brannigan->new( { name => 'post', ignore_missing => 1, params => { subject => { required => 1, length_between => [3, 40], }, text => { required => 1, min_length => 10, validate => sub { my $value = shift; return undef unless $value; return $value =~ m/^lorem ipsum/ ? 1 : undef; } }, day => { required => 0, integer => 1, value_between => [1, 31], }, mon => { required => 0, integer => 1, value_between => [1, 12], }, year => { required => 0, integer => 1, value_between => [1900, 2900], }, section => { required => 1, integer => 1, value_between => [1, 3], parse => sub { my $val = shift; my $ret = $val == 1 ? 'reviews' : $val == 2 ? 'receips' : 'general'; return { section => $ret }; }, }, id => { required => 1, exact_length => 10, value_between => [1000000000, 2000000000], }, '/^picture_(\d+)$/' => { length_between => [3, 100], validate => sub { my ($value, $num) = @_; ... }, }, somearray => { array => 1, min_length => 3, values => { integer => 1, }, }, somehash => { hash => 1, keys => { _all => { required => 1, }, en => { exact_length => 10, }, he => { exact_length => 10, }, }, }, }, groups => { date => { params => [qw/year mon day/], parse => sub { my ($year, $mon, $day) = @_; return undef unless $year && $mon && $day; return { date => $year.'-'.$mon.'-'.$day }; }, }, tags => { regex => '/^tags_(en|he|fr)$/', parse => sub { return { tags => \@_ }; }, }, }, }, { name => 'edit_post', inherits_from => 'post', params => { subject => { required => 0, # subject is no longer required }, id => { forbidden => 1, }, }, }); # post a new blog post sub new_post : Local { my ($self, $c) = @_; # get input parameters hash-ref my $params = $c->request->params; # process the parameters my $parsed_params = $b->process('post', $params); if ($parsed_params->{_rejects}) { die $c->list_errors($parsed_params); } else { $c->model('DB::BlogPost')->create($parsed_params); } } # edit a blog post sub edit_post : Local { my ($self, $c, $id) = @_; my $params = $b->process('edit_posts', $c->req->params); if ($params->{_rejects}) { die $c->list_errors($params); } else { $c->model('DB::BlogPosts')->find($id)->update($params); } } DESCRIPTION Brannigan is an attempt to ease the pain of collecting input parameters in web applications, validating them and finally (if necessary), parsing them before actually using them. It's designed to answer both of the main problems that web applications face: * simple user input Brannigan can validate and parse simple, "flat", user input, possibly coming from web forms. * complex data structures Brannigan can validate and parse complex data structures, possibly deserialized from JSON or XML requests sent to web services and APIs. Brannigan's approach to data validation is as follows: define a structure of parameters and their needed validations, and let the module automatically examine input parameters against this structure. Brannigan provides you with common validation methods that are used everywhere, and also allows you to create custom validations easily. This structure also defines how, if at all, the input should be parsed. This is akin to schema-based validations such as XSD, but much more functional, and most of all flexible. Check the synopsis section for an example of such a structure. I call this structure a validation/parsing scheme. Schemes can inherit all the properties of other schemes, which allows you to be much more flexible in certain situations. As per the synopsis example, imagine you have a blogging application. The base scheme defines all validations and parsing needed to create a new blog post from a user's input. When editing a post, however, some parameters that were required when creating the post might not be required now (so you can just use older values). Inheritance allows you to do so easily by creating another scheme which gets all the properties of the base scheme, only changing whatever it is needs changing (and possibly adding specific properties that don't exist in the base scheme). Brannigan works by receiving a hash-ref of input parameters, asserting all validation methods required for each parameter, and parsing every parameter (or group of parameters, see below). Brannigan then returns a hash-ref with all parsed input parameters, and a '_rejects' key with failed validations (see more info below). HOW BRANNIGAN WORKS In essence, Brannigan works in three stages, which all boil down to one single command: * input stage Brannigan receives a hash-ref of input parameters from the user, or a hash-ref based data structure, and the name of a scheme to validate against. * data validation Brannigan applies all validation methods on the input data, and generates a hash-ref of rejected parameters. For every parameter in the hash-ref, a list of failed validations is created in an array-ref (see more info in the HOW SCHEMES WORK section). * data parsing Regardless of the previous stage, every parsing method defined in the scheme is applied on the relevant data. The data resulting from these parsing methods, along with the values of all input data for which no parsing methods were defined, is returned to the user in a hash-ref. This hash-ref also includes a _rejects key whose value is the rejects hash created in the previous stage. The reason I say this stage isn't dependant on the previous stage is simple. First of all, it's possible no parameters failed validation, but the truth is this stage doesn't care if a parameter failed validation. It will still parse it and return it to the user, and no errors are ever raised by Brannigan. It is the developer's (i.e. you) job to decide what to do in case rejects are present. HOW SCHEMES WORK The validation/parsing scheme defines the structure of the data you're expecting to receive, along with information about the way it should be validated and parsed. A scheme is a hash-ref based data structure that has the following keys: * name Defines the name of the scheme. Required. * ignore_missing Boolean value indicating whether input parameters that are not referenced in the scheme should be added to the parsed output or not. Optional, defaults to false (i.e. parameters missing from the scheme will be added to the output as-is). You might find it is probably a good idea to turn this on, so any input parameters you're not expecting receive from users are ignored. * inherits_from Either a scalar naming a different scheme or an array-ref of scheme names. The new scheme will inherit all the properties of the scheme(s) defined by this key. If an array-ref is provided, the scheme will inherit their properties in the order they are defined. See the CAVEATS section for some "heads-up" about inheritance. * params The params key is the most important part of the scheme, as it defines the expected input. This key takes a hash-ref containing the names of input parameters. Every such name (i.e. key) in itself is also a hash-ref. This hash-ref defines the necessary validation methods to assert for this parameter, and optionally a parse method. The idea is this: use the name of the validation method as the key, and the appropriate values for this method as the value of this key. For example, if a certain parameter, let's say 'subject', must be between 3 to 10 characters long, then your scheme will contain: subject => { length_between => [3, 10] } These values (3 and 10) will be passed to the "length_between()" validation method, along with the actual value of the parameter from the input, that will be automatically prepended to the "length_between()" method's parameters. Suppose a certain subject sent to your app failed the "length_between()" validation; then the rejects hash-ref described earlier will have something like this: subject => ['length_between(3, 10)'] Notice the values of the "length_between()" validation method were added to the string, so you can easily know why the parameter failed the validation. Aside for the built-in validation methods that come with Brannigan, a custom validation method can be defined for each parameter. This is done by adding a 'validate' key to the parameter, and an anonymous subroutine as the value. As with built-in methods, the parameter's value will be automatically sent to this method. So, for example, if the subject parameter from above must start with the words 'lorem ipsum', then we can define the subject parameter like so: subject => { length_between => [3, 10], validate => sub { my $value = shift; return $value =~ m/^lorem ipsum/ ? 1 : 0; } } Custom validation methods, just like built-in ones, are expected to return a true value if the parameter passed the validation, or a false value otherwise. If a parameter failed a custom validation method, then 'validate' will be added to the list of failed validations for this parameter. So, in our 'subject' example, the rejects hash-ref will have something like this: subject => ['length_between(3, 10)', 'validate'] It is more than possible that the way input parameters are passed to you application will not be exactly the way you'll eventually use them. That's where parsing methods can come in handy. Brannigan doesn't have any built-in parsing methods (obviously), so you must create these by yourself, just like custom validation methods. All you need to do is add a 'parse' key to the parameter's definition, with an anonymous subroutine. This subroutine also receives the value of the parameter automatically, and is expected to return a hash-ref of key-value pairs. You will probably find it that most of the time this hash-ref will only contain one key-value pair, and that the key will probably just be the name of the parameter. But note that when a parse method exists, Brannigan makes absolutely no assumptions of what else to do with that parameter, so you must do so yourself. After all parameters were parsed by Brannigan, all these little hash-refs are merged into one hash-ref that is returned to the application. If a parse method doesn't exist for a paramter, Brannigan will simply add it "as-is" to the resulting hash-ref. Returning to our subject example (which we defined must start with 'lorem ipsum'), let's say we want to substitute 'lorem ipsum' with 'effing awesome' before using this parameter. Then the subject definition will now look like this: subject => { length_between => [3, 10], validate => sub { my $value = shift; return $value =~ m/^lorem ipsum/ ? 1 : 0; }, parse => sub { my $value = shift; $value =~ s/^lorem ipsum/effing awesome/; return { subject => $value }; } } If you're still not sure what happens when no parse method exists, then you can imagine Brannigan uses the following default parse method: param => { parse => sub { my $value = shift; return { param => $value }; } } As of version 0.3, parameter names can also be regular expressions in the form '/regex/'. Sometimes you cannot know the names of all parameters passed to your app. For example, you might have a dynamic web form which starts with a single field called 'url_1', but your app allows your visitors to dynamically add more fields, such as 'url_2', 'url_3', etc. Regular expressions are handy in such situations. Your parameter key can be '/^url_(\d+)$/', and all such fields will be matched. Regex params have a special feature: if your regex uses capturing, then captured values will be passed to the custom "validate" and "parse" methods (in their order) after the parameter's value. For example: '/^url_(\d+)$/' => { validate => sub { my ($value, $num) = @_; # $num has the value captured by (\d+) in the regex return $value =~ m!^http://! ? 1 : undef; }, parse => sub { my ($value, $num) = @_; return { urls => { $num => $value } }; }, } Please note that a regex must be defined with a starting and trailing slash, in single quotes, otherwise it won't work. It is also important to note what happens when a parameter matches a regex rule (or perhaps rules), and also has a direct reference in the scheme. For example, let's say we have the following rules in our scheme: '/^sub(ject|headline)$/' => { required => 1, length_between => [3, 10], }, subject => { required => 0, } When validating and parsing the 'subject' parameter, Brannigan will automatically merge both of these references to the subject parameter, giving preference to the direct reference, so the actual structure on which the parameter will be validates is as follows: subject => { required => 0, length_between => [3, 10], } If you're parameter matches more than one regex rule, they will all be merged, but there's no way (yet) to ensure in which order these regex rules will be merged. As previously stated, Brannigan can also validate and parse a little more complex data structures. So, your parameter no longer has to be just a string or a number, but maybe a hash-ref or an array-ref. In the first case, you tell Brannigan the paramter is a hash-ref by adding a 'hash' key with a true value, and a 'keys' key with a hash-ref which is just like the 'params' hash-ref. For example, suppose you're receiving a 'name' parameter from the user as a hash-ref containing first and last names. That's how the 'name' parameter might be defined: name => { hash => 1, required => 1, keys => { first_name => { length_between => [3, 10], }, last_name => { required => 1, min_length => 3, }, } } What are we seeing here? We see that the 'name' parameter must be a hash-ref, that it's required, and that it has two keys: first_name, whose length must be between 3 to 10 if it's present, and last_name, which must be 3 characters or more, and must be present. An array parameter, on the other hand, is a little different. Like before, you define the parameter as an array-ref with the 'array' key and a true value, and a 'values' key. This key has a hash-ref of validation and parse methods that will be applied to EVERY value inside this array. For example, suppose you're receiving a 'pictures' parameter from the user as an array-ref containing URLs to pictures on the web. That's how the 'pictures' parameter might be defined: pictures => { array => 1, length_between => [1, 5], values => { min_length => 3, validate => sub { my $value = shift; return $value =~ m!^http://! ? 1 : 0; }, }, } What are we seeing this time? We see that the 'pictures' parameter must be a hash, with no less than one item (i.e. value) and no more than five items (notice that we're using the same "length_between()" method from before, but in the context of an array, it doesn't validate against character count but item count). We also see that every value in the 'pictures' array must have a minimum length of three (this time it is characterwise), and must match 'http://' in its beginning. What Brannigan returns for such structures when they fail validations is a little different than before. Instead of an array-ref of failed validations, Brannigan will return a hash-ref. This hash-ref might contain a '_self' key with an array-ref of validations that failed specifically on the 'pictures' parameter (such as the 'required' validation for the 'name' parameter or the 'length_between' validation for the 'pictures' parameter), and/or keys for each value in these structures that failed validation. If it's a hash, then the key will simply be the name of that key. If it's an array, it will be its index. For example, let's say the 'first_name' key under the 'name' parameter failed the "length_between(3, 10)" validation method, and that the 'last_name' key was not present (and hence failed the "required()" validatin). Also, let's say the 'pictures' parameter failed the "length_between(1, 5)" validation (for the sake of the argument, let's say it had 6 items instead of the maximum allowed 5), and that the 2nd item failed the min_length(3) validation, and the 6th item failed the custom validate method. Then our rejects hash-ref will have something like this: name => { first_name => ['length_between(3, 10)'], last_name => ['required(1)'], }, pictures => { _self => ['length_between(1, 5)'], 1 => ['min_length(3)'], 5 => ['validate'], } Notice the '_self' key under 'pictures' and that the numbering of the items of the 'pictures' array starts at zero (obviously). The beauty of Brannigan's data structure support is that it's recursive. So, it's not that a parameter can be a hash-ref and that's it. Every key in that hash-ref might be in itself a hash-ref, and every key in that hash-ref might be an array-ref, and every value in that array-ref might be a hash-ref... well, you get the idea. How might that look like? Well, just take a look at this: pictures => { array => 1, values => { hash => 1, keys => { filename => { min_length => 5, }, source => { hash => 1, keys => { website => { validate => sub { ... }, }, license => { one_of => [qw/GPL FDL CC/], }, }, }, }, }, } So, we have a pictures array that every value in it is a hash-ref with a filename key and a source key whose value is a hash-ref with a website key and a license key. The _all "parameter" can be used in a scheme to define rules that apply to all of the parameters in that level. This can either be used directly in the 'params' key of the scheme, or in the 'keys' key of a hash parameter. * groups Groups are very useful to parse parameters that are somehow related together. This key takes a hash-ref containing the names of the groups (names are irrelevant, they're more for you). Every group will also take a hash-ref, with a rule defining which parameters are members of this group, and a parse method to use with these parameters (just like our custom parse methods from the 'params' key). This custom parse method will automatically receive the values of all the parameters in the group, in the order they were defined. For example, suppose our app gets a user's birth date by using three web form fields: day, month and year. And suppose our app saves this date in a database in the format 'YYYY-MM-DD'. Then we can define a group, say 'date', that automatically does this. For example: date => { params => [qw/year month day/], parse => sub { my ($year, $month, $day) = @_; $month = '0'.$month if $month < 10; $day = '0'.$day if $day < 10; return { date => $year.'-'.$month.'-'.$day }; }, } Alternative to the 'params' key, you can define a 'regex' key that takes a regex. All parameters whose name matches this regex will be parsed as a group. As oppose to using regexes in the 'params' key of the scheme, captured values in the regexes will not be passed to the parse method, only the values of the parameters will. Also, please note that there's no way to know in which order the values will be provided when using regexes for groups. For example, let's say our app receives one or more URLs (to whatever type of resource) in the input, in parameters named 'url_1', 'url_2', 'url_3' and so on, and that there's no limit on the number of such parameters we can receive. Now, suppose we want to create an array of all of these URLs, possibly to push it to a database. Then we can create a 'urls' group such as this: urls => { regex => '/^url_(\d+)$/', parse => sub { my @urls = @_; return { urls => \@urls }; } } Schemes are created by passing them to the Brannigan constructor. You can pass as many schemes as you like, and these schemes can inherit from one another. You can create the Brannigan object that gets these schemes wherever you want. Maybe in a controller of your web app that will directly use this object to validate and parse input it gets, or maybe in a special validation class that will hold all schemes. It doesn't matter where, as long as you make the object available for your application. HOW THE PARSE METHOD WORKS As stated earlier, your "parse" methods are expected to return a hash-ref of key-value pairs. Brannigan collects all of these key-value pairs and merges them into one big hash-ref (along with all the non-parsed parameters). Brannigan actually allows you to have your "parse" methods be two-leveled. This means that a value in a key-value pair in itself can be a hash-ref or an array-ref. This allows you to use the same key in different places, and Brannigan will automatically aggregate all of these places, just like in the first level. So, for example, suppose you're scheme has a regex rule that matches parameters like 'tag_en' and 'tag_he'. Your parse method might return something like "{ tags => { en => 'an english tag' } }" when it matches the 'tag_en' parameter, and something like "{ tags => { he => 'a hebrew tag' } }" when it matches the 'tag_he' parameter. The resulting hash-ref from the process method will thus include "{ tags => { en => 'an english tag', he => 'a hebrew tag' } }". Similarly, let's say your scheme has a regex rule that matches parameters like 'url_1', 'url_2', etc. Your parse method might return something like "{ urls => [$url_1] }" for 'url_1' and "{ urls => [$url_2] }" for 'url_2'. The resulting hash-ref in this case will be "{ urls => [$url_1, $url_2] }". Take note however that only two-levels are supported, and that I'm so tired right now that I have no idea what I'm writing. OK, BUT HOW DO I USE THIS THING? OK, so we have created our scheme(s), we know how schemes look and work, but what now? Well, that's the easy part. All you need to do is call the "process()" method on the Brannigan object, passing it the name of the scheme to enforce and a hash-ref of the input parameters/data structure. This method will return a hash-ref back, with all the parameters after parsing. If any validations failed, this hash-ref will have a '_rejects' key, with the rejects hash-ref described earlier. Remember: Brannigan doesn't raise any errors. It's your job to decide what to do, and that's a good thing. Example schemes, input and output can be seen in Brannigan::Examples. SO WHICH VALIDATION METHODS ARE PROVIDED? For a list of all validation methods provided by Brannigan, check Brannigan::Validations. METHODS new( \%scheme | @schemes ) Creates a new instance of Brannigan, with the provided scheme(s) (see HOW SCHEMES WORK for more info on schemes). process( $scheme, \%params ) Receives the name of a scheme and a hash-ref of input parameters (or a data structure), and validates and parses these paremeters according to the scheme (see HOW SCHEMES WORK for detailed information about this process). Returns a hash-ref of parsed parameters according to the parsing scheme, possibly containing a list of failed validations for each parameter. Actual processing is done by Brannigan::Tree. INTERNAL METHODS _build_tree( $scheme ) Builds the final "tree" of validations and parsing methods to be performed on the parameters hash during processing. CAVEATS Brannigan is still in an early stage. Currently, no checks are made to validate the schemes built, so if you incorrectly define your schemes, Brannigan will not croak and processing will probably fail. Also, there is no support yet for recursive inheritance or any crazy inheritance situation. While deep inheritance is supported, it hasn't been tested extensively. Also bugs are popping up as I go along, so keep in mind that you might encounter bugs (and please report any if that happens). IDEAS FOR THE FUTURE The following list of ideas may or may not be implemented in future versions of Brannigan: * cross-scheme custom validation/parsing methods Add an option to define custom validate/parse methods in the Brannigan object that can be used in the schemes as if they were built-in methods. * support for third-party validation methods Add support for loading validation methods defined in third-party modules (such as Brannigan::Validations) and using theme in schemes as if they were built-in methods. * validate schemes by yourself Have Brannigan use itself to validate the schemes it receives from the developers (i.e. users of this module). * support loading schemes from JSON/XML Allow loading schemes from JSON/XML files or any other source. Does that make any sense? * something to aid rejects traversal Find something that would make traversal of the rejects list easier or whatever. SEE ALSO Brannigan::Validations, Brannigan::Tree. AUTHOR Ido Perlmuter, "" BUGS Please report any bugs or feature requests to "bug-brannigan at rt.cpan.org", or through the web interface at . I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. SUPPORT You can find documentation for this module with the perldoc command. perldoc Brannigan You can also look for information at: * RT: CPAN's request tracker * AnnoCPAN: Annotated CPAN documentation * CPAN Ratings * Search CPAN ACKNOWLEDGEMENTS Brannigan is inspired by Oogly (Al Newkirk) and the "Ketchup" jQuery validation plugin (). LICENSE AND COPYRIGHT Copyright 2010 Ido Perlmuter. This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License. See http://dev.perl.org/licenses/ for more information.