SYNOPSIS Use English: use DateTime::Format::Alami::EN; my $parser = DateTime::Format::Alami::EN->new(); my $dt; $dt = $parser->parse_datetime("2 hours 13 minutes from now"); $dt = $parser->parse_datetime("yesterday"); use Indonesian: use DateTime::Format::Alami::ID; my $parser = DateTime::Format::Alami::ID->new(); my $dt; $dt = $parser->parse_datetime("5 jam lagi"); $dt = $parser->parse_datetime("hari ini"); DESCRIPTION This class parses human/natural date/time string and returns DateTime object. Currently it supports English and Indonesian. The goal of this module is to make it easier to add support for other human languages. To actually use this class, you must use one of its subclasses for each human language that you want to parse. There are already some other DateTime human language parsers on CPAN and elsewhere, see "SEE ALSO". HOW IT WORKS DateTime::Format::Alami is base class. Each human language is implemented in a separate DateTime::Format::Alami:: module (e.g. DateTime::Format::Alami::EN and DateTime::Format::Alami::EN) which is a subclass. Parsing is done using a single recursive regex (i.e. containing (?&NAME) and (?(DEFINE)) patterns, see perlre). This regex is composed from pieces of pattern strings in the p_* and o_* methods, to make it easier to override in an OO-fashion. A pattern string that is returned by the p_* method is a normal regex pattern string that will be compiled using the /x and /i regex modifier. The pattern string can also refer to pattern in other o_* or p_* method using syntax or . Example, o_today for English might be something like: sub p_today { "(?: today | this \s+ day )" } Other examples: sub p_yesterday { "(?: yesterday )" } sub p_dateymd { join( "", '(?: \\s* ? | \\s* \\b|[ /-]\\b )', '(?: \\s*[,/-]?\\s* )?' )} sub o_date { "(?: ||)" } sub p_time { "(?: :(?:)? \s* )" } sub p_date_time { "(?: (?:\s+ at)? )" } When a pattern from p_* matches, a corresponding action method a_* will be invoked. Usually the method will set or modify a DateTime object in $self->{_dt}. For example, this is code for a_today: sub a_today { my $self = shift; $self->{_dt} = DateTime->today; } The patterns from all p_* methods will be combined in an alternation to form the final pattern. An o_* pattern is just like p_*, but they will not be combined into the final pattern and matching it won't execute a corresponding a_* method. And there are also w_* methods which return array of strings. ADDING A NEW HUMAN LANGUAGE TBD METHODS new => obj Constructor. You actually must instantiate subclass instead. parse_datetime($str[ , \%opts ]) => obj Parse/extract date/time expression in $str. Return undef if expression cannot be parsed. Otherwise return DateTime object (or string/number if format option is verbatim/epoch, or hash if format option is combined) or array of objects/strings/numbers (if returns option is all/all_cron). Known options: * time_zone => str Will be passed to DateTime constructor. * format => str (DateTime|verbatim|epoch|combined) The default is DateTime, which will return DateTime object. Other choices include verbatim (returns the original text), epoch (returns Unix timestamp), combined (returns a hash containing keys like DateTime, verbatim, epoch, and other extra information: pos [position of pattern in the string], pattern [pattern name], m [raw named capture groups], uses_time [whether the date involves time of day]). You might think that choosing epoch could avoid the overhead of DateTime, but actually you can't since DateTime is used as the primary format during parsing. The epoch is retrieved from the DateTime object using the epoch method. But if you choose verbatim, you can avoid the overhead of DateTime (as long as you set returns to first, last, or all). * prefers => str (nearest|future|past) NOT YET IMPLEMENTED. This option decides what happens when an ambiguous date appears in the input. For example, "Friday" may refer to any number of Fridays. Possible choices are: nearest (prefer the nearest date, the default), future (prefer the closest future date), past (prefer the closest past date). * returns => str (first|last|earliest|latest|all|all_cron) If the text has multiple possible dates, then this argument determines which date will be returned. Possible choices are: first (return the first date found in the string, the default), last (return the final date found in the string), earliest (return the date found in the string that chronologically precedes any other date in the string), latest (return the date found in the string that chronologically follows any other date in the string), all (return all dates found in the string, in the order they were found in the string), all_cron (return all dates found in the string, in chronological order). When all or all_cron is chosen, function will return array(ref) of results instead of a single result, even if there is only a single actual result. FAQ What does "alami" mean? It is an Indonesian word, meaning "natural". SEE ALSO Similar modules on CPAN Date::Extract. DateTime::Format::Alami has some features of Date::Extract so it can be used to replace Date::Extract. For Indonesian: DateTime::Format::Indonesian, Date::Extract::ID (currently this module uses DateTime::Format::Alami as its backend). For English: DateTime::Format::Natural. You probably want to use this instead, unless you want something other than English. I did try to create an Indonesian translation for this module a few years ago, but gave up. Perhaps I should make another attempt. Other modules on CPAN DateTime::Format::Human deals with formatting and not parsing. Similar non-Perl libraries Natt Java library, which the last time I tried sometimes gives weird answer, e.g. "32 Oct" becomes 1 Oct in the far future. http://natty.joestelmach.com/ Duckling Clojure library, which can parse date/time as well as numbers with some other units like temperature. https://github.com/wit-ai/duckling