NAME AI::Genetic - A pure Perl genetic algorithm implementation. SYNOPSIS use AI::Genetic; my $ga = new AI::Genetic( -fitness => sub { rand }, -genes => [qw/geneA geneB geneC/], -population => 500, -crossover => 0.9, -mutation => 0.01, ); DESCRIPTION This module implements a Genetic Algorithm (GA) in pure Perl. Other Perl modules that achieve the same thing (perhaps better, perhaps worse) do exist. Please check CPAN. I mainly wrote this module to satisfy my own needs, and to learn something about GAs along the way. I will not go into the details of GAs here, but here are the bare basics. Plenty of information can be found on the web. In a GA, a population of individuals compete for survival. Each individual is designated by a set of genes that defines its behaviour. Individuals that perform better (as defined by the fitness function) have a higher chance of mating with other individuals. When two individuals mate, they swap some of their genes, resulting in an individual that has properties from both of its "parents". Every now and then, a mutation occurs where some gene randomly changes value, resulting in a different individual. A GA implementation runs for a discrete number of time steps call *generations*. During each generation, the following happens: 1. Selection Here the performances of all the individuals are evaluated based on the fitness function, and each is given a specific fitness value. The higher the value, the bigger the chance of an individual passing its genes on in future generations. Currently, individuals are ranked by fitness, and the top half are selected for subsequent steps. 2. Crossover Here, individuals selected are randomly paired up for crossover (aka *sexual reproduction*). This is further controlled by the crossover rate specified and may result in a new offspring individual that contains genes common to both parents. More details are given in the section on "CROSSOVER". 3. Mutation In this step, each individual is given the chance to mutate based on the mutation probability specified. More details are given in the section on "MUTATION". CLASS METHODS Here are the public methods. *$ga*->new(*options*) This is the constructor. It accepts options in the form of hash-value pairs. These are: -population This defines the size of the population, i.e. how many individuals to simultaneously exist at each generation. Defaults to 100. -crossover This defines the crossover rate. Defaults to 0.95. -mutation This defines the mutation rate. Defaults to 0.05. -genes This defines the gene pool. Please see the section on "GENES". *-init* This defines the genes of any initial individuals you want to exist at generation 0. Please see the section on "GENES". *-fitness* This defines a fitness function. It expects a reference to a subroutine. More details are given in the section on "FITNESS FUNCTION". *$ga*->evolve(?*num_generations*?) This method causes the GA to evolve the specified number of generations. If no argument is given, it defaults to 1 generation. In each generation, the 3 steps defined above (selection -> crossover -> mutation) are executed. *$ga*->get_fittest(?*N*?) This returns the *N* fittest individuals. If not specified, N defaults to 1. The actual AI::Genetic::Individual objects are returned. You can use the "genes()" and "score()" methods to get the genes and the scores of the individuals. Please check the AI::Genetic::Individual manpage for details. GENES There are two ways to specify the gene pool to the GA. For lack of imagination, I call them LIST and LOL. They do not mix. So please use one or the other. LIST This is the simplest way to pass the gene pool, but it has its limits. Here, you pass the argument to -genes as an anonymous list of all the possible genes: "-genes => [qw/geneA geneB geneC geneD .../]" The catch here is that genes are binary, so they can assume either an ON or an OFF value. Genes that are off will not be present in an individuals genome. You can specify an initial population of LIST individuals using the -init option as follows: "-init => [ [qw/geneA geneC geneD/], [qw/geneB geneC/], ]" This defines two individuals: the first has genes geneA, geneC and geneD; and the second has two genes: geneB and geneC. LOL Here, each gene can have a range of possible values, which have to be passed on to the GA object as so: "-genes => [ [qw/geneA 0 1 2 3 4], [qw/geneB a b c d e], .... ]" In this example, geneA is allowed to have any of the values 0, 1, 2, 3 or 4. Similarly, geneB can have only one of the values to its right. In this case, all genes will be present in each individual, but the gene will have a value chosen from its respective list. Pleas note: for now, you have to specify all the values that a gene can take. Perhaps in the future I will support specifying a range of values instead, but not now :) You can specify an initial population of LOL individuals using the -init option as follows: "-init => [ { geneA => 1, geneB => 'b', geneC => 5, } ]" This defines one individual with the genes having the specified values. Note here that you have to specify values for ALL genes. This is necessary and failing to do so will create warnings/errors at *run-time*. CROSSOVER I am aware that many crossover techniques exist. I have chosen only one that suited my needs. Basically, fit individuals are paired up randomly. Whether mating actually occurs depends on the crossover rate defined by the -crossover option. For LIST individuals (see the section on "GENES"), the genes of both parents are inspected. Genes shared by both parents have a higher chance of being passed on. Genes present in only one parent have a 50% chance of making it to the child. Genes in neither parent can not pass on to the child. Based on this, genes are randomly selected for the child. Note that under this scheme, the number of genes in both parents need NOT be equal to each other, or to the number of genes in the child. Note also that this might result in a child with no genes, in which case it is assumed that the parent individuals did not mate at all. For LOL individuals (see the section on "GENES"), both parents have all the genes, but with potentially different values. Here, the child individual will have all the genes, and the value of each gene is randomly chosen among the two values of the parents'. If you require a different crossover technique, please let me know. MUTATION Again, I'm aware of the many mutation techniques out there. Again, I have chosen something simple that suits my needs. For LIST individuals (see the section on "GENES"), each individual has a list of the ON genes only. Iterating through this list, the mutation probability determines whether this gene is to be mutated or not. If so, then another gene is chosen from the original list of all possible genes, and replaces the current gene. Note that duplicate genes are not allowed. If a gene is to be replaced by another one that is already in the genome, then no mutation occurs. For LOL individuals (see the section on "GENES"), all genes are present in each individual. Iterating through all the genes, the mutation probability determines whether the gene is to mutate or not. If so, then a new value is chosen from the original list of possible values. Again, if you require a different mutation technique, please let me know. FITNESS FUNCTION AI::Genetic expects the fitness function to be an anonymous subroutine, defined by the user. It is used during selection to calculate a *fitness* value for each individual. The higher the value, the more fit the individual. The arguments to this subroutine depend on the type of genome used. The result is expected to be a scalar number, that defines the fitness of the individual. For LIST individuals (see the section on "GENES"), the input arguments to the fitness function will be a list of the ON genes of the individual. For LOL individuals (see the section on "GENES"), the input argument to the fitness function will be an anonymous hash. The keys of this hash are the the genes, and the values are the values of the genes. At this point, AI::Genetic passes the same reference that it is storing inside of it. So, if you modify the gene within the fitness function subroutine, it will affect the actual individual. Please be careful. BUGS You tell me :) This module is still in early beta mode. So please send any bugs or feature requests to me (See the section on "AUTHOR"). INSTALLATION Either the usual: perl Makefile.PL make make install or just stick it somewhere in @INC where perl can find it. It's in pure Perl. AUTHOR Ala Qumsieh *aqumsieh@cpan.org* COPYRIGHTS This module is distributed under the same terms as Perl itself.