NAME Bio::Minimizer - minimizer package Based on the ideas put forth by Roberts et al 2004: SYNOPSIS my $minimizer = Bio::Minimizer->new($sequenceString); my $kmers = $minimizer->{kmers}; # hash of minimizer => kmer my $minimizers= $minimizer->{minimizers};# hash of minimizer => [kmer1,kmer2,...] # With more options my $minimizer2= Bio::Minimizer->new($sequenceString,{k=>31,l=>21}); DESCRIPTION Creates a set of minimizers from sequence EXAMPLES example: Sort a fastq file by minimizer, potentially shrinking gzip size. use Bio::Minimizer # Read fastq file via stdin, in this example while(my $id = <>){ # Grab an entry ($seq,$plus,$qual) = (scalar(<>), scalar(<>), scalar(<>)); chomp($id,$seq,$plus,$qual); # minimizer object $MINIMIZER = Bio::Minimizer->new($seq); # Get smallest minimizer in this entry $minMinimizer = (sort {$a cmp $b} values(%{$$MINIMIZER{minimizers}}))[0]; # combine the minimum minimizer with the entry, for # sorting later. # Save the entry as a string so that we don't have to # parse it later. my $entry = [$minMinimizer, "$id\n$seq\n$plus\n$qual\n"]; push(@entry,$entry); } for my $e(sort {$$a[0] cmp $$b[0]} @entry){ print $$e[1]; } VARIABLES $Bio::Minimizer::iThreads Boolean describing whether the module instance is using threads METHODS Bio::Minimizer->new() Arguments: sequence A string of ACGT settings A hash k Kmer length l Minimizer length (some might call it lmer) numcpus Number of threads to use. Currently might work against you.