======================= A T T E N T I O N : ======================= ############################################################################## ## ## ## The "Set::IntegerFast"-distribution has been renamed to "Bit::Vector"!!! ## ## ======================================================================== ## ## ## ############################################################################## ===================================== Package "Bit::Vector" Version 4.0 ===================================== for Perl version 5.000 and higher Copyright (c) 1995, 1996, 1997 by Steffen Beyer. All rights reserved. This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Contents of this file: ---------------------- + Where to find + Reasons for the change of the name of this distribution and module + Migration strategy + Prerequisites + What does it do + New features in version 4.0 + This distribution contains + Erratum Where to find: -------------- You can download this module directly from the author's web site, where you will also find my other modules and a couple of logos describing what the modules do: http://www.engelschall.com/u/sb/download/ You should also be able to find this module on any CPAN ftp site (CPAN = "Comprehensive Perl Archive Network"), where the file "Bit-Vector-4.0.tar.gz" should be found in any of the following directories: .../CPAN/authors/id/STBEY/ .../CPAN/modules/by-category/06_Data_Type_Utilities/Bit/ .../CPAN/modules/by-module/Bit/ To find a CPAN ftp site, you can either direct your web browser to http://www.perl.com/CPAN/modules/by-module/Bit/Bit-Vector-4.0.tar.gz (which will automatically redirect you to a CPAN ftp server near you) or look into "The Perl 5 Module List" by Tim Bunce and Andreas Koenig either on USENET in the "comp.lang.perl.modules" newsgroup or at http://www.perl.com/CPAN/modules/00modlist.long.html Reasons for the change of the name of this distribution and module: ------------------------------------------------------------------- Version 4.0 is a complete rewrite of the whole "Set::Integerfast" package, during which a huge number of new methods have been added to this module (see also below!), with also large additions to its associated application modules. The module has been renamed to "Bit::Vector" since it now not only offers methods for the handling of sets, but also for other applications based on bit vectors, like boolean matrices and shift registers. Having multiple modules using C code via the XSUB interface in one distribution (the "Set::IntegerFast" and the "Math::MatrixBool" modules) made things prohibitively costly to maintain. A third such module was on the verge to arrive, one to implement bit shift registers of arbitrary length. Instead of duplicating the basic code to create, handle and destroy bit vectors (which would have been the same for all three modules!), everything was assembled into one unique C library. The name of "Set::IntegerFast" simply didn't fit this grown library anymore, a more general name was needed. "Bit::Vector" seems to be the most appropriate for it. Migration strategy: ------------------- You will need to apply the following changes to your existing applications that use the "Set::IntegerFast" module (use interactive global text search & replace for this): "Set::IntegerFast" --> "Bit::Vector" (required) "Empty_Interval(" --> "Interval_Empty(" (recommended) "Fill_Interval(" --> "Interval_Fill(" (recommended) "Flip_Interval(" --> "Interval_Flip(" (recommended) "Delete(" --> "Bit_Off(" (recommended) "Insert(" --> "Bit_On(" (recommended) "flip(" --> "bit_flip(" (recommended) "in(" --> "bit_test(" (recommended) "in(" --> "contains(" (alternative) "inclusion(" --> "subset(" (recommended) You should also apply these very changes to all of your applications using the "Set::IntegerRange" and the "Math::MatrixBool" module. This is because the same changes as in the new "Bit::Vector" module have also been applied to these two modules, including (some of) the new methods and (all of) the new method names and aliases! The old method names on the left of the table above are still supported, but deprecated from now on (this means that they could disappear at some day in the future). The module "Set::IntegerFast" still exists, but it is maintained only to assure backward compatibility. It might also go sometime in the future. Therefore, it is strongly recommended that you migrate your application(s). Prerequisites: -------------- Perl version 5.000 or higher, a C compiler capable of the ANSI C standard (!) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ What does it do: ---------------- The base class of this distribution, "Bit::Vector", allows you to create bit vectors and sets of arbitrary size (only limited by the size of a machine word and available memory on your system) with indices (= elements) in the range from zero to some positive integer, to dynamically change the size of such bit vectors or sets and to perform a broad range of basic operations on them, like - adding or removing elements (setting and clearing single bits), - testing the presence of a certain element (testing a single bit), - setting or clearing contiguous ranges of bits, - detecting contiguous ranges of set bits, - copying bit vectors, - converting a bit vector into either a compact (hexadecimal) or a human-readable string representation (allowing you to store bit vectors in a file, for instance), - reading in the contents of a bit vector from a string, - comparing two bit vectors for equality and lexical order, - performing bitwise shift and rotation operations, - computing the union, intersection, difference, symmetric difference or complement of sets, - testing two sets for equality or inclusion (subset relationship), - computing the minimum, the maximum and the norm (number of elements) of a set, and more. Note also that it is very easy to implement sets of arbitrary intervals of integers using this module (negative indices are no obstacle), despite the fact that only intervals of positive integers (from zero to some positive integer) are supported directly. Please refer to the "Set::IntegerRange" module (also contained in this distribution) and its man page to see how this can be done! The "Bit::Vector" module is mainly intended for mathematical or algorithmical computations. There are also a number of efficient algorithms that rely on sets (and bit vectors). An example of such an efficient algorithm (which uses a different representation for sets, however, not bit vectors) is Kruskal's algorithm for minimal spanning trees in graphs. (See the module "Graph::Kruskal" and its man page in this distribution.) Another famous algorithm using bit vectors is the "Seave of Erathostenes" for calculating prime numbers, which is included here as a demo program (see the file "primes.pl" in this distribution). An important field of application is the computation of "first", "follow" and "look-ahead" character sets for the construction of LL, SLR, LR and LALR parsers for compilers (or a compiler-compiler, like "yacc", for instance). (That's what the C library in this package was initially written for.) (See Aho, Hopcroft, Ullman, "The Design and Analysis of Computer Algorithms" for an excellent book on efficient algorithms and the famous "Dragon Book" on how to build compilers by Aho, Sethi, Ullman.) Therefore, this module is primarily designed for efficiency, which is the reason why most of its methods are implemented in C. To increase execution speed, the module doesn't use bytes as its basic storage unit, it rather uses machine words, assuming that a machine word is the most efficiently handled size of all scalar types on any machine (that's what the ANSI C standard proposes and assumes anyway). In order to achieve this, it automatically determines the number of bits in a machine word on your system and then adjusts its internal configuration constants accordingly. The greater the size of this basic storage unit, the better the complexity (= execution speed) of the methods in this module (but also the greater the average waste of unused bits in the last word). Note that the C library of this package ("BitVector.c") is designed in such a way that it can be used independently from Perl and this Perl extension module. (!) For this, you can use the file "BitVector.o" exactly as it is produced when building this module! It contains no references to Perl, and it doesn't need any Perl header files in order to compile. (It only needs "Definitions.h" and some system header files.) Note however that this C library does not perform any bounds checking whatsoever! (This is your application's duty!) (See the respective explanation in the file "BitVector.c" for more details and the file "Vector.xs" for an example of how this can be done!) In this module, all bounds and type checking (which should be absolutely fool-proof, BTW!) is done in the XSUB routines (in C). For more details about the modules in this distribution, please refer to their respective man pages! New features in version 4.0: ---------------------------- Flip $vector->Flip(); Interval_Scan_inc while (($min,$max) = $vector->Interval_Scan_inc($start)) Interval_Scan_dec while (($min,$max) = $vector->Interval_Scan_dec($start)) rotate_left $carry_out = $vector->rotate_left(); rotate_right $carry_out = $vector->rotate_right(); shift_left $carry_out = $vector->shift_left($carry_in); shift_right $carry_out = $vector->shift_right($carry_in); to_String $string = $vector->to_String(); # e.g., "A08A28AC" from_string $ok = $vector->from_string($string); Multiplication $matrix1->Multiplication($rows1,$cols1, $matrix2,$rows2,$cols2, $matrix3,$rows3,$cols3); Closure $matrix->Closure($rows,$cols); Shadow $other_vector = $some_vector->Shadow(); Clone $twin_vector = $some_vector->Clone(); new_from_String eval { $vector = Bit::Vector->new_from_String($string); }; to_ASCII $string = $vector->to_ASCII(); # e.g., "2,3,5-7,11,13-19" from_ASCII eval { $vector->from_ASCII($string); }; Emptyness if ($vector) # if not empty if (! $vector) # if empty unless ($vector) # if empty # "$index" is a number or a Perl scalar variable containing a # number which represents the set containing only that element: Equality if ($vector1 == $vector2) if ($vector1 != $vector2) if ($vector == $index) if ($vector != $index) Lexical Comparison $cmp = $vector1 cmp $vector2; if ($vector1 lt $vector2) if ($vector1 le $vector2) if ($vector1 gt $vector2) if ($vector1 ge $vector2) if ($vector1 eq $vector2) if ($vector1 ne $vector2) $cmp = $vector cmp $index; if ($vector lt $index) if ($vector le $index) if ($vector gt $index) if ($vector ge $index) if ($vector eq $index) if ($vector ne $index) Shift Register $carry_out = $vector << $carry_in; $carry_out = $vector >> $carry_in; $carry_out = $carry_in >> $vector; $vector <<= $carry_in; $vector >>= $carry_in; Rotate Register $carry_out = $vector << $vector->bit_test($vector->Size()-1); $carry_out = $vector >> $vector->bit_test(0); $carry_out = $vector->bit_test(0) >> $vector; $vector <<= $vector->bit_test($vector->Size()-1); $vector >>= $vector->bit_test(0); String Conversion $string = "$vector"; print "\$vector = '$vector'\n"; Union $set1 = $set2 + $set3; $set1 += $set2; $set1 = $set2 | $set3; $set1 |= $set2; $vector1 = $vector2 + $index; $vector += $index; $vector1 = $vector2 | $index; $vector |= $index; Intersection $set1 = $set2 * $set3; $set1 *= $set2; $set1 = $set2 & $set3; $set1 &= $set2; $vector1 = $vector2 * $index; $vector *= $index; $vector1 = $vector2 & $index; $vector &= $index; Difference $set1 = $set2 - $set3; $set1 -= $set2; $set1 = $set2 - $set1; $vector1 = $vector2 - $index; $vector1 = $index - $vector2; $vector -= $index; ExclusiveOr $set1 = $set2 ^ $set3; $set1 ^= $set2; $vector1 = $vector2 ^ $index; $vector ^= $index; Complement $set1 = -$set2; $set1 = ~$set2; $set = -$set; $set = ~$set; Subset Relationship if ($set1 <= $set2) True Subset Relationship if ($set1 < $set2) Superset Relationship if ($set1 >= $set2) True Superset Relationship if ($set1 > $set2) Norm $norm = abs($set); This distribution contains: --------------------------- * a "BitVector" C library (can be used stand-alone!) (files "Definitions.h" and "BitVector.c") + efficient (fast) handling of bit vectors and sets of integers (and more) + determines the size of a machine word on your system and automatically configures itself accordingly for maximum speed of execution * a "Bit::Vector" base class (files "Vector.pm", "Vector.xs", "typemap", "BitVector.h", "Definitions.h" and "BitVector.c") + efficient (fast) object-oriented methods for handling bit vectors and sets of integers (intervals from zero to some positive integer) + methods for converting bit vectors in strings and vice-versa (supports "newsrc" style sets) + overloaded arithmetic and relational operators for maximum comfort * a "Set::IntegerFast" wrapper module (file "lib/Set/IntegerFast.pm") + assures backward compatibility to previous versions * a "Set::IntegerRange" application module (file "lib/Set/IntegerRange.pm") + object-oriented methods for handling sets of integers (arbitrary intervals) + overloaded arithmetic and relational operators for maximum comfort + methods for converting sets in strings and vice-versa * a "Math::MatrixBool" application module (file "lib/Math/MatrixBool.pm") + object-oriented methods for handling matrices of booleans (Boolean Algebra) + overloaded arithmetic and relational operators for maximum comfort + computes reflexive transitive closure using Kleene's algorithm (essential for solving path-problem in graphs) + methods for converting matrices in strings and vice-versa + matrix multiplication and closure are actually carried out in the "Bit::Vector" module for maximum speed of execution Related modules (NOT derived from the "Bit::Vector" base class): * "Math::MatrixReal" (file "lib/Math/MatrixReal.pm") + object-oriented methods for handling matrices of reals + overloaded arithmetic and relational operators for maximum comfort + allows to solve linear equation systems using an efficient algorithm known as "LR decomposition" and several approximative (iterative) methods + features an implementation of Kleene's algorithm to compute the minimal costs for all paths in a graph with weighted edges + methods for converting matrices in strings and vice-versa * "DFA::Kleene" (file "lib/DFA/Kleene.pm") + another implementation of Kleene's algorithm to compute the language accepted by a Deterministic Finite Automaton * "Math::Kleene" (file "lib/Math/Kleene.pod") + a man page giving a brief introduction into the theory behind Kleene's algorithm * "Graph::Kruskal" (file "lib/Graph/Kruskal.pm") + implementation of Kruskal's efficient algorithm for Minimal Spanning Trees in graphs O( n * ld(n) ) + example of an efficient algorithm relying heavily on sets (with a different representation, though, not bit vectors) Erratum: -------- In the file "CHANGES" in the "Bit::Vector" distribution in section "Migration strategy" is an error: Instead of "Delete(" --> "Bit_On(" (recommended) "Insert(" --> "Bit_Off(" (recommended) it should of course read "Delete(" --> "Bit_Off(" (recommended) "Insert(" --> "Bit_On(" (recommended) Please excuse this lapse! Thank you! ----------------------------------------------------------------------------- I hope you will find this module benefitful! Yours, -- Steffen Beyer http://www.engelschall.com/u/sb/ "There is enough for the need of everyone in this world, but not for the greed of everyone." - Mahatma Gandhi