regexps.com
Dynamically allocated arrays are a common feature of C programs. The Hackerlab C library provides two array-like data structures: variable size arrays, and power-of-two size sparse arrays.
Variable size arrays are contiguous regions of memory, similar
to memory allocated by malloc. The size of a variable size array,
measured as the number of elements it contains, can be retrieved
at run-time using the function ar_size
.
Power-of-two size sparse arrays are a tree structure, used to represent an array holding a number of elements which is a power of two. Access to these arrays is very fast, and they are memory efficient for very large arrays which are sparsely populated.
A
variable size array
is a dynamically allocated block of memory,
similar to a block returned by lim_malloc
, except that a variable sized
array is tagged with its size, measured in the number of array elements.
A null pointer counts as an array of 0
elements. For example, if
ar_size
, which returns the size of a variable sized arrary, is
passed 0
, it returns 0
. That means there is no special function
to allocate a new variable sized array -- instead, array pointers
should be initialized to 0
. This example creates an array with ten
integers by using ar_ref
:
{ int * the_array; int * tenth_element;
the_array = 0; tenth_element = (int *)ar_ref (&the_array, lim_use_must_malloc, 9, sizeof (int)); }
A variable size array can be used as a stack. (See ar_push
and
ar_pop
.)
Array functions use the lim_malloc
family of functions to allcoate
memory. (See Allocation With Limitations.)
int ar_size (void * base, alloc_limits limits, size_t elt_size);
Return the number of elements in the array. If base == 0
, return
0
.
limits
is the allocation limits associated with this array.
void * ar_ref (void ** base, alloc_limits limits, int n, int szof);
Return the address of element n
of an array, expanding the array
to n+1
elements, if necessary.
base
is a pointer to a pointer to the array.
limits
is the allocation limits associated with this array.
szof
is the size, in bytes, of one element of the array.
If this function adds new elements to an array, those elements are
filled with 0
bytes.
This function may resize and relocate the array. If it does,
*base
is updated to point to the new location of the array.
void ar_setsize (void ** base, alloc_limits limits, int n, size_t szof);
Resize the array so that it contains exactly n
elements.
base
is a pointer to a pointer to the array.
limits
is the allocation limits associated with this array.
szof
is the size, in bytes, of one element.
If this function adds new elements to an array, those elements are
filled with 0
bytes.
This function can be used to make an array smaller, but doing so
does not reclaim any storage. (See ar_compact
.)
void ar_compact (void ** base, alloc_limits limits, size_t szof);
Resize an array so that it is only as large as it needs to be.
base
is a pointer to a pointer to the array.
limits
is the allocation limits associated with this array.
szof
is the size, in bytes, of one element.
This function may resize and relocate the array. If it does,
*base
is updated to point to the new location of the array.
Functions like ar_setsize
can be used to make an array smaller, but
doing so does not reclaim any storage used by the array and does
not move the array in memory.
This function does attempt to reclaim storage (by using
lim_realloc
). If the array occupies significantly more memory
than needed, this function will move it to a smaller block.
If lim_realloc
returns 0
, this function has no effect.
void ar_free (void ** base, alloc_limits limits);
Release storage associated with the array pointed to by *base
.
Set *base
to 0
.
limits
is the allocation limits associated with this array.
void * ar_push (void ** base, alloc_limits limits, size_t szof);
Return the address of element n
in an array previously containing
only n-1
elements.
base
is a pointer to a pointer to the array.
limits
is the allocation limits associated with this array.
szof
is the size, in bytes, of one element.
The new array element is filled with 0
bytes.
This function may resize and relocate the array. If it does,
*base
is updated to point to the new location of the array.
void * ar_pop (void ** base, alloc_limits limits, size_t szof);
Return the address of the n
th element in an array previously
containing n
elements. Resize the array so that it contains
exactly n-1
elements.
base
is a pointer to a pointer to the array.
limits
is the allocation limits associated with this array.
szof
is the size, in bytes, of one element.
This function may resize and relocate the array. If it does,
*base
is updated to point to the new location of the array.
void * ar_copy (void * base, alloc_limits limits, size_t szof);
Create a new array which is a copy of the array pointed to by
base
.
limits
is the allocation limits associated with this array.
#include <hackerlab/arrays/pow2-array.h>
A pow2_array
(
power-of-two sparse array
) is an array-like
data structure. It always holds a number of elements which is
some power of two. It provides reasonably fast access to elements
(but slower than ordinary arrays). It provides good memory efficiency
for sparsely populated arrays.
NOTE: this interface is net yet complete. Some details of the existing interface may change in future releases.
A pow2_array
is represented by a tree structure of uniform depth.
Leaf elements are ordinary (dynamically allocated) arrays, each
leaf having the same number of elements.
So that sparsely populated arrays can be stored efficiently in
memory, subtrees which are populated entirely with default values
are represented in one of two ways: the root of such subtrees may
be represented as a NULL
pointer; or the root of such as subtree
may be represented by a
default node
. In the latter case, one
default node exists for each level of the tree (a default root, a
default second-level node, a default leaf node, and so on).
Representation by null pointer saves memory by not allocating
default nodes. Representation by default nodes speeds up access,
in some cases.
For each level of the tree, two values are defined: a shift
and a
mask
. For a given internal node of the tree, the subtree
containing the N
th element below that node is stored in the
subtree:
(N >> shift) & mask
The index of the same element within that subtree is:
N & ((1 << shift) - 1)
For leaf nodes, shift
is 0
.
The opaque type pow2_array_rules
holds the set of shift
and mask
values which define a tree structure for arrays of some size.
The opaque type pow2_array
holds a particular array.
Here is a simple example: a sparse array containing 8
elements.
(In ordinary use, we would presumably choose a much larger power
of two.)
We will define oen possible tree structures for this array: a two
level tree with four elements in each leaf. Other structures are
possible: we might have defined a two-level structure with two
elements in each leaf or a three-level structure with 2
in each
leaf, and two sub-trees below each internal node.
For the two level tree with four elements per leaf node, we have:
root: shift == 2 mask == 1
leaf nodes: shift == 0 mask == 3
The default leaf node, at address Ld
is a four element array:
Ld: ---------------------------- | dflt | dflt | dflt | dflt| ----------------------------
The default root node, at address Rd
is a two element array:
Rd: ----------- | Ld | Ld | -----------
An array with a non-default value (V
) in element 2
, but default
values everywhere else might look like:
root: ---------------- | leaf | Ld | --/----------|-- / | / | /---------/ | leaf: Ld: -------------------------- ----------------------------- | dflt | V | dflt | dflt | | dflt | dflt | dflt | dflt | -------------------------- -----------------------------
Suppose that elements in this array are of type T
. Then, using
the shift
and mask
values given above, the address of element
N
in that tree is:
(T *)((char *)root[(N >> 2) & 1] + ((N & ((1 << 2) - 1)) & 3))
That is the address returned by the function pow2_array_rref
.
Note that this address might be in leaf
, or it might be in the
default leaf Ld
.
When modifying a particular element, it is important to not modify
the default leaf. A copy-on-write strategy is used. For example,
before modifying element 7
, the tree is rewritten:
root: -------------------- | leaf0 | Leaf1 | --/------------/---- / / / / /---------/ / leaf0: leaf1: -------------------------- ----------------------------- | dflt | V | dflt | dflt | | dflt | dflt | dflt | dflt | -------------------------- -----------------------------
The function pow2_array_ref
performs that copy-on-write operation
and then returns an element address similarly to pow2_array_rref
.
If the default value for elements is a region of memory filled with
0
bytes, a tree can be represented without using default nodes.
For example, the array containing an element only in element 2
would be represented:
root: ---------------- | leaf | 0 | --/------------- / / /---------/ leaf: ----------------- | 0 | V | 0 | 0 | -----------------
A tree of this variety is created by not specifying a default leaf
node when calling make_pow2_array_rules
. pow2_array_ref
returns
a NULL pointer if an element is accessed which is not currently in
such a tree.
The function pow2_array_compact
compresses the representation of a
sparse array by eliminating identical subtrees. For example, after
calling pow2_array_compact
on an array with default values everywhere
except elements 1
and 5
, the tree would look like:
root: ------------------- | leaf | Leaf | -------\-------/--- \ / \ / \ / leaf: -------------------------- | dflt | V | dflt | dflt | --------------------------
After calling pow2_array_compact
, it is no longer safe to call
pow2_array_ref
for the same array. pow2_array_ref
is safe.
pow2_array_compact
is useful in combination with
pow2_array_print
.
Function
make_pow2_array_rules
pow2_array_rules make_pow2_array_rules (alloc_limits limits, size_t elt_size, void * default_page, int shift, size_t mask, ...);
Return the pow2_array_rules
which defines the tree structure
for a particular type of sparse array.
limits
is used when allocating the pow2_array_rules
and default
nodes. See Allocation With Limitations.
elt_size
is the size, in bytes, of individual elements.
default_page
is either 0
or a default leaf node.
The remaining arguments are a series of shift
and mask
pairs,
ending with a pair in which shift
is 0
.
See The pow2_array Data Structure for more information about default leaf nodes, shifts, and masks.
If allocation fails, this funtion returns 0
.
pow2_array pow2_array_alloc (alloc_limits limits, pow2_array_rules rules);
Allocate a sparse array.
limits
is used when allocating the array. It is also used by
pow2_array_ref
when allocating nodes within the array.
See Allocation With Limitations.
rules
defines the tree structure for the array and should be
an object returned by make_pow2_array_rules
.
If allocation fails, this funtion returns 0
.
void * pow2_array_rref (pow2_array array, size_t addr);
Return the address if the addr
element within array
.
The value pointed to by this address should not be modified.
If the element has never been modified, and no default leaf node
was passed to make_pow2_array_rules
, this function returns 0
.
void * pow2_array_ref (pow2_array array, size_t addr);
Return the address if the addr
element within array
.
The value pointed to by this address may be modified.
This function might allocate memory if the element has not
previously been modified. If allocation fails, this function
returns 0
.
regexps.com