Node:Shared Substrings, Next:, Up:Shared And Read Only Strings



36.1.1 Shared Substrings

Whenever you extract a substring using substring, the Scheme interpreter allocates a new string and copies data from the old string. This is expensive, but substring is so convenient for manipulating text that programmers use it often.

Guile Scheme provides the concept of the shared substring to improve performance of many substring-related operations. A shared substring is an object that mostly behaves just like an ordinary substring, except that it actually shares storage space with its parent string.

make-shared-substring str [start [end]] Deprecated Scheme Procedure
scm_make_shared_substring (str, start, end) Deprecated C Function
Return a shared substring of str. The arguments are the same as for the substring function: the shared substring returned includes all of the text from str between indexes start (inclusive) and end (exclusive). If end is omitted, it defaults to the end of str. The shared substring returned by make-shared-substring occupies the same storage space as str.

Example:

(define foo "the quick brown fox")
(define bar (make-shared-substring some-string 4 9))

foo => "t h e   q u i c k   b r o w n   f o x"
bar =========> |---------|

The shared substring bar is not given its own storage space. Instead, the Guile interpreter notes internally that bar points to a portion of the memory allocated to foo. However, bar behaves like an ordinary string in most respects: it may be used with string primitives like string-length, string-ref, string=?. Guile makes the necessary translation between indices of bar and indices of foo automatically.

(string-length? bar) => 5	; bar only extends from indices 4 to 9
(string-ref bar 3)  => #\c	; same as (string-ref foo 7)
(make-shared-substring bar 2)
  => "ick"			; can even make a shared substring!

Because creating a shared substring does not require allocating new storage from the heap, it is a very fast operation. However, because it shares memory with its parent string, a change to the contents of the parent string will implicitly change the contents of its shared substrings.

(string-set! foo 7 #\r)
bar => "quirk"

Guile considers shared substrings to be immutable. This is because programmers might not always be aware that a given string is really a shared substring, and might innocently try to mutate it without realizing that the change would affect its parent string. (We are currently considering a "copy-on-write" strategy that would permit modifying shared substrings without affecting the parent string.)

In general, shared substrings are useful in circumstances where it is important to divide a string into smaller portions, but you do not expect to change the contents of any of the strings involved.