Next: , Up: Strings and Characters


4.1 String and Character Basics

Characters are represented in Emacs Lisp as integers; whether an integer is a character or not is determined only by how it is used. Thus, strings really contain integers.

The length of a string (like any array) is fixed, and cannot be altered once the string exists. Strings in Lisp are not terminated by a distinguished character code. (By contrast, strings in C are terminated by a character with ascii code 0.)

Since strings are arrays, and therefore sequences as well, you can operate on them with the general array and sequence functions. (See Sequences Arrays Vectors.) For example, you can access or change individual characters in a string using the functions aref and aset (see Array Functions).

There are two text representations for non-ascii characters in Emacs strings (and in buffers): unibyte and multibyte (see Text Representations). An ascii character always occupies one byte in a string; in fact, when a string is all ascii, there is no real difference between the unibyte and multibyte representations. For most Lisp programming, you don't need to be concerned with these two representations.

Sometimes key sequences are represented as strings. When a string is a key sequence, string elements in the range 128 to 255 represent meta characters (which are large integers) rather than character codes in the range 128 to 255.

Strings cannot hold characters that have the hyper, super or alt modifiers; they can hold ascii control characters, but no other control characters. They do not distinguish case in ascii control characters. If you want to store such characters in a sequence, such as a key sequence, you must use a vector instead of a string. See Character Type, for more information about the representation of meta and other modifiers for keyboard input characters.

Strings are useful for holding regular expressions. You can also match regular expressions against strings (see Regexp Search). The functions match-string (see Simple Match Data) and replace-match (see Replacing Match) are useful for decomposing and modifying strings based on regular expression matching.

Like a buffer, a string can contain text properties for the characters in it, as well as the characters themselves. See Text Properties. All the Lisp primitives that copy text from strings to buffers or other strings also copy the properties of the characters being copied.

See Text, for information about functions that display strings or copy them into buffers. See Character Type, and String Type, for information about the syntax of characters and strings. See Non-ASCII Characters, for functions to convert between text representations and to encode and decode character codes.