Next: , Up: Coding for Mule   [Contents][Index]


8.7.1 Character-Related Data Types

First, let’s review the basic character-related datatypes used by SXEmacs. Note that the separate typedefs are not mandatory in the current implementation (all of them boil down to unsigned char or int), but they improve clarity of code a great deal, because one glance at the declaration can tell the intended use of the variable.

Emchar

An Emchar holds a single Emacs character.

Obviously, the equality between characters and bytes is lost in the Mule world. Characters can be represented by one or more bytes in the buffer, and Emchar is the C type large enough to hold any character.

Without Mule support, an Emchar is equivalent to an unsigned char.

Bufbyte

The data representing the text in a buffer or string is logically a set of Bufbytes.

SXEmacs does not work with the same character formats all the time; when reading characters from the outside, it decodes them to an internal format, and likewise encodes them when writing. Bufbyte (in fact unsigned char) is the basic unit of SXEmacs internal buffers and strings format. A Bufbyte * is the type that points at text encoded in the variable-width internal encoding.

One character can correspond to one or more Bufbytes. In the current Mule implementation, an ASCII character is represented by the same Bufbyte, and other characters are represented by a sequence of two or more Bufbytes.

Without Mule support, there are exactly 256 characters, implicitly Latin-1, and each character is represented using one Bufbyte, and there is a one-to-one correspondence between Bufbytes and Emchars.

Bufpos
Charcount

A Bufpos represents a character position in a buffer or string. A Charcount represents a number (count) of characters. Logically, subtracting two Bufpos values yields a Charcount value. Although all of these are typedefed to EMACS_INT, we use them in preference to EMACS_INT to make it clear what sort of position is being used.

Bufpos and Charcount values are the only ones that are ever visible to Lisp.

Bytind
Bytecount

A Bytind represents a byte position in a buffer or string. A Bytecount represents the distance between two positions, in bytes. The relationship between Bytind and Bytecount is the same as the relationship between Bufpos and Charcount.

Extbyte
Extcount

When dealing with the outside world, SXEmacs works with Extbytes, which are equivalent to unsigned char. Obviously, an Extcount is the distance between two Extbytes. Extbytes and Extcounts are not all that frequent in SXEmacs code.


Next: , Up: Coding for Mule   [Contents][Index]