First, let’s review the basic character-related datatypes used by
SXEmacs. Note that the separate typedef
s are not mandatory in the
current implementation (all of them boil down to unsigned char
or
int
), but they improve clarity of code a great deal, because one
glance at the declaration can tell the intended use of the variable.
Emchar
An Emchar
holds a single Emacs character.
Obviously, the equality between characters and bytes is lost in the Mule
world. Characters can be represented by one or more bytes in the
buffer, and Emchar
is the C type large enough to hold any
character.
Without Mule support, an Emchar
is equivalent to an
unsigned char
.
Bufbyte
The data representing the text in a buffer or string is logically a set
of Bufbyte
s.
SXEmacs does not work with the same character formats all the time; when
reading characters from the outside, it decodes them to an internal
format, and likewise encodes them when writing. Bufbyte
(in fact
unsigned char
) is the basic unit of SXEmacs internal buffers and
strings format. A Bufbyte *
is the type that points at text
encoded in the variable-width internal encoding.
One character can correspond to one or more Bufbyte
s. In the
current Mule implementation, an ASCII character is represented by the
same Bufbyte
, and other characters are represented by a sequence
of two or more Bufbyte
s.
Without Mule support, there are exactly 256 characters, implicitly
Latin-1, and each character is represented using one Bufbyte
, and
there is a one-to-one correspondence between Bufbyte
s and
Emchar
s.
Bufpos
Charcount
A Bufpos
represents a character position in a buffer or string.
A Charcount
represents a number (count) of characters.
Logically, subtracting two Bufpos
values yields a
Charcount
value. Although all of these are typedef
ed to
EMACS_INT
, we use them in preference to EMACS_INT
to make
it clear what sort of position is being used.
Bufpos
and Charcount
values are the only ones that are
ever visible to Lisp.
Bytind
Bytecount
A Bytind
represents a byte position in a buffer or string. A
Bytecount
represents the distance between two positions, in bytes.
The relationship between Bytind
and Bytecount
is the same
as the relationship between Bufpos
and Charcount
.
Extbyte
Extcount
When dealing with the outside world, SXEmacs works with Extbyte
s,
which are equivalent to unsigned char
. Obviously, an
Extcount
is the distance between two Extbyte
s. Extbytes
and Extcounts are not all that frequent in SXEmacs code.