Previous: Big5 and Shift-JIS Functions, Up: Coding Systems [Contents][Index]
MULE initializes most of the commonly used coding systems at SXEmacs’s startup. A few others are initialized only when the relevant language environment is selected and support libraries are loaded. (NB: The following list is based on XEmacs 21.2.19, the development branch at the time of writing. The list may be somewhat different for other versions. Recent versions of GNU Emacs 20 implement a few more rare coding systems; work is being done to port these to SXEmacs.)
Unfortunately, there is not a consistent naming convention for character sets, and for practical purposes coding systems often take their name from their principal character sets (ASCII, KOI8-R, Shift JIS). Others take their names from the coding system (ISO-2022-JP, EUC-KR), and a few from their non-text usages (internal, binary). To provide for this, and for the fact that many coding systems have several common names, an aliasing system is provided. Finally, some effort has been made to use names that are registered as MIME charsets (this is why the name ’shift_jis contains that un-Lisp-y underscore).
There is a systematic naming convention regarding end-of-line (EOL) conventions for different systems. A coding system whose name ends in "-unix" forces the assumptions that lines are broken by newlines (0x0A). A coding system whose name ends in "-mac" forces the assumptions that lines are broken by ASCII CRs (0x0D). A coding system whose name ends in "-dos" forces the assumptions that lines are broken by CRLF sequences (0x0D 0x0A). These subsidiary coding systems are automatically derived from a base coding system. Use of the base coding system implies autodetection of the text file convention. (The fact that the -unix, -mac, and -dos are derived from a base system results in them showing up as "aliases" in ‘list-coding-systems’.) These subsidiaries have a consistent modeline indicator as well. "-dos" coding systems have ":T" appended to their modeline indicator, while "-mac" coding systems have ":t" appended (eg, "ISO8:t" for iso-2022-8-mac).
In the following table, each coding system is given with its mode line indicator in parentheses. Non-textual coding systems are listed first, followed by textual coding systems and their aliases. (The coding system subsidiary modeline indicators ":T" and ":t" will be omitted from the table of coding systems.)
### SJT 1999-08-23 Maybe should order these by language? Definitely need language usage for the ISO-8859 family.
Note that although true coding system aliases have been implemented for XEmacs 21.2, the coding system initialization has not yet been converted as of 21.2.19. So coding systems described as aliases have the same properties as the aliased coding system, but will not be equal as Lisp objects.
automatic-conversion
undecided
undecided-dos
undecided-mac
undecided-unix
Modeline indicator: Auto
. A type undecided
coding system.
Attempts to determine an appropriate coding system from file contents or
the environment.
raw-text
no-conversion
raw-text-dos
raw-text-mac
raw-text-unix
no-conversion-dos
no-conversion-mac
no-conversion-unix
Modeline indicator: Raw
. A type no-conversion
coding system,
which converts only line-break-codes. An implementation quirk means
that this coding system is also used for ISO8859-1.
binary
Modeline indicator: Binary
. A type no-conversion
coding
system which does no character coding or EOL conversions. An alias for
raw-text-unix
.
alternativnyj
alternativnyj-dos
alternativnyj-mac
alternativnyj-unix
Modeline indicator: Cy.Alt
. A type ccl
coding system used for
Alternativnyj, an encoding of the Cyrillic alphabet.
big5
big5-dos
big5-mac
big5-unix
Modeline indicator: Zh/Big5
. A type big5
coding system used for
BIG5, the most common encoding of traditional Chinese as used in Taiwan.
cn-gb-2312
cn-gb-2312-dos
cn-gb-2312-mac
cn-gb-2312-unix
Modeline indicator: Zh-GB/EUC
. A type iso2022
coding system used
for simplified Chinese (as used in the People’s Republic of China), with
the ascii
(G0), chinese-gb2312
(G1), and sisheng
(G2) character sets initially designated. Chinese EUC (Extended Unix
Code).
ctext-hebrew
ctext-hebrew-dos
ctext-hebrew-mac
ctext-hebrew-unix
Modeline indicator: CText/Hbrw
. A type iso2022
coding system
with the ascii
(G0) and hebrew-iso8859-8
(G1) character
sets initially designated for Hebrew.
ctext
ctext-dos
ctext-mac
ctext-unix
Modeline indicator: CText
. A type iso2022
8-bit coding system
with the ascii
(G0) and latin-iso8859-1
(G1) character
sets initially designated. X11 Compound Text Encoding. Often
mistakenly recognized instead of EUC encodings; usual cause is
inappropriate setting of coding-priority-list
.
escape-quoted
Modeline indicator: ESC/Quot
. A type iso2022
8-bit coding
system with the ascii
(G0) and latin-iso8859-1
(G1)
character sets initially designated and escape quoting. Unix EOL
conversion (ie, no conversion). It is used for .ELC files.
euc-jp
euc-jp-dos
euc-jp-mac
euc-jp-unix
Modeline indicator: Ja/EUC
. A type iso2022
8-bit coding system
with ascii
(G0), japanese-jisx0208
(G1),
katakana-jisx0201
(G2), and japanese-jisx0212
(G3)
initially designated. Japanese EUC (Extended Unix Code).
euc-kr
euc-kr-dos
euc-kr-mac
euc-kr-unix
Modeline indicator: ko/EUC
. A type iso2022
8-bit coding system
with ascii
(G0) and korean-ksc5601
(G1) initially
designated. Korean EUC (Extended Unix Code).
hz-gb-2312
Modeline indicator: Zh-GB/Hz
. A type no-conversion
coding
system with Unix EOL convention (ie, no conversion) using
post-read-decode and pre-write-encode functions to translate the Hz/ZW
coding system used for Chinese.
iso-2022-7bit
iso-2022-7bit-unix
iso-2022-7bit-dos
iso-2022-7bit-mac
iso-2022-7
Modeline indicator: ISO7
. A type iso2022
7-bit coding system
with ascii
(G0) initially designated. Other character sets must
be explicitly designated to be used.
iso-2022-7bit-ss2
iso-2022-7bit-ss2-dos
iso-2022-7bit-ss2-mac
iso-2022-7bit-ss2-unix
Modeline indicator: ISO7/SS
. A type iso2022
7-bit coding system
with ascii
(G0) initially designated. Other character sets must
be explicitly designated to be used. SS2 is used to invoke a
96-charset, one character at a time.
iso-2022-8
iso-2022-8-dos
iso-2022-8-mac
iso-2022-8-unix
Modeline indicator: ISO8
. A type iso2022
8-bit coding system
with ascii
(G0) and latin-iso8859-1
(G1) initially
designated. Other character sets must be explicitly designated to be
used. No single-shift or locking-shift.
iso-2022-8bit-ss2
iso-2022-8bit-ss2-dos
iso-2022-8bit-ss2-mac
iso-2022-8bit-ss2-unix
Modeline indicator: ISO8/SS
. A type iso2022
8-bit coding system
with ascii
(G0) and latin-iso8859-1
(G1) initially
designated. Other character sets must be explicitly designated to be
used. SS2 is used to invoke a 96-charset, one character at a time.
iso-2022-int-1
iso-2022-int-1-dos
iso-2022-int-1-mac
iso-2022-int-1-unix
Modeline indicator: INT-1
. A type iso2022
7-bit coding system
with ascii
(G0) and korean-ksc5601
(G1) initially
designated. ISO-2022-INT-1.
iso-2022-jp-1978-irv
iso-2022-jp-1978-irv-dos
iso-2022-jp-1978-irv-mac
iso-2022-jp-1978-irv-unix
Modeline indicator: Ja-78/7bit
. A type iso2022
7-bit coding
system. For compatibility with old Japanese terminals; if you need to
know, look at the source.
iso-2022-jp
iso-2022-jp-2 (ISO7/SS)
iso-2022-jp-dos
iso-2022-jp-mac
iso-2022-jp-unix
iso-2022-jp-2-dos
iso-2022-jp-2-mac
iso-2022-jp-2-unix
Modeline indicator: MULE/7bit
. A type iso2022
7-bit coding
system with ascii
(G0) initially designated, and complex
specifications to insure backward compatibility with old Japanese
systems. Used for communication with mail and news in Japan. The "-2"
versions also use SS2 to invoke a 96-charset one character at a time.
iso-2022-kr
Modeline indicator: Ko/7bit
A type iso2022
7-bit coding
system with ascii
(G0) and korean-ksc5601
(G1) initially
designated. Used for e-mail in Korea.
iso-2022-lock
iso-2022-lock-dos
iso-2022-lock-mac
iso-2022-lock-unix
Modeline indicator: ISO7/Lock
. A type iso2022
7-bit coding
system with ascii
(G0) initially designated, using Locking-Shift
to invoke a 96-charset.
iso-8859-1
iso-8859-1-dos
iso-8859-1-mac
iso-8859-1-unix
Due to implementation, this is not a type iso2022
coding system,
but rather an alias for the raw-text
coding system.
iso-8859-2
iso-8859-2-dos
iso-8859-2-mac
iso-8859-2-unix
Modeline indicator: MIME/Ltn-2
. A type iso2022
coding
system with ascii
(G0) and latin-iso8859-2
(G1) initially
invoked.
iso-8859-3
iso-8859-3-dos
iso-8859-3-mac
iso-8859-3-unix
Modeline indicator: MIME/Ltn-3
. A type iso2022
coding system
with ascii
(G0) and latin-iso8859-3
(G1) initially
invoked.
iso-8859-4
iso-8859-4-dos
iso-8859-4-mac
iso-8859-4-unix
Modeline indicator: MIME/Ltn-4
. A type iso2022
coding system
with ascii
(G0) and latin-iso8859-4
(G1) initially
invoked.
iso-8859-5
iso-8859-5-dos
iso-8859-5-mac
iso-8859-5-unix
Modeline indicator: ISO8/Cyr
. A type iso2022
coding system with
ascii
(G0) and cyrillic-iso8859-5
(G1) initially invoked.
iso-8859-7
iso-8859-7-dos
iso-8859-7-mac
iso-8859-7-unix
Modeline indicator: Grk
. A type iso2022
coding system with
ascii
(G0) and greek-iso8859-7
(G1) initially invoked.
iso-8859-8
iso-8859-8-dos
iso-8859-8-mac
iso-8859-8-unix
Modeline indicator: MIME/Hbrw
. A type iso2022
coding system with
ascii
(G0) and hebrew-iso8859-8
(G1) initially invoked.
iso-8859-9
iso-8859-9-dos
iso-8859-9-mac
iso-8859-9-unix
Modeline indicator: MIME/Ltn-5
. A type iso2022
coding system
with ascii
(G0) and latin-iso8859-9
(G1) initially
invoked.
koi8-r
koi8-r-dos
koi8-r-mac
koi8-r-unix
Modeline indicator: KOI8
. A type ccl
coding-system used for
KOI8-R, an encoding of the Cyrillic alphabet.
shift_jis
shift_jis-dos
shift_jis-mac
shift_jis-unix
Modeline indicator: Ja/SJIS
. A type shift-jis
coding-system
implementing the Shift-JIS encoding for Japanese. The underscore is to
conform to the MIME charset implementing this encoding.
tis-620
tis-620-dos
tis-620-mac
tis-620-unix
Modeline indicator: TIS620
. A type ccl
encoding for Thai. The
external encoding is defined by TIS620, the internal encoding is
peculiar to MULE, and called thai-xtis
.
viqr
Modeline indicator: VIQR
. A type no-conversion
coding
system with Unix EOL convention (ie, no conversion) using
post-read-decode and pre-write-encode functions to translate the VIQR
coding system for Vietnamese.
viscii
viscii-dos
viscii-mac
viscii-unix
Modeline indicator: VISCII
. A type ccl
coding-system used
for VISCII 1.1 for Vietnamese. Differs slightly from VSCII; VISCII is
given priority by SXEmacs.
vscii
vscii-dos
vscii-mac
vscii-unix
Modeline indicator: VSCII
. A type ccl
coding-system used
for VSCII 1.1 for Vietnamese. Differs slightly from VISCII, which is
given priority by SXEmacs. Use
(prefer-coding-system 'vietnamese-vscii)
to give priority to VSCII.
Previous: Big5 and Shift-JIS Functions, Up: Coding Systems [Contents][Index]