Previous: , Up: Internal Mule Encodings   [Contents][Index]


18.3.2 Internal Character Encoding

One 19-bit word represents a single character. The word is separated into three fields:

Bit number:     18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
                <------------> <------------------> <------------------>
Field:                1                  2                    3

Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5 bits.

Character set           Field 1         Field 2         Field 3
-------------           -------         -------         -------
ASCII                      0               0              PC1
   range:                                                   (00 - 7F)
Control-1                  0               1              PC1
   range:                                                   (00 - 1F)
Dimension-1 official       0            LB - 0x80         PC1
   range:                                    (01 - 0D)      (20 - 7F)
Dimension-1 private        0            LB - 0x80         PC1
   range:                                    (20 - 6F)      (20 - 7F)
Dimension-2 official    LB - 0x8F         PC1             PC2
   range:                    (01 - 0A)       (20 - 7F)      (20 - 7F)
Dimension-2 private     LB - 0xE1         PC1             PC2
   range:                    (0F - 1E)       (20 - 7F)      (20 - 7F)
Composite                 0x1F             ?               ?

Note that character codes 0 - 255 are the same as the “binary encoding” described above.