Step 1: Determine the UTF-8 encoding bit layout
The character ᘸ has the Unicode code point U+1638. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+1638 to binary:
00010110 00111000
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100001 10011000 10111000
CANADIAN SYLLABICS CARRIER TLHI·U+1638
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E1 98 B8 | 11100001 10011000 10111000 |
UTF16 (big Endian) | 16 38 | 00010110 00111000 |
UTF16 (little Endian) | 38 16 | 00111000 00010110 |
UTF32 (big Endian) | 00 00 16 38 | 00000000 00000000 00010110 00111000 |
UTF32 (little Endian) | 38 16 00 00 | 00111000 00010110 00000000 00000000 |
Description
The Unicode character U+1638, known as the "Canadian Syllabics Carrier TLHI," serves a crucial role in digital text representing the Indigenous languages of Canada, specifically the Inuit, Cree, and Ojibwe languages. This character is part of the Canadian Aboriginal Syllabics block, which consists of 48 characters that together provide a comprehensive system for expressing phonetic and syntactic structures in these languages. TLHI stands for "Taiga Land Higher" in Inuktitut and represents a neutral, silent syllable carrier used to adjust the position of accented vowels or other diacritical marks within words. As such, it plays an essential part in ensuring accurate phonetic representation and readability in these Indigenous language texts. The use of U+1638 reflects Canada's commitment to preserving and promoting its rich linguistic heritage, providing digital support for the country's diverse Indigenous communities.
How to type the ᘸ symbol on Windows
Hold Alt and type 5688 on the numpad. Or use Character Map.