Step 1: Determine the UTF-8 encoding bit layout
The character ᱭ has the Unicode code point U+1C6D. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+1C6D to binary:
00011100 01101101
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100001 10110001 10101101
OL CHIKI LETTER UY·U+1C6D
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E1 B1 AD | 11100001 10110001 10101101 |
UTF16 (big Endian) | 1C 6D | 00011100 01101101 |
UTF16 (little Endian) | 6D 1C | 01101101 00011100 |
UTF32 (big Endian) | 00 00 1C 6D | 00000000 00000000 00011100 01101101 |
UTF32 (little Endian) | 6D 1C 00 00 | 01101101 00011100 00000000 00000000 |
Description
U+1C6D is a typographical character known as the Ol Chiki letter UY. In digital text, this character plays a crucial role in representing the Ol Chiki script, which was developed for writing the Santali language, predominantly spoken by the indigenous people of the Indian states of West Bengal, Jharkhand, and Bihar. The Ol Chiki script is based on the Latin alphabet and consists of 25 letters, among which U+1C6D represents the phoneme /ʊ/. The development of the Ol Chiki script was a significant step towards promoting literacy and cultural preservation for the indigenous Santali community. The character U+1C6D contributes to this goal by accurately representing the unique linguistic features of the language, thereby facilitating effective communication and preserving the rich cultural heritage of the Santali people.
How to type the ᱭ symbol on Windows
Hold Alt and type 7277 on the numpad. Or use Character Map.