Step 1: Determine the UTF-8 encoding bit layout
The character ඁ has the Unicode code point U+0D81. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0D81 to binary:
00001101 10000001
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100000 10110110 10000001
SINHALA SIGN CANDRABINDU·U+0D81
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E0 B6 81 | 11100000 10110110 10000001 |
UTF16 (big Endian) | 0D 81 | 00001101 10000001 |
UTF16 (little Endian) | 81 0D | 10000001 00001101 |
UTF32 (big Endian) | 00 00 0D 81 | 00000000 00000000 00001101 10000001 |
UTF32 (little Endian) | 81 0D 00 00 | 10000001 00001101 00000000 00000000 |
Description
The character U+0D81, known as the Sinhala Sign Candrabindu, plays a significant role within the digital realm of Sinhala language text. In its traditional usage, this glyph is applied as an accent to vowels in written Sinhala, assisting with pronunciation and meaning clarification. Particularly, it appears above the 'a' vowel to create a distinct sound or modifies other vowels to change their inherent phonetic value. Its employment is not merely stylistic but bears linguistic importance in enabling accurate communication in this language, which has over 21 million speakers in Sri Lanka and is part of the Indo-Aryan branch of the Indo-European languages. As for its technical context, U+0D81 follows the Unicode standard, a system that assigns unique code points to characters from virtually every written language in use today, providing a universal character encoding. Its categorization under this system ensures consistency and interoperability across digital platforms, facilitating effective global communication. The Sinhala Sign Candrabindu's presence in the Unicode Standard is a testament to its relevance in the contemporary digital sphere. In terms of cultural significance, it holds importance in religious texts and ancient manuscripts written in Sinhala. Its use also extends to modern literature, signifying respect for tradition while adapting to the evolving needs of the language in the digital era. The character U+0D81 is not only a simple typographic element but carries substantial linguistic and cultural significance within the realm of digital text representation. It is an essential tool in maintaining the integrity of Sinhala as a living language, balancing tradition with modernity in the increasingly interconnected world.
How to type the ඁ symbol on Windows
Hold Alt and type 3457 on the numpad. Or use Character Map.