Step 1: Determine the UTF-8 encoding bit layout
The character ထ has the Unicode code point U+1011. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+1011 to binary:
00010000 00010001
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100001 10000000 10010001
MYANMAR LETTER THA·U+1011
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E1 80 91 | 11100001 10000000 10010001 |
UTF16 (big Endian) | 10 11 | 00010000 00010001 |
UTF16 (little Endian) | 11 10 | 00010001 00010000 |
UTF32 (big Endian) | 00 00 10 11 | 00000000 00000000 00010000 00010001 |
UTF32 (little Endian) | 11 10 00 00 | 00010001 00010000 00000000 00000000 |
Description
U+1011, known as MYANMAR LETTER THA, is a significant character in the Myanmar script. In digital text, it typically represents the consonant 'THA' in Myanmar's alphasyllabic writing system. This script, used primarily in Myanmar (formerly Burma), is composed of 33 consonants and 14 vowels. MYANMAR LETTER THA plays a crucial role in enabling accurate text-to-speech conversion, as it signifies the correct pronunciation when combined with various vowel symbols. This character holds cultural significance, as it forms part of a script with over 1,000 years of history. The Myanmar script has evolved from its original Mon-Khmer origins, and is now used to write both modern and classical texts in the Burmese language. In linguistic terms, MYANMAR LETTER THA contributes to the versatility of the script, allowing for the representation of a wide range of vocabulary in written form. Technically speaking, U+1011 is encoded within the Unicode Standard, ensuring its compatibility across different digital platforms and devices. This standardized encoding system facilitates smooth data exchange between various computing systems, promoting seamless communication and collaboration across diverse linguistic contexts.
How to type the ထ symbol on Windows
Hold Alt and type 4113 on the numpad. Or use Character Map.