Step 1: Determine the UTF-8 encoding bit layout
The character ྟ has the Unicode code point U+0F9F. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0F9F to binary:
00001111 10011111
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100000 10111110 10011111
TIBETAN SUBJOINED LETTER TA·U+0F9F
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E0 BE 9F | 11100000 10111110 10011111 |
UTF16 (big Endian) | 0F 9F | 00001111 10011111 |
UTF16 (little Endian) | 9F 0F | 10011111 00001111 |
UTF32 (big Endian) | 00 00 0F 9F | 00000000 00000000 00001111 10011111 |
UTF32 (little Endian) | 9F 0F 00 00 | 10011111 00001111 00000000 00000000 |
Description
U+0F9F is a character from the Tibetan script called "TIBETAN SUBJOINED LETTER TA". This script belongs to the family of scripts known as the Tibetan script, which is used predominantly in the region of Tibet for writing the Tibetan language. The Tibetan script is a unique system of writing that has been developed over centuries and holds significant cultural and religious importance in the region. In digital text, U+0F9F functions as a component in the formation of words in the Tibetan language. It is used to represent the subjoined form of the letter "TA" in certain grammatical contexts within the Tibetan script system. The use of this character helps maintain the linguistic integrity and coherence of text written in the Tibetan language when using digital communication and information systems. The Tibetan script is deeply rooted in the region's rich cultural history, religious practices, and traditions. It has been an essential tool for transmitting Buddhist teachings, literature, and historical records throughout centuries. Consequently, U+0F9F plays a significant role in preserving this heritage by enabling accurate representation of the Tibetan language in digital formats. In summary, U+0F9F is an indispensable component of the Tibetan script system, serving as a subjoined letter in the writing of the Tibetan language. Its usage is deeply intertwined with the cultural and linguistic contexts of the region, contributing to the preservation of Tibetan literature and religious texts in digital format for future generations.
How to type the ྟ symbol on Windows
Hold Alt and type 3999 on the numpad. Or use Character Map.