Step 1: Determine the UTF-8 encoding bit layout
The character ૼ has the Unicode code point U+0AFC. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0AFC to binary:
00001010 11111100
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100000 10101011 10111100
GUJARATI SIGN MADDAH·U+0AFC
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E0 AB BC | 11100000 10101011 10111100 |
UTF16 (big Endian) | 0A FC | 00001010 11111100 |
UTF16 (little Endian) | FC 0A | 11111100 00001010 |
UTF32 (big Endian) | 00 00 0A FC | 00000000 00000000 00001010 11111100 |
UTF32 (little Endian) | FC 0A 00 00 | 11111100 00001010 00000000 00000000 |
Description
The Unicode character U+0AFC, known as the Gujarati Sign Madh, is a crucial component of the Gujarati script. As part of the digital text, it plays an essential role in representing the phonetic and semantic aspects of written Gujarati language. The Gujarati script, which belongs to the Indic family of scripts, is predominantly used in the Indian state of Gujarat and by Gujarati-speaking communities worldwide. In terms of its usage, the Gujarati Sign Madh (U+0AFC) signifies a long vowel sound that elongates the previous vowel or consonant cluster within a word. This character is vital for maintaining linguistic accuracy and proper pronunciation in written Gujarati text. The Madh sign, as with other diacritics in the script, allows readers to accurately interpret the intended meaning of words without ambiguity. The inclusion of U+0AFC in digital texts facilitates the accurate representation of Gujarati language, which is essential for cultural and linguistic preservation among Gujarati-speaking communities. The Gujarati script has a rich history that dates back to the 12th century, and its proper usage ensures that this historical and cultural heritage is preserved for future generations. In summary, U+0AFC, or the Gujarati Sign Madh, plays a significant role in the accurate representation of the Gujarati language in digital texts. Its use contributes to maintaining linguistic integrity, cultural preservation, and effective communication among Gujarati-speaking communities worldwide.
How to type the ૼ symbol on Windows
Hold Alt and type 2812 on the numpad. Or use Character Map.