Step 1: Determine the UTF-8 encoding bit layout
The character ݦ has the Unicode code point U+0766. In UTF-8, it is encoded using 2 bytes because its codepoint is in the range of
0x0080
to0x07ff
.
Therefore we know that the UTF-8 encoding will be done over 11 bits within the final 16 bits and that it will have the format:110xxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0766 to binary:
00000111 01100110
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11011101 10100110
ARABIC LETTER MEEM WITH DOT BELOW·U+0766
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | DD A6 | 11011101 10100110 |
UTF16 (big Endian) | 07 66 | 00000111 01100110 |
UTF16 (little Endian) | 66 07 | 01100110 00000111 |
UTF32 (big Endian) | 00 00 07 66 | 00000000 00000000 00000111 01100110 |
UTF32 (little Endian) | 66 07 00 00 | 01100110 00000111 00000000 00000000 |
Description
The Unicode character U+0766 is known as the Arabic Letter Meem with Dot Below (U+0766). This character plays a significant role in digital text, particularly within the Arabic language. It represents a consonant that is pronounced like the English "m" or "n". The dot below the letter aids in its legibility and distinction from similar characters in Arabic script. In terms of linguistic context, this character is used in various languages that use the Arabic script, including Arabic, Persian, and Urdu. It is an essential part of the Arabic language's 28-letter alphabet and plays a crucial role in spelling words and constructing sentences. From a technical perspective, U+0766 adheres to Unicode standards, enabling seamless integration with modern computing systems. Its use ensures compatibility across various devices, software applications, and digital platforms that utilize the Arabic script. In summary, U+0766 is an integral component of the Arabic language and other languages that employ the Arabic script. The character's role in digital text is vital for maintaining legibility and distinction within the Arabic alphabet, while its compatibility with Unicode standards ensures seamless integration across modern computing systems.
How to type the ݦ symbol on Windows
Hold Alt and type 1894 on the numpad. Or use Character Map.