Step 1: Determine the UTF-8 encoding bit layout
The character ݬ has the Unicode code point U+076C. In UTF-8, it is encoded using 2 bytes because its codepoint is in the range of
0x0080
to0x07ff
.
Therefore we know that the UTF-8 encoding will be done over 11 bits within the final 16 bits and that it will have the format:110xxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+076C to binary:
00000111 01101100
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11011101 10101100
ARABIC LETTER REH WITH HAMZA ABOVE·U+076C
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | DD AC | 11011101 10101100 |
UTF16 (big Endian) | 07 6C | 00000111 01101100 |
UTF16 (little Endian) | 6C 07 | 01101100 00000111 |
UTF32 (big Endian) | 00 00 07 6C | 00000000 00000000 00000111 01101100 |
UTF32 (little Endian) | 6C 07 00 00 | 01101100 00000111 00000000 00000000 |
Description
U+076C is the Unicode character code for Arabic Letter Reh with Hamza Above (Arabic: رُ مَسْتَعْمَلُ في النصوص الرقمية). In the context of digital text, it plays a significant role in representing the Arabic language, which is widely spoken across the Middle East and North Africa. The character is part of the Arabic script, an abjad system that has been used for more than 1,400 years to write various languages, including Modern Standard Arabic, classical Arabic, and several dialects. U+076C features a unique combination of two characters: the base letter "رُ" (Reh) and an additional diacritical mark, in this case, a Hamza Above, which indicates a short vowel sound. This character's accurate representation in digital text is crucial for maintaining linguistic integrity and facilitating seamless communication across different platforms and devices.
How to type the ݬ symbol on Windows
Hold Alt and type 1900 on the numpad. Or use Character Map.