Step 1: Determine the UTF-8 encoding bit layout
The character ه has the Unicode code point U+0647. In UTF-8, it is encoded using 2 bytes because its codepoint is in the range of
0x0080
to0x07ff
.
Therefore we know that the UTF-8 encoding will be done over 11 bits within the final 16 bits and that it will have the format:110xxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0647 to binary:
00000110 01000111
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11011001 10000111
ARABIC LETTER HEH·U+0647
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | D9 87 | 11011001 10000111 |
UTF16 (big Endian) | 06 47 | 00000110 01000111 |
UTF16 (little Endian) | 47 06 | 01000111 00000110 |
UTF32 (big Endian) | 00 00 06 47 | 00000000 00000000 00000110 01000111 |
UTF32 (little Endian) | 47 06 00 00 | 01000111 00000110 00000000 00000000 |
Description
U+0647 is the Unicode code point for Arabic Letter Heh (ه), a character widely used within the Arabic script. In digital text, this character plays a significant role as it serves as a fundamental building block of various words in the Arabic language. The Arabic script is not only used to communicate among Arabic-speaking communities but also serves as an important tool for conveying religious texts, literature, and scientific knowledge. U+0647 specifically represents the Arabic letter 'ه', which phonetically corresponds to a voiceless glottal plosive sound (/h/). In written form, this character is visually distinct due to its unique shape and appearance. Apart from its linguistic role, U+0647 also carries cultural and historical significance. The Arabic script has been in use for over a thousand years, with the modern Arabic alphabet deriving largely from the pre-Islamic period. As an integral part of this ancient writing system, the character U+0647 has witnessed the evolution of languages, cultures, and civilizations in the Middle East and North Africa. Additionally, it is worth noting that the Arabic script does not use uppercase letters like the Latin alphabet. Thus, U+0647 always appears in its lowercase form, providing a uniform appearance across various text formats. In technical contexts, U+0647 may be encountered when working with software and systems that involve text encoding or processing in Arabic language environments. For instance, developers creating applications, websites, or other digital platforms targeting Arabic-speaking users must consider proper support for characters like U+0647 to ensure accurate representation and display of content. The character's encoding within the Unicode Standard enables seamless integration and compatibility across various platforms and devices, thereby facilitating efficient communication in the Arabic language on a global scale.
How to type the ه symbol on Windows
Hold Alt and type 1607 on the numpad. Or use Character Map.