Step 1: Determine the UTF-8 encoding bit layout
The character অ has the Unicode code point U+0985. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0985 to binary:
00001001 10000101
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100000 10100110 10000101
BENGALI LETTER A·U+0985
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E0 A6 85 | 11100000 10100110 10000101 |
UTF16 (big Endian) | 09 85 | 00001001 10000101 |
UTF16 (little Endian) | 85 09 | 10000101 00001001 |
UTF32 (big Endian) | 00 00 09 85 | 00000000 00000000 00001001 10000101 |
UTF32 (little Endian) | 85 09 00 00 | 10000101 00001001 00000000 00000000 |
Description
The Unicode character U+0985 is known as the Bengali Letter A, which plays a crucial role in the Bengali language, used predominantly in Bangladesh and West Bengal, India. It forms an essential part of digital text, enabling accurate communication and expression of ideas for native speakers of this language. Bengali, an Indo-Aryan language, is written from left to right, and each script character has a distinct form based on its position in a word. U+0985, or Bengali Letter A, is the initial letter of a word and is utilized for the phonetic representation of the 'a' sound. Furthermore, it's part of the extended version of Gurmukhi script block, which comprises more than 360 characters, showcasing the richness and diversity of the Indian subcontinent's linguistic heritage. As typography evolves, so does the representation and understanding of these unique scripts across digital platforms, ensuring cultural preservation and global connectivity for Bengali-speaking communities.
How to type the অ symbol on Windows
Hold Alt and type 2437 on the numpad. Or use Character Map.