Step 1: Determine the UTF-8 encoding bit layout
The character ᪰ has the Unicode code point U+1AB0. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+1AB0 to binary:
00011010 10110000
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100001 10101010 10110000
COMBINING DOUBLED CIRCUMFLEX ACCENT·U+1AB0
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E1 AA B0 | 11100001 10101010 10110000 |
UTF16 (big Endian) | 1A B0 | 00011010 10110000 |
UTF16 (little Endian) | B0 1A | 10110000 00011010 |
UTF32 (big Endian) | 00 00 1A B0 | 00000000 00000000 00011010 10110000 |
UTF32 (little Endian) | B0 1A 00 00 | 10110000 00011010 00000000 00000000 |
Description
The Unicode character U+1AB0 represents the "COMBINING DOUBLED CIRCUMFLEX ACCENT." This typographical element is utilized in digital text to modify certain characters by doubling the circumflex accent, which is a diacritical mark typically used in French and Romanian. It is applied above a base character to alter its pronunciation or orthography, such as in the word "î" (U+0069 LATIN SMALL LETTER I) with a doubled circumflex accent becoming "î̂." Its primary role is to provide a specific linguistic nuance in text where it's needed, particularly for languages that use the circumflex accent. However, its usage may be less common due to potential compatibility issues and limited support across devices or platforms. The COMBINING DOUBLED CIRCUMFLEX ACCENT remains an important character in Unicode, showcasing the extensive range of typographical possibilities and further enriching digital text's expressive capabilities.
How to type the ᪰ symbol on Windows
Hold Alt and type 6832 on the numpad. Or use Character Map.