Step 1: Determine the UTF-8 encoding bit layout
The character has the Unicode code point U+0089. In UTF-8, it is encoded using 2 bytes because its codepoint is in the range of
0x0080
to0x07ff
.
Therefore we know that the UTF-8 encoding will be done over 11 bits within the final 16 bits and that it will have the format:110xxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0089 to binary:
10001001
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11000010 10001001
<control>·U+0089
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | C2 89 | 11000010 10001001 |
UTF16 (big Endian) | 00 89 | 00000000 10001001 |
UTF16 (little Endian) | 89 00 | 10001001 00000000 |
UTF32 (big Endian) | 00 00 00 89 | 00000000 00000000 00000000 10001001 |
UTF32 (little Endian) | 89 00 00 00 | 10001001 00000000 00000000 00000000 |
Description
The Unicode character U+0089 (codepoint 0089, char: , code: 137) is a non-standard control code with no common usage within the digital text realm. Its location in the Control Pictures range (U+2600-26FF), which houses characters intended for specific purposes such as emojis or special symbols, indicates that it may be used in niche applications but is generally discouraged due to potential compatibility issues across platforms and devices. U+0089 has no widely accepted definition or purpose within typography and linguistics, making it obscure in the digital text world. Despite its lack of standardization and common use, it resides within the Latin-1 Supplement block (U+0080-00FF), which is a versatile collection of 256 characters essential for proper formatting and presentation of written content. This block was designed to extend the basic Latin character set, offering additional symbols like pilcrows (◊) and en dashes (–). However, U+0089 does not carry any notable cultural, linguistic, or technical context due to its non-standard status.
How to type the symbol on Windows
Hold Alt and type 0137 on the numpad. Or use Character Map.