Step 1: Determine the UTF-8 encoding bit layout
The character has the Unicode code point U+0011. In UTF-8, it is encoded using 1 byte because its codepoint is in the range of
0x0000
to0x007f
.
Therefore we know that the UTF-8 encoding will be done over 7 bits within the final 8 bits and that it will have the format:0xxxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0011 to binary:
00010001
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
00010001
<control>·U+0011
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | 11 | 00010001 |
UTF16 (big Endian) | 00 11 | 00000000 00010001 |
UTF16 (little Endian) | 11 00 | 00010001 00000000 |
UTF32 (big Endian) | 00 00 00 11 | 00000000 00000000 00000000 00010001 |
UTF32 (little Endian) | 11 00 00 00 | 00010001 00000000 00000000 00000000 |
Description
U+0011, also known as "CHARACTER 0011," is a significant character within the Unicode Standard that primarily functions to ensure proper line and paragraph separation in text formatting, particularly within the CJK (Chinese, Japanese, and Korean) Unicode blocks. This character plays a crucial role in maintaining readability, flow, and efficient typographic presentation of multilingual texts across various digital platforms and applications. Although it is not associated with any specific language or cultural context, U+0011's primary function lies in its utilization for text formatting purposes within the Basic Latin Unicode block (U+0000 to U+007F), serving as a foundation for other Unicode blocks. This essential character set encompasses control codes and special symbols, many of which are indispensable components for programming languages, text documents, and various applications. Despite its historical roots in the ASCII character set, the Basic Latin Unicode block has evolved to meet modern needs and remains an integral part of digital communication worldwide, facilitating accurate and clear communication across multiple platforms and devices.
How to type the symbol on Windows
Hold Alt and type 0017 on the numpad. Or use Character Map.