Step 1: Determine the UTF-8 encoding bit layout
The character has the Unicode code point U+009C. In UTF-8, it is encoded using 2 bytes because its codepoint is in the range of
0x0080
to0x07ff
.
Therefore we know that the UTF-8 encoding will be done over 11 bits within the final 16 bits and that it will have the format:110xxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+009C to binary:
10011100
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11000010 10011100
<control>·U+009C
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | C2 9C | 11000010 10011100 |
UTF16 (big Endian) | 00 9C | 00000000 10011100 |
UTF16 (little Endian) | 9C 00 | 10011100 00000000 |
UTF32 (big Endian) | 00 00 00 9C | 00000000 00000000 00000000 10011100 |
UTF32 (little Endian) | 9C 00 00 00 | 10011100 00000000 00000000 00000000 |
Description
The Unicode character U+009C, often represented as , plays a pivotal role in digital typography with its hexadecimal code 009C. This symbol, designated as the "NO-BREAK SPACE" character, is a crucial tool for maintaining spacing between words or other elements in text, particularly in languages with special formatting requirements, such as programming, science, and mathematics. Its significance extends beyond its technical utility to cultural context within digital communication. The NO-BREAK SPACE character ensures consistency in text structure across various platforms, contributing to a seamless reading experience for users worldwide. This character belongs to the Latin-1 Supplement Unicode block, which spans characters from 128 to 255. This range of characters serves various text formatting and typography purposes, enhancing the readability and overall appearance of text documents. The NO-BREAK SPACE character's role is particularly important in this context due to its specific function as a non-breaking space that prevents line breaks within words or other content. In terms of technical context, U+009C falls under the General Category (Gc) of Cc (Control) and Bidi Class (BN). It has no canonical combining class, decomposition type, or mapping, and it is not mirrored in Unicode 10. Its numeric value, digital value, and ISO/IEC 10646 comment are null. In summary, the Unicode character U+009C, or the NO-BREAK SPACE, is an essential component of digital typography, ensuring proper spacing in specialized text formats while maintaining consistency across platforms for a smooth reading experience. Its inclusion in the Latin-1 Supplement block highlights its versatility and importance within the realm of digital communication.
How to type the symbol on Windows
Hold Alt and type 0156 on the numpad. Or use Character Map.