Step 1: Determine the UTF-8 encoding bit layout
The character ท has the Unicode code point U+0E17. In UTF-8, it is encoded using 3 bytes because its codepoint is in the range of
0x0800
to0xffff
.
Therefore we know that the UTF-8 encoding will be done over 16 bits within the final 24 bits and that it will have the format:1110xxxx 10xxxxxx 10xxxxxx
Where thex
are the payload bits.UTF-8 Encoding bit layout by codepoint range Codepoint Range Bytes Bit pattern Payload length U+0000 - U+007F 1 0xxxxxxx 7 bits U+0080 - U+07FF 2 110xxxxx 10xxxxxx 11 bits U+0800 - U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx 16 bits U+10000 - U+10FFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 21 bits Step 2: Obtain the payload bits:
Convert the hexadecimal code point U+0E17 to binary:
00001110 00010111
. Those are the payload bits.Step 3: Fill in the bits to match the bit pattern:
Obtain the final bytes by arranging the paylod bits to match the bit layout:
11100000 10111000 10010111
THAI CHARACTER THO THAHAN·U+0E17
Character Information
Character Representations
Click elements to copyEncoding | Hex | Binary |
---|---|---|
UTF8 | E0 B8 97 | 11100000 10111000 10010111 |
UTF16 (big Endian) | 0E 17 | 00001110 00010111 |
UTF16 (little Endian) | 17 0E | 00010111 00001110 |
UTF32 (big Endian) | 00 00 0E 17 | 00000000 00000000 00001110 00010111 |
UTF32 (little Endian) | 17 0E 00 00 | 00010111 00001110 00000000 00000000 |
Description
U+0E17, or the Thai Character Tho Thahan, holds a significant position in digital text as it is an essential glyph within the Thai script system. Its primary role lies in the accurate representation of the spoken Thai language, which employs this character for correct phonetic and linguistic expression. The Tho Thahan symbol represents a specific consonant sound in Thai, contributing to the fluidity and comprehensibility of written communication. This character plays an integral part in preserving Thai culture and history through its application in digital textual forms, ensuring that the language remains accessible and understandable for speakers worldwide. Furthermore, as Thailand continues to emerge as a prominent player in global trade and technology, the accurate encoding and utilization of characters like U+0E17 become increasingly vital for seamless digital communication within and beyond the country's borders.
How to type the ท symbol on Windows
Hold Alt and type 3607 on the numpad. Or use Character Map.