Unicode Converter
Convert between Unicode escape sequences and readable text online. Free Unicode converter supporting UTF-8, UTF-16, and code points.
About Unicode Converter
Convert between text and Unicode code points. Supports multiple output formats including hex (U+XXXX), HTML entities, CSS escape sequences, and JavaScript notation.
How to Use Unicode Converter
Enter text or codepoint
Type a character to see its Unicode codepoint, or paste a codepoint like U+1F600 to see the matching character. The conversion runs in both directions from the same input field.
View all representations
You'll see the codepoint, the UTF-8 hex bytes, the JavaScript escape \u{...}, the HTML entity &#NNN;, the URL-encoded percent form %XX, and several other formats side by side.
Copy the format you need
Pick the right format for your destination. Source code wants the language-specific escape, HTML templates want the entity form, and URLs need percent encoding. The converter covers each common case so you don't have to translate by hand.
Inspect emoji structure
Compound emoji like π¨βπ©βπ§βπ¦ decompose into multiple codepoints linked by zero-width joiners. The converter shows the full breakdown, which is the easiest way to see why these characters take so much storage and behave oddly with naive length functions.
When to Use Unicode Converter
Embedding special characters in code
Some editors and build pipelines still flinch at raw Unicode in source files. Escape sequences sidestep the problem β \u{1F600} for JavaScript, \U0001F600 for Python, and so on β letting you keep emoji, symbols, and translated strings in plain ASCII while the runtime restores them to their full glory.
Debugging encoding issues
When an apostrophe shows up as Γ’β¬β’, the bytes are correct but the program reading them picked the wrong encoding. Looking at the raw codepoints exposes that mismatch quickly and usually points the finger at UTF-8 being interpreted as Latin-1 somewhere along the path.
Embedding emoji in HTML
Modern HTML accepts Unicode directly, but some templating layers and legacy systems mangle anything outside ASCII. Falling back to entity references like 😀 for π routes around those filters and keeps your output readable across environments.
Working with internationalized domain names
Domains containing non-ASCII characters travel through DNS in Punycode form, so cafΓ©.com becomes xn--caf-dma.com on the wire. Translating between the two views matters whenever you're configuring an IDN, validating user input, or auditing how a domain renders in different clients.
Unicode Converter Examples
Emoji to codepoint
π (smiling face)Codepoint: U+1F600\nUTF-8: F0 9F 98 80 (4 bytes)\nJavaScript: \u{1F600}\nHTML: 😀One emoji, four equivalent identifiers. Reach for the codepoint when writing documentation, the UTF-8 bytes when you're estimating storage, the JavaScript escape when patching source files, and the HTML entity when templating engines won't pass raw Unicode through.
Surrogate pair issue
JS String.length of 'π'String.length = 2 (not 1!)\nReason: surrogate pairJavaScript measures strings in UTF-16 code units, and any codepoint above U+FFFF needs two of them β a surrogate pair. That's why most emoji return a length of 2 instead of 1, and why Array.from(str).length is the safer way to count actual characters.
Chinese character
δΈ (Chinese 'middle')Codepoint: U+4E2D\nUTF-8: E4 B8 AD (3 bytes)\nUTF-16: 4E 2D (2 bytes)BMP characters (anything below U+10000) take three bytes in UTF-8 but only two bytes in UTF-16. Since JavaScript uses UTF-16 internally, String.length reports the expected 1 for this character.
Tips & Best Practices for Unicode Converter
- 1.Treat UTF-8 as the default for any new file or API you build. It's the web standard, it handles every codepoint, and it sidesteps the parade of incompatibilities that Latin-1 and other legacy encodings keep alive.
- 2.Excel and Word sometimes save text and CSV files with a leading byte-order mark (U+FEFF) even though UTF-8 doesn't require one. Strip that BOM at the boundary, otherwise the first column header in your import quietly carries a phantom character.
- 3.In JavaScript, Array.from(str).length is the right way to count visible characters because it iterates by codepoint. Plain String.length returns UTF-16 code units, which means emoji and other non-BMP characters report inflated lengths.
- 4.Encoding bugs are almost always a three-way mismatch. The file is saved one way, the Content-Type header or meta charset declares another, and your code reads it as a third. Spot which step disagrees and the mojibake usually clears up.
- 5.For IDNs, decide on a canonical form and store everything that way. Punycode is great for databases and DNS lookups, while Unicode is friendlier in user-facing displays β pick one for storage and convert at the edges.
- 6.Emoji art varies wildly across platforms. Apple, Google, Microsoft, and Twitter all draw the same codepoint differently, so any test that depends on a specific rendering is fragile. The codepoint itself is what's portable.
Frequently Asked Questions
Related Tools
Morse Code Converter
Convert text to Morse code and back online. Free Morse code translator with audio playback and dot-dash encoding support.
Base64 Encoder
Encode text to Base64 format online instantly. Free Base64 encoder for converting strings, data URIs, and binary content safely.
Base64 Decoder
Decode Base64 strings back to readable text online instantly. Free Base64 decoder for converting encoded data and content safely.
Hex to Text
Convert hexadecimal values to readable text online. Free hex to text decoder for parsing hex-encoded strings and binary data.
Binary Text Converter
Convert between binary and text online with bidirectional support. Free binary text converter for encoding and decoding binary data.
Text to Binary Converter
Convert text to binary code online with space-separated byte output. Free text to binary encoder for learning and data conversion.