Skip to content

Byte Counter

Calculate string byte size in UTF-8, UTF-16, and other encodings online. Free byte counter for measuring text payload and storage.

Calculators
Instant results
0
Characters
0
Code Points
0
ASCII Chars

Byte Size by Encoding

UTF-8(Most common for web)
0bytes
0 Bytes
UTF-16(JavaScript internal)
0bytes
0 Bytes
UTF-32(Fixed width)
0bytes
0 Bytes

Encoding Notes:

  • UTF-8: 1-4 bytes per character, efficient for ASCII
  • UTF-16: 2-4 bytes per character, used by JavaScript/Java
  • UTF-32: Fixed 4 bytes per character, simple indexing
  • Emojis and CJK characters use more bytes than ASCII

About Byte Counter

Calculate the byte size of text in different character encodings. Useful for understanding string storage requirements and optimizing data transmission.

How to Use Byte Counter

1

Paste your text

Paste or type the text you want to measure. Byte size updates instantly as you type, calculated as UTF-8 encoding.

2

Compare bytes vs characters

The counter shows both: 'characters' (visual count, what users see) and 'bytes' (storage/transmission size). Each emoji is 4 bytes; each accented character is 2-3 bytes.

3

Apply to your context

Use the byte count when working with byte-limited contexts: SMS, database VARCHAR limits, HTTP headers, cookies, JWT tokens, file metadata.

4

Optimize if needed

If exceeding limits, reduce by: removing emoji (each saves 4 bytes), simplifying accented chars (Café → Cafe saves 1 byte), removing unnecessary spaces, using ASCII-only when possible.

When to Use Byte Counter

Database field size validation

VARCHAR(N) and TEXT fields have byte limits (not character limits) in many databases. Before saving multi-language or emoji-containing text, verify it fits in the allocated byte space. PostgreSQL's varchar(255) allows 255 bytes which might be only 60 Chinese characters or fewer with emoji.

HTTP header and cookie size

HTTP headers typically limit to 8 KB total per request; cookies to 4 KB each. When storing user preferences, session data, or auth tokens in cookies, calculating exact byte size prevents unexpected request failures or cookie rejections by browsers.

SMS encoding optimization

GSM-7 encoding (Latin alphabet) gives 160 chars per SMS = 160 bytes. UCS-2 encoding (used when message contains non-Latin chars or emoji) gives 70 chars but uses 140 bytes. The byte counter helps optimize SMS content to use cheaper GSM-7 when possible.

JWT token size tracking

JWT tokens are sent in every request (Authorization header). Each byte adds to request size. Tokens with many claims, custom data, or namespace-prefixed claims grow quickly. Tracking byte size helps balance feature richness vs. request overhead.

Byte Counter Examples

ASCII text

Input
Hello, World!
Output
Bytes: 13\nCharacters: 13

Simple ASCII text: each character is 1 byte in UTF-8. Bytes equal characters for plain English text. Standard messaging, code identifiers, URL paths typically use only ASCII.

Accented characters

Input
Café résumé
Output
Bytes: 13\nCharacters: 11

Café and résumé contain é characters, each 2 bytes in UTF-8. So 11 visible characters require 13 bytes. Important when storing in fields with byte limits or transmitting via byte-limited channels.

Emoji and CJK

Input
Hello 👋 你好
Output
Bytes: 17\nCharacters: 9

Emoji 👋 is 4 bytes; Chinese characters 你好 are 3 bytes each. The 9-character message requires 17 bytes total. Critical for SMS encoding decisions, database field sizing, and any byte-limited context.

Tips & Best Practices for Byte Counter

  • 1.Use byte count, not character count, when limits are specified in bytes. Database fields, cookies, HTTP headers, and many file format metadata fields all use byte limits.
  • 2.For SMS, prefer ASCII when possible to use cheaper GSM-7 encoding. A message with ANY non-GSM character (most emoji, accents) switches to UCS-2 doubling the byte cost.
  • 3.Watch for character normalization differences. 'é' (U+00E9) is 1 codepoint (2 bytes UTF-8); 'é' decomposed (e + combining acute U+0301) is 2 codepoints (3 bytes UTF-8). Visually identical but different bytes.
  • 4.ZWJ (zero-width joiner) emojis like 👨‍👩‍👧‍👦 (family) compose multiple emojis with joiners, reaching 25+ bytes. They appear as single 'characters' visually but are surprisingly large in storage.
  • 5.When designing database schemas, account for UTF-8 worst case: typical 3 bytes per BMP character. A varchar(100) for user names handles ~33 Chinese names but ~100 ASCII names. Plan for the worst case.
  • 6.For cookies/headers approaching limits, consider compressing values. Base64-encoded gzipped JSON is often 30-50% smaller than raw JSON for typical values.

Frequently Asked Questions

It counts the byte size of text — how many bytes the text occupies when stored as UTF-8 (the standard web encoding). Different from character count: 'café' is 4 characters but 5 bytes (the é is 2 bytes in UTF-8). Critical when working with byte-limited contexts: HTTP headers, file size limits, database VARCHAR limits.