encoding base32 base58 base85 punycode dev-tech

Beyond Base64: A Guide to Base32, Base58, Base85, and Punycode

Explore alternative binary-to-text encodings. Learn when to use Base58 for Bitcoin, Base85 for PDF/Git, and Punycode for internationalized domains.

2026-04-11

Binary-to-Text Encoding Guide: Base64, Base58, Punycode, and Beyond

In computing, we often need to transport binary data (like images or executable files) over systems that only support text. This is where binary-to-text encoding comes in. These schemes represent binary data using a specific set of printable characters.

1. The Base Family: Efficiency and Readability

Base64 (The Standard)

The most common encoding, used in email (MIME) and for embedding images in HTML/CSS. It uses 64 characters.

Base32

Uses 32 characters (A-Z and 2-7). It is often used in human-entered codes (like Google Authenticator secret keys) because it is case-insensitive and avoids ambiguous characters.

Base58

Popularized by Bitcoin, Base58 is similar to Base64 but removes visually similar characters like 0 (zero), O (capital o), I (capital i), and l (lower case L). This makes it ideal for wallet addresses.

Base85 (ASCII85)

Used primarily in Adobe PDF files and Git patches. It is more efficient than Base64, offering a smaller encoded size.


2. Specialized Web Encodings

Punycode

Used to represent Unicode characters in the Domain Name System (DNS), which only supports a limited set of ASCII characters. This is how "idn.example" works.

Percent-encoding (URL Encoding)

Used to encode reserved characters in a URL (e.g., a space becomes %20).

Quoted-Printable

Used in email for data that is mostly text but contains some non-ASCII characters. It keeps the text readable for humans even in its encoded form.


3. Legacy and Niche Encodings

  • UUEncode: An early Unix utility for sending binary files over email.
  • Yenc: Developed to replace UUEncode for Usenet newsgroups, offering better efficiency.

4. Communication and Symbolic Codes

Morse Code

A method used in telecommunication to encode text characters as standardized sequences of two different signal durations, called dots and dashes.

NATO Phonetic Alphabet

The most widely used radiotelephony spelling alphabet (Alpha, Bravo, Charlie...), ensuring critical letters and numbers are pronounced and understood correctly.

Braille

A tactile writing system used by people who are visually impaired. While not "binary-to-text" in a computer sense, it is a fascinating example of character encoding.


5. Classic Ciphers (Substitution)

These are simple methods for obscuring text, often used for puzzles or basic data masking.

ROT13 & ROT47

ROT13 ("rotate by 13 places") is a simple substitution cipher that replaces a letter with the 13th letter after it in the alphabet. It is its own inverse. ROT47 applies a similar logic but includes numbers and symbols.

Caesar Cipher

The oldest known substitution cipher, named after Julius Caesar. It shifts letters by a fixed number of positions down the alphabet.


Comparison Table

Encoding Base Size Best Use Case
Base64 64 Web data, Email
Base58 58 Crypto addresses
Base32 32 MFA Keys, human entry
Punycode N/A International Domains
Base85 85 PDF, Git

Conclusion

Understanding these encoding schemes is crucial for developers and security professionals. Whether you are optimizing web performance with Base64, securing a blockchain with Base58, or ensuring domain compatibility with Punycode, choosing the right encoding is key to data integrity and system interoperability.