Why Binary Data Needs Encoding

Text protocols and formats — JSON, XML, HTML, HTTP headers, email — are built to carry characters, not raw bytes. A JSON payload or an HTML page expects ASCII-compatible text. If you try to embed raw binary data (an image, a compiled binary, a PDF) directly into these formats, it breaks.

The problem: bytes range from 0–255, but most text formats only guarantee safe passage for characters in the ASCII range (0–127). Bytes 128–255 either get mangled by text encoders or interpreted as control characters. Embedding raw bytes directly is not safe.

Base64 is the solution. It maps any arbitrary binary sequence to a string made entirely of safe, printable ASCII characters. The receiver decodes the string back to the original bytes. This is called binary-to-text encoding.

How Base64 Encoding Works

Base64 converts binary data to a string using a 64-character alphabet:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

The process breaks input into 3-byte groups (24 bits). Each group is then split into four 6-bit values. Each 6-bit value maps to one character in the alphabet:

Input:  "H" (ASCII 72)  = binary 01001000
        "i" (ASCII 105) = binary 01101001
        "!" (ASCII 33)  = binary 00100001

Combined:  010010 000110 100100 100001
           ↑      ↑      ↑      ↑
           18     6      36     33

Map to alphabet:
  18 → S   (index 18)
   6 → G   (index  6)
  36 → k   (index 36)
  33 → h   (index 33)

Result: "SGkh"

When the input length isn't a multiple of 3 bytes, padding characters (=) fill the last group:

This 4:3 ratio means output is always ~133% of the original size. That's the cost of the ASCII-only representation.

Base64 vs. URL-Safe Base64

Standard Base64 uses + and / in its output. These characters have special meaning in URLs and filenames — + means "space" in query strings, and / separates path segments. Embedding standard Base64 in a URL requires percent-encoding those characters, which is messy.

URL-safe Base64 (RFC 4648, section 5) replaces the two problematic characters:

Padding is also omitted in URL-safe Base64 (the length can be derived from the string). Use this variant whenever you're embedding encoded data in a URL path or query parameter. JWT tokens use URL-safe Base64 for exactly this reason.

Common Use Cases

JWT Token Payloads

JSON Web Tokens are header.payload.signature, where the header and payload are Base64-encoded JSON objects. Decode the middle segment of any JWT and you'll get a readable JSON object.

Embedding Images in CSS or HTML

Data URIs embed small images directly in HTML or CSS as data:image/png;base64,iVBORw0KGgo.... This eliminates an HTTP request for tiny assets — icons, small decorative images. For anything larger than 1–2 KB, the 33% size overhead and the inability to cache make this a poor trade.

HTTP Basic Authentication

When you send a Authorization: Basic <credentials> header, the username:password string is Base64-encoded. This is why Basic Auth is not encryption — anyone can decode the string and read the credentials.

Embedding Binary in JSON APIs

APIs sometimes embed small file payloads (thumbnails, small attachments) directly in JSON responses as Base64 strings. This avoids multipart encoding or separate file endpoints. For anything larger than a few KB, it's usually better to use a separate binary endpoint and return a URL instead.

Code Examples

JavaScript (browsers and Node.js)

// Strings: btoa/atob (available in all browsers and Node.js 16+)
const plain = "Hello, world!";
const encoded = btoa(plain);       // "SGVsbG8sIHdvcmxkIQ=="
const decoded = atob(encoded);     // "Hello, world!"

// Binary data (Uint8Array) in Node.js
const buf = Buffer.from([0x48, 0x69]);  // bytes [72, 105]
const b64 = buf.toString('base64');     // "SGk="
const back = Buffer.from(b64, 'base64'); // Uint8Array [72, 105]

// Browser: encode binary string
const bytes = new Uint8Array([0x48, 0x69, 0x21]);
const binaryStr = Array.from(bytes).map(b => String.fromCharCode(b)).join('');
const b64str = btoa(binaryStr);  // "SGkh"

// URL-safe Base64 (Node.js)
const urlSafe = buf.toString('base64url');  // "SGk" (no padding)
const fromUrlSafe = Buffer.from(urlSafe, 'base64url');  // Uint8Array [72, 105]

Python

import base64

# String encoding
plain = "Hello, world!"
encoded = base64.b64encode(plain.encode('utf-8')).decode('ascii')
# "SGVsbG8sIHdvcmxkIQ=="

decoded = base64.b64decode(encoded).decode('utf-8')
# "Hello, world!"

# Binary data (bytes)
raw = b'\\x48\\x69'  # bytes object
b64 = base64.b64encode(raw).decode('ascii')
# "SGk="

# URL-safe Base64 (no padding, - and _ instead of + and /)
url_b64 = base64.urlsafe_b64encode(raw).decode('ascii')
# "SGk=" (same for short inputs; differs for longer ones with +/)
# Or explicitly:
import base64
url_b64 = base64.urlsafe_b64encode(raw).rstrip(b'=').decode('ascii')

# URL-safe decode
decoded = base64.urlsafe_b64decode(url_b64 + '==')  # add padding back for decode

Command Line

# Encode a file (Linux/macOS)
base64 -i image.png -o image.txt

# Decode
base64 -d image.txt -o image_decoded.png

# One-liner: echo encode
echo "Hello" | base64
# SGVsbG8=Cg==

# One-liner: echo decode
echo "SGVsbG8=" | base64 -d
# Hello

Comparison Table

Property Standard Base64 URL-Safe Base64
Characters used A–Z, a–z, 0–9, +, / A–Z, a–z, 0–9, -, _
Padding Required (= or ==) Typically omitted
Safe in URLs No — requires percent-encoding Yes
Use case MIME email, JSON, binary files JWT tokens, URL parameters, path segments
RFC RFC 4648, section 4 RFC 4648, section 5
Size overhead ~33% larger than input ~33% larger than input

Common Pitfalls

Using Base64 to hide sensitive data

Base64 is not encryption. It is trivial to decode — atob(), base64 -d, and online decoders reverse it instantly. If you need to protect data, encrypt it with AES or a proper secret management solution. Never Base64-encode passwords, API keys, or tokens you want to keep secret.

Encoding UTF-8 strings with btoa in browsers

btoa() only works with Latin-1 (ISO-8859-1) characters. Passing a string with Unicode characters (emoji, accented letters, CJK) throws a DOMException. The fix:

// Encode Unicode string to Base64
function encodeUnicode(str) {
  return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
    (_, p1) => String.fromCharCode(parseInt(p1, 16))));
}

// Decode Base64 to Unicode string
function decodeUnicode(str) {
  return decodeURIComponent(atob(str).split('').map(c =>
    '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2)
  ).join(''));
}

encodeUnicode("Hello, 世界!");  // "SGVsbG8sIOS4rCk="

Forgetting charset in data URIs

A malformed data URI: data:text,hello world. Correct: data:text/plain;charset=utf-8,hello%20world. The MIME type and charset must match the actual content, otherwise browsers interpret the data incorrectly.

Size warning: Base64 is 33% larger than the original binary. A 300 KB image becomes 400 KB when Base64-encoded and embedded in a page. For images on the web, this increases download size, prevents browser caching, and blocks parallel loading. Only inline Base64 images that are genuinely critical-path and very small (< 1–2 KB).

Need to encode or decode Base64 right now? Our free Base64 tool runs 100% in your browser — paste anything, get instant output, no signup required.

Open Base64 Encoder

Frequently Asked Questions

What is Base64 encoding?
Base64 is a binary-to-text encoding scheme that represents binary data using 64 printable ASCII characters (A-Z, a-z, 0-9, +, /). Every 3 bytes of input become 4 Base64 characters, with padding characters (=) filling out the final group when needed. It is used to embed binary data safely in text-based formats like JSON, HTML, email, and URLs.
When should you use Base64 encoding?
Use Base64 when you need to embed binary data (images, files, protocol messages) in a text format that only accepts ASCII: JSON APIs, HTML pages, CSS files, email (MIME), XML, or URL query parameters. It is also the standard format for HTTP Basic Auth credentials and JWT token payloads.
Is Base64 a form of encryption?
No. Base64 is encoding, not encryption. It provides zero secrecy — the output can be reversed trivially. Never use Base64 to hide passwords, tokens, API keys, or any sensitive data. For that, use AES, RSA, or a proper secrets manager.
What does Base64 encoding look like?
The string "Hello" encodes to "SGVsbG8=". The output contains only ASCII characters from the Base64 alphabet. The trailing "=" is padding when the input length isn't divisible by 3.
Why does Base64 increase file size by ~33%?
Base64 converts 3 bytes (24 bits) into 4 characters (32 bits). Since each character is 8 bits, the output is 4/3 the size of the input — a 33% overhead. This is the cost of representing arbitrary binary data as printable ASCII text.
How do you encode and decode Base64 in JavaScript?
For strings in browsers and Node.js: use btoa(str) to encode and atob(str) to decode. For binary data (Uint8Array) in Node.js: Buffer.from(data).toString('base64') to encode, Buffer.from(str, 'base64') to decode. For URL-safe Base64: buf.toString('base64url') and Buffer.from(str, 'base64url').
What is the Base64 alphabet?
The 64 characters are: A–Z (0–25), a–z (26–51), 0–9 (52–61), + (62), / (63). Each 6-bit chunk of the input maps to one of these characters. The two padding characters (=) fill in when the input doesn't divide evenly into 3-byte groups.
What is the difference between Base64 and URL-safe Base64?
Standard Base64 uses +, /, and =, which are reserved characters in URLs and need percent-encoding. URL-safe Base64 (RFC 4648 base64url) replaces + with -, / with _, and omits padding. Always use URL-safe Base64 when the encoded string will appear in a URL path, query parameter, or path segment.
Should you Base64-encode images for web pages?
Rarely. Inline Base64 images (data URIs) eliminate HTTP requests but add 33% size overhead, prevent caching, and block parallel resource loading. For anything over 1–2 KB, the performance cost exceeds the benefit. Use regular <img> tags for actual images. Only inline Base64 for critically-rendered assets below ~1 KB.
What is the padding character '=' in Base64?
Base64 processes input in 3-byte groups. A 1-byte final group produces 2 Base64 chars + "==" padding. A 2-byte final group produces 3 Base64 chars + "=" padding. URL-safe Base64 omits padding entirely — the original length can be inferred from the string length modulo 4.