Why Binary Data Needs Encoding
Text protocols and formats — JSON, XML, HTML, HTTP headers, email — are built to carry characters, not raw bytes. A JSON payload or an HTML page expects ASCII-compatible text. If you try to embed raw binary data (an image, a compiled binary, a PDF) directly into these formats, it breaks.
The problem: bytes range from 0–255, but most text formats only guarantee safe passage for characters in the ASCII range (0–127). Bytes 128–255 either get mangled by text encoders or interpreted as control characters. Embedding raw bytes directly is not safe.
Base64 is the solution. It maps any arbitrary binary sequence to a string made entirely of safe, printable ASCII characters. The receiver decodes the string back to the original bytes. This is called binary-to-text encoding.
How Base64 Encoding Works
Base64 converts binary data to a string using a 64-character alphabet:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
The process breaks input into 3-byte groups (24 bits). Each group is then split into four 6-bit values. Each 6-bit value maps to one character in the alphabet:
Input: "H" (ASCII 72) = binary 01001000
"i" (ASCII 105) = binary 01101001
"!" (ASCII 33) = binary 00100001
Combined: 010010 000110 100100 100001
↑ ↑ ↑ ↑
18 6 36 33
Map to alphabet:
18 → S (index 18)
6 → G (index 6)
36 → k (index 36)
33 → h (index 33)
Result: "SGkh"
When the input length isn't a multiple of 3 bytes, padding characters (=) fill
the last group:
- 1 byte remaining → produces 2 Base64 chars +
==padding - 2 bytes remaining → produces 3 Base64 chars +
=padding
This 4:3 ratio means output is always ~133% of the original size. That's the cost of the ASCII-only representation.
Base64 vs. URL-Safe Base64
Standard Base64 uses + and / in its output. These characters have
special meaning in URLs and filenames — + means "space" in query strings, and
/ separates path segments. Embedding standard Base64 in a URL requires percent-encoding
those characters, which is messy.
URL-safe Base64 (RFC 4648, section 5) replaces the two problematic characters:
+→-(minus)/→_(underscore)
Padding is also omitted in URL-safe Base64 (the length can be derived from the string). Use this variant whenever you're embedding encoded data in a URL path or query parameter. JWT tokens use URL-safe Base64 for exactly this reason.
Common Use Cases
JWT Token Payloads
JSON Web Tokens are header.payload.signature, where the header and payload are
Base64-encoded JSON objects. Decode the middle segment of any JWT and you'll get a readable
JSON object.
Embedding Images in CSS or HTML
Data URIs embed small images directly in HTML or CSS as
data:image/png;base64,iVBORw0KGgo.... This eliminates an HTTP request for
tiny assets — icons, small decorative images. For anything larger than 1–2 KB, the 33% size
overhead and the inability to cache make this a poor trade.
HTTP Basic Authentication
When you send a Authorization: Basic <credentials> header, the
username:password string is Base64-encoded. This is why Basic Auth is not
encryption — anyone can decode the string and read the credentials.
Embedding Binary in JSON APIs
APIs sometimes embed small file payloads (thumbnails, small attachments) directly in JSON responses as Base64 strings. This avoids multipart encoding or separate file endpoints. For anything larger than a few KB, it's usually better to use a separate binary endpoint and return a URL instead.
Code Examples
JavaScript (browsers and Node.js)
// Strings: btoa/atob (available in all browsers and Node.js 16+)
const plain = "Hello, world!";
const encoded = btoa(plain); // "SGVsbG8sIHdvcmxkIQ=="
const decoded = atob(encoded); // "Hello, world!"
// Binary data (Uint8Array) in Node.js
const buf = Buffer.from([0x48, 0x69]); // bytes [72, 105]
const b64 = buf.toString('base64'); // "SGk="
const back = Buffer.from(b64, 'base64'); // Uint8Array [72, 105]
// Browser: encode binary string
const bytes = new Uint8Array([0x48, 0x69, 0x21]);
const binaryStr = Array.from(bytes).map(b => String.fromCharCode(b)).join('');
const b64str = btoa(binaryStr); // "SGkh"
// URL-safe Base64 (Node.js)
const urlSafe = buf.toString('base64url'); // "SGk" (no padding)
const fromUrlSafe = Buffer.from(urlSafe, 'base64url'); // Uint8Array [72, 105]
Python
import base64
# String encoding
plain = "Hello, world!"
encoded = base64.b64encode(plain.encode('utf-8')).decode('ascii')
# "SGVsbG8sIHdvcmxkIQ=="
decoded = base64.b64decode(encoded).decode('utf-8')
# "Hello, world!"
# Binary data (bytes)
raw = b'\\x48\\x69' # bytes object
b64 = base64.b64encode(raw).decode('ascii')
# "SGk="
# URL-safe Base64 (no padding, - and _ instead of + and /)
url_b64 = base64.urlsafe_b64encode(raw).decode('ascii')
# "SGk=" (same for short inputs; differs for longer ones with +/)
# Or explicitly:
import base64
url_b64 = base64.urlsafe_b64encode(raw).rstrip(b'=').decode('ascii')
# URL-safe decode
decoded = base64.urlsafe_b64decode(url_b64 + '==') # add padding back for decode
Command Line
# Encode a file (Linux/macOS)
base64 -i image.png -o image.txt
# Decode
base64 -d image.txt -o image_decoded.png
# One-liner: echo encode
echo "Hello" | base64
# SGVsbG8=Cg==
# One-liner: echo decode
echo "SGVsbG8=" | base64 -d
# Hello
Comparison Table
| Property | Standard Base64 | URL-Safe Base64 |
|---|---|---|
| Characters used | A–Z, a–z, 0–9, +, / | A–Z, a–z, 0–9, -, _ |
| Padding | Required (= or ==) | Typically omitted |
| Safe in URLs | No — requires percent-encoding | Yes |
| Use case | MIME email, JSON, binary files | JWT tokens, URL parameters, path segments |
| RFC | RFC 4648, section 4 | RFC 4648, section 5 |
| Size overhead | ~33% larger than input | ~33% larger than input |
Common Pitfalls
Using Base64 to hide sensitive data
Base64 is not encryption. It is trivial to decode — atob(), base64 -d,
and online decoders reverse it instantly. If you need to protect data, encrypt it with AES or
a proper secret management solution. Never Base64-encode passwords, API keys, or tokens you
want to keep secret.
Encoding UTF-8 strings with btoa in browsers
btoa() only works with Latin-1 (ISO-8859-1) characters. Passing a string with
Unicode characters (emoji, accented letters, CJK) throws a DOMException. The fix:
// Encode Unicode string to Base64
function encodeUnicode(str) {
return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
(_, p1) => String.fromCharCode(parseInt(p1, 16))));
}
// Decode Base64 to Unicode string
function decodeUnicode(str) {
return decodeURIComponent(atob(str).split('').map(c =>
'%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2)
).join(''));
}
encodeUnicode("Hello, 世界!"); // "SGVsbG8sIOS4rCk="
Forgetting charset in data URIs
A malformed data URI: data:text,hello world. Correct:
data:text/plain;charset=utf-8,hello%20world. The MIME type and charset
must match the actual content, otherwise browsers interpret the data incorrectly.
Need to encode or decode Base64 right now? Our free Base64 tool runs 100% in your browser — paste anything, get instant output, no signup required.
Open Base64 Encoder