v0.1 — interactive · real SHA-256 from scratch · birthday-attack collision finder

SHA & HMAC, hand-cranked

Hashes are the workhorse of cryptography — every signature, every KDF, every commitment scheme is built on top of one. Watch SHA-256's Merkle-Damgård compression chew through a message round by round, see the SHA-3 sponge in diagram form, build HMAC by hand, and find a real collision in a truncated hash through the birthday paradox.

/01

what hashes do

// property 1

Deterministic

Same input always yields the same output. Every time, on every machine.

// property 2

Fixed-size output

Hash a single byte or a 4 GB file — output is the same length. SHA-256 is always 32 bytes.

// property 3

Avalanche

Flip one input bit and roughly half the output bits flip — unpredictably scattered.

// property 4

Pre-image resistant

Given a hash, no efficient way to find any input that produces it. One-way.

// property 5

Collision resistant

No efficient way to find two different inputs that produce the same hash.

That last property is the most interesting cryptographically. The output space of SHA-256 is 2256 values; the input space is infinite, so collisions must exist. The guarantee is just that you can't find one — even with arbitrary computational resources by today's standards.

The bound isn't 2256 tries, though. By the birthday paradox, you only need ~2n/2 tries to find a collision in an n-bit hash. So SHA-256's collision resistance is really 128 bits of work — which is still far beyond reach, but it's why nobody seriously proposes a 128-bit cryptographic hash.

We'll find a real collision in a truncated SHA-256 below. With 24 bits of output, the birthday-attack expected work is just 212 = 4096 tries — a fraction of a second.

/02

SHA-256, hand-cranked

SHA-256 reads a message in 512-bit (64-byte) blocks and threads a 256-bit state through them. The structure is Merkle-Damgård: an initial hash H₀, a compression function C(state, block) → state, and a final output. The compression function is where all the cryptographic action happens.

Merkle-Damgård flow SHA-256
// padded message (519 bytes)
Block 0 · Round 0 of 64 INITIAL
step 0/64
// what's happening

SHA-256 in one paragraph

SHA-256 maintains 8 32-bit working variables (a through h) initialized to fixed constants (H₀ — the fractional parts of square roots of small primes). For each 64-byte block, the message is expanded into 64 32-bit words (W₀…W₆₃), and 64 rounds of mixing run.

Each round combines the current state with a round constant K[i] (from cube roots of small primes) and the corresponding message word W[i], mostly through 32-bit XORs, rotations, and additions modulo 232. After 64 rounds, the new working state is added back to the previous H and we move to the next block.

Press Next to step through one round at a time. Skip block jumps to the end of the current block's compression. The 8 working variables are shown in the grid above; cells highlight when their value just changed.

/03

the sponge — SHA-3

SHA-3 isn't a faster SHA-2. It's a totally different construction designed as a hedge: if anyone ever found a flaw in Merkle-Damgård, the world would have a fallback that doesn't share the same mathematical structure.

Instead of a small state threaded through compression rounds, SHA-3 uses a giant sponge: a 1600-bit state that absorbs message blocks by XOR'ing them in, then squeezes hash output bits out, with a permutation applied between operations. The state is organized as a 5×5×64 cube; the permutation does five things (theta, rho, pi, chi, iota) for 24 rounds.

// absorb XOR each message block into the rate; permute; repeat M₁ M₂ M₃ rate (1088 b) capacity (512 b) rate capacity rate capacity f 24 rds f 24 rds f 24 rds // squeeze read output from the rate; permute again if you need more rate 256b SHA3-256 hash

For SHA3-256 specifically, the rate is 1088 bits (136 bytes) and the capacity is 512 bits. Each message block is 136 bytes; the permutation runs after each absorb. To produce 256 bits of output, a single squeeze of the rate is enough.

A consequence of the structure: SHA-3 is naturally extensible. The same primitive supports arbitrary output lengths (SHAKE-128, SHAKE-256) just by squeezing more bits, with extra permutations in between. ML-KEM and ML-DSA use SHAKE internally as their pseudo-random generator — which is why post-quantum cryptography quietly depends on SHA-3 even though it wasn't strictly required to.

Performance-wise, SHA-3 in software is somewhat slower than SHA-256. In hardware (where Keccak's bitwise structure shines), the gap narrows. For most applications, SHA-256 remains the practical default; SHA-3 is the strategic backup.

/04

HMAC — authenticate with a key

A plain hash is unkeyed: anyone can compute it, anyone can verify it. That's useless for authentication — a man-in-the-middle could rewrite a message and recompute the hash. HMAC adds a secret key to a hash, producing an authenticator that only someone with the key can generate or verify.

The construction is bizarrely simple: hash the message twice, with a key XOR'd against two different fixed pads. That's it. The cleverness is in why it's secure — the double-hash structure is what saves it from length-extension attacks that affect plain SHA-256(key || message).

// the formula

Why two passes?

HMAC(K, m) = H((K' ⊕ opad) ∥ H((K' ⊕ ipad) ∥ m))

Where K' is the key normalized to the hash's block size (64 bytes for SHA-256): hashed if too long, zero-padded if too short. ipad = 0x36 repeated 64 times, opad = 0x5C repeated 64 times. Two specific constants chosen so that ipad ⊕ opad has high Hamming weight — which the security proof needs.

The two passes break length-extension: the inner hash produces a 32-byte digest with no key material in the visible output; the outer hash adds the key again so the final tag depends on the key in a way that can't be extended without knowing the key. With plain H(K ∥ m), an attacker who saw a tag could append bytes and produce a valid tag for the longer message — a real attack that broke MD5-based MACs in the wild.

HMAC is what TLS uses for cookie integrity, what JWT uses for token signing, what HKDF uses internally as its PRF. It's the de facto way to combine a key and a message into an authenticator built from a hash.

/05

birthday-attack collision finder

The birthday paradox: in a room of 23 people, there's a 50% chance two share a birthday. The math: with N possible birthdays and k people, the probability of a collision is ~k²/2N. Set that to 0.5 and you get k ≈ √N = √365 ≈ 19 for actual birthdays (closer to 23 once you account for higher-order terms).

For an n-bit hash function, N = 2n, so you find a collision after roughly 2n/2 tries — not the 2n you might naively expect. That's why a 128-bit hash gives only 64 bits of collision resistance.

Below: real SHA-256, truncated to whatever bit length you choose. Click Find collision and watch the simulator hash random inputs until two of them produce the same truncated digest.

tries so far
0
birthday expected
unique digests seen
0
collision
searching…
// collision found
message A:
SHA-256(A):
message B:
SHA-256(B):
truncated:
// what this proves

Real SHA-256 is fine; truncated SHA-256 is not

The collisions you find above aren't bugs in SHA-256 — they're a consequence of throwing away most of the digest. With 24 bits of output (3 bytes), the birthday bound is just 4096 tries. With the full 256 bits, it's 2128 tries, which (at modern bitcoin-network speeds of ~1020 hashes per second) would take ~1018 years — about 100 million times the age of the universe.

That's why hash truncation is hazardous. SHA-256/128 (truncate to 128 bits) gives 64-bit collision resistance — barely enough today, definitely not for a 50-year secret. MD5 (128 bits) was once expected to give 64-bit collision resistance; in 2004 Wang found explicit collisions in seconds via a non-birthday attack, and MD5 has been retired from cryptographic use ever since.

SHA-1 (160 bits) followed in 2017, when Google demonstrated the first practical SHA-1 collision (the "SHAttered" attack). For new applications, SHA-256 minimum; SHA-384 or SHA-512 for things that need to last decades.

/06

the hash families

name output size family status notes
MD5 128 bit Merkle-Damgård BROKEN Wang 2004 collision attack. Still widely used as a non-cryptographic checksum; never use for security.
SHA-1 160 bit Merkle-Damgård BROKEN SHAttered 2017 — Google found two PDFs with the same SHA-1. Retired from TLS, certificates, signatures. Still in some legacy git contexts (transitioning to SHA-256).
SHA-256 256 bit Merkle-Damgård (SHA-2) GOOD Workhorse of TLS, signatures, blockchains, Git (since 2.42). Vulnerable to length-extension; use HMAC-SHA-256 if that matters.
SHA-384 / SHA-512 384 / 512 bit Merkle-Damgård (SHA-2) GOOD Larger SHA-2 variants. Use when you need post-quantum hash collision resistance (Grover halves the bound, so 256-bit hash gives 128-bit PQ collision resistance).
SHA3-256 / SHA3-512 256 / 512 bit Sponge (Keccak) GOOD NIST's structural alternative to SHA-2. Standardized 2015 (FIPS 202). Used internally by ML-KEM and ML-DSA via SHAKE.
SHAKE-128 / SHAKE-256 arbitrary Sponge (Keccak) GOOD Variable-length output extendable-output functions. Same Keccak permutation, squeeze more bits as needed. The PRF for ML-KEM/ML-DSA.
BLAKE2 / BLAKE3 256 / 256 bit Compression tree GOOD Faster than SHA-2 in software. BLAKE2 in libsodium and OpenSSH; BLAKE3 in Cargo, content-addressable storage. Not standardized by NIST.
// when to use which

Practical guidance

For most things, use SHA-256. It's standardized everywhere, hardware-accelerated on every modern CPU (SHA-NI on x86, ARMv8 crypto extensions), and well-understood after 25 years of analysis.

For HMAC, use HMAC-SHA-256. The pad-and-double-hash construction immunizes against length-extension and gives you a proper keyed pseudo-random function.

For post-quantum applications, use SHA-384 or larger. Grover's algorithm halves the effective collision-resistance bound, so a 256-bit hash provides 128-bit post-quantum collision resistance — enough for now, but a 384-bit hash gives 192-bit margin without much extra cost.

For variable-length output (key derivation, commitment schemes, lattice cryptography), use SHAKE-128 or SHAKE-256. The sponge structure handles arbitrary lengths natively.