v0.1 — interactive · ciphers from the BC era · breakable in milliseconds

Caesar & Vigenère, broken in your browser

Two thousand years of cryptography start here — with a single letter shift and the polyalphabetic key that hid messages for three centuries before frequency analysis caught up. Encrypt with both, then watch a browser-side attacker strip them down to plaintext using nothing but the statistics of English.

/01

what's a substitution cipher?

classical

A substitution cipher replaces each letter of the plaintext with a different letter, according to a fixed rule. The rule — the key — is what the sender and receiver share; everything else is public. That separation between the algorithm and the secret is one of the oldest and most important ideas in cryptography.

Caesar's cipher (used by Julius Caesar around 50 BC) is the simplest possible substitution: shift every letter by a fixed amount. The key is just one number from 0 to 25 — 25 keys total, so the entire keyspace fits in a sentence.

Vigenère's cipher (Bellaso 1553, misattributed to Blaise de Vigenère in 1586) makes the shift change with each letter, cycling through the letters of a keyword. For three centuries it was called le chiffre indéchiffrable — the indecipherable cipher — until Charles Babbage and Friedrich Kasiski independently broke it in the 1850s.

Both ciphers are completely broken today by tools that fit in a few hundred lines of code. We'll watch them break below — but the important takeaway isn't that they're weak. It's that the design failure that breaks them — leaking the statistical structure of the underlying language — is the same failure mode that haunts modern cryptography too. AES is engineered precisely to avoid this leak.

/02

Caesar — shift every letter

substitution

Plaintext

Shift k 3

// alphabet (plaintext side)

// shifted alphabet (ciphertext side)

// plaintext

// ciphertext

// what's happening

Each letter gets the same shift

For a shift of k, plaintext letter P maps to ciphertext letter C = (P + k) mod 26. With k = 3 — Caesar's actual shift, according to Suetonius — A becomes D, B becomes E, and so on, with X, Y, Z wrapping around to A, B, C.

Decryption is just the inverse shift: P = (C − k) mod 26. Because the operation is symmetric, the same algorithm runs both ways with opposite signs.

Notice the keyspace: 25 possible shifts (shift 0 is the identity). An attacker can try them all in microseconds, even without knowing anything about the language. This is what cryptographers call a brute-force-feasible cipher.

/03

breaking Caesar — frequency analysis

attacker

Even without trying every shift, you can spot the right one instantly: letter frequencies in English are wildly uneven. E is ~12.7%, T is ~9.1%, A is ~8.2%; while Z is 0.07%, Q is 0.10%, J is 0.15%. A Caesar shift just rotates the histogram — the shape stays identical.

So the attack is to compare the frequency histogram of the ciphertext to the expected English histogram, with each candidate shift applied. The shift that minimizes the difference (measured by chi-squared) is the right one.

Ciphertext

// observed letter frequency in ciphertext (vs. expected English)

observed (ciphertext)

expected (English)

// why this works

Statistics > brute force

Even if Caesar had used a thousand-letter alphabet (1000 possible keys), the attack still works in microseconds. The keyspace size doesn't matter — what matters is whether the ciphertext leaks statistical structure.

Modern stream ciphers (AES-CTR, ChaCha20) are designed precisely so that the ciphertext has the flat frequency distribution of uniform random noise, regardless of the plaintext. No matter how many bytes you intercept, the frequency histogram looks like random data. That's the property classical ciphers fail at — and the property modern ones provide by construction.

/04

Vigenère — a key that cycles

polyalphabetic

Vigenère's clever idea: instead of one shift for every letter, use a repeating keyword. Each plaintext letter is shifted by the corresponding letter of the key (cycled to match plaintext length). Different positions get different shifts, so the same plaintext letter doesn't always become the same ciphertext letter.

That destroys the simple frequency histogram that breaks Caesar. The character "E" in plaintext might map to "M" in one position and "Q" in another, depending on which key letter is currently active. For a 5-letter key, every position uses one of 5 different Caesar shifts, cycling.

Plaintext

Key

plaintext

key (cycled)

ciphertext

// ciphertext

// what's happening

One Caesar shift per key position

For each plaintext letter at position i, the cipher uses the key letter at position i mod len(key) as the shift. Mathematically: C[i] = (P[i] + K[i mod n]) mod 26.

With key LEMON (L=11, E=4, M=12, O=14, N=13), position 0 of the plaintext is shifted by 11, position 1 by 4, position 2 by 12, position 3 by 14, position 4 by 13, then position 5 by 11 again, and so on.

For 300 years this was thought unbreakable. The flat frequency distribution of the ciphertext (every letter shows up roughly equally) hid the underlying English structure — at least, until someone realized the key length was the chink in the armor.

/05

breaking Vigenère — Kasiski + frequency

attacker

Friedrich Kasiski (1863) and Charles Babbage (~1854, unpublished) independently noticed that if you can recover the key length, Vigenère collapses into a stack of independent Caesar ciphers — one per key position — and each falls to plain frequency analysis.

The modern way to find the key length is the Index of Coincidence: the probability that two randomly-chosen letters of the text are the same. English text has IoC ≈ 0.067. Random uniform text has IoC ≈ 0.0385. If you split a Vigenère ciphertext into n streams (every nth letter into the same stream), the streams will have English-like IoC only when n is a multiple of the key length. Otherwise they'll look random.

Ciphertext

// why this works

Vigenère wasn't a new cipher. It was 5 Caesars in a coat.

Once you know the key length is, say, 5, you treat the ciphertext as five interleaved Caesar ciphertexts: positions 0, 5, 10, 15, … all used the same shift, and so on for each offset. Each of those streams is just a Caesar cipher, breakable by frequency analysis as before.

Stitch the recovered shifts back together in the right order, convert each shift to a letter (0=A, 1=B, …), and you have the keyword. Decrypt with the keyword, and you've recovered the plaintext.

Total computational cost: a few thousand operations. The cipher that held the title of indéchiffrable for 300 years now falls in a fraction of a second on a phone.

/06

what survives, what doesn't

Caesar and Vigenère are useless for hiding messages today, but two of their ideas survive verbatim in modern cryptography:

1. The plaintext-key separation. Caesar's attacker knew the algorithm — shift letters — but didn't know the shift. That framework is now Kerckhoffs's principle: a cryptosystem should be secure even if everything but the key is public. AES, RSA, ML-KEM all follow this; their algorithms are published in NIST documents and only the keys are secret.

2. The polyalphabetic idea. Vigenère's "different shift per position" matures into the stream cipher: a key-derived sequence of bytes XOR'd into the plaintext, where each byte's "shift" is essentially random. ChaCha20 and AES-CTR are 21st-century Vigenères, with key material indistinguishable from random — so the IoC trick can't find anything to grab onto.

What dies with Caesar and Vigenère is the assumption that small keyspaces or simple structures can hide plaintext from a determined adversary with statistical tools. Every modern primitive is tested by the same question: "could the ciphertext distribution be distinguished from random with any computational budget short of the keyspace size?" The answer for AES, after 25 years of attack, is still no. The answer for Caesar after about 5 minutes was yes.

// the lineage

From letters to bits

The arc from Caesar to AES is a long one — 2000+ years of "what if we made the alphabet bigger / the key longer / the shift smarter." Every step taught the field something. Caesar taught the value of a secret rule. Vigenère taught the limits of polyalphabetic thinking. The Enigma machine pushed polyalphabetic to its electromechanical extreme. Shannon (1949) finally formalized what "perfect secrecy" means and proved it requires a one-time pad (key as long as the message).

Modern cryptography is the art of getting computationally indistinguishable from one-time-pad security with practical key sizes. AES, ChaCha20, and HMAC achieve that against classical attackers. ML-KEM and ML-DSA achieve it against quantum ones. None of which would have made sense without first understanding why Caesar fails.