Simple substitution ciphers substitute one letter of the alphabet for another, in some random arrangement. Here is an example of a simple substitution cipher:
A | → | D |
B | → | V |
C | → | N |
D | → | X |
E | → | I |
F | → | K |
G | → | S |
H | → | W |
I | → | Y |
J | → | Z |
K | → | T |
L | → | Q |
M | → | M |
N | → | H |
O | → | P |
P | → | A |
Q | → | U |
R | → | F |
S | → | L |
T | → | G |
U | → | E |
V | → | O |
W | → | B |
X | → | C |
Y | → | R |
Z | J |
To use this substitution cipher table to write a coded message, substitute each letter in the original message with the substitute letter from the table. For example, the message:
WHO WROTE THIS MESSAGE
would be coded as:
BWP BFPGI GWYL MILLDSI
On the other hand, if someone got the coded message:
BWIFI DFI RPE
they could use the cipher table to decode the message as:
WHERE ARE YOU
Julius Caesar used a simple substitution cipher to send secret messages. His substitution cipher consisted of shifting the letters of the alphabet 3 to the left, with the first 3 letters shifting to the end of the alphabet:
A | → | X |
B | → | Y |
C | → | Z |
D | → | A |
E | → | B |
F | → | C |
G | → | D |
H | → | E |
I | → | F |
J | → | G |
K | → | H |
L | → | I |
M | → | J |
N | → | K |
O | → | L |
P | → | M |
Q | → | N |
R | → | O |
S | → | P |
T | → | Q |
U | → | R |
V | → | S |
W | → | T |
X | → | U |
Y | → | V |
Z | → | W |
In The Adventure of the Dancing Men, by Arthur Conan Doyle, Sherlock Holmes is shown a page from a notebook with the following markings made in pencil:
Holmes guesses correctly that this is a simple substitution cipher in which the little dancing figures are substituted for letters of the alphabet. Using his knowledge of how often different letters occur in English Holmes is able to decode the message.
Letters of the alphabet can be chosen at random to make the second column of a simple substitution cipher.
(1) For example, the 26 letters of the alphabet, written on pieces of card, can be placed in a container which is shaken hard, and a letter chosen at random. The first letter chosen substitutes for “A”. That letter is not put back in the container, which is shaken again and a new letter drawn. That letter substitutes for “B”, and so on on.
(2) Another way to produce a random simple substitution cipher is to use a spreadsheet. In a spreadsheet make a column with the letters of the alphabet in order, as shown in the first table above. Then copy that column into an adjoining column in the spreadsheet. In the next column, produce a random decimal number by using the command “=rand()”. Copy this formula down the 26 rows of that column so that all letters have a random decimal number next to them. Now select columns 2 and 3 and sort in order by column 3. This sorts the second column of letters randomly and gives us the second column of the simple substitution cipher.
You can construct a simple substitution cipher using Julius Caesar’s method by shifting all the letters of the alphabet along by a fixed amount.
(1) For example in Julius Casar’s simple substitution cipher all letters were shifted to the left by 3, with the 3 letters A, B, C at the beginning if the alphabet shifter to the end: X, Y, Z.
This is like moving the letters 3 spaces anti-clockwise around a circle:
You can use shifts by other amounts.
(2) Another way to make a shift cipher like this is first to represent all the letters of the alphabet by whole numbers, in order, starting from 0. So A is represented by 0, B by 1, C by 2, and so on through Z by 25.
A shift cipher where every letter is shifted 2 to the right can be made by adding 2 to each number, except that when we get to 26 we subtract 26 from the answer. You can easily do this by hand
A shift cipher can also be constructed in most spreadsheets, using the “=mod(“number”+2,26)” command. To construct a shift cipher in which all letters – represented by numbers – are shifted 2 to the right, first create a column of the numbers 0 through 26. Do this by putting the number 0 in the first row. In the second row enter a formula to add 1 to the previous number. Then copy that formula down to the 26th row. In the second column enter the “=mod(“number”+2,26)” where “number” points to the first row of the first column. Then copy that formula down to the 26th row.
When we know the table for a simple substitution cipher it’s easy to decode a coded message. But if we just have the message and do not know the cipher how can we decode a message. How did Sherlock Holmes crack the code of the dancing men?
If the coded message is very short, and we only have one message it is hard to crack a simple substitution cipher. But if we have a long enough message, or several messages made with the same substitution cipher, then we have a good chance of cracking the code.
We do this by counting how many times each coded letter appears in the message, or messages. How does this help us? It helps because the approximate frequency of letters in English is known. Sherlock Holmes used the following rough guide: E, T, A, O, N, R, I, S, H, D, L, F, C, M, U, G, Y, P, W, B, V, K, X, J, Q, Z. This means that “E” is the most commonly occurring letter in English, with “T” or “A” following next, and so on, with “J, Q, Z” being the least commonly occurring letters.
Let’s say we get the coded message
M MQEKMRI CSX AMPP JMRH MX JEMVPC WMQTPI XS GVEGO XLMW QIWWEKI AVMXXIR E WMQTPI WYFWXMXYXMSR GSHI EJXIV EPP CSY EVI TVIXXC WQEVX WXYHIRXW
and we suspect this was coded by a simple substitution cipher.
The first thing to do is to count how often each letter appears in the message, and arrange the letters in order by frequency of occurrence:
Frequency | Letter |
15 | X |
13 | M |
11 | I |
9 | E |
9 | W |
7 | P |
7 | V |
5 | Q |
5 | R |
5 | S |
4 | C |
4 | Y |
3 | G |
3 | H |
3 | J |
3 | T |
2 | A |
2 | K |
1 | F |
1 | L |
1 | O |
It’s a fair guess that “X” is code for either “E” or “T”.
Let’s test for a shift cipher.
If “X” is code for “E” then all letters have been shifted 19 to the right – or, what amounts to the same thing 7 to the left. If that’s what happened, then the original message begins FÂ F JXDFK B.
That doesn’t look right, so let’s see if we have a shift cipher in which the letter “X” is replaced by “T”. In this case all letters are shifted 4 to the right. If that’s what happened then the original message begins I IMAGINE YOU … . This looks like it IS what happened and we can easily decode the message.
Even when a message is coded by a simple substitution cipher the shape of words can be a strong clue as to which letters have been substituted, for example a single letter occurring on its own could only be A or I. This is a big clue, so coded messages are usually all run together, like this:
MMQEKMRICSXAMPPJMRHMXJEMVPCWMQTPIXSGVEGOXLMWQIWWEKI AVMXXIREWMQTPIWYFWXMXYXMSRGSHIEJXIVEPPCSYEVITVIXXCWQEVXWXYHIRXW
Sometimes they are broken into fixed length pieces, like this:
MMQ EKM RIC SXA MPP JMR HMX JEM VPC WMQ TPI XSG VEG OXL MWQ IWW EKI Â AVM XXI REW MQT PIW YFW XMX YXM SRG SHI EJX IVE PPC SYE VIT VIX XCW QEV XWX YHI RXW
Because simple substitution ciphers are relatively easy to crack, much effort has gone into developing harder to crack ciphers. Some of these are hard for the average person to crack, but relatively easy for people who know enough mathematics. Ciphers are used everyday to code business transactions, including those using debit and credit cards. The ciphers used for these business messages are very hard to crack indeed.
Top Secret: A Handbook of Codes, Ciphers and Secret Writing. Edited by Paul Janeczko
The Magic of Numbers by Benedict Gross & Joe Harris (this book has two chapters on codes)
In Code: A Mathematical Journey, by Sarah Flannery
The Book of Codes: Understanding the World of Hidden Messages. Edited by Paul Lunde
The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography, by Simon Singh
Some things students can be expected to learn by writing and attempting to break simple substitution ciphers:
More advanced:
1 | Nina Schlinger
November 16, 2010 at 11:37 pm
i definitely agree