Back to the Contents Page

How to Déchiffre Le Chiffre Indéchiffrable

We can't apply frequency analysis straight away since a number of different keys have been used to code the message. This means that each letter can be represented by a number of different letters, depending on the value of the key letter at that point, and each letter in the code text can represent a number of different letters in the original text. For example:


key:         PAULPAULPAULPAULPAULPAULPAU
message:     meetmeforapintatelevenforty
coded text:  BEYEBEZZGAJTCTUETLYGTNZZGTS

The most common letter is T, but we can't say this is equal to e since T represents i, t and e.

However, with one piece of extra information, we can use the technique. If we knew how long the keyword used in encryption was, we would know how many rows of the Vigenère square have been used (the number of letters in the keyword). And then we can split the coded message into that many parts and use frequency analysis on each one. So, for instance, if the keyword was five letters long, we can take every fifth letter, and work out the most common letters for that set. Then we take the next one along from every fifth letter, and find the new common letters from that set, and work from there, and so on.

So in the example above, we need to find out the length of the keyword. To do this, we look for repeating patterns, preferably of four letters or more. Since this is only a very short message, there aren't any groups of four, but there is a repeating group of three - ZZG appears twice, 16 letters apart. Since only certain letters appear next to others (part of the frequency analysis used before), it is likely (and is the case this time) that the ZZG represents the same letters. This would suggest that the two occurrences of ZZG must be separated by an integer number of keys - so that the same letter is above each one.

So since ZZG is 16 letters apart from the other ZZG, the key must be either 16, 8, 4 or 2 letters long. With a longer message, we could find other examples - the best we can do with this short one is find BE which appears twice near the start, 4 letters apart, showing that the key must be 4 or 2 letters long. Assuming two letter keys haven't been used, we now know that the key is four letters long; we can split the message, and take every fourth letter, which must have been coded using the same key letter (P in this case), and perform frequency analysis on it to find that key letter. Obviously it works better with a longer message, but this explains the technique used.




Continue