For some reason the program used for this stage was called heavy. The program builds the key square in a stepwise fashion. The user has the option to specify the initial part of the key. For instance, if we know that the key starts with the letters APE we can tell the program to only try key squares beginning with the three letters APE. Given these preset letters of the key the program tries all possible combinations of the next six letters in the key. The most likely candidates of the possible keys are then stored in a heap of specified size. For each of the keys in the heap all possible choices of the next letter in the key is evaluated, and the keys with the highest score are saved in another heap.
The crucial part of the algorithm is thus to find a good evaluation function. To evaluate a partial key, heavy uses a file containing the log-probability of all bigrams occurring in English text, where in-frame repeats AA have been substituted with AXA. Assume that the following key square should be evaluated.
THECO DBK## ##### ##### #####To eliminate the bias due to the frequencies of the key letters in English text we first compute the mean m of the log-probabilities of the bigrams that may be constructed with the letters in the partial key. For a bigram to be included we required that both the plaintext bigram and the corresponding ciphertext bigram should be a subset of the letters in the partial key. In the example above this means that the bigram DE is included but not OK. Finally, to reduce noise, we only included bigrams with log-probability higher than -9 when computing the mean.
Using the partial key the program then deciphers as much as possible of the ciphertext. The longer the partial key, the larger part of the ciphertext will be deciphered. For all bigrams we then sum the log-probabilities of all bigrams occurring in the text. If we let sbe this sum and n the number of bigrams that were the detected, we finally set the score of the partial key to s - nm.