The method we used is based very much on the way a legitimate receiver of the message who knows the correct key would decipher it. We assumed the cipher key codeword length (which is the same as the width of the code matrix) to be 8. Decipherment consists of the following steps.
The table for the simple substitution consists of 36 entries for a normal ADFGVX cipher, and of course this table has 64 entries for a CEMOPRTU cipher. We initially guessed that the characters used in the message were 52 lower and upper case letters, 10 digits, space, and full stop. This guess later turned out correct.
Since simple substitution ciphers are generally easy to break, the permutation of columns in the matrix is really the only thing we had to worry about.
After a plain text has been encrypted by a simple substitution, the distribution of characters is still very uneven. Therefore, we knew that some permutations (including the correct one!) of the columns would result in a distinctly uneven distribution of bigrams. This had to hold for all 4 pairs of columns individually, and also, the distribution of bigrams had to be approximately the same for each column pair.
We wrote a program that graphically showed the 28 bigram distributions resulting from all pairs of columns. It was an easy and fast matter to spot the correct pairing of columns from this output.
Notice that if the initial guess for the code matrix width had been wrong, the results at this stage would not have been any good. It could have been, however, that the code matrix width was 2 or 4, but a re-run of the program with this assumption excluded this possibility.
We could now rewrite the matrix as a four-column matrix of bigrams, and we actually substituted the bigrams by the assumed input alphabet at this stage. Since one of the 24 possible permutations of these columns had to be a simple substitution of the plaintext, we decided that the best course of action was to examine each one of the 24 possibilities individually.