Monday, April 09, 2007

Writing An "Unbreakable" Code

In Dan Brown's Digital Fortress (see my review), the plot centered on the fact of the NSA having a secret massive super-computer that could try every possible combination of even very large digital keys in a matter of minutes, and thus crack any code. While it's mathematically true that any clear-text message encoded using public key encryption could eventually be cracked given enough time, that presupposes that you started with a clear-text message. In the universe of Digital Fortress, the super-computer (called "Translator") tries every possible key until it finds one that results in clear-text. Presumably it recognizes clear text by doing a dictionary check and looking for recognizable words (which, being a massive super-computer, it could also do very quickly). But what if the originally encrypted content wasn't "clear"? Suppose I had a message I didn't want anyone, including the NSA, to see. I could simply write the message in Double-Dutch, then use standard 128-bit public key encryption, and even the mighty Translator wouldn’t be able to crack it, because it wouldn't recognize when it had succeeded. Even when Translator tried the right key, it would read:
thibisibismibysibecribetmibessibage
which would just look like gibberish, so it would zoom right past it and keep on trying other keys. Brown is aware of this problem, but seems under the misconception that it's very difficult to do.

Technically, Double-Dutch would be a poor choice. The huge preponderance of I's and B's would call attention to themselves and quickly get recognized as noise (even if the cryptographer hadn't learned Double-Dutch in grade school). But you could do something only slightly more complicated, like this:
thinisthismebysegecrinetmniessngage
I took the Double-Dutch algorithm of adding two fill characters before each vowel, but instead of always using "ib", I used letters drawn from another text (e.g., the Bible). Try reading it this way:
thINisTHismEBysEGecrINetmNIessNGage
This is grade-school simple, and yet it's actually very sophisticated in that it doesn't follow any inherent cycle. The fill characters don't repeat, and their insertion points vary with the original text rather than following any set pattern. This would probably be quite enough to bring Translator to its knees. For good measure, you could compress your input using a program like "zip" before encrypting it.

Alternatively, who says that my "clear text" is even text? I could handwrite my message, scan it to a JPEG image, and then encrypt that. Or I could speak my message and capture the audio in a WAV file, zip it, and then encrypt it. Good luck to Translator trying to figure that out. It doesn't take a supercomputer to generate something that a supercomputer would be unable to solve. It just takes a little creativity.

3 comments:

Anonymous said...

Hi Tom,
I haven't read Dan Brown's book, but there are a couple reasons that your schemes won't rock the cryptographic world.
The deepest reason is that your decoding function, how your friends get back to a file from the sequence of bits that they receive, has to be completely known to them. Also, it has to be, in principle, known to your enemies. In other words, your code has to be published.
If you could have whispered to your friends "Pst. I'll be using double-dutch," in some secure way, then you should just tell them the message the same way.
So, you have to make it absolutely public that you are using double-dutch, or your Bible encoding, or some other scheme, which means that the NSA will just decrypt it using the supercomputer, then run it through a double-dutch machine, then look at it to see if it's clear text.
It's an even clearer answer when you encode it in, say, a zipped .WAV file. You would have to make it known to your friends (and, depending on how you tell them) the NSA that you are sending a zipped .WAV file. So the NSA would use the supercomputer with a key, then try to unzip it and see it it was a wav file.
Also, zipped files and wav files have special prefixes that tell the computer what sort of files they are, how long they are, etc. The NSA could just check every decryption for that prefix.
Also, on a less theoretical level, most public key encryptions actually have a few possible decryptions (Rabin does, I believe that RSA does as well). In order to get around this problem, many schemes require prefixing the message with a particular sequence of bits, so that when the decryptor gets the set of possible decryptions, it knows which to pick. This in and of itself would allow the supercomputer to work, then pass its output to something that would decrypt double-dutch, the bible, etc.
I'm not a particularly advanced cryptography student, so I could be wrong about any of this, but let me know if you don't agree with any of it!

-Joe

Anonymous said...

Hi Tom,

Came across your site while browsing around…cool stuff u have going on here. Also I thought I’d tell u about something I came across, thought u might find it useful, bcoz ur in Technology…it’s this site called Myndnet…u should check it out..the link is here http://www.myndnet.com/login.jsp?referral=alpa83&channel=al679

It’s this cool place where u get paid for responding to queries…very cool stuff!! http://www.myndnet.com/login.jsp?referral=alpa83&channel=al679

Sign up n lemme know what u think…my mail id is barot.alpa@gmail.com

Cheers
Alpa

Anonymous said...

Hello,

I write encryption for our company and I wonder how their super computer would do with what we use: a 1024-byte randomly calculated key with a 95x95 random encryption matrix (similar to a combination of Vigenere and Bazere's cylinder). The key is to have the matrix since the key is calculated uniquely for each message and each character is encrypted using a unique offset so it never repeats.

Just curious.

Tommy Martin