Why aren't people using this stuff yet?
Totally, completely, 100%, perfect, unbreakable encryption is easy, and I mean really easy. It doesn't require an advanced education to understand and it doesn't require powerful computers to calculate. Why isn't anyone concerned about data privacy using it? It has one minor hitch: the two parties exchanging information must meet in person at least once. That one meeting will be enough to ensure a lifetime of unbreakable encryption between the two parties. Of course, if you are just encrypting your own files for protection purposes, this hitch doesn't apply and you can begin today. Even if you are trying to communicate with someone, if that other person is someone who you can meet even once, just briefly for a few minutes (nothing fancy is required here), then you're in business.
Strong encryption gets a fair amount of press. There are weird laws preventing the export of encryption technology using keys longer than a certain length because it's difficult for the US government to break encrypted messages using long keys and that makes strong encryption technology a security threat. Then there was the fiasco with PGP, pretty good privacy, which made it easy for anyone anywhere to use reasonably strong encryption. Quite the legal hullabaloo erupted over that. In order to make fun of the export laws, people started wearing tee-shirts with encryption algorithms on them (and then wearing these shirts onto planes, get it?)
Here's what's so silly about this. Perfect encryption is easy, and widely known. When I say perfect, I mean perfect. It isn't just really really really hard to break. It's absoluteley impossible. The most powerful government with the most advanced technology and unlimited resources couldn't break it. Aliens that are millions or years more advanced than us and can push stars around at will couldn't break it.
The basic cipher
So, how's it work? This kind of encryption is called the one-time pad. The first way to introduce it is to remember the silly little cipher that we all learned when we were kids. Choose a letter of the alphabet, say 'F'. Now 'F' has the numerical value of 6 because it is the sixth letter of the alphabet. To encrypt a message you simply shift every letter forward 6 spaces (wrapping around at the end). To decipher the message just shift each letter back 6 spaces. The "key" is the letter 'F'. A slightly harder version of this cipher consists of mapping each letter of the alphabet to a different letter using a random mixup of letters.
Now, these are both pretty stupid ciphers. Assuming someone knows you're using the original cipher all you have to do is try shifting the message back 25 times at 25 different distances. One of them will hit the original message perfectly. This wouldn't work very easily for the second cipher however. But it's still pretty easy to break. You can pull all kinds of data out of the encrypted message that can be used to break the cipher. For example, the most common letter in the English language is 'E' and the most common letter in an encrypted message (assuming the message is long enough to exhibit probabilistic patterns) will probably be coded to the letter E. Certain letter patterns are common, for example 'THE'. Bottom line, these ciphers are stupid and pointless.
The perfect cipher
What's cool is that you can turn this almost useless cipher into an unbreakable form of encryption with just one tiny little change. It's based more on the original cipher, where each letter shifts forward by a certain numerical distance. A key is generated where each letter in the message if shifted by a different amount instead of using the same shift for every letter in the message. The key is simply a random stream of letters, for example, 'EIBHAKDJBL'. So, using this key, the first letter of the message would be shifted forward 5 spaced because of the 'E' in the first space of the key, and so on down the line.
That's it. No one can break this cipher. Since each letter is shifted by a different amount and the key is essentially random, there are no patterns to be discerned.
Now, let me show you how this same method can be adapted for any general kind of data. Letters are pretty limiting. For example, a picture isn't made out of letters, a sound file isn't made out of letters. Every kind of data stored on a computer is stored as a stream of bits, 0s and 1s. This is essentially an alphabet of 2 letters if you think about it. If a 0 gets a shift value of 0 and a 1 gets a shift value 1, then there are four possible occurrences when encrypting a given bit in the message:
There is a more formal way to specify this kind of relationship. This is a XOR bitwise operator, which stands for "exclusive or". Given a message bit and a key bit, the encrypted bit will be a 0 if the message bit and the key bit are the same (either 0 or 1), and the encrypted bit will be a 1 if the message bit and the key bit are different:
To decrypt an encrypted bit, you simply reverse the process. You can either think of this using the shift distances or the XOR operator, they'll both work. Using the XOR method (since it's more formal and all), you simply XOR the encrypted bit with the key bit to get the original message bit back. Look at the table above again and you'll notice that it works both forward and backward.
So now you can encrypt not only text messages but any kind of digital data, that is, any data that you store on your computer, text, pictures, sound, video, take your pick. The key is a stream of bits, the encryption algorithm is simply an XOR of the data and the key, and the decryption is identical. Writing this as a computer program would probably require about ten lines of code, and most of that would be opening and closing files. Pretty cool, eh? The reason that this is called a one-time pad is that repeated uses of the same key on different data sets will reveal patterns in the encrypted data that can be used to break the key, so you must generate a new key as long as the data set for every data set that you want to encrypt (in reality, a few repeated uses probably wouldn't reveal enough information to break the key, but there is no guarantee of this). This may be something of a hassle, but generating a stream of pseudo-random bits is pretty easy, and that's the price you pay for perfect, totally unbreakable encryption.
If your only goal is to protect your own personal data, you're all set. You can create a key and encrypt any data you want. The data will sit encrypted on your harddrive, totally protected from anyone who may wish to take a peek. When you want to see the data again, you can simply decrypt it. You'll have to store the key somewhere however. Tricky tricky. More on this problem below. Now, if your goal is to communicate with another person, in other words to send encrypted data to another person through the mail or over the internet, you have a problem. Quite simply, the person at the other end won't be able to decrypt the data unless they have the key. The two of you are going to have to have the same key.
What to do with the key
Here's the solution. Say you own some huge mega-bucks business in New York and you want to perform secure data-transfer and communication with an associate in Tokyo. Here's what you do. You meet just once. You're rich, you can afford a plane trip. When you meet, you give your associate a stack of DVDs pressed with random bits. That's it. So long as you can store your key securely and you trust your associate to do the same, you can trade encrypted data with your associate until you run out of key bits, and a stack of DVDs is a hell of a lot of bits.
Of course, having these keys lying around kind of causes a problem doesn't it? Now strictly speaking, there's no solution to this problem. However, I have a partial solution that's pretty good. Since it isn't perfect, it kind of fails the goal of attaining unbreakable encryption. However, since it doesn't entail leaving keys out in the open, it might not be less secure than the original method. What you do is memorize a list of movies, or you could use CDs and memorize albums instead of course. DVDs provide more key data, but there are many more CDs in the world than DVDs, thus providing better security. Even if someone figures out that your trick is to use CDs they have to find the right one out of millions. You could even beef up the unpredictability a touch by using every seventh bit or so. Using this method would require some careful software that skips over various track headers and such that are stored on a CD because these headers aren't random and could even provide information that could identify the appropriate CD to use as the key. Nevertheless, this is pretty good since it doesn't require leaving an obvious key lying around haplessly. The reason this isn't perfect is that CDs don't actually have random bits, they have a stream of bits that represent a fairly organized pattern of bits that encode the waveform of a piece of music. It is possible that these patterns can be identified and the key can be broken. It is extremely unlikely however. Whether this method is better or worse than a key of truly random (well pseudo-random anyway) bits depends on whether you think a key of bits can be stored securely.
So that's it. Go to it. Don't tell them I told you.
I would really like to hear what people think of this. If you prefer private feedback, you can email me at firstname.lastname@example.org. Alternatively, the following form and comment section is available.