Asymmetric ciphers use two keys: a public key and a private key. The public key is made public, while the private key is kept private; hence the clever names. Any message that is encrypted with the public key can only be decrypted with the private key. This removes the issue of key distribution—public keys are public, and by using the public key, a message can be encrypted for the corresponding private key. Unlike symmetric ciphers, there's no need for an out-of-band communication channel to transmit the secret key. However, asymmetric ciphers tend to be quite a bit slower than symmetric ciphers.
1. RSA
RSA is one of the more popular asymmetric algorithms. The security of RSA is based on the difficulty of factoring large numbers. First, two prime numbers are chosen, P and Q, and their product, N, is computed:
- N = P · Q
Then, the number of numbers between 1 and N – 1 that are relatively prime to N must be calculated (two numbers are relatively prime if their greatest common divisor is 1). This is known as Euler's totient function, and it is usually denoted by the lowercase Greek letter phi (φ).
For example, φ(9) = 6, since 1, 2, 4, 5, 7, and 8 are relatively prime to 9. It should be easy to notice that if N is prime, φ(N ) will be N –1. A somewhat less obvious fact is that if N is the product of exactly two prime numbers, Pand Q, then φ(P · Q) = (P –1) · (Q –1). This comes in handy, since φ(N) must be calculated for RSA.
An encryption key, E, that is relatively prime to φ(N), must be chosen at random. Then a decryption key must be found that satisfies the following equation, where S is any integer:
-
E · D = S · φ(N) + 1
This can be solved with the extended Euclidean algorithm. The Euclidean algorithm is a very old algorithm that happens to be a very fast way to calculate the greatest common divisor (GCD) of two numbers. The larger of the two numbers is divided by the smaller number, paying attention only to the remainder. Then, the smaller number is divided by the remainder, and the process is repeated until the remainder is zero. The last value for the remainder before it reaches zero is the greatest common divisor of the two original numbers. This algorithm is quite fast, with a run time of O(log10N). That means that it should take about as many steps to find the answer as the number of digits in the larger number.
In the table below, the GCD of 7253 and 120, written as gcd(7253, 120), will be calculated. The table starts by putting the two numbers in the columns A and B, with the larger number in column A. Then A is divided by B, and the remainder is put in column R. On the next line, the old B becomes the new A, and the old R becomes the new B. R is calculated again, and this process is repeated until the remainder is zero. The last value of R before zero is the greatest common divisor.
gcd(7253, 120) |
|
---|
A | B | R |
---|
7253 | 120 | 53 |
120 | 53 | 14 |
53 | 14 | 11 |
14 | 11 | 3 |
11 | 3 | 2 |
3 | 2 | 1 |
2 | 1 | 0 |
So, the greatest common divisor of 7243 and 120 is 1. That means that 7250 and 120 are relatively prime to each other.
The extended Euclidean algorithm deals with finding two integers, J and K, such that
- J · A + K · B = R
when gcd(A, B) = R.
This is done by working the Euclidean algorithm backward. In this case, though, the quotients are important. Here is the math from the prior example, with the quotients:
- 7253 = 60 · 120 + 53
- 120 = 2 · 53 + 14
- 53 = 3 · 14 + 11
- 14 = 1 · 11 + 3
- 11 = 3 · 3 + 2
- 3 = 1 · 2 + 1
With a little bit of basic algebra, the terms can be moved around for each line so the remainder (shown in bold) is by itself on the left of the equal sign:
- 53 = 7253 – 60 · 120
- 14 = 120 – 2 · 53
- 11 = 53 – 3 · 14
- 3 = 14 – 1 · 11
- 2 = 11 – 3 · 3
- 1 = 3 – 1 · 2
Starting from the bottom, it's clear that:
- 1 = 3 – 1 · 2
The line above that, though, is 2 = 11 –3 · 3, which gives a substitution for 2:
- 1 = 3 – 1 · (11 – 3 · 3)
- 1 = 4 · 3 – 1 · 11
The line above that shows that 3 = 14 –1 · 11, which can also be substituted in for 3:
- 1 = 4 · (14 – 1 · 11) – 1 · 11
- 1 = 4 · 14 – 5 · 11
Of course, the line above that shows that 11 = 53 –3 · 14, prompting another substitution:
- 1 = 4 · 14 – 5 · (53 – 3 · 14)
- 1 = 19 · 14 – 5 · 53
Following the pattern, we use the line that shows 14 = 120 –2 · 53, resulting in another substitution:
- 1 = 19 · (120 – 2 · 53) – 5 · 53
- 1 = 19 · 120 – 43 · 53
And finally, the top line shows that 53 = 7253 –60 · 120, for a final substitution:
- 1 = 19 · 120 – 43 · (7253 – 60 · 120)
- 1 = 2599 · 120 – 43 · 7253
- 2599 · 120 + – 43 · 7253 = 1
This shows that J and K would be 2599 and –43, respectively.
The numbers in the previous example were chosen for their relevance to RSA. Assuming the values for P and Q are 11 and 13, N would be 143. Therefore, φ(N) = 120 = (11 –1) · (13 –1). Since 7253 is relatively prime to 120, that number makes an excellent value for E.
If you recall, the goal was to find a value for D that satisfies the following equation:
- E · D = S · φ(N) + 1
Some basic algebra puts it in a more familiar form:
- D · E + S · φ(N) = 1
- D · 7253 ± S · 120 = 1
Using the values from the extended Euclidean algorithm, it's apparent that D = –43. The value for S doesn't really matter, which means this math is done modulo φ(N), or modulo 120. That, in turn, means that a positive equivalent value for D is 77, since 120 –43 = 77. This can be put into the prior equation from above:
- E · D = S · φ(N) + 1
- 7253 · 77 = 4654 · 120 + 1
The values for N and E are distributed as the public key, while D is kept secret as the private key. P and Q are discarded. The encryption and decryption functions are fairly simple.
- Encryption: C = ME(modN)
- Decryption: M = CD(modN)
For example, if the message, M, is 98, encryption would be as follows:
- 987253 = 76(mod143)
The ciphertext would be 76. Then, only someone who knew the value for D could decrypt the message and recover the number 98 from the number 76, as follows:
- 7677 = 98(mod143)
Obviously, if the message, M, is larger than N, it must be broken down into chunks that are smaller than N.
This process is made possible by Euler's totient theorem. It states that if M and N are relatively prime, with M being the smaller number, then when M is multiplied by itself φ(N) times and divided by N, the remainder will always be 1:
- If gcd(M, N) = 1 and M < N then Mφ(N) = 1(modN)
Since this is all done modulo N, the following is also true, due to the way multiplication works in modulus arithmetic:
- Mφ(N) · Mφ(N) = 1 ·1(modN)
- M2 · φ(N) = 1(modN)
This process could be repeated again and again S times to produce this:
- MS · φ(N) = 1(modN)
If both sides are multiplied by M, the result is:
- MS · φ(N) · M = 1 ·M(modN)
- MS · φ(N) + 1 = M(modN)
This equation is basically the core of RSA. A number, M, raised to a power modulo N, produces the original number M again. This is basically a function that returns its own input, which isn't all that interesting by itself. But if this equation could be broken up into two separate parts, then one part could be used to encrypt and the other to decrypt, producing the original message again. This can be done by finding two numbers, E and D, that multiplied together equal S times φ(N) plus 1. Then this value can be substituted into the previous equation:
- E · D = S ·φ(N) + 1
- ME · D = M(modN)
This is equivalent to:
- MED = (MmodN)
which can be broken up into two steps:
- ME = C(modN)
- CD = M(modN)
And that's basically RSA. The security of the algorithm is tied to keepingD secret. But since N and E are both public values, if N can be factored into the original P and Q, then φ(N) can easily be calculated with (P –1) · (Q –1), and then D can be determined with the extended Euclidean algorithm. Therefore, the key sizes for RSA must be chosen with the best-known factoring algorithm in mind to maintain computational security. Currently, the best-known factoring algorithm for large numbers is the number field sieve (NFS). This algorithm has a subexponential run time, which is pretty good, but still not fast enough to crack a 2,048-bit RSA key in a reasonable amount of time.
2. Peter Shor's Quantum Factoring Algorithm
Once again, quantum computation promises amazing increases in computation potential. Peter Shor was able to take advantage of the massive parallelism of quantum computers to efficiently factor numbers using an old number-theory trick.
The algorithm is actually quite simple. Take a number, N, to factor. Choose a value, A, that is less than N. This value should also be relatively prime to N, but assuming that N is the product of two prime numbers (which will always be the case when trying to factor numbers to break RSA), if A isn't relatively prime to N, then A is one of N's factors.
Next, load up the superposition with sequential numbers counting up from 1 and feed every one of those values through the function f(x) = Ax(modN). This is all done at the same time, through the magic of quantum computation. A repeating pattern will emerge in the results, and the period of this repetition must be found. Luckily, this can be done quickly on a quantum computer with a Fourier transform. This period will be called R.
Then, simply calculate gcd(AR/2 + 1, N) and gcd(AR/2 –1, N). At least one of these values should be a factor of N. This is possible because AR = 1(modN) and is further explained below.
- AR = 1(modN)
- (AR/2)2 = 1(modN)
- (AR/2)2 –1 = 0(modN)
- (AR/2 –1) · (AR/2 + 1) = 0(modN)
This means that (AR/2 –1) · (AR/2 + 1) is an integer multiple of N. As long as these values don't zero themselves out, one of them will have a factor in common with N.
To crack the previous RSA example, the public value N must be factored. In this case N equals 143. Next, a value for A is chosen that is relatively prime to and less than N, so A equals 21. The function will look like f(x) = 21x(mod143). Every sequential value from 1 up to as high as the quantum computer will allow will be put through this function.
To keep this brief, the assumption will be that the quantum computer has three quantum bits, so the superposition can hold eight values.
x = 1 | 211(mod143) = 21 |
x = 2 | 212(mod143) = 12 |
x = 3 | 213(mod143) = 109 |
x = 4 | 214(mod143) = 1 |
x = 5 | 215(mod143) = 21 |
x = 6 | 216(mod143) = 12 |
x = 7 | 217(mod143) = 109 |
x = 8 | 218(mod143) = 1 |
Here the period is easy to determine by eye: R is 4. Armed with this information, gcd(212 –1143) and gcd(212 + 1143) should produce at least one of the factors. This time, both factors actually appear, since gcd(440, 143) = 11 and gcd(442, 142) = 13. These factors can then be used to recalculate the private key for the previous RSA example.