Fuelled by insatiable curiosity

Miscellaneous » Academic » RSA encryption

The RSA encryption algorithm was publically described in 1978 by three researchers at MIT. RSA was the first known algorithm suitable for signing as well as encryption. Compared to other encryption schemes, RSA does not require secure transmission of decryption keys. This makes RSA suitable for several applications requiring rapid and inexpensive encryption, such as email and money transactions. It is regarded as one of the greatest advances in public key cryptography.

RSA involves a public key and a private key. The public key is available to others and is used to encrypt messages. Only the holders of the private key are able to decrypt the messages encoded by the public key.

At the crux of RSA is the idea that generating the product of two large prime numbers is easy and this result can be made public. Conversely, going the other way of finding the two numbers from the result is very difficult. Prime numbers are used here as they have no common factors between them, and therefore the result can only be factored by knowing either one of the two initial numbers.

Even if you're not a mathematician/scientist/engineer, you probably know what a prime number is - it is a number that it has no factors other than 1 and itself. Odds are you know how to test if a number is prime, by seeing if any of the smaller counting numbers (up to n/2) divide it wholely. This scheme is fine for evaluating small prime numbers. However, in order to keep your bank transactions and emails from prying eyes, an key part of RSA is to generate prime numbers with several hundred digits. It would take a super computer years to find a suitably large prime with this method!

Fortunately much smarter methods exist for discovering prime numbers. One such method is to generate a large random number and test whether this number is prime using some criterion.

The idea is to firstly generate a large random odd number. Large random numbers can be made by concatenating the result of several calls to a random number generator. If the number is even, make it odd by adding one to it. Call this the candidate and suppose it is a prime. One criterion test is to find the greatest common divisor (gcd) with another smaller random number. If it turns out the greatest common factor between the two is 1, the candidate could be a prime.

An important factor towards the speed of the test is the implementation of algorithm. One way is to observe that gcd(a, b) = gcd(b, r), where a is the candidate, b is the test number, and r is the remainder when a is divided by b (i.e., r is result of a mod b). Using this, continue until the remainder is 0. The previous remainder turns out to the greatest common factor. If this remainder ends up being 1, it means that the candidate could either be a prime, or that the candidate and the test number are co-prime.

This process can be sped up by looking at quadratic residues, but I'll leave this as research to the keen reader! Wikipedia also has an informative link about the primality test.

The odds that the candidate is non-prime is reduced by carrying the experiment out more times against other random numbers. If the candidate has passed the required number of tests, it is therefore a prime. Otherwise it is incremented by two (i.e., check the next largest odd number) and the process is repeated. This is a so called probablistic method and is much better than the exhaustive test.

The public key consists of two positive integers, the modulus n and the public (or encryption) exponent e. The private key consists of one positive integer, the private (or decryption) exponent d. The modulus n is also used in decrypting the message.

The keys for the RSA algorithm are generated as follows:

1. Find the product of two randomly generated large prime numbers p and q. This product gives the modulus n which is used in both the public and private keys. Prime numbers are chosen here, so that the resulting product n is essentially un‐factorable, thus keeping p and q private.

2. Compute ϕ(n) = (p – 1)(q – 1), where ϕ is Euler's totient function. The exponent e is then found by choosing an integer between 1 and ϕ(n), and that it is also co‐prime with ϕ(n).

3. The decryption exponent d is found by satisfying the equation: d∙e ≡ 1 (mod ϕ(n)). Note that this is often computed by using the extended Euclidean algorithm.

Before encryption, a message needs to be represented as an integer between 0 and n – 1. Often messages are longer than this, so using a standard representation the message is broken up into a series of blocks, which are turned into integers. The encrypted message (ciphertext C) is generated by: C ≡ Me (mod n), were e and n are the public keys, and M is the message. To decrypt the ciphertext: M = Cd (mod n).

I wrote an implementation of the RSA algorithm (in C), which can be found on my github repository. Note that it was not vigorously tested and is likely to have bugs in it. If you are after a proper RSA implementation I suggest you have a look at the GNU crypto library.