RSA algorithm
The RSA algorithm is the best known public key encryption algorithm. Like any public key system, it can be used to create digital signatures as well as for secrecy.
It is named for its inventors Ron Rivest, Adi Shamir and Leonard Adeleman. The original paper defining it is "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems" [1]. by those three authors.
How it works
To generate an RSA key pair, the system first finds two primes p, q and the product N = pq. Take p-1 and q-1 and find the least common multiple T = lcm( p-1, q-1). Then choose d, e such that d*e == 1 modulo T. The public key is then the pair (N,e) and the private key (N,d).
The strength parameter of the system is the length of N in bits. As of 2008, 1024 bits is considered secure but some users choose larger sizes to be on the safe side.
Using t for plaintext and c for ciphertext, encryption is then:
c = te modulo N
and decryption, using m for the decrypted message is:
m = cd modulo N
so we have:
m = (te)d modulo N m = tde modulo N
whence (via the proof below)
m = t modulo N
That is, the encrypt/decrypt pair of operations always give the correct result.
Proof
The proof is based on theorems proved by Fermat, back in the 17th century,
For prime p and any x:
xp == x modulo p
and for non-zero x:
xp-1 == 1 modulo p
Whence, for any k and non-zero x:
xk(p-1) == 1 modulo p
so with two primes and x non-zero modulo both:
x(p-1)(q-1) == 1 modulo p or modulo q, hence modulo pq xk(p-1)(q-1) == 1 modulo pq
Whether or not x is zero modulo either prime, we get:
xk(p-1)(q-1)+1 == x modulo pq
but we have:
de == 1 mod T de = k(p-1)(q-1)+1 for some k
and
m = tde modulo N which is modulo pq
so
m = t in all cases
Implementation differences
T can be calculated in two ways. The original RSA paper, and other references such as Schneier and PKCS1v1, RFC 2313, give T = (p-1)(q-1). Other references such as PKCS1v2, RFC 2437, give T = lcm(p-1, q-1). That is a common implementation practice and what we give above. It is slightly more efficient, but the system works either way. In one sense, the difference is not important.
However it can lead to interoperation problems if two implementations do it differently. In an example from IPsec, the FreeS/WAN implementer used the product while PGPnet used the least common multiple. About half of the signatures generated with PGPnet failed to verify on FreeS/WAN. Several users, both implementers, and some managers attempted to find the problem, without success until the specification discrepancy was noticed.
RSA and factoring
Given an efficient solution to the integer factorisation problem, breaking RSA would become trivial. The attacker is assumed to have the public key (N,e). If he can factor N, he gets p, q and therefore p-1, q-1 and T. He knows e and can calculate its inverse mod T. That gives him d and he already has N, so now he knows the private key (N,d). The cryptosystem would be rendered worthless.
The problem with that is that no efficient solution for factoring is known, despite considerable effort by quite a few people over several decades to find one. It seems possible no such algorithm exists, though no-one has proven that.
However, while no really efficient (polynomial in the number of bits in N) methods are known, there are various methods that do work (see integer factorisation), those methods are improving, and computers get faster all the time (see Moore's Law). It is advisable to use large RSA moduli to provide a safety factor against these changes.
RSA Laboratories ran a factoring contest for some years, with large cash prizes, to test the security of RSA. This ended in 2007.