RSA algorithm: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Sandy Harris
mNo edit summary
 
(37 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{subpages}}
{{PropDel}}<br><br>{{subpages}}
{{TOC|right}}


The '''RSA algorithm''' is the best known [[public key]] encryption algorithm. Like any public key system, it can be used to create [[digital signatures]] as well as for secrecy.
The '''RSA algorithm''' is the best known and most widely used [[public key]] encryption algorithm. Like any public key system, it can be used to create [[digital signature]]s as well as for secrecy.


It is named for its inventors [[Ron Rivest]], [[Adi Shamir]] and [[Leonard Adeleman]]. The original paper defining it is "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems"  <ref>Rivest et al. [http://people.csail.mit.edu/rivest/Rsapaper.pdf "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems"]</ref>. by those three authors.
It is named for its inventors [[Ron Rivest]], [[Adi Shamir]] and [[Leonard Adleman]]. The original paper defining it is "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems"  <ref>Rivest et al. [http://people.csail.mit.edu/rivest/Rsapaper.pdf "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems"]</ref>  <ref>Communications of the ACM, Vol. 21 (2), pp.120&ndash;126. 1978</ref> <ref>Previously released as an MIT "Technical Memo" in April 1977, and published in Martin Gardner's ''Scientific American'' "Mathematical Recreations" column</ref>. by those three authors.
 
The three authors were joint recipients of the 2002 [[Turing Award]], primarily for that work.
 
The algorithm was patented <ref>{{US patent|4405829}}</ref> in 1983 by [[MIT]], but released into the public domain in 2000.
 
The inventors started a company, [[RSA Laboratories]], which became a major player in the [[information security]] industry. [[Ron Rivest | Rivest]] in particular went on to invent additional systems which the company marketed, including the [[MD4]] and [[MD5]] hash algorithms and [[Rivest ciphers|various encryption algorithms]] with names of the form RC''n'' including [[RC4]], a widely used [[stream cipher]].
 
Staff at RSA Laboratories wrote a specification for use of the RSA algorithm in Internet protocols. The current version is RFC 2437. RSA is used in many protocols, including [[IPsec]], [[Open PGP]] and [[DNS security]].


== How it works ==
== How it works ==


To generate an RSA key pair, the system first finds two primes p, q and the product N = pq. Take p-1 and q-1 and find the least common multiple T = lcm( p-1, q-1). Then choose d, e such that d*e == 1 [[modulo]] T. The public key is then the pair (N,e) and the private key (N,d).
Any public key system requires that keys be created in matched pairs, such that if one key encrypts then only the other key from that pair can decrypt. One key from a pair can be published as the user's '''public key'''; the other becomes his or her '''private key''', and must be kept secret.
 
To generate an RSA key pair, the system first finds two distinct primes p, q and the product N = pq. Take p-1 and q-1 and find the [[totient function]] of N, the product T = (p-1)(q-1). Alternately, find the least common multiple T = lcm( p-1, q-1) ([[#Implementation differences | discussion]]). Then choose encryption exponent e and decryption exponent d such that d*e == 1 [[modulo]] T. The public key is then the pair (N,e) and the private key (N,d).


The strength parameter of the system is the length of N in bits. As of 2008, 1024 bits is considered secure but some users choose larger sizes to be on the safe side.  
The strength parameter of the system is the length of N in bits. As of mid-2010, 1024-bit keys are in widespread use but are generally in the process of being replaced by 2048-bit keys for greater security. Some users choose larger sizes to be on the safe side.  


Using t for plaintext and c for ciphertext, encryption is then:
Using t for plaintext and c for ciphertext, encryption is then:
   c = t<sup>e</sup>  modulo N
   c = t<sup>e</sup>  modulo N
and decryption, using m for the decrypted message is:
and decryption, using m for the decrypted message, is:
   m = c<sup>d</sup> modulo N
   m = c<sup>d</sup> modulo N
so we have:
so we have:
Line 23: Line 34:
That is, the encrypt/decrypt pair of operations always give the correct result.
That is, the encrypt/decrypt pair of operations always give the correct result.


== Proof ==
=== Proof ===


The proof is based on theorems proved by [[Fermat]], back in the 17th century,
The proof is based on theorems proved by [[Fermat]], back in the 17th century,
Line 34: Line 45:
Whence, for any k and non-zero x:
Whence, for any k and non-zero x:
   x<sup>k(p-1)</sup> == 1  modulo p
   x<sup>k(p-1)</sup> == 1  modulo p
so with two primes and x non-zero modulo both:
so for non-zero x:
   x<sup>(p-1)(q-1)</sup> == 1  modulo p or modulo q, hence modulo pq
   x<sup>k(p-1)(q-1)</sup> == 1  modulo p
   x<sup>k(p-1)(q-1)</sup> == 1   modulo pq
and
   x<sup>k(p-1)(q-1)+1</sup> == x   modulo p
but that also holds if x is zero modulo p, since then both sides are zero.


Whether or not x is zero modulo either prime, we get:
Similarly,
  x<sup>k(p-1)(q-1)+1</sup> == x  modulo q
so if p and q are distinct primes
   x<sup>k(p-1)(q-1)+1</sup> == x  modulo pq
   x<sup>k(p-1)(q-1)+1</sup> == x  modulo pq
but we have:
 
But we have:
   de == 1 mod T
   de == 1 mod T
   de = k(p-1)(q-1)+1  for some k
   de = k(p-1)(q-1)+1  for some k
Line 48: Line 64:
   m = t      in all cases
   m = t      in all cases


== Implementation differences ==
== Security ==
 
A few considerations are critical for the security of RSA:


T can be calculated in two ways. The original RSA paper, and other references such as Schneier and PKCS1v1, RFC 2313, give T = (p-1)(q-1). Other references such as PKCS1v2, RFC 2437, give T = lcm(p-1, q-1). That is a common implementation practice and what we give above. It is slightly more efficient, but the system works either way. In one sense, the difference is not important.
* Generating primes which an enemy cannot guess requires a strong [[random number generator]]; any weakness in the generator reduces security.
* Using the system securely requires that the private key never be revealed; losing that key to an enemy instantly reduces security to zero.
* Discovery of an efficient solution to the [[integer factorisation]] problem would break RSA. See [[#RSA and factoring | below]].


However it can lead to interoperation problems if two implementations do it differently. In an [[IPsec#Complications | example]] from [[IPsec]], the [[FreeSWAN |FreeS/WAN]] implementer used the product while [[PGPnet]] used the least common multiple. About half of the signatures generated with PGPnet failed to verify on FreeS/WAN. Several users, both implementers, and some managers attempted to find the problem, without success until the specification discrepancy was noticed.
For more subtle problems, see [[Attacks on RSA]].


== RSA and factoring ==
=== RSA and factoring ===


Given an efficient solution to the [[integer factorisation]] problem, breaking RSA would become trivial. The attacker is assumed to have the public key (N,e). If he can factor N, he gets p, q and therefore p-1, q-1 and T. He knows e and can calculate its inverse mod T. That gives him d and he already has N, so now he knows the private key (N,d). The cryptosystem would be rendered worthless.
Given an efficient solution to the [[integer factorisation]] problem, breaking RSA would become trivial. The attacker is assumed to have the public key (N,e). If he can factor N, he gets p, q and therefore p-1, q-1, and T. He knows e and can calculate its inverse mod T using the efficient [[Extended Euclidean algorithm]]. That gives him d and he already has N, so now he knows the private key (N,d). With the private key he can both read encrypted messages and forge the digital signature of the key owner. The cryptosystem would be rendered worthless.


The problem with that is that no efficient solution for factoring is known, despite considerable effort by quite a few people over several decades to find one. It seems possible no such algorithm exists, though no-one has proven that.
The problem with that is that no really efficient (polynomial in the number of bits in N) solution for factoring is known, despite considerable effort by quite a few people over several decades to find one. It seems possible no such algorithm exists, though no-one has proven that.


However, while no really efficient (polynomial in the number of bits in N) methods are known, there are various methods that do work (see  [[integer factorisation]]), those methods are improving, and computers get faster all the time (see [[Moore's Law]]). It is advisable to use large RSA moduli to provide a safety factor against these changes.
However, while no really efficient methods are known, there are various methods that do work (see  [[integer factorisation]]), those methods are improving, and computers get faster all the time (see [[Moore's Law]]). It is advisable to use large RSA moduli to provide a safety factor against these changes.


[[RSA Laboratories]] ran a factoring [http://www.rsa.com/rsalabs/node.asp?id=2092 contest] for some years, with large cash prizes, to test the security of RSA. This ended in 2007.
[[RSA Laboratories]] ran a factoring [http://www.rsa.com/rsalabs/node.asp?id=2092 contest] for some years, with large cash prizes, to test the security of RSA. This ended in 2007.
== Implementation differences ==
T can be calculated in two ways. The original RSA paper, and other references such as Schneier <ref name="schneierbook">{{citation
| first = Bruce | last = Schneier
| title = Applied Cryptography
| date = 2nd edition, 1996,
| publisher = John Wiley & Sons
|ISBN =0-471-11709-9}}</ref> and PKCS1v1, RFC 2313, give T = (p-1)(q-1). Other references such as PKCS1v2, RFC 2437, give T = lcm(p-1, q-1). Menezes at al. give both <ref name=HAC>{{citation
| first1 = AJ | last1 = Menezes | first2 = PC | last2= van Oorschot | first3= SA | last3 = Vanstone 
| url=http://www.cacr.math.uwaterloo.ca/hac/
| title = Handbook of Applied Cryptography
| date = Fifth Edition, 2001
| ISBN=0-8493-8523-7}}</ref>. Using the least common multiple has now become the usual implementation practice. It is slightly more efficient, but the system works either way. In one sense, the difference is not important.
However it can lead to interoperation problems if two implementations do it differently. In an [[IPsec#Complications | example]] from [[IPsec]], the [[FreeSWAN |FreeS/WAN]] implementer used the product while [[PGPnet]] used the least common multiple. About half of the signatures generated with PGPnet failed to verify on FreeS/WAN. Several users, both implementers, and some managers attempted to find the problem, without success until the specification discrepancy was noticed.


==References==
==References==
{{reflist|2}}
{{reflist|2}}[[Category:Suggestion Bot Tag]]

Latest revision as of 11:00, 9 October 2024

This article may be deleted soon.
To oppose or discuss a nomination, please go to CZ:Proposed for deletion and follow the instructions.

For the monthly nomination lists, see
Category:Articles for deletion.


The RSA algorithm is the best known and most widely used public key encryption algorithm. Like any public key system, it can be used to create digital signatures as well as for secrecy.

It is named for its inventors Ron Rivest, Adi Shamir and Leonard Adleman. The original paper defining it is "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems" [1] [2] [3]. by those three authors.

The three authors were joint recipients of the 2002 Turing Award, primarily for that work.

The algorithm was patented [4] in 1983 by MIT, but released into the public domain in 2000.

The inventors started a company, RSA Laboratories, which became a major player in the information security industry. Rivest in particular went on to invent additional systems which the company marketed, including the MD4 and MD5 hash algorithms and various encryption algorithms with names of the form RCn including RC4, a widely used stream cipher.

Staff at RSA Laboratories wrote a specification for use of the RSA algorithm in Internet protocols. The current version is RFC 2437. RSA is used in many protocols, including IPsec, Open PGP and DNS security.

How it works

Any public key system requires that keys be created in matched pairs, such that if one key encrypts then only the other key from that pair can decrypt. One key from a pair can be published as the user's public key; the other becomes his or her private key, and must be kept secret.

To generate an RSA key pair, the system first finds two distinct primes p, q and the product N = pq. Take p-1 and q-1 and find the totient function of N, the product T = (p-1)(q-1). Alternately, find the least common multiple T = lcm( p-1, q-1) ( discussion). Then choose encryption exponent e and decryption exponent d such that d*e == 1 modulo T. The public key is then the pair (N,e) and the private key (N,d).

The strength parameter of the system is the length of N in bits. As of mid-2010, 1024-bit keys are in widespread use but are generally in the process of being replaced by 2048-bit keys for greater security. Some users choose larger sizes to be on the safe side.

Using t for plaintext and c for ciphertext, encryption is then:

 c = te  modulo N

and decryption, using m for the decrypted message, is:

 m = cd modulo N

so we have:

 m = (te)d modulo N
 m = tde modulo N

whence (via the proof below)

 m = t   modulo N

That is, the encrypt/decrypt pair of operations always give the correct result.

Proof

The proof is based on theorems proved by Fermat, back in the 17th century,

For prime p and any x:

 xp == x   modulo p

and for non-zero x:

 xp-1 == 1   modulo p

Whence, for any k and non-zero x:

 xk(p-1) == 1   modulo p

so for non-zero x:

 xk(p-1)(q-1) == 1   modulo p

and

 xk(p-1)(q-1)+1 == x   modulo p

but that also holds if x is zero modulo p, since then both sides are zero.

Similarly,

 xk(p-1)(q-1)+1 == x   modulo q

so if p and q are distinct primes

 xk(p-1)(q-1)+1 == x   modulo pq

But we have:

 de == 1 mod T
 de = k(p-1)(q-1)+1   for some k

and

 m = tde    modulo N which is modulo pq

so

 m = t      in all cases

Security

A few considerations are critical for the security of RSA:

  • Generating primes which an enemy cannot guess requires a strong random number generator; any weakness in the generator reduces security.
  • Using the system securely requires that the private key never be revealed; losing that key to an enemy instantly reduces security to zero.
  • Discovery of an efficient solution to the integer factorisation problem would break RSA. See below.

For more subtle problems, see Attacks on RSA.

RSA and factoring

Given an efficient solution to the integer factorisation problem, breaking RSA would become trivial. The attacker is assumed to have the public key (N,e). If he can factor N, he gets p, q and therefore p-1, q-1, and T. He knows e and can calculate its inverse mod T using the efficient Extended Euclidean algorithm. That gives him d and he already has N, so now he knows the private key (N,d). With the private key he can both read encrypted messages and forge the digital signature of the key owner. The cryptosystem would be rendered worthless.

The problem with that is that no really efficient (polynomial in the number of bits in N) solution for factoring is known, despite considerable effort by quite a few people over several decades to find one. It seems possible no such algorithm exists, though no-one has proven that.

However, while no really efficient methods are known, there are various methods that do work (see integer factorisation), those methods are improving, and computers get faster all the time (see Moore's Law). It is advisable to use large RSA moduli to provide a safety factor against these changes.

RSA Laboratories ran a factoring contest for some years, with large cash prizes, to test the security of RSA. This ended in 2007.

Implementation differences

T can be calculated in two ways. The original RSA paper, and other references such as Schneier [5] and PKCS1v1, RFC 2313, give T = (p-1)(q-1). Other references such as PKCS1v2, RFC 2437, give T = lcm(p-1, q-1). Menezes at al. give both [6]. Using the least common multiple has now become the usual implementation practice. It is slightly more efficient, but the system works either way. In one sense, the difference is not important.

However it can lead to interoperation problems if two implementations do it differently. In an example from IPsec, the FreeS/WAN implementer used the product while PGPnet used the least common multiple. About half of the signatures generated with PGPnet failed to verify on FreeS/WAN. Several users, both implementers, and some managers attempted to find the problem, without success until the specification discrepancy was noticed.

References

  1. Rivest et al. "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems"
  2. Communications of the ACM, Vol. 21 (2), pp.120–126. 1978
  3. Previously released as an MIT "Technical Memo" in April 1977, and published in Martin Gardner's Scientific American "Mathematical Recreations" column
  4. U.S. Patent 4,405,829, PDF
  5. Schneier, Bruce (2nd edition, 1996,), Applied Cryptography, John Wiley & Sons, ISBN 0-471-11709-9
  6. Menezes, AJ; PC van Oorschot & SA Vanstone (Fifth Edition, 2001), Handbook of Applied Cryptography, ISBN 0-8493-8523-7