RSA and IDEA Security Algorithms
RSA and IDEA Security Algorithms
Prepared By:
1040589 1040766
Supervised By:
Directed By:
BIRZEIT
June – 2008
Table of Contents
Item page
List of Figures 4
List of tables 5
Dedications 6
ا 7
Abstract 8
1. Chapter 1 Introduction 9
2.4.1 Introduction 14
14
2.4.2 Numbers generation 14
2.7 Mathematics 19
3.1 RSA 21
3.1.1 Introduction 21
2
3.1.2 RSA Algorithm 21
23
3.1.4 RSA Usage
24
3.1.5 RSA Security
24
3.1.6 RSA Possible Attacks
3.2 IDEA 27
3.2.1 Introduction 27
6.Chapter 6 Conclusions 55
6.1 Conclusions 55
Appendices 56
A. Bibliography 56
B. Code 58
3
List of Figures
Figure 5.4 IDEA file size vs. time encryption for encryption/decryption 52
Figure 5.7 RSA128, RSA512 and IDEA file size before and after encryption 54
4
List of Tables
Table 4.2 Input and output word status across IDEA operators. 44
Table 5.3 Character size before and after encryption for RSA128 bit 52
5
Dedications
We dedicate this project:
• To our teachers who helped, encouraged, endured and advised us all along the way.
• To our instructor Miss. Muna Al‐Khayyat for her continuous support and fruitful advice.
6
ﺍﻝﻤﺴﺘﺨﻠﺹ
ﺍﻝﻬﺩﻑ ﺍﻻﺴﺎﺴﻲ ﻤﻥ ﻤﺸﺭﻭﻋﻨﺎ ﻫﻭ ﺩﺭﺍﺴﺔ ﻭﺘﻁﺒﻴﻕ ﺨﻭﺍﺭﺯﻤﻴﺘﺎﻥ ﺘﻌﻤﻼﻥ ﻓﻲ ﻤﺠﺎل ﺃﻨﻅﻤﺔ ﺤﻤﺎﻴﺔ ﺍﻝﻤﻌﻠﻭﻤﺎﺕ ,ﻭﺘﺤﻠﻴﻠﻬﻤﺎ
ﻭﺍﻝﻤﻘﺎﺭﻨﺔ ﺒﻴﻨﻬﻤﺎ ﻤﻊ ﻤﺤﺎﻭﻝﺔ ﺘﻁﺒﻴﻕ ﺒﻌﺽ ﺍﻝﻬﺠﻤﺎﺕ ﻝﻠﻜﺸﻑ ﻋﻥ ﻜﻔﺎﺀﺓ ﻜل ﻤﻨﻬﻤﺎ ﻓﻲ ﺍﻝﺼﻤﻭﺩ ﺃﻤﺎﻡ ﻤﺤﺎﻭﻻﺕ ﺍﺨﺘﺭﺍﻗﻬﻤﺎ.
ﺇﻥ ﺍﻝﻔﻜﺭﺓ ﺍﻷﺴﺎﺴﻴﺔ ﻭﺭﺍﺀ ﻜل ﻤﻨﻬﻤﺎ ﻫﻭ ﺇﺭﺴﺎل ﺍﻝﻤﻌﻠﻭﻤﺎﺕ ﺒﺤﻴﺙ ﺘﺒﺩﻭ ﻝﻠﻘﺎﺭﺉ ﺒﺸﻜل ﻏﻴﺭ ﻤﻔﻬﻭﻡ ,ﻭﻻ ﻴﺘﻤﻜﻥ ﻤﻥ ﻗﺭﺍﺀﺓ
ﺍﻝﻤﺤﺘﻭﻯ ﺇﻻ ﺍﻝﻤﺭﺴل ﺇﻝﻴﻪ ﻋﻥ ﻁﺭﻴﻕ ﻤﻌﺎﻝﺠﺔ ﺍﻝﻤﻌﻠﻭﻤﺎﺕ ﻗﺒل ﺇﺭﺴﺎﻝﻬﺎ ﻭﻤﻥ ﺜﻡ ﻤﻌﺎﻝﺠﺘﻬﺎ ﺒﻌﺩ ﺍﺴﺘﻘﺒﺎﻝﻬﺎ ﺤﺘﻰ ﺘﺼﺒﺢ ﺒﺸﻜل
ﻴﻤﻜﻥ ﻝﻠﻘﺎﺭﺉ ﻓﻬﻤﻪ.
ﺒﻨﺎﺀ ﻋﻠﻰ ﻫﺫﻩ ﺍﻝﺩﺭﺍﺴﺔ ﻭﻜﻨﺘﻴﺠﺔ ﻝﻤﺎ ﺃﻨﺠﺯ ﻤﻥ ﺘﺤﻠﻴل ﻨﺴﺘﻁﻴﻊ ﺃﻥ ﻨﺤﺩﺩ ﻨﻘﺎﻁ ﻗﻭﺓ ﻫﺎﺘﻴﻥ ﺍﻝﺨﻭﺍﺭﺯﻤﻴﺘﻴﻥ ﻤﻥ ﺃﺠل ﻤﺤﺎﻭﻝﺔ
ﺩﻤﺠﻬﻤﺎ ﻤﻌﺎ ﻝﺘﻜﻭﻴﻥ ﻨﻅﺎﻡ ﺤﻤﺎﻴﺔ ﻝﻠﻤﻌﻠﻭﻤﺎﺕ.
7
ABSTRACT
The aim of our project is to study and implement two security algorithms Rivest-Shamir-
Adleman (RSA) and the International Data Encryption Algorithm (IDEA) to make a
comparative analysis between them with some possible attacks.
The idea behind those two algorithms is to transmit data in a non clear text format ( encrypted
one ) to ensure security transmission , at the same time, they ensure a full understand of
the message by the receiver (interested party) who can decrypt the message using secret
special key along with a specific algorithm to return it back to its original form and
understand it.
Finally, a new implementation that combines between the two algorithms could be presented
according to the analysis that has been done on those two algorithms.
8
Chapter 1
1.1Overview
As the internet users are growing rapidly meanwhile the world is directed toward the
electronic data communication to facilitate their life. However, hackers find their chance to
defect data. Fortunately, cryptography appears as a key technology to handle these problems
in electronic security systems.
This seminar manly presents two security algorithms Rivest-Shamir-Adleman (RSA) and
the International Data Encryption Algorithm (IDEA), their objective is to transmit data in a
non clear text format ( encrypted one ) to ensure security transmission ( keep the data
security ), in the same time they ensure a full understand of the message by the receiver
who can decrypt the message using secret special key along with some algorithm.
In chapter two, introduces the basic concepts for cryptography types, goals, security and
attacks, with few details to provide background knowledge to understand the general
concepts and mathematics that is used in our project.
In chapter three, will describe the two algorithms: RSA and IDEA briefly; their structure,
keys, security and possible attacks.
The fourth chapter contains the description of the implementation of both algorithms, and a
chosen attacks done on them.
The comparative analysis of the two algorithms comes in the fifth chapter, along with the
analytical results for each one of them.
Finally, in chapter six a conclusion of our work result is mentioned, and what future work
might be done.
9
Chapter 2
Information Security and Cryptography
Cryptography “is the science of keeping secrets secret”1, in another words Cryptography is
the" science of information and communication security" 2
Information refers to any understood quantity, and Information Security tries to keep your
information secret to all parties except the authorized ones, that what we will try to achieve
using cryptography.
Cryptography originally was defined as the science of secret codes (where the code is a set
of defined symbols enable the party's to understand the messages being transmitted).Modern
Cryptography has a wider sense, being defined as the science of information protection
against unauthorized parties by preventing unauthorized alteration of use. Cryptographic
algorithms are the mathematical algorithms which enforce protection and cryptography
becomes more than encryption. (Vaudenay Serge, 2006).
1
Delfs Hans, Knebl Helmut, "Introduction to Cryptography Principles and Applications", Springer, (2007),
Second edition, ISBN-13 978-3-540-49243-6 , page 1
2
Vaudenay Serge, "A CLASSICAL INTRODUCTION TO CRYPTOGRAPHY Applications for
Communications Security", Springer, (2006), Page: preamble
10
2.3 Cryptography Types
Modern cryptography can be divided into two main classes: Symmetric, Asymmetric.
The symmetric (secret key) is the more traditional way of cryptography; it uses the same
key for both encryption and decryption, as shown in figure 2.1.
That is, for obtaining a cipher text c from a message m, there should be a key k and some
encryption algorithm E.
c = E (k, m)
And for decrypting the message to its original form, a decryption algorithm D is used
along with the same key used in encryption, that is:
m = D (k, c)
Thus, the encrypted plaintext m can be uniquely recovered from the ciphertext c. This
means that for a fixed key k, the encryption map must be bijective (one-to_one and onto
function in the same time).
E: K × M → C
11
Ek is the encryption function with respect to the key k.
The inverse function Dk: = Ek-1 is the decryption function. It is assumed that efficient
algorithms to compute Ek and Dk exist. (Delfs, et. al., 2007)
Symmetric-key encryption scheme main issue is that it requires a method by which the
two parties can communicate safely through the courier (the phone system, or some other
transmission medium) to prevent anyone to overhear the key in transit and later modify
and forge all the messages encrypted by that key, that what is called “key distribution”
problem.
The most common techniques in secret-key cryptography are: block cipher and stream
cipher
The Block ciphers operates on a blocks of data that have a fixed length of bits per
block, it takes the plain text and cuts it into blocks, then apply the
encryption/decryption on each block separately.
In the data breaking process, the need of a data pad may arise, that is, if the last
separated block hasn’t the same size as the others, it should be filled with data
(padding) in order to complete the encryption successfully, thus, when the data is
decrypted that padding should be discovered and ignored.
12
2.3.1.2 Stream ciphers
In order to use the XOR, the data used should be binary; thus, plaintexts, keys and
ciphertexts are bit strings. To encrypt a message m: = m1 m2 m3…, where mi є {0,
1}, a key stream k: = k1 K2 k3…, with ki є {0, 1}, is needed. Encryption and
decryption are given by bitwise XORing with the key stream:
Unlike the symmetric cryptography where the sender and the receiver of some message
has the same key; Asymmetric cryptography has a couple of keys to solve the key
management problem that the symmetric cryptography has.
The Asymmetric cryptography (Public key cryptography) was introduced in 1976, its
idea was to separate each key into two parts: a public key for encryption available to
everyone, and a secret key for decryption which is kept secret by the owner, moreover
the private key always linked mathematically to the public key.
In other words, every key is presented in a pair k = (pk, sk), where pk is a public key, and
sk is a secret key; to encrypt a message m to a ciphertext c using some encryption
algorithm E, the pk should be used along with it :
c = E (pk, m)
On the other hand, decrypting the ciphertext requires the secret key sk to exist.
3
Key-stream: is a series of random numbers that is generated according to the key when there is a need to.
13
“Public key techniques have to be used for a secure distribution of secret keys, and at
least some important forms of authentication and non-repudiation also require public-key
methods, such as digital signatures.” (Delfs, et. al., 2007)
2.4.1 Introduction
A major issue in using the different cryptographic techniques is the management of the
keys. Key management deals with the secure generation, distribution, and storage of
keys, this security is not only related to those who use it -that is in keeping it a secret-,
but it also referred to the way of generating that key.
Secure methods of key management are extremely important, since most of the attacks
on public-key systems are aimed at the key management level, rather than at the
cryptographic algorithm itself, although the key was generated randomly, but there are
always a way to find out the key through different algorithms, as brute force algorithm.
Since then, the key usually have a limited lifetime, given that the repetitive usage of the
key allows attacker to build up a store of ciphertexts (and possibly plaintexts) which may
prove sufficient for a successful cryptanalysis of the key value.(RSA Laboratories,
2000)
In the secret key cryptography, the key is a number that is generated randomly with a
specific size, but even though the numbers generated are random, there still be some way
to find a relationship between them, for example, converting these numbers to binary and
find out some frequently appears patters of 0s and 1s.
Since then, there are algorithms and devices that could generate probably random
numbers such as: RNG & PRNG.
RNGs are devices that works by gathering numbers from various kinds of
unpredictable inputs, such as by measuring radioactive decay, examining
atmospheric conditions in the vicinity, or calculating minute variances in
electrical current. These numbers pass the tests of randomness.(RSA
Laboratories, 2000)
14
Every generated group of numbers will never be repeated, since the output
depends on a variable input.
The session key here is the same as the encryption key, thus, if the session key has
broken, the information could be decrypted, that is, there is the session key and then the
Key Encryption Key, which known as (KEK).
KEK is based on password, thus, there is no need to store it and matter about its
protection, whenever the KEK is needed to encrypt or decrypt the data, it could be
generated, used then thrown away, it is generated using the password based encryption
(PBE) which gives the same value every time its generated since it depends on the
password given.
Using a session key for bulk data and protecting it with PBE means that users don’t have
to share passwords, and everyone could have his own password.
In order to encrypt the session key, the required bits of the mixing result is taken for
the KEK and used with some symmetric key algorithm, and after the encryption is
done there will be no need for the KEK and the password, so they thrown away, but
the salt used should be saved, since it is necessary for decryption.
For decrypting the data the same mixing algorithm that is used in encryption should
be used here, the password and the salt that has been used in encryption are given to
it in order to generate the KEK, then this KEK is used in the right symmetric key
algorithm to generate the session key.
In generating the session key again for decryption, the salt, password, and mixing
algorithm should be the same as those used in encryption, if one or more of them are
different, the result will be a wrong KEK.
The importance of the salt comes from its role to prevent pre-computations and foils
the dictionary attacks. If the KEK is generated using only the password, there is a
possibility for some attacker to create a dictionary of common passwords and their
associated keys, thus, attacker would try the pre-computed keys to find out the
password, but if the salt is used, there is no way to create this kind of dictionaries,
since different keys is produced by the same password
"There are many techniques in practice by which authentic public keys can be
distributed, including exchanging keys over a trusted channel, using a trusted public
file, using an on-line trusted server, and using an off-line server and certificates"
(A.Menezes, et al. , 1997).
Cryptography has many benefits and goals, and it is not viewed as only a mean to provide
information security but it is a set of techniques, and here four goals form a framework upon
which the others will be derived:
16
1. Confidentiality" is a service used to keep the content of information from all but those
authorized to have it. Secrecy is a term synonymous with confidentiality and privacy".
(A.Menezes, et al. , 1997).
2. Data integrity is a service which addresses any changes done by an unauthorized party on
the information, these changes includes (insertion, deletion and substitution). To assure data
integrity, one must have the ability to detect data manipulation by unauthorized parties.
(A.Menezes, et al. , 1997).
A fundamental goal of cryptography is to adequately address these four areas in both theory
and practice. Cryptography is about the prevention and detection of cheating and other
malicious activities. (A.Menezes, et. al. , 1997).
Attacks can be thought as the trials to break part or all of a cryptosystem, and the security of
your algorithm can be measured as the immunity against the attempts to know the decrypted
message contents by people rather than the recipients.
The security of a cryptographic algorithm should not rely on its secrecy, in other words
keeping your algorithm secret does not guarantee that it will be secure and trusted, but it
depends on how it is easy for an attacker to understand and alter your message without
having the key in advance.
17
There are two types of attacks can be implemented upon the different security algorithms, the
first type is known as the brute force where the attacker try to break the algorithm using
algorithms to try every possible key in turn until the correct key is identified, and the other is
by finding a weaknesses point in the algorithm structure.
“Cryptanalytic attacks can be mounted not only against encryption algorithms, but also,
analogously, against digital signature algorithms." (RSA Laboratories, 2000)
The security of a cryptographic algorithm (method) may depends on a mathematical problem
like the Difficulty of Factoring Integers, Difficulty of Discrete Logarithm, Difficulty of the
Subset Problem(Knapsack Problem), Algorithm Code Theory, Difficulty of the discrete
logarithm for elliptic curve or the Probabilistic Public Key Encryption.
The remaining unbroken problems are:
1. Difficulty of Factoring Integers, where the strength of the algorithm depends on how
hard for an attacker to get the prime factors for large numbers, RSA and Rabin
algorithms get their strength from this assumption.
2. Difficulty of Discrete Logarithm, the discrete logarithm assumption essentially states
that Exponent cannot be inverted by an attacker for all but a negligible fraction of
the inputs. Therefore, we call Exponent a family of one-way functions4, example: El-
Gamal algorithm. (Delfs Hans, et al., 2007)
3. Difficulty of the discrete logarithm for elliptic curve (discrete for infinite fields), for
example the ElGamal encryption and signature schemes. (beddo takmeleh??)
2.7 Mathematics:
The following are that basic mathematics needed to understand the algorithms presented in
this project.
Prime numbers are integers that are greater than 1 and are only divisible by themselves and
by 1, formally:
Definition 2.1 An integer p is said to be prime if its only positive divisors are 1 and p.
Otherwise, p is called composite.
4
A one-way function: is a function that is easy to compute but hard to invert.
[http://en.wikipedia.org/wiki/One-way_function].
18
The following are some well known facts about prime numbers.
Fact 2.1(fundamental theorem of arithmetic) every integer n ≥ 2 has a factorization as a
product of prime powers:
Where the pi is distinct primes and the ei are positive integers. Furthermore, the factorization
is unique up to rearrangement of factors.
Fact 2.2:
If a = p1e1 p 2e2 ... p kek , b = p1f1 p 2f 2 ... p kf k , where each ei ≥ 0 and fi ≥ 0, then
gcd( a, b) = p1min( e1 , f1 ) p 2min( e2 , f 2 ) ... p kmin( ek , f k )
An important point in cryptography is the Prime Numbers Generation where one can find
many algorithms to create prime numbers in a random way with large size to achieve security
Definition 2.2 (Integer Factorization Problem) which is finding number's prime factors, so
for a positive integer n, find its prime factors: n = p1 p2 ... pi where pi is positive distinct
prime number. Example: 257603 = 41 * 61 * 103.
Definition 2.5 Two integers a and b are said to be relatively prime or coprime if gcd(a, b) = 1
Definition 2.6 If a and b are integers, then a is said to be congruent to b modulo n, written
a b (mod n), if n divides (a − b). The integer n is called the modulus of the congruence.
Example
(i) 24 9 (mod 5) since 24 − 9 = 3.5.
(ii) −11 17 (mod 7) since −11 − 17 = −4 .7.
19
Chapter 3
RSA and IDEA Algorithms
3.1 RSA
3.1.1 Introduction:
The RSA is a public-key cryptosystem that offers both encryption and digital signatures
(authentication).Ronald Rivest, Adi Shamir, and Leonard Adleman developed it in 1977
and the name stands for the first letter in each of its inventors’ last names.
RSA is "Patent free since 2000". (Pekka Riikonen, 2002)."The basic technique was first
discovered in 1973 by Clifford Cocks, but this was a secret until 1997". (DI Management
Services, 2002)
Suppose A wants to send a message m to B, then A should have a pair of keys (public and
secret keys) generated by:
1. Generate two large random primes, p and q, of approximately equal size such that
their product n = pq is of the required bit length, e.g. 1024 bits, and they have
different values.
2. Compute n = pq and phi (φ) = (p-1) (q-1).
3. Choose an integer e, 1 < e < φ, such that φ and e has no common factors except 1,
in other words gcd(e, φ) = 1.
4. Compute the secret exponent d, 1 < d < φ, such that ed ≡ 1 (mod φ) which means
find d such that de % φ =1, which means d = e-1 mod (φ).
5. The public key is (n, e) and the private key is (n, d).
6. The values of p, q, and φ should also be kept secret.
Where:
20
• d is known as the secret exponent or decryption exponent
After the keys are generated, A can publish his public key but keep his secret key
secret, any message m now can be encrypted by senders.
2. Encryption Algorithm
Sender B does the following:-
Since only the recipient A knows d, only A can decrypt this message.
3. Decryption Algorithm
Recipient A does the following:-
The same algorithm can be used to sign documents, using the secret key to sign
documents and the public to verify, thus encryption and authentication take place
without any sharing of private keys: each person uses only another’s public key or
their own private key. Anyone can send an encrypted message or verify a signed
message, but only someone in possession of the correct private key can decrypt or
sign a message.
o In the RSA key generation algorithm the Extended Euclidean method used to find
the greatest common divisor gcd of two integers a and b, and the integers x and y
satisfying ax + by = d in the same time obtain d .
de mod φ
Using definition2.6
Then
1 = φ k + de ,
o The Miller-Rabin Probabilistic Test: algorithm used to test the randomly generated
numbers p and q
These days RSA is deployed in many commercial systems, it is used for email privacy
and authenticity, also to secure remote login sessions, and it is at the heart of
electronic credit-card payment systems.(Boneh Dan, 2000)
22
In operating systems the RSA is currently built by Microsoft, Apple, Sun, and Novell
in their products.
In hardware it can be found in secure telephones, on Ethernet network cards, and on
smart cards.
For secure Internet communications it is used in internet protocols, including
S/MIME, SSL (Secure Socket Layer) used for TCP/IP connections over the World
Wide Web, and S/WAN. (RSA Laboratories, 2000)
Also RSA is used in PGP (Pretty Good Privacy) software for e-mail protection. (H. T.
Riele, 1999)
RSA gets its security from the difficulty of obtaining the private key d from the public
key (n, e), so one need to factor n into p and q then calculate the private key d.
That is what is called the "RSA factorization problem" has not been proven to be
intractable (i.e., difficult to solve in an efficient manner). Rather, it is believed to be
intractable because years of intensive study by leading mathematicians and computer
scientists have failed to yield efficient algorithms5 for solving them. And till now it is
an assumption that it is hard to factor n into p and q. Thus the security of the RSA
system is based on the assumption that factoring is difficult.
Twenty years of research to break RSA produces types of attacks, each considering
away to decrypt the ciphertext without having any secret information, like attacks
related to the factoring, Small encryption exponent e, Small decryption exponent d,
Forward search attack, Multiplicative properties, Common Modulus Attack, Cycling
attacks and Message Concealing:
Breaking the RSA system can be done by discovering the private key corresponding
to a given public would enable an attacker to read all encrypted messages and to forge
signature and the obvious way to do that is to factor the modulus, n, into its prime
5
Efficient Algorithm : an algorithm that finds the best way to get the best solution.
23
factors, p and q then calculate d the secret exponent using p, q and e, this method can
be considered as one of the ways to attack the RSA, it is known as the " RSA
factorization problem" 6
In order to decrease the encryption time therefore increase the encryption efficiency,
one can choose small public exponent e, but if the same message m was encrypted
with same e to be sent to three different recipients whose modulus n and secret
exponent d are different using the Gauss’s algorithm and the Chinese remainder
theorem to get m from the different encryptions of the message.
As the previous case it may be desirable to choose small d, in order to decrease the
decryption time and increase the decryption efficiency7, "a small d can improve
performance by at least a factor of 10 (for a 1024 bit modulus)"( Boneh Dan, 2000)
" if d has up to approximately one-quarter as many bits as the modulus n, then there is
an efficient algorithm for computing d from the public information (n; e)" .This
attack can be avoided by choosing d and n with similar sizes. (A.Menezes, et al. ,
1997).
If the message is small or can be predicted, an attacker can encrypt all possible
plaintext messages and compare them with the encrypted message c until a similar it
is obtained. Salting the message with additional spaces can prevent that attack.
If the modulus n was shared between many recipients even though e and d have
different values discovering a pair of d and e will lead to factorization of n, hence the
attacker could subsequently determine the secret exponents of all other entities in the
6
See 3.1.5
7
In this case, one would select d first and then compute e with special algorithms , for more details see
24
network so cracking all the messages between all these recipients, and that what
forces each entity to choose its own modulus n.
Another point here is that if the same message was encrypted with n and different e
values an attacker can recover it most likely just using the publicly known
information.
me m (mod n).
There are always some messages which are unconcealed (for example m = 0, m = 1,
and m = n − 1). (A.Menezes, et al. , 1997).
25
3.2 IDEA
3.2.1 Introduction:
The IDEA algorithm is a secret key cryptosystem, was developed in a joint project
between the Swiss Institute of Technology in Zurich (ETH) and Ascom AG of
Switzerland.
It was first introduced in 1990 be Lai and Massay as the Proposed Encryption
Standard (PES), then they modified PES with Murphy and called it Improved PES
(IPES), in 1992 it was renamed International Data Encryption Algorithm (IDEA).
(Leong, et. al., 2003)
The IDEA requires 52 different keys of 16 bit for each Z i(r), where (i) is the number of
the sub-key and r is the round used in it, 6 sub-keys are used in each round of the 8,
and 4 sub-keys are used in the final half round; the 52 sub-keys that are used in
encryption are generated according to the main 128 bit key, while the 52 sub-keys that
are used in decryption are generated based on the encryption sub-keys, the table 3.1
shows the various keys that are used in the encryption process.
26
Table 3.1: The IDEA Encryption sub-keys.
The key schedule is the algorithm that is used for generating the different 52 sub-keys
for both encryption and decryption process.
The generation of the encryption keys depends on the IDEA 128 bit master key; those
keys are produced using these steps:
‐ The first 8 sub-keys Z1 – Z8 produced by dividing the 128 bit key into 8 parts,
each part of these has 16 bit.
‐ The next 8 sub-keys are generated by shifting the key with 25 bits, then dividing
the resulting 128 bit into an eight 16 bits keys again.
‐ The keys for each round are generated using the same way as the second round.
The last 4 keys which are used for the output transformation (OT) are taken from the
lower 64 bits of the master 128 bit key.
The table 3.2 shows the various dependencies for the sub-keys bits on main the 128 bit
keys.
27
Table 3.2: The key schedule algorithm of IDEA (Biham, et. al., 2001)
The decryption sub-keys are derived from the encryption sub-keys, according to the
table 3.3, which is done using the multiplicative inverse8 and the additive inverse9.
8
The multiplicative inverse (modulo 216 + 1) of the key Z is Z-1, where, Z AND Z-1 = 1
9
The additive inverse (modulo 216 + 1) of the key Z is - Z, where, -Z OR Z = 0.
28
3.2.2.2 IDEA Encryption / Decryption:
The IDEA block cipher uses a 128 bit key, and processes on 8 Bytes input to produce
an 8 Bytes output, using 8 identical blocks (rounds) and an additional half round, each
one of them uses only three algebraic operations which includes:
The same rounds are used in both encryption and decryption, each round takes a 64
bits input which is divided into four 16 bits blocks (X1 – X4) as an input, applies the
operations on them, and transform them into a four 16 bits block (Y1 – Y4). The only
difference between the rounds –beside the input- is the keys that are used in the round.
10 2n
The Fermat Prime: is a positive integer with a form: Fn = 2 + 1 which is supposed to be prime, where n
is a nonnegative integer.
29
Figure 3.1: IDEA cipher encryption/decryption process.
30
3.2.3 IDEA Security:
‐ Confusion:
‐ Diffusion
The round function is said to be complete, since each bit of the output depends on
each plaintext bit and each key bit for that round.
The MA structure transforms two 16 bit sub-blocks (U1, U2) into two 16 bit sub-
blocks (V1, V2) using another two 16 bits sub-keys (Z1, Z2).
11
Refer to 2.6 for more details.
31
‐ Perfect secrecy of a onetime key:
The IDEA routine uses completely different keys of 16 bits for encryption and
decryption, besides, the keys used in encryption process are different from those
used in decryption, and every key of the 52 encryption and decryption keys is
used only once.
Since the IDEA was introduced in 1991, it endured all the known attacks and the developed
attacks too, in spite of the fact that several attacks has broken quite a few number of its
rounds, however no attack has broken the full IDEA rounds (8.5 round)
In 1993, W. Meier was the first one to publish an attack based on differential cryptanalysis
against up to 2.5 rounds running faster than an exhaustive search. Then, Borst, Knudsen and
Rijmen presented a differential-linear attack against 3 rounds and a truncated differential
attack on 3.5 rounds; then many attack were introduced to break the cipher based on different
basics as finding mathematical proofs, weaknesses of the key used, finding relations between
the IDEA operations.
“No published attack (with the exception of attacks on weak keys) is better than exhaustive
search on the 128-bit key space, which is computationally infeasible. The security of IDEA
appears bounded only by the weaknesses arising from the relatively small (compared to its
key length) block length of 64 bits.” (Hammalainen, et. al., 2002)
32
Table 3.4: Selected known attacks on IDEA (Biham, et. al., 2001)
33
Chapter 4
Code Implementation and Attacks
4.1 RSA Algorithm Implementation:
The first task is to generate the prime numbers p and q using the "SecureRandom" class
built in "security" library in java, encapsulated with a "BigInteger" object to handle
numbers of any size.
When constructing a BigInteger we specify a bit-length and the amount of times t, that we
want the Miller-Rabin12 probabilistic test to run on the BigInteger, as well as supplying a
random set of bits for these tests. This will generate a random integer which is probably
prime with the specified bit-length. The probability that the new BigInteger represents a
prime number will exceed (1 -1 / 4t).
After the generation of prime p and q the values of n and φ13 can be easily generated as
follows:
N = P.multiply(Q);
M = (P.subtract(BigInteger.ONE)).multiply(Q.subtract(BigInteger.ONE));
The next step is to calculate e which must be co-prime to m, i.e. gcd(e, φ) = 1.Start from e
= 3, if gcd(e, φ) 6= 1 we let e be the next odd number. We continue in this fashion until the
gcd(e, φ) = 1.
The reason of starting e from 3 and increment by 2 is to get odd numbers because φ will
always be even so therefore no even number will be co-prime to it.
12
See Chapter3 for more details
13
Φ replaced by the symbol M in our implementation
34
The gcd found using the Extended Euclid Algorithm14 which finds the gcd(e, φ) and the
integers x and y that satisfy ax + by = gcd to be used in the next step.
// a=e, b=φ
public BigInteger[] EE(BigInteger a, BigInteger b, BigInteger c, BigInteger
d, BigInteger e, BigInteger f)
{
if (b.compareTo(BigInteger.ZERO)==0)
{
BigInteger[] ret = {BigInteger.ZERO,
BigInteger.ZERO,BigInteger.ZERO};
ret[0] = a; // gcd(a,b)
ret[1] = e; // coefficient of a
ret[2] = f; // coefficient of b
return ret;
}
else
{
return EE(b, a.mod(b), e.subtract((a.divide(b)).multiply(c)),
f.subtract((a.divide(b)).multiply(d)), c, d);
}
}
The last step in the key generation is to get d, which equals the value of x obtained from
Extended Euclid Algorithm.
Encryption Algorithm
Decryption Algorithm
Also the decryption done using the same method used for encryption
There are basically two types of factoring algorithms, special-purpose and general-purpose.
Special-purpose factoring algorithms attempt to benefit from special features of the number n
being factored. In contrast, the running time of general-purpose factoring algorithms depend
only on the size of n.
One of the most powerful special-purpose factoring algorithms is the Elliptic Curve
Factoring Method (ECM) that was invented in 1985 by Hendrik Lenstra Jr. The running time
of this method depends on the size of the prime factors of n, and hence the algorithm finds
small factors first.( Certicom Corporation, 2000)
14
See chapter3 to find out the algorithm.
35
The largest factor found by the Elliptic Curve Method has 67 digits, found by B. Dodson on
August 24, 20061 (Dr.Ir. Riele, 2005)
Just prior to the development of the RSA cryptosystem, the best general-purpose factoring
algorithm was the continued fraction algorithm the best general-purpose algorithms used
today: the quadratic sieve (QS) and the number field sieve (NFS). These algorithms allow
factoring on distributed networks of workstations so it decreases the need to large mainframe
computers or supercomputers. ( Certicom Corporation, 2000)
The largest integer factored with a general-purpose algorithm (GNFS15) is RSA200 (200
digits), which was factored on May 9, 2005 by Bahr, Boehm, Franke and Kleinjung (Dr.Ir.
Riele, 2005)
15
http://www.loria.fr/~zimmerma/records/rsa200
16
One MIPS year is the equivalent of a computation during one full year at a sustained
36
“Factoring expert Richard Brent from the Oxford University Computing Laboratory has
given an experimental formula which ‘predicts’ the year Y in which we may expect a number
of D decimal digits to be factored:
In the derivation of this formula, Brent took into account the known factoring world record
data, the known heuristic complexity formula for NFS and Moore’s law (computing power
doubles every 18 months). According to this formula, a 768-bit number (D = 231) will be
factored by the year 2010, and a 1024-bit number (D = 309) by the year 2018". ( H. T. Riele,
1999)
In order to be in the safe side, RSA Laboratories currently recommends key sizes of 1024 bits
for organization use and 2048 bits for extremely long term security.(RSA Laboratories,
2000)
In this project we try to implement some attacks to test our implementation of the RSA algorithm by
trying to factor some modules n generated by this implementation.
First we try to factor n for key length 128 bit (39 digit) using the elliptic curve 20method and we get
results in 3 seconds, but it works for more than two days to factor 512 bit without results.
On the another hand, we try to factor n using the Trial division attack21 it was useful to factor numbers
less than (64) bits, but it works for 24 hours to factor the (70) bits without giving results.
Key generation:
The 128 bit key is generated based on the random operation in the “Math” library in java,
which generates 16 random characters to be the IDEA key.
key = "";
double randomNumber;
double randomNumberSetup;
char randomCharacter;
20
We use an applet implementation for the method, see websites[3]
21
See appendix Bto find out the code
37
while( key.length() < 16 )
{
randomNumber = Math.random();
randomNumberSetup = (randomNumber*26 + '0');
randomCharacter = (char) randomNumberSetup;
key += randomCharacter;
}
The first 8 sub-keys ( Z1 - Z8 ) for the first round are generated by dividing the 128 into 8
parts, each is of 16 bits.
In other words, since the key is represented in 16 ASCII characters, the first sub-key would
be the value of the first two characters of the key; that is, the higher 8 bits of the sub-key
value is the ASCII of the second character, and the lower 8 bits value is the first character’s
ASCII.
The rest of the first 8 sub keys are done using the same way as the first sub-key has been
done.
The keys ( Z9 – Z48 ) are generated gradually in 6 rounds, 8 in each one, the 8 keys in each
round is generated as the following:
‐ The first key is taken from the first 7 bits of the second key from the previous
round and the last 9 bits from the second key in the previous round.
‐ The second key is taken from the first 7 bits of the second key from the previous
round and the last 9 bits from the third key in the previous round
38
‐ The 7th key of the round is taken from the first 7 bits of the last key from the
previous round and the last 9 bits from the first key in the previous round.
‐ The 8th key of the round is taken from the first 7 bits of the first key from the
previous round and the last 9 bits from the second key in the previous round.
‐ Z49 is generated from the first 7 bits of Z42, and the last 9 bits of Z41.
‐ Z50 is generated from the first 7 bits of Z43, and the last 9 bits of Z42.
IDEA round:
As we’ve mentioned before, the encryption and the decryption processes are essentially the
same, the only difference between them is the sub-keys used in each process.
IDEA has eight rounds; assuming that (X1-X4) represents the input block, (Y1-Y4)
represents the output block, and (Z1-Z6) are the first 6 sub-keys, then the first round would
be represented as the following:
39
X1 x Z1 -> d1
X2 + Z2 -> d2
X3 + Z3 -> d3
X4 x Z4 -> d4
d1 XOR d3 -> d5
d2 XOR d4 -> d6
d5 x Z5 -> d7
d6 + d7 -> d8
d8 x Z6 -> d9
d7 + d9 -> d10
d1 XOR d9 -> d11
d3 XOR d9 -> d12
d2 XOR d10 -> d13
d4 XOR d10 -> d14
repeating those steps 7 more times using a for loop, to give a total rounds of 8, each uses the
output of the previous round as an input, that is (d11-d14) will be the input of the second
round.
After the 8th round completed, it’s output would be (s1-s4), which will be the input of the last
half round (OT) along with the last 4 sub-keys, and the final output would be (Y1-Y4).
s1 x Z49 -> Y1
s2 + Z50 -> Y2
s3 + Z51 -> Y3
s4 x Z52 -> Y4
Since the IDEA is a block cipher, the last taken block from the file may not be a complete
block, therefore, we have filled the deficiency if the block with white spaces when it happens.
1. Prediction: given two plaintexts with some XOR value, we can predict the XOR
of the cipher texts.
2. Distinguishing: take many plaintexts with the input difference, check how many
satisfy the output difference.
40
3. Key Recovery: combine distinguishing with auxiliary techniques.
41
Definition 1: (Word Status)
An active word stands for an n-bit quantity which assumes all 2n possible values.
An active word contains, therefore, a permutation of 2n values: 0, …, 2n - 1.
Analogously, a passive word always assumes a fixed value. Words which are neither
active nor passive are termed garbled.
For Square, n = 8, the Square analysis will be initially performed with n = 16.
Definition 2: (h-set)
A h -set is a set of 2n text blocks whose consecutive n-bit words are either active,
passive or garbled.
the word xi, at the i-th position, is said to be balanced over the given h-set.
Taking into account that:
‐ Both active and passive words are always balanced.
‐ Garbled words can also be balanced. but not always.
A Square attack starts by choosing a h -set such that the balanced words propagate
for as many rounds as possible across the cipher.
By following the propagation of balanced words through multiple rounds, it is
possible to identify a pattern of active, passive and garbled words.
The h -sets in successive rounds should have at least one balanced word.
42
In the round just following the one in which no more balanced words are found, an
attack is made by using the property of balanced words to distinguish sub-keys in the
round right after the last balanced words.
Table 4.2: Input and output word status across IDEA operators
Chains of 4-tuples will represent status of data words across multiple rounds. For
example, (P P A A) (P A A A) (A A A A) means that (P P A A) (P A A A)
and (P A A A) (A A A A) across two consecutive rounds.
Also, let Pi = (Pi1; Pi2; Pi3; Pi4) denote the i-th plaintext block and Ci =
(Ci1;Ci2;Ci3;Ci4) the corresponding ith cipher-text block in a h set. The number of
attacked rounds will be made clear from the context.
43
Definition 4: (An nR-Attack)
An nR-attack stands for a Square attack on n rounds of a cipher, using the property of
balanced words to distinguish some sub-key bits surrounding a chain of h-sets. Here,
n could be fractional in case half-rounds are included in the attack.
The best Square attacks founded on IDEA, employing 16-bit words, use the following
h-set chains:
(P A P P) (A A * A) (? ? ? ?) (1)
(A P A P) (A A P P) (? ? ? ?) (2)
A 0.5R-attack on 2.5 rounds of IDEA can be made using chain (1). This attack
exploits that the property is balanced, to discover two
sub-keys Z3(3) and Z4(3) . The most significant bit of Z3(3), though, cannot be uniquely
determined. In order to filter out wrong 31-bit key candidates, two h-set are used.
(See Fig. 4.2).
That is a total of 2.216 chosen-plaintexts and 216.231+216 .215 ≈ 247 half-round IDEA
decryptions.
44
Figure 4.2: 0.5R-attack on 2.5 rounds of IDEA using chain (1)
Chain (2) can be used in an attack that exploits further the kind of permutation behind
an active word. A 1R-attack can be made on 2.5 rounds of IDEA, using (2), consisting
of a 0.5R-attack at the beginning and a 0.5R-attack at the end of 2.5 rounds.
‐ The attack uses a h-set with the first and third words both active and
containing the same permutation.
‐ Sub-keys Z1(1) and Z3(1) are guessed by multiplying the first active word in the
−1
h-set by (a candidate subkey for) Z 1(1) and adding the third active word with
(a candidate subkey for) -Z3(1.
45
‐ When the correct subkey pair (Z1(1), Z3(1)) is found, the active word, which is
the XOR of the corresponding outputs after the subkey mixing, will be equal
and so will cancel out, generating a passive word. This passive word will
propagate through the MA structure inside the first round and will not affect
any of the two original active words.
‐ At the end of the second round, the following two values might be active for
the correct keys: and
(See Fig. 4.3). Therefore, it is possible to discover two more subkey pairs.
‐ When the correct ( Z1(1), Z3(1) ) is used, both ( Z1(3), Z2(3) ) and ( Z3(3), Z4(3 ))
can be computed separately.
46
‐ According to the key schedule of IDEA, Z1(1) and Z3(3) share bits 0-8 of the
22
master key, and similarly, Z1(1) and Z4(3)share bits 9-15 . Therefore, the 4-
tuple (Z1(1) , Z3(1), Z3(3) , Z4(3) ) consists of 48 non-overlapping key bits.
Nonetheless, the MSB23 of the additive subkeys Z3(1), Z2(3), and Z3(3) are not
uniquely determined.
‐ Therefore, to discover the initial, correct 46 key bits with high probability,
three h-sets are used. This reduces the chance of a wrong subkey being
filtered to (2-16)3 = 2-48.
The initial computational cost of this attack is 3. 216 chosen-plaintexts and 216. 246 +
216. 230 + 216. 214 ≈ 262 half-round IDEA decryptions. Once ( Z1(1) , Z3(1) ) are
discovered, they can be used to find (Z1(3) , Z2(3) ) - except for the MSB of Z2(3) - with
two h -sets and 216.231+216.215 ≈ 247 half-round IDEA decryptions. Notice that (Z1(3),
Z2(3) ) and (Z1(1) , Z3(1) ) do not share any key bits via the key schedule.
22
Refer to Table 1.
23
MSB: most significant bit.
47
Chapter 5
Comparative Analysis
5.1 RSA vs. IDEA
Let's begin with RSA, its speed as mentioned in the RSA laboratory articles, depends on the
typical modular exponentiation algorithms used in its implementation, so they approximate
the public key operations in terms of the number of bits (k) used in the modular, to be O(k2)
steps, and the private key operations to take O(k3) steps, and the key generation will take
O(k4) steps. They also states that block ciphers are much faster than the RSA algorithm.
According to Bruce Schneie 2002, "RSA will never approach the speed of symmetric
algorithms", and he supported his speech with the following table:
Our implementation testing starts after fixing all the variables except the one that is being
studied ,for that purpose we test the two algorithms on the same pc, with processor Intel(R)
Core(TM)2 Duo CPU 1.60GHz, and we had used the same files with sizes ranges from 1 KB
up to 5 MB, the time described here is the time taken by the algorithms to make a specific
process in milliseconds and the encryption/decryption time includes the time needed to
read/write from/to the original/encrypted file.
48
Let’s start from the key Generation time for the RSA algorithm the RSA takes few
milliseconds to generate a key, as shown in figure 5.1, the time needed to generate a key in
the RSA differs within trials; because we create keys from random numbers24.
Key Generation
300
250
200
Time (ms)
150
100
50
0
1 2 3 4 5 6 7 8 9 10 11 12 13
Trail #
On the other hand the key generation in has an almost zero second generation time, that is
because the master key is generated randomly and the 52 encryption / decryption keys are
generated using the bitwise shifting operation, which is the fastest operation.
Continuing to the algorithms tests, for the RSA algorithm, we have file with size 20KB and
Key length 16 bit, along with the algorithm gives the sample output, as shown in table5.2
24
See Chapter3 for more details about the RSA key generation algorithm.
49
A number of files is prepared to be tested with sizes in KB are (50, 100, 150, 200, 300, 400,
800, 1000, and 1500) in both of the algorithms.
Starting with the RSA and 128 bit key length appears, it seems that the algorithm takes a long
time to decrypt the files and less time to encrypt them, and approximately no time to generate
the keys, that is relatively to the encryption / decryption time, see figure 5.2
250000
200000
Time (ms)
150000
100000
50000
Figure 5.2: RSA 128 file size vs. time for encryption/decryption.
Moreover, the figure (5.3) shows the various sizes of the encrypted files compared with the
original files.
18000
16000
14000
12000
Size (KB)
10000
8000
6000
4000
2000
0
1 2 3 4 5 6 7 8 9 10 11 12 13
File #
Figure 5.3: RSA file size change before and after encryption
The rapid increase in the data size after the encryption process that is going to be fed to the
decryption algorithm, Table 5.3 shows how the file size increases after encryption, because
the character size is grown after the encryption.
50
Test.txt // File Name
Enter key length : 128
Encryption Time : 0
Letter Before Encryption : 116 //ASCII for the letter "t"
Letter After encryption : 21003416576
Table 5.3: Character size before and after encryption for RSA128 bit
Then the same files were tested in encryption / decryption with both IDEA of block sizes 8
and 16 bytes, in order to notice the difference between them, the results are shown in figure
5.4
(A) I DEA 8 Bytes Block Size (B) IDEA 8 Bytes Block Size
Figure 5.4: IDEA file size vs. time encryption for encryption/decryption.
We could see clearly that the time taken for encryption is approximately the same time taken
for decryption, since the operations done in encryption are the ones which done in decryption,
and the number of processed blocks are the same in both encryption and decryption, see
Table 5.2
51
File Name: Test.txt
key length: 128
Output Block:
Output 1: 111
Output 2: 32
Output 3: 87
Output 4: 111
Output Block:
Output 1: 72
Output 2: 101
Output 3: 108
Output 4: 108
Besides, the time taken for the encryption using the block size 16 Byte was performed faster
than the block size 8 Bytes. See figure 5.4
52
Figure 5.6: IDEA Encrypted Files Size.
As a result of the encryption process, we could see the output ( encrypted ) files sizes, (see
figure 5.5) notice that when using the block size of 16B, the encrypted file size will be the
same as the original file, which is a good result.
As a conclusion, IDEA algorithm behaves significantly faster than RSA algorithm since data
size in IDEA remain constant and doesn't change in both encryption and decryption
processes while it increases rapidly as the key size increases in RSA algorithm.
35000
30000
File Size(K.B.)
25000
20000
15000
10000
5000
0
Trial #
File Size RSA128 RSA512 IDEA
Figure 5.7: RSA128, RSA512 and IDEA file size before and after encryption
The above figure for files sizes 150, 200, 300, 400, 800, 1000, and 1500 KB, shows that
RSA128 results in smaller sizes than RSA512 and both are larger than IDEA; because the
IDEA algorithm keeps the data size fixed before and after encryption, in contrast to the RSA
algorithm, in which the numbers after encryption grows with the exponent operations without
restrictions on their size.
53
Results for attacks:
RSA
We can factor numbers with a key length that is not used today, so our implementation for
the RSA algorithm is in the safe side.
As mentioned by Boneh Dan that Two decades of research into inverting the RSA function
produced some insightful attacks, but no devastating attack has ever been found (Boneh
Dan, 2000)
IDEA
Although that several attacks has been performed against IDEA, but no attacks till now has
broken all its rounds, thus, IDEA is a perfect security algorithm.
54
Chapter 6
Conclusions and Future work
6.1 Conclusions
‐ RSA algorithm is a reliable algorithm, since it faces two decades of attacks without
finding an attack that could weaken its structure and each key had been broken they
can use a larger one, bur several restrictions should be taken into account to get a high
level of security while implementing the RSA algorithm.
‐ Because of the new complex and secure design of IDEA, it is nearly impossible to
attack it completely with its full 8.5 rounds, Still there are several attacks that has
broken the reduced IDEA, that is about 4 founds.
‐ Combining the two algorithms in a one powerful security algorithm, that to derives its
benefits from the strengths of each one of the used algorithms.
‐ A development of some of the theoretical attacks for both RSA and IDEA, on the
other hand thinking over new developments that could handle out those attacks.
55
Appendices
A. Bibliography
Books
[1]. Menezes A., Oorschot P.van, and Vanstone S., "HAND BOOK OF APPLIED
CRYPTOGRAPHY", CRC Press, (1997).
[4]. RSA Laboratories, "RSA Laboratories’ Frequently Asked Questions about Today’s
Cryptography", RSA Security Inc, (2000), Version 4.1
[6]. Stamp Mark, Low Richard M., “Applied Cryptanalysis: Breaking Ciphers in the Real
World”, Wiley, (2007), ISBN: 9780-470-11486-5.
Papers
[1] Certicom Corporation, " Remarks on the Security of the Elliptic Curve Cryptosystem ",
Certicom, (July 2000)
[2] Riele, H. T. "Factoring Large Numbers: Fun or Applied Science?", Research project:
Computational Number Theory and Data Security, CWI-AR 1999
[3]. Boneh Dan, "Twenty Years of Attacks on the RSA Cryptosystem", American
Mathematical Society (AMS), Vol. 46, No. 2, pp. 203-213, 1999
56
[4]. Biham E., Dunkelman, O. and Keller, N. , "A New Attack on 6-Round IDEA", Springer-
Verlag, 2001.
[5]. Nakahara J., Barreto P., Preneel B., and Vandewalle J., "Square Attacks on Reduced-
Round PES and IDEA Block Ciphers", In Proceedings of the 23rd Symposium on
Information Theory in the Benelux, B. Macq, and J. Quisquater (eds.), Werkgemeenschap
voor Informatie- en Communicatietheorie, 2002.
[6]. Leong, M.P. Cheung, O.Y.H. Tsoi, K.H. Leong, P.H.W. , “A bit-serial
implementation of the international data encryptionalgorithm IDEA”, Dept. of Comput.
Sci. & Eng., Chinese Univ. of Hong Kong, Shatin, 2003.
[7]. Antti Hammalainen, Matti Tommiska, and Jorma Skytta, “6.78 Gigabits per Second
Implementation of the IDEA Cryptographic Algorithm”, In Proceedings of the 12th
Conference on Field-Programmable Logic and Applications (FPL 2002). Montpellier, France,
2002.
Websites
57
B. Code
58