Secure Data Communication and Cryptography Based On DNA Based Message Encoding
Secure Data Communication and Cryptography Based On DNA Based Message Encoding
Secure Data Communication and Cryptography Based On DNA Based Message Encoding
ABSTRACT 2. BACKGROUND
Information flows throughout the network that may be of
local or of global scope. It is mandatory to secure that
2.1 Cryptography
Cryptography is the art and science of achieving security by
information to prevent from unauthorized access of it by any
encoding the simple message to make it unreadable [3] [20].
node in the path. There are various users and organizations
The typical scenario in cryptographic as shown in Fig. 1 is
who want to prevent their crucial data from attackers and
that Bob (sender) wants to send some messages secretly to
hackers. Also we need to ensure privacy, integrity and
Alice (intended receiver) and the Eve is the third person who
confidentiality about data in the network for it to be a reliable.
is trying to read the message but could not succeed, as the
Thus to achieve security it is very necessary to encode the
message is secured to some extent, by using cryptographic
data before sending it through the various unreliable
algorithm. The message to send is in simple or ordinary
communication channels available to make it unreadable. This
language understood by all, it is called a plaintext. The
is where the Cryptography comes into picture. Various
process of converting plaintext into a form which cannot be
cryptographic systems were developed in the past year but
understood without having special information is called
now the latest development on this field is DNA
encryption. This unreadable form is called cipher text and this
Cryptography. This concept has emerged after the disclosure
special information is called encryption key. The conversion
of computational ability of Deoxyribo Nucleic Acid (DNA).
of cipher text again into plaintext with a special knowledge is
In this field of DNA Cryptography many research work is
called decryption, whereas special knowledge for decryption
going on to make the computational process more complex to
is called decryption key. Only the receiver has this special
the unauthorized user. Well, presently it is in the development
knowledge and only receiver can decrypt a cipher text with
phase and requires a lot of work and research to reach an
this knowledge called decryption key.
established stage. In this paper; a proposal is given where the
concept of DNA is being used in encryption and decryption There are basically two types of cryptography based on the
process. The theoretical analysis shows this method to be techniques for converting plaintext to cipher and vice versa
efficient in computation, storage and transmission; and it is which are namely called as symmetric and asymmetric
very powerful in certain attacks. This paper also proposes a cryptography. In symmetric cryptography sender and receiver
secured symmetric key generation scheme which generates use the same key for encryption and decryption of test
primary cipher and this primary cipher is then converted into whereas in asymmetric cryptography systems two keys
final cipher using DNA sequences, so as to make it again namely public and private keys are used for encryption and
more complicated in reading. Finally, the implementation decryption process. By keeping the private key safe, you can
methodology and experimental results are presented. And then assure that the data remain safe. But the disadvantage of
conclusion with future work is described in the last section. asymmetric algorithm is that they are computationally
intensive. Therefore, in this proposed system symmetric key
Keywords cryptography is used with the intension of less computation
Security, Encryption; Decryption; Key generation; Cipher but high data security.
text; DNA cryptography.
Cryptography mechanisms are depending on the degree of
randomness and uncertainty in the generation of the cipher
1. INTRODUCTION text from the plain text. Hence depend on the phenomenon of
The security to a system is essential nowadays ! with the nature there are various types of cryptography such as:
growth of the Information Technology and with the Modern Cryptography is based on the difficult mathematical
emergence of new techniques, the number of threats a user is problems such as prime factorization, matrix manipulation.
supposed to deal with grew exponentially. It doesn't matter if Elliptical Cryptography, make use of elliptical curve
we talk about bank accounts, social security numbers or a problems. Quantum Cryptography uses the randomness of
simple telephone call. It is important that the information is states of electron inside an atom. Moreover DNA
known only by the intended persons, usually the sender and Cryptography depends on the difficult biological process
the receiver. This is where the cryptography comes into concerning to the field of DNA technology [13] [21].
picture. Cryptography is the basis of security of all the
information.
35
International Journal of Computer Applications (0975 – 8887)
Volume 98– No.16, July 2014
DNA computing and cryptography came into picture in 4. DNA CRYPTOGRAPHY AND
1990[15]. DNA computing was initiated by L. Adleman [14]-
[16] in 1990, where he solved a direct Hamiltonian path RELATED WORKS
problem and set the foundation of the research in the field of DNA cryptography is based on DNA computing where,
Bio Computing. In the field of cryptography, Ashish Gehani message is encrypted in the form of DNA nucleotide
et al introduces the first algorithm of DNA based sequence. DNA computing can be used as conceptual
cryptography [1] [2] [11]. platform for data encryption and decryption by using
symmetric or asymmetric key. In current scenario it is not
much effective than traditional cryptography but it can
3. DNA BIO-LOGICAL THEOREM provide a hybrid security by combining traditional
Deoxyribo Nucleic Acid (DNA) is a biological material or cryptography with it [21]. However DNA logic can be
bio-molecule which is present in almost all living things [21]. implemented with traditional cryptography [13]-[21].The
DNA is located in the cell nucleus but a small amount of ultimate target is to scramble data in the way that the person
DNA may also be found in the mitochondria. DNA contains who doesn’t know the key, can’t read or modify data.
genetic information of a living thing. It contains genetic In the recent year few works on qualitative and quantitative
instructions which help in constructing other cells [12]. In analysis on DNA based Cryptography as well as many new
1953, James Watson discovered the structure of DNA. A Cryptographic techniques are proposed by the researchers.
DNA molecule is composed of two single strands which form Bibhash Roy et al [5] [6] [7] proposed a DNA sequencing
a double helix structure as shown in Fig. 2: based encryption and decryption process. This paper also
proposes a unique cipher text generation procedure as well as
a new key generation procedure. But the experimental result
shows that the encryption process requires high time
complexity. Tushar Mandge et al [21] designed a DNA
encryption technique based on 4*4 matrix manipulations and
using a key generation scheme which makes data much
secure. Miki Hirabayashi and Akio Nishikawa [18] have
proposed theoretical and empirical based analysis on
Fig 2: Basic DNA Structure [8] application of DNA cryptography. Pankaj Rakheja [19]
designed a new method by integrating DNA computing in
The DNA sequences consists only four alphabets: Adenine IDEA. Such conceptual works can be useful in the
(A) Cytosine (C) Guanine (G) Thymine (T). Each alphabet is development of this new born technology of cryptography to
related to a nucleotide. Watson-Crick proposed a fulfill the future security requirements.
complimentary rule for DNA sequences that: “A only joint
with T through double bound (A T)” and C only joint with
G through triple bound (C G)” [12]. DNA provides the 5. PROPOSED METHOD
major support of genetic information for all kind of organism In actual scenario, DNA cryptography is far away from
in biosphere. It composed of two long strands of nucleotides, realization because in current time it can be performed only in
a deoxyribose sugar and a phosphate group. DNA sequences labs using chemical operations. In order to provide better
are responsible for transfer of complex information. The security and reliable data transmission an effective method of
sequence of these bases determines the information available DNA based cryptography is proposed here. In this method the
for building or forming an organism, similar to the way in mixture of mathematical and biological concepts are used to
which letters of the alphabet appear in a certain order to form get the encrypted data in the form DNA sequences. The
words and sentences [12]. Thus, DNA provide information to benefit of this scheme is that it makes difficult to read and
message RNA (mRNA) through the process transcription, guess about data (plain text).The proposed algorithm has two
then mRNA transfer the information to Protein, this process is phases in consequence: these are Primary Cipher text
known as translation. These two processes play an important generation using substitution method followed by Final
role for information transfer from one age group to another Cipher text generation using DNA digital coding.
age group.
36
International Journal of Computer Applications (0975 – 8887)
Volume 98– No.16, July 2014
In the Primary Cipher text generation phase, the encryption 6.1 Format of Cipher Text
algorithm uses OTP (one-time-pad) key generation scheme, From plain text (PT) the Primary cipher text (PCT) is obtained
since almost one key for one piece of information is sufficient by using the encryption algorithm and Level-1 Private key
to provide lots of strength in encoding technique. The (PK1).
proposed method uses randomly generated symmetric key of
8 bits size by the intended receiver and provided to the sender.
Thus the sender will have a partial knowledge of the private STEP TO OBTAIN THE FINAL CIPHER TEXT
key only and then it generates the rest part of the keys (Private Begin
Keys: Level 1 and 2) to encode the information. The Byte Step 1: Encrypt the plain text with Private key Level 1(PK1).
values are extracted from the input file or message. The Step 2: Calculate Private Key Level 2(PK2) and attach the
further encryption process works on unsigned byte values of SPM (Starting Primers) and EPM (Ending Primers)
the input file or text called as plain text. These byte values are to the Primary cipher text (PCT) so obtained from
replaced by combination of alphabets and special symbols Step 1.
using substitution method. And then this substitution values Step 3: Apply DNA digital encoding technique to get Final
are converted into its binary value. In order to embed more cipher text (FCT). (Fig. 3)
security extra bits are padded at both the ends of the primary End
cipher text. These extra bits are nothing but the file size
information, which is provided to the receiver through Level 2
key. Thus the secret key, the information of primer pairs are
shared between sender and receiver through the secret
channel.
In the DNA digital coding phase, the Final Cipher text is
generated from Primary Cipher text using DNA digital
encoding technique. From a computational point of view, we
cannot process the DNA molecules as in form of alphabets, so
the DNA sequence encoding is used in this method through
which the binary data is converted into DNA format and its
vice versa. The four subunits of DNA molecule called as
nucleotide bases: A: adenine; G: Guanine; C: Cytosine and T:
Thymine are converted into 2 bit binary as A: 0(00), T: 1(01),
C: 2(10), G: 3(11). Obviously, there are 4! =24 possible
Fig 3: Final Cipher Format
coding patterns by this encoding format. However, according
to the Watson-Crick complementarily rule, in double helix 6.2 Procedure for Level1 Private Key
DNA structure, the two DNA strands are held together
complementary in terms of sequence, i.e. A to T and C to G. Generation at both sides
Procedure: Sender’s side Computation
Taking DNA digital coding into account, it is required to Begin
reflect the biological characteristics of 4 nucleotides DNA Step 1: First the receiver will send a number as public key
bases, the complementary rule that (~0=1) and (~1=0) is (PK) through channel (private or public). This key
proposed in [17]. As per the rules, 3(11) is complement of should be any positive number between the ranges 1
0(00) and 2(10) of 1(01). So among 24 patterns, only 8 kinds to 255.
of patterns (0123/CTAG, 0123/CATG, 0123/GTAC, Step 2: Sender will generate one random number (R).
0123/GATC, 0123/TCGA, 0123/TGCA, 0123/ACGT AND Step 3: The random number selected is being represented in
0123/AGCT) are fit as per complimentary rule of the binary and then its complement is being again
nucleotide bases. It is suggested that the coding pattern converted into decimal which will be used as the
0123/CTAG is the best for nucleotide bases [17]. Thus A and Encryption key (E).
T are corresponds to ‘00’ and ‘11’respectively and C and G to (E.g. Let Public Key is PK=7, and the random
‘01’ and ‘10’ respectively. So substitution rule is A=00, T=11, number R=5.Binary representation of R=0101(4-
C=01 and G=10 as illustrate in Table 1. bit).Complement of R =0010. Therefore, In Decimal
Table 1. DNA Digital Coding R= 2. This 2 will be used as Encryption Key
(EK=2)).
DNA nucleotide Decimal Binary Step 4: The sender will compute the level-1 private key
A 0 00 (PK1) as follows:
C 1 01 Remainder computation (r): (PK * R) % 16= (7*5)
G 2 10 % 16 =3 (Remainder) Hexadecimal Notation = 3.
T 3 11 Quotient computation (c): (PK * R) / 16 = (7*5) /
16 =2, Hexadecimal Notation = 2.
Concatenating these two hexadecimal notations, we
6. ALGORITHMIC PRESENTATION get rc = 32.
The encryption scheme proposed in this paper has following Step 5: Sender will send rc as level1 private key (PK1) with
phases: level2 private keys (PK2) through private Channel
Abbreviations used are: PT- Plain Text, PK: Public Key, PK1- in a progress.
Private key Level 1, PK2- Private Key Level 2 and SPM- End
Starting Primer; EPM-Ending Primer, PCT- Primary Cipher
text, FC-Final Cipher text.
37
International Journal of Computer Applications (0975 – 8887)
Volume 98– No.16, July 2014
Procedure: Receiver’s side Computation Step 3: In order to get the index of the arrays we have to
Begin change the negative value byte codes into positive
Step 1: Receiver will receive rc = 32 and separate the values by adding +128 to each of the byte values.
numbers r and c and convert into decimal notation. For example,-120 becomes +8.Thus the range of the
Step 2: Receiver will compute the decryption key as byte codes becomes 0 to 255.
follows: Step 4: Each of the byte will be taken in account of
Decimal value computation (X): X= (16 * c) + r calculation as: n1= (byte code / 16) and n2=
(e.g. X= (16 * 2) + 3 = 35) (byte_code%16).For example, if the byte code=92
Intermediate key computation (K1): K1=(X / PK) then n1= 92 / 16 =5 and n2= 92%16= 12 Therefore,
(e.g. K1= (35 / 7) = 5, where PK=7) n1=5 and n2=12
Step 3: Convert to binary form and complement it. E.g. (Note: We are using 16 in the calculation as 16 is
Binary of 5=101. In Decimal Notation = 010 =2. the size of the array. So as there are two arrays, the
Step 4: Therefore, 2 is Level-1the Private Key (PK1) to be range of the byte codes will be 16 x 16= 256.)
used for decryption. Step 5: Now the key will be added up with the numbers n1
End & n2 to get the new indexes q and r as: q= [(n1+k1)
%16] and r= [(n2+ k1) % 16], where k1 is the key
value. For example, Let the key is k1=7. So, q =
6.3 Procedure for Level2 Private Key (n1+7) %16 = (5+7) %16=12%16=12 and r =
Generation (n2+7) %16= (12+7) %16=19%16=3.
Procedure: Sender’s side Computation Step 6: The numbers q and r will be used as the index of the
static arrays Q & R. E.g. here the value of q =12 and
The Level-2 private key gives the information about the
r=3. We get:
length of the primer. The primers are added at the start and
end of the primary cipher text (PCT). The sum of the digits of
the sender’s file length is taken as the input (PK2) to decide
the primer’s length. To encoded this primers information use
following procedure: Thus the byte code ‘92’ is converted into ‘M@’
Begin Step 7: After getting the cipher of each byte code,
Step 1: Let P be an array which will hold the secondary concatenate all byte codes in order to get the
level of keys. primary cipher text (PCT), which is the result of
Step 2: Take a variable and initialize it with a number first part of our algorithm.(Note: Primary cipher
which is the file length. will always be different for same plaintext because
Step 3: Repeat through the following steps for 1 to number of our secure random key generation scheme. This
of digits in N. feature makes data safe because it does not give
Step 4: Perform digit wise X – OR of N from left to right hint due to its OTP nature.)
(i.e. from MSB to LSB) as shown in Fig. 3. Step 8: Now convert the byte codes obtained into binary
bits. Use Primer pairs to change primary cipher
sequence and then DNA digital coding is performed
on reshaped data to get final cipher text in DNA
sequences.
End
38
International Journal of Computer Applications (0975 – 8887)
Volume 98– No.16, July 2014
39
International Journal of Computer Applications (0975 – 8887)
Volume 98– No.16, July 2014
[10] G. Cui, L. Qin, Y. Wang, and X. Zhang, “An encryption [15] L. Adleman, “Molecular computation of solutions to
scheme using DNA technology,” in IEEE 3rd combinatorial problems,” Science, JSTOR, vol. 266,
International conference on Bio-Inspired Computing: pp.1021–1025, 1994.
Theories and Applications (BICTA08), Adelaid, SA,
Australia, 2008, pp. 37–42. [16] L. Eschenauer and V. D. Gligor. “A key-management
scheme for distributed sensor networks.” Proceedings of
[11] Gehani Ashish, La Bean, Thomas H. Reif, John the 9th ACM conference on Computer and
H,“DNA-Based Cryptography”,Department of Computer communications security, Washington, DC, USA, pp.
Science, Duke University, June 1999. 41– 47, November 18-22 2002.
[12] Genetic home reference, a service of the U.S. National [17] Leroy Hood and David Galas, “The digital code of
Library of Medicine, http://ghr.nlm.nih.gov/ DNA”, vol 421, no. 6921, 2003.
handbook/basics/dna, 2012.
[18] Miki Hirabayashi, Akio Nishikawa, “Analysis on Secure
[13] Guangzhao Cui Limin Qin Yanfeng Wang Xuncai and Effective Applications of a DNA-Based
Zhang. “An encryption scheme using DNA technology.” Cryptosystem”, IEEE computer Society, 978-0-7695-
Bio-Inspired Computing: Theories and Applications, 4514-1/11, 2011.
2008. BICTA 2008. 3rd International Conference on
Publication Date: Sept. 28 2008-Oct. 1 2008 ISBN: 978- [19] Nucleotide base pairing of strands,
1-4244-2724-6, page(s):37-42; Adelaide, SA. http://dedunn.edblogs.org, 2012.
[14] Guangzhao Cui Limin Qin, Yanfeng Wang, Xuncai [20] P Pankaj Rakheja,“Integrating DNA Computing in
Zhang, “An Encryption Scheme Using DNA International Data Encryption Algorithm”, IJCA,
Technology”, IEEE, 978-1-4244-2724-6/08, 2008. Volume 26-No.3, 2011.
[21] Tushar Mandge, Vijay Choudhary, “A DNA Encryption
Technique Based on Matrix Manipulation and Secure key
Generation Scheme”, ICICES Journal, 2013.
IJCATM : www.ijcaonline.org
40