RITS-MHT_Relative_indexed_and_time_stamped_Merkle_

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Journal of Network and Computer Applications 84 (2017) 1–13

Contents lists available at ScienceDirect

Journal of Network and Computer Applications


journal homepage: www.elsevier.com/locate/jnca

RITS-MHT: Relative indexed and time stamped Merkle hash tree based MARK
data auditing protocol for cloud computing

Neenu Garg , Seema Bawa
Department of Computer Science and Engineering, Thapar University, Patiala, India

A R T I C L E I N F O A BS T RAC T

Keywords: Cloud computing is a recent phrase in marketing for a concept which has been known for years: Outsourcing.
Proof of possession Cloud computing provides cornucopian of gratuities for both individual users and business organizations.
Data integrity audit ‘Cloud’ is more of a notion where files and data are hosted online and can be accessed when required via a
Third party audit number of methods, anywhere, and at any time. That is the gist of it. Providing storage space to cloud users is an
Cloud computing
appealing service of cloud computing. Although cloud storage provides benefits of location independence access
to data, reduced burden for hardware and software maintenance and many more, yet this service has several
challenges in the range of security and preserving data. Cloud servers may exist in white puffy shapes in the
azure, but these are not immune to temporal errors. To ensure that outsourced data is secure and is not
tampered, cloud provider must allow data proprietor to periodically audit data integrity. Numerous Remote
Data Auditing (9+( ) protocols have been proposed by researchers so far. In presented work, to cope with this
problem we analyze the efficiency issues of a current protocol for data integrity auditing in cloud storage and
propose an approach based on Relative Index and Time Stamped Merkle Hash Tree (90;: − − 4 /; ) which
integrates 4 /; with relative index of a node resulting in reduction of computation cost of searching a data
block from O(n) in Wang's protocol to O(log n) and time of last modification to data, thereby guarantying
freshness of data respectively. 90;: − − 4 /; ensures that the outsourced data has not been polluted as well
as it assures that the recent copy of data is reclaimed. This protocol supports public auditing of data, and
efficiently supports data dynamic operations like insertion, modification, deletion of outsourced data at minimal
computational cost. The security of proposed protocol has been proved in Random Oracle Model (964 ). As
compared to Wang's protocol, 90;: − − 4 /; is more efficient in terms of computation cost.

1. Introduction a cloud user is data security. Data security is a paradox issue. It is both
appealing as well as a vulnerable spot in cloud computing. Once a user
A rising trend in market termed as “Cloud computing” has greatly has shifted data from local systems to cloud, its data is exposed to
metamorphosed the way organizations interacts with their customers, cloud. There can be external and internal dangers to data once it is on
vendors and employees. Various internet services like Google, YouTube cloud. For example, a Cloud Service Provider (*:7 ), facing sporadic
have modified the way people communicate, shop, educate, and much complex breakdowns, may choose to conceal data errors from users for
more. Services offered by cloud are: Software as a Service (: aa: ), personal benefits (Garg and Bawa, 2016). For economical benefits or
Platform as a Service (7 aa: ), and Infrastructure as a Service (0aa: ) storage space benefits, a *:7 may intentionally erase files which are
(Mell and Grance, 2011). As infrastructure, cloud provides a remote occasionally accessed. Data in cloud can be hacked by an external
storage location, other than user's computer's hard drive or any other attacker who can destroy or delete it for some reason. The traditional
local and private storage infrastructure for keeping data and is integrity auditing methods like hash functions and digital signatures
maintained by cloud itself. Thus the pressure of keeping the data safe are not appropriate in cloud as the local copy of data is not available.
and dealing with daily increasing needs of organizations and indivi- Moreover downloading entire data for integrity verification also seems
duals for storage is handled by cloud provider. Access to data is impractical due to resource limitations of users. The afore stated
facilitated via an Internet connection through a web based interface or situation gets worse if the users are mobile users. Thus due to the
some special Application Programming Interface ((70 ). Apart from voluminous outsourced data and resource constraints, a cloud service
the storage benefits it provides, the top reason to stay away from being consumer may be unable to perform an efficient integrity auditing on


Corresponding author.
E-mail addresses: neenu.garg@thapar.edu (N. Garg), seema@thapar.edu (S. Bawa).

http://dx.doi.org/10.1016/j.jnca.2017.02.005
Received 11 May 2016; Received in revised form 28 January 2017; Accepted 12 February 2017
Available online 21 February 2017
1084-8045/ © 2017 Elsevier Ltd. All rights reserved.
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

its data periodically and hence relies on a Third Party Auditor (;7( ) encoding of each data copy separately. However in their subsequent
for the same. The ;7( service is efficiently designed to prove data work, Chen et al. (2010) designed a framework for distributed data
integrity without requiring downloading of complete data. Few of the storages based on network coding. One of its major flaw is that it is not
data integrity auditing protocol proposed so far require Data Proprietor publicly auditable and the code is unsystematic and the framework is
(+7 ) to perform data auditing and few facilitates public auditing of inappropriate for online storages. Moreover, accessing few parts of a
data. Public auditing of data implies that anyone authorized by +7 can file requires regeneration of complete data file. Recently Wang et al.
perform data auditing. Practically, it is more rational to combine public (2015) proposed a protocol which supports the correctness and
auditing with auditing protocol for efficiency consideration. Moreover, completeness of proof of possession by server. However the protocol
the data in cloud may be frequently accessed by +7 for not just is provable secure in an encrypted outsourced database but its
reading but for any updating including modify, insert, delete or application is constrained to static files only. Despite the existing
append. A data integrity auditing protocol should be such that it protocols target at providing data integrity auditing for cloud systems,
provides support for data dynamic operations as well. In our work, the the requirement for public audit and data dynamics has yet to be
deficiencies in the integrity auditing protocol proposed by Wang et al. fulfilled. A efficient design of a data integrity auditing protocol which is
(2011) have been analyzed and a Relative Indexed and Time Stamped secure and integrates above two features seamlessly remains an open
Merkle Hash Tree (90;: − − 4 /; ) based data integrity auditing problem for researchers in cloud computing.
protocol which supports public auditing of data and data dynamic
procedures along with ensuring data freshness has been proposed. The 1.2. Our contribution
issue of privacy preservation of outsourced data is orthogonal to the
proposed protocol. The contributions of proposed work are summarized as below:

1.1. Related work 1. The data auditing model in cloud computing has been analyzed and
an auditing protocol is proposed, which efficiently reduces the
To ensure data integrity in cloud systems, numerous RDA protocols computational complexity for integrity verification from O(n) in
have been suggested (Ateniese et al., 2007, 2008; Bowers et al., 2009; Wang et al. (2011) to O(log n).
Shacham and Waters, 2013; Wang et al., 2009; Erway et al., 2015; 2. 90;: − − 4 /; provides assurance not only for data integrity but
Mohammad Etemad, 2016) so far. Among all above suggested proto- also for data freshness. A time stamp has been integrated with a field
cols (Ateniese et al., 2007) worked on the subject of publicly auditable in proof of possession to assure that most recent copy of data is
outsourced data and data integrity assurance. (Ateniese et al., 2007) retrieved.
suggested two PDP constructions as: Sampling(S)-PDP; Efficient(E)- 3. 90;: − − 4 /; has been proven secure in ROM with the
PDP. S-PDP assures firm probabilistic possession of data, however E- assumption of toughness of Computational Diffie Hellman (*+/ )
PDP assures efficient auditing mechanism but weaker data possession problem.
assurance. To provide public auditing of data in Ateniese et al. (2007), 4. The proposed protocol has been efficiently designed to support
RSA based homomorphic tags have been used. However, their protocol public auditing of data as well as data dynamics.
lacks support for data dynamic procedures and its application is limited
to static data only. Erway et al. (2015) is a dynamic version of Ateniese 1.3. Organization
et al. (2007). Again in Erway et al. (2015) number of audits is bounded.
Similar to Erway et al. (2015), Juels and Kaliski (2007) bounds the Section 2 presents the system model of the proposed data integrity
number of audits and does not provides public auditing of data. In auditing protocol along with some threats to data posed by various
continuation to Juels and Kaliski (2007), Shacham and Waters (2013) entities in system model. Section 3 highlights the background methods
proposed two PoR constructions. For their work, they use homo- used to realize the proposed protocol. These include )3: signatures,
morphic authenticators which are publicly auditable. Their work relies bilinear pairings, *+/ problem and classic 4 /; . The efficiency
on BLS signatures, and is also publicly auditable. Shacham and Waters issues in Wang et al. (2011) have been discussed in Section 4. The
(2013) is efficient in terms of time complexity of an auditing procedure details of proposed protocol are given in Section 5. Section 6 represents
however communication complexity of the protocol varies linearly as the support to data dynamic procedures by 90;: − − 4 /; . Section
outsourced data block size increases. Apart from the space efficiency 7 gives an adversarial model for proposed protocol with detailed
(due to BLS signatures) provided by Shacham and Waters (2013), its security proof in Section 8. Section 9 accesses the performance of
usage is constrained to static data. The mechanism for handling data proposed protocol which includes parameters like probability of
dynamic procedures were addressed by Erway et al. (2015) which relies detecting server malpractice, computation cost of auditor, storage
on skip lists where the skip list provide rank based authentication. To and communication costs. This section also includes experimental
assist insertion operations in data blocks, Erway et al. (2015) modifies evaluation of proposed protocol and the results obtained have been
the technique of generating tags in the tag generation phase of Ateniese plotted. Conclusions have been drawn in Section 10.
et al. (2007). The efficiency of Erway et al. (2015) is still uncertain.
Mohammad Etemad (2016) provides a generic framework DPoR from 2. System model of proposed data integrity auditing model
Erway et al. (2015), Ateniese et al. (2007) and erasure codes. They use
logs to store all update information. Their protocol requires main- The system model of proposed data auditing protocol is illustrated
tenance of all logs to ensure data retrievability if any update to data in Fig. 1. This auditing model possess three entities defined as follows:
fails. They provide a mechanism to perform auditing of the logs. Data
has been stored twice at server, one for providing authenticity of data (i) Data Proprietor (+7 ): is a person, a firm, or a business
by reading, and other in log store. Any update command is sent to organization which has sufficient amount of data files for out-
memory checking part for execution. Along with this client prepares an sourcing and may update this data later by carrying modify,
update log to be appended to existing logs in log store. The protocol in delete, insert or append operations. +7 is a dependent entity on
Bowers et al. (2009) provides a framework which involves a mobile cloud provider for data maintenance. Generally, it is a resource
adversary. Their work is constrained to static data. In other related bounded entity.
work, Curtmola et al. (2008) designed a framework for data integrity (ii) Cloud Service Provider (*:7 ): is an entity having sufficient
assurance of numerous data copies widely distributed across storage computationally capable resources and storage servers which have
servers. Their work preserves numerous data copies without requiring unlimited storage space. *:7 is accountable for keeping and

2
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

bilinear pairing, *+/ problem and 4 /; .

3.1. Boneh Lynn Shacham ()3: ) signatures


)3: signature method is used for authentication of signer of a
message. )3: signatures use bilinear pairings for signature authenti-
cation and the signatures are group elements of some elliptic curve.
The main property of a )3: signature is that multiple signatures
produced for multiple messages with multiple public keys can be
aggregated into a single signature. Therefore, the )3: signature of ‘n’
data blocks (d[1], d[2],……, d[n]) of file ‘F’ are generated as follows:
n
ψagg = ∏ (H (d[i])κ
i =1

Fig. 1. System model of 90;: − − 4 /; . where H: {0, 1}*→ G being a hash function and G is a prime order
group having generator g. Here κ denotes secret key ∈p which is ring
of integer modulo p. The )3: signature method consists of three
functions: key generation, message sign and verify as described below:
preserving outsourced data. A *:7 is considered to be an
untrusted entity. (i) keygeneration: Generate secret key κ ∈ p . Publish gκ as public
(iii) Third Party Auditor (;7( ): is a skilled entity to perform audit on key and save secret key.
+7 's data. A ;7( is an entity trusted by both, *:7 and +7 . (ii) message sign (F, κ): ψ→ H(F)κ
;7( minimizes computational pressure of data auditing on +7 . (iii) verify (F, ψ): e(ψ, g)= ? e(H(F), g κ )Boneh et al. (2001) provide
The numbers in Fig. 1 represents the sequence of activities sufficient details on BLS signatures.
taking place in an auditing process. It has been assumed that
before an audit process is initiated, +7 has hired a *:7 and has 3.2. Bilinear pairings
outsourced data at cloud server.
Consider (G1, +), (G2,.) be two cyclic groups having prime order r,
2.1. Threat model where G1 is an additive and G2 is a multiplicative group. A pairing is a
map defined as e: G1 × G1 → G2 satisfying the properties as follows:
As previously discussed, a ;7( is a trusted entity and a *:7 is an
untrusted entity. However, both ;7( and *:7 can cause some threats (i) Bilinear: ∀P ∈G1, ∀Q ∈G2 , ∀a, b ∈Zp , a map ‘e’ is said to be bilinear
to +7 's data (Garg and Bawa, 2016) as discussed below: if e (Pa, Qb)=e (P, Q)ab
(ii) Non-degenerate: ∀P, Q ∈G1, e(P, Q) ≠ 1
(i) Threats caused by ;7( : A +7 relies on ;7( for assuring data (iii) Computable: the map ‘e’ is computable
integrity, assuming that this independent entity is genuine and These pairings have applications in cryptographic schemes.
reliable. However there is always a possibility for ;7( to be These allow users to create new schemes such as )3: signature
curious about +7 's data. Thus the privacy of data can be scheme. Among all the signature methods in cryptography, )3:
compromised by ;7( in a public auditing protocol. Therefore signatures have shortest length (Zhang et al., 2004). Boneh and
along with data integrity preservation, there is a requirement for Franklin (2001) provide descriptive study of bilinear parings.
privacy maintenance to prevent ;7( from gaining any knowledge
from +7 's data. 3.3. The *+/ problem
In the proposed work, the issue of data privacy has not been The *+/ assumption states that a particular computational
addressed and ;7( is assumed to be loyal. problem in a cyclic group is hard. Assume a cyclic group G with order
(ii) Threats caused by *:7 : Some threats to +7 's data caused by g. For given (g , g a , g b ) with unknown a, b ∈Zp ; it is computationally
*:7 are as follows: challenging to establish the value gab.
(a) A *:7 may deliberately remove rarely accessed data for
saving server space without notifying +7 . 3.4. Classical Merkle Hash Tree (4 /; )
(b) A *:7 can cause some processing error on data and this may 4 /; is a renown binary tree data structure where every node other
damage +7 's data permanently. than leaf node is concatenation of its leaf nodes. The node at the top is
(iii) Some external threats: termed as root node. Leaf nodes in a 4 /; represents hashes of data
(a) A legitimate user can access outsourced data via an (70 blocks. Benefits of 4 /; includes the secure verification of data which
offered by *:7 . Weak (70 may put +7 's data at risk. A is large in size. Authentication of root node provides integrity assurance
stranger may use login credentials of some legitimate user of all the leaf nodes. Fig. 2 depicts a 4 /; with 8 leaf nodes. For
and may pollute or delete data without being exposed. example, if an auditor requests the prover for integrity verification of
(b) A former administrator at *:7 may break into a cloud server data node at position 3, the prover shares the auditor with the Auxiliary
and provide harm to stored data. Information (AI) as AI(d[3]): {(HC , L ), H (d[3]), (H (d[4]), R ), (HB, R )}.
Thus, along with offering variety of cloud services, securing data Now the auditor can generate root HR as follows:
accessibility and preventing external attacks to outsourced data
are important for *:7 . (i) Compute HD ← (H (d[3])∥ H (d[4])).
(ii) Compute HA ← (HC ∥ HD ).
3. Preliminaries (iii) Finally compute root HR ← (HA ∥ HB ).

In this section, the background concepts used in realizing By verifying authenticity of root node, all blocks are automatically
90;: − − 4 /; are presented commencing with a brief description authenticated. In this paper, we have modified the classic 4 /; in
of Boneh Lynn Shacham ()3: ) signature method which has been used which every node has now two pieces of information: hash value and
for authenticating block signatures followed by brief discussions on

3
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

Fig. 2. Classic merkle hash tree.

relative index. Moreover, the root node in modified 4 /; is con- 4 /; has been modified to store two pieces of information, one is the
catenated with the date and time of creation of tree to ensure data hash value of the data block and the other is the relative index of a
freshness. Merkle (1980) provides more details on 4 /; . node. The relative index associated to a node P specifies the number of
leaf nodes belonging to the subtree of P. In modified 4 /; the relative
4. Efficiency issues in Wang et al. (2011) index of leaf nodes is set to 1. For example, if L and R are left and right
child nodes with hash values as Ha, Hb and relative index field values
Wang et al. (2011) is an efficient protocol for auditing cloud data. as ra and rb to a parent node P, then the node P has hash value as
Their protocol supports public auditing of data and data dynamic (Ha ∥ Hb ) and relative index field value as (ra + rb ). A time stamp field is
procedures also. However, some major flaws in their protocol are associated with the root node of 4 /; . This time stamp is the time
discussed below: and date of creation of 4 /; concatenated to the root hash value.
Fig. 3 shows an example of a modified 4 /; . Here, HR=(HA ∥ HB ∥ dt ),
(i) Computational complexity: The computational complexity of where dt is date and time of tree creation. As HR is always updated for
searching ith leaf node in a particular data integrity auditing any modification done to any block in a tree and so time and date of last
instance is computed as O(n) in Wang et al. (2011), where n modification is reflected in HR, which provides data freshness assur-
denotes number of leaf nodes in 4 /; . If the storage structure ance.
used is continuous, then any operation of insertion, deletion has
O(n) computational complexity, which is quite high. If link list 5.2. Definition
data structure is used then also for reaching ith leaf node, the list
of 4 /; leaf nodes is traversed one by one, leading to computa- This section provides definitions of various algorithms composing
tional complexity of O(n). 90;: − − 4 /; :
(ii) Position authentication: After insertion, deletion operation, the
sequence number of nodes probably gets modified and the height (i) Keygen(1λ): This algorithm is executed by +7 . Here λ denotes a
of tree also becomes imbalanced. The proof of any data dynamic security parameter, and the output is a key pair (pubkey, seckey)
procedure in Wang et al. (2011) is specified as: ←(η, κ).
Pf = {ψ , (H (d[i ]), AI (i ))i ∈ Q, ρ , R′}, which clearly includes no in- (ii) FileTagGen(fname, κ, n, dt): This algorithm is executed by +7 to
formation for authenticating position of block in question. generate tag for file F. Input to this algorithm is the name of file to
Significance of each field in Pf has been explained in Section 5. be outsourced, secret key, number of block partitions and date and
(iii) Empty proofs: There is no mechanism provided to handle the time of preprocessing of file. This file tag is denoted as τ.
integrity auditing instance of a block which actually does not (iii) BlockSigGen(κ, H(d[i ]), dt, u): This algorithm is executed by +7 .
exists in the 4 /; . Its input is secret key κ, hash of file blocks, date and time of
(iv) Data freshness: Wang et al. (2011) provide assurance that the preprocessing of file and a random element u ∈G. Its output is θ,
outsourced data is intact. However, it does not provides any which is an ordered collection of )3: signatures {ψi}1≤ i ≤ n on hash
freshness guarantees for data. of file blocks.
(iv) Challenge: This algorithm is executed by ;7( to produce a
5. The proposed protocol (90;: − − 4 /; ) challenge message for *:7 after delegation of auditing by +7 .
In this section, 90;: − − 4 /; based protocol for performing a (v) GenProof(F, θ, C): This algorithm is executed by *:7 as soon as a
data integrity audit in cloud computing has been detailed. Firstly, the challenge message C is received from ;7( . *:7 generates a
modification in classic 4 /; has been discussed followed with the proof Pf and communicates it to ;7( for verification. Input to
definition of various algorithms comprising the protocol. Then, the this algorithm is file F, signature set θ and challenge message C.
protocol construction is presented in detail. (vi) VerifyProof(C, Pf, η): This algorithm is executed by ;7( to verify
the proof Pf received from *:7 . Input to this algorithm is
5.1. Modified 4 /; challenge message C, Pf message from *:7 and public key g κ .
To reduce the computation complexity of searching a node in Output of this algorithm is {TRUE, FALSE} depending upon
4 /; during a data integrity audit instance, every node of classic whether verification is successful or a failure.

4
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

Fig. 3. Relative indexed and time stamped merkle hash tree.

5.3. Detailed construction ;7( , *:7 uses search algorithm to search ith node ∀ i ∈ C in the
hash tree. While searching the node, *:7 can store the informa-
Consider the file to be outsourced is ‘F’ and is split into ‘n’ data tion of every sibling node on the search path of the ith node. This
blocks (d[1], d[2],……, d[n]). Assume e: G × G → GT denotes a bilinear information will serve as AI of the node being searched. For
map with ‘g’ being generator and ‘p’ being prime order of group G. Let example, if ith node is say 5 (Fig. 3), the AI is specified as:
H : {0, 1}* → G be a cryptographic hash function. The proposed proto- [(HA, 4, L ), (H (d[6]), 1, R ), (HF , 2, R )] where L specifies a left
col is described in detail as follows: sibling and R specifies a right sibling. Algorithm 5 also produces
a field SIB as one of the output. SIB specifies the number of sibling
(i) Keygen(1λ): +7 must first generate set of keys (pubkey, seckey) nodes positioned at left to the node in question. For example, in
by using key generation procedures shown in Algorithm 1. The above case SIB(5)=4. The pseudo code for searching a node in
algorithm first chooses a random element κ ∈ Zp as a secret key 90;: − − 4 /; is given in Algorithm 5. If the ith node doesn't
and produces η = g κ as the public key. exists in 4 /; , then *:7 reports message of *block not found*
(ii) FileTagGen: +7 now generates tag for file F. For this, +7 first to ;7( and halts auditing process. However, if the block does ex-
generates a random element say u ∈ G . +7 generates system date ists then it continues the proof generation further and produces a
and time denoted as dt. The system date and time is concatenated complete proof of possession. *:7 now computes:
in generating file tag τ to ensure freshness of data file. Let rk
h=(fname ∥ n ∥ u ∥ dt ) where h ∈ G and τ=sigκ (h ) be the file tag ψ← ∏ ψibi ∈ G
for F. The concatenated string ‘h’ is stored locally for future i = r1
verification of file tag. The pseudo code for generating file tag is
and,
given in Algorithm 2.
rk
(iii) BlockSigGen: After generating file tag τ, +7 generates )3:
signatures of each ith file block d[i] as ψi=(H (d[i ]). u d[i])κ where i
μ← ∑ bid[i] ∈ Zp
i = r1
∈{1, 2,…, n} and u ∈G. Here H(d[i]) denotes hash of file block by
using some cryptographic hash function. In proposed protocol we After computing ψ and μ, *:7 responds ;7( with the following
use SHA256 hash algorithm. The set of signatures is denoted as as proof of possession: Proof Pf = {μ, ψ , (H (d[i]), AI (i ))i ∈ Q , SIB(i ), ρ}
θ = {ψi}, 1 ≤ i ≤ n . After generating block signatures, +7 then Here AI(i) is the auxiliary information of ‘i’ which consists
produces a 90;: − − 4 /; using H(d[i]) ∀i ∈{1, n} as leaves small information about the nodes which are siblings of ‘i’ on the
and generates a root node HR. On achieving HR, +7 concate- path from root node to H(d[i]) as explained above.
nates HR with date and time of system to ensure data freshness. (vi) VerifyProof: Before the actual verification process begins, ;7(
Thus HR=HR ∥ dt . Now +7 authenticates root HR using κ. Finally authenticates the file tag τ as follows:
the signature of root is denoted as ρ ← H (R )κ . +7 will send {F, τ, (a) Recover locally stored file tag hash h.
θ, ρ} information to the *:7 for storage and deletes the same (b) Recover file tag τ from server.
information from local storage. The pseudo code for generating (c) Check whether following equation holds or not:
block signatures and root signature is given in Algorithm 3 and
e(h , g κ ) =? e(τ , g) (1)
The pseudo code for generating hash tree has been given in
Algorithm 4.
(iv) Challenge: A +7 may wish to check integrity of outsourced file F (d) Output TRUE if verification is successful and recover u and
later. For this +7 delegates auditing task to ;7( by sharing file dt.
details, root signature and security keys over a secure and (e) Halt otherwise.
authentic channel. To start with audit, ;7( picks random k- If the authentication verification of file tag τ in Eq. (1)
element subset Q={r1, r2, …, rk} ∈ [1, n] and ∀i ∈Q, ;7( selects would have failed, ;7( will halt further auditing process and
random element bi ∈ Zp . The ;7( sends the challenge message C reports *integrity audit process failure* message to +7 .
←(i , bi )i ∈ Q to *:7 . However if the authentication of file tag is successful, then
(v) GenProof: As soon as the challenge message C is received from ;7( recovers u and dt field, which has to be used in

5
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

authentication of root ρ received in Pf from *:7 . The pseudo 1: procedure GENERATE_FILE_TAG


code for authenticating file tag is given in Algorithm 6. Now Input: File name fname, Secret key κ number of file blocks n,
the ;7( progresses the auditing task by authenticating root date and time dt
of 4 /; . Using (H(d[i]), AI(i)), ;7( first generates a root Output: Publish file tag (τ). Save file tag hash h.
of the tree and denote it as HR. Now ;7( concatenates HR 2: Initialize pairing:
with dt field recovered in Algorithm 6. Then ;7( verifies 3: pbc_demo_pairing_init(pairing, count, param)
root by using following equation: 4: Initialize a random element u and h of G
5: element_init_G(u, pairing);
e(H (R ), g κ ) =? e(ρ , g) (2)
6: element_init_G(h, pairing);
Now, e(H(R), gκ)=e(H (R )κ , g)=e(ρ,g). If ρ is polluted then the 7: Generate u
above verification would not have been successful. On verification 8: g ← element_random(u);
failure, ;7( halts the auditing process and reports authentication 9: Read system date and time
failure to +7 . On authentication verification success, ;7( 10: dt← ctime(t);
proceeds the auditing process and verifies the following equation: 11: Concatenate fname, n, u, dt
⎛ rk ⎞ 12: file_tag_concatenation ←fname ∥ n ∥ u ∥ dt
e⎜⎜∏ H (d[i ])bi . u μ, g κ ⎟⎟ =? e(ψ , g) 13: convert file_tag_concatenation to an element ∈ G say
⎝ i = r1 ⎠ (3) h;
14: h ← element_from_hash(file_tag_concatenation);
Now consider L.H.S of Eq. (3) 15: Generate file tag τ
⎛ rk ⎞ 16: τ← element_pow_Zp(τ, file_tag_concatenation, κ);
e⎜⎜∏ H (d[i ])bi . u μ, g κ ⎟⎟ Algorithm 3. Block signature generation algorithm.
⎝ i = r1 ⎠
⎛ rk ⎞ 1: procedure GENERATE_BLOCK_SIG
= e⎜⎜∏ H (d[i ])bi . u bid[i], g κ ⎟⎟ Input: number of blocks n, secret key κ hash of file blocks
⎝ i = r1 ⎠ (H (d[i ])), date and time dt
⎛ rk ⎞ Output: set of block signatures ψ, root signature ρ
= e⎜⎜∏ (H (d[i ]). u d[i])bi , g κ ⎟⎟ 2: Initialize pairing:
⎝ i = r1 ⎠ 3: pbc_demo_pairing_init(pairing, count, param)
⎛ rk ⎞ 4: Initialize random element array h[i], temp[i],
= e⎜⎜∏ (H (d[i ]). u d[i])biκ , g⎟⎟ roothash of G ∀i ∈n, u ∈G
⎝ i = r1 ⎠ 5: element_init_G(h[i], pairing);
⎛ rk ⎞ ⎛ rk ⎞ 6: element_init_G(temp[i], pairing);
= e⎜⎜∏ ((H (d[i ]). u d[i])κ )bi , g⎟⎟ = e⎜⎜∏ ψibi, g⎟⎟ 7: element_init_G(u, pairing);
⎝ i = r1 ⎠ ⎝ i = r1 ⎠ 8: element_init_G(roothash, pairing);
= e (ψ , g ) = R . H . S 9: Read hash of blocks ∀i ∈n in hash[i]
10: convert hash[i] to an element ∈G ∀i ∈n
Thus, it can be seen that the proposed protocol is accurate if *:7
11: h[i] ←element_from_hash(hash[i])
religiously responds to Pf. On the other hand, if any file block
12: Generate u d[i]
would have been deleted or any modifications to a file block had
not been done accurately, the above verification equation would 13: temp[i] ← element_pow_Zp(temp[i], u, d[i ]);
not have been hold. The pseudo code for verifying above equation 14: Generate h[i].u d[i]
is given in Algorithm 7. 15: h[i] ← element_mul(h[i], h[i], temp[i]);
Authentication algorithm of ith leaf node: One of the 16: Generate block signature as ψi
field in the Pf message by the *:7 is the number of siblings (SIB) 17: ψ←
i element_pow_Zp(ψi, h[i], κ);
of node (key) being searched. If SIB=key – 1, leaf node is 18: Generate MHT using Algorithm 4
confirmed to be an authenticated one. Otherwise position of node 19: roothash ←create_tree(hash);
is not authenticated and the verification fails. This completes one 20: Concatenation roothash with dt
integrity auditing instance in 90;: − − 4 /; . 21: roothash ← roothash ∥ dt ;
Algorithm 1. Key generation algorithm. 22: Generate root signature as ρ
23: ρ← element_pow_Zp(ρ, roothash, κ);
1: procedure KEYGEN(1λ) Algorithm 4. 4 /; creation algorithm.
Input: Security parameter λ
Output: Publish system parameters(G, g, η). Save κ. 1: procedure MERKLE_HASH_TREE_CREATION
2: Initialize pairing: Input: Structure element mt, hash of blocks (H(d[i])), header
3: pbc_demo_pairing_init(pairing, count, param) file merkle_tree.h
4: Initialize elements of G and Zp Output: root node (roothash)
5: element_init_G(g, pairing); 2: Initialize leaf_start to number of blocks:
6: element_init_G(η, pairing); 3: leaf_start ← (1⪡(mt → treeheight − 1));
7: element_init_Zp(κ, pairing); 4: Initialize number of nodes in tree
8: Generate system parameters and private key 5: mt → n← (leaf_start + mt → data_blocks - 1);
9: g ← element_random(g); 6: malloc memory for mt → n in mt → node
10: κ← element_random(κ); 7: Initialize count to leaf_start
11: Generate public key 8: while count is less than equal to mt →n
12: η← element_pow_Zp(η, g, κ); 9: set mt → node[count].index to 1;
Algorithm 2. File tag generation algorithm. 10: set mt → node[count].hash to H(d[count - leaf_start ]);
11: set mt → node[count].left to NULL;

6
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

12: set mt → node[count].right to NULL; 5: element_init_G(temp1, pairing);


13: increment count by 1; 6: element_init_G(temp2, pairing);
14: end while 7: element_init_G(temp3, pairing);
15: Initialize count to leaf_start - 1 8: element_init_G(H(d[i]), pairing);
16: while count is greater than 0 9: element_init_G(Power[i], pairing);
17: set mt → node[count].hash to NULL value; 10: element_init_Zr(b[i], pairing);
18: set mt → node[count].hash to mt → node[2*count].hash + 11: element_set1(temp1);
mt → node[2*count+1].hash; 12: Read H(d[i]), μ from server and b[i], u from C
19: set mt → node[count].index to mt → node[2*count].index + Generate Hd ([i ])b[i]. u μ ∀ i ∈Q
mt → node[2*count+1].index; 13: element_pow_Zp(Power[i], H(d[i]), b[i]);
20: set mt → node[count].left to mt → node[2*count].left; 14: element_pow_Zp(Power1[i], u, μ)
21: set mt → node[count].right to mt → node[2*count].right; 15: Generate ∏rk (H (d[i ])bi . u μ)
i = r1
22: decrement count by 1;
23: if count equal to 0; 16: element_mul(Power[i], Power[i], Power1[i])
24: return address of mt → node; 17: element_mul(temp1, temp1, Power[i]);
25: end while 18: Apply pairing on temp1
Algorithm 5. Searching algorithm. 19: pairing_apply(temp2, temp1, η, pairing);
20: Apply pairing on ψ
1: procedure SEARCH_NODE(ROOT, KEY, SIB) 21: pairing_apply(temp3, ψ, g, pairing);
Input: F, Root of MHT, node to be searched(key) 22: Compare temp2 and temp3
Output: Authentication information of i AAI(key), number of 23: result=element_cmp(temp2, temp3)
siblings of key SIB(key) 24: if result is 0 then output TRUE
2: Initialize left index and right index variables: 25: elseoutput FALSE
3: left_index ← 0; right_index ← 0; SIB ← 0
4: if key is 1 then
5: return(root) 6. Data dynamic procedures
6: else if key is 0 then
7: return(NULL) Allowing data dynamic procedures is an important feature of a data
8: else integrity auditing protocol. This feature is, however, missing in most of
9: left_index=root → left → index; right_index=root → right the existing protocols. In this section we will describe how
→ index 90;: − − 4 /; supports data dynamic procedures which includes
10: if key ≤ left_index then data modification procedure (+4 7 ), data insertion procedure (+07 ),
11: SEARCH_NODE(root → left, key, SIB) data append procedure (+(7 ) and lastly data deletion procedure
12: else (++7 ). In the description provided below, we have assumed that the
13: SIB=SIB + left_index data file F, the file block signatures set θ and the root ρ have been
14: key=key - left_index stored at the server.
15: SEARCH_NODE(root → right, key, SIB)
Algorithm 6. File tag authentication. 6.1. Data modification procedure (+4 7 )
+4 7 is a vital feature of 90;: − − 4 /; , which allows +7 to
1: procedure FILE_TAG_AUTHENTICATION update selected file blocks without any need of downloading entire data
Input: System parameter g, public key η, local file hash h blocks. For modifying ith file block, +7 initiates the +4 7 as follows:
Output: TRUE if file tag is authenticated; Recover dt and u
else FALSE a) +7 will first produces the signature for the new block as
2: Initialize pairing: ψ’=(H (d[i ]′). u d[i] ′)κ .
3: pbc_demo_pairing_init(pairing, count, param) b) +7 then produces a new file tag τ’ ←sigκ (fname ∥ n ∥ u ∥ dt ). New tag
is generated to authenticate date and time of modification in order
4: Initialize random elements temp1, temp2 of G
to ensure freshness of data.
5: element_init_G(temp1, pairing);
6: element_init_G(temp2, pairing);
Frame a data modification message as (4 , i, d[i ]′, H (d[i ]′), ψ’, τ’) and
7: Read file tag τ from server Apply pairing on τ
transfer this message to the *:7 . Here 4 specifies +4 7 and ‘i’
8: pairing_apply(temp2, τ, g, pairing);
represents the block to be modified with i’. Upon reception of above
9: Apply pairing on h
message, *:7 performs following replacements:
10: pairing_apply(temp1, h, η, pairing);
11: Verify file tag
a) replace di with di’
12: result=element_cmp(temp1, temp2)
b) replace ψ with ψ’
13: if result not 0 then output TRUE
c) replace τ with τ’
14: Recover dt, Recover u;
d) replace H(d[i]) with H(d[i]’)*:7 then generates new root hash R’
15: elseoutput FALSE
using 4 /; reconstruction with H (d[i ]′). The *:7 now delivers the
Algorithm 7. Integrity verification.
proof of modification procedure to +7 for verification as
Pf = {ψ , (H (d[i ]), AI (i ), ρ , SIB(i ), R′}. On reception of proof of +4 7
1: procedure INTEGRITY_VERIFICATION
from *:7 , +7 first authenticates τ. Then +7 authenticates root by
Input: Pf, η, C
using (H (d[i ]), AI (i ))i , ρ) by using Eq. (2). If the authentication suc-
Output: TRUE if file blocks are authenticated one else FALSE
ceeds, +7 computes new root using (H (d[i ]′), AI (i ))i ) and compares
2: Initialize pairing:
newly generated root with R’, if the comparison is SUCCESSFUL, +7
3: pbc_demo_pairing_init(pairing,count,param)
authenticates R’ by signing with secret key κ and thus generating
4: Initialize few random elements temp1, temp2, temp3,
ρ’=H (R′)κ and sends it to *:7 for storage. Finally, +7 runs the default
H(d[i]), Power[i] of G and b[i] of Zp where i ∈Q

7
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

Fig. 4. 90;: − − 4 /; after +07 of d * block after d[2] block.

integrity auditing protocol for new block. If result is TRUE, +7 can 7. Adversarial model for proposed protocol
delete {d[i]’, τ’, ρ’} from its local system.
In this section the adversarial model for the security of proposed
protocol has been presented. The security of proposed protocol against
6.2. Data insertion procedure (+07 )
an existential forgery under selective data block attack in a 964 is
Assume +7 wants to insert a data block d * at a particular position
established by simulating a game (GameI) between an adversary ( I
(Fig. 4), it has to execute following +07 :
and an auditor ( .
We will prove that if (0 is able to produce a forgery with non-
a) +7 will first produce signature for the new block as
* negligible success probability on a signature of data block, then ∃ an
ψ *=(H (d *). u d )κ .
algorithm B capable of generating solution of *+/ problem in cyclic
b) +7 will then produce a new file tag τ* ← sigκ (fname ∥ n ∥ u ∥ dt ).
groups.
In GameI, ( I may access following three oracle queries:
Now +7 frames a data insertion message as (0 , V, i, d *, ψ’, τ’) and
transfer this message to *:7 . Here 0 specifies +07 . The field ’V’ (i) Reveal_System_Parameter_Query: (0 may request set of system
specifies position of new block to be inserted, V ←A specifies insertion
parameters from ( . As a result, ( returns system parameters to
after ith position and V ←B specifies insertion before ith position. (0 .
Upon receiving insertion message, *:7 saves d * and the corre-
(ii) Hash_Query: (0 may request hash queries at any time. (
sponding leaf node H (d *). Now *:7 uses Algorithm 5 and finds H(d[i])
maintains a list to respond to this query.
in the 4 /; , and preserves AI(i) and inserts leaf node H (d *). If the ‘V’
field is set ‘A’ then an internal node with hash value (H (d[i ])∥ H (d *)) (iii) Sign_Query: (0 may request signature queries for a data block
will be joined to the original tree, or if the ‘V’ field is set ‘B’ then an d[i]. The auditor responds to this query with a signature σi.
internal node with hash value (H (d *)∥ H (d[i ])) will be added to the
original tree with index set to 2. After this the detail of each node lying There are three phases in GameI, discussed as below:
on the track from ith node to the topmost node (root) are modified. Phase I : In this phase, (0 generate a
Now *:7 produces a fresh root R due to the regenerated 4 /; . Reveal_System_Parameter_Query. To answer this query, ( executes
Finally, *:7 delivers +7 with a proof for insertion procedure as, Keygen algorithm. ( send parameters to (0 keeping its master key as
Pf = {ψ , (H (d[i ]), AI (i ), ρ , SIB(i ), R′}, where AI(i) specifies the auxiliary secret.
information of d[i ] in the previous tree. On reception of proof for Phase II : (0 can make Hash_Query and Sign_Query at this phase of
insertion procedure, +7 first authenticates τ. Upon successful authen- oracle model.
tication of τ it produces root using {AI(i), H (d[i ])} and then authenti- Phase III : At this stage, after sufficient number of oracle queries, (0 is
cates this newly generated root by verifying following Eq. (2). If Eq. (2) able to generate a pair 〈σi*, d[i ]〉.
holds TRUE, +7 can now verify whether *:7 has accomplished the (0 wins GameI iff:
insertion correctly by further producing a fresh value for root using
{AI(i), H(d[i]), H (d *)} and comparing it with R’. If the outcome is (i) σi* is a genuine signature on data block d[i].
successful, +7 authenticates R’ as (H (R′))κ and transfers (H (R′))κ to (ii) (0 has issued no Sign_Query for data block d[i].
*:7 for updation. Finally, +7 runs the default integrity auditing
protocol for new block. If result is TRUE, +7 can erase {ρ’, d *, τ’} from The security of proposed protocol is defined as follows:
its local storage.
Definition 1. The proposed data integrity auditing protocol based on
)3: signatures is existentially unforgeable in an adaptively selected
6.3. Data deletion procedure (++7 ) data block attack type iff the probability of success of a probabilistic
Initial few steps of ++7 are similar to +07 . The structure of a polynomial time adversary ( I defined in above GameI is negligible.
data deletion message is (+ , V, i, τ’). If the ‘V’ field is set ‘A’ then a node
with hash value H(d[i+1]) is removed from the original tree, or if the ‘V’
8. Security proofs
field is set ’B’ then a node with hash value H(d[i−1]) is removed from
the original tree. The relative index value and hash value of parent node
is also updated (See Fig. 5). After this all the steps of ++7 are similar Theorem 1. If an adversary (0 having an advantage ϵ to generate a
to that of +07 and are thus omitted here. forgery in time span ‘t’ on a data block d[i] by simulating GameI and
making qH queries to Hash_Query, qS queries to Sign_Query and
qsys queries toReveal_System_Parameter_Query in a 964 , then a
6.4. Data append procedure(+(7 ) *+/ problem can be solved with probability:
+(7 is similar to data insertion at the end of a file. The details of this
procedure is thus omitted here. ϵ′ ≥ (1/(qS + 1)e)ϵ

8
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

Fig. 5. 90;: − − 4 /; after ++7 of d[3] block after d[2] block.

and in polynomial time: assumption of firmness of *+/ problem.


tc ≥ t + tsm(qH + 2qS ) Further, the following proof shows that the *+/ problem can be
solved by B with probability:
where tsm is time for one scalar multiplication in G and e is base of ϵ′ ≥ (1/(qS + 1). e)ϵ
natural logarithm.
B requires three events in order to succeed:
Proof. Let ( be an attacker and is provided a specimen (g , g a , g b ) of
1: Any Sign_Query by ( I does not aborts B.
*+/ problem in cyclic group G having prime order q. ( interacts with
2 : ( I generates a genuine and non trivial signature σk on data block
(0 to fix *+/ problem by simulating GameI using an algorithm B.
dk.
Here the hash function H is treated as random oracle. B maintains a list
3: ( I produces a valid and non trivial forgery and B does not aborts is
LH which is initially empty.□
probable.
(0 makes queries at various stages of GameI and B responds to all
The probability that B succeeds after these three events happens is:
queries of (0 throughout the game. B is maintained by ( .
P[1 ∩ 2 ∩ 3] = P[1]. P[2|1]. P[3|1 ∩ 2]
(i) Reveal_System_Parameter_Query: At the beginning (0 requests
Claim 1: The probability that any Sign_Query by ( I does not aborts B
B for setup of system. B answers this query as follows:
is atleast (1 − δ )qS .
(a) B picks a random r ∈p as it secret key.
Thus,
(b) B generate its public key as gr ∈ G1.
(c) B then send (g , gr , G1) to (0 keeping r as secret. P[1] ≥ (1 − δ )qS
(ii) Hash_Query: (0 can make a Hash_Query at any time. To keep
Proof: As described above P[cl=1]=(1−δ), therefore the probability that
track of each block d[i] ever queried by (0 , B makes a list LH:
B does not abort is (1−δ). As it takes almost qS Sign_Query, so
{d[i ], wi , ki, cl }, which is initially empty. To answer this query B
probability of B not aborting after queries by (0 is atleast (1 − δ )qS .
behaves as follows:
Claim 2: The probability that B does not abort after ( I 's
(a) B checks LH for any previous query on d[i]. If found then B
Sign_Query and will generate a valid signature on d[k] P[2|1] ≥ ϵ.
responds H(d[i])=wi as answer.
Claim 3: The probability that ( I generate a valid and non trivial
(b) Otherwise, B flips a coin cl∈ {0, 1}, such that the probability
forgery after events 1 and 2 have occurred and B does not abort is δ.
for cl=0 (Pcl =0 = δ ) and the probability for cl=1 (Pcl =1 = 1 − δ ).
Thus P[3|1 ∩ 2] ≥ δ .
(c) B chooses a random k i ∈ p and compute
wi = (g b )(1− cl )ψ (g)ki ∈ G1. Proof. Now,
(d) B adds a tuple {d[i ], wi , ki, cl } to LH and responds H(d[i])=wi
P[1 ∩ 2 ∩ 3] = P[1]. P[2|1]. P[3|1 ∩ 2] = (1 − δ )qS . ϵ. δ
as answer to (0 .
(iii) Sign_Query: (0 can make a Sign_Query for a data block d[i] and Thus,
B answers to this query as follows:
ϵ′ = δ . (1 − δ )qS . ϵ (4)
(a) B first issues a Hash_Query to obtain list LH, and checks the
value of cl. To find minimum value of ϵ’, differentiating both sides of Eq. (4):
(b) If cl=0, B reports failure and halts.
(c) Otherwise, if cl=1, then wi=ψ (g)ki and set σi=(ψ (g a )ki )(ψ (g)rki ). d ϵ′/dδ = d (δ . (1 − δ )qS . ϵ)/dδ = (1. (1 − δ )qS )ϵ
Observe that σi=wiawir =wi(a + r ) + (δ. qS (1 − δ )(qS −1) . ( − 1))ϵ = (1 − δ )(qS −1)ϵ((1 − δ ) − δ. qS )
(d) B responds σi to (0 .Forgery : Eventually (0 generates a pair = (1 − δ )(qS −1)ϵ(1 − δ (1 + qS ))
<d[k ], σk*>. Assume (0 generates a forged signature σk* for a data
block d[k] such that no Sign_Query has been issued for d[k]. Now B Substitute d ϵ′/ dδ =0,
performs following steps to generate a forgery:
(1 − δ )(qS −1)ϵ(1 − δ (1 + qS )) = 0
(a) B issues a Hash_Query for d[k] to fetch list LH. B checks the value We get,
of cl. If cl=1, B halts.
(b) If cl=0, H(d[k])=wi=g bψ (g)k . δopt = 1/(1 + qS )

Thus Eq. (4) becomes:


Hence,
ϵ′ ≥ δ. (1 − δ )qS . ϵ ≥ 1/(qS + 1)[1 − 1/(qS + 1)]qS . ϵ
σk* = (g bψ (g)k )(a + r ) = g b(a + r )(ψ (g)k )(a + r ) = g abgrbψ (g ak )ψ (grk )
For large value of qS, [1 − 1/(qS + 1)]→ 1/e, therefore the probability
⟹g ab = σk*/(grbψ (g ak )ψ (grk )) becomes,
B knows the values of σk*, gb, r, k. Thus B can compute gab or B can fix ϵ′ ≥ [1/(qS + 1). e]ϵ
*+/ problem in G1 in polynomial time, which is a contradiction to the
Further, the following proof shows that B solves the *+/ problem in

9
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

polynomial time ⎛1 1⎞ x2 4xa 4xa2 4 ⎛1 1⎞


1 − xa⎜ + ⎟ + a ≤ 1 − + −⎜ + ⎟
tc ≥ t + tsm(qH + 2qS ) ⎝A B⎠ AB A+B (A + B )2 A + B ⎝ A B⎠
⎛ 4 1 ⎞ 4AB − (A + B )2
≤ xa⎜ − ⎟
⎝ (A + B )2
AB ⎠ (A + B )AB
(a) Running time of algorithm B=running time of (0 + time to ⎛ 4AB − (A + B )2 ⎞ −(A − B )2
respond to (qH + qS ) Hash_Query + time to respond to qS ≤ xa⎜ ⎟⟹
Sign_Query. ⎝ AB(A + B ) ⎠ 2
(A + B )AB
(b) Each query requires computation of an exponentiation operation ⎛ −(A − B )2 ⎞
in group G which is equal to tsm. ≤ xa⎜ ⎟
⎝ AB(A + B )2 ⎠

Using (a) and (b), total time required by algorithm B to solve *+/ As long as 0 ≤xa≤ (A + B)
problem is: ⎛ ⎞2
⎜ ⎟
tc ≥ t + tsm(qH + 2qS ) ⎛ x ⎞⎛ x ⎞ ⎜ xa ⎟
Applying ⎜1 − a ⎟ ⎜1 − a ⎟ ≤⎜1 − , we get
⎝ A⎠ ⎝ B⎠ ⎛ A + B⎞ ⎟
This completes the proof of Theorem 1. ⎜ ⎜ 2⎟ ⎟
⎛ ⎝ ⎝ 2 ⎞⎠⎠
⎜ ⎟
⎛ x ⎞⎛ xa ⎞ ⎜ xa ⎟
⎜1 − a ⎟⎜1 − ⎟ ≤ ⎜1 −
9. Performance analysis ⎝ n ⎠⎝ n − t + 1⎠ ⎛ t − 1⎞⎟
⎜ n−⎜ ⎟⎟
⎝ ⎝ 2 ⎠⎠
The performance of 90;: − − 4 /; has been analyzed in this
section. The analysis of probability of malpractice at *:7 has been and,
studied. We also discussed the computation complexity of
90;: − − 4 /; with reference to number of corrupted blocks and ⎛ ⎞2
⎜ ⎟
number of blocks modified. The computation cost of various algorithms ⎛ xa ⎞ ⎛ xa ⎞ ⎜ xa ⎟
⎜1 − ⎟⎜1 − ⎟ ≤ ⎜1 −
comprising the proposed protocol have also been given followed with ⎝ n − 1 ⎠⎝ n − t + 2⎠ ⎛t − 1⎞⎟
⎜ n−⎜ ⎟⎟
the storage cost and communication overhead in 90;: − − 4 /; . ⎝ ⎝ 2 ⎠⎠
The results of comparison between 90;: − − 4 /; and two prior
data integrity auditing methods presented by Tan and Jia (2014) and Similarly finding other results and multiplying all results, we get
Wang et al. (2011) have been given. ⎛ x ⎞⎛ xa ⎞⎛ xa ⎞ ⎛ xa ⎞
⎜1 − a ⎟⎜1 − ⎟⎜1 − ⎟ − − − − ⎜1 − ⎟
⎝ n ⎠⎝ n − 1 ⎠⎝ n − 2⎠ ⎝ n − t + 1⎠
9.1. Probability of detecting malpractice at *:7
⎛ ⎞t
The proposed protocol relies on probability sampling method to ⎜ ⎟
reduce computation burden. The sampling method partitions +7 's ⎜ xa ⎟
≤ ⎜1 −
data file ‘F’ in ‘n’ number of blocks and then selects randomly ‘t’ blocks ⎛ t − 1⎞⎟
⎜ n−⎜ ⎟⎟
out of ‘n’ blocks to check for malpractice if any. Using this, we will ⎝ ⎝ 2 ⎠⎠
study the probability of malpractice detection at *:7 .
Consider that *:7 alters ‘xa’ blocks out of the total ‘n’ blocks, then Hence,
the probability that randomly checked block is a corrupted block is: ⎛ ⎞t
x ⎜ ⎟
p0= a xa
n P ≥ 1 − ⎜1 − ⎟
Now, assume that the probability of detecting malpractice in at least ⎜ t − 1⎟
n−
one block is specified as P. ⎝ 2 ⎠
Therefore, P=1 – Probability of not detecting malpractice For finding maximum value of P, consider
Probability of not detecting malpractice in 1st block
n − xa x ⎛ xa ⎞ ⎛ xa ⎞
checked= =1 - a ⎜1 − ⎟ ≥ ⎜1 − ⎟
n n ⎝ n − i⎠ ⎝ n − t + 1⎠
Probability of not detecting malpractice in 2nd block
n − xa − 1 xa
checked= =1 - t −1
⎛ xa ⎞ ⎛ xa ⎞t t −1
⎛ xa ⎞
n−1 n−1 ⟹ ∏ ⎜1 − ⎟ ≥ ⎜1 − ⎟ ⟹1 − ∏ ⎜1 − ⎟
i =0 ⎝ n − i⎠ ⎝ n − t + 1⎠ ⎝ n − i⎠
Probability of not detecting malpractice in 3rd block i =0
n − xa − 2 xa
checked= =1 - ⎛ xa ⎞t
n−2 n−2 ≤ ⎜1 − ⎟
Probability of not detecting malpractice in tth block ⎝ n − t + 1⎠
n − xa − t + 1 xa
checked= =1 - Hence,
n−t+1 n−t+1
Therefore,
⎛ xa ⎞t
⎛ x ⎞⎛ xa ⎞ ⎛ xa ⎞ P ≤ ⎜1 − ⎟
P = 1 − ⎜ 1 − a ⎟ ⎜1 − ⎟ − − − − ⎜1 − ⎟ ⎝ n − t + 1⎠
⎝ n ⎠⎝ n − 1⎠ ⎝ n − t + 1⎠
Now we conclude that probability of malpractice at *:7 is:
t −1
⎛ xa ⎞
=1− ∏ ⎜1 − ⎟
⎛ ⎞t
⎝ n − i⎠
i =0
⎜ xa ⎟ ⎛ xa ⎞t
1 − ⎜1 − ⎟ ≤ P ≤ ⎜1 − ⎟
For finding minimum value of P, consider ⎜ t − 1⎟ ⎝ n − t + 1⎠
n−
⎝ 2 ⎠ (5)
⎛ ⎞2
⎜ ⎟ Assume the +7 partitions a 1 GB file in 62500 data blocks with each
⎛ x ⎞⎛ x ⎞ ⎜ xa ⎟
⎜1 − a ⎟⎜1 − a ⎟ ≤ ⎜1 − ; A, B ≥ 0 block having size equal to 16 KB. Fig. 6 shows the number of blocks(t)
⎝ A ⎠⎝ B⎠ ⎛ A + B⎞ ⎟
⎜ ⎜ ⎟⎟ required in challenge message to detect number of blocks corrupted by
⎝ ⎝ 2 ⎠⎠
considering probability of malpractice detection (P) from a set Px={0.7,

10
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

Table 2
Notations with descriptions.

Notation Description

H(G) Hash a value into group G


Add(G) Addition task in G
Mul(G) Multiplication task in group G
Exp(G) Exponentiation task in group G
Pair(G) Pairing task in group G

Table 3
Computation cost of various algorithms in 90;: − − 4 /; .

Algorithm Addition Multiplication Hash Exponentiation Pairing

KeyGen 0 0 0 Exp(G) 0
FileTagGen 0 0 0 Exp(G) 0
BlockSigGen 0 0 nH(G) (n+1)Exp(G) 0
GenProof tAdd(G) (t−1)Mul(G) 0 tExp(G) 0
Fig. 6. Number of challenged blocks required w.r.t 90% percent of corrupted blocks.
VerifyProof 0 tMul(G) 0 tExp(G) 4

Table 1 n: total number of blocks; t: number of challenged blocks.


Number of challenged blocks upto 90% of corrupted blocks.

Table 4
Number of corrupted blocks (xa) Number of challenged blocks (t)
Comparison of computation cost.
6250 23
Protocol Server Auditor Time
12 500 12
complexity
18 750 10
25 000 7
Wang protocol (Wang et al., tExp(G) + (t 2Pair + (t+2)Exp O(n)
31 250 5
2011) −1)Mul(G) (G) + (t+1)Mul
37 500 4
(G)
43 750 4
Tan protocol (Tan and Jia, 2tExp(G) + 3Pair + (t+1)Exp O(n)
50 000 3
2014) (2t−2)Mul (G) + tMul
56 250 2
(G)
90;: − − 4 /; tExp(G) + (t 4Pair + tExp(G) O(logn)
−1)Mul(G) + tMul(G)
0.8, 0.9, 0.99, 0.99999}. Thus if *:7 modifies 10% of total outsourced
blocks (n), then the ;7( needs only 68 random blocks in challenge set n: total number of blocks; t: number of challenged blocks.
to detect a server malpractice with 0.999 probability. Table 1 shows
number of blocks required in challenge message with various percent of in searching a node in 4 /; is reduced to O(logn). The computational
corrupted blocks to detect a malpractice with probability 0.9. Also with complexity of proposed search algorithm is similar to searching an
an increase in number of corrupted blocks, the number of challenged element in a Binary Search Tree ():; ) with (2n - 1) nodes and the
blocks decreases. For example, if *:7 modifies 40% of total out- computational complexity in Wang et al. (2011) is similar to that of a
sourced blocks (n), then the ;7( needs only 18 random blocks in linear search algorithm. The performance of proposed protocol in
challenge set to detect a server malpractice with 0.999 probability. terms of frequent data modifications is depicted in Fig. 7. An experi-
ment for updating a file of 1 GB size has been conducted where the
number of blocks updated varies between 100 and 1000. Results
9.2. Computation cost
depicted are of 10 trials. From Fig. 7, it is clear that as the number
of blocks modified increases, the computation cost in Wang et al.
Computation cost is viewed differently by different entities in an
(2011) increases at higher rate as compared to the proposed protocol.
auditing model. For a ;7( , computational cost specifies the resources
required by ;7( to perform an integrity audit. For a *:7 , computa-
tion cost corresponds to the time required to process the challenge
message and produce a proof of possession. The computation cost
incurred in a data dynamic procedure is different from a regular data
integrity audit. In a update procedure, ;7( requires time to generate
block tag for fresh block, *:7 requires time to regenerate 4 /; for
regenerating root hash, updating file tag. All these tasks contributes to
computational burden on respective entities. Following this section, we
present computation cost of 90;: − − 4 /; and provide a compar-
ison with Wang et al. (2011) and Tan and Jia (2014) protocol. For the
convenience of discussion, we use notations presented in Table 2 for
our analysis. Table 3 depicts the computation cost of each algorithm in
90;: − − 4 /; . Table 4 provides the comparison of computation
cost of 90;: − − 4 /; protocol with Wang et al. (2011) protocol
and Tan and Jia (2014) protocol. From Table 4 it is clear that
90;: − − 4 /; is efficient w.r.t performance than the other two
schemes. In Wang et al. (2011) the computation time in searching a
blocks is high and varies linear w.r.t the number of blocks i.e. O(n).
However in 90;: − − 4 /; , due to relative index field, complexity Fig. 7. Computation cost vs Number of blocks modified.

11
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

Table 5
Storage cost in 90;: − − 4 /; .

Entity Storage cost (bits)

+7 k+2 |G|+l
;7( k+|G|+l
*:7 n+l+(n+1)|G|

n: total number of blocks; k: security parameter; l: size of file tag; |G|: size of an element
∈G

In Tan and Jia (2014), the data dynamic operations are not supported.

9.3. Storage cost

The storage and communication cost of 90;: − − 4 /; has been


analyzed in this section. +7 stores key pair (η, κ ), file tag hash h. The
delegated ;7( stores public key η, file tag τ. *:7 stores +7 's data file Fig. 8. Computation cost vs Number of challenged blocks.
F, file block tags θ, root signature ρ, file tag τ. Thus the size of metadata
generated due to preprocess of a file plays an important role in storage 9.5. Experimental evaluation
overhead. The metadata generated in protocols which are based on
RSA is of the size of RSA modulo. Consider 9:( modulo be 2048 bit The experiment was carried out on a system with processor
in size and public key size be 6168 bit prime number, then protocol configuration Intel core (TM) 2 Duo processor at 2.20 GHz and
(Ateniese et al., 2008) concludes that the size of a data block should installed memory as 2 GB RAM using C language. We use Pairing
remain ≤ e/2. The protocol may become unsecured if size is < e/2. Based Cryptography (PBC) library (Lynn, 2016) version 0.5.14 for
Therefore, a data of size 64 MB is required to be partitioned into deploying algorithms using pairings and cryptographic library Openssl
minimum 87,056 data blocks and every block possessing length of version 1.0.2e (openssl Project, 2016) for implementing hash opera-
metadata as 2048 bits. Thus size of total metadata of around 21.2 MB. tions (SHA256). We have practiced 160 bit group order elliptic curve
In 90;: − − 4 /; , we have chosen )3: signatures based homo- for our construction. The simulation platform in 90;: − − 4 /; is
morphic tags due to their shorter length in comparison to 9:( ubuntu-10.04 operating system. Pairing parameters with type A have
signatures based homomorphic tags. It is clear from Table 5 that the been chosen from the “param” subdirectory of the PBC record to key in
storage overhead due to metadata at +7 and ;7( is constant. pairing. We choose Type A pairing as it is fastest. The results plotted
However it varies linearly at *:7 . are achieved by mean of 10 trials. For experimental demonstration, we
use file size varying from 10 KB to 100 KB. Fig. 8 shows the variation in
number of challenged blocks with computation cost of an integrity
audit on ;7( . It is clear from the figure that 90;: − − 4 /; incurs
9.4. Communication cost
minimum computational burden on the ;7( . To study the impact of
an update operation, in Wang et al. (2011), the ;7( first search for
The volume of data transferred during an auditing instance between
the location of block in the 4 /; whose complexity varies linearly
*:7 and a ;7( constitutes communication cost. In
with respect to number of blocks in a tree as previously discussed in
90;: − − 4 /; , communication cost majorly arises from the chal-
Section 4. Thus the search operation adds a computational overhead on
lenge message sent by ;7( to *:7 and the proof information sent by
the ;7( . However in 90;: − − 4 /; , the computational complex-
*:7 to ;7( . The communication cost incurred in delegating auditing
ity of searching a node is reduced logarithmically. It is clear from Fig. 8
task to ;7( by +7 is constant. Table 6 shows communication cost
that the computation cost of in Wang et al. (2011) and Tan and Jia
involved between various entities in 90;: − − 4 /; . +7 first
(2014) protocol increases as the file size increases. Our protocol incurs
delegates auditing task to ;7( , for this +7 shares keys and file tag,
minimum computational overhead on the ;7( . Thus,
root signature with ;7( . Therefore the communication cost equals
90;: − − 4 /; marks an impact on the computational overhead
size of keys, root signature, file tag and which is only one time for every
when a +7 outsources a large data file in cloud or updates a
audit. ;7( sends challenge message to *:7 which includes a k
outsourced file frequently in cloud.
element subset Q and random element bi ∈ Zp for i ∈Q. Now as in Lynn
(2016), Q can be represented as {log|F | + 3log(1/err )} bits, with |F |
specifying data file size and ‘err’ being error probability. Normally for a
10. Conclusions
file size ≤ 1024 TB and 2 ≤ err ≤ 80, size of Q can be expressed in 283
bits. So the communication cost in challenge phase is maximum t . |Zp| +
In this paper an integrity auditing protocol for cloud based data,
283 bits in 90;: − − 4 /; for a data file with size ≤ 1024 TB.
relying on 4 /; construction and )3: signatures has been pro-
posed. The relative index of a node has been integrated with the hash
Table 6
Communication cost in 90;: − − 4 /; . value of the node in tree to reduce searching complexity. The remark-
able feature of this protocol is the time stamp field concatenated with
Phase Communication cost (bits) the root hash which guarantees data freshness. Our construction
guarantees that whenever a data file is accessed by a user it is the
Delegation of audit task +7→ TPA: 2 |G|+l
Challenge phase ;7( → *:7 : t. |Zp| + 283 most recent copy of data. The proposed protocol efficiently supports
Responding proof *:7 → ;7( : |G| + |Zp| + (t +2)256 public auditing of data, data dynamic procedures and data freshness,
which are missing in most of the existing techniques. Experimental
size of element ∈Zp; t: number of blocks challenged; l: size of file tag; |G|: size of an results and security proofs have proved that 90;: − − 4 /; is
element ∈G; |Zp|. highly secure and efficient one.

12
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13

References Juels, A., Kaliski Jr, B.S., 2007. Pors: Proofs of retrievability for large files. In:
Proceedings of the 14th ACM Conference on Computer and Communications
Security. ACM, in Alexandria, Virginia, USA pp. 584–597.
Ateniese, G., Burns, R., Curtmola, R., Herring, J., Kissner, L., Peterson, Z., Song, D., Lynn, B., Accessed: 11 January 2016. Pairing-based cryprography library.〈http://crypto.
2007. Provable data possession at untrusted stores. In: Proceedings of the 14th ACM stanford.edu/pbc/〉.
Conference on Computer and Communications Security. ACM, Alexandria, Virginia, Mell, P., Grance, T., 2011. The Nist Definition of Cloud Computing.
USA pp. 598–609. Merkle, R.C., 1980. Protocols for public key cryptosystems. In: Null. IEEE, in Oakland
Ateniese, G., Di Pietro, R., Mancini, L.V., Tsudik, G., 2008. Scalable and efficient provable california p. 122.
data possession. In: Proceedings of the 4th International Conference on Security and Mohammad Etemad, M., Küpçü, A., 2016. Generic efficient dynamic proofs of
Privacy in Communication Networks. ACM, in Istanbul, Turkey USA p. 9. retrievability. In: Proceedings of the 2016 ACM on Cloud Computing Security
Boneh, D., Franklin, M., 2001. Identity-based encryption from the weil pairing. In: Workshop. ACM, in Vienna, Austria pp. 85-96.
Advances in CryptologyCRYPTO 2001. Springer, in Santa Barbara, California, pp. openssl Project, T., Accessed: 11 January 2016. Openssl library. 〈http://www.openssl.
213–229. org/docs/manmaster/crypto/crypto.html〉.
Boneh, D., Lynn, B., Shacham, H., 2001. Short signatures from the weil pairing. In: Shacham, H., Waters, B., 2013. Compact proofs of retrievability. J. Cryptol. 26 (3),
Advances in CryptologyASIACRYPT 2001. Springer, in Gold Coast, Australia pp. 442–483.
514–532. Tan, S., Jia, Y., 2014. Naepasc: a novel and efficient public auditing scheme for cloud
Bowers, K.D., Juels, A., Oprea, A., 2009. Hail: a high-availability and integrity layer for data. J. Zhejiang Univ. Sci. C 15 (9), 794–804.
cloud storage. In: Proceedings of the 16th ACM conference on Computer and Wang, Q., Wang, C., Li, J., Ren, K., Lou, W., 2009. Enabling public verifiability and data
communications security. ACM, pp. 187–198. dynamics for storage security in cloud computing. In: Computer Security–ESORICS
Chen, B., Curtmola, R., Ateniese, G., Burns, R., 2010. Remote data checking for network 2009. Springer, in Saint-Malo, France pp. 355–370.
coding-based distributed storage systems. In: Proceedings of the 2010 ACM Wang, Q., Wang, C., Ren, K., Lou, W., Li, J., 2011. Enabling public auditability and data
Workshop on Cloud Computing Security Workshop. ACM, in Chicago, Illinois, USA dynamics for storage security in cloud computing. IEEE Trans. Parallel Distrib. Syst.
pp. 31–42. 22 (5), 847–859.
Curtmola, R., Khan, O., Burns, R., Ateniese, G., 2008. Mr-pdp: Multiple-replica provable Wang, J., Chen, X., Huang, X., You, I., Xiang, Y., 2015. Verifiable auditing for outsourced
data possession. In: The 28th International Conference on Distributed Computing database in cloud computing. IEEE Trans. Comput. 64 (11), 3293–3303.
Systems, 2008. ICDCS'08. IEEE, in Beijing, China pp. 411–420. Zhang, F., Safavi-Naini, R., Susilo, W., 2004. An efficient signature scheme from bilinear
Erway, C.C., Küpçü, A., Papamanthou, C., Tamassia, R., 2015. Dynamic provable data pairings and its applications. In: Public Key Cryptography–PKC 2004. Springer,
possession. ACM Trans. Inf. Syst. Secur. (TISSEC) 17 (4), 15. in Singapore pp. 277–290.
Garg, N., Bawa, S., 2016. Comparative analysis of cloud data integrity auditing protocols.
J. Netw. Comput. Appl. 66, 17–32.

13

You might also like