RITS-MHT_Relative_indexed_and_time_stamped_Merkle_
RITS-MHT_Relative_indexed_and_time_stamped_Merkle_
RITS-MHT_Relative_indexed_and_time_stamped_Merkle_
RITS-MHT: Relative indexed and time stamped Merkle hash tree based MARK
data auditing protocol for cloud computing
⁎
Neenu Garg , Seema Bawa
Department of Computer Science and Engineering, Thapar University, Patiala, India
A R T I C L E I N F O A BS T RAC T
Keywords: Cloud computing is a recent phrase in marketing for a concept which has been known for years: Outsourcing.
Proof of possession Cloud computing provides cornucopian of gratuities for both individual users and business organizations.
Data integrity audit ‘Cloud’ is more of a notion where files and data are hosted online and can be accessed when required via a
Third party audit number of methods, anywhere, and at any time. That is the gist of it. Providing storage space to cloud users is an
Cloud computing
appealing service of cloud computing. Although cloud storage provides benefits of location independence access
to data, reduced burden for hardware and software maintenance and many more, yet this service has several
challenges in the range of security and preserving data. Cloud servers may exist in white puffy shapes in the
azure, but these are not immune to temporal errors. To ensure that outsourced data is secure and is not
tampered, cloud provider must allow data proprietor to periodically audit data integrity. Numerous Remote
Data Auditing (9+( ) protocols have been proposed by researchers so far. In presented work, to cope with this
problem we analyze the efficiency issues of a current protocol for data integrity auditing in cloud storage and
propose an approach based on Relative Index and Time Stamped Merkle Hash Tree (90;: − − 4 /; ) which
integrates 4 /; with relative index of a node resulting in reduction of computation cost of searching a data
block from O(n) in Wang's protocol to O(log n) and time of last modification to data, thereby guarantying
freshness of data respectively. 90;: − − 4 /; ensures that the outsourced data has not been polluted as well
as it assures that the recent copy of data is reclaimed. This protocol supports public auditing of data, and
efficiently supports data dynamic operations like insertion, modification, deletion of outsourced data at minimal
computational cost. The security of proposed protocol has been proved in Random Oracle Model (964 ). As
compared to Wang's protocol, 90;: − − 4 /; is more efficient in terms of computation cost.
1. Introduction a cloud user is data security. Data security is a paradox issue. It is both
appealing as well as a vulnerable spot in cloud computing. Once a user
A rising trend in market termed as “Cloud computing” has greatly has shifted data from local systems to cloud, its data is exposed to
metamorphosed the way organizations interacts with their customers, cloud. There can be external and internal dangers to data once it is on
vendors and employees. Various internet services like Google, YouTube cloud. For example, a Cloud Service Provider (*:7 ), facing sporadic
have modified the way people communicate, shop, educate, and much complex breakdowns, may choose to conceal data errors from users for
more. Services offered by cloud are: Software as a Service (: aa: ), personal benefits (Garg and Bawa, 2016). For economical benefits or
Platform as a Service (7 aa: ), and Infrastructure as a Service (0aa: ) storage space benefits, a *:7 may intentionally erase files which are
(Mell and Grance, 2011). As infrastructure, cloud provides a remote occasionally accessed. Data in cloud can be hacked by an external
storage location, other than user's computer's hard drive or any other attacker who can destroy or delete it for some reason. The traditional
local and private storage infrastructure for keeping data and is integrity auditing methods like hash functions and digital signatures
maintained by cloud itself. Thus the pressure of keeping the data safe are not appropriate in cloud as the local copy of data is not available.
and dealing with daily increasing needs of organizations and indivi- Moreover downloading entire data for integrity verification also seems
duals for storage is handled by cloud provider. Access to data is impractical due to resource limitations of users. The afore stated
facilitated via an Internet connection through a web based interface or situation gets worse if the users are mobile users. Thus due to the
some special Application Programming Interface ((70 ). Apart from voluminous outsourced data and resource constraints, a cloud service
the storage benefits it provides, the top reason to stay away from being consumer may be unable to perform an efficient integrity auditing on
⁎
Corresponding author.
E-mail addresses: neenu.garg@thapar.edu (N. Garg), seema@thapar.edu (S. Bawa).
http://dx.doi.org/10.1016/j.jnca.2017.02.005
Received 11 May 2016; Received in revised form 28 January 2017; Accepted 12 February 2017
Available online 21 February 2017
1084-8045/ © 2017 Elsevier Ltd. All rights reserved.
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
its data periodically and hence relies on a Third Party Auditor (;7( ) encoding of each data copy separately. However in their subsequent
for the same. The ;7( service is efficiently designed to prove data work, Chen et al. (2010) designed a framework for distributed data
integrity without requiring downloading of complete data. Few of the storages based on network coding. One of its major flaw is that it is not
data integrity auditing protocol proposed so far require Data Proprietor publicly auditable and the code is unsystematic and the framework is
(+7 ) to perform data auditing and few facilitates public auditing of inappropriate for online storages. Moreover, accessing few parts of a
data. Public auditing of data implies that anyone authorized by +7 can file requires regeneration of complete data file. Recently Wang et al.
perform data auditing. Practically, it is more rational to combine public (2015) proposed a protocol which supports the correctness and
auditing with auditing protocol for efficiency consideration. Moreover, completeness of proof of possession by server. However the protocol
the data in cloud may be frequently accessed by +7 for not just is provable secure in an encrypted outsourced database but its
reading but for any updating including modify, insert, delete or application is constrained to static files only. Despite the existing
append. A data integrity auditing protocol should be such that it protocols target at providing data integrity auditing for cloud systems,
provides support for data dynamic operations as well. In our work, the the requirement for public audit and data dynamics has yet to be
deficiencies in the integrity auditing protocol proposed by Wang et al. fulfilled. A efficient design of a data integrity auditing protocol which is
(2011) have been analyzed and a Relative Indexed and Time Stamped secure and integrates above two features seamlessly remains an open
Merkle Hash Tree (90;: − − 4 /; ) based data integrity auditing problem for researchers in cloud computing.
protocol which supports public auditing of data and data dynamic
procedures along with ensuring data freshness has been proposed. The 1.2. Our contribution
issue of privacy preservation of outsourced data is orthogonal to the
proposed protocol. The contributions of proposed work are summarized as below:
1.1. Related work 1. The data auditing model in cloud computing has been analyzed and
an auditing protocol is proposed, which efficiently reduces the
To ensure data integrity in cloud systems, numerous RDA protocols computational complexity for integrity verification from O(n) in
have been suggested (Ateniese et al., 2007, 2008; Bowers et al., 2009; Wang et al. (2011) to O(log n).
Shacham and Waters, 2013; Wang et al., 2009; Erway et al., 2015; 2. 90;: − − 4 /; provides assurance not only for data integrity but
Mohammad Etemad, 2016) so far. Among all above suggested proto- also for data freshness. A time stamp has been integrated with a field
cols (Ateniese et al., 2007) worked on the subject of publicly auditable in proof of possession to assure that most recent copy of data is
outsourced data and data integrity assurance. (Ateniese et al., 2007) retrieved.
suggested two PDP constructions as: Sampling(S)-PDP; Efficient(E)- 3. 90;: − − 4 /; has been proven secure in ROM with the
PDP. S-PDP assures firm probabilistic possession of data, however E- assumption of toughness of Computational Diffie Hellman (*+/ )
PDP assures efficient auditing mechanism but weaker data possession problem.
assurance. To provide public auditing of data in Ateniese et al. (2007), 4. The proposed protocol has been efficiently designed to support
RSA based homomorphic tags have been used. However, their protocol public auditing of data as well as data dynamics.
lacks support for data dynamic procedures and its application is limited
to static data only. Erway et al. (2015) is a dynamic version of Ateniese 1.3. Organization
et al. (2007). Again in Erway et al. (2015) number of audits is bounded.
Similar to Erway et al. (2015), Juels and Kaliski (2007) bounds the Section 2 presents the system model of the proposed data integrity
number of audits and does not provides public auditing of data. In auditing protocol along with some threats to data posed by various
continuation to Juels and Kaliski (2007), Shacham and Waters (2013) entities in system model. Section 3 highlights the background methods
proposed two PoR constructions. For their work, they use homo- used to realize the proposed protocol. These include )3: signatures,
morphic authenticators which are publicly auditable. Their work relies bilinear pairings, *+/ problem and classic 4 /; . The efficiency
on BLS signatures, and is also publicly auditable. Shacham and Waters issues in Wang et al. (2011) have been discussed in Section 4. The
(2013) is efficient in terms of time complexity of an auditing procedure details of proposed protocol are given in Section 5. Section 6 represents
however communication complexity of the protocol varies linearly as the support to data dynamic procedures by 90;: − − 4 /; . Section
outsourced data block size increases. Apart from the space efficiency 7 gives an adversarial model for proposed protocol with detailed
(due to BLS signatures) provided by Shacham and Waters (2013), its security proof in Section 8. Section 9 accesses the performance of
usage is constrained to static data. The mechanism for handling data proposed protocol which includes parameters like probability of
dynamic procedures were addressed by Erway et al. (2015) which relies detecting server malpractice, computation cost of auditor, storage
on skip lists where the skip list provide rank based authentication. To and communication costs. This section also includes experimental
assist insertion operations in data blocks, Erway et al. (2015) modifies evaluation of proposed protocol and the results obtained have been
the technique of generating tags in the tag generation phase of Ateniese plotted. Conclusions have been drawn in Section 10.
et al. (2007). The efficiency of Erway et al. (2015) is still uncertain.
Mohammad Etemad (2016) provides a generic framework DPoR from 2. System model of proposed data integrity auditing model
Erway et al. (2015), Ateniese et al. (2007) and erasure codes. They use
logs to store all update information. Their protocol requires main- The system model of proposed data auditing protocol is illustrated
tenance of all logs to ensure data retrievability if any update to data in Fig. 1. This auditing model possess three entities defined as follows:
fails. They provide a mechanism to perform auditing of the logs. Data
has been stored twice at server, one for providing authenticity of data (i) Data Proprietor (+7 ): is a person, a firm, or a business
by reading, and other in log store. Any update command is sent to organization which has sufficient amount of data files for out-
memory checking part for execution. Along with this client prepares an sourcing and may update this data later by carrying modify,
update log to be appended to existing logs in log store. The protocol in delete, insert or append operations. +7 is a dependent entity on
Bowers et al. (2009) provides a framework which involves a mobile cloud provider for data maintenance. Generally, it is a resource
adversary. Their work is constrained to static data. In other related bounded entity.
work, Curtmola et al. (2008) designed a framework for data integrity (ii) Cloud Service Provider (*:7 ): is an entity having sufficient
assurance of numerous data copies widely distributed across storage computationally capable resources and storage servers which have
servers. Their work preserves numerous data copies without requiring unlimited storage space. *:7 is accountable for keeping and
2
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
Fig. 1. System model of 90;: − − 4 /; . where H: {0, 1}*→ G being a hash function and G is a prime order
group having generator g. Here κ denotes secret key ∈p which is ring
of integer modulo p. The )3: signature method consists of three
functions: key generation, message sign and verify as described below:
preserving outsourced data. A *:7 is considered to be an
untrusted entity. (i) keygeneration: Generate secret key κ ∈ p . Publish gκ as public
(iii) Third Party Auditor (;7( ): is a skilled entity to perform audit on key and save secret key.
+7 's data. A ;7( is an entity trusted by both, *:7 and +7 . (ii) message sign (F, κ): ψ→ H(F)κ
;7( minimizes computational pressure of data auditing on +7 . (iii) verify (F, ψ): e(ψ, g)= ? e(H(F), g κ )Boneh et al. (2001) provide
The numbers in Fig. 1 represents the sequence of activities sufficient details on BLS signatures.
taking place in an auditing process. It has been assumed that
before an audit process is initiated, +7 has hired a *:7 and has 3.2. Bilinear pairings
outsourced data at cloud server.
Consider (G1, +), (G2,.) be two cyclic groups having prime order r,
2.1. Threat model where G1 is an additive and G2 is a multiplicative group. A pairing is a
map defined as e: G1 × G1 → G2 satisfying the properties as follows:
As previously discussed, a ;7( is a trusted entity and a *:7 is an
untrusted entity. However, both ;7( and *:7 can cause some threats (i) Bilinear: ∀P ∈G1, ∀Q ∈G2 , ∀a, b ∈Zp , a map ‘e’ is said to be bilinear
to +7 's data (Garg and Bawa, 2016) as discussed below: if e (Pa, Qb)=e (P, Q)ab
(ii) Non-degenerate: ∀P, Q ∈G1, e(P, Q) ≠ 1
(i) Threats caused by ;7( : A +7 relies on ;7( for assuring data (iii) Computable: the map ‘e’ is computable
integrity, assuming that this independent entity is genuine and These pairings have applications in cryptographic schemes.
reliable. However there is always a possibility for ;7( to be These allow users to create new schemes such as )3: signature
curious about +7 's data. Thus the privacy of data can be scheme. Among all the signature methods in cryptography, )3:
compromised by ;7( in a public auditing protocol. Therefore signatures have shortest length (Zhang et al., 2004). Boneh and
along with data integrity preservation, there is a requirement for Franklin (2001) provide descriptive study of bilinear parings.
privacy maintenance to prevent ;7( from gaining any knowledge
from +7 's data. 3.3. The *+/ problem
In the proposed work, the issue of data privacy has not been The *+/ assumption states that a particular computational
addressed and ;7( is assumed to be loyal. problem in a cyclic group is hard. Assume a cyclic group G with order
(ii) Threats caused by *:7 : Some threats to +7 's data caused by g. For given (g , g a , g b ) with unknown a, b ∈Zp ; it is computationally
*:7 are as follows: challenging to establish the value gab.
(a) A *:7 may deliberately remove rarely accessed data for
saving server space without notifying +7 . 3.4. Classical Merkle Hash Tree (4 /; )
(b) A *:7 can cause some processing error on data and this may 4 /; is a renown binary tree data structure where every node other
damage +7 's data permanently. than leaf node is concatenation of its leaf nodes. The node at the top is
(iii) Some external threats: termed as root node. Leaf nodes in a 4 /; represents hashes of data
(a) A legitimate user can access outsourced data via an (70 blocks. Benefits of 4 /; includes the secure verification of data which
offered by *:7 . Weak (70 may put +7 's data at risk. A is large in size. Authentication of root node provides integrity assurance
stranger may use login credentials of some legitimate user of all the leaf nodes. Fig. 2 depicts a 4 /; with 8 leaf nodes. For
and may pollute or delete data without being exposed. example, if an auditor requests the prover for integrity verification of
(b) A former administrator at *:7 may break into a cloud server data node at position 3, the prover shares the auditor with the Auxiliary
and provide harm to stored data. Information (AI) as AI(d[3]): {(HC , L ), H (d[3]), (H (d[4]), R ), (HB, R )}.
Thus, along with offering variety of cloud services, securing data Now the auditor can generate root HR as follows:
accessibility and preventing external attacks to outsourced data
are important for *:7 . (i) Compute HD ← (H (d[3])∥ H (d[4])).
(ii) Compute HA ← (HC ∥ HD ).
3. Preliminaries (iii) Finally compute root HR ← (HA ∥ HB ).
In this section, the background concepts used in realizing By verifying authenticity of root node, all blocks are automatically
90;: − − 4 /; are presented commencing with a brief description authenticated. In this paper, we have modified the classic 4 /; in
of Boneh Lynn Shacham ()3: ) signature method which has been used which every node has now two pieces of information: hash value and
for authenticating block signatures followed by brief discussions on
3
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
relative index. Moreover, the root node in modified 4 /; is con- 4 /; has been modified to store two pieces of information, one is the
catenated with the date and time of creation of tree to ensure data hash value of the data block and the other is the relative index of a
freshness. Merkle (1980) provides more details on 4 /; . node. The relative index associated to a node P specifies the number of
leaf nodes belonging to the subtree of P. In modified 4 /; the relative
4. Efficiency issues in Wang et al. (2011) index of leaf nodes is set to 1. For example, if L and R are left and right
child nodes with hash values as Ha, Hb and relative index field values
Wang et al. (2011) is an efficient protocol for auditing cloud data. as ra and rb to a parent node P, then the node P has hash value as
Their protocol supports public auditing of data and data dynamic (Ha ∥ Hb ) and relative index field value as (ra + rb ). A time stamp field is
procedures also. However, some major flaws in their protocol are associated with the root node of 4 /; . This time stamp is the time
discussed below: and date of creation of 4 /; concatenated to the root hash value.
Fig. 3 shows an example of a modified 4 /; . Here, HR=(HA ∥ HB ∥ dt ),
(i) Computational complexity: The computational complexity of where dt is date and time of tree creation. As HR is always updated for
searching ith leaf node in a particular data integrity auditing any modification done to any block in a tree and so time and date of last
instance is computed as O(n) in Wang et al. (2011), where n modification is reflected in HR, which provides data freshness assur-
denotes number of leaf nodes in 4 /; . If the storage structure ance.
used is continuous, then any operation of insertion, deletion has
O(n) computational complexity, which is quite high. If link list 5.2. Definition
data structure is used then also for reaching ith leaf node, the list
of 4 /; leaf nodes is traversed one by one, leading to computa- This section provides definitions of various algorithms composing
tional complexity of O(n). 90;: − − 4 /; :
(ii) Position authentication: After insertion, deletion operation, the
sequence number of nodes probably gets modified and the height (i) Keygen(1λ): This algorithm is executed by +7 . Here λ denotes a
of tree also becomes imbalanced. The proof of any data dynamic security parameter, and the output is a key pair (pubkey, seckey)
procedure in Wang et al. (2011) is specified as: ←(η, κ).
Pf = {ψ , (H (d[i ]), AI (i ))i ∈ Q, ρ , R′}, which clearly includes no in- (ii) FileTagGen(fname, κ, n, dt): This algorithm is executed by +7 to
formation for authenticating position of block in question. generate tag for file F. Input to this algorithm is the name of file to
Significance of each field in Pf has been explained in Section 5. be outsourced, secret key, number of block partitions and date and
(iii) Empty proofs: There is no mechanism provided to handle the time of preprocessing of file. This file tag is denoted as τ.
integrity auditing instance of a block which actually does not (iii) BlockSigGen(κ, H(d[i ]), dt, u): This algorithm is executed by +7 .
exists in the 4 /; . Its input is secret key κ, hash of file blocks, date and time of
(iv) Data freshness: Wang et al. (2011) provide assurance that the preprocessing of file and a random element u ∈G. Its output is θ,
outsourced data is intact. However, it does not provides any which is an ordered collection of )3: signatures {ψi}1≤ i ≤ n on hash
freshness guarantees for data. of file blocks.
(iv) Challenge: This algorithm is executed by ;7( to produce a
5. The proposed protocol (90;: − − 4 /; ) challenge message for *:7 after delegation of auditing by +7 .
In this section, 90;: − − 4 /; based protocol for performing a (v) GenProof(F, θ, C): This algorithm is executed by *:7 as soon as a
data integrity audit in cloud computing has been detailed. Firstly, the challenge message C is received from ;7( . *:7 generates a
modification in classic 4 /; has been discussed followed with the proof Pf and communicates it to ;7( for verification. Input to
definition of various algorithms comprising the protocol. Then, the this algorithm is file F, signature set θ and challenge message C.
protocol construction is presented in detail. (vi) VerifyProof(C, Pf, η): This algorithm is executed by ;7( to verify
the proof Pf received from *:7 . Input to this algorithm is
5.1. Modified 4 /; challenge message C, Pf message from *:7 and public key g κ .
To reduce the computation complexity of searching a node in Output of this algorithm is {TRUE, FALSE} depending upon
4 /; during a data integrity audit instance, every node of classic whether verification is successful or a failure.
4
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
5.3. Detailed construction ;7( , *:7 uses search algorithm to search ith node ∀ i ∈ C in the
hash tree. While searching the node, *:7 can store the informa-
Consider the file to be outsourced is ‘F’ and is split into ‘n’ data tion of every sibling node on the search path of the ith node. This
blocks (d[1], d[2],……, d[n]). Assume e: G × G → GT denotes a bilinear information will serve as AI of the node being searched. For
map with ‘g’ being generator and ‘p’ being prime order of group G. Let example, if ith node is say 5 (Fig. 3), the AI is specified as:
H : {0, 1}* → G be a cryptographic hash function. The proposed proto- [(HA, 4, L ), (H (d[6]), 1, R ), (HF , 2, R )] where L specifies a left
col is described in detail as follows: sibling and R specifies a right sibling. Algorithm 5 also produces
a field SIB as one of the output. SIB specifies the number of sibling
(i) Keygen(1λ): +7 must first generate set of keys (pubkey, seckey) nodes positioned at left to the node in question. For example, in
by using key generation procedures shown in Algorithm 1. The above case SIB(5)=4. The pseudo code for searching a node in
algorithm first chooses a random element κ ∈ Zp as a secret key 90;: − − 4 /; is given in Algorithm 5. If the ith node doesn't
and produces η = g κ as the public key. exists in 4 /; , then *:7 reports message of *block not found*
(ii) FileTagGen: +7 now generates tag for file F. For this, +7 first to ;7( and halts auditing process. However, if the block does ex-
generates a random element say u ∈ G . +7 generates system date ists then it continues the proof generation further and produces a
and time denoted as dt. The system date and time is concatenated complete proof of possession. *:7 now computes:
in generating file tag τ to ensure freshness of data file. Let rk
h=(fname ∥ n ∥ u ∥ dt ) where h ∈ G and τ=sigκ (h ) be the file tag ψ← ∏ ψibi ∈ G
for F. The concatenated string ‘h’ is stored locally for future i = r1
verification of file tag. The pseudo code for generating file tag is
and,
given in Algorithm 2.
rk
(iii) BlockSigGen: After generating file tag τ, +7 generates )3:
signatures of each ith file block d[i] as ψi=(H (d[i ]). u d[i])κ where i
μ← ∑ bid[i] ∈ Zp
i = r1
∈{1, 2,…, n} and u ∈G. Here H(d[i]) denotes hash of file block by
using some cryptographic hash function. In proposed protocol we After computing ψ and μ, *:7 responds ;7( with the following
use SHA256 hash algorithm. The set of signatures is denoted as as proof of possession: Proof Pf = {μ, ψ , (H (d[i]), AI (i ))i ∈ Q , SIB(i ), ρ}
θ = {ψi}, 1 ≤ i ≤ n . After generating block signatures, +7 then Here AI(i) is the auxiliary information of ‘i’ which consists
produces a 90;: − − 4 /; using H(d[i]) ∀i ∈{1, n} as leaves small information about the nodes which are siblings of ‘i’ on the
and generates a root node HR. On achieving HR, +7 concate- path from root node to H(d[i]) as explained above.
nates HR with date and time of system to ensure data freshness. (vi) VerifyProof: Before the actual verification process begins, ;7(
Thus HR=HR ∥ dt . Now +7 authenticates root HR using κ. Finally authenticates the file tag τ as follows:
the signature of root is denoted as ρ ← H (R )κ . +7 will send {F, τ, (a) Recover locally stored file tag hash h.
θ, ρ} information to the *:7 for storage and deletes the same (b) Recover file tag τ from server.
information from local storage. The pseudo code for generating (c) Check whether following equation holds or not:
block signatures and root signature is given in Algorithm 3 and
e(h , g κ ) =? e(τ , g) (1)
The pseudo code for generating hash tree has been given in
Algorithm 4.
(iv) Challenge: A +7 may wish to check integrity of outsourced file F (d) Output TRUE if verification is successful and recover u and
later. For this +7 delegates auditing task to ;7( by sharing file dt.
details, root signature and security keys over a secure and (e) Halt otherwise.
authentic channel. To start with audit, ;7( picks random k- If the authentication verification of file tag τ in Eq. (1)
element subset Q={r1, r2, …, rk} ∈ [1, n] and ∀i ∈Q, ;7( selects would have failed, ;7( will halt further auditing process and
random element bi ∈ Zp . The ;7( sends the challenge message C reports *integrity audit process failure* message to +7 .
←(i , bi )i ∈ Q to *:7 . However if the authentication of file tag is successful, then
(v) GenProof: As soon as the challenge message C is received from ;7( recovers u and dt field, which has to be used in
5
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
6
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
7
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
integrity auditing protocol for new block. If result is TRUE, +7 can 7. Adversarial model for proposed protocol
delete {d[i]’, τ’, ρ’} from its local system.
In this section the adversarial model for the security of proposed
protocol has been presented. The security of proposed protocol against
6.2. Data insertion procedure (+07 )
an existential forgery under selective data block attack in a 964 is
Assume +7 wants to insert a data block d * at a particular position
established by simulating a game (GameI) between an adversary ( I
(Fig. 4), it has to execute following +07 :
and an auditor ( .
We will prove that if (0 is able to produce a forgery with non-
a) +7 will first produce signature for the new block as
* negligible success probability on a signature of data block, then ∃ an
ψ *=(H (d *). u d )κ .
algorithm B capable of generating solution of *+/ problem in cyclic
b) +7 will then produce a new file tag τ* ← sigκ (fname ∥ n ∥ u ∥ dt ).
groups.
In GameI, ( I may access following three oracle queries:
Now +7 frames a data insertion message as (0 , V, i, d *, ψ’, τ’) and
transfer this message to *:7 . Here 0 specifies +07 . The field ’V’ (i) Reveal_System_Parameter_Query: (0 may request set of system
specifies position of new block to be inserted, V ←A specifies insertion
parameters from ( . As a result, ( returns system parameters to
after ith position and V ←B specifies insertion before ith position. (0 .
Upon receiving insertion message, *:7 saves d * and the corre-
(ii) Hash_Query: (0 may request hash queries at any time. (
sponding leaf node H (d *). Now *:7 uses Algorithm 5 and finds H(d[i])
maintains a list to respond to this query.
in the 4 /; , and preserves AI(i) and inserts leaf node H (d *). If the ‘V’
field is set ‘A’ then an internal node with hash value (H (d[i ])∥ H (d *)) (iii) Sign_Query: (0 may request signature queries for a data block
will be joined to the original tree, or if the ‘V’ field is set ‘B’ then an d[i]. The auditor responds to this query with a signature σi.
internal node with hash value (H (d *)∥ H (d[i ])) will be added to the
original tree with index set to 2. After this the detail of each node lying There are three phases in GameI, discussed as below:
on the track from ith node to the topmost node (root) are modified. Phase I : In this phase, (0 generate a
Now *:7 produces a fresh root R due to the regenerated 4 /; . Reveal_System_Parameter_Query. To answer this query, ( executes
Finally, *:7 delivers +7 with a proof for insertion procedure as, Keygen algorithm. ( send parameters to (0 keeping its master key as
Pf = {ψ , (H (d[i ]), AI (i ), ρ , SIB(i ), R′}, where AI(i) specifies the auxiliary secret.
information of d[i ] in the previous tree. On reception of proof for Phase II : (0 can make Hash_Query and Sign_Query at this phase of
insertion procedure, +7 first authenticates τ. Upon successful authen- oracle model.
tication of τ it produces root using {AI(i), H (d[i ])} and then authenti- Phase III : At this stage, after sufficient number of oracle queries, (0 is
cates this newly generated root by verifying following Eq. (2). If Eq. (2) able to generate a pair 〈σi*, d[i ]〉.
holds TRUE, +7 can now verify whether *:7 has accomplished the (0 wins GameI iff:
insertion correctly by further producing a fresh value for root using
{AI(i), H(d[i]), H (d *)} and comparing it with R’. If the outcome is (i) σi* is a genuine signature on data block d[i].
successful, +7 authenticates R’ as (H (R′))κ and transfers (H (R′))κ to (ii) (0 has issued no Sign_Query for data block d[i].
*:7 for updation. Finally, +7 runs the default integrity auditing
protocol for new block. If result is TRUE, +7 can erase {ρ’, d *, τ’} from The security of proposed protocol is defined as follows:
its local storage.
Definition 1. The proposed data integrity auditing protocol based on
)3: signatures is existentially unforgeable in an adaptively selected
6.3. Data deletion procedure (++7 ) data block attack type iff the probability of success of a probabilistic
Initial few steps of ++7 are similar to +07 . The structure of a polynomial time adversary ( I defined in above GameI is negligible.
data deletion message is (+ , V, i, τ’). If the ‘V’ field is set ‘A’ then a node
with hash value H(d[i+1]) is removed from the original tree, or if the ‘V’
8. Security proofs
field is set ’B’ then a node with hash value H(d[i−1]) is removed from
the original tree. The relative index value and hash value of parent node
is also updated (See Fig. 5). After this all the steps of ++7 are similar Theorem 1. If an adversary (0 having an advantage ϵ to generate a
to that of +07 and are thus omitted here. forgery in time span ‘t’ on a data block d[i] by simulating GameI and
making qH queries to Hash_Query, qS queries to Sign_Query and
qsys queries toReveal_System_Parameter_Query in a 964 , then a
6.4. Data append procedure(+(7 ) *+/ problem can be solved with probability:
+(7 is similar to data insertion at the end of a file. The details of this
procedure is thus omitted here. ϵ′ ≥ (1/(qS + 1)e)ϵ
8
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
9
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
Using (a) and (b), total time required by algorithm B to solve *+/ As long as 0 ≤xa≤ (A + B)
problem is: ⎛ ⎞2
⎜ ⎟
tc ≥ t + tsm(qH + 2qS ) ⎛ x ⎞⎛ x ⎞ ⎜ xa ⎟
Applying ⎜1 − a ⎟ ⎜1 − a ⎟ ≤⎜1 − , we get
⎝ A⎠ ⎝ B⎠ ⎛ A + B⎞ ⎟
This completes the proof of Theorem 1. ⎜ ⎜ 2⎟ ⎟
⎛ ⎝ ⎝ 2 ⎞⎠⎠
⎜ ⎟
⎛ x ⎞⎛ xa ⎞ ⎜ xa ⎟
⎜1 − a ⎟⎜1 − ⎟ ≤ ⎜1 −
9. Performance analysis ⎝ n ⎠⎝ n − t + 1⎠ ⎛ t − 1⎞⎟
⎜ n−⎜ ⎟⎟
⎝ ⎝ 2 ⎠⎠
The performance of 90;: − − 4 /; has been analyzed in this
section. The analysis of probability of malpractice at *:7 has been and,
studied. We also discussed the computation complexity of
90;: − − 4 /; with reference to number of corrupted blocks and ⎛ ⎞2
⎜ ⎟
number of blocks modified. The computation cost of various algorithms ⎛ xa ⎞ ⎛ xa ⎞ ⎜ xa ⎟
⎜1 − ⎟⎜1 − ⎟ ≤ ⎜1 −
comprising the proposed protocol have also been given followed with ⎝ n − 1 ⎠⎝ n − t + 2⎠ ⎛t − 1⎞⎟
⎜ n−⎜ ⎟⎟
the storage cost and communication overhead in 90;: − − 4 /; . ⎝ ⎝ 2 ⎠⎠
The results of comparison between 90;: − − 4 /; and two prior
data integrity auditing methods presented by Tan and Jia (2014) and Similarly finding other results and multiplying all results, we get
Wang et al. (2011) have been given. ⎛ x ⎞⎛ xa ⎞⎛ xa ⎞ ⎛ xa ⎞
⎜1 − a ⎟⎜1 − ⎟⎜1 − ⎟ − − − − ⎜1 − ⎟
⎝ n ⎠⎝ n − 1 ⎠⎝ n − 2⎠ ⎝ n − t + 1⎠
9.1. Probability of detecting malpractice at *:7
⎛ ⎞t
The proposed protocol relies on probability sampling method to ⎜ ⎟
reduce computation burden. The sampling method partitions +7 's ⎜ xa ⎟
≤ ⎜1 −
data file ‘F’ in ‘n’ number of blocks and then selects randomly ‘t’ blocks ⎛ t − 1⎞⎟
⎜ n−⎜ ⎟⎟
out of ‘n’ blocks to check for malpractice if any. Using this, we will ⎝ ⎝ 2 ⎠⎠
study the probability of malpractice detection at *:7 .
Consider that *:7 alters ‘xa’ blocks out of the total ‘n’ blocks, then Hence,
the probability that randomly checked block is a corrupted block is: ⎛ ⎞t
x ⎜ ⎟
p0= a xa
n P ≥ 1 − ⎜1 − ⎟
Now, assume that the probability of detecting malpractice in at least ⎜ t − 1⎟
n−
one block is specified as P. ⎝ 2 ⎠
Therefore, P=1 – Probability of not detecting malpractice For finding maximum value of P, consider
Probability of not detecting malpractice in 1st block
n − xa x ⎛ xa ⎞ ⎛ xa ⎞
checked= =1 - a ⎜1 − ⎟ ≥ ⎜1 − ⎟
n n ⎝ n − i⎠ ⎝ n − t + 1⎠
Probability of not detecting malpractice in 2nd block
n − xa − 1 xa
checked= =1 - t −1
⎛ xa ⎞ ⎛ xa ⎞t t −1
⎛ xa ⎞
n−1 n−1 ⟹ ∏ ⎜1 − ⎟ ≥ ⎜1 − ⎟ ⟹1 − ∏ ⎜1 − ⎟
i =0 ⎝ n − i⎠ ⎝ n − t + 1⎠ ⎝ n − i⎠
Probability of not detecting malpractice in 3rd block i =0
n − xa − 2 xa
checked= =1 - ⎛ xa ⎞t
n−2 n−2 ≤ ⎜1 − ⎟
Probability of not detecting malpractice in tth block ⎝ n − t + 1⎠
n − xa − t + 1 xa
checked= =1 - Hence,
n−t+1 n−t+1
Therefore,
⎛ xa ⎞t
⎛ x ⎞⎛ xa ⎞ ⎛ xa ⎞ P ≤ ⎜1 − ⎟
P = 1 − ⎜ 1 − a ⎟ ⎜1 − ⎟ − − − − ⎜1 − ⎟ ⎝ n − t + 1⎠
⎝ n ⎠⎝ n − 1⎠ ⎝ n − t + 1⎠
Now we conclude that probability of malpractice at *:7 is:
t −1
⎛ xa ⎞
=1− ∏ ⎜1 − ⎟
⎛ ⎞t
⎝ n − i⎠
i =0
⎜ xa ⎟ ⎛ xa ⎞t
1 − ⎜1 − ⎟ ≤ P ≤ ⎜1 − ⎟
For finding minimum value of P, consider ⎜ t − 1⎟ ⎝ n − t + 1⎠
n−
⎝ 2 ⎠ (5)
⎛ ⎞2
⎜ ⎟ Assume the +7 partitions a 1 GB file in 62500 data blocks with each
⎛ x ⎞⎛ x ⎞ ⎜ xa ⎟
⎜1 − a ⎟⎜1 − a ⎟ ≤ ⎜1 − ; A, B ≥ 0 block having size equal to 16 KB. Fig. 6 shows the number of blocks(t)
⎝ A ⎠⎝ B⎠ ⎛ A + B⎞ ⎟
⎜ ⎜ ⎟⎟ required in challenge message to detect number of blocks corrupted by
⎝ ⎝ 2 ⎠⎠
considering probability of malpractice detection (P) from a set Px={0.7,
10
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
Table 2
Notations with descriptions.
Notation Description
Table 3
Computation cost of various algorithms in 90;: − − 4 /; .
KeyGen 0 0 0 Exp(G) 0
FileTagGen 0 0 0 Exp(G) 0
BlockSigGen 0 0 nH(G) (n+1)Exp(G) 0
GenProof tAdd(G) (t−1)Mul(G) 0 tExp(G) 0
Fig. 6. Number of challenged blocks required w.r.t 90% percent of corrupted blocks.
VerifyProof 0 tMul(G) 0 tExp(G) 4
Table 4
Number of corrupted blocks (xa) Number of challenged blocks (t)
Comparison of computation cost.
6250 23
Protocol Server Auditor Time
12 500 12
complexity
18 750 10
25 000 7
Wang protocol (Wang et al., tExp(G) + (t 2Pair + (t+2)Exp O(n)
31 250 5
2011) −1)Mul(G) (G) + (t+1)Mul
37 500 4
(G)
43 750 4
Tan protocol (Tan and Jia, 2tExp(G) + 3Pair + (t+1)Exp O(n)
50 000 3
2014) (2t−2)Mul (G) + tMul
56 250 2
(G)
90;: − − 4 /; tExp(G) + (t 4Pair + tExp(G) O(logn)
−1)Mul(G) + tMul(G)
0.8, 0.9, 0.99, 0.99999}. Thus if *:7 modifies 10% of total outsourced
blocks (n), then the ;7( needs only 68 random blocks in challenge set n: total number of blocks; t: number of challenged blocks.
to detect a server malpractice with 0.999 probability. Table 1 shows
number of blocks required in challenge message with various percent of in searching a node in 4 /; is reduced to O(logn). The computational
corrupted blocks to detect a malpractice with probability 0.9. Also with complexity of proposed search algorithm is similar to searching an
an increase in number of corrupted blocks, the number of challenged element in a Binary Search Tree ():; ) with (2n - 1) nodes and the
blocks decreases. For example, if *:7 modifies 40% of total out- computational complexity in Wang et al. (2011) is similar to that of a
sourced blocks (n), then the ;7( needs only 18 random blocks in linear search algorithm. The performance of proposed protocol in
challenge set to detect a server malpractice with 0.999 probability. terms of frequent data modifications is depicted in Fig. 7. An experi-
ment for updating a file of 1 GB size has been conducted where the
number of blocks updated varies between 100 and 1000. Results
9.2. Computation cost
depicted are of 10 trials. From Fig. 7, it is clear that as the number
of blocks modified increases, the computation cost in Wang et al.
Computation cost is viewed differently by different entities in an
(2011) increases at higher rate as compared to the proposed protocol.
auditing model. For a ;7( , computational cost specifies the resources
required by ;7( to perform an integrity audit. For a *:7 , computa-
tion cost corresponds to the time required to process the challenge
message and produce a proof of possession. The computation cost
incurred in a data dynamic procedure is different from a regular data
integrity audit. In a update procedure, ;7( requires time to generate
block tag for fresh block, *:7 requires time to regenerate 4 /; for
regenerating root hash, updating file tag. All these tasks contributes to
computational burden on respective entities. Following this section, we
present computation cost of 90;: − − 4 /; and provide a compar-
ison with Wang et al. (2011) and Tan and Jia (2014) protocol. For the
convenience of discussion, we use notations presented in Table 2 for
our analysis. Table 3 depicts the computation cost of each algorithm in
90;: − − 4 /; . Table 4 provides the comparison of computation
cost of 90;: − − 4 /; protocol with Wang et al. (2011) protocol
and Tan and Jia (2014) protocol. From Table 4 it is clear that
90;: − − 4 /; is efficient w.r.t performance than the other two
schemes. In Wang et al. (2011) the computation time in searching a
blocks is high and varies linear w.r.t the number of blocks i.e. O(n).
However in 90;: − − 4 /; , due to relative index field, complexity Fig. 7. Computation cost vs Number of blocks modified.
11
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
Table 5
Storage cost in 90;: − − 4 /; .
+7 k+2 |G|+l
;7( k+|G|+l
*:7 n+l+(n+1)|G|
n: total number of blocks; k: security parameter; l: size of file tag; |G|: size of an element
∈G
In Tan and Jia (2014), the data dynamic operations are not supported.
12
N. Garg, S. Bawa Journal of Network and Computer Applications 84 (2017) 1–13
References Juels, A., Kaliski Jr, B.S., 2007. Pors: Proofs of retrievability for large files. In:
Proceedings of the 14th ACM Conference on Computer and Communications
Security. ACM, in Alexandria, Virginia, USA pp. 584–597.
Ateniese, G., Burns, R., Curtmola, R., Herring, J., Kissner, L., Peterson, Z., Song, D., Lynn, B., Accessed: 11 January 2016. Pairing-based cryprography library.〈http://crypto.
2007. Provable data possession at untrusted stores. In: Proceedings of the 14th ACM stanford.edu/pbc/〉.
Conference on Computer and Communications Security. ACM, Alexandria, Virginia, Mell, P., Grance, T., 2011. The Nist Definition of Cloud Computing.
USA pp. 598–609. Merkle, R.C., 1980. Protocols for public key cryptosystems. In: Null. IEEE, in Oakland
Ateniese, G., Di Pietro, R., Mancini, L.V., Tsudik, G., 2008. Scalable and efficient provable california p. 122.
data possession. In: Proceedings of the 4th International Conference on Security and Mohammad Etemad, M., Küpçü, A., 2016. Generic efficient dynamic proofs of
Privacy in Communication Networks. ACM, in Istanbul, Turkey USA p. 9. retrievability. In: Proceedings of the 2016 ACM on Cloud Computing Security
Boneh, D., Franklin, M., 2001. Identity-based encryption from the weil pairing. In: Workshop. ACM, in Vienna, Austria pp. 85-96.
Advances in CryptologyCRYPTO 2001. Springer, in Santa Barbara, California, pp. openssl Project, T., Accessed: 11 January 2016. Openssl library. 〈http://www.openssl.
213–229. org/docs/manmaster/crypto/crypto.html〉.
Boneh, D., Lynn, B., Shacham, H., 2001. Short signatures from the weil pairing. In: Shacham, H., Waters, B., 2013. Compact proofs of retrievability. J. Cryptol. 26 (3),
Advances in CryptologyASIACRYPT 2001. Springer, in Gold Coast, Australia pp. 442–483.
514–532. Tan, S., Jia, Y., 2014. Naepasc: a novel and efficient public auditing scheme for cloud
Bowers, K.D., Juels, A., Oprea, A., 2009. Hail: a high-availability and integrity layer for data. J. Zhejiang Univ. Sci. C 15 (9), 794–804.
cloud storage. In: Proceedings of the 16th ACM conference on Computer and Wang, Q., Wang, C., Li, J., Ren, K., Lou, W., 2009. Enabling public verifiability and data
communications security. ACM, pp. 187–198. dynamics for storage security in cloud computing. In: Computer Security–ESORICS
Chen, B., Curtmola, R., Ateniese, G., Burns, R., 2010. Remote data checking for network 2009. Springer, in Saint-Malo, France pp. 355–370.
coding-based distributed storage systems. In: Proceedings of the 2010 ACM Wang, Q., Wang, C., Ren, K., Lou, W., Li, J., 2011. Enabling public auditability and data
Workshop on Cloud Computing Security Workshop. ACM, in Chicago, Illinois, USA dynamics for storage security in cloud computing. IEEE Trans. Parallel Distrib. Syst.
pp. 31–42. 22 (5), 847–859.
Curtmola, R., Khan, O., Burns, R., Ateniese, G., 2008. Mr-pdp: Multiple-replica provable Wang, J., Chen, X., Huang, X., You, I., Xiang, Y., 2015. Verifiable auditing for outsourced
data possession. In: The 28th International Conference on Distributed Computing database in cloud computing. IEEE Trans. Comput. 64 (11), 3293–3303.
Systems, 2008. ICDCS'08. IEEE, in Beijing, China pp. 411–420. Zhang, F., Safavi-Naini, R., Susilo, W., 2004. An efficient signature scheme from bilinear
Erway, C.C., Küpçü, A., Papamanthou, C., Tamassia, R., 2015. Dynamic provable data pairings and its applications. In: Public Key Cryptography–PKC 2004. Springer,
possession. ACM Trans. Inf. Syst. Secur. (TISSEC) 17 (4), 15. in Singapore pp. 277–290.
Garg, N., Bawa, S., 2016. Comparative analysis of cloud data integrity auditing protocols.
J. Netw. Comput. Appl. 66, 17–32.
13