0% found this document useful (0 votes)

9 views

Erasure Coding For Distributed Storage An Overview

Uploaded by

tan tran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Erasure Coding For Distributed Storage An Overview

Uploaded by

tan tran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

SCIENCE CHINA

Information Sciences
. REVIEW . October 2018, Vol. 61 100301:1–100301:45
https://doi.org/10.1007/s11432-018-9482-6

Special Focus on Distributed Storage Coding

Erasure coding for distributed storage: an overview†

S. B. BALAJI1 , M. Nikhil KRISHNAN1 , Myna VAJHA1 , Vinayak RAMKUMAR1 ,

Birenjith SASIDHARAN1 & P. Vijay KUMAR1,2
1Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore 560012, India;
2Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles CA90089, USA

Received 5 April 2018/Revised 4 May 2018/Accepted 7 July 2018/Published online 6 September 2018

Abstract In a distributed storage system, code symbols are dispersed across space in nodes or storage
units as opposed to time. In settings such as that of a large data center, an important consideration is
the efficient repair of a failed node. Efficient repair calls for erasure codes that in the face of node failure,
are efficient in terms of minimizing the amount of repair data transferred over the network, the amount of
data accessed at a helper node as well as the number of helper nodes contacted. Coding theory has evolved
to handle these challenges by introducing two new classes of erasure codes, namely regenerating codes and
locally recoverable codes as well as by coming up with novel ways to repair the ubiquitous Reed-Solomon
code. This survey provides an overview of the efforts in this direction that have taken place over the past
decade.
Keywords distributed storage, regenerating codes, locally recoverable codes, codes with locality, erasure
codes, node repair
Citation Balaji S B, Krishnan M N, Vajha M, et al. Erasure coding for distributed storage: an overview. Sci
China Inf Sci, 2018, 61(10): 100301, https://doi.org/10.1007/s11432-018-9482-6

1 Introduction
This survey article deals with the use of erasure coding for the reliable and efficient storage of large
amounts of data in settings such as that of a data center. The amount of data stored in a single data
center can run into tens or hundreds of petabytes. Reliability of data storage is ensured in part by
introducing redundancy in some form, ranging from simple replication to the use of more sophisticated
erasure-coding schemes such as Reed-Solomon (RS) codes. Minimizing the storage overhead that comes
with ensuring reliability is a key consideration in the choice of erasure-coding scheme. More recently
a second problem has surfaced, namely, that of node repair. In [1, 2], the authors study the Facebook
warehouse cluster and analyze the frequency of node failures as well as the resultant network traffic
relating to node repair. It was observed in [1] that a median of 50 nodes is unavailable per day and that a
median of 180 TB of cross-rack traffic is generated as a result of node unavailability. It was also reported
that 98.08% of the cases have exactly one block missing in a stripe. The erasure code that was deployed
in this instance was an [n = 14, k = 10] RS code. Here n denotes the block length of the code and k
the dimension. The conventional repair of an [n, k] RS code is inefficient in that the repair of a single
node, calls for contacting k other (helper) nodes and downloading k times the amount of data stored in
the failed node, which is clearly inefficient. Thus there is significant practical interest in the design of
erasure-coding techniques that offer both low overhead and which can also be repaired efficiently.
* Corresponding author (email: pvk1729@gmail.com)
† Invited article

c Science China Press and Springer-Verlag GmbH Germany, part of Springer Nature 2018 info.scichina.com link.springer.com
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:2

rδ

Figure 1 (Color online) An overview of the different classes of codes for distributed storage discussed in this survey
article.

Coding theorists have responded to this need by coming up with two new classes of codes, namely
regenerating (RG) and locally recoverable (LR) codes. The focus in an RG code is on minimizing the
amount of data download needed to repair a failed node, termed the repair bandwidth while LR codes
seek to minimize the number of helper nodes contacted for node repair, termed the repair degree. In a
different direction, coding theorists have also re-examined the problem of node repair in RS codes and
have come up with new and more efficient repair techniques. This survey provides an overview of these
recent developments. An outline of the survey itself appears in Figure 1.
RG codes are discussed in Section 2. The two principal classes of RG codes, namely minimum band-
width regenerating (MBR) and minimum storage regeneration (MSR) appear in the two sections that
follow. These two classes of codes are at the two extreme ends of a tradeoff known as the storage-repair
bandwidth (S-RB) tradeoff. A discussion on codes that correspond to the interior points of this tradeoff
appears in Section 5. The theory of RG codes has been extended in several directions and these are
explored in Section 6. Section 7 examines LR codes. There have been several approaches at extending
the theory of LR codes to handle multiple erasures and these are dealt with in Section 8. A class of
codes known as locally regenerating (LRG) codes that offer both low repair bandwidth and low repair
degree within a single erasure code is discussed in Section 9. This is followed by Section 10 that discusses
recent advances in the repair of RS codes. A brief description of a different approach based on capacity
considerations and leading to the development of a liquid cloud storage system appears in Section 11.
The final section, discusses practical evaluations and implementations.
Disclaimer. This survey is presented from the perspective of the authors and is biased in this respect.
Given the explosion of research activity in this area, the survey also does not claim to be comprehensive
and we offer our apologies to the authors whose work has inadvertently or for lack of space, not been
appropriately cited. We direct the interested reader to some of the excellent surveys of codes on distributed
storage contained in the literature including [3–6].

2 RG codes

Definition 1 ([7]). Let Fq denote a finite field of size q. Then an RG code C over Fq having integer
parameter set ((n, k, d), (α, β), B) where 1 6 k 6 n − 1, k 6 d 6 n − 1, β 6 α, maps a file u ∈ FB
q on to
a collection {ci }ni=1 of n α-tuples over Fq using an encoding map

E(u) = [cT T T T
1 , c2 , . . . , cn ]
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:3

c
S d
k
d
k
d

d
n n

Figure 2 (Color online) An illustration of the data col- Figure 3 (Color online) The graph behind the cut-set file
lection and node repair properties of an RG code. size bound.

with the α components of ci stored on the i-th node in such a way that the following two properties (see
Figure 2) are satisfied:
• Data collection. the message u can be uniquely recovered from the contents {cij }kj=1 of any k nodes.
• Node repair. If the f -th node storing cf fails, then a replacement node can (1) contact any subset
D ⊆ [n] \ {f } of the remaining (n − 1) nodes of size |D| = d; (2) map the α contents ch of each helper
node h ∈ D on to a collection of β repair symbols aD β
h,f ∈ Fq ; (3) pool together the dβ repair symbols
thus computed to use them to create a replacement vector ĉf ∈ Fα q whose α components are stored in
the replacement node, in such a way that the contents of the resultant nodes, with the replacement node
replacing the failed node, once again forms an RG code.
An RG code is said to be exact-repair (ER) RG code if the contents of the replacement node are exactly
same as that of the failed node, ie., ĉf = cf . Else the code is said to be functional-repair (FR) RG code.
An RG code is said to be linear if (1) E(u1 + θu2 ) = E(u1 ) + θE(u2 ), u1 , u2 ∈ FB q , θ ∈ Fq and (2) the
map mapping the contents ch of the h-th helper node on to the corresponding β repair symbols aD h,f is
linear over Fq .
Thus an RG code is a code over a vector alphabet Fα q and the quantity α is termed the sub-packetization
level of the RG code. The total number dβ of Fq symbols to be transferred for repair of failure node is
B
called the repair bandwidth of the RG code. The rate of the RG code is given by R = nα . Its reciprocal
nα
B is the storage overhead.

2.1 Cut-set bound

Let us assume that C is an FR RG code having parameter set ((n, k, d), (α, β), B). Since an ER
RG code is also an FR code, this subsumes the case when C is an ER RG code. Over time, nodes will
undergo failures and every failed node will be replaced by a replacement node. Let us assume to begin
with, that we are only interested in the behavior of the RG code over a finite-but-large number N ≫ n
of node repairs. For simplicity, we assume that repair is carried out instantaneously. Then at any given
time instant t, there are n functioning nodes whose contents taken together comprise an RG code. At
this time instant, a data collector could connect to k nodes, download all of their contents and decode to

recover underlying message vector u. Thus in all, there are at most N nk distinct data collectors which
are distinguished based on the particular set of k nodes to which the data collector connects.
Next, we create a source node that possesses the B message symbols {ui }B
i=1 , and draw edges connecting
the source to the initial set of n nodes. We also draw edges between the d helper nodes that assist a
replacement node and the replacement node itself as well as edges connecting each data collector with
the corresponding set of k nodes from which the data collector downloads data. All edges are directed in
the direction of information flow. We associate a capacity β with edges emanating from a helper node to
a replacement node and an ∞ capacity with all other edges. Each node can only store α symbols over Fq .
We take this constraint into account using a standard graph-theory construct, in which a node is replaced
by 2 nodes separated by a directed edge (leading towards a data collector) of capacity α. We have in this

way, arrived at a graph (Figure 3) in which there is one source S and at most N nk sinks {Ti }.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:4

5000

4000

dβ
3000

2000

1000
600 800 1000 1200
α

Figure 4 (Color online) Storage-repair bandwidth tradeoff. Here, n = 60, k = 51, d = 58, B = 33660.

Each sink Ti would like to be able to reconstruct all the B source symbols {ui } from the symbols it
receives. This is precisely the multicast setting of network coding. A principal result in network coding
tells us that in a multicast setting, one can transmit messages along the edges of the graph in such a way
that each sink Ti is able to reconstruct the source data, provided that the minimum capacity of a cut
separating S from Ti is > B. A cut separating S from Ti is simply a partition of the nodes of the network
into 2 sets: Ai containing S and Aci containing Ti . The capacity of the cut is the sum of capacities of the
edges leading from a node in Ai to a node in Aci . A careful examination of the graph will reveal that the
Pk−1
minimum capacity Q of a cut separating a sink Ti from source S is given by Q = i=0 min{α, (d − i)β}
(Figure 3 shows an example cut separating source from sink). This leads to the following upper bound
on file size [7]:

k−1
X
B 6 min{α, (d − i)β}. (1)
i=0

Network coding also tells us that when only a finite number of regenerations take place, this bound
is achievable and furthermore achievable using linear network coding, i.e., using only linear operations
at each node in the network when the size q of the finite field Fq is sufficiently large. In a subsequent
result [8], Wu established using the specific structure of the graph, that even in the case when the number
of sinks is infinite, the upper bound in (1) continues to be achievable using linear network coding.
In summary, by drawing upon network coding, we have been able to characterize the maximum file size
of an RG code given parameters {k, d, α, β} for the case of functional repair when there is constraint placed
on the size q of the finite field Fq . Note interestingly, that the upper bound on file size is independent
of n. Quite possibly, the role played by n is that of determining the smallest value of field size q for
which a linear network code can be found having file size B satisfying (1). A functional RG code having
parameters ((n, k, d), (α, β), B) is said to be optimal provided (a) the file size B achieves the bound
in (1) with equality and (b) reducing either α or β will cause the bound in (1) to be violated.

2.2 Storage-repair bandwidth tradeoff

We have thus far, specified code parameters (k, d)(α, β) and asked what is the largest possible value of file
size B. If however, we fix parameters (n, k, d, B) and ask instead what are the smallest values of (α, β)
for which one can hope to achieve (1), it turns out, as might be evident from the form of the summands
on the RHS of (1), that there are several pairs (α, β) for which equality holds in (1). In other words,
there are different flavors of optimality.
For a given file size B, the storage overhead and normalized repair bandwidth are given respectively
dβ
by nα
B and B . Thus α reflects the amount of storage overhead while β determines the normalized repair
bandwidth. The several pairs (α, β) for which equality holds in (1), represent a tradeoff between storage
overhead on the one hand and normalized repair bandwidth on the other as can be seen from the example
plot in Figure 4. Clearly, the smallest value of α for which the equality can hold in (1) is given by α = Bk.
Given α = B k , the smallest permissible value of β is given by β = α
d−k+1 . This represents the MSR point
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:5

and codes achieving (1) with α = B α

k and β = d−k+1 are known as MSR codes. At the other end of the
tradeoff, we have the MBR code whose associated (α, β) values are given by β = dk−B k , α = dβ.
(2)
Remark 1. Since an RG code can tolerate (n − k) erasures by the data collection property, it follows
that the minimum Hamming weight dmin of an RG code must satisfy dmin > (n − k + 1). By the
singleton bound, the largest size M of a code of block length n and minimum distance dmin is given by
M 6 Qn−dmin +1 6 Qk , where Q is the size of alphabet of the code. Since Q = q α in the case of RG
code, it follows that the size M of an RG code must satisfy M 6 q kα , or equivalently q B 6 q kα , i.e.,
B 6 kα. But B = kα in the case of an MSR code and it follows that an MSR code is a maximum
distance separable (MDS) code over a vector alphabet. Such codes also go by the name MDS array code.
From a practical perspective, ER RG codes are easier to implement as the contents of the n nodes in
operation do not change with time. Partly for this reason and partly for reasons of tractability, with few
exceptions, most constructions of RG codes belong to the class of ER RG codes. Examples of FR RG
code include the d = (k + 1) construction in [9] as well as the construction in [10].
Early constructions of RG codes focused on the two extreme points of the S-RB tradeoff, namely the
MSR and MBR points. The various constructions of MBR and MSR codes are described in Sections 3
and 4. Not surprisingly, given the vast amount of data stored, the storage industry places a premium on
low storage overhead. In this connection, we note that the maximum rate of an MBR code is given by

B (dk − k2 )β dk − k2
RMBR = = = ,
nα ndβ nd

which can be shown to be upper bounded by RMBR 6 12 and is achieved when k = d = (n − 1). In the
case of MSR codes, there is no such limitation and MSR codes can have rates approaching 1.
An RG code is said to be a help-by-transfer (HBT) RG code if repair of a failed node can be accom-
plished without incurring any computation at a helper node. If no computation is required at either
helper node or at the replacement node, then the code is termed a repair-by-transfer (RBT) RG code.
Clearly, an RBT RG code is also an HBT RG code.

3 MBR codes

Remark 2. If the B message symbols are drawn randomly with uniform distribution from FB q , it can
be shown that in any RG code achieving the cut-set bound, the contents of each node correspond to a
random variable that is uniform over Fα q . In an MBR code, repair is accomplished by downloading a total
of just α symbols which clearly, is the minimum possible.
Remark 3. Let C be an MBR code. If C has the RBT property, it trivially follows that all scalar code-
symbols of C are replicated at least twice. In [11], it is shown that for an MBR code it is not possible to
have even a single scalar code-symbol replicated more than twice. Thus the RBT property implies that
the collection of nα scalar code-symbols associated with a codeword represent a set of nα 2 distinct code
symbols, each repeated twice. The converse is not true in general. However when d = (n − 1), it can be
shown that the two properties are equivalent.
Remark 4. In [12], it is shown that for d < (n − 1), it is not possible to construct an MBR code that
has the HBT property.

3.1 Polygonal MBR codes

In the following, we describe with the help of an example, one of the first explicit families of MBR
codes [13]. We term these codes as polygonal MBR codes. The construction holds for parameters
k 6 d = n − 1, β = 1 and the constructed MBR codes possess the RBT property.

Example 1. Consider the parameters n = 5, k = 3, d = 4 and β = 1. Thus B = kdβ − k2 β = 9. First

construct a complete graph with n = 5 vertices and N = 52 = 10 edges. The nine message symbols
are then encoded using a [10, 9] MDS code to produce ten code-symbols. Each code-symbol is then
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:6

2,5,6,P
P 6

1,4,9,P 1 1,3,6,7
5 2
4 3
9 7

3,5,8,9 8 2,4,7,8

Figure 5 (Color online) An example RBT MBR code for the parameters n = 5, k = 3, d = 4. Here file size is 9.

uniquely assigned an edge. Each node of the MBR code stores the code-symbols corresponding to the
edges incident on that node (Figure 5). The data collection property follows as any collection of k = 3
nodes yields nine distinct (MDS) code-symbols. If a node fails, the replacement node can download from
each of the remaining four nodes, the code-symbol corresponding to the edge it shares with the failed
node. Hence repair is accomplished by merely transferring the data without any computation (RBT).
Remark 5. For the general construction, in order to construct an [n, k, d = n − 1], β = 1 MBR code,
one first forms the complete graph on n vertices. Each edge is then mapped to a code-symbol of an

[N, B] MDS code, where N = n2 and B is the file size parameter. An O(n2 ) field-size requirement is
thus imposed by the underlying scalar MDS code.

3.2 Product-matrix (PM) MBR codes

A second, general construction for MBR codes is the PM construction [14] which derives its name from
the fact that the contents of n nodes can be expressed in the form of a product of two matrices. The
two matrices are respectively an encoding matrix and a second, message matrix containing the message
symbols. This construction yields MBR codes for all feasible parameters k 6 d 6 n − 1, β = 1, with an
O(n) field-size requirement. The (n × d) encoding matrix ψ is of the form: ψ = [φ ∆], where φ, ∆ are
(n × k), (n × (d − k)) matrices, respectively. Let the i-th row of ψ be denoted by ψiT . The sub-matrices
φ and ∆ are here chosen such that any d rows of ψ and any k rows of φ are linearly independent. The

(d × d) symmetric message matrix M is derived from the B = kd − k2 message symbols as M = [ VST V0 ],
where S is a symmetric (k × k) matrix and V a (k × (d − k)) matrix.
The i-th node, under the PM-MBR construction, stores the matrix product ψiT M . The repair data
passed on by helper node j to replacement node i is given by ψjT M ψi .

3.3 Other work

In [15], the authors introduce a family of RBT MBR codes for d = n − 1, that are constructed based
on a congruent transformation applied to a skew-symmetric matrix of message symbols. In comparison
with the O(n2 ) field requirement of polygonal MBR codes, in this construction, a field-size of O(n)
suffices. In [16], the authors stay within the PM framework, but provide a different set of encoding
matrices for MSR and MBR codes that have least-possible update complexity within the PM framework.
The authors of [16] also analyze the codes for their ability to correct errors and provide corresponding
decoding algorithms. Ref. [12] proves the non-existence of HBT MBR codes with d < (n − 1). The
paper also provides PM-based constructions for two relaxations, namely (i) any failed node which is a
part of a collection of systematic nodes can be recovered in HBT fashion from any d other nodes and
(ii) for every failed node, there exists a corresponding set of d helper nodes which permit HBT repair.
Ref. [11] provides binary MBR constructions for the parameters (k = d = n − 2), (k + 1 = d = n − 2)
and studies the existence of MBR codes with inherent double replication, for all parameters. In [17],
the authors provide regenerating-code constructions that asymptotically achieve the MSR or MBR point
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:7

as k increases and these codes can be constructed over any field, provided the file size is large enough.
In [18], the authors introduce some extensions to the classical MBR framework by permitting the presence
of a certain number of error-prone nodes during repair/reconstruction and by introducing flexibility in
choosing the parameter d during node repair.
Open problems 1. Determine the smallest possible field size q of an MBR code for given {(n, k, d),
(α, β)}.

4 MSR codes

Among the class of RG codes, MSR codes have received the greatest attention, and the reasons include
the fact that (a) the storage overhead of an MSR code can be made as small as desired, (b) MSR codes
are MDS codes and (c) MSR codes have been challenging to construct.

4.1 Introduction

α
As noted previously, an MSR code with parameters (n, k, d, α) has file size B = kα and β = d−k+1 .
Although MSR codes are vector MDS codes that have optimum repair-bandwidth of dβ for the repair
of any node among the n nodes, there are papers in the literature that refer to a code as an MSR code
even if optimal repair holds only for the systematic nodes. In the current paper, we refer to such codes
as systematic MSR codes. While only β symbols are sent by each of the d helper nodes, the number
of symbols accessed by the helper node in order to generate these β symbols could be > β. The class
of MSR codes that access at each helper node, only as many symbols as are transferred, are termed
optimal-access MSR codes. MSR codes that alter a minimum number of parity symbols while updating
a single, systematic symbol, are called update-optimal MSR codes.
There are several ER MSR constructions available in the literature. Shah et al. [9] show that inter-
ference alignment (IA) is necessarily present in every ER MSR code, and use IA techniques to construct
systematic MSR codes, known as MISER codes, for d = n − 1 > 2k − 1. The IA condition in the
context of MSR codes (observed earlier in [19]) demands that the interference components in the data
passed by helper nodes must be aligned so that they can be cancelled at the replacement node by data
received from the systematic helper nodes. In [20], Suh et al. build on [9] to construct MSR codes for
d > 2k − 1 with optimal repair bandwidth for all nodes, under the condition that the helper-node set
necessarily includes systematic nodes. In [14], the well-known PM framework is introduced to provide
MSR constructions for d > 2k − 1, thereby settling the problem of MSR code construction in the low-rate
regime, k/n 6 0.5. While the method adopted in [14] to provide a construction for d > 2k − 1 is to
suitably shorten a code for d = 2k − 1, an extension of the PM framework that yields constructions
for any d > 2k − 1 in a single step is provided in [21]. Apart from a few notable constructions such as
the Hadamard-design-based code [22] for (k + 2, k) and its generalization for (n − k) > 2 for systematic
node-repair, the problem of high-rate constructions (i.e., k/n > 0.5) for all-node repair remained open.
The first major result in this direction, is due to Cadambe et al. [23] where the authors apply the notion
of symbol extension in IA where multiple symbols are grouped together to form a single vector symbol, to
jointly achieve IA. The symbol-extension viewpoint is then used to show that ER MSR codes exist for all
(n, k, d), as B goes to infinity. The second major development was the zigzag code construction [24, 25],
the first non-asymptotic high-rate MSR code construction with d = (n − 1) permitting rates as close as 1
as desired, with additional desirable properties such as optimal access and optimal update. Zigzag codes
however, require a sub-packetization level (α) that grows exponentially with k and a very large finite
field size, while the earlier PM codes for the low-rate regime, have α = (k + 1) and field-size that is linear
2
in n. In a subsequent work [26], the authors present a systematic MSR construction having α = k4 and
k
rate R = 2/3. A second systematic MSR code with α = r r+1 is presented in [27]. A lower bound on
k−1
sub-packetization level α of a general MSR code is derived in [28]. The same paper shows that α > r r
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:8

in the case of an optimal-access MSR code. An improved lower bound for general MSR codes

2 log2 α log( r−1
r
)α+1 +1 > k (2)

appears in [29]. These developments made it clear that the ultimate goal in MSR code construction was
to construct a high-rate MSR code that simultaneously had low sub-packetization level α, low field-size
q, arbitrary repair degree d and the optimal-access property.
In [30], a parity-check viewpoint is adopted to construct a high-rate MSR code for d = n − 1 with a
n
sub-packetization level r r , requiring however, a large field-size. The construction was extended in [31],
to d satisfying k 6 d 6 n − 1. In [32], the authors provide a construction of MSR codes that holds
for all k 6 d 6 n − 1, but which once again required large field size. In [33], the authors provide a
construction for an optimal-access systematic MSR code that holds for any parameter set (n, k, d = n− 1)
having sub-packetization α matching the lower bound given in [28]. In [24–27, 30–33], combinatorial
nullstellensatz [34] is used to prove the MDS property due to which the codes are non-explicit and have
large field sizes.
In [35], an explicit optimal-access, systematic MSR code is constructed with optimal α, but for limited
values of n − k = 2, 3. In [36], the authors present two different classes of explicit MSR constructions,
one of which possessed the optimal-access property. Both constructions are for any (n, k, d) with sub-
packetization level growing exponential in n.
In a major advance, Ye and Barg [37] present an explicit construction of a high-rate, optimal-access
n
MSR code with α = r⌈ r ⌉ , field size no larger than r⌈ nr ⌉, and d = (n − 1). Essentially the same
construction was independently rediscovered in [38] from a different coupled-layer perspective, where
layers of an arbitrary MDS codes are coupled by a simple pairwise coupling transform to yield an MSR
code. Just prior to the appearance of these two papers, in an earlier version of [39], the authors show
how a systematic MSR code can be converted into an MSR code by increasing the sub-packetization
level by a factor of r = (n − k) using a pairwise symbol transformation. This result is then extended
in [39], to present a technique that takes an MDS code, increases sub-packetization level by a factor of r
and converts it into a code in which the optimal repair of r nodes can be carried out. By applying this
transform repeatedly ⌈ nr ⌉ times, it is shown that any scalar MDS code can be transformed into an MSR
code. It turns out that the three papers [37–39], either explicitly or implicitly, employed as a key part of
the construction, essentially the same pairwise-coupling transform.
n
Let s = (d − k + 1). More recently, the lower bound α > s s was derived in [40] for optimal-access MSR
codes. The same paper also shows that the sub-packetization level of an MDS code that can optimally
w
repair any w of the n nodes must satisfy α > s⌈ s ⌉ . These results established that the earlier constructions
in [30, 31, 37–39, 41] were optimal in terms of sub-packetization level α. It is also shown in [40], that a
vector MDS code that can repair failed nodes belonging to a fixed set of Q nodes with minimum repair
n
bandwidth and in optimal-access fashion, and having minimum sub-packetization level α = s s must
necessarily have a coupled-layer structure, similar to that found in [37–39]. An explicit construction of
n
MSR codes for d < (n − 1) with α achieving the lower bound α > s s for s = 2, 3, 4 was recently provided
in [41]. Please refer to Table 1 for a summary of MSR code constructions in existing literature.
Open problems 2. Derive a tight lower bound on the sub-packetization level of MSR codes and provide
matching constructions.
Open problems 3. Constructions for explicit optimal-access MSR codes for any (n, k, d) with optimal
sub-packetization.

4.2 Constructions of MSR codes

Product matrix construction [14]. We provide a brief description of the PM construction for param-
eter set (n, k, d = 2k − 2), (α = k − 1, β = 1, B = k(k − 1)). The message symbols {ui }B
i=1 are arranged
T
in the form of a (d × α) matrix M : M = [S1 S2 ] , where the S1 , S2 are symmetric (k − 1) × (k − 1)
matrices containing the B = k(k − 1) message symbols. Encoding is carried out using a (n × d) matrix
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:9

Table 1 A list of MSR constructions and the parameters. In the table r = n − k, s = d − k + 1 and when all node repair is
No, the constructions are systematic MSR. By ‘non-explicit’ field-size, we mean that the order of the size of the field from
which coefficients are picked is not given explicitly
MSR code Parameters α Field size All node repair Optimal access Notes
Shah et al. [9] (n, k, d = n − 1 > 2k − 1) r 2r No Yes IA framework
Suh et al. [20] (n, k, d > 2k − 1) s 2r Yes No IA framework
(n, k 6 3, d)
Rashmi et al. [14] (n > 2k − 1, k, d) r n Yes No Product matrix framework
Papailiopoulos et al. [22] (n, k, d = n − 1) rk non-explicit No No High rate systematic MSR
Tamo et al. [24] (n, k, d = n − 1) r k+16 4 when r 6 3, Yes Yes High rate MSR
Wang et al. [25] else non-explicit known as Zigzag codes
Cadambe et al. [26] (n > 3k
2
, k, d = n − 1) O(k2 ) non-explicit No Yes
n r
Introduced parity-check
Sasidharan et al. [30] (n, k, d = n − 1) r⌈ r ⌉ O(n ) Yes Yes
viewpoint, optimal α
r

Goparaju et al. [32] (n, k, d) k
s s – No Yes Very large field-size needed
See Sec. IV in [32] for details
n
Rawat et al. [31] (n, k, d) s⌈ s ⌉ O(nr ) Yes Yes Extended [30] for d < n − 1
Ye et al. [36] (n, k, d) sn sn Yes No
n−1
(n, k, d) s n+1 Yes Yes
Ye et al. [37]
n
Sasidharan et al. [38] (n, k, d = n − 1) r⌈ r ⌉ r⌈ n
r⌉ Yes Yes Optimal α
Li et al. [39] for optimal-access MSR
Vajha et al. [41] (n, k, d)
⌈n⌉ O(n) Yes Yes
d ∈ {k + 1, k + 2, k + 3} s
s

Ψ = [Φ ΛΦ], where Φ is an n × (k − 1) matrix and Λ is a diagonal matrix. Let the i-th row of Ψ be
ψiT , the i-th row of Φ be φT i and the i-th diagonal element in Λ be λi . The α symbols stored in node i
are given by ci = ψi M = φT
T T T
i S1 + λi φi S2 . The matrix Ψ is required to satisfy the properties (1) any d
rows of Ψ are linearly independent, (2) any α rows of Φ are linearly independent and (3) the n diagonal
elements of Λ are distinct.
• Node repair. Let f be the index of failed node, thus the aim is to reconstruct cf . The i-th helper
node, hi , i ∈ [d], passes on the information: cT T
hi φf = ψhi M φf . Upon aggregating the repair information
T
we obtain the vector [ψh1 ψh2 · · · ψhd ] [M φf ]. As any d-rows of Ψ are linearly independent, the vector
M φf can be recovered. From M φf , we can obtain S1 φf and S2 φf . Since S1 and S2 are symmetric, we
can recover the contents cT T T
f = φf S1 + λf φf S2 of the replacement node.
• Data collection. Let ΨDC = [ΦDC ΛDC ΦDC ] be the (k × d) sub matrix of Ψ corresponding to the k
nodes contacted for data collection. We wish to retrieve M from ΨDC M = ΦDC S1 + ΛDC ΦDC S2 . This
can be done in three steps.
(1) First compute ΨDC M ΦT T T T
DC = ΦDC S1 ΦDC +ΛDC ΦDC S2 ΦDC and set P = ΦDC S1 ΦDC , Q = ΦDC S2 ΦDC .
T

(2) It is clear that P, Q are symmetric. Thus we know both Pij + λi Qij and Pij + λj Qij . Since λi = 6 λj
for i =6 j, we can recover Pij and Qij for all i 6= j.
(3) Since we know Pij for j 6= i, we can compute the vector φT i S1 [φ1 , . . . , φi−1 , φi+1 , . . . , φk ]. Since any
α rows of Φ are linearly independent, we can recover {φT i S 1 |1 6 i 6 k}. For any set of α distinct elements
φTi , we can compute [φ1 · · · φ α ] T
S 1 , from which S 1 can be recovered. S2 can be similarly recovered from
Q. The present description assumes data collection from the first k nodes, while a similar argument holds
true for any arbitrary set of k nodes.
Coupled layer code. We present here the constructions in [37–39] from a coupled-layer perspective.
We explain the construction here only for parameter sets of the form

(n = st, k = s(t − 1), d = n − 1), (α = st , β = st−1 ), q > n,

where s > 1, t > 2. (The construction can however, be extended to yield MSR codes for any (n, k, d =
n − 1) using a technique called shortening). The coupled-layer code can be constructed in two steps:
(a) we layer α, (n, k) MDS codewords to form an uncoupled data-cube; (b) the symbols within the
uncoupled-data cube are transformed using a pairwise-forward-transform (PFT) to obtain the coupled
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:10

y
x

−Z=(0,0,0)

PFT

−Z
PRT

−Z=(0,0,1)

−Z=(1,1,1) Uncoupled data cube Coupled data cube

Figure 6 (Color online) Uncoupled data cube for s = 2, Figure 7 (Color online) Paired symbols are shown us-
t = 3. The red dots represent plane-index z. ing yellow rectangles connected by dotted lines. Uncoupled
symbols are transformed using PFT to get the coupled sym-
bols in the coupled data cube.

layer code. While we discuss only the case when the MDS code employed in the layers is a scalar MDS
code, there is a straightforward extension that permits the use of vector MDS codes [38].
Let us first consider the nα symbols {U (x, y, z) | (x, y) ∈ Zs × Zt , z ∈ Zts } of an uncoupled code U
where each code symbol U (x, y, ·) is a vector of α symbols in Fq . These nα symbols can be organized
to form a three-dimensional (3D) data cube (Figure 6), where (x, y) ∈ Zs × Zt is the node index and
where z ∈ Zts serves to index the contents of a node. For fixed z ∈ Zts , we think of the symbols
{U (x, y, z) | (x, y) ∈ Zs × Zt } as forming a plane or a layer and thus the value of z may be regarded as
identifying a plane or layer. The symbols in each layer of the uncoupled data cube form an (n, k) MDS
code.
Let Θ be the ((n − k) × n) parity check (p-c) matrix of an arbitrarily chosen (n, k) scalar MDS code
defined over Fq . Let θx,y (ℓ) denote the element of Θ lying in the ℓ-th row, and (x, y)-th column. Then
the symbols of the uncoupled code satisfy the p-c equations
X
θx,y (ℓ)U (x, y; z) = 0, ∀ℓ ∈ [0, n − k − 1], ∀z ∈ Zts . (3)
(x,y)∈Zs ×Zt

Next, consider an identical data-cube (Figure 7) containing the nα symbols {C(x, y, z) | (x, y) ∈
Zs ×Zt , z ∈ Zts } corresponding to the coupled-layer code. This data-cube will be referred to as the coupled
data cube. The symbols of the coupled data cube are derived from the symbols of the uncoupled data cube
as follows. Let γ be an element in Fq \{0}, γ 2 6= 1. Let us define z(y, x) = (z0 , . . . , zy−1 , x, zy+1 , . . . , zt−1 ).
Each symbol C(x, y, z) which is such that zy 6= x is paired with a symbol C(zy , y, z(y, x)). The values
of the symbols so paired, are derived from those of their counterparts in the uncoupled data cube as per
the (2 × 2) linear transformation given below, termed as the PFT
" # " #−1 " #
C(x, y, z) 1 γ U (x, y, z)
= . (4)
C(zy , y, z(y, x)) γ 1 U (zy , y, z(y, x))

In the case of the symbols C(x, y, z) when zy = x, the relation between symbols in the two data cubes
is even simpler and given by C(x, y, z) = U (x, y, z). The pairwise reverse transform (PRT) is simply the
inverse of the PFT and is used to obtain the uncoupled symbols U (·) from the coupled symbols C(·).
The p-c equations satisfied by the coupled-layer code can be derived using the p-c equations (3) satisfied
by the symbols in the uncoupled data cube and the PRT
X X X
θx,y (ℓ)C(x, y, z) + γθx,y (ℓ)C(zy , y, z(y, x)) = 0, ∀z ∈ Zts , ℓ ∈ [0, n − k − 1]. (5)
(x,y)∈Zs ×Zt y∈Zt x6=zy
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:11

• Node repair. Let (x0 , y0 ) be the failed node. To recover the symbols {C(x0 , y0 , z) | z ∈ Zts }, each of
the remaining nodes (x, y) = 6 (x0 , y0 ) sends helper information {C(x, y, z) | z ∈ Zts , zy0 = x0 }. Focusing
on (5) for z such that zy0 = x0 and retaining on the left side the unknown symbols, leads to equations of
the form
X
θx0 ,y0 (ℓ)C(x0 , y0 , z) + γθx,y0 (ℓ)C(x0 , y0 , z(y0 , x)) = κ∗ , ∀ℓ ∈ [0, n − k − 1], (6)
x6=x0

where κ∗ is a known value. These equations can be solved for the contents of the replacement node.
• Data collection. Please refer to [38] for the proof of data collection property.
Ye-Barg codes [36]. In [36], the authors present two constructions, for non optimal-access MSR
and optimal-access MSR codes, respectively. These are the only known MSR constructions that are
explicit and yield MSR codes for any parameter set (n, k, d). The same codes are also optimal for the
repair of multiple nodes. We describe here, for simplicity, the construction of (n, k, d) MSR codes having
parameters (n, k, d), (α = sn , β = sn−1 ), q > sn where s = d − k + 1, defined over finite field Fq for s > 1.
Let {C(i, z) | i ∈ [n], z ∈ Zns } be the collection of nα symbols of a codeword, where i is the node index
and z is the scalar symbol index. The code is defined via the p-c equations given as follows:
X
λℓi,zi C(i; z) = 0, ∀z ∈ Zns , ℓ ∈ [0, n − k − 1], (7)
i∈[n]

where the {λi,j , i ∈ [n], j ∈ [0, s − 1]} are all distinct, thereby requiring a field size q > sn.
• Node repair. Let f be the failed node, D be the set of d helper nodes. The helper information sent
Ps−1
by a node i ∈ D is given by {µif (z) = j=0 C(i, z(f, j)) | z ∈ Zns , zf = 0}. Next, fixing zi , ∀i ∈ [n] \ {f }
and summing equations (7) over the values of zf , we get
s−1
X X
λℓf,zf C(f, z) + λℓi,zi µif (z) = 0, ∀ℓ ∈ [0, n − k − 1]. (8)
zf =0 i∈[n]\{f }

It can be shown that the collection of symbols {µif (z)|i ∈ [n] \ {f }} form an [n − 1, d] MDS code.
Therefore, all the µif (z) can be computed from the known d values supplied by the helper nodes and the
symbols {C(f, z) | z ∈ Zns } can thus be recovered from (8).
• Data collection. For every z ∈ Zns , the collection {C(i, z)|i ∈ [n]} forms an (n, k) MDS code.
Therefore, any (n − k) erased symbols can be recovered.
Multiple node repair. Let 1 6 t 6 n − k be the number of erasures to be recovered. It was shown
in [23] that the minimum repair bandwidth required to repair t erasures in an MDS code having sub-
packetization level α is lower bounded by γt > t(n−t)α
n−k . Given that k 6 d 6 n − t is the number of helper
tdα
nodes that need to be contacted during the repair of t nodes, γt is lower bounded by γt > d+t−k . The Ye-
Barg code presented above achieves this bound [36]. The t node repair discussed here assumes a centra-
lized repair setting whereas an alternate, cooperative repair approach is discussed in Subsection 6.1.
Adaptive repair. Adaptive-repair (n, k) MSR codes are MSR codes that can repair a failed node by
connecting to any d nodes, for any d ∈ [k, n − 1] and can reconstruct the failed node by downloading
α
d−k+1 symbols each from the d helper nodes. Constructions of MSR codes with adaptive repair can be
found in [32, 36, 42].

5 On the S-RB tradeoff under exact repair

We distinguish between the S-RB tradeoffs for exact and FR RG code, by referring to them as the ER
and FR tradeoff respectively. The file size B under exact repair cannot exceed that in the FR case since
ER may be regarded as a trivial instance of FR. However, unlike in the case of FR codes, the data
collection problem in the ER setting, cannot be identified with a multicast problem simply because each
replacement node for a failed node acts as a sink for a different set of data. Thus it is not clear that
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:12

X Y Z

WX
X

Y SY

Z SZY

Figure 8 (Color online) The repair matrix.

the cut-set bound for FR can be achieved under ER, leaving the door open for an S-RB tradeoff in the
case of ER that lies strictly above and to the right of the FR tradeoff in the (α, β)-plane. There do exist
constructions of ER MBR and MSR codes meeting the cut-set bound with equality, showing that the ER
tradeoff coincides with the FR tradeoff at the extreme MSR and MBR points.

5.1 The non-existence of ER codes achieving FR tradeoff

The first major result on the ER tradeoff was the result in [43], showing that apart from the MBR point
and a small region adjacent to the MSR point, there do not exist ER codes whose (α, β) values lie on the
interior point of the FR tradeoff. We set αMSR = β(d − k + 1) to be the value of α at the MSR point.
Theorem 1. For any given values of (n, k > 3, d), ER codes having parameters (α, β, B) corresponding
to an interior point on the FR tradeoff do not exist, except possibly for α in the range

1
αMSR 6 α 6 αMSR 1 + (9)
αMSR (αMSR + 1)
corresponding to a small region in the neighborhood of the MSR point.
Proof. (Sketch) By restricting attention to any (d + 1) symbols of an RG code having parameter set
(n, k, d, (α, β), B) one obtains a second RG code with parameter set ((d + 1), k, d, (α, β), B) in which
all the remaining nodes participate in the repair of a failed node. This simplifies the analysis of the
repair setting and with this in mind, in the proof, we set n = (d + 1). When the message vector u
is picked uniformly at random, we have associated nodal random variables {Wi | i ∈ [n]} and repair
data variables {Sij | i ∈ [n] \ j}, where Sij denotes the data passed from node i to replacement node j.
The repair matrix S (Figure 8) is an (n × n) matrix whose (i, j)-th entry i 6= j, is Sij . The diagonal
elements of S do not figure in the discussion and maybe set equal to 0. Given subsets H, N ⊂ [n], we
set WN = {Wi | i ∈ N }, SH N
= {Sij | i ∈ H, j ∈ N }. We introduce the index sets X = {1, 2, . . . , m},
Y = [k] \ X and Z = [k + 1, d + 1] for m 6 k. The file size B can be expressed in terms of the joint
entropy of the node and repair-data variables (with logs computed to base q)

B = H(WX , WY ) = H(WX , SYY ∪Z ) (10)

k
X
j
= H(WX ) + H(SYY ∪Z | WX ) 6 H(WX ) + H S[m+1,j−1]∪Z | WX (11)
j=m+1
k
X
6 mα + (d − i + 1)β := Bm , m = 0, 1, . . . , k. (12)
i=m+1

The cut-set bound in (1) corresponds to the inequalities: B 6 minm=0,1,...,k Bm . For the bound to
hold with equality, the joint random variables SYY and SZY must have maximum entropy. However it can
be shown that the entropy of a row in the repair matrix is limited by β if the cut-set bound holds with
equality. This leads to a contradiction, concluding the proof.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:13

Theorem 1 does not however, rule out the possibility of an ER code having tradeoff approaching the
FR tradeoff asymptotically i.e., as the file size B → ∞.

5.2 The S-RB tradeoff for (4, 3, 3)

It is possible that the entropies of the random variables involved satisfy Shannon inequalities other than
the ones we have noted and which shed light on the ER tradeoff. For the particular case (n, k, d) = (4, 3, 3),
Tian [44] was able to identify such an inequality with the help of a modified version of the information
theory inequality prover (ITIP) [45, 46].
α β
Let ᾱ = B , β̄ = B represent the normalization of α and β with respect to file size B. A point
(ᾱ, β̄) is said to be achievable if for any ǫ > 0, there exists an ER-RG code whose (ᾱ1 , β̄1 ) is ǫ-close
to (ᾱ, β̄). The normalized tradeoff, i.e., the tradeoff expressed in terms of ᾱ and β̄ allows compar-
ison of codes across file sizes B. In the limit as B → ∞, the S-RB tradeoff becomes a smooth
curve. Let C1 , C2 be RG codes over Fq having respective parameter sets (n, k, d, (α1 , β1 ), B1 ) and
(n, k, d, (α2 , β2 ), B2 ). Consider a codeword array c obtained by vertically stacking M1 codeword arrays
of C1 and M2 codeword arrays of C2 . The code C comprising of all such arrays is said to be the space-
shared code of C1 and C2 . Then C is also an RG code with parameter set (n, k, d, (M1 α1 + M2 α2 , M1 β1 +
M2 β2 ), M1 B1 + M2 B2 ). The notion of space-sharing clearly extends to multiple codes.
Theorem 2. For (n, k, d) = (4, 3, 3), the achievable region R is given by

R = (ᾱ, β̄) 3ᾱ > 1, 2ᾱ + β̄ > 1, 6β̄ > 1, 4ᾱ + 3β̄ > 1 . (13)

Proof. Of the four inequalities listed, the first 3 follow the entropy constraints listed in (12) above. The
last inequality 4ᾱ + 3β̄ > 1 does not follow from (12), and was found in [44] using an ITIP. It remains
to construct a code that operate on points on the (ᾱ, β̄)-plane, satisfying the inequalities with equality.
A [4, 3] single parity-check code serves as an MSR code C1 for (4, 3, 3). A (4, 3, 3) MBR code C2 can be
constructed using the polygonal construction described in Section 3. A hand-crafted code C3 operating
at the interior point of deflection (Figure 9) is given in [44]. Every point on the lines determined by
equality in (13) is achieved by a code obtained by space-sharing among C1 , C2 and C3 .

5.3 Layered codes for interior points

A simple code-construction technique based on the layering of MDS codes turns out to provide codes
that perform well with respect to file size in the interior region of the S-RB tradeoff. Let CMDS be an
n

MDS code having parameters [w + γ, w, γ + 1]. Let n be such that w + γ 6 n and L = w+γ . Let
{Si ⊂ [n] | i = 1, 2, . . . , L} denote an ordering of the collection of all possible (w + γ) subsets of [n].
Let ui ∈ Fwq , i = 1, 2, . . . , L be L message vectors, not necessarily distinct, and ci be the codeword in
CMDS associated with ui . We create an (L × n) array in which we place the symbols of codeword ci in
the location specified by subset Si . It turns out that this array represents an array code which possesses
the data collection property of an RG code, but not the repair property. By replicating the array a
certain number V of times, it turns out that one obtains an RG code with parameters (n, k = n − γ,
d = k, B0 = LV w), operating between the MSR and MBR points. Further details can be found in [47].
We will refer to this code as the canonical layered code Ccan (Figure 10). The canonical layered-code
construction has been extended to construct codes with k < d by making use of an outer code designed
using linearized polynomials. An alternate generalization of the canonical code to the case of k < d
involved adding additional layers consisting of carefully designed parity symbols. Such an approach leads
to the improved layered codes in [48], that turn out to be optimal for the set of parameters (n, k = 3,
d = n − 1).

5.4 ER tradeoff strictly away from FR tradeoff for all (n, k, d)

In [49], it was shown that the ER tradeoff cannot approach the FR tradeoff even when B → ∞ for any
value of (n, k, d). This was established by deriving a positive lower bound 0 < δ < β on the gap between
the ER and FR tradeoffs.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:14

0.32 Functional repair tradeoff

New outer bound
0.30 Achievable by layered code [48]
1 2 3 4 5
0.28
c11 c12 c13
β→

0.26
c21 c22 c23
0.24
B

0.22
cm1 cm2 cm3
0.20
.. . . .. ..
0.18 . .. .. . .

0.35 0.40 0.45 0.50

α
−→
B

Figure 9 (Color online) The (4,3,3) normalized tradeoff. Figure 10 (Color online) An (n = 5, k = 4, d = 4)
canonical layered code.

Theorem 3. The ER tradeoff between ᾱ and β̄ for any ER RG code, with k > 3 is strictly separated
from the FR tradeoff, apart from the MSR and MBR endpoints as well as the region surrounding the
MSR point appearing in (9).
The proof the theorem involves identifying contradicting bounds on the entropy of various trapezoidal-
shaped subsets within the repair matrix. Subsequent papers [50, 51] derive better bounds, thereby im-
proving the gap δ to go beyond β. In [52], the authors adopt a different approach by first providing
three different expression for the entropy B of the data file involving mutual information between various
repair-data variables, and taking a linear combination of these expressions that leads to a significantly
tighter bound on B:
p(2(d−k)+p+1)β
(3k − 2p)α + 2 + (d − k + 1) min{α, pβ}
B 6 min . (14)
06p6k 3

The authors in [53] improve upon the result in (14) using repair-matrix techniques, in combination
with the bound in Theorem 3, leading to the best-known outer bound on the ER tradeoff. For the case
of (n, k = 3, d = n − 1), the outer bound is achieved by the improved layered codes, thus characterizing
the ER tradeoff. The bound also characterizes certain interior points when k = 4 [49].

5.5 Determinant codes for interior points

k
k−1
k+1

The construction given in [54] has parameters α = m , β = m−1 and file size B = m m+1 , where
k

m ∈ {1, 2, . . . , k} is an auxiliary parameter. The message symbols are first precoded to obtain k m
symbols, and these are then arranged in a data matrix M of size (k × α) in a particular manner. The
codeword array is then obtained as in the case of the product matrix framework introduced in [14],
by setting Cn×α = ψn×k Mk×α , where ψn×k is a Vandermonde matrix. The data collection and repair
properties of the code are proved by making use of the Laplace expansion of determinants, and the
codes for this reason, are called determinant codes. The codes achieve an outer bound discussed in the
next subsection, and thus form an optimal family of codes for parameters (n, k, k). An extension of the
construction to include the parameter set (n, k, d = k + 1) can be found in [55].

5.6 ER tradeoff under linear setting

In [51, 56, 57], the authors characterize the ER tradeoff for (n, k = n − 1, d = n − 1) for the subclass of
linear codes, using an approach that involves lower bounding the rank of the parity-check matrix of an
RG code. The upper bound in [56] holds in general for any (n, k, d = k).
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:15

Table 2 Parameters of explicit constructions of cooperative RG codes

Type Code parameters Ref.
MBCR n, k, k 6 d 6 (n − t), t > 1 [61]
MSCR n = d + 2, k = t = 2 [63]
MSCR n = 2k, d = n − 2, k > 2, t = 2 [62]
MSCR n = 2k, d = n − t, k > 2, k > t > 2 [62]
(Repair of systematic nodes only)
MSCR n, k, k 6 d 6 (n − t), t > 1 [64]

Theorem 4. Consider an ER linear RG code with parameters {(n > 4, k, d), (α, β)} and file size
B = nα − ρ. Then

2rnα − n(n − 1)β dβ dβ


 2
, 6α6 , 2 6 r 6 n − 2,
r +r r r−1
ρ > (15)
 dβ dβ

 2α − β, 6α6 .
n−1 n−2
The corresponding bound on file size B coincides with the achievable region of layered codes when
k = d = (n − 1). Determinant codes achieve the above bound in general for (n, k, k), thus characterizing
the linear ER tradeoff in this case.
Open problems 4. Characterization of ER tradeoff for general (n, k, d) in both the linear and non-
linear settings.

6 Variations on the theme of RG codes

6.1 Cooperative repair

This subsection was contributed at the request of the authors, by Kenneth Shum. The potential benefit
of allowing data exchange among the nodes being regenerated while repairing multiple node failures
simultaneously, was first investigated by Hu et al. [58]. The cooperative-repair process consists of two
phases. In the first phase, each of the new nodes selects a set of d surviving nodes, and downloads a
total of dβ1 symbols from them. In the second phase, a new node downloads β2 symbols from each of
the other new nodes. If t new nodes are re-built at the same time, the repair bandwidth per new node
is dβ1 + (t − 1)β2 . As in the non-cooperative case, there is a tradeoff between the amount of data stored
in a node and the repair bandwidth. In the following, we denote the repair bandwidth per new node by
γ. The minimum-storage cooperative regenerating (MSCR) point and minimum-bandwidth cooperative
regenerating (MBCR) point are determined in [59, 60], and are given by

B B(d + t − 1) B(2d + t − 1)
(αMSCR , γMSCR ) = , , (αMBCR , γMBCR ) = (1, 1),
k k(d + t − k) k(2d + t − k)

where t is the number of nodes to be repaired simultaneously. When t = 1, they reduce to the corre-
sponding operative points for single-node repair. The full FR tradeoff curve between storage and repair
bandwidth per node is derived in [60]. In the case of exact repair, the explicit construction of cooperative
RG codes for all parameters at the minimum-bandwidth point was first presented in [61]. The construc-
tion in [61] is presented in an alternate way in [62]. Constructions for minimum-storage cooperative codes
are relatively rare (e.g., [62, 63]). Table 2 summarizes the existing constructions of MSCR and MBCR
codes. We note that the MSCR codes in [62] share the same encoding method as in [9, 20]. It is shown
in [62] that with the MSR codes in [9,20], we can repair multiple systematic nodes with repair bandwidth
achieving the MSCR point. In [64], the authors present constructions for any (n, k, k 6 d 6 n − t, t)
MSCR codes.
The cooperative repair model was extended to partial cooperative repair in [65]. The first phase of
repair is the same as described above. Each of the t new nodes contacts d other nodes and download a
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:16

total of β1 data packets. In the second phase, a new node exchanges β2 data packets with t − s other
new nodes, where s is a system parameter between 1 and t. When s = t, it is the original single-loss
repair model. When s = 1, it reduces to the cooperative repair model. The minimum-storage and
minimum bandwidth point are derived in [65]. With partial collaboration, the minimum-storage and
minimum-bandwidth operating points are given respectively by

B B(d + t − s) B(2d + t − s)
(α, γ) = , and (α, γ) = (1, 1).
k k(d − k + t − s + 1) k(2d − k + t − s + 1)

Two explicit codes for partial collaborative repair are presented in [66]. The code construction in [62]
for MBCR codes can be extended to achieve all minimum-bandwidth points with partial collaboration.
The security of cooperative RG codes is investigated in [67, 68].

6.2 MDS codes with repair capability

We discuss in this subsection, vector MDS codes that are not MSR, which nevertheless offer some sav-
ings in repair bandwidth in comparison to the conventional repair of RS codes while keeping the sub-
packetization level α small. The piggybacking framework introduced in [69], was one of the first such
efforts. In [70], the authors introduce codes that offer a choice of sub-packetization levels, namely, α = rp
for 1 6 p < ⌈ nr ⌉. The corresponding repair download from each helper node is given by β = (1 + 1p )rp−1 .
When p = ⌈ nr ⌉ these codes coincide with the construction in [30]. A similar approach was followed by the
k
authors of [71] where they provide constructions for MDS codes for any given 1 6 α 6 r⌈ r ⌉ . However, the
constructions here are restricted to systematic node repair and the bandwidth needed from each helper
k
node is not uniform. These constructions are motivated by the systematic MSR code with α = r⌈ r ⌉
appearing in [33]. In more recent work [72], the ǫ-MSR framework was introduced to construct MDS
codes that somewhat surprisingly, have sub-packetization α that is logarithmic in n for a modest increase
in repair bandwidth by a multiplicative factor (1 + ǫ).
Piggybacking framework. The piggybacking framework [69] begins with a collection of α codewords
drawn form an MDS code and proceeds to modify the code symbols as described below. Let C be an
MDS code and let (f1 (u), f2 (u), . . . , fn (u)) represent the codeword corresponding to message u. Next,
consider codewords of C corresponding to α distinct messages, u1 , . . . , uα . The α code symbols fj (ui ),
i = 1, 2, . . . , α are stored on node j. We first modify the code by adding a function gij (u1 , . . . , ui−1 ) to the
j-th symbol of i-th codeword fj (ui ), for all i ∈ {2, . . . , α}, j ∈ {1, . . . , n}. The values so added are termed
as piggybacks. This modification does not affect our ability to decode the code, if the codewords are
decoded in sequence. Applying an invertible linear transform Ti to the α code symbols in the i-th node,
similarly does not affect our ability to decode the α codewords, nor a node’s ability to serve as a helper
node. By carefully choosing the piggybacking functions and the set Ti of invertible linear transformations
it possible to reduce the repair bandwidth for the collective repair of the α MDS codewords in comparison
with the repair bandwidth needed for the conventional repair of α MDS codewords. Three families of
piggybacking-based MDS codes with reduced repair bandwidth and disk read are constructed in [69]. The
piggybacking framework typically provides savings between 25% to 50% depending up on the parameters
and choice of piggybacking functions. For example, Figure 11 shows modification of a [4, 2] MDS code
with sub-packetization level 2 in such a way that the systematic nodes can be repaired by reading 3
symbols (instead of the 4 symbols required for MDS decoding), resulting in a 25% repair bandwidth and
disk read saving.
ǫ-MSR framework. The motivation for constructing ǫ-MSR codes [72] is the larger sub-packetization
level of an MSR code, which could possibly prove to be a hurdle in its practical implementation. The
authors of [72] provide a generic way to transform an MSR code into an ǫ-MSR code.
Definition 2. An MDS code C with sub-packetization α over a finite field B is said to be an (n, k,
d = n − 1, α)B ǫ−MSR code, ǫ > 0, if for every i ∈ [n] there exists a linear repair scheme for the code
ℓ
symbol ci which downloads βij 6 (1 + ǫ) n−k symbols over B from the (n − 1) nodes storing code symbols
cj , for j ∈ [n] \ {i}.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:17

a1 b1 a1 b1 a1
X Xb1 a1 b1
a2 b2 a2 b2 a2 b2 a2
X b2
X
a1+a2 b1+b2 a1+a2 b1+b2 a1+a2 b1+b2 a1+a2 b1+b2
a1+2a2 b1+2b2 2a2−2b2−b1 b1+2b2+a1 2a2−2b2−b1 b1+2b2+a1 2a2−2b2−b1 b1+2b2+a1

Figure 11 (Color online) Here two codewords of a [4,2] MDS code are piggybacked. The first systematic node can be
repaired by reading b2 , b1 + b2 and b1 + 2b2 + a1 , whereas the second systematic node repair requires b1 , b1 + b2 and
2a2 − 2b2 − b1 .

The construction of an ǫ-MSR code presented in [72] combines a short block-length MSR code with a
code having large minimum distance. Let CI be an (n = k + r, k, d = n − 1, α)B MSR code having parity
check matrix  
H1,1 H1,2 . . . H1,n
 . .. .. 
H= . . ,

 . .
Hr,1 Hr,2 . . . Hr,n
where the sub-matrices Hi,j are of size (α × α). Next, let CII be a (not necessarily linear) code having
block length N , size M and minimum distance D = δN over an alphabet G of size |G| 6 n. Let us
associate with every codeword c = (c1 , . . . , cN ) of CII , an (rN α × N α) matrix
 
u1,c Diag(H1,c1 , . . . , H1,cN )
 .. 
Hc = 
 .
,

ur,c Diag(Hr,c1 , . . . , Hr,cN )

where the {ui,c } are non-zero coefficients, drawn from B. Next, using the fact that the number of
codewords in CII is M , let us form an (rN α × M N α) matrix H with each of the M ‘thick’ columns Hc
corresponding to a different codeword c ∈ CII . It can be shown that the code having H as its parity-check
matrix is an (M , M − r, d = M − 1, N α)B ǫ-MSR code, where ǫ = (r − 1)(1 − δ). Ensuring this
requires judicious selection of the base MSR code CI as well as the non-zero scalars {ui,c }. An additional
requirement is that for a given ǫ > 0, the code CII should be chosen such that the parameter δ satisfies
ǫ
δ > 1 − r−1 . The ǫ-MSR codes constructed using this approach can have sub-packetization level scaling
logarithmically in the block length.
In [72], ǫ-MSR codes are constructed by picking the non-optimal-access MSR constructions as CI . For
instance, using CI with parameters (n = 3, k = 1, d = 2, α = 23 = 8) and CII with parameters N = 20,
M = 27 and D = 13 over F3 one can construct a (M = 27, M − r = 25, M − 1 = 26, N α = 160) ǫ-MSR
code. Note that the MSR code CI with parameters (n = 27, k = 25, d = 26) requires a sub-packetization
level of 227 , whereas this ǫ-MSR code has sub-packetization level of 160 (≪ 227 ) and repair bandwidth is
within 1.35 times that of the MSR code.

6.3 Fractional repetition codes

Fractional repetition codes [73] are regarded as codes that generalize the RBT MBR construction in [13].
A fractional repetition code is associated with the parameter set {n, k, α, ρ}, where n is the number of
nodes and k is the smallest number such that one can retrieve the entire data file from connecting to
any set of that many nodes. Let K be the file size of the fractional repetition code. To encode and store
data, a fractional repetition code begins by encoding a collection {u1 , . . . , uK } of message symbols drawn
from a finite field Fq using a scalar [N, K] MDS code A, also referred to as the DRESS code in [74]. Let
(v1 , v2 , . . . , vN ) denote the symbols of a codeword in A. Each of the N scalar code symbols is replicated ρ
times and the resultant ρN symbols are stored across the n nodes in such a way that there are α symbols
per node and each code symbol is present in precisely ρ distinct nodes. Combinatorial techniques such as
t-designs are used to make such an assignment possible. For this to happen, we must have that nα = N ρ.
In order to be able to recover the entire data file by connecting to any k nodes we must clearly have that
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:18

1 2

c2 c4
4 c7 5
7
6
3
c1 c6 c5

Figure 12 (Color online) Each of the seven lines in the Fano plane indicates a node and points within a line denote the
code symbols stored in the corresponding node. For instance, N1 = {c1 , c2 , c3 }.

RC (k) , minJ⊆[n]:|J|=k | ∪j∈J Nj | > K, where Nj indicates the set of α code symbols stored in j-th node,
j ∈ [n]. Note that RC (k) is defined with respect to a given collection {Nj }nj=1 . Let Cfr (n, k, α, ρ) denote
the maximum RC (k) possible across all possibilities of {Nj }nj=1 , which conform to the parameters n, α
and ρ. Hence a fractional repetition code is said to be k-optimal [75], if it satisfies K = Cfr (n, k, α, ρ).
In contrast to an MBR code, a fractional repetition code requires the existence of just a single set of
d = α helper nodes to perform RBT. However it follows naturally from the ρ-replication of code symbols
that such a set of d helper nodes is available, even in the presence of (ρ − 1) node failures.
Example 2 ([73]). Consider a fractional repetition code C with parameters n = 7, k = 3, d = 3, ρ = 3.
The code is described using the Fano plane as shown in Figure 12. Here RC (k) = 6. By choosing the
outer MDS code to be the [7, 6] single parity check code, data collection property follows. As each symbol
is shared by three lines, ρ = 3 and hence C permits RBT up to 2 node failures.
The following bound on the maximum rate of a fractional repetition code with parameters (n, k, α, ρ),
is derived in [73]. ( )

n−ρ
nα
Cfr (n, k, α, ρ) 6 min 1 − nk , g(n, k, α, ρ) ,
ρ k

where g(n, 1, α, ρ) = 1, and g(n, k + 1, α, ρ) = g(n, k, α, ρ) + α − ⌈ ρg(n,k,α,ρ)−kα

n−k ⌉.
Ref. [75] considers fractional repetition codes with parameters α > k, β = 1 and provides several
k-optimal constructions. Ref. [76] considers fractional repetition codes with parameter β > 1 and also
introduces a certain notion of locally recoverable fractional repetition codes where the parameter α < k.
In [77], the authors study fractional repetition codes that have α much larger than replication degree, ρ.
In [78], the authors identify necessary and sufficient conditions for the existence of fractional repetition
codes.

6.4 Secure RG codes

Three secrecy models in the context of an RG code are introduced in [79]: (a) a passive eavesdropper
model, where the eavesdropper can read the contents of any ℓ nodes but cannot modify the content of
these nodes, (b) an active omniscient adversary model, where the adversary can read the content of ℓ = k
nodes and can also modify the content of b nodes where 2b 6 k, and (c) an active limited-knowledge
adversary model, where the adversary can read the content of ℓ < k nodes and can modify the content of
b 6 ℓ nodes. In the case of a passive eavesdropper, the secrecy capacity (Bs ) is the maximum amount of
information that can be stored without any information being revealed to the eavesdropper. In the active
eavesdropper model, the resiliency capacity (Br ) is the maximum amount of information that can be
stored such that it can be reliably made available to a legitimate data collector, in spite of the tampering
on the data in b nodes done by the eavesdropper. In [79], the following upper bound on secrecy capacity
of the passive eavesdropper model was derived:
k
X
Bs (α, γ = dβ) 6 min{(d − i + 1)β, α}. (16)
i=ℓ+1
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:19

If α is not constrained, then the resultant bandwidth-limited secrecy capacity Bs,BL becomes a function of
(k, d, β) alone. The value of Bs,BL is determined [79] for d = (n − 1) by providing a bound and an optimal
P
construction. It was also shown that the resiliency capacity satisfies Br (α, γ) 6 ki=i0 min{(d−i+1)β, α},
where i0 is equal to 2b + 1 for omniscient case and b + 1 for the limited knowledge case.
In an alternate setting, Rashmi et al. [80] assume a noisy channel for transmission of data during repair
and reconstruction, and introduce the notion of an (s, t)-resilient RG code that can correct up to t errors
and s errors during both repair and reconstruction. The model is aligned with the active eavesdropper
model where the eavesdropper can tamper the contents of b nodes. An (s, t)-resilient RG code is shown
Pk
to satisfy B 6 i=1 min{(d − i + 1)β, α} where, d = ∆ − 2t − s, k = κ − 2t − s and ∆, κ are the number
of nodes contacted during repair and reconstruction respectively. Constructions of MSR and MBR codes
that are (s, t) resilient are also provided in [80]. In [36], the authors extend this model to the repair of
multiple nodes and provide MSR constructions that are resilient to t errors during repair. In [81], the
authors extend the passive eavesdropper model to the setting where out of the ℓ nodes accessed, the
eavesdropper can read the contents of ℓ1 nodes and can observe the information passed on for the repair
of ℓ2 = ℓ − ℓ1 nodes. The upper bound in (16) also holds for this extended case. In the case of an MBR
code, since the amount of data stored equals the amount of data received for node repair, the breakup
between ℓ1 , ℓ2 is immaterial.
However in the case of an MSR code, dβ > α. In [81], the authors provide explicit, secure MBR,
and low-rate MSR code constructions that achieve the upper bound (16) for ℓ2 = 0. The secure MSR
construction from [81] provides a lower bound to the secure file size of an MSR code Bs > (k − ℓ)(α − ℓ2 β)
for ℓ2 > 0.
The upper bound on secure MSR file size Bs 6 (k − ℓ)α given by (16) is improved in [82–85]. In [86],
1
Rawat established that the secrecy capacity of an MSR codes is given by Bs = (k − ℓ)(1 − n−k )ℓ2 α by
providing an MSR construction. An upper bound that matches with Rawat’s construction is proved by
Goparaju et.al [84] under the constraint of linearity. In [87], secure MSR codes with smaller field sizes
for all parameters were constructed. In [88, 89], the ER tradeoff is studied for secure RG codes.

7 Locally recoverable codes

The earliest-known appearance of LR codes can be found in [90,91]. A construction for a code with locality
appears in [92]. A formal treatment of codes with locality with a bound on minimum distance (discussed
below) appears in [93]. The extension to the non-linear case for all-symbol (AS) and information-symbol
(IS) locality appear in [94, 95], respectively.
Let C be an [n, k] linear code over Fq . For a subset S ⊆ [n], we use C|S to denote the restriction
of C to the coordinates in S. Let G be a (k × n) generator matrix for C having columns {gi }ni=1 , i.e.,
G = [g1 , g 2 , . . . , gn ]. An information set E = {e1 , e2 , . . . , ek } is any subset of [n] of size k satisfying
rk(G|E ) = rk[g e , . . . , ge ] = k. An [n, k] code C is said to have (r, δ) IS locality if there is an information
1 k
set E = {e1 , e2 , . . . , ek } such that for every ei ∈ E, there exists a subset Si ⊆ [n], with ei ∈ Si ,

dim(C|Si ) 6 r, dmin (C|Si ) > δ, (17)

C is said to have (r, δ) AS locality if for every coordinate i ∈ [n], there exists a subset Si ⊆ [n] with i ∈ Si ,
such that (17) holds. Clearly, a code with AS locality also possesses IS locality.

7.1 Bound on minimum distance

A major result in the theory of LR codes is the minimum distance bound derived in [93], which in the
context of the theorem below, was derived for δ = 2. An analogous proof for δ = 2 and nonlinear codes
can be found in [94, 95]. The bound in [93] was extended adopting the same approach as in [93], to the
general case δ > 2 in [96] and appears in Theorem 5 below. The extension to codes over a vector alphabet
can be found in [97].
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:20

Theorem 5 ([96]). Let C be an [n, k] linear code over Fq having (r, δ) IS locality. Then

k
dmin 6 (n − k + 1) − − 1 (δ − 1). (18)
r

Our proof will make use of Lemma 1.

Lemma 1. Let C be an [n, k] code and let S ⊆ [n] such that rk(G|S ) 6 k − 1. Then dmin (C) 6 n − |S|.
Proof. Since rk(G|S ) 6 k−1, it follows that there exists a nonzero message vector u such that uT G|S = 0.
Let c = uT G, then 0 < wt(c) 6 n − |S| and the result follows.
Proof. (of Theorem 5) Let E = {e1 , e2 , . . . , ek } be the information set with respect to which C has IS
locality. Let the subsets Si ⊆ [n], 1 6 i 6 k, be such that ei ∈ Si , C|Si is an (r, δ) code, i.e., dim(C|Si ) 6 r,
dmin (C|Si ) > δ. Let Vi denote the column space of G|Si . Next, over the course of several iterations, we
incrementally build up a set S, beginning with S = φ. We use j to indicate the iteration number and begin
with j = 1. On the j-th iteration, j > 1, we first search for an index i such that Vi 6⊂ Col(G|S ) (Col(A)
refers to the column space of A). This will always be possible, as we always ensure rk(G|S ) 6 k − 1.
Having found such an index i, we next examine the rk(G|S∪Si ). If rk(G|S∪Si ) 6 k − 2, we set

aj = |S ∪ Si | − |S|, γj = rk(G|S∪Si ) − rk(G|S ), S = S ∪ Si , j = j + 1 (19)

in order from left to right, and repeat the procedure in (j + 1)-th iteration by searching for an index i
such that Vi 6⊂ Col(G|S ). If at the j-th iteration, for any j, we find that
Case (i). rk(G|S∪Si ) = k − 1, we then replace the procedure in (19) with the steps aj = |S ∪ Si | − |S|,
γj = rk(G|S∪Si ) − rk(G|S ), S = S ∪ Si , m = j, and terminate the program.
Case (ii). rk(G|S∪Si ) = k. In this case, we replace the procedure in (19) by selecting a subset
Ti ⊆ Si such that rk(G|S∪Ti ) = k − 1 (this can always be done), and then setting aj = |S ∪ Ti | − |S|,
γj = rk(G|S∪Ti ) − rk(G|S ), S = S ∪ Ti , m = j, and then terminating the program.
Thus m indicates the number of iterations that took place before the program was terminated. Note
that since for every i, rk(G|Si ) 6 r, we have that γj 6 r. Let j > 1. At the j-th iteration, let i be the
index chosen such that Vi 6⊂ Col(G|S ) and Let Ri ⊆ Si \S be such that |Ri | = γj −1 and rk(G|Ri ) = γj −1.
Since the code having generator matrix G|Si has minimum distance > δ and since rk(G|(S∩Si )∪Ri ) 6 r−1,
by Lemma 1, δ 6 |Si | − |(S ∩ Si ) ∪ Ri | = |Si | − |(S ∩ Si )| − |Ri | = |Si \ S| − (γj − 1). It follows from this
that aj > γj + (δ − 1).
• Algorithm terminates under Case (i). Since the incremental rank is at most r, it follows that the
number of iterations m satisfies m > ⌈ k−1 r ⌉. We thus have

m m
X X k−1
|S| = aj > (γj + δ − 1) = (k − 1) + (δ − 1)m > (k − 1) + (δ − 1).
j=1 j=1
r

• Algorithm terminates under Case (ii). Arguing similarly, we have that m > ⌈ kr ⌉ and
m m−1
X X k
|S| = ai > (γj + δ − 1) + γm = (k − 1) + (δ − 1)(m − 1) > (k − 1) + − 1 (δ − 1).
j=1 j=1
r

Case (ii) leads to a smaller lower bound on |S|. Hence from Lemma 1 it follows that

k
dmin 6 (n − k + 1) − − 1 (δ − 1).
r

We note the following:

(1) Setting δ = 1 (i.e., no locality constraint) in (18), one recovers the classical Singleton bound. For
this reason, the bound in (18) is commonly referred to in the context of locality as the Singleton bound.
(2) The pyramid-code construction in Subsection 7.2.1 provides a general construction of codes with
IS locality that achieves the Singleton bound for all parameters (n, k, r, δ).
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:21

(3) For many parameter sets, one can construct codes with AS locality that achieve the bound in (18),
include all cases where (r + 1)|n, see Subsection 7.2.2 below.
(4) For δ = 2, bounds for AS locality that are tighter than the Singleton bound for IS locality appearing
in (18), can be found in [98–101]. Constructions for codes achieving the tightened bound in [99] for the
n
case of n1 > n2 where n1 = ⌈ r+1 ⌉, n2 = n1 (r + 1) − n and having exponential field size can also be found
there.
(5) It is shown in [102] that one can construct codes with AS locality and field size of order n whose
minimum distance is within 1 of the bound in (18) provided r ∤ k, n 6= 1 (mod r + 1). In [103], it is shown
that this can be achieved for any parameter set if one permits the field size to be exponential in n.

7.2 Constructions

7.2.1 Pyramid code construction

The pyramid code construction technique which appeared in [91], allows us to construct for any given
parameter set {n, k, r, δ} a code with (r, δ) IS locality achieving the dmin bound in (18). We sketch the
construction for the case k = 2r. The general case k = ar, a > 2 or even when r ∤ k, follows along similar
lines. The construction begins with the systematic generator matrix GMDS of an [n1 , k] scalar MDS code
C MDS having block length n1 = n − (δ − 1). It then reorganizes the sub-matrices of GMDS to create the
generator matrix GPYR of the pyramid code:
 
Ir P1 Q1 " #
  Ir P1 Q1
GMDS =  Ir P2 Q2  ⇒ GPYR = ,
|{z} |{z} Ir P2 Q2
(r×(δ−1)) (r×s)

where s = n1 − 2r − (δ − 1). It is not hard to show that the [n, k] code C PYR generated by GPYR has (r, δ)
IS locality and that dmin (C PYR ) > dmin (C MDS ). It follows that dmin (CPYR ) > dmin (CMDS ) = n1 − k + 1 =
(n − k + 1) − (δ − 1), and the code C PYR is thus optimal with respect to the dmin bound in (18).

7.2.2 The Tamo-Barg construction

The construction below by Tamo and Barg [102], provides a construction for LR codes with AS locality.
While for simplicity, we present the construction for the case δ = 2, the construction has a natural
extension to the general case δ > 2 [102]. We will refer to the construction in the sequel as the Tamo-
Barg (T-B) construction.
Theorem 6. Let Fq be a finite field of size q, let r > 2, n = m(r+1) 6 q, with m > 2 and 2 6 k 6 (n−1).
Set k = ar + b, 0 6 b 6 (r − 1). Let A = {θ1 , θ2 , . . . , θn } ⊆ Fq and Ai ⊂ A, 1 6 i 6 m, |Ai | = (r + 1),
Ai ∩ Aj = φ, i 6= j represent a partitioning A = ∪m i=1 Ai of A. Let g(x) be a ‘good’ polynomial, by which
is meant, a polynomial over Fq that is constant on each Ai and of degree (r + 1). Let
a−1
XX r−1 b−1
XX
f (x) = aij [g(x)]j xi + aij [g(x)]j xi , (20)
j=0 i=0 j=a i=0

where the aij ∈ Fq are the message symbols and where the second term is vacuous for b = 0, i.e., when
r|k. Consider the code C of block length n and dimension k where the code symbols are obtained through
evaluation of the above collection of polynomials at the elements in A. Then C is an (r, δ) AS locality
code with δ = 2 and is optimal with respect to the dmin bound in (18). The i-th local code has support
set Ai .
Proof. In (20), it can be checked that by varying {aij }, one obtains a collection of k linearly independent
polynomials and since k < n, it follows that the code has dimension k. Let g(θ) = γℓ , all θ ∈ Aℓ . Next,
let θ ∈ Aℓ . Then we have
a−1
XX r−1 b−1
XX
f (x)|θ∈Aℓ = aij [γℓ ]j xi + aij [γℓ ]j xi ,
j=0 i=0 j=a i=0
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:22

f (Pa)

f (Pb)

f (Pc)

Figure 13 (Color online) In the T-B construction, code symbols in the local codes of length (r + 1) correspond to the
evaluations of polynomials of degree 6 (r − 1). Here, r = 2 implying evaluation at 3 points of a linear polynomial.

which is a polynomial of degree 6 (r − 1), see Figure 13, and hence the corresponding evaluation code,
when restricted to Ai has dmin > 2, leading to the desired locality and ability to recover from a single
erasure. To determine dmin , assume b > 1. The maximum degree of a polynomial f (x) then equals

k
a(r + 1) + b − 1 = (ar + b) + (a − 1) = k + − 2.
r
When b = 0 and hence k = ar, the maximum degree equals

k
(a − 1)(r + 1) + (r − 1) = (ar) + (a − 2) = k + − 2.
r
It follows that the code is optimal as dmin > (n − k + 1) − (⌈ kr ⌉ − 1).
An example of how good polynomials may be constructed is given below, corresponding to the anni-
hilator polynomial of a multiplicative subgroup G of F∗q .
Example 3. Let H < G 6 F∗q be a chain of cyclic subgroups, where |H| = (r + 1), |G| = n so that
(r + 1)|n|(q − 1). Let n = (r + 1)t. Let {Ai = γi H | i ∈ {1, 2, . . . , t}} be the t multiplicative cosets of H
in G, with γ1 being the multiplicative identity so that A1 = H. It follows that
Y
(x − β) = xr+1 − γir+1 ,
β∈Ai

so that xr+1 is constant on all the cosets of H in G and may be selected as the good polynomial g(x)
i.e., g(x) = xr+1 is one possible choice of good polynomial based on multiplicative group H.
Further examples may be found in [102, 104, 105]. For constructions meeting the Singleton bound with
field size of O(n) and more flexible value of r [106]. Construction of LR codes meeting the Singleton
bound with O(n) field size can also be found in [107].

7.3 Alphabet-size dependent bounds on code rate

7.3.1 General bound

The bound in Theorem 5 as well as the bounds for non-linear and vector codes derived in [94, 97] hold
regardless of the size q of the underlying finite field. The theorem below takes the size q of the code symbol
alphabet into account and provides a tighter upper bound on the dimension of a code with locality that is
valid even for nonlinear codes. The ‘dimension’ of a nonlinear code C over an alphabet Q of size q = |Q|
is defined to be the quantity k = logq (|C|).
Theorem 7 ([108]). For any (n, k, d) code C that is an LR code with parameter r over an alphabet Q
of size q = |Q|,
h i
(q)
k 6 min tr + kopt (n − t(r + 1), d) , (21)
t∈Z+

(q)
where kopt (n−t(r+1), d) is the largest possible dimension of a code over Q having block length (n−t(r+1))
and minimum distance d.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:23

Table 3 A comparison of upper bounds on the dimension k of binary LR code, for given (n, dmin , r, q)
n = 31, q = 2, dmin = 5
r (locality) 2 3 4 5 6
Bound (21) 17 19 20 20 20
Bound in [112] 15 18 20 22 23
Bound (22) 16 18 19 20 20

Proof. (Sketch of proof) The bound holds for linear as well as nonlinear codes. In the linear case, with
Q = Fq , the derivation proceeds as follows. Let G be a (k × n) generator matrix of the LR code C. Then
it can be shown that for any integer t > 0, there exists an index set I such that |I| = min(t(r + 1), n)
and rank(G|I ) = s 6 tr. This implies that C has a generator matrix of the form (after permutation of
columns):
 
A B
 |{z} 
G =  (s×|I|) .
[0] D

In turn, this implies that the rowspace of D defines an [n − t(r + 1), k − s > k − tr, d] code over Fq , if
(q)
k − tr > 0. It follows that k 6 tr + kopt (n − t(r + 1), d) and the result follows. Note that the row space of
D corresponds to a shortening C S of C with respect to the coordinates I ⊆ [n]. The proof in the general
case is a (nontrivial) extension to the nonlinear setting.
Remark 6. The above bound was obtained by showing that shortening of an [n, k, d] LR code with
parameter r, leads to an [n − t(r + 1), > k − tr, d] code. Classical bounds on coding theory can be applied
to this shortened code, to yield “lifted” bounds on the parent code having locality. This shortening
approach, presented for the first time in [108], has since been employed in subsequent papers in [109,110].
An alphabet-size-dependent bound on dmin (based on the shortening approach in [108]), and which uses
upper bounds on generalized Hamming weights (GHW) [111] of the dual code derived in [98], appears
in [110]. The approach in [110] can also be used to derive the following upper bound on dimension which
is in general tighter than (21)
h i
(q)
k 6 min ei − i + kopt (n − ei , d) . (22)
{i:ei <n−d+1}

The integers {ei }i appearing here can be recursively computed for a given (n, r), and represent upper
bounds on the GHW of the dual code (Subsection 8.2.3). A bound on the dimension of a binary LR code
for a given (n, r, dmin ) based on the Hamming bound for dmin > 5 and 2 6 r 6 n2 − 2 appears in [112].
This bound is shown to be tighter than (21) for some cases including 5 6 dmin 6 8 for n large.
In [109], the authors employ the shortening approach to derive an alphabet-size-dependent bound on
the minimum distance and dimension of codes having IS locality. An example comparison of the bounds
on dimension for linear LR codes in (21), (22) and the Hamming-bound based bound in [112] is presented
in Table 3.

7.3.2 Bounds with disjoint repair groups

Bounds on the dimension of a binary LR code C for a given n, r, dmin under the assumption that the
local codes (C|Si ) have pairwise disjoint support appear in [112–114]. The bound in [112] make use of
the Hamming bound and is shown to be tighter than (21) for some cases. A tightening of this bound
appears in [113]. The tightest known bounds for this setting appear in [114] and are based on linear
programming.

7.3.3 Bounds on the dimension of cyclic LR code

A linear-programming-based upper bound on the dimension of cyclic LR codes appears in [115]. Other
bounds can be found in [116, 117].
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:24

7.3.4 Asymptotic bounds

Upper bounds on asymptotic rate Rq (r, 1, ∆) (see Subsection 8.2.5 for a definition) for a given fractional
minimum distance of a binary LR code appear in [114], that represent a slight tightening of the asymptotic
version of the bound in (21). An achievable asymptotic Gilbert-Varshamov type lower bound for LR code
appear in [118] to be

q 1 r+1 r+1
R (r, 1, ∆) > 1 − min logq ((1 + (q − 1)s) + (q − 1)(1 − s) ) − ∆ logq (s) . (23)
0<s61 r+1

Constructions achieving the lower bound (23) can also be found in [108]. An improved lower bound
obtained via a construction that makes use of algebraic-geometric codes based on the Garcia-Stichtenoth
curves appear in [119]
√
r q+r √
Rq (r, 1, ∆) > 1−∆− for (r + 1)|( q + 1).
r+1 q−1

Constructions based on algebraic geometry and covering a wider range of parameters can be found
in [120]. The algebraic-geometry-based constructions improve upon the GV-type bound in (23) for some
selected range of parameters.

7.4 Small-alphabet constructions

7.4.1 Construction of binary codes

Constructions for binary codes that achieve the bound on dimension given in (21) for binary codes,
appear in [121–123]. While [121, 123] provide constructions for dmin = 4 and dmin = 6 respectively, the
constructions in [122] handle the case of larger minimum distance but have locality parameter restricted
to r ∈ {2, 3}. In [109], the authors give optimal binary constructions with information and all symbol
locality with dmin ∈ {3, 4}. The construction is optimal with respect to a bound similar to (21) derived
in [109]. Constructions achieving the bound on dimension appearing in [112] and the further tightened
bound for disjoint repair groups given in [113] for binary codes, appear respectively, in [112, 113]. These
constructions are for the case dmin = 6. In [123], the authors present a characterization of binary LR
codes that achieve the Singleton bound (18). In [124], the authors present constructions of binary codes
meeting the Singleton bound. These codes are a subclass of the codes characterized in [123] for the case
dmin 6 4.

7.4.2 Constructions with small, non-binary alphabet

In [125], the authors characterize ternary LR codes achieving the Singleton bound (18). In [123,124,126],
the authors provide constructions for codes over a field of size O(r) that achieve the Singleton bound in
(18) for dmin 6 5. Some codes from algebraic geometry achieving the Singleton bound (18) for restricted
parameter sets are presented in [127].

7.4.3 Construction of cyclic LR codes

Cyclic LR codes can be constructed by carefully selecting the generator polynomial g(x) of the cyclic
code. We illustrate a key idea behind the construction of a cyclic LR code by means of an example.
Example 4. Let α be a primitive element of F16 satisfying x4 + x + 1 = 0. Let C 1 be a cyclic
[n = 15, k = 10] code having generator polynomial g1 (x) = (x + 1)(x4 + x + 1). Since the consecutive
powers {1, α, α2 } of α are zeros of g1 (x), it follows that dmin (C) > 3 + 1 = 4 by the BCH bound.
Suppose we desire to ensure that a code C having generator polynomial g(x) has dmin > 4 and in
n
addition, is locally recoverable with parameter (r + 1) = 5, then we do the following. Set s = (r+1) = 3.
Qs−1=2 5l
Let g2 (x) = l=0 (x − α ) and g(x) = lcm{g1 (x), g2 (x)} = g1 (x)g2 (x)/(x + 1). It follows that
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:25

g1 (x)g2 (x)
Figure 14 Zeros of the generator polynomial g(x) = (x+1)
of the cyclic code in Example 4 are identified by circles.
The unshaded circles along with the shaded circle corresponding to α0 = 1 indicate the zeros {1, α, α2 , α4 , α8 } of g1 (x)
selected to impart the code with dmin > 4. The shaded circles indicate the periodic train of zeros {1, α5 , α10 } introduced
to cause the code to be locally recoverable with parameter (r + 1) = 5. The common element 1 is helpful both to impart
increased minimum distance as well as locality.

P14 5lt
t=0 ct α = 0, l = 0, 1, 2. Summing over l we obtain

2 X
X 14 X
ct α5lt = 0 ⇒ ct = 0.
l=0 t=0 t:t=0 (mod 3)

It follows that the symbols {ct |t = 0 (mod 3)} of C form a local code as they satisfy the constraint
of an overall parity-check. Since the code C is cyclic the same holds for the code symbols {ct+τ |t = 0
(mod 3)}, for τ = 0, 1, 2. Thus through this selection of generator polynomial g(x), we have obtained a
code that has both locality and dmin > 4. The zeros of g(x) are illustrated in Figure 14. The code C has
parameters [n = 15, k = 8, dmin > 4] and r = 4. Note that the price we pay for introduction of locality
2 (x)
is a loss in code dimension, equal to the degree of the polynomial gcd{g1g(x),g 2 (x)}
. Thus an efficient code
will choose the zeros of g1 (x), g2 (x) for maximum overlap.
The above idea of constructing cyclic LR code was introduced in [116] and extended in [115, 117, 128,
129]. In [130], the use of locality for reducing the complexity of decoding a cyclic code is explored. The
same paper also makes a connection with earlier work [131] that can be interpreted in terms of locality of
a cyclic code. In [116], a construction of binary cyclic LR codes for r = 2 an dmin ∈ {2, 6, 10} achieving a
bound derived within the same paper for binary codes is provided. In [128], the authors give constructions
of optimal binary, ternary codes meeting the Singleton bound (18) for dmin = 4, r ∈ {1, 3} and dmin = 6,
r = 2 as well as a construction of a binary code meeting the bound given in [112] for dmin = 6, r = 2
based on concatenating cyclic codes. A discussion on the locality of classical binary cyclic codes as well
as of codes derived from them through simple operations such as shortening, can be found in [109, 132].
The principal idea here is that any cyclic code has locality d⊥ − 1 where d⊥ is the minimum distance of
the dual code C ⊥ . In [117], the authors construct optimal cyclic codes under the constraint that the local
code is either a simplex code or else, a Reed-Muller code. In [115], the authors provide a construction
of cyclic codes with field size O(n) achieving the Singleton bound (18) and also study the locality of
subfield subcodes as well as their duals, the trace codes. In [129], constructions of cyclic LR codes with
dmin ∈ {3, 4} for any q and flexible n are provided.

7.5 Maximal recoverable (MR) codes

An [n, k] MDS code can recover from any pattern of (n − k) erasures. MR codes [133] are codes that
operate under some pre-specified linearity constraints and which can recover from any pattern of (n − k)
erasures that is not precluded by the pre-specified linearity constraints imposed. In the context of locality,
these constraints are the ones imposed on the local codes. A different perspective of MR codes based on
k-core subsets (defined below) is given in [93].
Definition 3. Let H0 be an (ρ × n) matrix over Fq whose row space has m = q ρ − 1 nonzero vectors
with respective support sets Ai ⊆ [n], i = 1, 2, . . . , m. We view H0 as the matrix that imposes locality
constraints. Let us define a subset S ⊂ [n] to be a k-core with respect to H0 if |S| = k and |Ai ∩ S c | > 1,
for all i = 1, 2, . . . , m. Then with respect to H0 , an MR code is an [n, k, H0 , q] code C possessing a (k × n)
generator matrix G with k 6 n − ρ satisfying the property that H0 GT = [0] and for any k-core S,

rank (G|S ) = k. (24)

Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:26

Remark 7. Let H = [ H H1 ] denote the parity-check matrix of the MR code, where H1 represents the
0

additional parity-checks that need to be imposed to satisfy the requirements of an MR code. It could
happen that the elements of H0 belong to a small base field B and over that field it is not possible to
find a matrix H1 which will result in an MR code. It turns out that in such instances, one can always
choose the elements of H1 to lie in a suitable extension field Fq of B, resulting in an MR code over Fq .
Remark 8. The condition in (24) imposed on the k-core subsets S is equivalent to the following
condition. Let B ⊆ [n] be such that |B c ∩ Ai | > 1, ∀i = 1, 2, . . . , m. Then G|B is a generator matrix of an
[n = |B|, k] MDS code. This follows since any k columns of G|B are required to be linearly independent.

7.5.1 General construction with exponential field size

The following construction is based on parity check matrix. There is an equivalent construction based
on generator matrix which is presented in [93]. Saying that S is a k-core is equivalent to saying that
S is an information set since the k underlying message symbols can be uniquely recovered from the k
code symbols {ci |i ∈ S}. From the perspective of the parity check matrix H, S is a k-core if and only if
rk (H |S c ) = (n − k). This suggests a construction technique. Setting H = [ H 0
H1
] as earlier, we regard the
symbols in the ((n−k−ρ)×n) matrix H1 as variables. We need to select H1 such that any (n−k)×(n−k)
sub-matrix of H corresponding to the complement S c of a k-core, has nonzero determinant. Let P (H1 )
be the polynomial in the symbols of H1 obtained by taking the product of these determinants. Note that
the definition of a k-core ensures that each of these determinants are non-zero polynomials. The product
polynomial is a polynomial in the entries (variables) of the matrix H1 and each variable appears with
n−1

degree at most n−k−1 . By the combinatorial nullstellensatz [34], it follows that there is a field of size
n−1

q > n−k−1 such that this product of determinants can be made nonzero. Thus an MR code always
n−1

exists of field size q > n−k−1 . The interest is of course, in explicit constructions of MR codes having
low field size q. It is also possible to use linearized polynomials to construct MR codes, but while this
results in an explicit construction, the field size is still in general, of exponential size.

7.5.2 Partial MDS codes

In the literature, the focus motivated by practical considerations, is on the following subclass of MR
codes, also sometimes termed as partial MDS (P-MDS) codes [134].
Definition 4. An (r, δ, s) MR code or P-MDS code is defined as an [n = m(r + δ), k = mr − s] code
over Fq in which the n code symbols can be arranged as an array of (m × (r + δ)) code symbols in such
a way that each row in the array forms an [r + δ, r, δ + 1] MDS code and upon puncturing any δ code
symbols from each row of the array, the resulting code becomes an [mr, mr − s] MDS code.
A tabular listing of some constructions of P-MDS codes [107, 134–141] appears in Table 4. In [142],
the authors characterize the weight enumerators and higher support weights of an (r, 1, s) MR code.

8 LR codes for multiple erasures

We begin with an overview of the different classes (Figure 15) of LR codes that are capable of recovering
from multiple erasures proposed in the literature. All the codes defined in this section are over the finite
field Fq .

8.1 Various classes of multiple-erasure LR codes

• Sequential-recovery LR codes. An (n, k, r, t) sequential-recovery LR code (abbreviated as S-LR code)

is an [n, k] linear code C having the following property. Given a collection of s 6 t erased code symbols,
there is an ordering (ci1 , ci2 , . . . , cis ) of these s erased symbols such that for each index ij , there exists a
subset Sj ⊆ [n] satisfying
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:27

Table 4 Constructions for P-MDS codes

Parameters of
Ref. Field size
MR code
General r, δ, s
[135] (r, δ, s) (q ′ )mr where q ′ is a prime power > r + δ.
[136] (r, δ, s) > max((q ′ )δ+s ms−1 , (q ′ )s(δ+s) ) with q ′ a prime power > r + δ.
δ=1
[134] (r, 1, s) O(2n )
1 m+s m+s−1
O(m⌈(s−1)(1− 2r )⌉ ) or > n 2 for m + s even and > 2n 2 for m + s odd, when r + 1 and m are powers
[137] (r, 1, s)
of 2.
1
> (q ′ )⌊(1− m )s⌋+m−1 (q ′ is prime power > n) and for some special case, the field size of their construction
[138] (r, 1, s) 1 s
is > (q ′ )⌊(1− m )s⌋+m−2 . For m = 2, 4|s, > (q ′ ) 2 where q ′ > n is a power of 2.
[136] (r, 1, s) >2 ℓ (1+(s−1)⌈log2ℓ (m)⌉) where ℓ = ⌈ s+1 ⌉⌈log2 (r + δ)⌉.
2
s=1
[134] (r, δ, 1) O(max(m, r + δ))
[139] (r, δ, 1) O(r + δ)
s=2
[140] (r, 1, 2) O(n)
[141] (r, δ, 2) > m((δ + 1)(r − 1) + 1) ≈ δ × n
[107] (r, δ, 2) O(n)
s=3
3
[137] (r, 1, 3) O(k 2 )
[136] (r, δ, 3) if m < (r + δ)3 then O((r + δ)3(δ+3) ) otherwise O((r + δ)δ+3 m1.5 )
s=4
7
[137] (r, 1, 4) O(k 3 )

Codes with sequential recovery Codes with

hierarchical locality
Codes with parallel recovery
Codes with
Availability cooperative Codes with
codes recovery (r, δ ) locality

Figure 15 The various code classes corresponding to different approaches to recovery from multiple erasures.

(i) |Sj | 6 r,
(ii) Sj ∩ {ij , ij+1 , . . . , is } = φ, (25)
X
(iii) cij = u ℓ cℓ , u ℓ ∈ F q .
ℓ∈Sj

It follows from the definition that an (n, k, r, t) S-LR code can recover from the erasure of s code symbols
ci1 , ci2 , . . . , cis , for 1 6 s 6 t by using (25) to recover the symbols cij , j = 1, 2, . . . , s, in succession.
• Parallel-recovery LR codes. If in the definition of the S-LR code, we replace the condition (ii) in (25)
by the more stringent requirement Sj ∩{i1 , i2 , . . . , is } = φ, then the LR code will be referred to as a parallel
recovery LR code, abbreviated as P-LR code. Clearly the class of P-LR codes is a subclass of S-LR codes.
From a practical perspective, P-LR codes are preferred since as the name suggests, the erased symbols
can be recovered in parallel. However, this will in general come at the expense of storage overhead.
We note that under parallel recovery, depending upon the specific code, this may require the same
helper (i.e., non-erased) code symbol to participate in the repair of more than one erased symbol cij .
• Availability codes. An (n, k, r, t) availability LR code, is an LR code having the property that in the
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:28

event of a single but arbitrary erased symbol ci , there exist t recovery sets {Rji }tj=1 which are pair-wise
disjoint and of size |Rji | 6 r with Rji ⊆ [n] − {i} such that for each j, 1 6 j 6 t, ci can be expressed as
X
ci = aiℓ cℓ , aiℓ ∈ Fq .
ℓ∈Rij

An (n, k, r, t) availability code is also an (n, k, r, t) P-LR code. This follows because the presence of
at most t erasures implies, that there will be at least one recovery set for each erased code symbol all
of whose symbols remain unerased. If the t disjoint recovery sets are available only for code symbols
corresponding to an information set, the code is said to be an IS availability code as opposed to the AS
availability implicit in the previous definition.
• (r, δ) codes. Recovery from t erasures can also be accomplished by using the codes with (r, δ) locality
introduced in Section 7, if one ensures that the code has dmin > t + 1. However in this case, repair is
local only in those cases where the erasure pattern is such that the number of erasures ei within each
local code satisfies ei 6 δ − 1. Thus one may regard (r, δ) codes as offering probabilistic guarantees of
local recovery in the presence of 6 t erasures in exchange for a potential increase in code rate. Of course,
one could always employ an (r, δ) locality with each local code being an MDS code and δ > t + 1, but
this would result in a significant rate penalty.
• Cooperative recovery codes. A cooperative recovery (n, k, r, t) LR (C-LR) code is an LR code
such that if a subset (ci1 , ci2 , . . . , cis ), 1 6 s 6 t of symbols are erased, then there exists a subset
{cj1 , cj2 , . . . , cjr } of r other code symbols (i.e., ia 6= jb for any a, b) such that for all a ∈ [s], cia =
Pr
b=1 θa,b cjb , θa,b ∈ Fq . Clearly an (n, k, r, t) C-LR code is also an (n, k, r, t) P-LR code, but the r in the
case of a C-LR code will tend to be significantly larger. One may regard C-LR codes as codes that seek
to minimize the number of unerased symbols contacted per erased symbol on average, rather than insist
that each code symbol be repaired by contacting r other code symbols.

8.2 Availability codes

8.2.1 Bounds on code rate

The following upper bound on the rate of an availability code was given in [118].
Theorem 8 ([118]). If C is an (n, k, r, t) availability code, then its rate R must satisfy

k 1
R= 6 Qt 1
. (26)
n j=1 (1 + jr )

The parity check matrix of an availability code can be written in the form H T = [HaT HbT ] where the
rows of Ha are the distinct parity checks associated with the recovery sets Rji , ∀i ∈ [n], j ∈ [t] and where
the matrix Hb contains all the remaining parity checks. Clearly the Hamming weight of each row of Ha
is 6 (r + 1) and the column weight > t.
• Codes with strict availability. Codes with strict availability (SA-LR codes) are simply the subclass
of availability codes where each row of Ha has weight equal to (r + 1) and each column of Ha has weight
equal to t. Thus the number m of rows of Ha must satisfy m(r + 1) = nt. Further, if the support sets of
(i)
the rows in Ha having a non-zero entry in the i-th column are given respectively by Sj , j = 1, 2, . . . , t,
(i) (i)
then we must have by the disjointness of the recovery sets, that Sj ∩ Sl = {i}, ∀ 1 6 j =6 l 6 t. Each
code symbol ci in an SA-LR code is thus protected by a collection of t ‘orthogonal’ parity checks, each
of weight (r + 1).
k
Theorem 9 ([110]). Let R = n be the maximum possible rate of an (n, k, r, t) SA-LR code. Then R
must satisfy the upper bound
!
t t 1
R 6 1− + Qr+1 1
. (27)
r+1 r+1 j=1 (1 + j(t−1) )
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:29

The above bound (27), derived in [110], is tighter than (26) as r increases for any fixed t. An upper
bound on rate of an (n, k, r = 2, t) SA-LR code over F2 that for large t, becomes tighter in comparison
with the bounds in either (26) or (27), is presented in [143]. Also contained in [143], is an upper bound
on the rate of an (n, k, r, 3) SA-LR code over F2 which is tighter than the bound in either (26) or (27)
for r > 72 and which makes use of the “transpose”-based rate equation appearing in [110].

8.2.2 Constructions
• The product code. Consider the [(r+1)t , rt ] product code in t dimensions. Clearly this is an (n = (r+1)t ,
r t
k = rt , r, t) availability code, having rate R = ( r+1 ).
• The Wang et al. [144] construction. For any given parameter pair (r, t), Wang et al. [144] provide a
construction for an (n, k, r, t) availability code which is defined through its parity-check matrix. Let S be
a set of m = (r + t) elements. Then in the construction, each row of H corresponds to a distinct subset
of S of cardinality (t − 1) and each column, to a distinct subset of S of cardinality t. We set hij = 1 if
m

the i-th (t − 1)-subset belongs to the j-th t-subset and zero otherwise. Thus H is of size t−1 × m t . It
is easy to verify that each row of H has constant row weight (r + 1) and each column of H has constant

weight t. It turns out that the rank of H is given by m−1 t−1 and that H defines an (n, k, r, t) availability

code, having parameters n = m m m−1
t , k = t − t−1 and
r
rate R = r+t . Thus this code provides improved
r+t
t
rate in comparison with the product code. Since t < (r + 1) , the code has smaller block length as
well.
• Direct-sum construction. It is shown in [143] that the direct sum of m copies of the [7, 3] simplex
code yields an SA-LR code with parameters (7m, 3m, 2, 3) having maximum possible rate for n = 7m,
r = 2, t = 3, q = 2.

8.2.3 Bounds on minimum distance

Let dmin (n, k, r, t) be the maximum possible minimum distance of an (n, k, r, t) availability code. In [145],
the following bound on the minimum distance of an information symbol availability code (and hence
applicable to the case of AS availability codes as well) was presented

t(k − 1) + 1
dmin (n, k, r, t) 6 n − k + 2 − . (28)
t(r − 1) + 1

This bound was derived by adopting the approach employed in Gopalan et al. [93] to bound the
minimum distance of an availability code. An improved minimum-distance estimate appears in [118]
t
X k−1
dmin (n, k, r, t) 6 n − . (29)
i=0
ri

• Approach via minimum support weights. The next bound on minimum distance relies upon an easy-
to-compute sequence that represents upper bounds on the GHW of the dual of an availability code. Let
there be b subsets {S1 , . . . , Sb } of [n], each of size at most r + 1. We assume that [n] = ∪bi=1 Si . Let fi be
the minimum size of the union of any i out of the b subsets, i.e., fi = min{T :T ⊆[b]:|T |=i} | ∪j∈T Sj |. Then
fi 6 ei [98] where the {ei }bi=1 are recursively calculated in the reverse direction as follows: set eb = n,
and for 2 6 i 6 b, set

2ei
ei−1 = min ei , ei − +r+1 . (30)
i

From the definition of ej , it is clear that ej is an upper bound on the j-th minimum support weight or
j-th GHW of a code containing b linearly independent codewords with the i-th codeword having support
Si , i ∈ [b]. We will refer to the sequence {ei } associated with a given parameter set (n, r, b) as the
minimum-support-weight (MSW) sequence associated to (n, r, b). The bound below in (32) appeared
in [110] and makes use of the fact that shortening of an (n, k, r, t) availability code results in a second
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:30

availability code with parameters (n − ∆n , k − ∆k , r, t) having the same or larger dmin . By applying the
bound in (29) to the shortened code, one often obtains a bound on the original code (i.e., the parent code
before shortening) that is significantly tighter. To estimate (∆n , ∆k ), the bound makes use of the MSW
sequence discussed above.
Theorem 10 ([110]). Let b = ⌈n(1 − ρ(r, t))⌉ and ei be calculated as per (30), where
 r

 , if t ∈ {1, 2},


 r+t


 r2
ρ(r, t) = (r + 1)2 , if t = 3, (31)




 1
 Qt (1 + 1 ) , if t > 3.


j=1 jr

Then,
 
 t
X k + i − ei − 1 
dmin (n, k, r, t) 6 min n−k−i+1− . (32)
16i6b, ei −i<k  rj 
j=1

The calculation of ρ(r, t) for t = 1 was not explicitly stated in [110] but is well known. Also contained
in [110] is an improved upper bound on dmin in the case of codes with strict availability.

8.2.4 Alphabet-size dependent bounds on dmin

Let dqmin (n, k, r, t) be the maximum possible minimum distance of an (n, k, r, t) availability code over Fq .
In [109], the authors provide a bound on minimum distance of an (n, k, r, t) IS availability code (the
bound thus also applies to AS availability codes as well) that depends on the size q of the underlying
finite field Fq

dqmin (n, k, r, t) 6 min dq (n − B(r, x, y), k − A(r, x, y)), (33)

k
1 6 x 6 ⌈ (r−1)t+1 ⌉,
x ∈ Z+ , y ∈ [t]x , A(r, x, y) < k

P P
where A(r, x, y) = xj=1 (r − 1)yj + x, B(r, x, y) = xj=1 ryj + x and dq (n, k) is the maximum possible
minimum distance of a classical (i.e., no locality necessary) [n, k] block code over Fq . There is a similar
bound on the dimension of an availability code with parameters n, r, t, dmin over Fq .
The following bound on the minimum distance of an (n, k, r, t) availability code over Fq that is tighter
than the bound in (33) appears in [110] and is currently the tightest-known bound on dqmin (n, k, r, t):

dqmin (n, k, r, t) 6 min dqmin (n − ei , k + i − ei , r, t), (34)

i∈S

where S = {i : ei − i < k, 1 6 i 6 b} and b = ⌈n(1 − ρ(r, t))⌉ and ei is calculated as per (30). This bound
is also based on the shortening approach introduced in [108].

8.2.5 Asymptotic bounds on rate

log (A (n,r,t,⌈∆n⌉))
Let Rq (r, t, ∆) = lim supn→∞ q q n , where Aq (n, r, t, d) is the maximum number of codewords
in an availability code with parameters (n, r, t) with minimum distance d over Fq . The only known upper
bounds on supq Rq (r, t, ∆) are based on converting the minimum distance bounds appearing in (28), (29)
and (32) into asymptotic bounds. There are constructions which provide lower bounds on Rq (r, t, ∆).
A lower bound on supq Rq (r, t, ∆) for any r > t is provided in [118]. For the specific case t = 2, [118]
provides lower bounds on Rq (r, 2, ∆):
!
q r 1 (2)
R (r, 2, ∆) > − min r+2 logq (gq (s)) − ∆ logq (s)
valid for any q, (35)
r + 2 0<s61 2
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:31

r+2
1 X r+2
(1 + s)( 2 )−i(r+2−i) (1 − s)i(r+2−i)
(2) r+2
g2 (s) = valid only for q = 2. (36)
2r+2 i=0
i

(2)
The reader is referred to [118] for an expression for gq (s) for general q as well as a lower bound on
supq Rq (r, t, ∆) for any r > t. A further lower bound on Rq (r, t, ∆) for the case t = 2 and based on
algebraic geometry codes appears in [119].

8.3 Codes with sequential recovery

Somewhat surprisingly, the maximum possible rate of an (n, k, r, t) S-LR code has been precisely deter-
mined via a tight upper bound and a matching construction. The case t = 2, 3 is respectively settled
in [98, 146], where the authors derive the respective bounds
& '
2k 2k + ⌈ kr ⌉
n>k+ for t = 2, n > k + for t = 3,
r r

and provide matching constructions in each case. Matching constructions for the t = 2 case can be derived
either from complete graphs or Turan graphs [98]. Interestingly, the construction based on Turan graphs
turns out to be optimal with respect to GHW as well. The general t > 4 case was settled in [147, 148]
and is presented below.
Theorem 11 ([147, 148]). Let C be an (n, k, r, t) S-LR code over a finite field Fq . Let r > 3. Then
 t
 r2


 t P 2t −1 i , t even,
k r 2 + 2 r
6 i=0 (37)
n   rs

 P , for t odd,
rs + 2 s−1 i
i=1 r + 1

where s = t+1 2 . Moreover, there exist binary codes (i.e., codes over Fq with q = 2) that achieve this
bound.
The rate bound given in (37) proves a conjecture given in [149] for maximum achievable rate of an
(n, k, r, t) S-LRC. The proof of the bound (37) given in [147, 148], shows that a code achieving the above
rate bound must have a parity check matrix (upto a permutation of rows and columns) with a specific,
sparse, staircase structure. An example of this for the case t = 8 is shown in Figure 16. Based on this,
it can shown that a binary code achieving the rate bound (37) and hence having parity check matrix of
the form as shown in Figure 16, must be based on a tree-like graph with girth > t + 1 with degree r + 1
for most nodes, where each edge of the graph represents a code symbol and each node represents a parity
check of the code symbols incident on it. Codes achieving the rate bound (37) appeared in [147, 148, 150]
and are based on constructing these tree-like graphs with girth > t + 1.
We note that a construction of codes based on (r + 1)-regular bipartite graphs having girth t + 1 and
achieving rate close to (37) was suggested earlier in [151]. It was noted that these codes have rate > r−1
r+1 .
1
It is not hard to show that these codes have rate equal to r−1 r+1 + n [147]. For certain n, the resultant
codes achieve the rate bound in (37). However these values of n correspond to the existence of Moore
graphs of degree r + 1, and girth = t + 1 with that number n of edges. For r > 2, Moore graphs exist
only for t ∈ {2, 3, 4, 5, 7, 11} [152].
In Figure 17, we compare the tight bound in (37) on the rate of an S-LR code with the upper bound in
(26), due to Tamo et al. [118] on the rate of a code with availability. The plots suggest that codes with
sequential recovery offer a significant rate advantage.

8.4 (r, δ) codes

The (Singleton) bound on the minimum distance of a code with (r, δ) locality was presented above in
(18). We collect together in this subsection, other results on this class of codes that have appeared in the
literature.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:32

0.80
0.75
D0 A1 0 0 0 0.70
0.65

Rate
0 D1 A2 0 0 0.60
0.55 Achievable bound on rate for t=10
0 0 D2 A3 0 for sequential recovery
0.50
0.45 Tamo et al. upper bound on rate
0 0 0 D3 C for t=10 for availability
0.40
3 4 5 6 7 8 9 10
r (locality parameter)

Figure 16 The general form [147,148] of the parity-check Figure 17 (Color online) Comparison of rate bounds on
matrix H of a rate-optimal S-LR code for t = 8. codes with sequential recovery (37) and codes with avail-
ability (26) for t = 10.

8.4.1 Constructions and characterization of distance optimal (r, δ) codes

We focus here only on optimal constructions having low field size. A detailed investigation of codes which
achieve the Singleton bound on minimum distance of a code with (r, δ) locality for all symbols appears
in [153] (see in particular, Figure 2 of [153] which provides a characterization of the existence of codes
achieving the Singleton bound). In [102], a construction of codes achieving (18) with field size O(n) for
the case (r + δ − 1)|n is provided. A construction of cyclic codes with (r, δ) locality achieving the bound
(18) for (r + δ − 1)|n and field size of O(n) appears in [154].

8.4.2 (r, δ) codes with small alphabet size

• Upper bounds on dimension. Several alphabet-size dependent bounds on dimension for a code with
(r, δ) AS locality and given minimum distance dmin appear in [114]. The bounds take on the form

n−d+1
k6 + 1 logq (B(r + δ − 1, δ)),
r+δ−1
where B(r + δ − 1, δ) is an upper bound on the number of codewords in a code of block length (r + δ − 1)
and minimum distance δ and is log-convex in the block length. The different bounds are obtained by
substituting various bounds for B(r + δ − 1, δ). The authors also present bounds for disjoint local codes
derived based on association schemes and linear programming which provide the tightest-known bounds
in the literature on codes with (r, δ) locality with disjoint local codes.
• Binary codes with (r, δ) locality. In [155], distance-optimal (codes achieving the Singleton bound)
binary codes are characterized and the authors of [155], prove that there are only 2 classes of binary,
distance-optimal codes for δ > 2. They make use of the fact in their proof that since the code is binary
and achieves the Singleton bound on minimum distance, the code after shortening a sufficient number of
selected symbols must be an [ℓ, 1, ℓ] MDS code for some ℓ < n.

8.4.3 Achievability results on asymptotic rate

In [119], the following GV-type bound is derived

r logq (bδ (s))
Rq (r, δ, ∆) > − min − ∆ logq (s) ,
r + δ − 1 0<s61 r+δ−1
where ∆ denotes the fractional minimum distance and δ is the parameter associated with (r, δ) locality
and where
X r + δ − 1
r+δ−1 X w − 1
w−δ
bδ (s) = 1 + (q − 1) w w−δ
s q (−q)−j .
w j=0
j
w=δ
√
A second lower bound, based on a construction appearing in [119] applies whenever r + δ − 1 = q and
r 3
improves upon the above GV-type bound in some parameter range Rq (r, δ, ∆) > r+δ−1 (1 − ∆ − √q+1 ).
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:33

[24,14,6]

[12,8,3] [12,8,3]

[4,3,2] [4,3,2] [4,3,2] [4,3,2] [4,3,2] [4,3,2]

Figure 18 (Color online) Illustration of a code with hierarchical locality. Each code symbol is protected by a [4, 3, 2] local
code. Each local code is contained in a [12, 8, 3] middle code.

8.5 Codes with hierarchical locality

Codes with hierarchical locality are codes proposed in [156] having multiple tiers of locality. We restrict
the discussion for simplicity here to 2 tiers. The motivation here is that in a code with 2-tier locality,
the higher probability single-erasure event can be repaired with the help of a short local code, while the
lower-probability, multiple-erasure event can be handled by accessing a larger number of symbols from
the next level local code, termed here as the ‘middle’ code. A hierarchical topology of local codes as
illustrated by the example shown in Figure 18 is proposed in [156] and a bound on the minimum distance
derived for the general case. The bound for a two-level hierarchy is presented below.
Theorem 12. Let C be an [n, k, d]-linear code with hierarchical locality with the local and middle codes
having dimensions at most r1 , r2 respectively, and minimum distances at least δ1 , δ2 respectively. Then

k k
d 6 n−k+1− − 1 (δ2 − 1) − − 1 (δ1 − δ2 ). (38)
r2 r1

Optimal constructions are provided in [156,157]. We note that in the context of a practical distributed-
storage system, the authors in [158] had previously suggested the topology of hierarchical codes and
compared hierarchical codes with RS codes in terms of repair-efficiency using real data.

8.6 LR code with cooperative recovery (C-LR code)

Let dmin (n, k, r, t) be the maximum possible minimum distance of a C-LR code with parameters (n, k, r, t).
In [151], the authors introduce the notion of cooperative local repair and provide the following bound on
minimum distance for both linear as well as non-linear codes

k−t
dmin (n, k, r, t) 6 n − k + 1 − t .
r

They also give a second bound for r > t. The paper also contains the following alphabet-size dependent
bound on dimension

k6 min rγ + logq (Aq (n − γ(r + t), d)),

n
γ6min(⌊ r+t ⌋,⌊ k−1
r ⌋)

where Aq (n, d) is the maximum size of a q-ary code of block length n and minimum distance d.
Open problems 5 (Codes for multiple erasures). (1) For a given (n, k, r, δ), what is the maximum
achievable minimum distance of codes having (r, δ) locality for a given constraint on field size?
(2) For a given (n, k, r), what is the minimum field size over which we can construct a code with locality
(δ = 2) meeting the Singleton bound?
(3) The construction of codes with locality (δ = 2) over a field Fq of size q = O(1) for a larger range
of (dmin , r) (say large dmin , r) which are dmin optimal over Fq .
(4) The construction of MR codes with smaller field size for a wide range of parameters.
(5) What is the maximum achievable rate nk for a given (r, t) of codes with availability and C-LR
codes?
(6) For a given (n, k, r, t), what is the maximum achievable minimum distance of an S-LR code, a code
with availability, or a C-LR code?
(7) Questions 5 and 6 when restricted to a finite field Fq .
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:34

(8) All the above questions on minimum distance can be rephrased as a question on maximum achiev-
able dimension for a given (n, dmin , r, t) over a finite field Fq .

9 Locally RG codes

As is clear from the discussion in the preceding sections, while RG codes aim to minimize the repair
bandwidth, LR codes focus in keeping the repair degree low. It is natural to ask if it is possible to
construct codes that possess both low repair bandwidth and repair degree. The class of LRG codes
introduced independently in [83, 159], answers this question in the affirmative. These codes are perhaps
best viewed as codes with locality in which the local codes are RG codes.

9.1 Locality in vector codes

We begin by studying the notion of locality in a vector code, i.e., a code over a vector alphabet. Let C be
an [[n, K, dmin , α]] vector code over the vector alphabet Fα
q having block length n and minimum Hamming
distance dmin . Let K be the dimension of the code viewing the code as a vector space over Fq . Let Cs
be the scalar code of length nα obtained from C by replacing each vector symbol by the corresponding
α scalar symbols. Let G be a generator matrix for Cs , where the first α columns correspond to the first
vector code symbol of C and so on. For 1 6 i 6 n, we use the terminology i-th thick column to denote
the set of columns [(i − 1)α + 1, iα] of G corresponding to the i-th vector code symbol of C. Clearly, the
scalar code Cs has dimension K.
For a subset S ⊆ [n], of indices, let C|S denote the vector code obtained by restricting the code C to
the thick columns associated with the indices in S. We similarly define G|S to be the restriction of G to
the thick columns associated to S. The definition below is a natural extension of the notion of locality
to a code over vector alphabet.
Definition 5. For i ∈ [0, n − 1] and δ > 2, the i-th vector code symbol of C is said to have (r, δ) locality
if there exists a set Si ⊆ [n] such that i ∈ Si , |Si | 6 r + δ − 1 and dmin (C|Si ) > δ. The restriction of C to
S, i.e., code C|Si will be referred to as the local code associated to Si .
Definition 6. A vector code C is said to have (r, δ) IS locality if there exists I ⊆ [n] such that
rank(G|I ) = K and the i-th vector code symbol of C has (r, δ) locality for all i ∈ I.
C is said to have (r, δ) AS locality if I can be set to be [n] in the definition above. If for a code having
(r, δ) AS locality, Si = Sj or |Si ∩ Sj | = 0, for all i 6= j, then the code is said to have disjoint locality.
Definition 7. An [[n, K, dmin , α]] vector code C is said to have the uniform rank accumulation (URA)
property if there exists a sequence {ai }ni=1 of non-negative integers satisfying (i) a1 = α, (ii) rank(G|I ) =
Pi
j=1 ai , ∀I ⊆ [n] : |I| = i. The integer sequence {ai , i ∈ [n]} is referred to as the rank profile of C.
Remark 9. It is shown in [12] that both MSR and MBR codes possess the URA property. The rank
profile in the case of ((n, k, d), (α, β), K) MSR, MBR codes, are respectively given by
( (
α, 1 6 i 6 k, α − (i − 1)β, 1 6 i 6 k,
ai = ai =
|{z} 0, (k + 1) 6 i 6 n, |{z} 0, (k + 1) 6 i 6 n.
MSR MBR

Definition 8. An [[n, K, dmin , α]] vector code C is said to have URA locality, if the code has either
information or AS locality and if in addition, local codes are [[nℓ , Kℓ , dℓ , α]] vector codes having the URA
property with identical rank profiles.
Consider the vector code C having URA locality with parameters as in Definition 8. The rank profile
for any given [[nℓ , Kℓ , dℓ , α]] local code is denoted by {ai }ni=1
ℓ
. Let {bi }∞
i=1 be a periodic sequence, where
Ps
bi = ai for 1 6 i 6 nℓ and bnℓ +j = bj for j > 1. Define P (s) , i=1 bi : s > 1. For x > 1, set P (inv) (x)
to be the smallest integer y such that P (y) > x, i.e., P (inv) (x) = y.
Theorem 13 ([159]). Let C be an [[n, K, dmin , α]] code with URA locality, where the local codes have
parameter set [[nℓ , Kℓ , dℓ , α]]. Then, we have dmin (C) 6 n − P (inv) (K) + 1.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:35

2,5, 2,5, 2,5,

6, P 6, P 6, P
P 6 P 6 P 6

1,4, 1 1,3, 1,4, 1 1,3, 1,4, 1 1,3,

9, P 5 2 6,7 9, P 5 2 6,7 9, P 5 2 6,7
4 3 4 3 4 3
9 7 9 7 9 7

3,5, 2,4, 3,5, 2,4, 3,5, 2,4,

8,9 8 7,8 8,9 8 7,8 8,9 8 7,8

Local code 1 Local code 2 Local code 3

Scalar all-symbol locality code

1 2 3 4 5 6 7 8 9 P 1 2 3 4 5 6 7 8 9 P 1 2 3 4 5 6 7 8 9 P
Local code 1 Local code 2 Local code 3

Figure 19 (Color online) An [[n = 15, K = 20, dmin = 5, α = 4]] LRG code C where local codes are MBR codes. Here the
local codes are ((nℓ = 5, r = 3, d = 4), (α = 4, β = 1), Kℓ = 9) MBR codes.

Corollary 1. Consider the case of a vector with locality, where the local codes are ((nℓ , r, d), (α, β), Kℓ )
MSR codes. Using Remark 9 and Theorem 13, it follows that [83, 159]

K K
dmin (C) 6 n − +1− − 1 (δ − 1).
α αr
In [159], the authors give minimum-distance bounds for general vector codes with locality and a tighter
bound for the case when the local codes have the URA property. LRG codes with MSR or MBR AS
locality, and IS locality that meet the minimum-distance bound, are provided for various parameters.
The field-size requirement here is at least O(n2 ) for the AS locality code constructions. In [83], the
authors present an explicit construction of a vector code with MSR AS locality, that requires a field-size
that is exponential in n. In [160], the authors construct a related family of vector codes with IS locality,
where the local codes are vector MDS codes with near-optimal bandwidth and small sub-packetization
(α) levels. In [161, 162], the authors consider vector codes with locality featuring functional repair and
achieving a reduction in repair bandwidth by carefully choosing for each failed node, a set of r 6 k helper
nodes. In [163], the authors provide linear field-size constructions for LRG codes with AS locality, where
the local codes are either MSR or MBR.

9.2 Codes where local codes are MSR/MBR

It is possible to construct LRG codes which are minimum-distance optimal where the local codes are
MSR or MBR using the T-B construction of optimal scalar LR codes.
Example 5 ([159]). An LRG code C having parameters [[n = 15, K = 20, dmin = 5, α = 4]] where
the local codes are ((nℓ = 5, r = 3, d = 4), (α = 4, β = 1), Kℓ = 9) MBR codes, can be constructed as

follows. Let Nℓ , n2ℓ = 10, δ ′ = Nℓ − Kℓ + 1 = 2, ν = nnℓ = 3. Take a minimum-distance optimal
[νNℓ = 30, K = 20, 9] scalar T-B code C ′ with (Kℓ = 9, δ ′ = 2) AS locality. Note that each local code of
C ′ is a [Nℓ = 10, Kℓ = 9] MDS code. The LRG code with the required parameters is obtained by mapping
each such local MDS code to an MBR code, using the polygonal MBR construction. The resultant code
(Figure 19) is shown to be minimum-distance optimal in [159].
Example 6 ([163]). From the discussion in Subsection 7.2.2, it can be inferred that each local code
in a T-B code is an MDS code. Let (nℓ − r)|r and nℓ |n. In order to construct a code with MSR local
nℓ
regeneration, we initially stack α = (nℓ − r) nℓ −r independent layers of codewords from an [n, k, dTB ] T-B
code with (r, δ) AS locality. We then perform the pairwise forward transform (introduced in Subsection
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:36

4.2) independently, for each local code. This results in an [[n, K = kα, dmin , α]] LRG code C where local
codes are ((nℓ , r, d), (α, β), Kℓ ) MSR codes, with d = nℓ − 1. Let dTB denote the (optimal) minimum-
distance of the underlying T-B code. The code will be minimum-distance optimal if dTB 6 2(nℓ − r + 1).

10 Repairing RS codes

The conventional repair of an [n, k] scalar MDS code treats each code symbol as an indivisible unit and
leads to a total repair bandwidth of k times the amount of data stored in the failed node, where k is
the dimension of the code. Over the past couple of years, new techniques have surfaced that present a
different picture for the repair of scalar MDS codes, particularly for RS codes. These techniques realize
that the code symbols (say, over Fq ) of a scalar MDS code can be viewed as vectors whose entries are
over some subfield, B ⊆ Fq . For example, consider the [16, 8] RS code obtained by evaluating message
polynomials of degree 6 7 over all the elements in F24 . Under the traditional repair, the repair bandwidth
will be 8 code symbols over F24 , which is equivalent to 32 bits. As we will shortly see, it is possible to
perform single-node repair in this instance, by downloading just 1 bit from each of the fifteen surviving
nodes. This results in a repair bandwidth of 15 bits, which is a clear improvement over the 32 bits
downloaded under the conventional scheme. This line of work which vectorizes scalar MDS codes and
performs repair operations over a suitable subfield B for bandwidth gains, began with the pioneering work
of Shanmugam et al. [164] who showed the existence of an efficient repair scheme for systematic node
repair, when k = n − 2, that improved up on the traditional repair bandwidth. In a subsequent paper,
Guruswami and Wootters [165] consider generalized Reed-Solomon (GRS) codes and all-node repair.
There have been other papers since as well.
Let t be the degree of the field extension [Fq : B]. Clearly, through vector representation over the
subfield B of over F, t can be regarded as the sub-packetization level of the MDS code. Traditional RS
codes have code lengths typically on the order of |Fq | corresponding to a sub-packetization level which
is logarithmic in code-length. On the other hand, there are fundamental bounds (Subsection 4.1) that
require the sub-packetization to be exponential in code length (for a fixed r) in order to achieve the cut-set
bound. This leads to the natural and interesting question: what is the least possible repair bandwidth
that can be achieved in a low-sub-packetization-level setting?

10.1 Linear repair schemes for scalar MDS codes

In this subsection, we consider the single-node repair of linear, scalar, MDS codes over Fq , where q = pt
for p a prime power and t a positive integer. Let B be a subfield of Fq of size |B| = p. In this setting,
by linear repair scheme, we will mean that all repair operations correspond to linear operations over B.
For i ∈ [n], let bi denote the least possible repair bandwidth (measured by the number of B-symbols
downloaded) to repair the i-th code symbol. The repair bandwidth b is then defined as: b , maxi∈[n] bi .
In the discussion below, by dimension we will throughout mean dimension as a vector space over B.
Theorem 14 ([165]). Let C be a scalar MDS code. Then a linear repair scheme for C with repair
bandwidth b exists iff for each code coordinate i ∈ [n], there exists a subset Ai of t codewords in the dual
code C ⊥ such that
 
X
dim(hai , a ∈ Ai i) = t and max  dim(haj , a ∈ Ai i) 6 b.
i∈[n]
j∈[n]\i

It is easy to see the ‘if’ part above. The trace function from Fq to B is the B-linear map given by
Pt−1 m
T (γ) = m=0 γ p . Given a basis {ρm }tm=1 for Fq over B, it is known [166] that there always exists a
second basis {γm }tm=1 for Fq over B, termed the trace-dual basis of {ρm }, such that any x ∈ Fq can be
Pt
expressed in the form x = m=1 T (xρm )γm . Let Ai be as defined in Theorem 14. For a ∈ Ai ⊆ C ⊥ and
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:37

Pn
c ∈ C, we have that ci ai = − j=1,j6=i cj aj . Hence

n
X
T (ci ai ) = − T (cj aj ), for a ∈ Ai . (39)
j=1,j6=i

The definition of Ai implies that dim(hai , a ∈ Ai i) = t. Let bij denote the dimension of the set
{aj }a∈Ai and let Bij denote a basis for the vector space spanned by {aj }a∈Ai . Using the B-linearity
of the trace function, it suffices to compute the bij trace values {T (cj x) : x ∈ Bij } which can be used
Pn
to obtain {T (cj aj ) : a ∈ Ai }. Hence by downloading j=1,j6=i bij symbols over B, one can compute
{T (ci ai ) : a ∈ Ai } using (39). Using the trace-dual basis, ci can be reconstructed from these t traces.
Next, consider the specific case of an [n, k] GRS code C, whose symbols are n (scaled) evaluations of
message polynomials f (x) ∈ F[x] of degree 6 k − 1. Let the evaluation points be denoted by the set
A = {αj }nj=1 . As the dual of a GRS code is a GRS code, codewords in the dual are scaled evaluations of
message polynomials of degree 6 (n − k − 1). Thus in the context of a GRS code and ignoring w.o.l.o.g.
the scaling coefficients, (39) takes on the form
n
X
T (f (αi )g(αi )) = − T (f (αj )g(αj )), for all g(x) ∈ Pi , (40)
j=1,j6=i

where f (x) and g(x) are polynomials having degrees at most k − 1 and n − k − 1, respectively, Pi is the set
of t message polynomials having degree at most n − k − 1 corresponding to the t dual codewords in Ai .

10.2 Guruswami-Wootters GRS repair scheme

Let k 6 n − pt−1 for a GRS code. Then it is possible repair each code-symbol (say, i-th) by downloading
just one symbol over B each from the remaining (n − 1) nodes. The scheme is as follows. Consider the
set of t polynomials Pi = {gi,m (x)}tm=1 and a basis {ρm }tm=1 , where
t−1
T ρm (x − αi ) X s s
gi,m (x) = = ρpm (x − αi )p −1 .
(x − αi ) s=0

Each polynomial gi,m (x) has degree pt−1 − 1 6 n − k − 1. Hence the evaluations of this polynomial
represent a codeword in C ⊥ . Note that {gi,m (αi ) = ρm , m ∈ [t]} forms a basis for F over B, i.e.,
dimF h{gi,m (αi ), m ∈ [t]}i = t. Also, dimF h{gi.m (αj ), m ∈ [t]}i = 1 ∀j ∈ [n] \ {i}.
Theorem 15. Let C be an [n, k] MDS code over Fq . For any linear repair scheme for C over B, the
repair bandwidth, b (counted according to the number of symbols from B) satisfies
!
n−1
b > (n − 1) log|B| .
n − k + k−1
|F |

By Theorem 15, the repair scheme discussed above is optimal when A = Fq and n − k = pt−1 .

10.3 Other related work

In [167], the authors improve the Guruswami-Wootters approach to a larger class of parameters. In [168],
the authors provide a family of RS codes that has asymptotically optimal repair bandwidth with respect to
the cut-set bound. This result is further developed in [169] to reduce the sub-packetization levels. In [170],
the authors present RS codes that meet the MSR point for all parameters k < d < n − 1. Bandwidth-
efficient recovery from multiple erasures in RS codes is addressed in [171] and is further extended to
include general scalar MDS codes in [172]. In [173], the authors present codes that universally achieve
the optimal bandwidth points for all parameters h 6 n − k and k 6 d 6 n − h simultaneously (see Table 5
for a summary of RS repair schemes appearing in the literature).
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:38

Table 5 A summary of schemes appearing in the literature for the repair of RS codes

Ref. Bandwidth Sub-packetization Cut-set bound achievability Remarks

[165] n−1 logp n No Single node repair; (n − k) > pt−1
[167] (n − 1)t(1 − logn (n − k)) logp n No Single node repair; (n − k) > pℓ ; ℓ ∈ [t − 1]
[168] < t(n+1)
n−k
(n − k)n Asymptotically Single node repair
t(n−1+3(n−k))
[169] < n−k
; (n − k) = sm sm+n−1 Asymptotically Single node repair
td
[170] d−k+1
≅ nn Yes Codes exist for any given d ∈ [k, n − 1]
[171] 2(n − 1)(2-erasures); logp n No Distributed repair
3(n − 1)(3-erasures)
[171] 2(n − 2)(2-erasures); logp n No Centralized repair
3(n − 3)(3-erasures)
[172] h(n − h) − (p − 1)(h − 1) logp n No Centralized repair
√
(h-erasures) h 6 log n
tdh
[173] d−k+h
≅ nn Yes; bound in [23] Code works simultaneously for any
given number of failures, h : h ∈ [1, n − k]
and any d : d ∈ [k, n − h]

11 An information capacity approach

• Capacity bounds. In [174], a generic distributed storage system model is introduced and fundamental
limits presented. The notion of information capacity of a distributed system is introduced. Let m denote
the source data size in bits. Consider a distributed storage system with N nodes, each storing s bits of
s
data. If ∆ denotes the average time between node failures, the erasure rate ǫ can be defined as ǫ = ∆ .
When a node failure takes place, a repairer carries out node repair in a manner which ensures that the
source data can be recovered from the data in the surviving nodes at any point of time. The mean time
to data loss (MTTDL) is the average amount of time over which the source data can be recovered. Let γ
denote the repair rate, which is the rate at which the repairer reads and writes data. Let σ = γǫ denote
the repair rate to erasure rate ratio. The information capacity of a distributed storage system is then
defined as the largest amount of source data m for which a large MTTDL is possible. In [174], it is shown
1
that the information capacity approaches (1 − 2σ )N s bits as σ and N grow.
• Liquid storage. In [175], the idea of liquid cloud storage was proposed in which codes of large block
length (for example, authors use a code of block length 3010 in one of their simulations) are used to spread
data stored pertaining to every object over a large number of nodes. Liquid storage employs a lazy repair
strategy where the repair runs slowly in the background. The authors present simulation results that
shows that liquid storage gives better MTTDL performance in comparison with systems based on small
block length codes. The performance of liquid storage systems is shown to approach the fundamental
limits proved in [174].

12 Codes in practice

Distributed systems such as Hadoop, Google file system and Windows Azure have evolved to include
support for erasure codes within their systems, in order to enjoy the benefits of improved storage efficiency
in comparison with triple replication. However, the use of traditional erasure codes results in additional
repair traffic resulting in larger repair times. This led to several theoretical code constructions for efficient
node repair and these were discussed in the preceding sections of this article. Among the biggest success
stories is undoubtedly the adoption of LR codes in the Windows Azure production cluster.
• LR codes. In [176], the authors compare performance-evaluation results of an (n = 16, k = 12, r = 6)
LR code with that of an [n = 16, k = 12] RS code in the Azure production cluster and demonstrate the
repair savings offered by the LR code. Subsequently, the authors implemented an (n = 18, k = 14, r = 7)
LR code in Windows Azure storage and showed that this code has repair degree comparable to that of
an [9, 6] RS code, but has storage overhead 1.29 versus 1.5 in the case of the RS code. This code has
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:39

reportedly resulted in the savings of millions of dollars for Microsoft [177]. The authors of [2] implemented
HDFS-Xorbas which uses LR codes in place of RS codes in HDFS-RAID. Xorbas LR code is build on top
of an RS code by adding extra local XOR parties. The experimental evaluation of Xorbas was carried
out in Amazon EC2 and a cluster in Facebook, in which the repair performance of (n = 16, k = 10, r = 5)
LR code was compared against a [14, 10] RS code. A second distributed storage system that has an LR
code plug-in [178] is Ceph.
• MDS codes with bandwidth savings. The Hitchhiker erasure coded system presented in [179] is a
practical implementation of the piggybacking framework introduced in [69]. The authors implemented
the Hitchhiker in HDFS and evaluated its performance on a data-warehouse cluster at Facebook. The
Hitchhiker has now been incorporated into Apache Hadoop. In [180], the HDFS implementation of a
class of MDS array codes called HashTag codes is discussed. The theoretical framework of HashTag
codes was presented in [71]. These codes allow low sub-packetization levels at the expense of increased
repair bandwidth and are designed to efficiently repair systematic nodes.
• RG codes. The NCCloud [10] is one of the earliest works that dealt with the practical performance
evaluation of RG codes. The NCCloud storage system is build on top of a 2-parity functional MSR
code. In [181], the performance of the pentagon code (which is a RBT MBR code) and a heptagon-
local code (which is an LRG code) in a Hadoop setting are studied. These two codes possess inherent
double replication of code symbols, have storage overhead slightly greater than 2 and their performance is
compared against double and triple replication. In [182], the authors present an optimal-access version of
the PM MSR code, which they refer to as the PM-RBT code. The results of an experimental evaluation
of a rate 12 PM-RBT code on Amazon EC2 instances is reported. In [183], the authors introduced
erasure codes termed Beehive that are built on top of MSR codes. These codes repair multiple failures
simultaneously and are implemented using the PM MSR in C++ using the Intel storage acceleration
library (ISAL). In [184], the authors present the evaluation of a high-rate MSR code known as the
butterfly code in both Ceph and HDFS. This code is a simplified version of the MSR codes presented
in [185] corresponding to the presence of two parity nodes. This code possesses the optimal-access property
except in the case of the repair of a single parity node, and has sub-packetization level α = 2k−1 . More
recently in [186], the authors present Clay code that corresponds to the codes in [37–39]. The Clay code
is implemented over Ceph based on the coupled-layer perspective in [38] and is evaluated over an Amazon
AWS cluster. The Clay code is simultaneously optimal in terms of storage overhead, repair bandwidth,
optimal access and sub-packetization level. As a part of this work, vector code support has been added
to Ceph and the Clay code is under consideration to become a part of Ceph’s master code-base.

Acknowledgements This work was supported in part by National Science Foundation of USA (Grant No.
1421848) and in part by an India-Israel UGC-ISF Joint Research Program Grant.

References
1 Rashmi K V, Shah N B, Gu D, et al. A solution to the network challenges of data recovery in erasure-coded distributed
storage systems: a study on the facebook warehouse cluster. In: Proceedings of the 5th USENIX Workshop on Hot
Topics in Storage and File Systems, San Jose, 2013
2 Sathiamoorthy M, Asteris M, Papailiopoulos D, et al. XORing elephants. Proc VLDB Endow, 2013, 6: 325–336
3 Dimakis A G, Ramchandran K, Wu Y N, et al. A survey on network codes for distributed storage. Proc IEEE, 2011,
99: 476–489
4 Datta A, Oggier F. An overview of codes tailor-made for better repairability in networked distributed storage systems.
SIGACT News, 2013, 44: 89
5 Li J, Li B. Erasure coding for cloud storage systems: a survey. Tinshhua Sci Technol, 2013, 18: 259–272
6 Liu S Q, Oggier F. An overview of coding for distributed storage systems. In: Network Coding and Subspace Designs.
Berlin: Springer, 2018. 363–383
7 Dimakis A G, Godfrey P B, Wu Y, et al. Network coding for distributed storage systems. IEEE Trans Inf Theory,
2010, 56: 4539–4551
8 Wu Y. Existence and construction of capacity-achieving network codes for distributed storage. IEEE J Sel Areas
Commun, 2010, 28: 277–288
9 Shah N B, Rashmi K V, Kumar P V, et al. Interference alignment in regenerating codes for distributed storage:
necessity and code constructions. IEEE Trans Inf Theory, 2012, 58: 2134–2158
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:40

10 Hu Y C, Chen H C H, Lee P P C, et al. NCCloud: applying network coding for the storage repair in a cloud-of-clouds.
In: Proceedings of the 10th USENIX Conference on File and Storage Technologies, San Jose, 2012
11 Krishnan M N, Kumar P V. On MBR codes with replication. In: Proceedings of IEEE International Symposium on
Information Theory, Barcelona, 2016. 71–75
12 Shah N B. On minimizing data-read and download for storage-node recovery. IEEE Commun Lett, 2013, 17: 964–967
13 Rashmi K V, Shah N B, Kumar P V, et al. Explicit construction of optimal exact regenerating codes for distributed
storage. In: Proceedings of the 47th Annual Allerton Conference on Communication, Control, and Computing,
Monticello, 2009. 1243–1249
14 Rashmi K V, Shah N B, Kumar P V. Optimal exact-regenerating codes for distributed storage at the MSR and MBR
points via a product-matrix construction. IEEE Trans Inf Theory, 2011, 57: 5227–5239
15 Lin S J, Chung W H. Novel repair-by-transfer codes and systematic exact-mbr codes with lower complexities and
smaller field sizes. IEEE Trans Parallel Distrib Syst, 2014, 25: 3232–3241
16 Han Y S, Pai H T, Zheng R, et al. Update-efficient error-correcting product-matrix codes. IEEE Trans Commun,
2015, 63: 1925–1938
17 Raviv N. Asymptotically optimal regenerating codes over any field. In: Proceedings of IEEE International Symposium
on Information Theory, Aachen, 2017. 1416–1420
18 Mahdaviani K, Khisti A, Mohajer S. Bandwidth adaptive & error resilient MBR exact repair regenerating codes.
2017. ArXiv:1711.02770
19 Wu Y, Dimakis A G. Reducing repair traffic for erasure coding-based storage via interference alignment. In: Pro-
ceedings of IEEE International Symposium on Information Theory, Seoul, 2009. 2276–2280
20 Suh C, Ramchandran K. Exact-repair mds code construction using interference alignment. IEEE Trans Inf Theory,
2011, 57: 1425–1442
21 Lin S J, Chung W H, Han Y S, et al. A unified form of exact-MSR codes via product-matrix frameworks. IEEE
Trans Inf Theory, 2015, 61: 873–886
22 Papailiopoulos D S, Dimakis A G, Cadambe V R. Repair optimal erasure codes through Hadamard designs. IEEE
Trans Inf Theory, 2013, 59: 3021–3037
23 Cadambe V R, Jafar S A, Maleki H, et al. Asymptotic interference alignment for optimal repair of MDS codes in
distributed storage. IEEE Trans Inf Theory, 2013, 59: 2974–2987
24 Tamo I, Wang Z, Bruck J. Zigzag codes: MDS array codes with optimal rebuilding. IEEE Trans Inf Theory, 2013,
59: 1597–1616
25 Wang Z Y, Tamo I, Bruck J. On codes for optimal rebuilding access. In: Proceedings of the 49th Annual Allerton
Conference on Communication, Control, and Computing, Monticello, 2011. 1374–1381
26 Cadambe V R, Huang C, Li J, et al. Polynomial length MDS codes with optimal repair in distributed storage. In:
Proceedings of the 45th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, 2011. 1850–1854
27 Wang Z Y, Tamo I, Bruck J. Long MDS codes for optimal repair bandwidth. In: Proceedings of IEEE International
Symposium on Information Theory, Cambridge, 2012. 1182–1186
28 Tamo I, Wang Z Y, Bruck J. Access versus bandwidth in codes for storage. IEEE Trans Inf Theory, 2014, 60:
2028–2037
29 Goparaju S, Tamo I, Calderbank R. An improved sub-packetization bound for minimum storage regenerating codes.
IEEE Trans Inf Theory, 2014, 60: 2770–2779
30 Sasidharan B, Agarwal G K, Kumar P V. A high-rate MSR code with polynomial sub-packetization level. In:
Proceedings of IEEE International Symposium on Information Theory, Hong Kong, 2015. 2051–2055
31 Rawat A S, Koyluoglu O O, Vishwanath S. Progress on high-rate MSR codes: enabling arbitrary number of helper
nodes. In: Proceedings of Information Theory and Applications Workshop, La Jolla, 2016
32 Goparaju S, Fazeli A, Vardy A. Minimum storage regenerating codes for all parameters. IEEE Trans Inf Theory,
2017, 63: 6318–6328
33 Agarwal G K, Sasidharan B, Kumar P V. An alternate construction of an access-optimal regenerating code with
optimal sub-packetization level. In: Proceedings of the 21st National Conference on Communications, Mumbai, 2015
34 Alon N. Combinatorial nullstellensatz. Combinator Probab Comp, 1999, 8: 7–29
35 Raviv N, Silberstein N, Etzion T. Constructions of high-rate minimum storage regenerating codes over small fields.
IEEE Trans Inf Theory, 2017, 63: 2015–2038
36 Ye M, Barg A. Explicit constructions of high-rate MDS array codes with optimal repair bandwidth. IEEE Trans Inf
Theory, 2017, 63: 2001–2014
37 Ye M, Barg A. Explicit constructions of optimal-access MDS codes with nearly optimal sub-packetization. IEEE
Trans Inf Theory, 2017, 63: 6307–6317
38 Sasidharan B, Vajha M, Kumar P V. An explicit, coupled-layer construction of a high-rate MSR code with low
sub-packetization level, small field size and all-node repair. 2016. ArXiv:1607.07335
39 Li J, Tang X H, Tian C. A generic transformation for optimal repair bandwidth and rebuilding access in MDS codes.
In: Proceedings of IEEE International Symposium on Information Theory, Aachen, 2017. 1623–1627
40 Balaji S B, Kumar P V. A tight lower bound on the sub-packetization level of optimal-access MSR and MDS codes.
2018. ArXiv:1710.05876
41 Vajha M, Balaji S B, Kumar P V. Explicit MSR codes with optimal access, optimal sub-packetization and small field
size for d = k + 1, k + 2, k + 3. 2018. ArXiv:1804.00598
42 Mahdaviani K, Mohajer S, Khisti A. Product matrix MSR codes with bandwidth adaptive exact repair. IEEE Trans
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:41

Inf Theory, 2018, 64: 3121–3135

43 Shah N B, Rashmi K V, Kumar P V, et al. Distributed storage codes with repair-by-transfer and nonachievability
of interior points on the storage-bandwidth tradeoff. IEEE Trans Inf Theory, 2012, 58: 1837–1852
44 Tian C. Characterizing the rate region of the (4,3,3) exact-repair regenerating codes. IEEE J Sel Areas Commun,
2014, 32: 967–975
45 Information theory inequality prover. 2016. http://user-www.ie.cuhk.edu.hk/∼ ITIP/
46 Yeung R W. A framework for linear information inequalities. IEEE Trans Inf Theory, 1997, 43: 1924–1934
47 Tian C, Sasidharan B, Aggarwal V, et al. Layered exact-repair regenerating codes via embedded error correction and
block designs. IEEE Trans Inf Theory, 2015, 61: 1933–1947
48 Senthoor K, Sasidharan B, Kumar P V. Improved layered regenerating codes characterizing the exact-repair storage-
repair bandwidth tradeoff for certain parameter sets. In: Proceedings of IEEE Information Theory Workshop,
Jerusalem, 2015
49 Sasidharan B, Senthoor K, Kumar P V. An improved outer bound on the storage repair-bandwidth tradeoff of exact-
repair regenerating codes. In: Proceedings of IEEE International Symposium on Information Theory, Honolulu, 2014.
2430–2434
50 Duursma I M. Outer bounds for exact repair codes. 2014. ArXiv:1406.4852
51 Duursma I M. Shortened regenerating codes. IEEE Trans Inf Theory (Early Access), 2018. doi: 10.1109/TIT.2018.
2840995
52 Mohajer S, Tandon R. New bounds on the (n, k, d) storage systems with exact repair. In: Proceedings of IEEE
International Symposium on Information Theory, Hong Kong, 2015. 2056–2060
53 Sasidharan B, Prakash N, Krishnan M N, et al. Outer bounds on the storage-repair bandwidth trade-off of exact-repair
regenerating codes. Int J Inf Coding Theory, 2016, 3: 255–298
54 Elyasi M, Mohajer S. Determinant coding: a novel framework for exact-repair regenerating codes. IEEE Trans Inf
Theory, 2016, 62: 6683–6697
55 Elyasi M, Mohajer S. Exact-repair trade-off for (n, k = d − 1, d) regenerating codes. In: Proceedings of the 55th
Annual Allerton Conference on Communication, Control, and Computing, Monticello, 2017. 934–941
56 Prakash N, Krishnan M N. The storage-repair-bandwidth trade-off of exact repair linear regenerating codes for the
case d = k = n − 1. In: Proceedings of IEEE International Symposium on Information Theory, Hong Kong, 2015.
859–863
57 Elyasi M, Mohajer S, Tandon R. Linear exact repair rate region of (k + 1, k, k) distributed storage systems: a new
approach. In: Proceedings of IEEE International Symposium on Information Theory (ISIT), Hong Kong, 2015.
2061–2065
58 Hu Y, Xu Y, Wang X, et al. Cooperative recovery of distributed storage systems from multiple losses with network
coding. IEEE J Sel Areas Commun, 2010, 28: 268–276
59 Kermarrec A M, Scouarnec N L, Straub G. Repairing multiple failures with coordinated and adaptive regenerating
codes. In: Proceedings of International Symposium on Networking Coding, Beijing, 2011
60 Shum K W, Hu Y. Cooperative regenerating codes. IEEE Trans Inf Theory, 2013, 59: 7229–7258
61 Wang A Y, Zhang Z F. Exact cooperative regenerating codes with minimum-repair-bandwidth for distributed storage.
In: Proceedings of IEEE INFOCOM, Turin, 2013. 400–404
62 Shum K W, Chen J. Cooperative repair of multiple node failures in distributed storage systems. Int J Inf Coding
Theory, 2016, 3: 299–323
63 Scouarnec N L. Exact scalar minimum storage coordinated regenerating codes. In: Proceedings of IEEE International
Symposium on Information Theory, Cambridge, 2012. 1197–1201
64 Ye M, Barg A. Optimal MDS codes for cooperative repair. 2018. ArXiv:1801.09665
65 Liu S Q, Oggier F E. On storage codes allowing partially collaborative repairs. In: Proceedings of IEEE International
Symposium on Information Theory Proceedings, Honolulu, 2014. 2440–2444
66 Liu S Q, Oggier F E. Two storage code constructions allowing partially collaborative repairs. In: Proceedings of
International Symposium on Information Theory and its Applications, Melbourne, 2014. 378–382
67 Koyluoglu O O, Rawat A S, Vishwanath S. Secure cooperative regenerating codes for distributed storage systems.
IEEE Trans Inf Theory, 2014, 60: 5228–5244
68 Huang K, Parampalli U, Xian M. Security concerns in minimum storage cooperative regenerating codes. IEEE Trans
Inf Theory, 2016, 62: 6218–6232
69 Rashmi K V, Shah N B, Ramchandran K. A piggybacking design framework for read-and download-efficient dis-
tributed storage codes. IEEE Trans Inf Theory, 2017, 63: 5802–5820
70 Guruswami V, Rawat A S. MDS code constructions with small sub-packetization and near-optimal repair bandwidth.
In: Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms, Barcelona, 2017. 2109–2122
71 Kralevska K, Gligoroski D, Øverby H. General sub-packetized access-optimal regenerating codes. IEEE Commun
Lett, 2016, 20: 1281–1284
72 Rawat A S, Tamo I, Guruswami V, et al. ǫ-MSR codes with small sub-packetization. In: Proceedings of IEEE
International Symposium on Information Theory, Aachen, 2017. 2043–2047
73 Rouayheb S Y E, Ramchandran K. Fractional repetition codes for repair in distributed storage systems. In: Pro-
ceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing, Allerton, 2010
74 Pawar S, Noorshams N, Rouayheb S Y E, et al. DRESS codes for the storage cloud: simple randomized construc-
tions. In: Proceedings of IEEE International Symposium on Information Theory Proceedings, St. Petersburg, 2011.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:42

2338–2342
75 Silberstein N, Etzion T. Optimal fractional repetition codes based on graphs and designs. IEEE Trans Inf Theory,
2015, 61: 4164–4180
76 Olmez O, Ramamoorthy A. Fractional repetition codes with flexible repair from combinatorial designs. IEEE Trans
Inf Theory, 2016, 62: 1565–1591
77 Koo J C, Gill J T. Scalable constructions of fractional repetition codes in distributed storage systems. In: Proceedings
of the 49th Annual Allerton Conference on Communication, Control, and Computing, Monticello, 2011. 1366–1373
78 Ernvall T. The existence of fractional repetition codes. 2012. ArXiv:1201.3547
79 Pawar S, Rouayheb S Y E, Ramchandran K. Securing dynamic distributed storage systems against eavesdropping
and adversarial attacks. IEEE Trans Inf Theory, 2011, 57: 6734–6753
80 Rashmi K V, Shah N B, Ramchandran K, et al. Regenerating codes for errors and erasures in distributed storage.
In: Proceeding of IEEE International Symposium on Information Theory Proceedings, Cambridge, 2012. 1202–1206
81 Rashmi K V, Shah N B, Ramchandran K, et al. Information-theoretically secure erasure codes for distributed storage.
IEEE Trans Inf Theory, 2018, 64: 1621–1646
82 Tandon R, Amuru S D, Clancy T C, et al. Toward optimal secure distributed storage systems with exact repair.
IEEE Trans Inf Theory, 2016, 62: 3477–3492
83 Rawat A S, Koyluoglu O O, Silberstein N, et al. Optimal locally repairable and secure codes for distributed storage
systems. IEEE Trans Inf Theory, 2014, 60: 212–236
84 Goparaju S, Rouayheb S Y E, Calderbank R, et al. Data secrecy in distributed storage systems under exact repair.
In: Proceedings of International Symposium on Network Coding, Calgary, 2013
85 Huang K, Parampalli U, Xian M. On secrecy capacity of minimum storage regenerating codes. IEEE Trans Inf
Theory, 2017, 63: 1510–1524
86 Rawat A S. Secrecy capacity of minimum storage regenerating codes. In: Proceedings of IEEE International Sympo-
sium on Information Theory, Aachen, 2017. 1406–1410
87 Kadhe S, Sprintson A. Security for minimum storage regenerating codes and locally repairable codes. In: Proceedings
of IEEE International Symposium on Information Theory, Aachen, 2017. 1028–1032
88 Ye F, Shum K W, Yeung R W. The rate region for secure distributed storage systems. IEEE Trans Inf Theory, 2017,
63: 7038–7051
89 Shao S, Liu T, Tian C, et al. On the tradeoff region of secure exact-repair regenerating codes. IEEE Trans Inf
Theory, 2017, 63: 7253–7266
90 Han J, Lastras-Montano L A. Reliable memories with subline accesses. In: Proceedings of IEEE International
Symposium on Information Theory, Nice, 2007. 2531–2535
91 Huang C, Chen M H, Li J. Pyramid codes: flexible schemes to trade space for access efficiency in reliable data storage
systems. In: Proceedings of the 6th IEEE International Symposium on Network Computing and Applications,
Cambridge, 2007. 79–86
92 Oggier F, Datta A. Self-repairing homomorphic codes for distributed storage systems. In: Proceedings of IEEE
INFOCOM, Shanghai, 2011. 1215–1223
93 Gopalan P, Huang C, Simitci H, et al. On the locality of codeword symbols. IEEE Trans Inf Theory, 2012, 58:
6925–6934
94 Papailiopoulos D, Dimakis A. Locally repairable codes. In: Proceedings of IEEE International Symposium on Infor-
mation Theory, Cambridge, 2012. 2771–2775
95 Forbes M, Yekhanin S. On the locality of codeword symbols in non-linear codes. Discrete Math, 2014, 324: 78–84
96 Prakash N, Kamath G M, Lalitha V, et al. Optimal linear codes with a local-error-correction property. In: Proceedings
of IEEE International Symposium on Information Theory Proceedings, Cambridge, 2012. 2776–2780
97 Silberstein N, Rawat A S, Koyluoglu O O, et al. Optimal locally repairable codes via rank-metric codes. In:
Proceedings of IEEE International Symposium on Information Theory, Istanbul, 2013. 1819–1823
98 Prakash N, Lalitha V, Kumar P V. Codes with locality for two erasures. In: Proceedings of IEEE International
Symposium on Information Theory, Honolulu, 2014. 1962–1966
99 Wang A, Zhang Z. An integer programming-based bound for locally repairable codes. IEEE Trans Inf Theory, 2015,
61: 5280–5294
100 Zhang J, Wang X, Ge G N. Some improvements on locally repairable codes. 2015. ArXiv:1506.04822
101 Mehrabi M, Ardakani M. On minimum distance of locally repairable codes. In: Proceedings of the 15th Canadian
Workshop on Information Theory, Quebec, 2017
102 Tamo I, Barg A. A family of optimal locally recoverable codes. IEEE Trans Inf Theory, 2014, 60: 4661–4676
103 Ernvall T, Westerback T, Hollanti C. Constructions of optimal and almost optimal locally repairable codes. In:
Proceedings of the 4th International Conference on Wireless Communications, Vehicular Technology, Information
Theory and Aerospace Electronic Systems, Aalborg, 2014
104 Liu J, Mesnager S, Chen L. New constructions of optimal locally recoverable codes via good polynomials. IEEE
Trans Inf Theory, 2018, 64: 889–899
105 Kolosov O, Barg A, Tamo I, et al. Optimal LRC codes for all lenghts n 6 q. 2018. ArXiv:1802.00157
106 Jin L F, Ma L M, Xing C P. Construction of optimal locally repairable codes via automorphism groups of rational
function fields. 2017. ArXiv:1710.09638
107 Balaji S B, Kumar P V. On partial maximally-recoverable and maximally-recoverable codes. In: Proceedings of IEEE
International Symposium on Information Theory, Hong Kong, 2015. 1881–1885
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:43

108 Cadambe V R, Mazumdar A. Bounds on the size of locally recoverable codes. IEEE Trans Inf Theory, 2015, 61:
5787–5794
109 Huang P, Yaakobi E, Uchikawa H, et al. Binary linear locally repairable codes. IEEE Trans Inf Theory, 2016, 62:
6268–6283
110 Balaji S B, Kumar P V. Bounds on the rate and minimum distance of codes with availability. In: Proceedings of
IEEE International Symposium on Information Theory, Aachen, 2017. 3155–3159
111 Wei V K. Generalized Hamming weights for linear codes. IEEE Trans Inf Theory, 1991, 37: 1412–1418
112 Wang A Y, Zhang Z F, Lin D D. Bounds and constructions for linear locally repairable codes over binary fields. In:
Proceedings of IEEE International Symposium on Information Theory, Aachen, 2017. 2033–2037
113 Ma J X, Ge G N. Optimal binary linear locally repairable codes with disjoint repair groups. 2017. ArXiv:1711.07138
114 Agarwal A, Barg A, Hu S, et al. Combinatorial alphabet-dependent bounds for locally recoverable codes. IEEE
Trans Inf Theory, 2018, 64: 3481–3492
115 Tamo I, Barg A, Goparaju S, et al. Cyclic LRC codes, binary LRC codes, and upper bounds on the distance of cyclic
codes. 2016. ArXiv:1603.08878
116 Goparaju S, Calderbank R. Binary cyclic codes that are locally repairable. In: Proceedings of IEEE International
Symposium on Information Theory, Honolulu, 2014. 676–680
117 Zeh A, Yaakobi E. Optimal linear and cyclic locally repairable codes over small fields. In: Proceedings of IEEE
Information Theory Workshop, Jerusalem, 2015
118 Tamo I, Barg A, Frolov A. Bounds on the parameters of locally recoverable codes. IEEE Trans Inf Theory, 2016, 62:
3070–3083
119 Barg A, Tamo I, Vladut S. Locally recoverable codes on algebraic curves. IEEE Trans Inf Theory, 2017, 63: 4928–4939
120 Li X D, Ma L M, Xing C P. Construction of asymptotically good locally repairable codes via automorphism groups
of function fields. 2017. ArXiv:1711.07703
121 Nam M Y, Song H Y. Binary locally repairable codes with minimum distance at least six based on partial t-spreads.
IEEE Commun Lett, 2017, 21: 1683–1686
122 Silberstein N, Zeh A. Optimal binary locally repairable codes via anticodes. In: Proceedings of IEEE International
Symposium on Information Theory, Hong Kong, 2015. 1247–1251
123 Hao J, Xia S T, Chen B. Some results on optimal locally repairable codes. In: Proceedings of IEEE International
Symposium on Information Theory, Barcelona, 2016. 440–444
124 Shahabinejad M, Khabbazian M, Ardakani M. A class of binary locally repairable codes. IEEE Trans Commun, 2016,
64: 3182–3193
125 Hao J, Xia S T, Chen B. On optimal ternary locally repairable codes. In: Proceedings of IEEE International
Symposium on Information Theory, Aachen, 2017. 171–175
126 Hao J, Xia S. Bounds and constructions of locally repairable codes: parity-check matrix approach. 2016.
ArXiv:1601.05595
127 Li X D, Ma L M, Xing C P. Optimal locally repairable codes via elliptic curves. 2017. ArXiv:1712.03744
128 Kim C, No J S. New constructions of binary and ternary locally repairable codes using cyclic codes. IEEE Commun
Lett, 2018, 22: 228–231
129 Luo Y, Xing C P, Yuan C. Optimal locally repairable codes of distance 3 and 4 via cyclic codes. 2018.
ArXiv:1801.03623
130 Krishnan M N, Puranik B, Kumar P V, et al. Exploiting locality for improved decoding of binary cyclic codes. IEEE
Trans Commun, 2018, 66: 2346–2358
131 Vardy A, Be’ery Y. Maximum-likelihood soft decision decoding of BCH codes. IEEE Trans Inf Theory, 1994, 40:
546–554
132 Huang P, Yaakobi E, Uchikawa H, et al. Cyclic linear binary locally repairable codes. In: Proceedings of IEEE
Information Theory Workshop, Jerusalem, 2015
133 Chen M H, Huang C, Li J. On the maximally recoverable property for multi-protection group codes. In: Proceedings
of IEEE International Symposium on Information Theory, Nice, 2007. 486–490
134 Blaum M, Hafner J L, Hetzler S. Partial-MDS codes and their application to RAID type of architectures. IEEE
Trans Inf Theory, 2013, 59: 4510–4519
135 Calis G, Koyluoglu O O. A general construction for PMDS codes. IEEE Commun Lett, 2017, 21: 452–455
136 Gabrys R, Yaakobi E, Blaum M, et al. Constructions of partial MDS codes over small fields. In: Proceedings of
IEEE International Symposium on Information Theory, Aachen, 2017
137 Gopalan P, Huang C, Jenkins B, et al. Explicit maximally recoverable codes with locality. IEEE Trans Inf Theory,
2014, 60: 5245–5256
138 Hu G, Yekhanin S. New constructions of SD and MR codes over small finite fields. In: Proceedings of IEEE
International Symposium on Information Theory, Barcelona, 2016. 1591–1595
139 Chen J Y, Shum K W, Yu Q, et al. Sector-disk codes and partial MDS codes with up to three global parities. In:
Proceedings of IEEE International Symposium on Information Theory, Hong Kong, 2015. 1876–1880
140 Blaum M. Construction of PMDS and SD codes extending RAID 5. 2013. ArXiv:1305.0032
141 Blaum M, Plank J S, Schwartz M, et al. Construction of partial MDS and sector-disk codes with two global parity
symbols. IEEE Trans Inf Theory, 2016, 62: 2673–2681
142 Lalitha V, Lokam S V. Weight enumerators and higher support weights of maximally recoverable codes. In: Pro-
ceedings of the 53rd Annual Allerton Conference on Communication, Control, and Computing, Monticello, 2015.
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:44

835–842
143 Kadhe S, Calderbank R. Rate optimal binary linear locally repairable codes with small availability. 2017.
ArXiv:1701.02456
144 Wang A Y, Zhang Z F, Liu M L. Achieving arbitrary locality and availability in binary codes. In: Proceedings of
IEEE International Symposium on Information Theory, Hong Kong, 2015. 1866–1870
145 Wang A Y, Zhang Z F. Repair locality with multiple erasure tolerance. IEEE Trans Inf Theory, 2014, 60: 6979–6987
146 Song W T, Yuen C. Locally repairable codes with functional repair and multiple erasure tolerance. 2015.
ArXiv:1507.02796
147 Balaji S B, Kini G R, Kumar P V. A tight rate bound and a matching construction for locally recoverable codes
with sequential recovery from any number of multiple erasures. In: Proceedings of IEEE International Symposium
on Information Theory, Aachen, 2017. 1778–1782
148 Balaji S B, Kini G R, Kumar P V. A bound on rate of codes with locality with sequential recovery from multiple
erasures. 2016. ArXiv:1611.08561
149 Song W T, Cai K, Yuen C, et al. On sequential locally repairable codes. IEEE Trans Inf Theory, 2018, 64: 3513–3527
150 Balaji S B, Kini G R, Kumar P V. A rate-optimal construction of codes with sequential recovery with low block
length. In: Proceedings of National Conference on Communications, Hyderabad, 2018
151 Rawat A S, Mazumdar A, Vishwanath S. Cooperative local repair in distributed storage. 2014. ArXiv:1409.3900
152 Exoo G, Jajcay R. Dynamic cage survey. The Electronic Journal Combinatorics, 2013. http://pdfs.semanticscholar.
org/43b8/2016a2ef8f394f2cb1954439248198d2c274.pdf
153 Song W, Dau S H, Yuen C, et al. Optimal locally repairable linear codes. IEEE J Sel Areas Commun, 2014, 32:
1019–1036
154 Chen B, Xia S T, Hao J, et al. Constructions of optimal cyclic (r, δ) locally repairable codes. IEEE Trans Inf Theory,
2018, 64: 2499–2511
155 Hao J, Xia S T, Chen B. On the linear codes with (r, δ)-locality for distributed storage. In: Proceedings of IEEE
International Conference on Communications, Paris, 2017
156 Sasidharan B, Agarwal G K, Kumar P V. Codes with hierarchical locality. In: Proceedings of International Sympo-
sium on Information Theory (ISIT), Hong Kong, 2015. 1257–1261
157 Ballentine S, Barg A. Codes on curves with hierarchical locality. In: Proceedings of IEEE International Symposium
on Information Theory (accepted), Hong Kong, 2018
158 Duminuco A, Biersack E. Hierarchical codes: how to make erasure codes attractive for peer-to-peer storage systems.
In: Proceedings of the 8th International Conference on Peer-to-Peer Computing, Aachen, 2008. 89–98
159 Kamath G M, Prakash N, Lalitha V, et al. Codes with local regeneration and erasure correction. IEEE Trans Inf
Theory, 2014, 60: 4637–4660
160 Gligoroski D, Kralevska K, Jensen R E, et al. Repair duality with locally repairable and locally regenerating codes.
2017. ArXiv:1701.06664
161 Hollmann H D L. On the minimum storage overhead of distributed storage codes with a given repair locality. In:
Proceedings of IEEE International Symposium on Information Theory, Honolulu, 2014. 1041–1045
162 Ahmad I, Wang C C. When locally repairable codes meet regenerating codes — what if some helpers are unavailable.
In: Proceedings of IEEE International Symposium on Information Theory, Hong Kong, 2015. 849–853
163 Krishnan M N, Anantha N R, Kumar P V. Codes with combined locality and regeneration having optimal Rate, dmin
and linear field size. 2018. ArXiv:1804.00564
164 Shanmugam K, Papailiopoulos D S, Dimakis A G, et al. A repair framework for scalar MDS codes. IEEE J Sel Areas
Commun, 2014, 32: 998–1007
165 Guruswami V, Wootters M. Repairing Reed-Solomon codes. IEEE Trans Inf Theory, 2017, 63: 5684–5698
166 MacWilliams F J, Sloane N J A. The theory of error-correcting codes. Elsevier, 1977, 68: 185–186
167 Dau H, Milenkovic O. Optimal repair schemes for some families of full-length Reed-Solomon codes. In: Proceedings
of IEEE International Symposium on Information Theory, Aachen, 2017. 346–350
168 Ye M, Barg A. Explicit constructions of MDS array codes and RS codes with optimal repair bandwidth. In: Pro-
ceedings of IEEE International Symposium on Information Theory, Barcelona, 2016. 1202–1206
169 Chowdhury A, Vardy A. Improved schemes for asymptotically optimal repair of MDS codes. In: Proceedings of the
55th Annual Allerton Conference on Communication, Control, and Computing, Monticello, 2017. 950–957
170 Tamo I, Ye M, Barg A. Optimal repair of reed-solomon codes: achieving the cut-set bound. In: Proceedings of the
58th IEEE Annual Symposium on Foundations of Computer Science, Berkeley, 2017. 216–227
171 Dau S H, Duursma I M, Kiah H M, et al. Repairing Reed-Solomon codes with multiple erasures. 2016.
ArXiv:1612.01361
172 Bartan B, Wootters M. Repairing multiple failures for scalar MDS codes. In: Proceedings of the 55th Annual Allerton
Conference on Communication, Control, and Computing, Monticello, 2017. 1145–1152
173 Ye M, Barg A. Repairing Reed-Solomon codes: universally achieving the cut-set bound for any number of erasures.
2017. ArXiv:1710.07216
174 Luby M. Capacity bounds for distributed storage. 2016. ArXiv:1610.03541
175 Luby M, Padovani R, Richardson T J, et al. Liquid cloud storage. 2017. ArXiv:1705.07983
176 Huang C, Simitci H, Xu Y, et al. Erasure coding in windows azure storage. In: Proceedings of USENIX Annual
Technical Conference, Boston, 2012. 15–26
177 Gantenbein D. A better way to store data. Microsoft research blog. https://www.microsoft.com/en-us/research/
Balaji S B, et al. Sci China Inf Sci October 2018 Vol. 61 100301:45

blog/better-way-store-data/
178 CEPH. Locally repairable erasure code plugin. http://docs.ceph.com/docs/master/rados/operations/erasure-code-
lrc/
179 Rashmi K V, Shah N B, Gu D, et al. A “hitchhiker’s” guide to fast and efficient data reconstruction in erasure-coded
data centers. In: Proceedings of ACM SIGCOMM Conference, Chicago, 2014. 331–342
180 Kralevska K, Gligoroski D, Jensen R E, et al. Hashtag erasure codes: from theory to practice. IEEE Trans Big Data,
2017. doi: 10.1109/TBDATA.2017.2749255
181 Krishnan M N, Prakash N, Lalitha V, et al. Evaluation of codes with inherent double replication for Hadoop. In:
Proceedings of the 6th USENIX Workshop on Hot Topics in Storage and File Systems, Philadelphia, 2014
182 Rashmi K V, Nakkiran P, Wang J, et al. Having your cake and eating it too: jointly optimal erasure codes for I/O,
storage, and network-bandwidth. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies,
Santa Clara, 2015. 81–94
183 Li J, Li B. Beehive: erasure codes for fixing multiple failures in distributed storage systems. IEEE Trans Parallel
Distrib Syst, 2017, 28: 1257–1270
184 Pamies-Juarez L, Blagojevic F, Mateescu R, et al. Opening the chrysalis: on the real repair performance of MSR
codes. In: Proceedings of the 14th USENIX Conference on File and Storage Technologies, Santa Clara, 2016. 81–94
185 Gad E E, Mateescu R, Blagojevic F, et al. Repair-optimal MDS array codes over GF(2). In: Proceedings of IEEE
International Symposium on Information Theory, Istanbul, 2013. 887–891
186 Vajha M, Ramkumar V, Puranik B, et al. Clay codes: moulding MDS codes to yield an MSR code. In: Proceedings
of the 16th USENIX Conference on File and Storage Technologies, Oakland, 2018. 139–154