MIT Open Access Articles
Rateless Coding for Gaussian Channels
The MIT Faculty has made this article openly available. Please share
how this access benefits you. Your story matters.
Citation: Erez, Uri, Mitchell D. Trott, and Gregory W. Wornell. “Rateless Coding for Gaussian
Channels.” IEEE Transactions on Information Theory 58.2 (2012): 530–547. © Copyright 2012
IEEE
As Published: http://dx.doi.org/10.1109/tit.2011.2173242
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Persistent URL: http://hdl.handle.net/1721.1/73609
Version: Author's final manuscript: final author's manuscript post peer review, without
publisher's formatting or copy editing
Terms of use: Creative Commons Attribution-Noncommercial-Share Alike 3.0
1
Rateless Coding for Gaussian Channels
arXiv:0708.2575v2 [cs.IT] 3 Jun 2011
Uri Erez, Member, IEEE, Mitchell D. Trott, Fellow, IEEE, Gregory W. Wornell, Fellow, IEEE
Abstract—A rateless code—i.e., a rate-compatible family of
codes—has the property that codewords of the higher rate codes
are prefixes of those of the lower rate ones. A perfect family
of such codes is one in which each of the codes in the family is
capacity-achieving. We show by construction that perfect rateless
codes with low-complexity decoding algorithms exist for additive
white Gaussian noise channels. Our construction involves the
use of layered encoding and successive decoding, together with
repetition using time-varying layer weights. As an illustration of
our framework, we design a practical three-rate code family. We
further construct rich sets of near-perfect rateless codes within
our architecture that require either significantly fewer layers or
lower complexity than their perfect counterparts. Variations of
the basic construction are also developed, including one for timevarying channels in which there is no a priori stochastic model.
Index Terms—Incremental redundancy, rate-compatible punctured codes, hybrid ARQ (H-ARQ), static broadcasting.
I. I NTRODUCTION
T
HE design of effective “rateless” codes has received renewed strong interest in the coding community, motivated
by a number of emerging applications. Such codes have a
long history, and have gone by various names over time,
among them incremental redundancy codes, rate-compatible
punctured codes, hybrid automatic repeat request (ARQ) type
II codes, and static broadcast codes [1]–[10]. This paper
focuses on the design of such codes for average power limited
additive white Gaussian noise (AWGN) channels. Specifically,
we develop techniques for mapping standard good single-rate
codes for the AWGN channel into good rateless codes that are
efficient, practical, and can operate at rates of multiple b/s/Hz.
As such, they represent an attractive alternative to traditional
hybrid ARQ solutions for a variety of wireless and related
applications.
More specifically, we show that the successful techniques
employed to construct low-complexity codes for the standard
AWGN channel—such as those arising out of turbo and lowdensity parity check (LDPC) codes—can be leveraged to construct rateless codes. In particular, we develop an architecture
in which a single codebook designed to operate at a single
Manuscript received August 2007, revised December 2010. This work
was supported in part by the National Science Foundation under Grant
No. CCF-0515122, Draper Laboratory, MITRE Corp., and by Hewlett-Packard
Co. through the MIT/HP Alliance. This work was presented in part at the
Information Theory and Applications Workshop, University of California, San
Diego, Feb. 2006, and at the International Symposium on Information Theory,
Seattle, WA, July 2006.
U. Erez is with the Department of Electrical Engineering - Systems, Tel
Aviv University, Ramat Aviv, 69978, Israel (Email: uri@eng.tau.ac.il).
M. D. Trott is with Hewlett-Packard Laboratories, Palo Alto, CA, 94304
(Email: mitchell.trott@hp.com).
G. W. Wornell is with the Department of Electrical Engineering and
Computer Science, Massachusetts Institute of Technology, Cambridge, MA
02139 (Email: gww@mit.edu).
SNR is used in a straightforward manner to build a rateless
codebook that operates at many SNRs.
The encoding in our architecture exploits three key ingredients: layering, repetition, and time-varying weighting. By layering, we mean the creation of a code by a linear combination
of subcodes. By repetition, we mean the use of simple linear
redundancy. Finally, by time-varying weighting, we mean that
the (complex) weights in the linear combinations in each copy
are different. We show that with the appropriate combination
of these ingredients, if the base codes are capacity-achieving,
so will be the resulting rateless code.
In addition to achieving capacity in our architecture, we
seek to ensure that if the base code can be decoded with low
complexity, so can the rateless code. This is accomplished
by imposing the constraint that the layered encoding be
successively decodable—i.e., that the layers can be decoded
one at a time, treating as yet undecoded layers as noise.
Hence, our main result is the construction of capacityachieving, low-complexity rateless codes, i.e., rateless codes
constructed from layering, repetition, and time-varying weighting, that are successively decodable.
The paper is organized as follows. In Section II we put
the problem in context and summarize related work and
approaches. In Section III we introduce the channel and
system model. In Section IV we motivate and illustrate our
construction with a simple special-case example. In Section V
we develop our general construction and show that within it
exist perfect rateless codes for at least some ranges of interest,
and in Section VI we develop and analyze specific instances of
our codes generated numerically. In Section VII, we show that
within the constraints of our construction rateless codes for any
target ceiling and range can be constructed that are arbitrarily
close to perfect in an appropriate sense. In Section VIII we
make some comments on design and implementation issues,
and in Section IX we describe the results of simulations with
our constructions. In Section X, we discuss and develop simple
extensions of our basic construction to time-varying channels.
Finally, Section XI provides some concluding remarks.
II. BACKGROUND
From a purely information theoretic perspective the problem
of rateless transmission is well understood; see Shulman [11]
for a comprehensive treatment. Indeed, for channels having
one maximizing input distribution, a codebook drawn independently and identically distributed (i.i.d.) at random from this
distribution will be good with high probability, when truncated
to (a finite number of) different lengths. Phrased differently,
in such cases random codes are rateless codes.
Constructing good codes that also have computationally
efficient encoders and decoders requires more effort. A remarkable example of such codes for erasure channels are the
2
recent Raptor codes of Shokrollahi [12], which build on the
LT codes of Luby [13], [14]. An erasure channel model (for
packets) is most appropriate for rateless coding architectures
anchored at the application layer, where there is little or no
access to the physical layer.
Apart from erasure channels, there is a growing interest in
exploiting rateless codes closer to the physical layer, where
AWGN models are more natural; see, e.g., [15] and the
references therein. Much less is known about the limits of
what is possible in this realm, which has been the focus of
traditional hybrid ARQ research.
One line of work involves extending Raptor code constructions to binary-input AWGN channels (among others). In this
area, [16], [17] have shown that no degree distribution allows
such codes to approach capacity simultaneously at different
signal to noise ratios (SNRs). Nevertheless, this does not rule
out the possibility that such codes, when suitably designed,
can be near capacity at multiple SNRs.
A second approach is based on puncturing of low-rate
capacity-approaching binary codes such as turbo and LDPC
codes [3], [8], [9], [15], [18], [19], or extending a higherrate such code, or using a combination of code puncturing
and extension [20]. When iterative decoding is involved, such
approaches lead to performance tradeoffs at different rates—
improving performance at one rate comes at the expense of
the performance at other rates. While practical codes have
been constructed in this manner [3], [20], it remains to be
understood how close, in principle, one can come to capacity
simultaneously at multiple SNRs, particularly when not all
SNRs are low.
Finally, for the higher rates typically of interest, which
necessitate higher-order (e.g., 16-QAM and larger) constellations, the modulation used with such binary codes becomes
important. In turn, such modulation tends to further complicate
the iterative decoding, imposing additional code design challenges. Constellation rearrangement and other techniques have
been developed to at least partially address such challenges
[21]–[24], but as yet do not offer a complete solution. Alternatively, suitably designed binary codes can be, in principle,
combined with bit-interleaved coded modulation (BICM) for
such applications; for example, [25] explores the design of
Raptor codes for this purpose, and shows by example that the
gaps to capacity need not be too large, at least provided the
rates are not too high.
From the perspective of the broader body of related work
described above, the present paper represents somewhat of
a departure in approach to the design of rateless codes and
hybrid ARQ systems. However, with this departure come
additional complementary insights, as we will develop.
III. C HANNEL AND S YSTEM M ODEL
The codes we construct are designed for a complex AWGN
channel
ym = βxm + zm , m = 1, 2, . . . ,
(1)
where β is a channel gain,1 xm is a vector of N input symbols,
ym is the vector of channel output symbols, and zm is a noise
1 More
general models for β will be discussed later in the paper.
vector of N i.i.d. complex, circularly-symmetric Gaussian
random variables of variance σ 2 , independent across blocks
m = 1, 2, . . . . The channel input is limited to average power
P per symbol. In our model, the channel gain β and noise
variance σ 2 are known a priori at the receiver but not at the
transmitter.2
The block length N has no important role in the analysis
that follows. It is, however, the block length of the base code
used in the rateless construction. As the base code performance
controls the overall code performance, to approach channel
capacity N must be large.
The encoder transmits a message w by generating a sequence of code blocks (incremental redundancy blocks) x1 (w),
x2 (w), . . . . The receiver accumulates sufficiently many received blocks y1 , y2 , . . . to recover w. The channel gain β
may be viewed as a variable parameter in the model; more
incremental redundancy is needed to recover w when β is
small than when β is large.
An important feature of this model is that the receiver
always starts receiving blocks from index m = 1. It does not
receive an arbitrary subsequence of blocks, as might be the
case if one were modeling a broadcast channel that permits
“tuning in” to an ongoing transmission.
We now define some basic terminology and notation. Unless
noted otherwise, all logarithms are base 2, all symbols denote
complex quantities, and all rates are in bits per complex
symbol (channel use), i.e., b/s/Hz. We use · T for transpose
and · † for Hermitian (conjugate transpose) operators. Vectors
and matrices are denoted using bold face, random variables
are denoted using sans-serif fonts, while sample values use
regular (serif) fonts.
We define the ceiling rate of the rateless code as the highest
rate R at which the code can operate, i.e., the effective rate
if the message is decoded from the single received block y1 ;
hence, a message consists of N R information bits. Associated
with this rate is an SNR threshold, which is the minimum SNR
required in the realized channel for decoding to be possible
from this single block. This SNR threshold can equivalently be
expressed in the form of a channel gain threshold. Similarly,
if the message is decoded from m ≥ 2 received blocks,
the corresponding effective code rate is R/m, and there is
a corresponding SNR (and channel gain) threshold. Thus, for
a rateless encoding consisting of M blocks, there is a sequence
of M associated SNR thresholds.
Finally, as in the introduction, we refer to the code out of
which our rateless construction is built as the base code, and
the associated rate of this code as simply the base code rate.
At points in our analysis we assume that a good base code is
used in the code design, i.e., that the base code is capacityachieving for the AWGN channel, and thus has the associated
properties of such codes. This allows us to distinguish losses
due to the code architecture from those due to the choice of
base code.
2 An equivalent model would be a broadcast channel in which a single
encoding of a common message is being sent to a multiplicity of receivers,
each experiencing a different SNR.
3
IV. M OTIVATING E XAMPLE
To develop initial insights, we construct a simple lowcomplexity perfect rateless code that employs two layers of
coding to support a total of two redundancy blocks.
We begin by noting that for the case of a rateless code with
two redundancy blocks the channel gain |β| may be classified
into three intervals based on the number of blocks needed for
decoding. Let α1 and α2 denote the two associated channel
gain thresholds. When |β| ≥ α1 decoding requires only one
block. When α1 > |β| ≥ α2 decoding requires two blocks.
When α2 > |β| decoding is not possible. The interesting cases
occur when the gain is as small as possible to permit decoding.
At these threshold values, for one-block decoding the decoder
sees (aside from an unimportant phase shift)
y1 = α1 x1 + z1 ,
(2)
while for two-block decoding the decoder sees
y1 = α2 x1 + z1 ,
(3)
y2 = α2 x2 + z2 .
(4)
In general, given any particular choice of the ceiling rate
R for the code, we would like the resulting SNR thresholds
to be a low as possible. To determine lower bounds on these
thresholds, let
SNRm = P α2m /σ 2 ,
(5)
and note that the capacity of the one-block channel is
I1 = log(1 + SNR1 ),
(6)
while for the two-block channel the capacity is
I2 = 2 log(1 + SNR2 )
(7)
bits per channel use. A “channel use” in the second case
consists of a pair of transmitted symbols, one from each block.
In turn, since we deliver the same message to the receiver
for both the one- and two-block cases, the smallest values of
α1 and α2 we can hope to achieve occur when
I1 = I2 = R.
(8)
Thus, we say that the code is perfect if it is decodable at these
limits.
We next impose that the construction be a layered code, and
that the layers be successively decodable.
Layering means that we require the transmitted blocks to
be linear combinations of two base codewords c1 ∈ C1 and
c 2 ∈ C2 3 :
x1 = g11 c1 + g12 c2 ,
(9)
x2 = g21 c1 + g22 c2 .
(10)
Base codebook C1 has rate R1 and base codebook C2 has
rate R2 , where R1 + R2 = R, so that total rate of the
two codebooks equals the ceiling rate. We assume for this
example that both codebooks are capacity-achieving, so that
the codeword components are i.i.d. Gaussian. Furthermore, for
convenience, we scale the codebooks to have unit power, so
the power constraint instead enters through the constraints
|g11 |2 + |g12 |2 = P,
2
2
|g21 | + |g22 | = P.
(11)
(12)
Finally, the successive decoding constraint in our system
means that the layers are decoded one at a time to keep
complexity low (on order of the base code complexity).
Specifically, the decoder first recovers c2 while treating c1
as additive Gaussian noise, then recovers c1 using c2 as side
information.
We now show that perfect rateless codes are possible within
these constraints by constructing a matrix G = [gml ] so that
the resulting code satisfies (8). Finding an admissible G is
simply a matter of some algebra: in the one-block case we
need
R1 = Iα1 (c1 ; y1 |c2 )
R2 = Iα1 (c2 ; y1 ),
(13)
(14)
and in the two-block case we need
R1 = Iα2 (c1 ; y1 , y2 |c2 )
R2 = Iα2 (c2 ; y1 , y2 ).
(15)
(16)
The subscripts α1 and α2 are a reminder that these mutual
information expressions depend on the channel gain, and the
scalar variables denote individual components from the input
and output vectors.
While evaluating (13)–(15) is straightforward, calculating
the more complicated (16), which corresponds to decoding
c2 in the two-block case, can be circumvented by a little
additional insight. In particular, while c1 causes the effective noise in the two blocks to be correlated, observe that
a capacity-achieving code requires x1 and x2 to be i.i.d.
Gaussian. As c1 and c2 are Gaussian, independent, and equal
in power by assumption, this occurs only if the rows of G
are orthogonal. Moreover, the power constraint P ensures that
these orthogonal rows have the same norm, which implies that
G is a scaled unitary matrix.
The unitary constraint has an immediate important consequence: the per-layer rates R1 and R2 must be equal, i.e.,
R1 = R2 = R/2.
(17)
This occurs because the two-block case decomposes into two
parallel orthogonal channels of equal SNR. We see in the
next section that a comparable result holds for any number
of layers.
From the definitions of SNR1 and I1 [cf. (5) and (6)], and
the equality I1 = R (8), we find that
P α21 /σ 2 = 2R − 1.
(18)
Also, from (13) and (17), we find that
|g11 |2 α21 /σ 2 = 2R/2 − 1.
(19)
Combining (18) and (19) yields
3 In
practice, the codebooks C1 and C2 should not be identical, though they
can for example be derived from a common base codebook via scrambling.
This point is discussed further in Section VIII.
|g11 |2 = P
P
2R/2 − 1
.
= R/2
R
2 −1
2
+1
(20)
4
The constraint that G be a scaled unitary matrix, together with
the power constraint P , implies
(21)
|g22 |2 = |g11 |2 ,
(23)
|g21 |2 = P − |g11 |2
(22)
which completely determines the squared modulus of the
entries of G.
Now, the mutual information expressions (13)–(16) are
unaffected by applying a common complex phase shift to any
row or column of G, so without loss of generality we take the
first row and first column of G to be real and positive. For
G to be a scaled unitary matrix, g22 must then be real and
negative. We have thus shown that, if a solution to (13)–(16)
exists, it must have the form
r
P
g11 g12
1
2R/4
=
G=
.
(24)
g21 g22
2R/2 + 1 2R/4 −1
Conversely, it is straightforward to verify that (13)–(16) are
satisfied with this selection. Thus (24) characterizes the (essentially) unique solution G.4
In summary, we have constructed a 2-layer, 2-block perfect
rateless code from linear combinations of codewords drawn
from equal-rate codebooks. Moreover, decoding can proceed
one layer at a time with no loss in performance, provided
the decoder is cognizant of the correlated noise caused by
undecoded layers. In the sequel we consider the generalization
of our construction to an arbitrary number of layers and
redundancy blocks.
V. R ATELESS C ODES WITH L AYERED E NCODING
S UCCESSIVE D ECODING
AND
The rateless code construction we pursue is as follows
[26]. First, we choose the range (maximum number M of
redundancy blocks), the ceiling rate R, the number of layers
L, and finally the associated codebooks C1 , . . . , CL . We will
see presently that the L base codebooks must have equal rate
R/L when constructing perfect rateless codes with M = L,
and in any case using equal rates has the advantage of allowing
the codebooks for each layer to be derived from a single base
code.
Given codewords cl ∈ Cl , l = 1, . . . , L, the redundancy
blocks x1 , . . . , xM take the form
c1
x1
..
..
(25)
=
G
. ,
.
xM
cL
where G is an M × L matrix of complex gains and where
xm for each m and cl for each l are row vectors of length
N . The power constraint enters by limiting the rows of G
to have squared norm P and by normalizing the codebooks
to have unit power. With this notation, the elements of the
mth row of G are the weights used in constructing the mth
4 Interestingly, the symmetry in (24) implies that the construction remains
perfect even if the two redundancy blocks are received in swapped order. This
is not true of our other constructions.
g21 c1
g22 c2
←− layer
|g12 |2 = P − |g11 |2
g11 c1
g12 c2
g23 c3
g13 c3
g14 c4
g24 c4
g31 c1
g32 c2
g33 c3
g34 c4
time −→
Fig. 1. A rateless code construction with 4 layers and 3 blocks of redundancy.
Each block is a weighted linear combination of the (N -element) base
codewords c1 , c2 , . . . , c4 , where gml , the (m, l)th element of G, denotes
the weight for layer l of block m. In this illustration, the thickness of a layer
is a graphical depiction of the magnitude of its associated gain (power).
redundancy block from the L codewords.5 In the sequel we
use gml to denote the (m, l)th entry of G and Gm,l to denote
the upper-left m × l submatrix of G.6
An example of this layered rateless code structure is depicted in Fig. 1. Each redundancy block contains a repetition
of the codewords used in the earlier blocks, but with a different
complex scaling factor. The code structure may therefore be
viewed as a hybrid of layering and repetition. Note that, absent
assumptions on the decoder, the order of the layers is not
important.
In addition to the layered code structure, there is additional
decoding structure, namely that the layered code be successively decodable. Specifically, to recover the message, we first
T
T
decode cL , treating G[cT
1 · · · cL−1 ] as (colored) noise, then
T
T
decode cL−1 , treating G[c1 · · · cL−2 ]T as noise, and so on.
Thus, our aim is to select G so that capacity is achieved
for any number m = 1, . . . , M of redundancy blocks subject
to the successive decoding constraint. Minimum mean-square
error (MMSE) combining of the available redundancy blocks
conveniently exploits the repetition structure in the code when
decoding each layer.
Both the layered repetition structure (25) and the successive decoding constraint impact the degree to which we
can approach a perfect code. Accordingly, we examine the
consequences of each in turn.
We begin by examining the implications of the layered
repetition structure (25). When the number of layers L is at
least as large as the number of redundancy blocks M , such
layering does not limit code performance. But when L < M ,
it does. In particular, whenever the number m of redundancy
blocks required by the realized channel exceeds L, there is
necessarily a gap between the code performance and capacity.
To see this, observe that (25) with (1), restricted to the first
m blocks, defines a linear L-input m-output AWGN channel,
5 The lth column of G also has a useful interpretation. In particular, one can
interpret the construction as equivalent to a “virtual” code-division multipleaccess (CDMA) system with L users, each corresponding to one layer of the
rateless code. With this interpretation, the signature (spreading) sequence for
the lth virtual user is the lth column of G.
6 Where necessary, we adopt the convention that G
m,0 = 0.
5
the capacity of which is at most
2
m log 1 + |β| 2P
for m ≤ L,
σ
′
Im
=
2
|β|
P
m
L log 1 +
for m > L.
L σ2
TABLE I
L OSSES α′m /αm
IN D B DUE TO LAYERED STRUCTURE IMPOSED ON A
RATELESS CODE OF CEILING RATE R = 5 B / S /H Z , AS A FUNCTION OF THE
NUMBER OF LAYERS L AND REDUNDANCY BLOCKS m.
(26)
Only for m ≤ L does this match the capacity of a general
m-block AWGN channel, viz.,
|β|2 P
.
(27)
Im = m log 1 +
σ2
Ultimately, for m > L the problem is that an L-fold linear
combination cannot fill all degrees of freedom afforded by the
m-block channel.
An additional penalty occurs when we combine the layered
repetition structure with the requirement that the code be rateless. Specifically, for M > L, there is no choice of gain matrix
G that permits (26) to be met with equality simultaneously for
all m = 1, . . . , M . A necessary and sufficient condition for
equality is that the rows of Gm,L be orthogonal for m ≤ L
and the columns of Gm,L be orthogonal for m > L. This
follows because reaching (26) for m ≤ L requires that the
linear combination of L codebooks create an i.i.d. Gaussian
sequence. In contrast, reaching (26) for m > L requires that
the linear combination inject the L codebooks into orthogonal
subspaces, so that a fraction L/m of the available degrees
of freedom are occupied by i.i.d. Gaussians (the rest being
empty).
Unfortunately, the columns of Gm,L cannot be orthogonal
simultaneously for all m > L; orthogonal m-dimensional
vectors (with nonzero entries) cannot remain orthogonal when
truncated to their first m−1 dimensions. Thus (26) determines
only a lower bound on the loss due to the layering structure
(25). Fortunately, the additional loss encountered in practice
turns out to be quite small, as we demonstrate numerically as
part of the next section.
When M = L, the orthogonality requirement forces G to be
a scaled unitary matrix. Upon receiving the final redundancy
block m = M , the problem decomposes into L parallel
channels with equal SNR, which in turn implies that the rate
of each layer must equal R/L.
A lower bound on loss incurred by the use of insufficiently
many layers is readily obtained by comparing (26) and (27).
Given a choice of ceiling rate R for the rateless code,
(26) implies that for rateless codes constructed using linear
combinations of L base codes, the smallest channel gain α′m
for which it’s possible to decode with m blocks is
(
2
2R/m − 1 σP
for m ≤ L,
′2
αm =
(28)
L σ2
2R/L − 1 m
for m > L.
P
By comparison, (27) implies that without the layering constraint the corresponding channel gain thresholds αm are
σ2
.
(29)
α2m = 2R/m − 1
P
The resulting performance loss α′m /αm caused by the
layered structure as calculated from (28) and (29) is shown
in dB in Table I for a target ceiling rate of R = 5 bits/symbol.
For example, if an application requires M = 10 redundancy
L
L
L
L
L
L
L
L
L
=
=
=
=
=
=
=
=
=
1
2
3
4
5
6
7
8
9
2
5.22
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
3
6.77
1.55
0.00
0.00
0.00
0.00
0.00
0.00
0.00
4
7.50
2.28
0.73
0.00
0.00
0.00
0.00
0.00
0.00
Redundancy blocks m
5
6
7
7.92 8.20 8.40
2.70 2.98 3.17
1.16 1.43 1.63
0.42 0.70 0.90
0.00 0.28 0.47
0.00 0.00 0.20
0.00 0.00 0.00
0.00 0.00 0.00
0.00 0.00 0.00
8
8.54
3.32
1.77
1.04
0.62
0.34
0.14
0.00
0.00
9
8.65
3.43
1.88
1.15
0.73
0.45
0.26
0.11
0.00
10
8.74
3.52
1.97
1.24
0.82
0.54
0.35
0.20
0.09
blocks, a 3-layer code has a loss of less than 2 dB at m = 10,
while a 5-layer code has a loss of less than 0.82 dB at m = 10.
As Table I reflects—and as can be readily verified
analytically—for a fixed number of layers L and a fixed base
code rate R/L, the performance loss α′m /αm attributable to
the imposition of layered encoding grows monotonically with
the number of blocks m, approaching the limit
2R/L − 1
α′2
∞
=
.
α2∞
(R/L) ln 2
(30)
Thus, in applications where the number of incremental redundancy blocks is very large, it’s advantageous to keep the base
code rate small. For example, with a base code rate of 1/2
bit per complex symbol (implemented, for example, using a
rate-1/4 binary code) the loss due to layering is at most 0.78
dB, while with a base code rate of 1 bit per complex symbol
the loss is at most 1.6 dB.
We now determine the additional impact the successive
decoding requirement has on our ability to approach capacity,
and more generally what constraints it imposes on G. We
continue to incorporate the power constraint by taking the rateR/L codebooks C1 , . . . , CL to have unit power and the rows
of G to have squared norm P . Since our aim is to employ
codebooks designed for (non-fading) Gaussian channels, we
make the further assumption that the codebooks have constant
power,
i.e., that
they satisfy the per-symbol energy constraint
E |cl,n (w )|2 ≤ 1 for all layers l and time indices n =
1, . . . , N , where the expectation is taken over equiprobable
messages w ∈ {1, . . . , 2N R/L }. Additional constraints on G
now follow from the requirement that the mutual information
accumulated through any block m at each layer l be large
enough to permit successive decoding.
Concretely, suppose we have received blocks 1, . . . , m. Let
the optimal threshold channel gain αm be defined as in (29).
Suppose further that layers l+1, . . . , L have been successfully
decoded, and define
c1
z1
v1
.. ..
..
(31)
. = βGm,l . + .
vm
cl
zm
as the received vectors without the contribution from layers
l + 1, . . . , L.
6
Then, following standard arguments, with independent
equiprobable messages for each layer, the probability of decoding error for layers 1, . . . , l can made vanishingly small
with increasing block length only if the mutual information
between input and output is at least as large as the combined
rate lR/L of the codes C1 , . . . , Cl . That is, when β equals the
optimal threshold gain αm , successive decoding requires
where
−5
θ1 = arccos √ ,
2 22
√
θ3 = − arctan 7,
√
θ2 = 2π − arctan 3 7,
√
θ4 = π − arctan 7/3.
For M > 3 the algebra becomes daunting, though we
conjecture that exact solutions and hence perfect rateless codes
exist for all L = M , for at least some nontrivial values of R.7
lR/L ≤ (1/N )I(c1 , . . . , cl ; y1 , . . . , ym | cL
)
(32)
l+1
For L < M perfect constructions cannot exist. As devel= (1/N )I(c1 , . . . , cl ; v1 , . . . , vm )
(33) oped earlier in this section, even if we replace the optimum
= (1/N )(H(v1 , . . . , vm ) − H(v1 , . . . , vm |c1 , . . . , cl )) threshold channel gains αm defined via (29) with suboptimal
′
(34) gains αm of (28) determined by the layering bound (26), it is
still not possible to satisfy (36). However, one can come close.
≤ log det(σ 2 I + α2m Gm,l G†m,l ) − log det(σ 2 I) (35)
While the associated analysis is nontrivial, such behavior is
= log det(I + (α2m /σ 2 )Gm,l G†m,l ),
(36) easily demonstrated numerically, which we show as part of
the next section.
where I is an appropriately sized (m × m) identity matrix.
The inequality (35) relies on the assumption that the codeVI. N UMERICAL E XAMPLES
books have constant power, and it holds with equality if the
In this section, we consider numerical constructions both
components of Gm,l [cT1 , . . . , cTl ]T are jointly Gaussian, which
for
the case L = M and for the case L < M . Specifically,
by Cramer’s theorem requires the components of c1 , . . . , cl to
we
have experimented with numerical optimization methods
be jointly Gaussian.
to
satisfy
(36) for up to M = 10 redundancy blocks, using the
Our ability to choose G to either exactly or approximately
threshold
channel gains α′m defined via (28) in place of those
satisfy (36) for all l = 1, . . . , L and each m = 1, . . . , M
determines the degree to which we can approach capacity. It defined via (29) as appropriate when the number of blocks M
is straightforward to see that there is no slack in the problem; exceeds the number of layers L.
For the case L = M , for each of M = 2, 3, . . . , 10, we
(36) can be satisfied simultaneously for all l and m only if the
found
constructions with R/L = 2 bits/symbol that come
inequalities are all met with equality. Beyond this observation,
within
0.1% of satisfying (36) subject to (29), and often the
however, the conditions under which (36) may be satisfied are
solutions
come within 0.01%. This provides powerful evidence
not obvious.
that perfect rateless codes exist for a wide range of parameter
Characterizing the set of solutions for G when L = M = 2
choices.
was done in Section IV (see (24)). Characterizing the set of
For the case L < M , despite the fact that there do not
solutions when L = M = 3 requires more work. It is shown
exist perfect codes, in most cases of interest one can come
in Appendix A that, when it exists, a solution G must have
remarkably close to satisfying (36) subject to (28). Evidently
the form
mutual information for Gaussian channels is quite insensitive
√
to modest deviations of the noise covariance away from a
G= x−1·
scaled identity matrix.
p
p
√
4 (x + 1)
x2√(x + 1)
x
As an example, Table II shows the rate shortfall in meeting
p
p x+1
jθ2
jθ1
5+1
x3 (x + 1)
(37) the mutual information constraints (36) for an L = 3 layer
x
e
x(x
+
1)
e
p
p
√
code with M = 10 redundancy blocks, and a target ceiling
x2 (x3 + 1) ejθ3 x(x3 + 1)
ejθ4 x3 + 1
rate R = 5. The associated complex gain matrix is
where x = 2R/6 and where ejθi , i = 1, . . . , 4 are complex
1.4747 2.6277
4.6819
phasors. The desired phasors—or a proof of nonexistence—
3.5075 3.7794 ej2.0510
2.1009 e−j1.9486
may be determined from the requirement that G be a scaled
4.0648 3.1298 e−j0.9531 2.1637 ej2.5732
unitary matrix. Using this observation, it is shown in Ap 3.2146 3.1322 ej3.0765
3.2949 ej0.9132
pendix A that a solution G exists and√is unique (up to complex
3.2146 3.3328 e−j1.6547 3.0918 e−j1.4248
.
conjugate) for all R ≤ 3(log(7 + 3 5) − 1) ≈ 8.33 bits per
G=
3.2146 3.1049 ej0.9409
3.3206 ej2.8982
complex symbol, but no choice of phasors results in a unitary
3.2146 3.3248 ej1.2506
3.1004 e−j0.2027
G for larger values of R.
3.2146 3.0980 e−j1.4196 3.3270 ej1.9403
For example, using (37) with R = 6 bits/symbol we find
3.2146 3.2880 e−j2.9449 3.1394 e−j1.9243
that:
3.2146 3.1795 ej0.7839
3.2492 ej0.3413
p
p
P = 63, α1 = 1, α2 = 1/9, α3 = 1/21
The worst case loss is less than 1.5%; this example is typical
in its efficiency.
√
√
√
7 In recent calculations following the above approach, Ayal Hitron at Tel
√ 3 √ 12
√ 48
G = √24 √33ejθ1 √6ejθ2
Aviv University has determined that exact solutions exist in the M = L = 4
case for rates in the range R ≤ 10.549757.
36
18ejθ3
9ejθ4
7
TABLE II
P ERCENT SHORTFALL IN RATE FOR A NUMERICALLY- OPTIMIZED
RATELESS CODE WITH M = 10 BLOCKS , L = 3 LAYERS , AND A CEILING
RATE OF R = 5 B / S /H Z .
l=1
l=2
l=3
1
0.00
0.00
0.00
2
0.00
0.28
0.29
3
0.00
1.23
1.23
Redundancy blocks m
4
5
6
7
0.00 0.00 0.00 0.00
1.46 1.39 0.44 0.59
1.48 1.40 0.43 0.54
8
0.00
0.48
0.51
9
0.00
0.16
0.15
10
0.00
0.23
0.23
The total loss of the designed code relative to a perfect
rateless code is, of course, the sum of the successive decoding
and layered encoding constraint losses. Hence, the losses
in Tables I and II are cumulative. As a practical matter,
however, when L < M , the layered encoding constraint loss
dwarfs that due to the successive decoding constraint: the
overall performance loss arises almost entirely from the code’s
inability to occupy all available degrees of freedom in the
channel. Thus, this overall loss can be estimated quite closely
by comparing (27) and (26). Indeed this is reflected in our
example, where the loss of Table I dominates over that of
Table II.
VII. E XISTENCE OF N EAR -P ERFECT R ATELESS C ODES
While the closed-form construction of perfect rateless codes
subject to layered encoding and successive decoding becomes
more challenging with increasing code range M , the construction of codes that are at least nearly perfect is comparatively
straightforward. In the preceding section, we demonstrated
this numerically. In this section, we prove this analytically. In
particular, we construct rateless codes that are arbitrarily close
to perfect in an appropriate sense, provided enough layers are
used. We term these near-perfect rateless codes. The code
construction we present is applicable to arbitrarily large M
and also allows for simpler decoding than that required in the
preceding development.
The near-perfect codes we develop in this section [27] are
closely related to those in Section V. However, there are
a few differences. We retain the layered construction, but
instead of using a single complex weight for the codeword
at each layer (and block), we use a single weight magnitude
for each codeword and vary the phase of the weight from
symbol to symbol within the codeword in each layer (and
block). Moreover, in our analysis, the phases are chosen
randomly, corresponding to evaluating an ensemble of codes.
The realizations of these random phases are known to and
exploited by the associated decoders. As with the usual random
coding development, we establish the existence of good codes
in the ensemble by showing that the average performance is
good.
These modifications, and in particular the additional degrees
of freedom in the code design, simplify the analysis—at
the expense of some slightly more cumbersome notation.
Additionally, because of these differences, the particular gain
matrices in this section cannot be easily compared with those
of Section V, but we do not require such comparisons.
A. Encoding
As discussed above, in our approach to perfect constructions
in Section V, we made each redundancy block a linear
combination of the base codewords, where the weights are
the corresponding row of the combining matrix G, as (25)
indicates. Each individual symbol of a particular redundancy
block is, therefore, a linear combination of the corresponding
symbols in the respective base codewords, with the combining
matrix being the same for all such symbols.
Since for the codes of this section we allow the combining
matrix to vary from symbol to symbol in the construction of
each redundancy block, we augment our notation. In particular,
using cl (n) and xm (n) to denote the nth elements of codeword
cl and redundancy block xm , respectively, we have [cf. (25)]
c1 (n)
x1 (n)
.
..
(38)
. = G(n) .. , n = 1, 2, . . . , N.
xM (n)
cL (n)
The value of M plays no role in our development and may
be taken arbitrarily large. Moreover, as before, the power
constraint enters by limiting the rows of G(n) to have a
squared norm P and by normalizing the codebooks to have
unit power.
It suffices to restrict our attention to G(n) of the form
G(n) = P ⊙ D(n),
(39)
where P is an M × L (deterministic) power allocation matrix
√
with entries pm,l that do not vary within a block,
√
√
p1,1 . . .
p1,L
.. ,
..
(40)
P = ...
.
.
√
√
pM,1 . . .
pM,L
and D(n) is a (random) phase-only “dither” matrix of the form
d1,1 (n) · · · d1,L (n)
..
..
..
D(n) =
(41)
,
.
.
.
dM,1 (n) · · ·
dM,L (n)
with ⊙ denoting elementwise multiplication. In our analysis,
the dij (n) are all i.i.d. in i, j, and n, and are independent
of all other random variables, including noises, messages, and
codebooks. As we shall see below, the role of the dither is
to decorrelate pairs of random variables, hence it suffices for
dij (n) to take values +1 and −1 with equal probability.
B. Decoding
To obtain a near-perfect rateless code, it is sufficient to
employ a successive cancellation decoder with simple maximal
ratio combining (MRC) of the redundancy blocks. While, in
principle, an MMSE-based successive cancellation decoder
enables higher performance, as we will see, an MRC-based
one is sufficient for our purposes, and simplifies the analysis.
Indeed, although the encoding we choose creates a perlayer channel that is time-varying, the MRC-based successive
cancellation decoder effectively transforms the channel back
into a time-invariant one, for which any of the traditional
8
low-complexity capacity-approaching codes for the AWGN
channel are suitable as a base code in the design.8
The decoder operation is as follows, assuming the SNR is
such that decoding is possible from m redundancy blocks. To
decode the Lth (top) layer, the dithering is first removed from
the received waveform by multiplying by the conjugate dither
sequence for that layer. Then, the m blocks are combined
into a single block via the appropriate MRC for that layer.
The message in this Lth layer is then decoded, treating the
undecoded layers as noise, and its contribution subtracted
from the received waveform. The (L − 1)st layer is now the
top layer, and the process is repeated, until all layers have
been decoded. Note that the use of MRC in decoding is
equivalent to treating the undecoded layers as white (rather
than structured) noise, which is the natural approach when the
dither sequence structure in those undecoded (lower) layers is
ignored in decoding the current layer of interest.
We now introduce notation that allows the operation of the
decoder to be expressed more precisely. We then determine
the effective SNR seen by the decoder at each layer of each
redundancy block.
Since G(n) is drawn i.i.d., the overall channel is i.i.d., and
thus we may express the channel model in terms of an arbitrary
individual element in the block. Specifically, our received
waveform can be expressed as [cf. (1) and (25)]
y1
c1
z1
..
.. ..
y = . = βG . + . ,
(42)
yM
cL
zM
where G = P⊙D, with G denoting the arbitrary element in the
sequence G(n), and where ym is the corresponding received
symbol from redundancy block m (and similarly for cl , zm ,
D).
If layers l + 1, l + 2, . . . , L have been successively decoded
from m redundancy blocks, and their effects subtracted from
the received waveform, the residual waveform is denoted by
c1
z1
.. ..
vm,l = βGm,l . + . ,
(43)
cl
zm
where we continue to let Gm,l denote the m × l upper-left
submatrix of G, and likewise for Dm,l and Pm,l . As additional
notation, we let gm,l denote the m-vector formed from the
upper m rows of the lth column of G, whence
(44)
Gm,l = gm,1 gm,2 · · · gm,l ,
and likewise for dm,l and pm,l .
With such notation, the decoding can be expressed as
follows. Starting with vm,L = y, decoding proceeds. After
8 More generally, the MRC-based decoder is particularly attractive for
practical implementation. Indeed, as each redundancy block arrives a sufficient
statistic for decoding can be accumulated without the need to retain earlier
blocks in buffers. The computational cost of decoding thus grows linearly
with block length while the memory requirements do not grow at all. This is
much less complex than the MMSE decoder discussed in the development of
the codes of Section V.
layers l + 1 and higher have been decoded and removed, we
decode from vm,l . Writing
vm,l = β(dm,l ⊙ pm,l )cl + vm,l−1 ,
(45)
the operation of removing the dither can be expressed as
′
d∗m,l ⊙ vm,l = βpm,l cl + vm,l−1
(46)
′
vm,l−1
= d∗m,l ⊙ vm,l−1 .
(47)
where
The MRC decoder treats the dither in the same manner
as noise, i.e., as a random process with known statistics
but unknown realization. Because the entries of the dither
matrix are chosen to be i.i.d. random phases
independent
of
the messages, the entries of Dm,l and c1 · · · cl−1 are
jointly and individually uncorrelated, and the effective noise
′
vm,l−1
seen by the MRC decoder has diagonal covariance
†
′
Kvm,l−1
= E[v′ m,l−1 v′ m,l−1 ].
The effective SNR at which this lth layer is decoded from
m blocks via MRC is thus
m
X
SNRMRC =
SNRm′ ,l (β),
(48)
m′ =1
where
SNRm′ ,l (β) =
|β|2 pm′ ,l
.
|β|2 (pm′ ,1 + · · · + pm′ ,l−1 ) + σ 2
(49)
Note that we have made explicit the dependency of these perlayer per-block SNRs on β.
C. Efficiency
The use of random dither at the encoder and MRC at the
decoder both cause some loss in performance relative to the
perfect rateless codes presented earlier. In this section we show
that these losses can be made small.
When a coding scheme is not perfect, its efficiency quantifies
how close the scheme is to perfect. There are ultimately several
ways one could measure efficiency that are potentially useful
for engineering design. Among these, we choose the following
efficiency notion:
1) We find the ideal thresholds {αm } for a perfect code of
rate R.
2) We determine the highest rate R′ such that an imperfect
code designed at rate R′ is decodable with m redundancy blocks when the channel gain is αm , for all
m = 1, 2, . . . .
3) We measure efficiency η by the ratio R′ /R, which is
always less than unity.
With this notion of efficiency, we further define a coding
scheme as near-perfect if the efficiency so-defined approaches
unity when sufficiently many layers L are employed.
The efficiency of our scheme ultimately depends on the
choice of our power allocation matrix (40). We now show
the main result of this section: provided there exists a power
allocation matrix such that for each l and m
m
X
R
=
log(1 + SNRm′ ,l (αm )),
(50)
L
′
m =1
9
′
Il,m
= I(cl ; vm,l | dm,l )
= I(cl ; αm pm,l cl +
≥ I(cl ; αm pm,l cl +
≥ I(cl ; αm pm,l cl +
(51)
′
vm,l−1
′
vm,l
),
′′
vm,l ),
| dm,l ),
= log (1 + SNRMRC )
(52)
(53)
(55)
where we have applied the inequality ln(1 + u) ≤ u
(valid
Pm for u > 0) to (50) to conclude that (ln 2)R/L ≤
m′ =1 SNRm′ ,l (αm ). Note that the lower bound (56) may
′
be quite loose; for example, Im,l
= R/L when m = 1.
Thus, if we design each layer of the code for a base code
rate of
R′′
R
,
(57)
= log 1 + ln 2
L
L
(56) ensures decodability after m blocks are received when
the channel gain is αm , for m = 1, 2, . . . .
Finally, rewriting (57) as
′′
R
2R /L − 1
=
,
(58)
L
ln 2
the efficiency η of the conservatively-designed layered repetition code is bounded by
(ln 2)R′′ /L
ln 2 R′′
R′′
= R′′ /L
,
≥1−
R
2 L
2
−1
0.98
0.96
0.94
0.92
0.9
0.88
0.86
0.84
0.82
0.8
0
0.05
0.1
0.15
(54)
where (52) follows from (46)–(47), (53) follows from the
independence of cl and dm,l , and (54) obtains by replacing
′
′′
vm,l−1
with a Gaussian random vector vm,l−1
of covariance
′
Kvm,l−1
. Lastly, to obtain (55) we have used (48) for the postMRC SNR.
Now, if the assumption (50) is satisfied, then the right-hand
side of (55) is further bounded for all m by
R
′
Im,l
≥ log 1 + ln 2
,
(56)
L
η=
1
efficiency bound (fraction of capacity)
with SNRm,l (·) as defined in (49), a near-perfect rateless
coding scheme results. We prove the existence of such a
power allocation—and develop an interpretation of (50)—in
Appendix B, and thus focus on our main result in the sequel.
We establish our main result by finding a lower bound on
the average mutual information between the input and output
of the channel. Upon receiving m blocks with channel gain
αm , and assuming layers l+1, . . . , L are successfully decoded,
′
let Im,l
be the mutual information between the input to the
lth layer and the channel output. Then
(59)
which approaches unity as L → ∞ as claimed.
In Fig. 2, the efficiency bounds (59) are plotted as a function
of the base code rate R′′ /L. As a practical matter, our bound
implies, for instance, that to obtain 90% efficiency requires a
base code of rate of roughly 1/3 bits per complex symbol.
Note, too, that when the number of layers is sufficiently large
that the SNR per layer is low, a binary code may be used
instead of a Gaussian codebook, which may be convenient for
implementation. For example, a code with rate 1/3 bits per
complex symbol may be implemented using a rate-1/6 LDPC
code with binary antipodal signaling.
0.2
0.25
0.3
0.35
0.4
0.45
0.5
base code rate (b/s/Hz)
Fig. 2. Lower bound on efficiency of the near-perfect rateless code. The top
and bottom curves are the middle and right-hand bounds of (59), respectively.
VIII. D ESIGN
AND I MPLEMENTATION I SSUES
In this section, we comment on some issues that arise
in the development and implementation of our rateless code
constructions; additional implementation issues are addressed
in [28].
First, one consequence of our development of perfect
rateless codes for M = L is that all layers must have
the same rate R/L. This does not seem to be a serious
limitation, as it allows a single base codebook to serve as the
template for all layers, which in turn generally decreases the
implementation complexity of the encoder and decoder. The
codebooks C1 , . . . , CL used for the L layers should not be
identical, however, for otherwise a naive successive decoder
might inadvertently swap messages from two layers or face
other difficulties that increase the probability of decoding error.
A simple cure to this problem is to apply pseudorandom
phase scrambling to a single base codebook C to generate
the different codebooks needed for each layer. Pseudorandom
interleaving would have a similar effect.
Second, it should be emphasized that a layered code designed with the successive decoding constraint (36) can be
decoded in a variety of ways. Because the undecoded layers
act as colored noise, an optimal decoder should take this
into account, for example by using a MMSE combiner on
the received blocks {ym } as mentioned in Section V. The
MMSE combining weights change as each layer is stripped
off. Alternatively, some or all of the layers could be decoded
jointly; this might make sense when the decoder for the base
codebook decoder is already iterative, and could potentially
accelerate convergence compared to a decoder that treats the
layers sequentially.
Third, a comparatively simple receiver is possible when all
M blocks have been received from a perfect rateless code in
which M = L. In this special case the linear combinations
applied to the layers are orthogonal, hence the optimal receiver can decode each layer independently, without successive
decoding. This property is advantageous in a multicasting
10
scenario because it allows the introduction of users with
simplified receivers that function only at certain rates, in this
case the lowest supported one.
Finally, we note that with an ideal rateless code, every prefix
of the code is a capacity-achieving code. This corresponds to a
maximally dense set of SNR thresholds at which decoding can
occur. By contrast, our focus in the paper has been on rateless
codes that are capacity-achieving only for prefixes whose
lengths are an integer multiple of the base block length. The
associated sparseness of SNR thresholds can be undesirable
in some applications, since when the realized SNR is between
thresholds, there is no guarantee that capacity is achieved:
the only realized rate promised by the construction is that
corresponding to the next lower SNR threshold.
However, as will be apparent from the simulations described
in Section IX, performance is generally much better than this
pessimistic assessment. In particular, partial blocks provide
essentially all the necessary redundancy to allow an appropriately generalized decoder to operate as close to capacity as
happens with full blocks.
Nevertheless, when precise control over the performance at
a dense set of SNR thresholds is required, other approaches
can be used. For example, when the target ceiling rate is R,
we can use our rateless code construction to design a code of
ceiling rate κR, where 1 ≤ κ ≤ M , and have the decoder
collect at least κ blocks before attempting to decode. With
this approach, the associated rate thresholds are R, Rκ/(κ +
1), Rκ/(κ + 2), . . . , Rκ/M . Hence, by choosing larger values
of κ, one can increase the density of SNR thresholds.
IX. S IMULATIONS
Implicit in our analysis is the use of perfect base codes
and ideal (maximum likelihood) decoding. In this section,
we present simulations that further validate our rateless code
design with practical coding and decoding.
In our simulations, we use as our base code the turbo code
specified in the 3GPP LTE wireless standard [21], [22]. This
parallel-concatenated convolutional code constructed from a
pair of 8-state constituent encoders has a rate of 2/3 bits per
complex symbol. This code is used on conjunction with the
iterative turbo-decoding algorithm for which it was designed.
The base code is used in both 3- and 4-layer rateless
constructions, corresponding to ceiling rates of R = 2 and
R = 8/3 b/s/Hz, respectively. Moreover, there are a total
of 6144 information bits per layer, corresponding to a block
length of N = 9216 complex symbols.
Encoding proceeds as follows. Since the base code is not
ideal, it has a bit-error rate that rolls off with the operating
SNR. Let SNR◦ (ǫ) denote the SNR at which the base code
achieves a bit-error rate of ǫ. Then, using a definition analogous to that used in Section VII-C, the efficiency of the base
code is9
R/L
.
η◦ (ǫ) =
log(1 + SNR◦ (ǫ))
9 One can equivalently measure the efficiency of the base code in terms of
its gap to capacity at a particular target bit-error rate. However, our chosen
measure is more natural when relating the efficiency of the base code to the
rateless code constructed from it.
TABLE III
R ATE 2/3 B / S /H Z 3GPP LTE BASE C ODE E FFICIENCIES
Efficiency η◦
10−2
88.9%
Bit-Error Rate ǫ
10−3
10−4
87.1%
85.7%
10−5
84.7%
Thus, in computing the gain matrix G, we prescale the target
rate, replacing R with R/η◦ (ǫ). Note that as a result, G
depends on the target rate and the base code properties only.
For the particular base code used in the simulations, the
efficiencies are as given in Table III.
In our simulation, we decode not only from integer numbers
of redundancy blocks, but also from noninteger numbers,
corresponding to partial blocks. In general, MMSE combining
is applied on a symbol-by-symbol basis, in conjunction with
our usual successive cancellation. In particular, when the
number of incremental redundancy blocks m is noninteger,
then the MMSE combiner for the first N (m − ⌊m⌋) symbols
of the codeword in a given layer l is constructed from
the submatrix G⌊m⌋+1,l , while the MMSE combiner for the
remaining N (1 + ⌊m⌋ − m) symbols of the codeword is
constructed from the submatrices G⌊m⌋,l .
Following combining (and cancellation), turbo decoding is
applied to the layer of interest, where the initial log-likelihood
ratios are calculated treating the symbols as corrupted by
Gaussian noise with variance determined by the effective
SNR. This effective SNR is determined from the (reciprocal
of the unbiased) mean-square error resulting from MMSE
combining, taking into account the successive cancellation.
Thus, when m is noninteger, the initial log-likelihood ratios
take on one value for the symbols in the first part of the
codeword, and a different value in the second part.
The overall efficiency η of the resulting rateless code, i.e.,
the fraction of capacity at which it operates, is a function of the
number of incremental redundancy blocks m (or equivalently
the realized SNR in the channel). We calculate η for the
general case where m may be noninteger as follows. First,
for a given value of m, the roll-off of the bit-error rate
of the overall rateless code as a function of the SNR can
be generated, where for each SNR value, the corresponding
MMSE combiner with successive cancellation is used. As
above, when m is noninteger two MMSE combiners are
involved. The resulting bit error rate is averaged over both
the N symbols within the codeword at every layer and the L
layers, so that error propagation effects are taken into account.
We then let SNR(m, ǫ) denote the SNR at which the target
bit-error rate ǫ is attained for this particular value of m, from
which the efficiency of the rateless code is
η(m, ǫ) =
R/m
,
log(1 + SNR(m, ǫ))
(60)
where we have used a notion of efficiency consistent with
earlier definitions.
The resulting efficiency plots are depicted in Fig. 3. Several
features are noteworthy. First, the efficiencies for m = 1, 2, . . .
redundancy blocks are quite close to those of the base code
shown in Table III; typically they are at most 2-3% lower. This
11
90
85
Efficiency (%)
80
75
70
65
60
55
50
1
1.5
2
2.5
3
No. Blocks
y m = βm x m + z m ,
(a) 3-layers, 3-blocks (rate range: 2/3 to 2 b/s/Hz)
90
85
Efficiency (%)
80
75
70
65
60
55
50
1
1.5
2
2.5
3
No. Blocks
(b) 4-layers, 4-blocks (rate range: 8/9 to 8/3 b/s/Hz)
Fig. 3. Practical efficiencies achieved using a rateless construction in conjunction with rate 2/3 base code. The successively lower curves correspond
to target bit-error rates of 10−2 , 10−3 , 10−4 , and 10−5 , respectively.
suggests, at least for codes with few layers, that losses due to
the rateless architecture itself, as well as the use of iterative
decoding in the face of non-Gaussian noise from undecoded
layers, are negligible in practice, and that good base codes will
yield good rateless codes.
Second, the efficiencies do not vary significantly with the
number of redundancy blocks m. Moreover, even when partial
redundancy blocks are used, the efficiency does not deteriorate. This suggests that our rateless code constructions can
operate over a much finer-grained set of rates than our design
prescribed. However, it should be emphasized that this holds
only when at least one full redundancy block is used. When
less redundancy is used, Fig. 3 shows that efficiency falls off
rapidly.
X. E XTENSIONS
TO
time-varying channel can be viewed as an instance of parallel
channels, and thus a solution can be developed from a solution
to the problem of rateless coding for parallel channels. Initial
work in this direction is described in, e.g., [29]–[31], [33],
though much remains to be understood about the performance
limits of various constructions. Another approach is based
on the observation that feedback about the past channel state
can significantly simplify the problem of encoding for future
transmissions [34]. It is this approach we describe here as an
illustration of potential. In particular, we show that the natural
generalization of our architecture is perfect (i.e., capacityachieving), enabling the message to be recovered with the
minimum possible number of blocks for the realized channel.
For the time-varying channel we consider, the observations
take the form
T IME -VARYING C HANNELS
The framework of this paper can be extended to timevarying channels in a variety of ways. As one example, the
m = 1, 2, . . . ,
(61)
where the {βm } denote a sequence of complex channel gains.
The βm continue to be known a priori at the receiver but not
at the transmitter.
The encoder transmits a message w by generating a sequence of incremental redundancy blocks x1 (w), x2 (w), . . . .
The receiver accumulates sufficiently many received blocks y1 ,
y2 , . . . to recover w. Immediately following the transmission of
block xm , the encoder is notified of βm . Thus, knowledge of
β1 , . . . , βm can be used in the construction of the redundancy
block xm+1 (w).
In this context, a perfect rateless code is then one in
which capacity is achieved for any number m = 1, . . . , M
of redundancy blocks, i.e., whenever the (realized) channel
gains are such that
m
X
P
(62)
R≤
log 1 + 2 |βm′ |2 ,
σ
′
m =1
the message can be recovered with high probability.
In this development, for values of m such that the right side
of (62) is less than R, it is convenient to define target channel
gains αm+1 required for successful decoding once block m+1
is obtained. In particular, αm+1 is defined via
X
m
P
P
R = log 1 + 2 α2m+1 +
log 1 + 2 |βm′ |2 , (63)
σ
σ
′
m =1
whenever αm > |βm |.
Generalizing our construction for the time-invariant case,
we first choose the range M , the ceiling rate R, the number of layers L, and finally the associated base codebooks
C1 , . . . , CL . We assume a priori that the base codebooks all
have equal rate R/L.
As with our time-invariant construction, the redundancy
blocks x1 , . . . , xM take the form (25). We emphasize that
the mth row of G, which constitute the weights used in
constructing the mth redundancy block from the L codewords,
will in general be a function of the (realized) channel gains
β1 , . . . , βm−1 . Specifically, the mth row is designed for the
channel gain sequence {β1 , . . . , βm−1 , αm }, i.e., we substitute
the target gain αm for the (as yet unknown) channel gain βm .
Finally, in addition to the layered code structure, we continue
12
to impose the constraint that the layered code be successively
decodable.
Our aim is to select G so that the code is perfect as defined
earlier. From the layered repetition encoding structure, we
require as in the time-invariant development that the rows of G
be orthogonal, while from the successive decoding constraint
we have the requirement [cf. (36)] that
lR
1
≤ log det(I + 2 Bm Gm,l G†m,l B†m )
L
σ
for all l = 1, . . . , L and m = 1, . . . , M , with
Bm = diag(β1 , . . . , βm−1 , αm ).
(64)
(65)
With this model, in Appendix C we construct in closed
form perfect rateless codes for the case of M = 2 redundancy
blocks
and L = 3 layers for rates in the range R < log(2 +
√
5) ≈ 2.08 bits per complex symbol. This construction can
be viewed as the time-varying natural generalization of that in
Section IV. Establishing the existence of perfect rateless codes
for larger values of M and/or L requires more effort. However,
following an approach analogous to that used in corresponding
development for the time-invariant case in Section VII, we
shown in Appendix D that in the limit of a large number of
layers L, asymptotically perfect codes for all values of M are
possible.
XI. C ONCLUDING R EMARKS
In this paper, motivated by hybrid ARQ requirements in
wireless and related applications, our focus has been on the
development of a lossless framework for transforming a code
good for the AWGN channel at a single SNR into one good
simultaneously at multiple SNRs. There are a variety of
worthwhile directions for further research.
First, while beyond the scope of the present paper, a
comparative evaluation of methods described herein relative
to, for example, those described in Section II is likely to
reveal additional insight, and uncover opportunities for further
progress.
Second, while we have developed some preliminary results
on the extension of our framework to time-varying channels,
clearly this is just a beginning. For example, when M > 2,
there is flexibility in the problem formulation, and thus in
how the available degrees of freedom are allocated. As another
example, one could consider other time-variation models, such
as one that would allow β to vary deterministically so long
as the pattern of variation is known in advance. Then, for one
block the code would be designed for a gain of [α1,1 ], for two
blocks the target gains would be [α2,1 α2,2 ], for three blocks
the gains would be [α3,1 α3,2 α3,3 ], and so on. Still another
example would involve the development of solutions for timevarying channels without requiring SNR feedback, either with
or without a stochastic model for β.
Other worthwhile directions include more fully developing
rateless constructions for the AWGN channel that allow decoding to begin at any received block, and/or to exploit an
arbitrary subset of the subsequent blocks. Initial efforts in this
direction include the faster-than-Nyquist constructions in [27],
[29], and the diagonal subblock layering approach described
in [29].
Beyond the single-input, single-output (SISO) channel,
many multiterminal and multiuser extensions are also of
considerable interest. Examples of preliminary developments
along these lines include the rateless space-time code constructions in [35], the rateless codes for multiple-access channels
developed in [36], and the approaches to rateless coding
for parallel channels examined in [29]–[31]. Indeed, such
research may lead to efficient rateless orthogonal frequencydivision multiplexing (OFDM) systems and efficient rateless
multi-input, multi-output (MIMO) codes with wide-ranging
applications.
Finally, extending the layered approach to rateless coding
developed in this paper beyond the Gaussian channel is also
a potentially rich direction for further research. A notable
example would be the binary symmetric channel, where good
rateless solutions remain elusive. Preliminary work in this
direction is described in [32].
A PPENDIX A
P ERFECT
RATELESS SOLUTION FOR
Determining the set of solutions
g11 g12
G = g21 g22
g31 g32
L=M =3
g13
g23
g33
(66)
to (36) when L = M = 3 as a function of the ceiling rate R
is a matter of lengthy if routine algebra.
We begin by observing that any row or any column of G
may be multiplied by a common phasor without changing
GG† . Without loss of generality we may therefore take the
first row and first column of G to be real and positive. Each
G thus represents a set of solutions D1 GD2 , where D1 and
D2 are diagonal matrices in which the diagonal entries have
modulus 1. The solutions in the set are equivalent for most
engineering purposes and we shall therefore not distinguish
them further.
We know that G must be a scaled
√unitary matrix, scaled so
that the row and column norms are P . Thus, if we somehow
determine the first two rows of G, there is always a choice for
the third row: it’s the unique vector orthogonal to the first two
rows which meets the power constraint and which has first
component real and positive. Conversely, it’s easy to see that
any appropriately scaled unitary matrix G that satisfies (36)
for m = 1 and m = 2 (and all l = 1, 2, 3) necessarily satisfies
(36) for m = 3. We may therefore without loss of generality
restrict our attention to determining the set of solutions to the
first two rows of G; the third row comes “for free” from the
constraint that G be a scaled unitary matrix.
Assume, again without loss of generality, that α21 = 1 and
2
σ = 1. Via (36), the first row of G (which controls the first
redundancy block) must satisfy
2
R/3 = log(1 + g11
)
(67)
2
2
2R/3 = log(1 + g11
+ g12
)
(68)
2
2
2
+ g12
+ g13
)
3R/3 = log(1 + g11
(69)
13
and must also satisfy the power constraint
2
2
2
P = g11
+ g12
+ g13
.
(70)
Substituting (79) into (82) to eliminate the cosine term and
using (80) yields
2
2 2
2
x4 = α42 (g11
|g22 |2 + g12
g21 − g13
|g23 |2
Thus
R
2 2
2
g21 + g12
|g22 |2 )
+ g11
6
P =2 −1=x −1
2
2
2
+ α22 (g11
+ g12
+ g21
+ |g22 |2 ) + 1.
and
2
g11
= 2R/3 − 1 = x2 − 1,
2
g12
2
g13
=2
R/3
(2
R/3
(71)
2
2
− 1) = x (x − 1),
= 22R/3 (2R/3 − 1) = x4 (x2 − 1),
where for convenience we have introduced the change of
variables x = 2R/6 .
The first column of G (which controls the first layer of
each redundancy block) is also straightforward. Via (29) with
m = 2 and m = 3, we have
1
,
α22 = 3
x +1
1
.
α23 = 4
x + x2 + 1
(74)
(76)
(77)
To solve for the second row of G we use (36) with m =
l = 2 together with the requirement that the first and second
rows be orthogonal. It is useful at this stage to switch to polar
coordinates, i.e., g22 = |g22 |ejθ1 and g23 = |g23 |ejθ2 .
Orthogonality of the first and second rows means that
0 = g11 g21 + g12 |g22 |ejθ1 + g13 |g23 |ejθ2 .
(78)
Complex conjugation is not needed here because the first row
is real. The three terms in the above expression may be viewed
as the legs of a triangle, so by the law of cosines
2
2
2 2
g21 − g12
|g22 |2 . (79)
2g11 g21 g12 |g22 | cos θ1 = g13
|g23 |2 − g11
We now use (36) with m = l = 2 to infer that
2
2R/3
4
= x = det(I +
α22 G2,2 G†2,2 ).
To expand this expression, we compute
2
2
g + g12
g11 g21 + g12 |g22 |e−jθ1
,
G2,2 G†2,2 = 11
2
(∗)
g21
+ |g22 |2
(80)
(81)
where (∗) is the complex conjugate of the upper right entry,
from which we find
det(I + α22 G2,2 G†2,2 ) =
2
2 2
α42 (g11
|g22 |2 + g12
g21 − 2g11 g21 g12 |g22 | cos θ1 )
(84)
solving for |g22 |2 , and simplifying, we arrive at
|g22 |2 = (x5 + 1)(x − 1).
(85)
The power constraint (84) then immediately yields
|g23 |2 = x(x2 − 1).
(86)
The squared modulus of the entries of the last row of G
follow immediately from the norm constraint on the columns:
2
2
2
g31
= P − g21
+ g11
= x2 (x2 − x + 1)(x2 − 1).
2
|g32 | = P −
2
g22
−
2
g12
3
(87)
= x(x + 1)(x − 1)
(88)
2
2
|g33 |2 = P − g23
− g13
= (x3 + 1)(x − 1).
(89)
and
Substituting the previously computed expressions (71) and
2
(74) for g11
and α22 into (76) and solving for g21 yields
2
g21
= x3 (x2 − 1).
2
|g23 |2 = P − |g22 |2 − g21
,
(75)
Using (36) for l = 1 and m = 2 yields
2
2
R/3 = log(1 + α22 (g11
+ g21
)).
2
2
2
2
Finally, substituting the expressions for g11
, g12
, g13
, g21
,
and α22 computed above, using the power constraint
(72)
(73)
(83)
2
2
2
+ α22 (g11
+ g12
+ g21
+ |g22 |2 ) + 1. (82)
This completes the calculation of the squared modulus of the
entries of G. In summary, we have shown that G has the form
√
G= x−1·
p
p
√
2 (x + 1)
4 (x + 1)
x
+
1
x
x
p
p
√
1
x3 (x + 1)
ejθp
x5 + 1
ejθ2 √x(x + 1) (90)
p
jθ3
2
3
3
x (x + 1) e
x(x + 1)
ejθ4 x3 + 1
where x = 2R/6 .
We must now establish the existence of suitable θ1 , . . . , θ4 .
To resolve this question it suffices to consider the consequences of the orthogonality constraint (78) on θ1 and θ2 .
As remarked at the start of this section, the last row of G and
hence θ3 and θ4 come “for free” once we have the first two
rows of G.
Substituting the expressions for |gml |2 determined above
into (78) and canceling common terms yields
p
√
√
0 = x + ejθ1 x4 − x3 + x2 − x + 1 + ejθ2 x3 . (91)
The right-hand side is a sum of three phasors of predetermined
magnitude, two of which can be freely adjusted in phase. In
geometric terms, the equation has a solution if we can arrange
the three complex phasors into a triangle, which is possible if
and only if the longest side of the triangle is no longer than the
sum of the lengths of the shorter sides. The resulting triangle
is unique (up to complex conjugation of all the phasors). Now,
the middle term of (91) grows faster in x than the others, so
for large x we cannot possibly construct the desired triangle.
A necessary condition for a solution is thus
p
√
√
(92)
x + x3 ≥ x4 − x3 + x2 − x + 1,
where equality can be shown (after some manipulation)
√ to hold
at the largest root of x2 − x + 1, i.e., at x = (3 + 5)/2, or
14
√
equivalently R = 6 log2 x = 6 log2 (3 + 5) − 6. It becomes
evident by numerically plotting the quantities involved that this
necessary condition is also sufficient, i.e., a unique solution to
(91)
√ exists for all values of x in the range 1 < x ≤ (3 +
5)/2 and no others. Establishing this fact algebraically is an
unrewarding though straightforward exercise.
A relatively compact formula for θ1 may be found by
applying the law of cosines to (91):
x4 − 2x3 + x2 + 1
.
cos(π − θ1 ) = p
2 x(x4 − x3 + x2 − x + 1)
(93)
The power allocation satisfying the property (50) can be obtained as the solution to a different but closely related rateless
code optimization problem. Specifically, let us retain the block
structuring and layering of the code of Section VII-A, but
instead of using repetition and dithering in the construction,
let us consider a code where the codebooks in a given layer
are independent from block to block. While such a code is still
successively decodable, it does not retain other characteristics
that make decoding possible with low complexity. However,
the complexity characteristic is not of interest. What does
matter to us is that the per-layer, per-block SNRs that result
from a particular power allocation will be identical to those
of the code of Section VII-A for the same power allocation.
Thus, in tailoring our code in this Appendix to meet (50), we
simultaneously ensure our code of Section VII-A will as well.
We begin by recalling a useful property of layered codes in
general that we will apply. Consider an AWGN channel with
gain β and noise z of variance σ 2 , and consider an L-layer
block code that is successively decodable. If the constituent
codes are capacity-achieving i.i.d. Gaussian codes, and MMSE
successive cancellation is used, then the overall code will be
capacity achieving. More specifically, for any choice of powers
pl for layers l = 1, 2, . . . , L that sum to the power constraint
P , the associated rates Il for these layers will sum to the
corresponding capacity log(1 + |β|2 P/σ 2 ). Equivalently, for
any choice of rates Il that sum to capacity, the associated
powers pl will sum to the corresponding power constraint. In
this latter case, any rate allocation that yield powers that are
all nonnegative is a valid one.
To see this, let the relevant codebooks for the layers be
C̃1 , . . . , C̃L , and let the overall codeword be denoted
(94)
where the c̃l ∈ C̃l are independently selected codewords drawn
for each layer. The overall code rate is the sum of the rates
of the individual codes. The overall power of the code is P =
p1 + · · · + pL .
From the mutual information decomposition
L
X
l=1
Il
L
with c̃l+1
= (c̃l+1 , c̃l+2 , . . . , c̃L ), we see that the overall codebook power constraint P can be met by apportioning power
to layers in any way desired, so long as p1 + · · · + pL = P .
Since the undecoded layers are treated as noise, the maximum
codebook rate for the lth layer is then
(95)
(96)
where
SNRl =
A PPENDIX B
P OWER A LLOCATION
I(c̃; y ) =
L
Il = I(c̃l ; c̃1 + · · · + c̃L + z | c̃l+1
),
Il = log(1 + SNRl )
Similar formulas may be derived for θ2 , θ3 , and θ4 .
c̃ = c̃1 + · · · + c̃L ,
where
|β|2 pl
|β|2 p1 + |β|2 p2 + · · · + |β|2 pl−1 + σ 2
(97)
is the effective SNR when decoding the lth layer. Straightforward algebra, which amounts to a special-case recalculation
of (95), confirms that I1 + · · · + IL = log(1 + |β|2 P/σ 2 ) for
any selection of powers {pl }.
Alternatively, instead of selecting per-layer powers and
computing corresponding rates, one can select per-layer rates
and compute the corresponding powers. The rates {Il } for
each level may be set in any way desired so long as the
total rate I1 + · · · + IL does not exceed the channel capacity
log(1 + |β|2 P/σ 2 ). The required powers {pl } may then be
found using (96) and (97) recursively for l = 1, . . . , L. There
is no need to verify the power constraint: it follows from
(95) that the powers computed in this way sum to P . Thus
it remains only to check that the {pl } are all nonnegative to
ensure that the rate allocation is a valid one.
We now apply this insight to our rateless context. The target
ceiling rate for our rateless code is R, and, as before, αm ,
m = 1, 2, . . . , denotes the threshold channel gains as obtained
via (29).
Comparing (50) with (96) and (97) reveals that (50) can be
rewritten as
m
X
Rl =
Im′ ,l (αm ),
(98)
m′ =1
for all l = 1, 2, . . . , L and m = 1, 2, . . ., where
Rl = R/L
(99)
and Im′ ,l (αm ) is the mutual information in layer l from block
m′ when the realized channel gain is αm . Thus, meeting (50) is
equivalent to finding powers pm′ ,l for each code block m′ and
layer l so that for the given rate allocation Rl (a) the powers
are nonnegative, (b) the power constraint is met, and (c) when
the channel gain is αm , the mutual information accumulated
at the lth layer after receiving code blocks 1, 2, . . . , m equals
Rl .
Since the power constraint is automatically satisfied by any
assignment of powers that achieves the target rates, it suffices
to establish that (98) has a solution with nonnegative per-layer
powers.
The solution exists and is unique, as can be established by
induction on m. Specifically, for m = 1 the rateless code is
an ordinary layered code and the powers p1,1 , . . . , p1,L may
be computed recursively from [cf. (98)]
m
X
Rl =
log(1 + SNRm′ ,l (αm )),
(100)
m′ =1
15
TABLE IV
P ER - LAYER POWER ASSIGNMENTS pm,l AND CHANNEL GAIN
THRESHOLDS αm FOR THE INITIAL BLOCKS OF AN L = 4 LAYER
RATELESS CODE WITH TOTAL POWER P = 255, NOISE VARIANCE σ 2 = 1,
AND PER - LAYER RATE R/L = 1 B / S /H Z .
gain (dB)
l=1
l=2
l=3
l=4
m=1
0.00
3.00
12.00
48.00
192.00
m=2
-12.30
40.80
86.70
86.70
40.80
m=3
-16.78
48.98
61.21
81.32
63.48
m=4
-19.29
55.77
60.58
71.48
67.16
m=5
-20.99
58.79
61.65
67.50
67.06
with SNRm,l (αm ) as given in (49) for l = 1, . . . , L.
For the induction hypothesis, assume we have a power
assignment for the first m blocks that satisfies (100). To
find the power assignment for the (m + 1)st block, observe
that when the channel gain decreases from αm to αm+1
the per-layer mutual information of every block decreases.
A nonnegative power must be assigned to every layer in the
(m + 1)st code block to compensate for the shortfall.
The mutual information shortfall in the lth layer is
∆m+1,l = Rl −
m
X
log(1 + SNRm′ ,l (αm+1 )),
and the power pm+1,l needed to make up for this shortfall is
the solution to
pm+1,l = (2
2∆m+1,l
− 1)
· (pm+1,1 + · · · + pm+1,l−1 +
2
σm+1
). (103)
α2m+1
This completes the induction. Perhaps counter to intuition,
even if the per-layer rates R1 , . . . , RL are set equal, the perlayer shortfalls ∆m+1,1 , . . . , ∆m+1,L will not be equal. Thus,
within a layer the effective SNR and mutual information will
vary from block to block.
Eqs. (101) and (103) are easily evaluated numerically. An
example is given in Table IV.10
Finally, since this result holds regardless of the choice of
the constituent Rl , it will hold for the particular choice (99),
whence (50).
A PPENDIX C
P ERFECT L = 3, M = 2 R ATELESS S OLUTION FOR
T IME -VARYING C HANNEL
As the simplest example, for the case of M = 2 redundancy
blocks and L = 3 layers the constraints (64) can be met, i.e.,
a perfect rateless code is possible provided R is not too large.
10 If
2
R/3 = log(1 + g11
)
one were aiming to use a rateless code of the type described in
Section VII in practice, in calculating a power allocation one should take into
account the gap to capacity of the particular base code being used. Details
of this procedure for the case of perfect rateless codes are given as part of
the description of the simulations in Section IX. For the case of near perfect
codes, the corresponding procedure is described in [28].
(105)
2
2
+ g12
)
2R/3 = log(1 + g11
(106)
2
2
2
3R/3 = log(1 + g11
+ g12
+ g13
)
(107)
together with the power constraint
2
2
2
.
+ g13
P = g11
+ g12
(108)
Thus, with x , 2R/6 , we have
P = 2R − 1 = x6 − 1
(109)
and
2
g11
= 2R/3 − 1 = x2 − 1,
2
g12
2
g13
(102)
viz.,
(104)
as a function of the ceiling rate R, where the second row also
depends on the realized channel gain β1 experienced by the
first incremental redundancy block.
As in the time-invariant case, we may without loss of generality take the first row and column to be real and nonnegative.
Assume, also without loss of generality, that α1 = 1 and
σ 2 = 1. Then the first row of G, which corresponds to the first
redundancy block, is computed exactly as in the time-invariant
case. In particular, from (64) with m = 1, it must satisfy
(101)
m′ =1
∆m+1,l = log(1 + SNRm+1,l (αm+1 )),
In this case, we determine our gain matrix
g
g12 g13
G = 11
g21 g22 g23
=2
R/3
=2
2R/3
(2
R/3
(2
(110)
2
2
− 1) = x (x − 1),
R/3
4
2
− 1) = x (x − 1).
(111)
(112)
The derivation now departs from the time-invariant case.
Recall that β1 is the realized channel gain for the first block.
A second redundancy block is thus needed when |β1 | < α1 .
The target gain α2 is the solution to [cf. (63)]
R = log(1 + P |β1 |2 ) + log(1 + P α22 ),
which is
α22 =
1 − |β1 |2
.
1 + P |β1 |2
(113)
(114)
Using (64) for m = 2 and l = 1 yields
2
2
R/3 = log(1 + |β1 |2 g11
+ α22 g21
).
(115)
Substituting the previously computed expressions (110) and
2
(114) for g11
and α22 and solving for g21 yields
2
g21
= (x2 − 1)(1 + P |β1 |2 ).
(116)
As in the time-invariant case, to solve for the rest of the
second row of G we use (64) with m = l = 2 together with
the requirement that the first and second rows be orthogonal.
It is useful at this stage to switch to polar coordinates, i.e.,
g22 = |g22 |ejθ1 and g23 = |g23 |ejθ2 .
Orthogonality of the first and second rows means that
0 = g11 g21 + g12 |g22 |ejθ1 + g13 |g23 |ejθ2 .
(117)
The three terms in the above expression may be viewed as the
legs of a triangle, so by the law of cosines
2
2 2
2
2g11 g21 g12 |g22 | cos θ1 = g13
|g23 |2 − g11
g21 − g12
|g22 |2 .
(118)
16
We now use (64) with m = l = 2 to infer that
Power Allocation
22R/3 = x4 = det(I + diag{|β1 |2 |, α22 }G2,2 G†2,2 ).
(119)
To expand this expression, we compute
2
2
g11 + g12
g11 g21 + g12 |g22 |e−jθ1
†
, (120)
G2,2 G2,2 =
2
(∗)
g21
+ |g22 |2
where (∗) is the complex conjugate of the upper right entry,
from which we find
det(I + diag{|β1 |2 |, α22 }G2,2 G†2,2 ) =
2
2 2
|β1 |2 α22 (g11
|g22 |2 + g12
g21 − 2g11 g21 g12 |g22 | cos θ1 )
+ |β1 |
2
2
(g11
+
2
g12
)
+
2
α22 (g21
where
2
+ |g22 | ) + 1. (121)
Substituting (118) into (121) and using (119) yields
x4 = |β1 |2 α22
2
2
2 2
2
2 2
|g22 |2 )
|g23 |2 + g11
g21 + g12
· (g11
|g22 |2 + g12
g21 − g13
(122)
2
2
2
+ |β1 |2 (g11
+ g12
) + α22 (g21
+ |g22 |2 ) + 1. (123)
2
2
2
2
,
, g21
, g13
, g12
Finally, substituting the expressions for g11
2
and α2 computed above, using the power constraint
2
|g23 |2 = P − |g22 |2 − g21
,
x2 − 1
|g22 |2 =
1 + (x6 − 1)|β1 |2
· x2 + |β1 |2 (x10 + x8 − x6 − x4 − x2 + 1)
− |β1 |4 (x6 − 1)2 . (125)
Evidently, a necessary condition for the existence of a
2
solution for G is that g21
+ |g22 |2 < P . It can be shown that
the sum of the powers on the first two layers is maximized
when |β1 | → 1, and then the necessary condition simplifies to
2R+1 − 22R/3+1 < 2R − 1,
SNRm,l (βm ) =
|βm |2 pm,l
.
|βm |2 (pm,1 + · · · + pm,l−1 ) + 1
(128)
is the associated per-layer SNR experienced during successive
decoding.
We now obtain the elements of P recursively. We proceed
from the first block m = 1 to block M , where in each block m
we start by determining Pm,1 and proceed up through Pm,L .
By definition of α1 , we have
!
L
X
2
log 1 + α1
P1,l = R.
l=1
(124)
solving for |g22 |2 , and simplifying terms, we arrive at
(126)
√
which may be shown to hold for all R < log(2 + 5) ≈ 2.08
bits per complex symbol.
The final step—a straightforward exercise, the details of
which we omit—is to apply the triangle inequality to (117) to
prove that the required triangle exists, and thus the phases θ1
and θ2 .
A PPENDIX D
N EAR -P ERFECT R ATELESS C ODES
C HANNELS
As in the corresponding development for the time-invariant
case in Section B, a suitable power allocation for our construction is obtained as that which is optimum for a slightly
different construction, which we now develop. In this section,
different (independent) codebooks are used for different redundancy blocks, and we take G(n) to be independent of n,
so that G(n) = P, where P is as given in (40).
The mutual information in the lth layer of the mth block is
then
Im,l = log(1 + SNRm,l (βm )).
(127)
Viewing the layering as superposition coding for a multiaccess channel, it is clear that any rate vector is achievable as
long as its sum-rate is R. We may therefore obtain an equal
rate per layer by taking P1,1 , . . . , P1,L such that
log(1 + P1,l α21 ) = R/L,
l = 1, . . . , L.
(129)
Upon receiving knowledge of |β1 | we proceed to determine the
power allocation for block m = 2. More generally, suppose the
power allocations through block m − 1 have been determined
and we have now acquired channel state knowledge through
βm−1 . To determine the allocation for block m, we first
compute the mutual information shortfall in layer l as
∆m,l =
m−1
X
R
−
log(1 + SNRm′ ,l (βm′ )).
L
′
(130)
m =1
By the induction hypothesis, had the realized channel gain
been |βm−1 | = αm−1 , then ∆m,l would be zero for all l =
1, . . . , L. Now since we have |βm−1 | < αm−1 , clearly the
shortfall is positive for all layers. Also, by definition of αm ,
we also have
L
X
∆m =
∆m,l = log(1 + P α2m ).
(131)
l=1
FOR
T IME -VARYING
Our construction is a slight generalization of the corresponding construction in Section VII for time-invariant channels. First, we fix M , R, L, and the associated codebooks
Cl , . . . , CL each of rate R′ /L for some R′ < R to be
determined. Using cl (n) and xm (n) to denote the nth elements
of codeword cl and redundancy block xm , respectively, we
again have (38).
We then solve for pm,1 , . . . , pm,L , in order, via
log(1 + SNRm,l (αm )) = ∆m,l .
(132)
The resulting power allocation ensures that the aggregate
mutual information per layer is at least R/L if |βm | > αm
when i.i.d. Gaussian codebooks for all layers and blocks.
However, we wish to use the same set of L codebooks for
all redundancy blocks, to keep decoding complexity low. We
return to this problem next, but in doing so will exploit this
power allocation.
17
Encoding
In our construction we restrict our attention to an encoding
of the form described in Section VII-A. In particular, the G(n)
are of the form (39) with (40) and (41), with the dm,l (n) all
i.i.d. random variables in m, l, and n, and drawn independently
of all other random variables, including noises, messages, and
codebooks. As before, it is sufficient for dm,l (n) to take on
only values ±1, and with equal probability.
Decoding
Decoding proceeds in a manner analogous to that described
in Section VII-B for the time-invariant case. In particular, since
G(n) is drawn i.i.d., the overall channel is i.i.d., and thus
we may express the channel model in terms of an arbitrary
individual element in the block. Specifically, assume that the
channel gain for block m is the minimal required βm = αm ,
then our received symbol can be expressed as [cf. (42)]
y1
c1
z1
..
.. ..
y = . = Bm G . + . ,
ym
cL
zm
where G = P ⊙ D, with G denoting the arbitrary element
in the sequence G(n), and where ym′ is the corresponding
received symbol from redundancy block m′ (and similarly for
cm′ , zm′ , D).
As in the time-invariant case, it is sufficient to employ
successive cancellation decoding with simple maximal ratio
combining (MRC) of the redundancy blocks. In this case, the
effective SNR at which this lth layer is decoded from m blocks
via such MRC decoding is given by [cf. (48)]
m
X
SNRMRC =
SNRm′ ,l (βm ),
(133)
m′ =1
with SNR
m′ ,l
(βm ) is as given in (128).
Efficiency Analysis
To show that the resulting scheme is asymptotically perfect,
we first note that when random dither encoding, MRC decoding, and capacity-achieving base codes are used, the mutual
′
information Im,l
satisfies [cf. (55)]
′
Im,l
≥ log (1 + SNRMRC (βm ))
(134)
with SNRMRC (βm ) as in (133).
Again the efficiency of the scheme depends on the choice
of power allocation matrix (40). Recall that we may further
′
bound Im,l
for all m by (56). Thus, if we choose the rate
′′
R /L of the base code in each layer to be (57) then (56)
ensures decodability after m blocks are received when the
channel gain satisfies |βm | ≥ αm , as required. Moreover, the
efficiency R′′ /R can be made as close as desired to one by
taking L sufficiently large.
ACKNOWLEDGMENTS
We thank the reviewers and associate editor for their helpful
comments, which led to a number of improvements in the
manuscript. We also acknowledge the efforts of Yuval HarZion, whose project at Tel Aviv University provided the
simulation results depicted in Fig. 3.
R EFERENCES
[1] G. Caire and D. Tuninetti, “The throughput of hybrid-ARQ protocols
for the Gaussian collision channel,” IEEE Trans. Inform. Theory, vol.
47, no. 5, July 2001.
[2] D. Chase, “Code combining—a maximum-likelihood decoding approach
for combining an arbitrary number of noisy packets,” IEEE Trans.
Commun., vol. 33, no. 5, pp. 385–393, May 1985.
[3] J. Ha, J. Kim and S. W. McLaughlin, “Rate-compatible puncturing of
low-density parity-check codes,” IEEE Trans. Inform. Theory, vol. 50,
no. 11, pp. 2824–2836, Nov. 2004.
[4] J. Hagenauer, “Rate-compatible punctured convolutional codes (RCPC
codes) and their applications,” IEEE Trans. Commun., vol. 36, no. 4,
pp. 389–400, Apr. 1988.
[5] T. Ji and W. Stark, “Rate-adaptive transmission over correlated fading
channels,” IEEE Trans. Commun., vol. 53, no. 10, pp. 1663–1670, Oct.
2005.
[6] S. Lin, D. Costello, and M. Miller, “Automatic-repeat-request errorcontrol schemes,” IEEE Common. Magazine, vol. 22, no. 12, pp. 5–17,
Dec. 1984.
[7] D. M. Mandelbaum, “An adaptive-feedback coding scheme using incremental redundancy,” IEEE Trans. Inform. Theory, vol. 20, no. 3, pp.
388–389, May 1974.
[8] D. N. Rowitch and L. B. Milstein, “On the performance of hybrid
FEC/ARQ systems using rate compatible punctured turbo (RCPT)
codes,” IEEE Trans. Commun., vol. 48, no. 6, pp. 948–959, June 2000.
[9] S. Sesia, G. Caire, and G. Vivier, “Incremental redundancy hybrid
ARQ schemes based on low-density parity-check codes,” IEEE Trans.
Commun., vol. 52, no. 8, pp. 1311–1321, Aug. 2004.
[10] N. Shulman and M. Feder, “Static broadcasting,” in Proc. Int. Symp.
Inform. Theory (ISIT), (Sorrento, Italy), p. 23, June 2000.
[11] N. Shulman, Universal channel coding, Ph.D. thesis, Tel-Aviv University, Ramat Aviv, Israel, 2004.
[12] A. Shokrollahi, “Raptor codes,” IEEE Trans. Inform. Theory, vol. 52,
no. 6, pp. 2551–2567, June 2006.
[13] J. W. Byers, M. Luby, and M. Mitzenmacher, “A digital fountain
approach to asynchronous reliable multicast,” IEEE J. Select. Areas
Commun., vol. 20, pp. 1528–1540, Oct. 2002.
[14] M. Luby, “Information additive code generator and decoder for communication systems,” U.S. Pat. No. 6307487, Oct. 23, 2001.
[15] A. Soljanin, N. Varnica, and P. Whiting, “Incremental redundancy hybrid
ARQ with LDPC and raptor codes,” submitted to IEEE Trans. Inform.
Theory (preprint, 2005).
[16] O. Etesami and A. Shokrollahi, “Raptor codes on binary memoryless
symmetric channels,” IEEE Trans. Inform. Theory, vol. 52, no. 5, pp.
2033–2051, May 2006.
[17] R. Palanki and J.S. Yedidia, “Rateless codes on noisy channels,” in Proc.
Int. Symp. Inform. Theory (ISIT), (Chicago, IL), p. 37, June 2004.
[18] A. S. Barbulescu and S. S. Pietrobon, “Rate compatible turbo codes,”
Electron. Let., pp. 535–536, 1995.
[19] R. Mantha and F. R. Kschischang, “A capacity approaching hybrid ARQ
scheme using Turbo codes,” Proc. GLOBECOM, pp. 2341-2345, 1999.
[20] J. Li and K. R. Narayanan,“Rate-compatible low density parity check
codes for capacity-approaching ARQ scheme in packet data”, in Int.
Conf. Commun., Internet, Inform. Technol. (CIIT), (St. Thomas, USVI),
Nov. 2002.
[21] 3rd Generation Partnership Project (3GPP), “Evolved Universal Terrestrial Radio Access (E-UTRA); physical channels and modulation,”
3rd Generation Partnership Project, 3GPP TS 36.211, 2009. [Online].
Available: http://www.3gpp.org/ftp/Specs/html-info/36211.htm.
[22] 3rd Generation Partnership Project (3GPP), “Evolved Universal Terrestrial Radio Access (E-UTRA); multiplexing and channel coding,”
3rd Generation Partnership Project, 3GPP TS 36.212, 2009. [Online].
Available: http://www.3gpp.org/ftp/Specs/html-info/36212.htm.
[23] M. Doettling, T. Grundler, and A Seeger, “Incremental Redundancy
and Bit-Mapping Techniques for High-Speed Downlink Packet Access,”
in Proc. Global Commun. Conf. (GLOBECOM), (San Francisco, CA),
pp. 908–912, Dec. 2003.
[24] C. Wengerter, A. .G. E. von Elbwart, E. Seidel, G. Velev, and
M. P. Schmitt, “Advanced hybrid ARQ technique employing a signal
constellation rearrangement,” in Proc. Vehic. Technol. Conf. (VTC),
(Vancouver, Canada), vol. 4, pp. 2002-2006, Sep. 2002.
[25] R. J. Barron, C. K. Lo, and J. M. Shapiro, ”Global design methods
for Raptor codes using binary and higher-order modulations”, in Proc.
Military Commun. Conf. (MILCOM), (Boston, MA), Oct. 2009.
18
[26] U. Erez, G. W. Wornell, and M. D. Trott, “Rateless coding and perfect
rate-compatible codes for Gaussian channels,” in Proc. Int. Symp.
Inform. Theory (ISIT), (Seattle, WA), pp. 528–532, July 2006.
[27] U. Erez, G. W. Wornell, and M. D. Trott, “Coding for faster-thanNyquist signaling: the merits of a regime change,” in Proc. Allerton
Conf. Commun., Contr., Computing, (Monticello, IL), Sep. 2004.
[28] J. M. Shapiro, R. J. Barron, and G. W. Wornell, “Practical layered
rateless codes for the Gaussian channel: power allocation and implementation,” in Int. Workshop Signal Processing Adv. Wireless Commun.
(SPAWC), (Helsinki, Finland), June 2007.
[29] M. Shanechi, Universal codes for parallel Gaussian channels, S.M.
thesis, Massachusetts Institute of Technology, Cambridge, MA, 2006.
[30] M. Shanechi, U. Erez, and G. W. Wornell, “On universal coding for
parallel Gaussian channels,” in Proc. Int. Zurich Seminar Commun.,
(Zurich, Switzerland), Mar. 2008.
[31] M. Shanechi, U. Erez, G. W. Wornell, and K. P. Boyle, “Time-invariant
rateless codes for MIMO channels,” in Proc. Int. Symp. Inform. Theory,
(Toronto, Canada), July 2008.
[32] J. Jiang and K. R. Narayanan, “Multilevel coding for channels with
non-uniform inputs and rateless transmission over the BSC,” Proc. Int.
Symp. Inform. Theory (ISIT), (Seattle, WA), pp. 518–521, July 2006.
[33] M. Shanechi, U. Erez, and G. W. Wornell, “Rateless coding for MIMO
channels,”in Proc. Int. Global Commun. Conf. (GLOBECOM), (New
Orleans, LA), Nov. 2008.
[34] U. Erez, M. D. Trott, and G. W. Wornell, “An efficient ARQ scheme
with SNR feedback,” in Proc. Int. Zurich Seminar Commun., (Zurich,
Switzerland), Mar. 2008.
[35] U. Erez, G. W. Wornell, and M. D. Trott, “Rateless space-time coding,”
in Proc. Int. Symp. Inform. Theory (ISIT), (Adelaide, Australia), pp.
1937–1941, Sep. 2005.
[36] U. Niesen, U. Erez, D. Shah, and G. W. Wornell, “Rateless codes for the
Gaussian multiple-access channel,” in Proc. IEEE GLOBECOM, Nov.
2006.