Turbol Codes
Turbol Codes
Turbol Codes
Manjunatha. P
manjup.jnnce@gmail.com
Professor
Dept. of ECE
Note:
Slides are prepared to use in class room purpose, may be used as a
reference material
All the slides are prepared based on the reference material
Most of the figures used in this material are redrawn, some of the
figures/pictures are downloaded from the Internet.
This material is not for commercial purpose.
This material is prepared based on Advanced Digital Communication
for DECS M Tech course as per Visvesvaraya Technological University
(VTU) syllabus (Karnataka State, India).
Introduction
[1, 2, 3, 4, 5, 6]
Concatenated coding schemes were first proposed by Forney as a method for achieving
large coding gains by combining two or more relatively simple building block or
component codes
The resulting codes had the error-correction capability of much longer codes, and they
were endowed with a structure that permitted relatively easy to moderately complex
decoding
A turbo code can be thought of as a refinement of the concatenated encoding structure
plus an iterative algorithm for decoding the associated code sequence
Turbo codes were first introduced in 1993 by Berrou, Glavieux, and Thitimajshima
a scheme is described that achieves a bit-error probability of 10-5 using a rate 1/2 code
over an additive white Gaussian noise (AWGN) channel and BPSK modulation at an
Eb/N0 of 0.7 dB
The codes are constructed by using two or more component codes on different interleaved
versions of the same information sequence
For a system with two component codes, the concept behind turbo decoding is to pass
soft decisions from the output of one decoder to the input of the other decoder, and to
iterate this process several times so as to produce more reliable decisions.
Likelihood functions
For communications engineering, where applications involving an AWGN channel are of
great interest, the most useful form of Bayes theorem expresses the a posteriori
Fundamentals of Turbo Codes probability (APP) of a decision in terms of a
continuous-valued random variable x in the following form
P(d = i), called the a priori probability is the probability of occurrence of the ith signal
class
p(x) is the pdf of the received signal x, yielding the test statistic over the entire space of
signal classes
A line subtended from xk intercepts the two likelihood functions, yielding two likelihood
values .1 = p(xk—dk = +1) and .2 = p(xk—dk = -1)
A well-known hard decision rule, known as maximum likelihood, is to choose the data dk =
+1 or dk = -1 associated with the larger of the two intercept values, l1 or l2, respectively.
A similar decision rule, known as maximum a posteriori (MAP), which can be shown to be
a minimum probability of error rule, takes into account the a priori probabilities of the
data.
Equation (3) states that you should choose the hypothesis H1, (d = +1), if the APP P(d
= +1—x), is greater than the APP P(d = -1—x) else you should choose hypothesis H2,
(d = -1)
Equation (3) can be rewritten using Baye’s theorem as
equation 4 is generally expressed in terms of ratio, yielding likelihood ratio test and is
given by
Log-Likelihood Ratio
By taking the logarithm of the likelihood ratio developed in Equations (3) through (5), we
obtain a useful metric called the log-likelihood ratio (LLR)
It is a real number representing a soft decision out of a detector, designated by as follows
where L(x—d) is the LLR of the test statistic x obtained by measurements of the channel
output x under the alternate conditions that d = +1 or d = -1 may have been
transmitted, and L(d) is the a priori LLR of the data bit d.
the output LLR L(d) of the decoder is
The sign of L(d) denotes the hard decision; that is, for positive values of L(d) decide that
d = +1, and for negative values decide that d = -1. The magnitude of L(d) denotes the
reliability of that decision. Often, the value of Le(d) due to the decoding has the same
sign as Lc(x) + L(d), and therefore acts to improve the reliability of L(d).
Principles of Iterative (Turbo) Decoding
In a typical communications receiver, a demodulator is often designed to produce soft
decisions, which are then transferred to a decoder
The error-performance improvement of systems utilizing such soft decisions compared to
hard decisions are typically approximated as 2 dB in AWGN.Such a decoder could be
called a soft input/hard output decoder
With turbo codes, where two or more component codes are used, and decoding involves
feeding outputs from one decoder to the inputs of other decoders in an iterative fashion, a
hard-output decoder would not be suitable
for the decoding of turbo codes soft input/soft output decoder is needed
−1
KX
uk = g1i dk−i mod −2, g1i = 0, 1 (1)
i=0
−1
KX
vk = g2i dk−i mod −2, g2i = 0, 1 (2)
i=0
G1 = {g1i } and G2 = {g2i } are the code generators, and dk is represented as a binary digit
The nonsystematic convolutional (NSC ) code is shown in figure
−1
KX
0
ak = d k + g i ak−i mod −2 (3)
i=1
The free distance and trellis structures are identical for the RSC code and the NSC code
The two output sequences {uk } and {vk } do not correspond to the same input sequence {dk } for RSC and NSC codes
Encoding the input data sequence {dk }=1 1 1 0 involves the following steps and is given in table
1 At any instant of time, a data bit dk becomes transformed to ak by summing it to the bits ak−1 and ak−2 on the same
row
2 For example, at time k=2, the data bit dk =1 is transformed to ak = 0 by summing it to the bits ak−1 and ak−2 on
the same row
The resulting output, uk vk =10 dictated by the encoder logic circuitry, is the coded-bit sequence associated with time
k=2
At time k=2, the contents, 10, of the rightmost two stages, ak−1 ak−2 represents the state of the machine at the start
of that transition
The state at the end of that transition is seen as the contents, 01, in the two leftmost stages, ak ak−1 on that same row
Each row can be described in the same way. Thus the encoded sequence is 1 1 1 0 1 1 0 0
1 1 1 0 0 1 1
1 1 1 0 0 1 1
2 1 0 1 0 1 0
3 1 0 0 1 1 1
4 0 0 0 0 0 0
5 0 0
The switch yielding vk makes the overall code rate 1/2, otherwise the code rate would be 1/3
There is no limit to the number of encoders that may be concatenated, and, in general, the component codes need not
be identical with regard to constraint length and rate
The goal in designing turbo codes is to choose the best component codes by maximizing the effective free distance of
the code, at low values of Eb /N0 the weight distribution of the codewords must be optimized
The weight distribution for the codewords depends on how the codewords from one of the component encoders are
combined with codewords from the other encoder
The pairing of low weight codewords can be avoided by proper design of the interleaver
However the interleaver would not influence the output codeword weight distribution if the component codes were not
recursive
A Feedback Decoder
A decoded bit dk = i can be derived from the joint probability as given in equation 4
n o
i,m N
λk =P dk = i, Sk = m|R1 (4)
where Sk = m is the encoder state at time k, and R1N is a received binary sequence from time k=1 through some time N
Thus the APP that a decoded data bit dk = i, represented as a binary digit, is obtained by summing the joint
probability over all states as given in equation 5
n o X i,m
N
P dk = i|R1 = λk , i = 0, 1 (5)
m
The log-likelihood ratio(LLR) is written as the logarithm of the ratio of APPs, as given in equation 6
P 1,m
λ
∧ m k
L dk = log P 0,m (6)
λk
m
∧
The decoder makes a decision, known as the maximum a posteriori(MAP) decision rule, by comparing L dk to a zero
threshold as given in equation 7 and 8
∧ ∧
dk = 1 if L dk > 0 (7)
∧ ∧
dk = 0 if L dk <0 (8)
∧ ∧
For a systematic code, the LLR L dk associated with each decoder bit dk can be described as the sum of the LLR of
∧
dk , out of the demodulator and of other LLRs generated by the decoder
Consider the detection of a noisy data sequence that comes from the encoder shown in figure 5 with the use of a
decoder shown in figure
The decoder input is made up of a set Rk of two random variables xk and yk and are expressed as in equation 9and 10
xk = (2dk − 1) + ik (9)
yk = (2vk − 1) + qk (10)
ik and qk are two statistically independent variables accounting for the noise contribution
The redundant information, yk , is demultiplexed and sent to decoder DEC1 as y1k when vk = v1k , and to decoder
DEC2 as y2k when vk = v2k
The output of DEC1 has an interleaver structure identical to the one used at the transmitter between the two encoders.
This is because the information processed by DEC1 is the noninterleaved output of C1 (corrupted by channel noise).
Conversely, the information processed by DEC2 is the noisy output of C2 whose input is the same data going into C1,
however permuted by the interleaver.
∧ ∧
L dk = Lc (xk ) + Le dk (11)
" #
∧ p (xk |dk = 1) ∧
L dk = log + Le dk (12)
p (xk |dk = 0)
∧ ∧
Lc (xk ) and Le dk are corrupted by uncorrelated noise, and thus Le dk may be used as a new observation of dk
by another decoder and the principle here is that a decoder should never be supplied with information that stems from
its own input
2 2
1 xk − 1 1 xk + 1 2
Lc (xk ) = − + = xk (13)
2 σ 2 σ σ2
∧ ∧
If the inputs L1 dk and y2k to decoder DEC2 are statistically independent, then the LLR L2 dk at the output of
DEC2 is
∧ ∧ ∧
L2 dk =f L1 dk + Le2 dk (14)
with
∧ 2 ∧
L1 dk = xk + Le1 dk (15)
σ02
Introduction
In traditional decoding, the demodulator makes hard decision of the received symbol.
-Its disadvantage is that, if the value of some bit is determined with greater certainty, the decoder cannot make use of
this information
In tubo codes Soft In Soft Out(SISO) decoding is used.
Since turbo coding has two encoders, there will be two decoders for o/p from both encoders.
The decoder outputs for each data bit an estimate expressing the probability of transmitted data(i.e., a posteriori
probability,APP) for a number of iterations
At each round, decoder re-evaluates this estimates using information from the other decoder and only in the final stage
hard decisions will be made.
i,m
where λk is the joint probability that dk = i and state Sk = m conditioned on the received binary sequence R1N given
by eqn 18
i,m N
λk = P(dk = i, Sk = m|R1 ) (18)
N k−1 N
R1 = {R1 , Rk , Rk+1 } (19)
i,m k−1 N
λk = P(dk = i, Sk = m | R1 , Rk , Rk+1 ) (20)
| {z } | {z } |{z} | {z }
A B C D
Defining the first numerator factor of eqn 23 as the forward state metric at time k and state m, as αm
k
IRRELEVANT IRRELEVANT
z }| { z}|{
k−1 N k−1 ∆ m
P(R1 | dk = i, Sk = m, Rk ) = P(R1 |Sk = m) = αk (24)
In eqn 24 the two terms are irrelevant because, past is not affected by the future ; i.e., P(R1k−1 ) is independent of
dk = i and the sequence RkN .
Since the encoder has memory, the state Sk = m is based on the past, so this term is relevant.
The second numerator factor of eqn 23 represents a reverse state metric βkm at time k and state m, given by,
N N ∆ f (i,m)
P(Rk+1 |dk = i, Sk = m, R k ) = P(Rk+1 |Sk+1 = f (i, m)) = βk+1 (25)
f (i,m)
where f(i,m) is the next state, given an input i and state m, and βk+1 is the reverse state metric at time k+1 and
state f(i,m).
i,m
The third numerator factor of eqn 23 represents a branch metric δk at time k and state m, given by
∆ i,m
P(dk = i, Sk = m, Rk ) = δk (26)
Substituting eqns 24 through 26 into eqn 23 gives a more compact expression for the joint probability, as follows
i,m f (i,m)
i,m
αm
k δk βk+1
λk = (27)
P(R1N )
1,m f (1,m)
αm
P
k δk βk+1
m
Λ(d̂k ) = P (28)
0,m f (0,m)
αm δ
k k
βk+1
m
and P
1,m f (1,m)
αm
k δk βk+1
m
L(d̂k ) = loge (29)
0,m f (0,m)
αm
P
δ
k k
βk+1
m
where Λ(d̂k ) and L(d̂k ) are the likelihood ratio and the log-likelihood ratio of the k-th data bit respectively.
1
m 0 k−1
XX
αk = P(dk−1 = j, Sk−1 = m , R1 |Sk = m) (30)
m0 j=0
1
m k−2 0
XX
αk = P(R1 |Sk = m, dk−1 = j, Sk−1 = m , Rk−1 ) (31)
m0 j=0
0
= XP(dk−1 = j, Sk−1 = m , Rk−1 |Sk = m) (32)
1
k−2
X
= P(R1 |Sk−1 = b(j, m))P(dk−1 = j, Sk−1 = b(j, m), Rk−1 ) (33)
j=0
where b(j,m) is the state going backwards in time from state m, via the previous branch corresponding to input j.
Eqn 34 indicates that a new forward state metric at time k and state m is obtained by summing two weighted state
metrics from time k-1, corresponding to data bits 0 and 1.
The two possible transitions from the previous time terminate on the same state m at time k as shown in fig ??
f (i,m) N
Starting from eqn 25 where, βk+1 = P(Rk+1 |Sk+1 = f (i, m)) we have,
m N N
βk = P(Rk |Sk = m) = P(Rk , Rk+1 |Sk = m) (35)
We can express βkm as the summation of possible transition probabilities to time k+1, as
1
m 0 N
XX
βk = P(dk = j, Sk+1 = m , Rk , Rk+1 |Sk = m, ) (36)
m0 j=0
1
m N 0
XX
βk = P(Rk+1 |Sk = m, dk = j, Sk+1 = m , Rk ) (37)
m0 j=0
0
= XP(dk = j, Sk+1 = m , Rk |Sk = m) (38)
Sk = m and dk = j in the first term of eqn ?? defines the path resulting in Sk+1 = f (j, m), the next state given an
input j and state m.
1
m N
X
βk = P(Rk+1 |Sk+1 = f (j, m))P(dk = j, Sk = m, Rk ) (39)
j=0
1
j,m f (j,m)
X
= δk βk+1 (40)
j=0
Eqn ?? indicates that a new reverse state metric at time k and state m is obtained by summing two weighted state
metrics from time k+1 associated with transitions corresponding to data bits 0 and 1 as shown in fig ??
i,m
δk = P(dk = i, Sk = m, Rk ) (41)
= P(Rk |dk = i, Sk = m)P(Sk = m|dk = i)P(dk = i) (42)
1
P(Sk = m|dk = i) = (43)
2v
i,m πi
δk = P(xk |dk = i, Sk = m)P(yk |dk = i, Sk = m) k (44)
2v
The probability P(Xk = xk ) of a random variable, Xk is related to the pdf pxk (xk ) as follows,
For an AWGN channel,where the noise has zero mean and variance σ 2 , using eqn 45 to replace the probability terms in
eqn ?? with their pdf equivalents,
!2 2
i,m
i,m πi 1 xk − uki 1 1 yk − vk
δk = √k exp − dxk √ exp − dyk (46)
2v 2πσ 2 σ 2πσ 2 σ
where uk and vk represent the transmitted data bits and parity bits, dxk and dyk are the differentials of xk and yk and
get absorbed into the constant Ak .
i,m
uki is independent of state m, and vk is dependent on state m.
Simplifying the eqn 46, we get
1
i,m i i i,m
δk = Ak πk exp xk u k + yk vk (47)
σ2
1,m
!
yk v f (1,m)
αm k
P
k exp σ2
βk+1
2xk
m
Λ(d̂k ) = πk exp 0,m
! (48)
σ2 yk v f (0,m)
αm k
P
exp βk+1
m
k σ2
2xk
e
= πk exp πk (49)
σ2
where πk = πk1 /πk0 is the input a priori probability ratio and πke is the output extrinsic likelihood, which is the
correction term.
Eqn 50 gives the a priori LLR, the channel measurement LLR, and the extrinsic LLR.
The MAP algorithm can be implemented in terms of a likelihood ratio using eqns 48 and 49.
B. Sklar and P. K. Ray., Digital communications: fundamentals and applications, 2nd ed.
Prentice Halls International editions, 2001.
S. Lin and D. J. C. Jr., Error Control Coding, 2nd ed. Pearson / Prentice Hall, 2004.
R. Blahut, Theory and Practice of Error Control Codes, 2nd ed. Addison Wesley, 1984.
J. G. Proakis and M. Salehi, Communication Systems Engineering, 2nd ed. Prentice Hall,
2002.