0% found this document useful (0 votes)
87 views

Arithmetic Coding

Arithmetic coding is summarized in 3 sentences: Arithmetic coding is a method for lossless data compression that encodes data symbols using arithmetic operations on real numbers in the interval [0,1). It provides near-entropy compression by mapping variable-length strings of input symbols into fixed-length representations. Arithmetic coding achieves better compression than other entropy encoding techniques like Huffman coding by encoding multiple symbols in a single interval.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views

Arithmetic Coding

Arithmetic coding is summarized in 3 sentences: Arithmetic coding is a method for lossless data compression that encodes data symbols using arithmetic operations on real numbers in the interval [0,1). It provides near-entropy compression by mapping variable-length strings of input symbols into fixed-length representations. Arithmetic coding achieves better compression than other entropy encoding techniques like Huffman coding by encoding multiple symbols in a single interval.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Arithmetic coding

Irina Bocharova,
University of Aerospace Instrumentation,
St.-Petersburg, Russia

Outline

 Shannon-Fano-Elias coding

 Gilbert-Moore coding

 Arithmetic coding as a generalization of SFE


and GM coding

 Implementation of arithmetic coding

Lund, Sweden, February, 2005 1:22


Let x ∈ X = {1, . . . , M }, p(x) > 0,
p(1) ≥ p(2) ≥ · · · ≥ p(M ).
The cumulative sum is associated with the sym-
bol x

Q(x) = p(a),
a<x
that is,
M −1
Q(1) = 0,Q(2) = p(1), . . . , Q(M ) = i=1 p(i).

Then Q(m)lm is a codeword for m,


where lm = −log2 p(m)

Lund, Sweden, February, 2005 2:22


x p(x) Q Q in binary l(x) codeword
1 0.6 0 0.0 1 0
2 0.3 0.6 0.1001. . . 2 10
3 0.1 0.9 0.1110. . . 4 1110

L = 1.6 bits H(X) = 1.3 bits

Lund, Sweden, February, 2005 3:22


If lm binary symbols have been already transmit-
ted then the length of the interval of uncertainty
is 2−lm . Thus we can decode uniquely if

2−lm ≤ p(m)
or
lm ≥ − log2 p(m)

Choosing length lm we used only right segment


with respect to the point Q(m). This segment is
always shorter than the corresponding left seg-
ment since symbol probabilities are ordered in
descending order.

H(X) ≤ L < H(X) + 1.

Lund, Sweden, February, 2005 4:22


Let x ∈ X = {1, . . . , M }, p(x) > 0.
The cumulative sum is associated with the sym-
bol x

Q(x) = p(a),
a<x
that is,
M −1
Q(1) = 0, Q(2) = p(1), . . . , Q(M ) = i=1 p(i).

Introduce σ(x) = Q(x) + p(x)


2

Then σ̂(m) = σ(m)lm is a codeword for m,


where lm = −log2(p(m)/2).

Lund, Sweden, February, 2005 5:22


We put point σ(m) to the center of the segment
Q(m) + p(m)/2 and choose length of codeword
in such a manner that if lm binary symbols have
been transmitted the length of the interval of
uncertainty is less than or equal to p(m)/2.

Lund, Sweden, February, 2005 6:22


x p(x) Q σ l GM ShFE
1 0.1 0.0 0.00001... 5 00001 0000
2 0.6 0.0001.. 0.01100... 2 01 0
3 0.3 0.10110... 0.11011... 3 110 10

L = 2.6 bits H(X) = 1.3 bits

Lund, Sweden, February, 2005 7:22


Let i < j then σ(j) > σ(i)
j−1
 i−1
 p(j) p(i)
σ(j) − σ(i) = p(l) − p(l) + −
l=1 l=1
2 2

j−1
 p(j) − p(i) p(j) − p(i)
= p(l) + ≥ p(i) +
l=i
2 2

p(i) + p(j) max{p(i), p(j)}


≥ ≥
2 2
Since lm = − log2 p(m)
2  ≥ − log2
p(m)
2

we obtain
max{p(i), p(j)}
σ(j) − σ(i) ≥ ≥ 2− min{li,lj }.
2
H(X) + 1 ≤ L < H(X) + 2

Lund, Sweden, February, 2005 8:22


When symbol-by-symbol coding is not effi-
cient?
1. Memoryless source
For symbol-by-symbol coding
R = H(X) + α,
where α is coding redundancy.
For block coding
H(X n) + α nH(X) + α α
R= = = H(X) + ,
n n n
where H(X n ) denotes entropy of n random vari-
ables.
If H(X) << 1 R ≥ 1 for symbol-by-symbol cod-
ing. For binary memoryless source with p(0) =
0.99, p(1) = 0.01 H(X) = 0.081 bits and we can
easily construct the Huffman code with R = 1
bit but it is impossible to obtain R < 1 bit.
2.Source with memory
H(X n ) ≤ nH(X) and
H(X n ) + α α
R= ≤ H(X) + .
n n
R → H∞(X)
when n → ∞, H∞(X) denotes entropy rate.

Lund, Sweden, February, 2005 9:22


How to implement block coding?
Let x ∈ X = {1, . . . , M }, and we are going to
encode sequences x = (x1, . . . , xn) which appear
at the output of X during n consecutive time
moments.

We can consider a new source X n with symbols


corresponding to the sequences x = (x1, . . . , xn)
of length n and apply any method of symbol-by-
symbol coding to these symbols. We will obtain
H(X n ) α
R= + ,
n n
where α depends on the chosen coding proce-
dure.

The problem is coding complexity. The alpha-


bet of the new source is of size M n. For example,
if M = 28 = 256 then for n = 2 M 2 = 65536,
and for n = 3 M 3 = 16777216.

The arithmetic coding provides redundancy 2/n


with complexity n2.

Lund, Sweden, February, 2005 10:22


Arithmetic coding is a direct extension of the
Gilbert-Moore coding scheme.

Let x = (x1, x2, . . . , xn) be an M -ary sequence of


length n. We construct the modified cumulative
distribution function
 p(x) p(x)
σ(x) = p(a) + = Q(x) + ,
a≺x 2 2
where a ≺ x means that a is lexicographically less
than x, l(x) = −log2(p(x)/2).

The code rate R is equal to


1 1 1
p(x)l(x) = p(x)(log2  + 1)
n x n x p(x)

H(X n) + 2
<
n

If the source generates symbols independently


we obtain
2
R < H(X) + .
n
For source with memory
R → H∞(X)
when n → ∞.

Lund, Sweden, February, 2005 11:22


Consider

Q(x[1,n]) = p(a) =
a≺x

p(a)+
a:a[1,n−1] ≺x[1,n−1],an

p(a),
a:a[1,n−1] =x[1,n−1],an≺xn

where x[1,i] = x1, x2, . . . , xi. It is easy to see that



Q(x[1,n]) = Q(x[1,n−1])+ p(a)
a:a[1,n−1]=x[1,n−1] ,an≺xn

= Q(x[1,n−1]) + p(a[1,n−1]) p(an /a[1,n−1]).
an≺xn
If the source generates symbols independently
n−1

p(a[1,n−1]) = p(ai ),
i=1
 
p(an /a[1,n−1]) = p(an ) = Q(xn ),
an≺xn an≺xn
where Q(xi) denotes the cumulative probability
for xi.

Lund, Sweden, February, 2005 12:22


0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
...

Lund, Sweden, February, 2005 13:22


We obtain the following recurrent equations

Q(x[1,n]) = Q(x[1,n−1]) + p(x[1,n−1])Q(xn ),

p(x[1,n−1]) = p(x[1,n−2])p(xn−1).

Lund, Sweden, February, 2005 14:22


Coding procedure
x = (x1, . . . , xn)

Initialization

F = 0; G = 1; Q(1) = 0;
for j = 2 : M

Q(j) = Q(j − 1) + p(j − 1);


end;

for i = 1 : n

F ← F + Q(xi) × G;

G ← G × p(xi);
end;

F = F + G/2; l = −log2 G/2; F̂ ← F ∗ 2l ;

Lund, Sweden, February, 2005 15:22


X = {a, b, c},
p(a) = 0.1, p(b) = 0.6, p(c) = 0.3

x = (bcbab), n = 5

i xi p(xi ) Q(xi ) F G
0 - - - 0.0000 1.0000
1 b 0.6 0.1 0.1000 0.6000
2 c 0.3 0.7 0.5200 0.1800
3 b 0.6 0.1 0.5380 0.1080
4 a 0.1 0.0 0.5380 0.0108
5 b 0.6 0.1 0.5391 0.0065

Codeword length −log2 G + 1 = 9

F + G/2 = 0.5423... and


codeword F̂ = F + G/2l = 100010101

H(X) = 1.3 bits R = 1.8 bit/symbol

Lund, Sweden, February, 2005 16:22


At each step of the coding algorithm we perform
1 addition and 2 multiplications.
Let p(1), . . . , p(M ) be numbers with binary rep-
resentation of length d. Then at the first step
F and G will be numbers with binary representa-
ton of length 2d. Next steps will require length
of binary representation 3d, . . . , nd.

The complexity of coding procedure can be es-


timated as
n(n + 1)d
d + 2d + · · · + nd =
2

PROBLEMS

1. Algorithm requires high computational accu-


racy (theoretically infinite)

2. Computational delay=length of the sequence


to be encoded.

Lund, Sweden, February, 2005 17:22


Decoding of Gilbert-Moore code

Q(m), m = 1, . . . , M are known.


Input:σ̂.

Set m = 1
While Q(m + 1) < σ̂ m ← m + 1
end;
Output: x(m)

Example.

σ̂ = 0.01 → σ̂ = 0.25

Q(2) = 0.1 < 0.25 m = 2

Q(3) = 0.7 > 0.25 stop with m = 2.

Lund, Sweden, February, 2005 18:22


Decoding procedure:
F̂ ← F̂ /2l ; S = 0; G = 1;

for i = 1 : n
j = 1;

while S + Q(j + 1) × G < F̂ andj ≤ M

j ←j+1
end;

S ← S + Q(j) × G;

G ← G × Q(j);

xi = j;

end;

At the ith step G = p(x[1,i]) and S = Q(x[1,i]).

Lund, Sweden, February, 2005 19:22


a, b, c p(a) = 0.1, p(b) = 0.6, p(c) = 0.3

Codeword 0100010101 F̂ = 0.541

F̂ = 0.0100010101

S G Hyp. Q S + QG xi p
0.0000 1.000 a 0.0 0.0000 < F̂
b 0.1 0.1000 < F̂ b 0.6
c 0.7 0.7000 > F̂
0.1000 0.6000 a 0.0 0.1000 < F̂
b 0.1 0.1600 < F̂ c 0.3
c 0.7 0.5200 < F̂
0.5200 0.1800 a 0.0 0.5200 < F̂
b 0.1 0.5380 < F̂ b 0.6
c 0.7 0.6460 > F̂
0.5380 0.1080 a 0.0 0.5380 < F̂
b 0.1 0.5488 > F̂ a 0.1
0.5380 0.0108 a 0.0 0.5380 < F̂
b 0.1 0.5391 < F̂ b 0.6
c 0.7 0.5456 > F̂

Lund, Sweden, February, 2005 20:22


1. High < 0.5

Bit = 0;

Normalization:

Low = Low × 2
High = High × 2
Low = 0; High = 0.00011000001
Bit = 0; High = 0.0011000001
Bit = 0; High = 0.011000001
Bit = 0; High = 0.11000001

2.Low > 0.5

Bit = 1;

Normalization:

Low = Low − 0.5;Low = Low × 2


High = High − 0.5; High = High × 2
Low = 0.11000011
Bit = 1; Low = 0.1000011
Bit = 1; Low = 0.000011

Lund, Sweden, February, 2005 21:22


3. Low < 0.5 High > 0.5

0.011111...1
It can be 0.01111...10 or 0.10000...01

Low = 0.0110 < 0.5 High = 0.1010 > 0.5


Count=1;
Read next symbol
Low = 0.10001 = 0.0110 + 0.00101
High = 0.10101
Bit=1; Output: 10

Lund, Sweden, February, 2005 22:22

You might also like