0% found this document useful (0 votes)
54 views

CSEP 590 Data Compression: Course Policies Introduction To Data Compression Entropy Variable Length Codes

This document provides an overview of the CSEP 590 Data Compression course. It introduces key concepts like lossless vs lossy compression, entropy coding, and compression ratios. It outlines course policies including assignments, projects, grading, and resources. Concepts like Braille, lossless techniques, lossy compression, information theory, and entropy are also summarized.

Uploaded by

Rohan Borgalli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

CSEP 590 Data Compression: Course Policies Introduction To Data Compression Entropy Variable Length Codes

This document provides an overview of the CSEP 590 Data Compression course. It introduces key concepts like lossless vs lossy compression, entropy coding, and compression ratios. It outlines course policies including assignments, projects, grading, and resources. Concepts like Braille, lossless techniques, lossy compression, information theory, and entropy are also summarized.

Uploaded by

Rohan Borgalli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 93

CSEP 590

Data Compression
Autumn 2007

Course Policies
Introduction to Data Compression
Entropy
Variable Length Codes
Instructors
• Instructor
– Richard Ladner
– ladner@cs.washington.edu
– 206 543-9347
• TA
– Rahul Vanam
– rahulv@u.washington.edu

CSEP 590 - Lecture 1 - Autumn 2007 2


Helpful Knowledge
• Algorithm Design and Analysis
• Probability

CSEP 590 - Lecture 1 - Autumn 2007 3


Resources
• Text Book
– Khalid Sayood, Introduction to Data Compression,
Third Edition, Morgan Kaufmann Publishers, 2006.
• Course Web Page
– http://www.cs.washington.edu/csep590a

• Papers and Sections from Books


• Discussion Board
– For discussion

CSEP 590 - Lecture 1 - Autumn 2007 4


Engagement by Students
• Weekly Assignments
– Understand compression methodology
– Due in class on Fridays (except midterm Friday)
– No late assignments accepted except with prior
approval
• Programming Projects
– Bi-level arithmetic coder and decoder.
– Build code and experiment

CSEP 590 - Lecture 1 - Autumn 2007 5


Final Exam and Grading
• 6:30-8:20 p.m. Thursday, Dec. 13, 2007
• Percentages
– Weekly assignments (50%)
– Project (20%)
– Final exam (30%)

CSEP 590 - Lecture 1 - Autumn 2007 6


Logistics
• I will be gone the week of October 15th. We’ll
need to have a make up class.
• There is no class Thanksgiving week,
November 19th.
• We have some guest speakers toward the
end of the quarter.

CSEP 590 - Lecture 1 - Autumn 2007 7


Basic Data Compression Concepts
original compressed decompressed

x y x̂
Encoder Decoder

• Lossless compression x = xˆ
– Also called entropy coding, reversible coding.
• Lossy compression x ≠ xˆ
– Also called irreversible coding.
• Compression ratio = x y
– x is number of bits in x.

CSEP 590 - Lecture 1 - Autumn 2007 8


Why Compress
• Conserve storage space
• Reduce time for transmission
– Faster to encode, send, then decode than to send
the original
• Progressive transmission
– Some compression techniques allow us to send
the most important bits first so we can get a low
resolution version of some data before getting the
high fidelity version
• Reduce computation
– Use less data to achieve an approximate answer

CSEP 590 - Lecture 1 - Autumn 2007 9


Braille
• System to read text by feeling raised dots on
paper (or on electronic displays). Invented in
1820s by Louis Braille, a French blind man.

a b c z

and the with mother

th ch gh

CSEP 590 - Lecture 1 - Autumn 2007 10


Braille Example
Clear text:
Call me Ishmael. Some years ago -- never mind how
long precisely -- having \\ little or no money in my purse,
and nothing particular to interest me on shore, \\ I thought
I would sail about a little and see the watery part of the
world. (238 characters)
Grade 2 Braille in ASCII.
,call me ,i\%mael4 ,``s ye$>$s ago -- n``e m9d h[ l;g
precisely -- hav+ \\ ll or no m``oy 9 my purse1 \& no?+
``picul$>$ 6 9t]e/ me on \%ore1 \\ ,i $?$``$|$ ,i wd sail
ab a ll \& see ! wat]y ``p ( ! \_w4 (203 characters)

Compression ratio = 238/203 = 1.17


CSEP 590 - Lecture 1 - Autumn 2007 11
Lossless Compression
• Data is not lost - the original is really needed.
– text compression
– compression of computer binary files
• Compression ratio typically no better than 4:1 for
lossless compression on many kinds of files.
• Statistical Techniques
– Huffman coding
– Arithmetic coding
– Golomb coding
• Dictionary techniques
– LZW, LZ77
– Sequitur
– Burrows-Wheeler Method
• Standards - Morse code, Braille, Unix compress, gzip,
zip, bzip, GIF, JBIG, Lossless JPEG
CSEP 590 - Lecture 1 - Autumn 2007 12
Lossy Compression
• Data is lost, but not too much.
– audio
– video
– still images, medical images, photographs
• Compression ratios of 10:1 often yield quite
high fidelity results.
• Major techniques include
– Vector Quantization
– Wavelets
– Block transforms
– Standards - JPEG, JPEG2000, MPEG 2, H.264

CSEP 590 - Lecture 1 - Autumn 2007 13


Why is Data Compression Possible
• Most data from nature has redundancy
– There is more data than the actual information
contained in the data.
– Squeezing out the excess data amounts to
compression.
– However, unsqueezing is necessary to be able to
figure out what the data means.
• Information theory is needed to understand
the limits of compression and give clues on
how to compress well.
CSEP 590 - Lecture 1 - Autumn 2007 14
What is Information
• Analog data
– Also called continuous data
– Represented by real numbers (or complex
numbers)
• Digital data
– Finite set of symbols {a1, a2, ... , am}
– All data represented as sequences (strings) in the
symbol set.
– Example: {a,b,c,d,r} abracadabra
– Digital data can be an approximation to analog
data

CSEP 590 - Lecture 1 - Autumn 2007 15


Symbols
• Roman alphabet plus punctuation
• ASCII - 256 symbols
• Binary - {0,1}
– 0 and 1 are called bits
– All digital information can be represented
efficiently in binary
– {a,b,c,d} fixed length representation
symbol a b c d
binary 00 01 10 11
– 2 bits per symbol
CSEP 590 - Lecture 1 - Autumn 2007 16
Exercise - How Many Bits Per
Symbol?
• Suppose we have n symbols. How many bits
(as a function of n ) are needed in to
represent a symbol in binary?
– First try n a power of 2.

CSEP 590 - Lecture 1 - Autumn 2007 17


Discussion: Non-Powers of Two
• Can we do better than a fixed length
representation for non-powers of two?

CSEP 590 - Lecture 1 - Autumn 2007 18


Information Theory
• Developed by Shannon in the 1940’s and 50’s
• Attempts to explain the limits of communication
using probability theory.
• Example: Suppose English text is being sent
– It is much more likely to receive an “e” than a “z”.
– In some sense “z” has more information than “e”.

CSEP 590 - Lecture 1 - Autumn 2007 19


First-order Information
• Suppose we are given symbols {a1, a2, ... , am}.
• P(ai) = probability of symbol ai occurring in the
absence of any other information.
P(a1) + P(a2) + ... + P(am) = 1
• inf(ai) = log2(1/P(ai)) bits is the information of ai
in bits. 7
6
5
-log(x)
4
y

3
2
1
0
0.01
0.08

0.15

0.22
0.29

0.36
0.43

0.5

0.57
0.64

0.71
0.78

0.85

0.92
0.99
x

CSEP 590 - Lecture 1 - Autumn 2007 20


Example
• {a, b, c} with P(a) = 1/8, P(b) = 1/4, P(c) = 5/8
– inf(a) = log2(8) = 3
– inf(b) = log2(4) = 2
– inf(c) = log2(8/5) = .678
• Receiving an “a” has more information than
receiving a “b” or “c”.

CSEP 590 - Lecture 1 - Autumn 2007 21


First Order Entropy
• The first order entropy is defined for a probability
distribution over symbols {a1, a2, ... , am}.
m
1
H = ∑ P (ai ) log 2 ( )
i =1 P (ai )
• H is the average number of bits required to code up a
symbol, given all we know is the probability distribution
of the symbols.
• H is the Shannon lower bound on the average number of
bits to code a symbol in this “source model”.
• Stronger models of entropy include context.

CSEP 590 - Lecture 1 - Autumn 2007 22


Entropy Examples
• {a, b, c} with a 1/8, b 1/4, c 5/8.
– H = 1/8 *3 + 1/4 *2 + 5/8* .678 = 1.3 bits/symbol

• {a, b, c} with a 1/3, b 1/3, c 1/3. (worst case)


– H = 3* (1/3)*log2(3) = 1.6 bits/symbol

• Note that a standard code takes 2 bits per


symbol
symbol a b c
binary code 00 01 10

CSEP 590 - Lecture 1 - Autumn 2007 23


An Extreme Case
• {a, b, c} with a 1, b 0, c 0
– H=?

CSEP 590 - Lecture 1 - Autumn 2007 24


Entropy Curve
• Suppose we have two symbols with probabilities
x and 1-x, respectively.
maximum entropy at .5
1.2

1 -(x log x + (1-x)log(1-x))

0.8
entropy

0.6

0.4

0.2

0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
probability of first symbol

CSEP 590 - Lecture 1 - Autumn 2007 25


A Simple Prefix Code
• {a, b, c} with a 1/8, b 1/4, c 5/8.
• A prefix code is defined by a binary tree
• Prefix code property
– no output is a prefix of another
input output
binary tree a 00
0 1
c b 01 code
0 1
a b c 1

ccabccbccc
1 1 00 01 1 1 01 1 1 1
CSEP 590 - Lecture 1 - Autumn 2007 26
Binary Tree Terminology
root

node

leaf

1. Each node, except the root, has a unique parent.


2. Each internal node has exactly two children.

CSEP 590 - Lecture 1 - Autumn 2007 27


Decoding a Prefix Code
repeat
start at root of tree
0 1
repeat
c if read bit = 1 then go right
0 1
else go left
a b until node is a leaf
report leaf
until end of the code

11000111100

CSEP 590 - Lecture 1 - Autumn 2007 28


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

CSEP 590 - Lecture 1 - Autumn 2007 29


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

CSEP 590 - Lecture 1 - Autumn 2007 30


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

CSEP 590 - Lecture 1 - Autumn 2007 31


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

cc

CSEP 590 - Lecture 1 - Autumn 2007 32


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

cc

CSEP 590 - Lecture 1 - Autumn 2007 33


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

cc

CSEP 590 - Lecture 1 - Autumn 2007 34


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

cca

CSEP 590 - Lecture 1 - Autumn 2007 35


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

cca

CSEP 590 - Lecture 1 - Autumn 2007 36


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

cca

CSEP 590 - Lecture 1 - Autumn 2007 37


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

ccab

CSEP 590 - Lecture 1 - Autumn 2007 38


Decoding a Prefix Code

0 1
c
0 1
a b

11000111100

ccabccca

CSEP 590 - Lecture 1 - Autumn 2007 39


Exercise Encode/Decode

0 1
a 0 1

d
0 1

b c

• Player 1: Encode a symbol string


• Player 2: Decode the string
• Check for equality

CSEP 590 - Lecture 1 - Autumn 2007 40


How Good is the Code

0 1
c
0 1 5/8
a b
1/8 1/4

bit rate = (1/8)2 + (1/4)2 + (5/8)1 = 11/8 = 1.375 bps


Entropy = 1.3 bps
Standard code = 2 bps

(bps = bits per symbol)

CSEP 590 - Lecture 1 - Autumn 2007 41


Design a Prefix Code 1
• abracadabra
• Design a prefix code for the 5 symbols
{a,b,r,c,d} which compresses this string the
most.

CSEP 590 - Lecture 1 - Autumn 2007 42


Design a Prefix Code 2
• Suppose we have n symbols each with
probability 1/n. Design a prefix code with
minimum average bit rate.
• Consider n = 2,3,4,5,6 first.

CSEP 590 - Lecture 1 - Autumn 2007 43


Huffman Coding
• Huffman (1951)
• Uses frequencies of symbols in a string to build a
variable rate prefix code.
– Each symbol is mapped to a binary string.
– More frequent symbols have shorter codes.
– No code is a prefix of another.
• Example: 0 1
a 0
b 100 a 0 1
c 101
d
d 11 0 1

b c
CSEP 590 - Lecture 1 - Autumn 2007 44
Variable Rate Code Example
• Example: a 0, b 100, c 101, d 11
• Coding:
– aabddcaa = 16 bits
– 0 0 100 11 11 101 0 0= 14 bits
• Prefix code ensures unique decodability.
– 00100111110100

– aabddcaa

CSEP 590 - Lecture 1 - Autumn 2007 45


Cost of a Huffman Tree
• Let p1, p2, ... , pm be the probabilities for the
symbols a1, a2, ... ,am, respectively.
• Define the cost of the Huffman tree T to be
m
C(T) = ∑ piri
i=1
where ri is the length of the path from the root
to ai.
• C(T) is the expected length of the code of a
symbol coded by the tree T. C(T) is the bit
rate of the code.
CSEP 590 - Lecture 1 - Autumn 2007 46
Example of Cost
• Example: a 1/2, b 1/8, c 1/8, d 1/4
T
0 1
a 0 1

d
0 1

b c

C(T) = 1 x 1/2 + 3 x 1/8 + 3 x 1/8 + 2 x 1/4 = 1.75


a b c d
CSEP 590 - Lecture 1 - Autumn 2007 47
Huffman Tree
• Input: Probabilities p1, p2, ... , pm for symbols
a1, a2, ... ,am, respectively.
• Output: A tree that minimizes the average
number of bits (bit rate) to code a symbol.
That is, minimizes
m
HC(T) = ∑ piri bit rate
i=1
where ri is the length of the path from the root
to ai. This is the Huffman tree or Huffman
code
CSEP 590 - Lecture 1 - Autumn 2007 48
Optimality Principle 1
• In a Huffman tree a lowest probability symbol
has maximum distance from the root.
– If not exchanging a lowest probability symbol with
one at maximum distance will lower the cost.

T p smallest T’
k p<q
k<h
h
p q

q p

C(T’) = C(T) + hp - hq + kq - kp = C(T) - (h-k)(q-p) < C(T)


CSEP 590 - Lecture 1 - Autumn 2007 49
Optimality Principle 2
• The second lowest probability is a sibling of
the smallest in some Huffman tree.
– If not, we can move it there not raising the cost.

T p smallest T’
k q 2nd smallest
q<r
h
q k<h r

r p q p

C(T’) = C(T) + hq - hr + kr - kq = C(T) - (h-k)(r-q) < C(T)


CSEP 590 - Lecture 1 - Autumn 2007 50
Optimality Principle 3
• Assuming we have a Huffman tree T whose two
lowest probability symbols are siblings at
maximum depth, they can be replaced by a new
symbol whose probability is the sum of their
probabilities.
– The resulting tree is optimal for the new symbol set.
T
T’
p smallest
q 2nd smallest
h

q+p
q p
C(T’) = C(T) + (h-1)(p+q) - hp -hq = C(T) - (p+q)
CSEP 590 - Lecture 1 - Autumn 2007 51
Optimality Principle 3 (cont’)
• If T’ were not optimal then we could find a
lower cost tree T’’. This will lead to a lower
cost tree T’’’ for the original alphabet.

T’ T’’ T’’’

q+p
q+p q p

C(T’’’) = C(T’’) + p + q < C(T’) + p + q = C(T) which is a contradiction

CSEP 590 - Lecture 1 - Autumn 2007 52


Recursive Huffman Tree Algorithm
1. If there is just one symbol, a tree with one
node is optimal. Otherwise
2. Find the two lowest probability symbols with
probabilities p and q respectively.
3. Replace these with a new symbol with
probability p + q.
4. Solve the problem recursively for new symbols.
5. Replace the leaf with the new symbol with an
internal node with two children with the old symbols.

CSEP 590 - Lecture 1 - Autumn 2007 53


Iterative Huffman Tree Algorithm
form a node for each symbol ai with weight pi;
insert the nodes in a min priority queue ordered by probability;
while the priority queue has more than one element do
min1 := delete-min;
min2 := delete-min;
create a new node n;
n.weight := min1.weight + min2.weight;
n.left := min1;
n.right := min2;
insert(n)
return the last node in the priority queue.

CSEP 590 - Lecture 1 - Autumn 2007 54


Example of Huffman Tree Algorithm (1)
• P(a) =.4, P(b)=.1, P(c)=.3, P(d)=.1, P(e)=.1

.4 .1 .3 .1 .1
a b c d e

.4 .2 .3 .1
a c d

b e

CSEP 590 - Lecture 1 - Autumn 2007 55


Example of Huffman Tree Algorithm (2)
.4 .2 .3 .1
a c d

b e

.4 .3 .3
a c

b e

CSEP 590 - Lecture 1 - Autumn 2007 56


Example of Huffman Tree Algorithm (3)

.4 .3 .3 .4 .6
a c a

d c

b e d

b e

CSEP 590 - Lecture 1 - Autumn 2007 57


Example of Huffman Tree Algorithm (4)

.4 .6
a a

c c

d d

b e b e

CSEP 590 - Lecture 1 - Autumn 2007 58


Huffman Code

0 1 average number of bits per symbol is


.4 x 1 + .1 x 4 + .3 x 2 + .1 x 3 + .1 x 4 = 2.1
a
0 1
c a 0
0 b 1110
1
c 10
d d 110
0 1 e 1111
b e

CSEP 590 - Lecture 1 - Autumn 2007 59


Optimal Huffman Code vs. Entropy
• P(a) =.4, P(b)=.1, P(c)=.3, P(d)=.1, P(e)=.1
Entropy

H = -(.4 x log2(.4) + .1 x log2(.1) + .3 x log2(.3)


+ .1 x log2(.1) + .1 x log2(.1))
= 2.05 bits per symbol

Huffman Code

HC = .4 x 1 + .1 x 4 + .3 x 2 + .1 x 3 + .1 x 4
= 2.1 bits per symbol
pretty good!
CSEP 590 - Lecture 1 - Autumn 2007 60
In Class Exercise
• P(a) = 1/2, P(b) = 1/4, P(c) = 1/8, P(d) = 1/16,
P(e) = 1/16
• Compute the Optimal Huffman tree and its
average bit rate.
• Compute the Entropy
• Compare
• Hint: For the tree change probabilities to be
integers: a:8, b:4, c:2, d:1, e:1. Normalize at
the end.

CSEP 590 - Lecture 1 - Autumn 2007 61


Quality of the Huffman Code
• The Huffman code is within one bit of the entropy
lower bound.
H ≤ HC ≤ H + 1
• Huffman code does not work well with a two symbol
alphabet.
– Example: P(0) = 1/100, P(1) = 99/100
– HC = 1 bits/symbol
0 1
1 0
– H = -((1/100)*log2(1/100) + (99/100)log2(99/100))
= .08 bits/symbol

CSEP 590 - Lecture 1 - Autumn 2007 62


Powers of Two
• If all the probabilities are powers of two then
HC = H
• Proof by induction on the number of symbols.
Let p1 < p2 < ... < pn be the probabilities that add up
to 1
If n = 1 then HC = H (both are zero).
If n > 1 then p1 = p2 = 2-k for some k, otherwise the
sum cannot add up to 1.
Combine the first two symbols into a new symbol of
probability 2-k + 2-k = 2-k+1.
CSEP 590 - Lecture 1 - Autumn 2007 63
Powers of Two (Cont.)
By the induction hypothesis
HC(p1 + p 2 , p 3 ,..., pn ) = H(p1 + p 2 , p 3 ,..., pn )
n
= - (p1 + p 2 )log2 (p1 + p 2 ) − ∑ pilog2 (pi )
i= 3
n
= −2 −k +1
log2 (2 −k +1
) − ∑ pilog2 (pi )
i=3
n
= −2 −k +1
(log2 (2 ) + 1) − ∑ pilog2 (pi )
−k

i= 3
n
= −2 log2 (2 ) − 2 log2 (2 ) − ∑ pilog2 (pi ) − 2 −k − 2 −k
−k −k −k −k

i= 3
n
= − ∑ pilog2 (pi ) − (p1 + p 2 )
i =1

= H(p1, p 2 ,..., pn ) − (p1 + p 2 )

CSEP 590 - Lecture 1 - Autumn 2007 64


Powers of Two (Cont.)
By the previous page,

HC(p1 + p 2 , p 3 ,..., pn ) = H(p1, p 2 ,..., pn ) − (p1 + p 2 )


By the properties of Huffman trees (principle 3),

HC(p1, p 2 ,..., pn ) = HC(p1 + p 2 , p 3 ,..., pn ) + (p1 + p 2 )


Hence,
HC(p1, p 2 ,..., pn ) = H(p1, p 2 ,..., pn )

CSEP 590 - Lecture 1 - Autumn 2007 65


Extending the Alphabet
• Assuming independence P(ab) = P(a)P(b), so
we can lump symbols together.
• Example: P(0) = 1/100, P(1) = 99/100
– P(00) = 1/10000, P(01) = P(10) = 99/10000,
P(11) = 9801/10000.

0 1 HC = 1.03 bits/symbol (2 bit symbol)


= .515 bits/bit
11
0 1 Still not that close to H = .08 bits/bit
10
0 1
01 00
CSEP 590 - Lecture 1 - Autumn 2007 66
Quality of Extended Alphabet
• Suppose we extend the alphabet to symbols
of length k then

H ≤ HC ≤ H + 1/k
• Pros and Cons of Extending the alphabet
+ Better compression
- 2k symbols
- padding needed to make the length of the input
divisible by k

CSEP 590 - Lecture 1 - Autumn 2007 67


Huffman Codes with Context
• Suppose we add a one symbol context. That is in
compressing a string x1x2...xn we want to take into
account xk-1 when encoding xk.
– New model, so entropy based on just independent
probabilities of the symbols doesn’t hold. The new entropy
model (2nd order entropy) has for each symbol a probability
for each other symbol following it.
– Example: {a,b,c}
next
a b c
a .4 .2 .4
prev b .1 .9 0
c .1 .1 .8

CSEP 590 - Lecture 1 - Autumn 2007 68


Multiple Codes
next
Code for first symbol
a b c
a 00
a .4 .2 .4
prev b .1 .9 0 b 01
c 10
c .1 .1 .8

a b c

0 1 0 1 0 1

a b a c
0 .9 .1 .8 0 1
1 .4
b a b
c abbacc
.2 .1 .1
.4
00 00 0 1 01 0

CSEP 590 - Lecture 1 - Autumn 2007 69


Average Bit Rate for Code
• P(a) = .4 P(a) + .1 P(b) + .1 P(c)
P(b) = .2 P(a) + .9 P(b) + .1 P(c)
1 = P(a) + P(b) + P(c)
• 0 = -.6 P(a) + .1 P(b) + .1 P(c)
0 = .2 P(a) - .1 P(b) + .1 P(c)
1 = P(a) + P(b) + P(c)
• P(a) = 1/7, P(b) = 4/7, P(c) = 2/7

CSEP 590 - Lecture 1 - Autumn 2007 70


Average Bit Rate for Code
1/7 4/7 2/7

0 1 0 1 0 1

a b a c
0 .9 .1 .8 0 1
1 .4
b a b
c
.2 .1 .1
.4

ABR = 1/7 (.6 x 2 + .4) + 4/7 (1) + 2/7 ( .2 x 2 +.8)


= 8/7 = 1.14 bps

CSEP 590 - Lecture 1 - Autumn 2007 71


Complexity of Huffman Code Design
• Time to design Huffman Code is O(n log n)
where n is the number of symbols.
– Each step consists of a constant number of priority
queue operations (2 deletemin’s and 1 insert)

CSEP 590 - Lecture 1 - Autumn 2007 72


Approaches to Huffman Codes
1. Frequencies computed for each input
– Must transmit the Huffman code or
frequencies as well as the compressed input
– Requires two passes
2. Fixed Huffman tree designed from training data
– Do not have to transmit the Huffman tree
because it is known to the decoder.
– H.263 video coder
3. Adaptive Huffman code
– One pass
– Huffman tree changes as frequencies change

CSEP 590 - Lecture 1 - Autumn 2007 73


Run-Length Coding
• Lots of 0’s and not too many 1’s.
– Fax of letters
– Graphics
• Simple run-length code
– Input
00000010000000001000000000010001001.....
– Symbols
6 9 10 3 2 ...
– Code the bits as a sequence of integers
– Problem: How long should the integers be?
CSEP 590 - Lecture 1 - Autumn 2007 74
Golomb Code of Order m
Variable Length Code for Integers
• Let n = qm + r where 0 < r < m.
– Divide m into n to get the quotient q and
remainder r.
• Code for n has two parts:
1. q is coded in unary
2. r is coded as a fixed prefix code
Example: m = 5 0 1
code for r
0 1 0 1

0 0 1
1 2

3 4
CSEP 590 - Lecture 1 - Autumn 2007 75
Example
• n = qm + r is represented by:
q
67 8
11L10rˆ
r̂ is the fixed prefix code for r
– where
• Example (m = 5):
2 6 9 10 27
010 1001 10111 11000 11111010

CSEP 590 - Lecture 1 - Autumn 2007 76


Alternative Explanation
Golomb Code of order 5
input output
00000 1
0 1
0 1 00001 0111
00000
0 1 0001 0110
1 0
0 1 001 010
1 01 001
01 001
0001 00001
1 000

Variable length to variable length code.

CSEP 590 - Lecture 1 - Autumn 2007 77


Run Length Example: m = 5
00000010000000001000000000010001001.....
1
00000010000000001000000000010001001.....
001
00000010000000001000000000010001001.....
1
00000010000000001000000000010001001.....
0111

In this example we coded 17 bits in only 9 bits.

CSEP 590 - Lecture 1 - Autumn 2007 78


Choosing m
• Suppose that 0 has the probability p and 1
has probability 1-p.
• The probability of 0n1 is pn(1-p). The Golomb
code of order
m = - 1 
 log2 p
is optimal.
• Example: p = 127/128.

m = - 1  = 89
 log2 (127/128) 

CSEP 590 - Lecture 1 - Autumn 2007 79


Average Bit Rate for Golomb Code

Average output code length


Average Bit Rate =
Average input code length

• m = 4 as an example. With p as the probability of 0.


p 4 + 3p 3 (1− p) + 3p 2 (1− p) + 3p(1 − p) + 3(1 − p)
ABR =
4p 4 + 4p 3 (1- p) + 3p 2 (1- p) + 2p(1 − p) + (1− p)

ouput 1 011 010 001 000


input 0000 0001 001 01 1
weight p4 p3(1-p) p2(1-p) p(1-p) 1-p
CSEP 590 - Lecture 1 - Autumn 2007 80
Comparison of GC with Entropy

GC – entropy
order entropy

CSEP 590 - Lecture 1 - Autumn 2007 81


Notes on Golomb codes
• Useful for binary compression when one symbol is
much more likely than another.
– binary images
– fax documents
– bit planes for wavelet image compression
• Need a parameter (the order)
– training
– adaptively learn the right parameter
• Variable-to-variable length code
• Last symbol needs to be a 1
– coder always adds a 1
– decoder always removes a 1
CSEP 590 - Lecture 1 - Autumn 2007 82
Tunstall Codes
• Variable-to-fixed length code
• Example
input output
a 000
b 001 a b cca cb ccc ...
ca 010 000 001 110 011 110 ...
cb 011
cca 100
ccb 101
ccc 110

CSEP 590 - Lecture 1 - Autumn 2007 83


Tunstall code Properties
1. No input code is a prefix of another to
assure unique encodability.
2. Minimize the number of bits per symbol.

CSEP 590 - Lecture 1 - Autumn 2007 84


Prefix Code Property

a 000 a c
b
b 001
000 001 a c
ca 010 b
cb 011
010 011 a b
c
cca 100
ccb 101 100 101 110

ccc 110
Unused output code is 111.

CSEP 590 - Lecture 1 - Autumn 2007 85


Use for unused code
• Consider the string “cc”, if it occurs at the end
of the data. It does not have a code.
• Send the unused code and some fixed code
for the cc.
• Generally, if there are k internal nodes in the
prefix tree then there is a need for k-1 fixed
codes.

CSEP 590 - Lecture 1 - Autumn 2007 86


Designing a Tunstall Code
• Suppose there are m initial symbols.
• Choose a target output length n where 2n > m.

1. Form a tree with a root and m children with


edges labeled with the symbols.
2. If the number of leaves is > 2n – m then halt.*
3. Find the leaf with highest probability and
expand it to have m children.** Go to 2.

* In the next step we will add m-1 more leaves.


** The probability is the product of the probabilities
of the symbols on the root to leaf path.
CSEP 590 - Lecture 1 - Autumn 2007 87
Example
• P(a) = .7, P(b) = .2, P(c) = .1
• n=3
a c
b

.7 .2 .1

CSEP 590 - Lecture 1 - Autumn 2007 88


Example
• P(a) = .7, P(b) = .2, P(c) = .1
• n=3
a c
b
a c
b .2 .1

.49 .14 .07

CSEP 590 - Lecture 1 - Autumn 2007 89


Example
• P(a) = .7, P(b) = .2, P(c) = .1
• n=3
aaa 000
a c aab 001
b
aac 010
a c
b .2 .1 ab 011
a c ac 100
b .14 .07
b 101
.343 .098 .049 c 110

CSEP 590 - Lecture 1 - Autumn 2007 90


Bit Rate of Tunstall
• The length of the output code divided by the
average length of the input code.
• Let pi be the probability of, and ri the length of
input code i (1 < i < s) and let n be the length
of the output code.
n
Average bit rate = s

∑p r
i=1
i i

CSEP 590 - Lecture 1 - Autumn 2007 91


Example
aaa .343 000

a c aab .098 001


b aac .049 010
a c ab .14 011
b .2 .1
ac .07 100
a c b .2 101
b .14 .07
c .1 110

.343 .098 .049

ABR = 3/[3 (.343 + .098 + .049) + 2 (.14 + .07) + .2 + .1]


= 1.37 bits per symbol
Entropy = 1.16 bits per symbol

CSEP 590 - Lecture 1 - Autumn 2007 92


Notes on Tunstall Codes
• Variable-to-fixed length code
• Error resilient
– A flipped bit will introduce just one error in the
output
– Huffman is not error resilient. A single bit flip can
destroy the code.

CSEP 590 - Lecture 1 - Autumn 2007 93

You might also like