Chapter 3 Multimedia Data Compression
Chapter 3 Multimedia Data Compression
Chapter 3 Multimedia Data Compression
Take, for example, a video signal with resolution 320 x 240 pixels and 256 (8 bits) colors, 30
frames per second. Raw bit rate = 320 x 240 x 8 x 30
= 18,432,000 bits
A 90 minute movie would take 2.3 x 60 x 90 MB = 12.44 GB. Without compression, data
storage and transmission would pose serious problems!
Data compression is about finding ways to reduce the number of bits or bytes used to store or
transmit the content of multimedia data. It is the process of encoding information using fewer
bits Eg. ZIP file format. As with any communication, compressed data communication only
works when both the sender and receiver of the information understand the encoding scheme.
Is compression useful?
Compression is useful because it helps reduce the consumption of resources, such as hard disk
space or transmission bandwidth.
On the downside, compressed data must be decompressed to be used, and this extra processing
may be harmful to some applications. For instance, a compression scheme for video may require
expensive hardware for the video to be decompressed fast enough to be viewed as it's being
decompressed. The option of decompressing the video in full before watching it may be
inconvenient, and requires storage space for the decompressed video.
The design of data compression schemes therefore involves trade-offs among various factors,
including
1
The degree of compression: to what extent the data should be compressed?
The amount of distortion introduced: to what extent quality loss is tolerated?
The computational resources required to compress and uncompress the data: do we have
enough memory required for compressing and uncompressing the data?
Types of Compression
Lossless Compression
The original content of the data is not lost/changed when it is compressed (encoded). It is used
mainly for compressing symbolic data such as database records, spreadsheets, texts, executable
programs, etc., Lossless compression can recover the exact original data after compression
where exact replication of the original is essential and changing even a single bit cannot be
tolerated. Examples: Run Length Encoding (RLE), Lempel Ziv (LZ), Huffman Coding.
Lossy Compression
The original content of the data is lost to certain degree when compressed. For visual and audio
data, some loss of quality can be tolerated without losing the essential nature of the data. Lossy
compression is used for image compression in digital cameras like JPEG, audio compression
like mp3. video compression in DVDs with MPEG format.
2
GIF image files and WinZip use lossless compression. For this reason zip software is popular for
compressing program and data files. Lossless compression does not lose any data in the
compression process.
The advantage is that the compressed file will decompress to an exact duplicate of the
original file, mirroring its quality.
The disadvantage is that the compression ratio is not all that high, precisely because no
data is lost.
To get a higher compression ratio -- to reduce a file significantly beyond 50% -- you must
use lossy compression.
Lossless & lossy compression have become part of our everyday vocabulary due to the
popularity of MP3 music file, JPEG image file, MPEG video file. A sound file in WAV format,
converted to a MP3 file will lose much data as MP3 employs a lossy compression. JPEG uses
lossy compression, while GIF follows lossless compression techniques.
An example of lossless vs. lossy compression is the following string: 25.888888888. This string
can be compressed as: 25.9!8and interpreted as, "twenty five point 9 eights", the original string is
perfectly recreated, just written in a smaller form.
In which case, the original data is lost, at the benefit of a smaller file size. The above example is
a very simple example of run-length encoding,
Data often contains sequences of identical bytes. By replacing these repeated byte sequences
with the number of occurrences, a substantial reduction of data can be achieved. In Run-length
3
encoding, large runs of consecutive identical data values are replaced by a simple code with the
data value and length of the run, i.e. (data Value, Length Of The Run)
This encoding scheme tries to tally occurrence of data value (Xi) along with its run length, i.e.(Xi
, Length_of_Xi).
It compress data by storing runs of data (that is, sequences in which the same data value occurs
in many consecutive data elements) as a single data value & count. For example, consider the
following image with long runs of white pixels (W) and short runs of black pixels (B).
WWWWWWWWWWBWWWWWWWWWBBBWWWWWWWWWWWW
The RLE data compression algorithm, the compressed code is:10W1B9W3B12W (Interpreted
as ten W's, one B, nine W's, three B's, …)
Lossless compression schemes are reversible so that the original data can be
reconstructed,
Lossy schemes accept some loss of data in order to achieve higher compression.
These lossy data compression methods typically offer a three-way trade off between
quality loss.
Lossless compression is a method of reducing the size of computer files without losing any
information. That means when you compress a file, it will take up less space, but when you
decompress it, it will still have the exact same information. The idea is to get rid of any
redundancy in the information, this is exactly what happens is used in ZIP and GIF files. This
4
differs from lossy compression, such as in JPEG files, which loses some information that isn't
very noticeable.
Entropy coding:
E.g. Huffman coding- Estimate probabilities of symbols, code one symbol at a time, shorter
codes for symbols with high probabilities.
Huffman coding
Developed in 1950s by David Huffman, widely used for text compression, multimedia code and
message transmission
The problem: Given a set of n symbols and their weights (or frequencies), construct a tree
structure (a binary tree for binary code) with the objective of reducing memory space and
decoding time per symbol. For instance, Huffman coding is constructed based on frequency of
occurrence of letters in text documents.
5
The output of the Huffman encoder is determined by the Model (probabilities). The higher the
probability of occurrence of the symbol, the shorter the code assigned to that symbol and vice
versa. This will enable to easily control the most frequently occurring symbols in a data and also
reduce the time taken during decoding each symbols.
Associate binary code: 1 with the right branch and 0 with the left branch
Step 4: Create a unique codeword for each symbol by traversing the tree from the root to the
leaf.
Example: Consider a 7-symbol alphabet given in the following table to construct the Huffman
coding.
6
Character a b C d e f g
The Huffman encoding algorithm picks each time two symbols (with the smallest frequency) to
combine.
Using the Huffman coding a table can be constructed by working down the tree, left to right.
This gives the binary equivalents for each symbol in terms of 1s and 0s.
Character a b C d e f
frequency 5 9 12 13 16 45
Binary Code:
7
a:1100, b:1101, c:100, d: 101, e:111, f:0
Exercise: construct the tree & binary code by using Huffman coding
3. Divide the list into two half‟s, with the total frequency counts of each half being as close as
possible to each other.
4. The right half is assigned a code of 1 and the left half with a code of 0.
5. Recursively apply steps 3 and 4 to each of the halves, until each symbol has become a
corresponding code leaf on the tree. That is, treat each split as a list and apply splitting and
code assigning till you are left with lists of single elements.
Shannon Fano(sequence s)
If s has two letters
Attach 0 to the codeword of one letter and 1 to the codeword of another;
Else if s has more than two letter
Divide s into two subsequences S1, and S2 with the minimal difference between probabilities of
each subsequence; extend the codeword for each letter in S1 by attaching 0, and by attaching 1
8
to each codeword for letters in S2;
ShannonFano(S1);
ShannonFano(S2);
Example: Given five symbols A to E with their frequencies being 15, 7, 6, 6 & 5; encode
them using Shannon-Fano entropy encoding
symbol A B C D E
Count 15 7 6 6 5
Step1: Say, we are given that there are five symbols (A to E) that can occur in a source with their
frequencies being 15, 7, 6, 6 and 5. First sort the symbols in decreasing order of frequency.
Step2: Divide the list into two equal halves. That is, the counts of both halves are as close as
possible to each other. So in this case we split the list between B and C and assign 0 and 1.
Step3: We recursively repeat the steps of splitting and assigning code till each symbol become a
code leaf on the tree. That is, treat each split as a list and apply splitting and code assigning till
you are left with lists of single elements.
Step 4: Note that we split the list containing C, D and E between C and D because the difference
between the split lists is 11 minus 6 which is 5, if we were to have divided between D and E we
would get a difference of 12-5 which is 7.
Step5: We complete the algorithm and as a result have codes assigned to the symbols.
9
Example: Suppose the following source and with related probabilities
S={A,B,C,D,E}
P={0.35,0.17,0.17,0.16,0.15}
Message to be encoded=îABCDEî
The probability is already arranged in non-increasing order. First we divide the message into AB
and CDE. Why? This gives the smallest difference between the total probabilities of the two
groups.
S1={A,B} P={0.35,0.17}=0.52
S2={C,D,E} P={0.17,0.16,0.15}=0.48
The difference is only 0.52-0.48=0.04. This is the smallest possible difference when we divide
the message. Attach 0 to S1 and 1 to S2. Subdivide S1 into sub groups.
S11={A} attach 0 to this
S12={B} attach 1 to this
Again subdivide S2 into subgroups considering the probability again.
S2 1={C} P={0.17}=0.17
S2 2={D,E} P={0.16,0.15}=0.31
Attach 0 to S21 and 1 to S22. Since S22 has more than one letter in it, we have to subdivide it.
S22.1={D} attach 0
S22.2={E} attach 1
The message is transmitted using the following code (by traversing the tree)
A=00 B=01
C=10 D=110
E=111
Instead of transmitting ABCDE, we transmit 000110110111.
10
Exercise: S={A,B,C,D,E} P={0.35,0.17,0.17,0.16,0.15}
Given the following symbols and their corresponding frequency of occurrence, find an optimal
binary code for compression:
character a B c d e F
frequency 6 5 12 17 10 25
a. Using the Huffman algorithm
The benefit of one-pass procedure is that the source can be encoded in real time, though it
becomes more sensitive to transmission errors, since just a single loss ruins the whole code.
Adaptive Huffman Coding is also known as Dynamic Huffman Coding. The implementation is
done using Vitter Algorithm.
11
Adaptive Huffman coding for a string containing alphabets:
Let m be the total number of alphabets. So m = 26.
For Vitter Algorithm, find a parameters e & r such that
m = 2e + r and 0 ≤ r ≤ 2e
Therefore, for m = 26 we get e = 4 & r = 10
There are two type of code NYT Code & Fixed Code.
NYT code = Traversing tree from the root node to that particular NYT node.
For Fixed Code, it can be calculated from the following two conditions:
Note: First get the binary representation code then update the tree.
1. If 0 ≤ k ≤ 2r Then the letter Sk is encoded as the binary representation of (k-1) in (e+1) bits.
(where k is position of alphabet in sorted order)
2. Else the letter Sk is encoded as the binary representation of (k-r-1) in e bits.
Tree Updation
Nodes are numbered in increasing order i.e., by level and from left to right
The Nodes that have the same weight and the type together form a block
Blocks are related to each other as by increasing order of their weights
Internal Node is represented by Oval shape. Weight of internal nodes = Sum of child node
weights
External Node is represented by a square shape. Weight of external nodes = Initially 1 and if
repeated then increased the weight by 1
12
Example
code = "aardvark"
The final Code we get is:
00000 1 010001 0000011 0001011 0 10 110001010
a a r d v a r k
Explanation:
For string code = “aardvark”, e = 4, r = 10
As shown in the above image Tree is initialize with NYT Node with weight 0.
2. read symbol „a‟ which already exists in the tree. Traversing Tree up to symbol „a‟, we get
code = “1”
Huffman Code for symbol for 'a' is "1", now the tree is updated.
13
3. read symbol „r‟, k = 18.
Huffman Code for symbol for 'r' is "010001", now the tree is updated.
14
For Fixed Code: As k > 2r i.e, 22 > 2*10, satisfy condition (2)
So Fixed Code is Binary Representation of (k-r-1 = 11) as 4-bit representation
Note. Swap the node of left sub-tree and right as the tree is violating property.
6. Read symbol „a‟ which already exists in the tree. Traversing Tree up to symbol „a‟, we get
code = “0”
Huffman Code for symbol for 'a' is "0", now update the tree.
15
7. Read symbol „r‟ which already exists in the tree. Traversing Tree up to symbol „r‟, we get
code = “10”
Huffman Code for symbol for 'r' is "10", now update the tree
Huffman Code for symbol for 'k' is "110001010", now update the tree.
16
Decoding
Steps for Decoding:
Example:
code = "00000101000100000110001011010110001010"
Begin decoding by reading first e bits. So the first 4 bits are 0000, converting into decimal =
0.
Now the value 0 < r , i.e, 0 < 10 satisfy condition (1).
Now according to the condition (1), convert first e+1 = 5 bit into decimal and add 1 to it.
00000 = 0
0 + 1 = 1, which is value for alphabet a.
Update the tree and add a node for the symbol „a‟ in the tree.
17
Read the next bit in the given code and traverse the tree. We reach the external leaf node „a‟.
So the next decoded symbol is „a‟.
update the tree.
Read the next set of bits given code and traverse the tree. We have 0 as NYT Node. After
reaching the NYT Node, read e bits which are 1000. Convert 1000 to decimal is 8. As 8 < r
satisfy condition (1).
Now Convert e+1 bits in decimal and add 1 to it.
10001 = 17
17 + 1 = 18, which is value for alphabet r.
Update the tree and add a node for the symbol „r‟ in the tree.
Reading the next set of bits and traversing the Tree we reach NYT node at 00. Read e bits
which are 0001. Convert 0001 to decimal is 1. As 1 < r satisfy condition (1).
Now Convert e+1 bits in decimal and add 1 to it.
00011 = 3
3 + 1 = 4, which is value for alphabet d.
Update the tree and add a node for the symbol „d‟ in the tree.
Reading the next set of bits and traversing the Tree we reach NYT node at 000. Read e bits
which are 1011. Convert 1011 to decimal is 11. As 11 > r satisfy condition (2).
Now Convert k+r+1 bits in decimal and decode the symbol.
18
11+10+1=22 which is value for alphabet v.
Update the tree and add a node for the symbol „v‟ in the tree.
Reading the next set of bits and traversing the Tree we get symbol „a‟ at 0. Update the tree
and add a node for the symbol „a‟ in the tree.
Reading the next set of bits and traversing the Tree we get symbol „r‟ at 10. Update the tree
and add a node for the symbol „r‟ in the tree.
19
Reading the next set of bits and traversing the Tree we reach NYT node at 1100. Read e bits
which are 0101. Convert 0101 to decimal is 5. As 5 < r satisfy condition (1).
Now read e+1 bits is 01010 in decimal is 10, add 1 to it.
Lempel-Ziv Encoding
Data compression up until the late 1970's mainly directed towards creating better methodologies
for Huffman coding. An innovative, radically different method was introduced in1977 by
Abraham Lempel and Jacob Ziv. The zip and unzip use the LZH technique while UNIX's
compress methods belong to the LZW and LZC classes.
Lempel-Ziv compression
20
The problem with Huffman coding is that it requires knowledge about the data before encoding
takes place. Huffman coding requires frequencies of symbol occurrence before codeword is
assigned to symbols
In Lempel-Ziv compression not rely on previous knowledge about the data rather builds this
knowledge in the course of data transmission/data storage. Lempel-Ziv algorithm (called LZ)
uses a table of code-words created during data transmission; and it transmits the index of the
symbol/word instead of the word itself. Each time it replaces strings of characters with a
reference to a previous occurrence of the string.
The multi-symbol patterns are of the form: C0C1 . . . Cn-1 Cn. The prefix of a pattern
consists of all the pattern symbols except the last: C0C1 . . . Cn-1
Lempel-Ziv Output: there are three options in assigning a code to each symbol in the
list
If the last input symbol or the last pattern is in the dictionary, assign (dictionary
Prefix Index, )
Eg: Encode (i.e., compress) the string ABBCBCABABCAABCAAB using the LZ algorithm.
21
The compressed message is: (0,A)(0,B)(2,C)(3,A)(2,A)(4,A)(6,B)
Note: The above is just a representation, the commas and parentheses are not transmitted
22
Encode (i.e., compress) the following strings using the Lempel-Ziv algorithm.
SATATASACITASA
23