0% found this document useful (0 votes)
3 views

Huffman coding

Uploaded by

tayyabxardar007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Huffman coding

Uploaded by

tayyabxardar007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

CUI Abbottabad

Department of Computer Science

Data Structures and Algorithms


Lecture No.
Applications of Tree

COMSATS University Islamabad, Abbottabad Campus


Agenda
❖ Applications of Tree
▪ Expression Tree
▪ Huffman Coding

COMSATS University Islamabad, Abbottabad Campus


Expression Trees

▪ Binary tree has many forms and applications. One form of


binary tree is the Expression tree which is very important in
compiler construction.

▪ An expression tree is defined as a binary tree that is built up


from arithmetical or logical expression by placing the operands
as the leave nodes and the operators as the interior nodes.

▪ For each binary operator, the left subtree consists of its left
operand and the right subtree consists of its right operand.
▪ For unary operators, one of the subtrees is empty.

3
Expression Trees
Expression Tree

(a+b*c)+((d*e+f)*g)
+

+ *

a * + g

b c * f

d e

5
Expression Tree

▪ The inner nodes contain operators while leaf


nodes contain operands.

+ *

a * + g

b c * f

d e
Expression Tree

▪ The tree is binary because the operators are


binary.

+ *

a * + g

b c * f

d e
Expression Tree

▪ Inorder traversal yields: a+b*c+d*e+f*g

+ *

a * + g

b c * f

d e
Enforcing Parenthesis
/* inorder traversal routine using the parenthesis */
void inorder(TreeNode treeNode)
{
if( treeNode != NULL )
{
if(treeNode.left != NULL && treeNode.right !=
NULL) //if not leaf
system.out.print("(“);
inorder(treeNode.left);
system.out.print(treeNode.data+" “);
inorder(treeNode.right);
if(treeNode.left != NULL && treeNode.right!=
NULL) //if not leaf
system.out.print(“)“); }
}
Expression Tree

▪ Inorder : (a+(b*c))+(((d*e)+f)*g)

+ *

a * + g

b c * f

d e
Expression Tree

▪ Postorder traversal: a b c * + d e * f + g * +
which is the postfix form.
+

+ *

a * + g

b c * f

d e
Constructing Expression Tree

▪ Algorithm to convert postfix expression into an


expression tree.
▪ We already have an expression to convert an
infix expression to postfix.
▪ Read a symbol from the postfix expression.
▪ If symbol is an operand, put it in a one node tree
and push it on a stack.
▪ If symbol is an operator, pop two trees from the
stack, form a new tree with operator as the root
and T1 and T2 as left and right subtrees and push
this tree on the stack.
Constructing Expression Tree

▪ ab+cde+**

stack
Constructing Expression Tree

▪ ab+cde+**
top
If symbol is an
operand, put it in a
one node tree and
a b push it on a stack.

Stack is growing left to right


Constructing Expression Tree

▪ ab+cde+**

If symbol is an operator,
pop two trees from the
stack, form a new tree
with operator as the root
+
and T1 and T2 as left and
right subtrees and push
a b this tree on the stack.

Stack is growing left to right


Constructing Expression Tree

▪ ab+cde+**

+ c d e

a b
Constructing Expression Tree

▪ ab+cde+**

+ c +

a b d e
Constructing Expression Tree

▪ ab+cde+**

+ *

a b c
+

d e
Constructing Expression Tree

▪ ab+cde+**

+ *

a b c
+

d e
Huffman coding
Huffman Encoding
▪ Data compression plays a significant role in
computer networks.
▪ To transmit data to its destination faster, it is
necessary to either increase the data rate of the
transmission media or to simply send less data.
▪ Improvements with regard to the transmission
media has led to increase in the rate.
▪ The other options is to send less data by means
of data compression.
▪ Compression methods are used for text, images,
voice and other types of data.
Huffman Encoding
▪ Huffman code is method for the compression for
standard text documents.
▪ It makes use of a binary tree to develop codes of
varying lengths for the letters used in the
original message.
▪ Huffman code is also part of the JPEG image
compression scheme.
▪ The algorithm was introduced by David
Huffman in 1952 as part of a course assignment
at MIT.
Huffman Encoding
▪ To understand Huffman encoding, it is best to
use a simple example.
▪ Encoding the 32-character phrase: "traversing
threaded binary trees",
▪ If we send the phrase as a message in a network
using standard 8-bit ASCII codes, we would have
to send 8*32= 256 bits.
▪ Using the Huffman algorithm, we can send the
message with only 116 bits.
Huffman Encoding
▪ List all the letters used, including the "space"
character, along with the frequency with which
they occur in the message.
▪ Consider each of these (character,frequency)
pairs to be nodes; they are actually leaf nodes, as
we will see.
▪ Pick the two nodes with the lowest frequency,
and if there is a tie, pick randomly amongst
those with equal frequencies.
Huffman Encoding
▪ Make a new node out of these two, and make the
two nodes its children.

▪ This new node is assigned the sum of the


frequencies of its children.

▪ Continue the process of combining the two


nodes of lowest frequency until only one node,
the root, remains.
Greedy Algorithm: Huffman Encoding

26
Greedy Algorithm: Huffman Encoding

27
ABBRA CADA BBRA
Total bit=15 x 8=120 bits
ASCII
HC bit=?

28
29
30
31
Huffman Code Problem
• Left tree represents a fixed length encoding scheme
• Right tree represents a Huffman encoding scheme

32
Example

33
Huffman Encoding
Original text:
traversing threaded binary trees
size: 33 characters (space and newline)

NL : 1 i: 2
SP : 3
a: 3 n: 2
b: 1 r: 5
d: 2 s: 2
e: 5
g: 1
t: 3
h: 1 v: 1
y: 1
Huffman Encoding

2 is equal to sum
of the frequencies of
the two children nodes.

e r
a t
3 3 5 5

d i n s 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

There a number of ways to combine


nodes. We have chosen just one such
way.

e r
a t
3 3 5 5

d i n s 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

e r
a t
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

e r
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

14 19

9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding
33

14 19

9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
Huffman Encoding

▪ Start at the root. Assign 0 to left branch and 1 to


the right branch.
▪ Repeat the process down the left and right
subtrees.
▪ To get the code for a character, traverse the tree
from the root to the character leaf node and read
off the 0 and 1 along the path.

43
Huffman Encoding
33
0 1

14 19

9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
44
Huffman Encoding
33
0 1

14 19
0 1 0 1
9 10
6 8

4 e r 5
a t 4 4
3 3 5 5

d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
45
Huffman Encoding
33
0 1

14 19
0 1 0 1
9 10
6 8
0 1 0 1
0 1 0 1
4 e r 5
a t 4 4
3 3 5 5
0 1 0 1 0 1 0 1
d i n s 2 2 2 SP
2 2 2 2 3
NL b g h v y
1 1 1 1 1 1
46
Huffman Encoding
33
0 1

14 19
0 1 0 1
9 10
6 8
0 1 0 1
0 1 0 1
4 e r 5
a t 4 4
3 3 5 5
0 1 0 1 0 1 0 1
d i n s 2 2 2 SP
2 2 2 2 0 1 0 1 0 1 3
NL b g h v y
1 1 1 1 1 1
47
Huffman Encoding

Huffman character codes

NL  10000 • Notice that the code is


SP  1111 variable length.
a  000
b  10001 • Letters with higher
d  0100 frequencies have shorter
e  101 codes.
g  10010
h  10011 • The tree could have been
i  0101 built in a number of ways;
n  0110 each would yielded
r  110
s  0111 different codes but the
t  001 code would still be
v  11100 minimal.
y  11101
48
Huffman Encoding
Original: traversing threaded binary trees

Encoded:

0011100001110010111001110101011010010111100
110011110101000010010101001111100001010110
00011011101111100111010110101110000

49
Huffman Encoding

Original: traversing threaded binary trees


With 8 bits per character, length is 264.

Encoded:
0011100001110010111001110101011010010111100
110011110101000010010101001111100001010110
00011011101111100111010110101110000

Compressed into 122 bits, 54% reduction.

50
Summary
❖ Summary

COMSATS University Islamabad, Abbottabad Campus


THANK YOU

Q&A

COMSATS University Islamabad, Abbottabad Campus

You might also like