Lab Sheet: Faculty of Engineering
Lab Sheet: Faculty of Engineering
Lab Sheet: Faculty of Engineering
FACULTY OF ENGINEERING
LAB SHEET
ETN2126
TRIMESTER 2 (2017/2018)
Note: Students are advised to read through this lab sheet before doing experiment. On-
the-spot evaluation may be carried out during or at the end of the experiment. Your
performance, teamwork effort, and learning attitude will count towards the marks.
0
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
Multimedia University
Experiment 1: Huffman Coding
1. OBJECTIVE
To apply source coding techniques using Huffman Codes.
To design and generate Huffman Codes with MATLAB software.
2. SOFTWARE REQUIRED
MATLAB
3. INTRODUCTION TO MATLAB
3.1 MATLAB
MATLAB is a powerful collection of tools for algorithm development, computation and
visualization. It provides more control and flexibility compared to a traditional high-
level programming language. Unlike such languages, MATLAB is compact and easy to
learn, letting you express algorithms in concise, readable code. In addition, MATLAB
provides an extensive set of ready-to-use functions including mathematical and matrix
operations, graphics, color and sound control, and low-level file I/O. MATLAB is readily
extendable - you can use the MATLAB language to easily create functions that operate as
part of the MATLAB environment.
The MATLAB User's Guide describes how to work with the MATLAB language. It
discusses how to enter and manipulate data and use MATLAB's extensive collection of
functions. It explains how to perform command-line computations and how to create
your own functions and scripts. The MATLAB Reference Guide provides reference
descriptions of supplied MATLAB functions and commands.
3.2 SIMULINK
SIMULINK is a dynamic system simulation environment. It allows you to represent
systems as block diagrams, which you build using your mouse to connect blocks and
your keyboard to edit block parameters.
The SIMULINK User's Guide describes how to work with SIMULINK. It also provides
reference descriptions of each block in the standard SIMULINK libraries.
1
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
You can use the toolbox directly from the MATLAB workspace. You can use the
SIMULINK environment to construct a simulation block diagram for your
communication system. For convenience, the Communications Toolbox provides online
interactive examples that use any of the function found in this toolbox.
4. MATLAB FUNDAMENTALS
In this section, we will illustrate some commands used for this experiment. (For additional
information on the individual MATLAB commands, we refer you to the MATLAB User's Guide or
the online help facility.)
>>a = 4/3
a=
1.3333
In general, it is better to use appropriate and memorable names for variables. MATLAB
recognizes the first 19 characters of a variable name but the first character in a variable
name be a letter. MATLAB is case sensitive and all MATLAB commands are written in
lowercase letters.
2
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
If you prefer to create a new variable but do not wish to see the MATLAB response, type
a semicolon at the end of the expression. For example,
>>b = 4+7;
will create a new variable b whose value is 11, but MATLAB will not display the value
of b.
Then type magphase at the MATLAB prompt will yield the following MATLAB
response:
mag=
0.4472
phase=
-0.4636
which are the magnitude and phase of the transfer function G(j) = l/(j + 2) evaluated at
= 1. By entering help magphase will give you back the text in the comment lines at
the beginning of the file:
It should be obvious that it is very desirable to include at least some brief comments as a
header to each m-file.
3
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
4.3 Matrices
Matrices are entered into MATLAB by listing the elements of the matrix and enclosing
them within a pair of brackets. Commas or blanks separate elements of a single row, and
semicolons or carriage returns separate rows. For example,
>>A = [1 2; 3 4]
A=
1 2
3 4
To get the dimensions of a matrix, we can use the size command, e.g.,
>>[K,N] = size(A)
K=
2
N=
2
Special vectors can be created using the ':' operator. For example, the command
>>k= 1:10
Using the simple commands described thus far, you can easily manipulate both matrices
and vectors. For example, to add a row onto matrix A we type,
To extract the submatrix of A that consists of the first through second columns of the
second through third rows, use vectors as indices as follows:
>>B = A(2:3,1:2)
B=
3 4
7 8
MATLAB has commands for generating special matrices. For example, you can create a
n x n identity matrix with the eye(n). The zeros, ones and rand commands work
4
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
similarly to eye and make matrices with elements equal to zero, elements equal to one,
and random elements, respectively. These commands can also be used to create non-
square matrices. For example, zeros(2,4) generates a 2 x 4 matrix of zeros.
A) HUFFMANDICT
DICT = HUFFMANDICT(SYM, PROB)
Generates a binary Huffman code dictionary using the maximum variance algorithm for
the distinct symbols given by the SYM vector. The symbols can be represented as a
numeric vector or single-dimensional alphanumeric cell array. The second input, PROB,
represents the probability of occurrence for each of these symbols. SYM and PROB must
be of same length.
B) HUFFMANENCO
ENCO = HUFFMANENCO(SIG, DICT)
Encodes the input signal, SIG, based on the code dictionary, DICT. The code dictionary is
generated using the HUFFMANDICT function. Each of the symbols appearing in SIG
must be present in the code dictionary, DICT. The SIG input can be a numeric vector or a
single-dimensional cell array containing alphanumeric values.
C) HUFFMANDECO
HUFFMANDECO(COMP, DICT)
Decodes the numeric Huffman code vector COMP using the code dictionary, DICT. The
encoded signal is generated by the HUFFMANENCO function. The code dictionary can
be generated using the HUFFMANDICT function. The decoded signal will be a numeric
vector if the original signals are only numeric. If any signal value in DICT is
alphanumeric, then the decoded signal will be represented as a single-dimensional cell
5
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
array.
5. THEORY
For a source encoder to be efficient, we require knowledge of the statistics of the source.
In particular, if some source symbols are known to be more probable than the others, then
we may exploit this feature in the generation of a source code by assigning short code
words to frequent source symbols, and long code words to rare source symbols. We refer
to such a source code as a variable-length code.
An efficient source encoder should satisfy the following two functional requirements:
1. The code words produced by the encoder are in binary form.
2. The source code is uniquely decodable, so that the original source sequence can be
reconstructed perfectly from the encoded binary sequence.
L = pklk k = 0, 1, …, K-1
L represents the average number of bits per source symbol used in the encoding process.
Let Lmin denote the minimum possible value of L. We then define the coding efficiency
of the source encoder as
= Lmin /L
With L Lmin, we clearly have 1. The source encoder is said to be efficient when
approaches unity.
L H(S)
6
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
Accordingly, the entropy H(S) represents a fundamental limit on the average number of
bits per source symbol necessary to represent a discrete memoryless source. In other
words, L cannot be made smaller than H(S).
Thus with Lmin = H(S), we may rewrite the efficiency of a source encoder in terms of the
entropy H(S) as
= H(S) /L
6. PROCEDURE
Step 1: Start the MATLAB program.
Step 2: In the Command Window, type the MATLAB code line by line, as given in
Example (see Section 7). Observe the result after typing each line.
Alternatively, you can type the entire MATLAB code in an m-file (this is actually the
preferred method). By saving the m-file, you can run the code at a later date, or you can
make minor modifications without retyping the entire code, or you can copy the code to
another computer to be executed there.
7
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
Step 5: Save the m-file by going to File -> Save As. Choose a short name for your m-
file. Make sure it contains no spaces.
Step 6: To run the m-file from the MATLAB Editor window, go to Tools -> Run. You
can also execute the m-file from the Command Window by typing the m-file
name (without the “.m”) on the command line.
7. EXAMPLE
STEP 1: Generating the message sequence
Let us generate a message which consists of one sentence of text: 'information theory is
interesting'. The message will be stored in a variable called msg. In the MATLAB
command window, type
msg = 'information theory is interesting'
Our message contains a total of 33 characters, including spaces.
Observe that we have enclosed the elements of symb with curly braces { } instead of the
usual square brackets [ ]. Doing so indicates that symb is a cell array, instead of a
character array. The variable symb will become an input to a later function called
huffmandict, which requires that the source alphabet be of cell data type.
8
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
Like the variable symb previously, the variable dict is also a cell array. In order to
access the elements in dict, we again use curly braces { }. To see the Huffman code for
the first symbol, type:
dict{1,:}
To view the Huffman code for the rest of the symbols, simply increment the subscript
value:
dict{2,:}
dict{3,:}
…
8. PROBLEM
1. Generate a message sequence that incorporates your name. The message should be
phrased as
'my name is <insert name here>'
Store the message in a variable called myname.
2. Estimate the source alphabet and source statistics of your message. Store the source
alphabet in the variable salpha, and the source statistics in the variable sstat.
3. Draw the Huffman tree corresponding to the information found in part 2 (do this by
9
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
hand). Trace the branches in the Huffman tree to construct the Huffman codebook.
4. Type the MATLAB code that automatically generates the Huffman codebook.
Compare this codebook with the one constructed manually in part 3. Is there any
difference between the two codebooks? If so, why?
5. Encode your message sequence and convert it to a binary stream.
6. Answer the following questions:
(a) How many bits were used to encode your message?
(b) If 7-bit ASCII were employed instead of Huffman, what would the length of
your binary stream be?
(c) Estimate the percentage of compression that is achieved by Huffman coding
relative to using 7-bit ASCII.
(d) Calculate the efficiency of your Huffman coder.
9. REPORT
Print out and submit your report 2 weeks after the laboratory session. The report should
contain the following:
Introduction
Procedures
Results and Discussion
Conclusions
Appendices (MATLAB code)
Lab report writing is an individual work. Fabricating results and copying the report of
others are strictly prohibited. Severe penalties will be imposed for any such offences.
10
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
FACULTY OF ENGINEERING
11
IT1: HUFFMAN CODING
ETN2126 INFORMATION THEORY AND ERROR CODING
12
IT1: HUFFMAN CODING