SW Lab 3 Fixed Point Simulation EE 462

Lab 3: Fixed-Point Simulation
EE 462 DSP Laboratory (Spring 2024)
Lab Goals
• Understand fixed-point representation of numbers and fixed-point computations.
• Compare fixed-point and floating-point implementation of a simple function through simulation in
Matlab
• Understand subtle issues related to quantization of coefficients in a FIR filter
• Handling overflow issues.
1 Introduction
The processor you will use in the hardware labs is TMS320C5515, is a fixed-point processor. Fixed-point
arithmetic is generally used when hardware resources are limited and we can afford a reduction in accuracy in
return for higher execution speed. Fixed-point processors are either 16-bit or 24-bit devices, while floating point
processors are usually 32-bit devices. A typical 16-bit processor such as the TMS320C55x, stores data as a 16-bit
integer or a fraction format in a fixed range. Although signals are only stored with 16-bits, intermediate values
during the arithmetic operations may be kept at 32-bits. This is done using the internal 40-bit accumulator,
which helps reducing cumulative round-off errors. Fixed-point DSP devices are usually cheaper and faster than
their floating-point counterparts because they use less silicon, have lower power consumption, and require fewer
external pins. Most high volume low cost embedded applications, such as appliance control, cellular phones,
hard disk drives, modems, audio players, and digital cameras use fixed-point processors.
Floating-point arithmetic greatly expands the dynamic range of numbers. A typical 32-bit floating point DSP
processor, such as the TMS320C67x, represents number with a 24-bit mantissa and an 8-bit exponent. The
mantissa represents a fraction in the range -1.0 to +1.0, while the exponent is an integer that represents the
number of places that the binary point must be shifted left or right in order to obtain the true value. For
example, in decimal number system we have some number like 10.2345. We can write this in the form of
0.102345 × 102 . So in this format, 0.102345 is called mantissa and 2 is called exponent.
A 32-bit floating-point format covers a large dynamic range. Thus the data dynamic range restriction may be
virtually ignored in a design using floating-point DSP processors. But in fixed-point format, the designer has
to apply scaling factors and other techniques to prevent arithmetic overflow. This is usually a difficult and
time consuming process. As a result, floating-point DSP processors are generally easy to program and use,
but are more expensive and have higher power consumption. In this session, you will learn the issues related
to fixed-point arithmetic.
2 Fixed-point Formats
Two’s complement Integer Representation
This representation is most common method for representing a signed integer. The two’s complement of a binary
number is defined as the value obtained by subtracting the number from a large power of two (specifically,
from 2N for an N-bit two’s complement). i.e., a negative number x is represented as 2N + x. With N = 4,
x = (−6)(10) is represented as 16 + (−6) = (10)(10) = 1010(2) . As far as the hardware is concerned, fixed-point
number systems represent data as B-bit integers. The two’s-complement number system usually used is:
(
binary integer representation, if 0 ≤ k ≤ 2B−1 − 1
k= (1)
(bitwise complement of k) + 1, if − (2B−1 ) ≤ k ≤ 0
1
The most significant bit is known as the sign bit. It is 0 when the number is non-negative and 1 when the
number is negative.
Figure 1 is an easy way to visualize two’s complement representation.
Figure 1: Two’s complement representation
Fractional fixed-point number representation

For the purposes of signal processing, we often regard the fixed-point numbers as binary fractions in the range
[-1, 1), by implicitly placing a decimal point after the sign bit. Fixed-point representation of a fractional
number x is illustrated in figure 2.
Figure 2: Fixed-point representation of binary fractional numbers
The word-length is B(= M + 1) bits, i.e. M magnitude bits and one sign bit. The most significant bit (MSB)
is the sign bit, which represents the sign of the number as follows.
(
0, if x ≥ 0
b0 = (2)
−1, otherwise
The remaining M bits give the magnitude of the number. The rightmost bit bM is called the least significant
bit (LSB), which represents precision of the number.
The decimal value corresponding to a binary fractional number x can be expressed as,
x = b0 + b1 2−1 + b2 2−2 + · · · + bM 2−M (3)
M
X
x = b0 + bm 2−m (4)
m=1
2 of 7
Figure 3 provides a visual representation for a 3 bit fractional representation.
Figure 3: Fixed-point representation of binary fractional numbers that uses 3 bits
For example, from Figure 2, the easiest way to convert a normalized 16-bit fractional binary number into an
integer that can be used by C55x assembler is to move the binary point to the right by 15 bits (at the right of
bM ). Since shifting the binary point 1 bit right is equivalent to multiplying the fractional number by 2, moving
the binary point to the right by 15 bits can be done by multiplying the decimal value by 215 = 32768.
Q-format
An alternate format for storing rational numbers is Qn.m format as illustrated in Figure 4. There are n bits
to the left of the binary point representing integer portion, and m bits to the right, representing a fractional
value. We have N=n+m+1, where N = word length in bits (ex. 16).
Figure 4: general binary fractional number
The choice of n and m control the trade-off between range and resolution of the representation. The most
popular fractional number format is Q0.15 format (n = 0 and m = 15) , which is simply referred to as Q15
format since there are 15 fractional bits.
3 Integer Word-Length (IWL)

The number of bits required to express the integer part of a floating point number is called as the integer
word-length of that number. IWL of a number gives us an idea of the maximum precision with which it can
be represented in a finite sized shift register. For example, the IWL of 56.25 is 6 (Why?). Therefore, it can be
stored in a 16 bit register just by using the 6 LSBs. But, this results in wastage of the 9 MSBs which all are
zeros. Thus, to store 56.25 in a 16 bit register, we will have to multiply it by 215−6 , i.e., 29 = 512, so that it
occupies the whole 16 bit register. Note that, the binary point is implicit while storing a number in fixed-point
notation. Hence, we will store 56.25 × 29 = 28800 as the 16-bit signed integer with an implicit binary point
to have 9 fractional bits in the 16-bit binary representation. Compare this to a representation where we do
not have any fractional bits and will have 56. This concept of IWL and its use for effective storage is used in
section 6.
3 of 7
4 Round-off Error in Fixed Point Implementation
Fixed-point arithmetic is often used with DSP hardware for real-time processing because it offers fast operation
and relatively economical implementation. Its drawbacks include a small dynamic range and low resolution. A
fixed-point representation also leads to arithmetic errors, which if not handled correctly will lead to erroneous
operations. These errors have to be handled in such a way as to have some trade-off among accuracy, speed
and cost.
Consider the multiplication of two binary fractions, as shown in figure4. From above calculation, we see that
Figure 5: Demonstration of round-off error
full-precision multiplication almost doubles the number of bits. If we wish to return the product to a b-bit
representation, we must truncate the (b − 1) least significant bits. However this introduces truncation error.
Truncation error is also known as round off error (the number is rounded to the nearest b-bit fractional value
rather than truncated). This type of error occurs after the multiplication.
5 Fixed-point computations in Matlab

Matlab has a fixed-point toolbox to support fixed point computations. It is quite comprehensive that it can be
used for several applications. Here we will consider only specific functions that will be useful for us to understand
and perform some basic operations. Verify that you have fixed-point toolbox installed by typing
>> help fixpoint
The above command should execute and show that ’Fixed-Point Designer’ is installed. We will next look at
some useful functions that you will use in this lab. For more details consult the documentation provided in
Matlab.
5.1 Creating fixed-point numeric object - fi

Run the command
>> help fi
to know all about the fixed-point numeric object. We will restrict our discussion to one specific usage of this
function. This function can be used to represent a real world value in fixed-point representation. For example,
if we would like to represent x = 3.147 using 16-bits and in Q13.2 format. We will have
>> x = 3.147;
>> xf = fi(x, 1, 16, 2);
The above representation is of the format fi(v, s, w, f), to represent a real value v, in signed s format,
with word length w and fraction length f . The integer word length (IWL) is w − f . For signed representation
s = 1 and unsigned representation has s = 0.
Note: In Matlab, the default display format is short. To view the actual value you need to set the format to
long. This will help when visualizing fixed-point values.
>> format long
4 of 7
Another useful and important aspect that needs to be associated with a fixed-point object created using fi is
its behaviour during arithmatic operations. This is controlled by the fimath setting, which will be discussed
in the next section. This is done as shown next,
>> fm = fimath;
>> x = 3.147;
>> xf = fi(x, 1, 16, 2, fm);
In this example, we are using the default settings in fimath. So this behaves similar to the earlier case where
we did not specify fm, while creating the fixed-point object xf. Note you can use the object to access several
of its properties using the ’dot’ form as shown next.
>> xf.bin
or
>> xf.WordLength
Try typing the tab-key after placing a ’.’ to see other properties that can be useful.
5.2 Configuring fixed-point arithmatic behaviour - fimath

Run the command
>> help fimath
to know about this command. We will use this to set our preferences for intermediate actions that will happen
while performing arithmatic with fixed-point numbers. To get a sense of such properties, do this
>> fimath
Some properties that will be useful for us to understand and use in this lab are
• OverflowAction, RoundingMethod
• MaxProductWordLength, MaxSumWordLength
• ProductMode, ProductWordLength, ProductFractionLength
• SumMode, SumWordLength, SumFractionLength
The fimath function will be used to control the behaviour of variables (objects) created through fi. For
instance, if we need to set the overflow action (while adding) to Wrap instead of the default option of Saturate,
we can do this.
>> fimath(’OverflowAction’, ’Wrap’)
5.3 Setting fixed-point preferences - fipref

Run the command
>> help fipref
to know about this command. We can use this to set our preferences on what gets displayed when examining
fixed-point objects. Once set the values are set for the current Matlab session. You can use resetfipref to
get default behaviour.
>> xf
will display the real-world value and other fixed-point properties of xf, which is a fixed-point object. For
example NumberDisplay is one property that we can use to display by default the ’RealWorldValue’, ’bin’,
’dec’, ’hex’, or , ’int’.
>> p = fipref;
>> p.NumberDisplay = ’bin’;
Now doing
>> xf
5 of 7
will display the binary form of xf and other fixed-point properties.
5.4 Quick tutorial on fixed-point arithmatic

Go over this section in Matlab’s help for a quick tutorial on Matlab’s fixed-point arithmatic, as this will be
helpful to do the lab.
web(fullfile(docroot, ’fixedpoint/gs/performing-fixed-point-arithmetic.html’))
6 Lab Exercise
6.1 Computing y = a x + b
1. We will compare fixed-point computation and floating-point computation by considering the operation
y = a x + b . Go through the given Matlab file floating sim.m and run it. This is full-precision
computation using Matlab’s default double-precision representation. Note down the range of values each
for the variables of interest to us. These are to be used to identify the required fraction and integer
length in the fixed-point simulation.
2. Use the fixed sim template.m to simulate the computation using fixed-point arithmetic. Choose frac-
tion lengths for the variables in the file, so that your fixed-point result (yf) closely matches the floating-
point result (y) in previous section. You need to complete the code with appropriate options to perform
valid fixed-point computations. The comments in the file can be used to complete the code. Compare
the floating point result with fixed-point result by plotting the error or difference between y and yf.
3. For this part use the file overflow sim template.m. The computation is similar to previous parts, but
the result will overflow. Hence, appropriately set the fixed-point overflow mode in addition to appropriate
choice of fraction lengths. Compare the floating-point result with fixed-point result by plotting the error
or difference between y and yf. This needs to be done for two different values of b as can be seen in the
template file.
6.2 Computing convolution output y = a * x

1. We will compare fixed-point computation and floating-point computation by considering the convolution
operation y = a * x . Go through the given Matlab file floating conv template.m and modify it to
realize convolution. This is full-precision computation using Matlab’s default double-precision represen-
tation. Note down the range of values each for the variables of interest to us. These are to be used to
identify the required fraction and integer length in the fixed-point simulation.
2. Use the fixed conv template.m to simulate the computation using fixed-point arithmetic. Choose
fraction lengths for the variables in the file, so that your fixed-point result (yf) closely matches the
floating-point result (y) in previous section. You need to complete the code with appropriate options
to perform valid fixed-point computations. The comments in the file can be used to complete the code.
Compare the floating point result with fixed-point result by plotting the error or difference between y
and yf.
6.3 FIR filter with quantized coefficients

In this section, you will use the Matlab filterDesigner (FD) to design a low-pass filter (LPF). You will obtain
quantized (finite word length) coefficients of different bit lengths and compare their performance (magnitude
frequency response). A sampling rate of 8 kHz is assumed throughout this exercise. The specifications of the
filter to be designed are:
• Low-pass filter, with Fpass = 500 Hz, Fstop = 1000 Hz, sampling frequency fs = 8 kHz and stop-band
attenuation Astop = 80 dB.
• Design a Equiripple filter with minimum order.
Follow these steps to design four filters that satisfy the specifications but use different word lengths.
1. Obtain the filter coefficients in double-precision (floating-point) and save it as b ref. You can use the
File → Export option in the FD to do this.
6 of 7
2. Use the Set quantization parameters option in the FD, to use fixed-point arithmetic. Use word-length
(W) of 16 and fraction-length (F) 15. In the Input/Output options, set word length to 16 and input
range to be ±1. All other options can be left as is. Design this quantized filter by pressing the Apply
button. You will see the frequency responses Lowpass Reference and Lowpass Quantized overlaid in
the frequency response panel. Export these coefficients as b 16.
3. Do as in previous part, but use W = 8, and F = 7. Export these coefficients as b 8.
4. Now use W = 4 and F = 3. Export these coefficients as b 4.
5. Save all these coefficients in a MAT file, so that you can compare the responses.
>> save(’LPF_coefficients’, ’b_ref’, ’b_16’, ’b_8’, ’b_4’)
6. Demonstrate to your TA that the filters meet the specifications. Also indicate the difference in gain/attenuation
due to use of quantized coefficients.
We will use the above filters to filter a signal that is sum of two sinusoidal input signals, to verify the filter
performance. You can use the provided FIR template.m to complete this part.
1. Use the FIR template.m file to filter a signal x, which is sum of two sinusoids x1 and x2. Signal x1 has
frequency f1 = 200 Hz and x2 has frequency f2 = 2000 Hz.
2. Filter the signal x through each of the filters designed to obtain the outputs y ref, y16, y8, and y4.
Verify that the output signals are as expected. Plot and compare the output signals and the error signals
(|y ref - y16|, |y ref - y8|, and |y ref - y4|). Demonstrate this to your TA.
3. Compare the mean-square errors (MSE) of the various outputs y16, y8, and y4 when compared to the
double-precision output y ref. While doing this, you can ignore the initial outputs depending on the
filter length. Compare the three MSEs in terms of dB and show these to your TA.
7 of 7

SW Lab 3 Fixed Point Simulation EE 462

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

SW Lab 3 Fixed Point Simulation EE 462

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SW Lab 3 Fixed Point Simulation EE 462

Uploaded by

Copyright:

Available Formats

Lab 3: Fixed-Point Simulation

EE 462 DSP Laboratory (Spring 2024)

Figure 1: Two’s complement representation

Fractional fixed-point number representation

Figure 2: Fixed-point representation of binary fractional numbers

x = b0 + b1 2−1 + b2 2−2 + · · · + bM 2−M (3)

Figure 3: Fixed-point representation of binary fractional numbers that uses 3 bits

Figure 4: general binary fractional number

3 Integer Word-Length (IWL)

Figure 5: Demonstration of round-off error

5 Fixed-point computations in Matlab

5.1 Creating fixed-point numeric object - fi

5.2 Configuring fixed-point arithmatic behaviour - fimath

5.3 Setting fixed-point preferences - fipref

5.4 Quick tutorial on fixed-point arithmatic

6.2 Computing convolution output y = a * x

6.3 FIR filter with quantized coefficients

You might also like