EEE4120F Project
EEE4120F Project
EEE4120F Project
Prepared by:
Best Nkumeleni (NKHBES001)
Kananelo Chabeli (CHBKAN001)
Malefetsane Lenka(LNKMA001)
Rumbidzai Mashumba (MSHRUM006)
Prepared for:
EEE4120F
Department of Electrical Engineering
University of Cape Town
1. I know that plagiarism is wrong. Plagiarism is to use another’s work and pretend that it is one’s
own.
2. I have used the IEEE convention for citation and referencing. Each contribution to, and quotation
in, this report from the work(s) of other people has been attributed, and has been cited and
referenced.
4. I have not allowed, and will not allow, anyone to copy my work with the intention of passing it
off as their own work or part thereof.
ii
December 22, 2024
iii
Contents
List of Figures vi
1 Introduction 1
1.1 Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Scope & Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Report Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4.1 git repo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Literature Review 4
2.1 Image Filtering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Linear Filtering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Non-Linear Filtering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Median Filter Typologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Field Gate Programmable Array (FPGA) Realization of Median Filter . . . . . . . . . 5
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 System Design 6
3.1 System Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Prototype Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2.1 Controller Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2.2 Memory Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.3 Filter Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.4 Integrated System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4.1 Benchmarking Suit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
iv
Contents
5 Discussion 25
5.1 Comments of Unit Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.1 Controller module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.2 Filter Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.3 Memory Module test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Comments on Integration Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.3 Comments on Benchmark Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.4 Comments on FPGA Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Glossary 28
Bibliography 29
v
List of Figures
vi
Chapter 1
Introduction
There have been many image processing algorithms that have been proposed in the literature and
are generally classified as linear and nonlinear [2]. Nonlinear filtering algorithms are often desired
because of their robustness and denoising power [1]. Median filtering is a widely used technique for its
ability to preserve important image features while reducing the effects of noise. The filter replaces each
pixel value with the median value of neighbouring pixels in a chosen window.Figure 1.1 illustrates this
algorithm.
In [2], the authors investigates the effect of window size on denoising. They found that for small
window sizes, the filter performed well for low noise densities but poorly when noise density was high.
1
1.2. Objectives
Also, as can be seen from figure 1.1, the filter involves several mathematical operations such as sorting
and two-dimensional computation. These operations usually require excessive computing speeds and
memory. Thus, FPGA implementation is a crucial alternative for the implementation of image filtering
algorithms in real life. Authors in [1] argue that SRAM-based FPGA offer the highest performance.
This report presents an in-depth analysis of the median filtering algorithm, its implementation in
hardware, and the comparative performance evaluation across different computational platforms.
Through this exploration, we aim to provide insights into the efficacy and versatility of hardware-
accelerated median filtering in addressing noise reduction challenges within the realm of digital imagery.
1.2 Objectives
Therefore, the objectives of the project are to:
• Implement and optimize median filter and edge detection algorithm using Verilog on FPGA or
simulated.
• Evaluate the performance of the median filter in terms of processing speed, and overall image
quality improvement.
• The solution must also be capable of handling PNG, JPEG,n and JPG images of arbitrary size.
• The solution must realize the filter with a window approach. The window size must be left
adjustable user parameter.
2
1.4. Report Outline
https://github.com/BestNkhumeleni/Yodap roject
3
Chapter 2
Literature Review
Image filtering is a fundamental operation in image processing, used for various tasks such as noise
reduction, edge enhancement, and image sharpening. There exists a broad array of filtering techniques,
which can be broadly categorized into linear and non-linear filters. This review explores the landscape
of image filtering, focusing particularly on median filtering and its implementation on (FPGA).
The mean filter, also known as an averaging filter, is a simple linear filter used primarily for reducing
noise and smoothing images. It replaces a pixel’s value with the average of its neighbors. While
effective for noise reduction, it tends to blur sharp edges [3].
Gaussian Filter
A Gaussian filter is a type of linear filter used for smoothing (blurring) images and reducing noise. It
applies a convolution operation with a Gaussian function, which weights neighboring pixels according
to a normal distribution, offering smoother noise reduction but with an edge-blurring effect [4].
Wiener Filter
The Wiener filter is an adaptive filter used for noise reduction and signal restoration. It operates on
the principle of minimizing the mean square error between the estimated and desired signals. While
it is statistically optimal for specific noise models, it requires knowledge of the noise characteristics,
which may not always be available [5].
The median filter is a non-linear digital filtering technique commonly used to remove noise while
preserving edges. Unlike linear filters such as the mean filter, the median filter replaces each pixel’s
value with the median value of the neighboring pixels. This approach effectively removes outliers and
noise while maintaining the overall structure and detail of the image [6].
4
2.2. Median Filter Typologies
2.4 Conclusion
Image filtering is crucial in image processing, with median filtering standing out for its edge-preserving
noise reduction capabilities. FPGA implementations provide a hardware-centric approach for efficient
median filter execution, and ongoing research continues to explore further optimization techniques. As
the demands of image processing grow, the development of more sophisticated filtering methods and
their hardware implementations remains an active area of research.
5
Chapter 3
System Design
• Must median filter input image file given the window size.
By default, these parameters are set to handle a 512x512 RGB image. The Controller communicates
with the Memory module using MEM_ADDR and MEM_RW. The MEM_ADDR is a BUS_WIDTH-bit wide address
bus that specifies the address from which the Filter module reads pixel values. Moreover, MEM_RW is a
2-bit signal that determines whether to read (0b10) or write (0b11).
Additionally, the Controller interfaces with the Filter module through three lines: FILT_DATA, which
carries the output filtered pixel value; FILT_DNE, a signaling line indicating that the filtering is complete
6
3.2. Prototype Design
and the data on FILT_DATA line is valid; and FILT_EN, an enable line that keeps the Filter module active
as long as this line is set HIGH. Figure 3.1 shows the state machine of the Controller module, which
operates in three states: WAIT, WINDOW_PROMPT, and FILE_WRITE. The module enters the WAIT state upon
reset and remains there until a pulse is received on the START line. Upon detecting a positive pulse on
START, the Controller moves into the WINDOW_PROMPT state, where it calculates the memory address for
the next pixel in the filtering window until all window pixels have been read by the Filter Module.
After exiting this state, it transitions to FILE_WRITE upon receiving a positive pulse on FILT_DNE. In
this state, it disables the filter, captures the value from FILT_DATA, and writes it to the output file.
The Controller remains in the FILE_WRITE state for approximately one clock cycle, during which it
calculates the address of the next pixel to be filtered, if any, re-enables the filter, or sends a positive
pulse on SYS_DNE to signal that the entire image has been filtered. The methodology for calculating
pixel addresses is described later in this chapter.
7
3.2. Prototype Design
This module is designed to load the image file specified by FILENAME into the MEMORY buffer upon reset.
Read and write operations are processed at the rising edge of Mem_CLK depending on the value of Mem_RW.
The READ operation is indicated by setting Mem_RW to 0b10 during the rising edge of the clock, while
the WRITE operation is indicated by 0b01. Mem_ADDR is the address bus used for reading or writing
data. During a READ operation, the requested data is loaded onto Mem_ODR followed by a pulse on
Mem_DRDY which the Filter module uses to read pixel values.
port in which case it transition into WINDOW_READ. The module remains in this state and reads window
pixel values from memory on each positive pulse of MEM_DRDY line. This data is transmitted from
Memory using Filt_MEMDATA data bus and added into window internal buffer,until all window pixels
has been read. The system then transitions into WINDOW_SORT in which case it sorts the window buffer
and extracts the median value. This median value is loaded onto Filt_RES data bus and a positive
pulse is sent on Filt_DNE. The window buffer is sorted using bubblesort and is implemented as follows:
8
3.2. Prototype Design
integer k, j ; // iterators
begin
for( k = 0;k< WINDOW_SIZE*WINDOW_SIZE-1; k= k+1) begin
for(j=0; j < WINDOW_SIZE*WINDOW_SIZE-1-k; j= j+1) begin
if(window[j] > window[j+1]) begin
temp = window[j];
window[j] = window[j+1];
window[j+1] = temp;
end
end
end
sortWindow =0;
end
endfunction
by sending a positive pulse on START line. Controller then computes the address of the starting
pixel of the filter window, and sends a pulse on FILT_EN line to enable the filter. The each subsequent
negative edge of the clock, the Filter module reads the data value on MEM_ODATA bus upon receipt of a
pulse on MEM_RDY. At the same time, the Controller module computes the address of the next pixel
in that particular window. When all pixels within the window has been read by Filter module, it
then sorts the window pixel and writes the resulting median pixel on FILT_DATA data bus,and sends
indication to the Controller on FILT_DNE line. The controller then grabs the value on FILT_DATA and
writes it to the output file given at the start of the simulation.
9
3.3. "1"FPGA
3.3 FPGA
This module was designed to transmit filtered image data from FPGA to the PC at a 9600 baud rate
or 115200 baud rate. The UART module has 4 states: IDLE, START, SEND_BIT, and STOP. The
machine remains in the IDLE state if the data available flag is low. In the START state, the TX
line is held at 0 for a number of clock cycles per bit. Then follows SEND_BIT, where a byte to be
transmitted is sent bit by bit. In the STOP state, the TX line is held high for a number of clocks per
bit, and then the machine goes back to IDLE.
Figure 3.4 shows the state transition diagram.
The chosen golden measure for benchmarking was OpenCV. It is a widely used image-processing library.
Using OpenCV provided optimised and robust image processing algorithms. This made it an ideal
reference point.
10
3.4. Performance Evaluation
• OpenCV is well tested and it has been widely used in academia and in the industry. It has been
proven to be accurate and reliable.
• OpenCV has been optimised and this allows it to provide a high-performance baseline for
comparison.
• OpenCV is easy to use. This is because a lot of supporting resourcing are available for use. This
ease of use made the benchmarking efficient.
11
Chapter 4
Testing Procedure: Create Memory_tb.v file which will be used as a testbench for testing Memory.v
module. Instantiate the module and set the parameters appropriately. write a series of data streams to
memory and read the back. Analyse the results on gtkwave forms to confirm if they are as expected.
Testing Procedure Create a separate testbench called Filter_tb.v and in it, instantiate Memory.v
and Filter.v modules. Load memory with data from external data files. Analyse signal lines on gtkwave
analyse to confirm that it runs as expected.
12
4.2. Integration Test Procedures
Testing Procedure With the same testbench used for UTP02, add a the following for loop in
Filter.v module after the sorting the window buffer:
Observe the pixels as displayed on the console if they appear in a sorted manner.
Testing Procedure Do the test procedures as in UTP02 and UTP03. Use gtkwave analyser to
confirm that data written on Filt_DATA data bus is the median value, and that there is a pulse on
Filt_DNE line.
Testing procedure Test the module in simulation using Vivado. Then write a Python script to
read data. Use SW0 to SW7 on Nexys A7-100T as the data byte and SW8 as data available flag.
Synthesise the code, rum implementation, generate bitstream and load bit files to FPGA. Run Python
code and observe output. FPGA must be running at 100MHz and UART baud rate should be 9600bps.
13
4.3. Benchmarking Test Procedures
Testing Procedure: Load the data in table 4.2 into a text file and read it into the image using
readmemh() system functions. Within a testbench Integrate_tb.v instantiate all modules and collect
simulation data, dumping it to the external dump file. Analyse this data to ensure that the signal line
carry data at expected times.
Testing Procedure: The testing procedure is the same at ITP01. However, the real image might be
too big to analyse on the waveform analyser,as such the final image pixels are written to an external
file and converted back to a PNG image using a C++ program developed.
Testing Procedure: The following is the procedure followed to obtain timing data of the openCV
median filtering.
• Output Verification to see if the image is changing as expected with different filtering windows.
14
4.4. Test Data
• Timing was done by measuring the number of clock cycles taken to filter an image and multiplying
by time unit of 1ns to find simulation time in seconds.
• Output Verification to see if the image is changing as expected with different filtering windows.
15
4.5. Unit Test Results
Figure 4.1 shows the input and output registers of the controller change when:
1 // Initialize Inputs
2 Control_RST = 0;
3 Control_STRT = 0;
4 Control_FDNE = 0;
5 Control_FDATA = 0;
1 // Initialize Inputs
2 Filt_RST = 0;
16
4.5. Unit Test Results
3 Filt_EN = 0;
4 Filt_MEMRDY = 0;
5 Filt_MEMDATA = 0;
6
7 // Apply reset
8 #(CLOCK_PERIOD);
9 Filt_RST = 1;
10 #(CLOCK_PERIOD);
11 Filt_RST = 0;
12
After a filter is applied the data is sent through the send window wire. The highlighted section in
figure 4.2 shows when the filter is enabled.
1 // Initialize Inputs
2 Mem_RST = 0;
3 Mem_RW = 0;
4 Mem_IDR = 0;
5 Mem_ADDR = 0;
6
17
4.6. "1"UART Test Results
11 // Apply reset
12 #(CLOCK_PERIOD);
13 Mem_RST = 1;
14 #(CLOCK_PERIOD);
15 Mem_RST = 0;
16
In the figure above (4.5.2) dummy data is written into the the memory and then read to simulate
information transfer.
Figure 4.4 shows the UART transmitter simulation results in Vivado. This figure shows that state
transition is occurring correctly as data 0xAA is transmitted correctly when the data available flag
is high. Also, the data is transmitted least significant bit first as per UART protocol data frame
specification.
18
4.7. Integration Results
23 except KeyboardInterrupt:
24 ser.close() # Close the serial port
Figure 4.5 shows FPGA configuration when the data available flag is held low as illustrated by the
red highlighted switch. Python test program output in figure 4.6 verifies that there is do data being
transmitted when the flag is held low. Moreover, UART receiver led (the receiver from USB to UART
chip which is connected to FPGA transmitter) is off hence verifying that UART is in an idle state.
FPGA successfully transmitted 0b11111111 indicated by SW0-SW7 on figure 4.7. Also UART status
led is on. Finally python output on figure 4.8 verifies that number 255 which is 0b11111111 in binary
is recieved.
19
4.7. Integration Results
20
4.8. Benchmarking Results
Filtering window
To get a good idea of performance the filtering widow was varied. This variation helped in understanding
how the filters perform under different window sizes. Each image was processed using a 5x5, 9x9, 15x15
window. It was expected that for both filters, increasing window size increasing smoothing. The larger
the window the blurrier the image.
21
4.9. Benchmark Results
Image size
To evaluate the performance of the filtering algorithms different image sizes were used in the bench-
marking. This was helpful in understanding how size affects the performance and scalability of the
filter. It was expected that with increasing size the execution time would increase too due to high
numbers of pixels to be processed.
the image outputs of the different filters with Figure 3.4 and 3.8 as the reference original images.
22
4.9. Benchmark Results
(a) Varying window size (tree) (b) Varying window size (dog)
23
4.9. Benchmark Results
24
Chapter 5
Discussion
25
5.3. Comments on Benchmark Results
Figures 4.9- 4.12 show the side-by-side comparisons of the Verilog implementation and the OpenCV
implementation. From these images, it can be seen that the results of the two filters are consistent
showing that the Verilog algorithm is indeed accurately filtering the images.
Figure 4.13 shows the speedup graph with (a) depicting the speedup of the filtering algorithm used
on an image of a tree with varying window sizes and (b) illustrating the speedup of the algorithm on
an image of a dog under the same conditions. The results are not as expected, instead the graphs
illustrate fluctuations in performance. This is an indication of irregularities in speedup instead of a
smooth predictable pattern.
Figure 4.14 is the speedup graph for varying image sizes. A window of 5 was used to maintain
consistency and eliminate other factors that may alter the results. It was expected that as the size
increases the speed up also increases consistently but just like the graphs mentioned above there are
irregularities in the speed up.
26
Chapter 6
This report detailed the design and implementation of a median filter on an FPGA platform. The
literature review on image filtering techniques is presented in chapter 2, followed by a detailed system
design procedure in chapter 3. Some of the objectives of the project were to implement a median
filter on FPGA platform and analyse its performance. The system was coded in Verilog HDL and an
attempt to run on Nexys A7100T board, but failed due to reasons explained in chapter 5. However,
this can be overcome by using more powerful FPGA boards like Digilent Zybo Z7 which has SoC on
board. Moreover, more robust hardware descriptive languages like VHDL or System Verilog can be
used as instead of Verilog as they are more expressive and robust than Verilog.
The unit tests demonstrated that the submodules and the overall system behave as expected. The
memory module stores information and makes it accessible at all times, resetting when prompted. The
filter processes data correctly, and the controller successfully triggers the other modules as intended.
27
Glossary
28
Bibliography
[1] L. A. Aranda, P. Reviriego, and J. A. Maestro, “Error detection technique for a median filter,”
IEEE Transactions on Nuclear Science, vol. 64, no. 8, pp. 2219–2226, 2017.
[2] S. Sudan, “Median filter performance based on different window sizes for salt and pepper noise
removal in gray and rgb images,” International Journal of Signal Processing, Image Processing and
Pattern Recognition, vol. 8, no. 10, pp. 343–352, 2015.
[3] B. Justusson, “Median filtering: Statistical properties,” Two-dimensional digital signal prcessing II:
transforms and median filters, pp. 161–196, 2006.
[4] G. Deng and L. Cahill, “An adaptive gaussian filter for noise reduction and edge detection,” in
1993 IEEE conference record nuclear science symposium and medical imaging conference. IEEE,
1993, pp. 1615–1619.
[5] J. Benesty, S. Makino, J. Chen, J. Benesty, J. Chen, Y. Huang, and S. Doclo, “Study of the wiener
filter for noise reduction,” Speech enhancement, pp. 9–41, 2005.
[6] L. Yin, R. Yang, M. Gabbouj, and Y. Neuvo, “Weighted median filters: a tutorial,” IEEE
Transactions on circuits and systems II: analog and digital signal processing, vol. 43, no. 3, pp.
157–192, 1996.
[7] Y. Hu and H. Ji, “Research on image median filtering algorithm and its fpga implementation,” in
2009 WRI global Congress on intelligent systems, vol. 3. IEEE, 2009, pp. 226–230.
[8] A. Nieminen and Y. Neuvo, “Comments on‘ theoretical analysis of the max/median filter’ by gr
arce and mp mclaughlin,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36,
no. 5, pp. 826–827, 1988.
[9] G. Bates and S. Nooshabadi, “Fpga implementation of a median filter,” in TENCON ’97 Brisbane -
Australia. Proceedings of IEEE TENCON ’97. IEEE Region 10 Annual Conference. Speech and
Image Technologies for Computing and Telecommunications (Cat. No.97CH36162), vol. 2, 1997, pp.
437–440 vol.2.
29