PC Project Report Latex
PC Project Report Latex
PC Project Report Latex
Parallel Computing
Abstract. points
1. This project report presents the implementation and performance
evaluation of a 2D convolution algorithm using the Gaussian blur
filter on parallel computing systems. Gaussian blur, a widely-used
technique for noise reduction in image processing, demands high
computational power. We compare the performance of multi-core
CPUs and GPUs, employing the Google Colaboratory platform to
analyze the efficiency and speedup achieved through parallel com-
puting.
– 2. This study leverages the parallel processing capabilities of multi-
core CPUs and GPUs to enhance the performance of the Gaussian
blur filter. By distributing the computational workload across mul-
tiple processors, the project aims to achieve significant reductions
in processing time. The Google Colaboratory platform is utilized to
run the experiments, providing access to powerful GPU resources
and enabling efficient execution of parallel algorithms.
– 3. The report details the methodology for implementing the Gaus-
sian blur filter using Python libraries for both CPU (pymp) and GPU
(PyCUDA). A comprehensive performance comparison is conducted,
evaluating the processing times for various image resolutions and ker-
nel sizes. The results demonstrate a substantial speedup when using
GPU parallelization compared to multi-core CPU implementations.
The findings underscore the potential of parallel computing to meet
the demands of modern image processing tasks, making it a viable
solution for applications requiring high computational throughput
– 4.To maximize efficiency, optimization techniques such as shared
memory utilization in GPUs, memory coalescing, and minimizing
thread divergence are employed. These techniques reduce memory
access latency and improve the overall throughput of the Gaussian
Blur operation.
– 5. Parallel implementation shows good speedup compared to serial
run, especially with large images. The performance grows with the
number of processing units and it is very good for a real-applications
that require high resolution image processing. Results Our exper-
imental results illustrate the advantages of parallel programming
for quick and effective Gaussian Blur effects. Studies by Chauhan,
Munesh Singh (1984) showcase the potential of Optimizing Gaussian
Blur Filter using CUDA Parallel Framework, which our implemen-
tation strives to emulate within the Walnut framework.
– 6. Keywords: Gaussian Blur, OpenMP.
1 Introduction
– Section I: Introduction.
– Section II: Preliminaries.
– Section III: EXPERIMENTAL SETUP.
– Section IV: Conclusion.
1.5 Conclusion
In this report, we have outlined the relevant and essential approaches for ad-
dressing the challenges of implementing gaussain blur in dynamic scenes. By
leveraging techniques such as bounding volume hierarchies, GPU acceleration,
dynamic scene management, and temporal coherence.
2 Preliminaries
2.1 CPU and GPU
A Multicore CPU is a single computing component with more than one inde-
pendent core. OpenMP (Open Multi-Processing) and TBB (Threading Building
Blocks) are widely used application programming interfaces (APIs) to make use
of multicore CPU efficiently (Polesel et al, 2000). In this study, a general purpose
platform using Python pymp tool is used to parallelize the algorithm. On the
otherhand, GPU is a single instruction and multiple data (SIMD) stream archi-
tecture which is suitable for applications where the same instruction is running
in parallel on different data elements. In image convolution, image pixels are
treated as separate data elements which makes GPU architecture more suitable
for parallelizing the application (Polesel et al,2000). In this study, we are using
CUDA platform with GPU since CUDA is the most popular platform used to
increase the GPU utilization. The main difference between CPU and GPU, as
shown in Figure 1 (Reddy et al, 2017), is the number of processing units. In
CPU it has less processing units with cache and control units while in GPU it
has more processing units with its own cache and control units. GPUs contain
hundreds of cores which causes higher parallelism compared to CPUs.
3 EXPERIMENTAL SETUP
In our experiment, we are using Google Colaboratory or “colab” to run our
code. Google Colaboratory is a free online cloud-based Jupyter notebook envi-
ronment that allows us to train PyCUDA and which gives you easy, pythonic
access NVIDIA’s CUDA parallel computation API. To be able to run our code
on Google Colabs, we used the Python programming language. Python pymp
library is used for the multiprocessor code. This package brings OpenMP-like
functionality to Python. It takes the good qualities of OpenMP such as mini-
mal code changes and high efficiency and combines them with the Python Zen
of code clarity and ease-of-use. Python PyCUDA library is used to be able to
call CUDA function in our python code and PyCUDA gives you easy, pythonic
access to NVIDIA’s CUDA parallel computation API.
4 Conclusion
In conclusion, In this paper, a feature-preserving interpolative vector quanti-
zation method is proposed to compress images. On compressing an image, the
Fig. 4: Processing Time.
Fig. 5: Speedup.
proposed method decomposes the image into a low frequency image (i.e., the ap-
proximation image) and a high frequency image (i.e., the residual image which
is generated by subtracting the approximation image from the original image).
The proposed method applies the MVP algorithm to generate the approxima-
tion image and uses the FPVQ algorithm to quantize the residual image. The
encoder sends the top left pixel’s intensity value in every block of the approxi-
mation image and the index of the represented codeword for every block of the
residual image to the network. The decoder reconstructs the image on the other
end of the network by adding the approximation image (generated by interpo-
lating received intensity values) and the quantized residual image (generated by
replacing received indices with their corresponding codewords in a pre-stored
codebook). The experimental results show that the proposed method usually
yields satisfactory results.
5 REFERENCES
Andrea Polesel et al, 2000, Image Enhancement via Adaptive Unsharp Masking.
IEEE Transactions on Image Processing, VOL. 9, NO. 3 Ritesh Reddy et al,
2017, Digital Image Processing through Parallel Computing in Single-Core and
Multi-Core Systems using MATLAB. IEEE International Conference On Recent
Trends in Electronics Information Communication Technology (RTEICT). India
Jan Novák et al, GPU Computing: Image Convolution. Karlsruhe Institute of
Technology. Shrenik Lad et al, Hybrid Multi-Core Algorithms for Regular Image
Filtering Applications. International Institute of Information Technology. Hyder-
abad, India Ferhat Bozkurt et al, 2015, Effective Gaussian Blurring Process on
Graphics Processing Unit with CUDA. International Journal of Machine Learn-
ing and Computing, Vol. 5, No. 1. Singapore, Asia. Munesh Singh Chauhan,
2018, Optimizing Gaussian Blur Filter using CUDA Parallel Framework. Infor-
mation Technology Department, College of Applied Sciences. Ibri, Sulatanate of
Oman. B. N. Manjunatha Reddy et al, 2017, Performance Analysis of GPU V/S
CPU for Image Processing Applications. International Journal for Research in
Applied Science Engineering Technology (IJRASET). India. Ridho Dwisyah Pu-
tra et al, 2017, A Review of Image Enhancement Methods. International Journal
of Applied Engineering Research ISSN 0973-4562 Volume 12, Number 23 (2017),
pp. 13596-13603. India. Nvidia Corporation, 2016, NVIDIA® TESLA® P100
GPU ACCELERATOR. NVIDIA Data sheet