A comprehensive resource to explore media processing, from fundamental concepts to advanced techniquess.
- Introduction to Digital Images
- Load & Save Images
- Interpolating Images
- Apply Geometric Transformations
- Intensity Transformations
- Histogram Processing
- Spatial Filtering using Convolution
- Frequency Filtering using Fourier & Cosine Transform
- Multi-Resolution Analysis (Wavelet Transform)
- Image Compression (JPEG Coded)
- Morphological Processing
A collection of concepts and tools utilized in the main notebooks
- Color Spaces
- Introductions to Compression
- Prerequisites and Introductions to Convolution
- Quality Assessment Metrics
- Prerequisites and Introductions to Frequency Transforms
- π¨βπ» Programming Fundamentals
- Proficiency in Python (data types, control structures, functions, classes, etc.).
- My Python Workshop: github.com/mr-pylin/python-workshop
- Experience with libraries like NumPy and Matplotlib.
- My NumPy Workshop: github.com/mr-pylin/numpy-workshop
- My Data Visualization Workshop: github.com/mr-pylin/data-visualization-workshop
- Proficiency in Python (data types, control structures, functions, classes, etc.).
- π£ Mathematics for Machine Learning
- π² Linear Algebra: Vectors, matrices, matrix operations.
- Linear Algebra Review and Reference written by Zico Kolter.
- Notes on Linear Algebra written by Peter J. Cameron.
- MATH 233 - Linear Algebra I Lecture Notes written by Cesar O. Aguilar.
- π² Probability & Statistics: Probability distributions, mean/variance, etc.
- π² Linear Algebra: Vectors, matrices, matrix operations.
- πΆ Digital Signal Processing Knowledge
This project requires Python v3.10 or higher. It was developed and tested using Python v3.13.1. If you encounter issues running the specified version of dependencies, consider using this version of Python.
Use Poetry for dependency management. It handles dependencies, virtual environments, and locking versions more efficiently than pip.
To install exact dependency versions specified in poetry.lock for consistent environments without installing the current project as a package:
poetry install --no-root
Install all dependencies listed in requirements.txt using pip:
pip install -r requirements.txt
- Open the root folder with VS Code (
Ctrl/Cmd + K
followed byCtrl/Cmd + O
). - Open
.ipynb
files using the Jupyter extension integrated with VS Code. - Select the correct Python kernel and virtual environment where the dependencies were installed.
- Allow VS Code to install any recommended dependencies for working with Jupyter Notebooks.
βοΈ Notes:
- It is highly recommended to stick with the exact dependency versions specified in poetry.lock or requirements.txt rather than using the latest package versions. The repository has been tested on these versions to ensure compatibility and stability.
- This repository is actively maintained, and dependencies are updated regularly to the latest stable versions.
- The table of contents embedded in the notebooks may not function correctly on GitHub.
- For an improved experience, open the notebooks locally or view them via nbviewer.
-
ffmpeg & ffprobe:
- ffmpeg is a Swiss Army knife for media, converting and manipulating audio and video files in a wide range of formats.
- Link: github.com/BtbN/FFmpeg-Builds
-
Video Quality Measurement Tool (VQMT):
- It is a software program designed to analyze the quality of digital video and images.
- Link: compression.ru/video/quality_measure
-
yuv-player:
- Lightweight YUV player which supports various YUV format.
- Link: github.com/Tee0125/yuvplayer
-
DIP3/e β Book Images
- A collection of all images and videos used in the Digital Image Processing (3rd Edition) book written by Rafael C. Gonzalez & Richard E. Woods.
- Permission is required from the owner of a Β© image if the image is used for other than personal educational or research purposes.
- Link: imageprocessingplace.com/DIP-3E/dip3e_book_images_downloads.htm
-
YUV4MPEG Videos:
- Derf's video collection provides uncompressed YUV4MPEG clips for testing video codecs.
- Link: media.xiph.org/video/derf
- Codecs are algorithms used to compress and decompress signals, ensuring efficient storage and transmission of high-quality signals e.g. videos.
- For detailed information on popular image/video codecs, refer to the ./codecs/README.md.
- A fundamental package for scientific computing in Python, providing support for arrays, matrices, and a large collection of mathematical functions.
- Official site: numpy.org
- A comprehensive collection of Python libraries for creating static, animated, and interactive visualizations: Matplotlib, Seaborn, and Plotly.
- Official sites: matplotlib.org | seaborn.pydata.org | plotly.com
- A powerful open-source library (primarily written in C++) for computer vision and image processing tasks.
- Supports a wide range of functionalities, including image and video processing, object detection, facial recognition, and more.
- Compatible with multiple programming languages, including Python, C++, and Java.
- Official sites: opencv.org
Any mistakes, suggestions, or contributions? Feel free to reach out to me at:
I look forward to connecting with you! πββοΈ
This project is licensed under the Apache License 2.0.
You are free to use, modify, and distribute this code, but you must include copies of both the LICENSE and NOTICE files in any distribution of your work.
-
Original Images:
- The images located in the ./assets/images/original/ folder are licensed under the CC BY-ND 4.0.
- Note: This license restricts derivative works, meaning you may share these images but cannot modify them.
-
The images located in the ./assets/images/dip_3rd/ folder are licensed under the table below:
Image Copyright Owner Address CH02_Fig0222(b)(cameraman).tif Massachusetts Institute of Technology MIT.edu CH03_Fig0309(a)(washed_out_aerial_image).tif NASA nasa.gov CH03_Fig0326(a)(embedded_square_noisy_512).tif - imageprocessingplace.com CH03_Fig0354(a)(einstein_orig).tif Public domain - CH06_Fig0638(a)(lenna_RGB).tif Public domain - CH06_FigP0606(color_bars).tif - - -
Third-Party Assets:
- Additional images located in ./assets/images/third_party/ are used with permission or according to their original licenses.
- Attributions and references to the files included in ./assets/images/third_party/ are included in the code where these images are used.
-
Miscellaneous assets:
- The images found in ./assets/images/misc/ are modified versions of the ones listed above.