Skip to content

Commit 954e2f9

Browse files
committed
Tutorial Discrete Fourier Transform
1 parent 13317bd commit 954e2f9

File tree

5 files changed

+382
-93
lines changed

5 files changed

+382
-93
lines changed

doc/tutorials/core/discrete_fourier_transform/discrete_fourier_transform.markdown

Lines changed: 171 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,59 @@
11
Discrete Fourier Transform {#tutorial_discrete_fourier_transform}
22
==========================
33

4+
@prev_tutorial{tutorial_random_generator_and_text}
5+
@next_tutorial{tutorial_file_input_output_with_xml_yml}
6+
47
Goal
58
----
69

710
We'll seek answers for the following questions:
811

912
- What is a Fourier transform and why use it?
1013
- How to do it in OpenCV?
11-
- Usage of functions such as: @ref cv::copyMakeBorder() , @ref cv::merge() , @ref cv::dft() , @ref
12-
cv::getOptimalDFTSize() , @ref cv::log() and @ref cv::normalize() .
14+
- Usage of functions such as: **copyMakeBorder()** , **merge()** , **dft()** ,
15+
**getOptimalDFTSize()** , **log()** and **normalize()** .
1316

1417
Source code
1518
-----------
1619

20+
@add_toggle_cpp
1721
You can [download this from here
18-
](https://github.com/opencv/opencv/tree/master/samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp) or
22+
](https://raw.githubusercontent.com/opencv/opencv/master/samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp) or
1923
find it in the
2024
`samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp` of the
2125
OpenCV source code library.
26+
@end_toggle
27+
28+
@add_toggle_java
29+
You can [download this from here
30+
](https://raw.githubusercontent.com/opencv/opencv/master/samples/java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java) or
31+
find it in the
32+
`samples/java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java` of the
33+
OpenCV source code library.
34+
@end_toggle
35+
36+
@add_toggle_python
37+
You can [download this from here
38+
](https://raw.githubusercontent.com/opencv/opencv/master/samples/python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py) or
39+
find it in the
40+
`samples/python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py` of the
41+
OpenCV source code library.
42+
@end_toggle
43+
44+
Here's a sample usage of **dft()** :
45+
46+
@add_toggle_cpp
47+
@include cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
48+
@end_toggle
2249

23-
Here's a sample usage of @ref cv::dft() :
50+
@add_toggle_java
51+
@include java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java
52+
@end_toggle
2453

25-
@includelineno cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
54+
@add_toggle_python
55+
@include python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py
56+
@end_toggle
2657

2758
Explanation
2859
-----------
@@ -49,89 +80,140 @@ Fourier Transform too needs to be of a discrete type resulting in a Discrete Fou
4980
(*DFT*). You'll want to use this whenever you need to determine the structure of an image from a
5081
geometrical point of view. Here are the steps to follow (in case of a gray scale input image *I*):
5182

52-
-# **Expand the image to an optimal size**. The performance of a DFT is dependent of the image
53-
size. It tends to be the fastest for image sizes that are multiple of the numbers two, three and
54-
five. Therefore, to achieve maximal performance it is generally a good idea to pad border values
55-
to the image to get a size with such traits. The @ref cv::getOptimalDFTSize() returns this
56-
optimal size and we can use the @ref cv::copyMakeBorder() function to expand the borders of an
57-
image:
58-
@code{.cpp}
59-
Mat padded; //expand input image to optimal size
60-
int m = getOptimalDFTSize( I.rows );
61-
int n = getOptimalDFTSize( I.cols ); // on the border add zero pixels
62-
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
63-
@endcode
64-
The appended pixels are initialized with zero.
65-
66-
-# **Make place for both the complex and the real values**. The result of a Fourier Transform is
67-
complex. This implies that for each image value the result is two image values (one per
68-
component). Moreover, the frequency domains range is much larger than its spatial counterpart.
69-
Therefore, we store these usually at least in a *float* format. Therefore we'll convert our
70-
input image to this type and expand it with another channel to hold the complex values:
71-
@code{.cpp}
72-
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
73-
Mat complexI;
74-
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
75-
@endcode
76-
-# **Make the Discrete Fourier Transform**. It's possible an in-place calculation (same input as
77-
output):
78-
@code{.cpp}
79-
dft(complexI, complexI); // this way the result may fit in the source matrix
80-
@endcode
81-
-# **Transform the real and complex values to magnitude**. A complex number has a real (*Re*) and a
82-
complex (imaginary - *Im*) part. The results of a DFT are complex numbers. The magnitude of a
83-
DFT is:
84-
85-
\f[M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2}\f]
86-
87-
Translated to OpenCV code:
88-
@code{.cpp}
89-
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
90-
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
91-
Mat magI = planes[0];
92-
@endcode
93-
-# **Switch to a logarithmic scale**. It turns out that the dynamic range of the Fourier
94-
coefficients is too large to be displayed on the screen. We have some small and some high
95-
changing values that we can't observe like this. Therefore the high values will all turn out as
96-
white points, while the small ones as black. To use the gray scale values to for visualization
97-
we can transform our linear scale to a logarithmic one:
98-
99-
\f[M_1 = \log{(1 + M)}\f]
100-
101-
Translated to OpenCV code:
102-
@code{.cpp}
103-
magI += Scalar::all(1); // switch to logarithmic scale
104-
log(magI, magI);
105-
@endcode
106-
-# **Crop and rearrange**. Remember, that at the first step, we expanded the image? Well, it's time
107-
to throw away the newly introduced values. For visualization purposes we may also rearrange the
108-
quadrants of the result, so that the origin (zero, zero) corresponds with the image center.
109-
@code{.cpp}
110-
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
111-
int cx = magI.cols/2;
112-
int cy = magI.rows/2;
113-
114-
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
115-
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
116-
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
117-
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
118-
119-
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
120-
q0.copyTo(tmp);
121-
q3.copyTo(q0);
122-
tmp.copyTo(q3);
123-
124-
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
125-
q2.copyTo(q1);
126-
tmp.copyTo(q2);
127-
@endcode
128-
-# **Normalize**. This is done again for visualization purposes. We now have the magnitudes,
129-
however this are still out of our image display range of zero to one. We normalize our values to
130-
this range using the @ref cv::normalize() function.
131-
@code{.cpp}
132-
normalize(magI, magI, 0, 1, NORM_MINMAX); // Transform the matrix with float values into a
133-
// viewable image form (float between values 0 and 1).
134-
@endcode
83+
#### Expand the image to an optimal size
84+
85+
The performance of a DFT is dependent of the image
86+
size. It tends to be the fastest for image sizes that are multiple of the numbers two, three and
87+
five. Therefore, to achieve maximal performance it is generally a good idea to pad border values
88+
to the image to get a size with such traits. The **getOptimalDFTSize()** returns this
89+
optimal size and we can use the **copyMakeBorder()** function to expand the borders of an
90+
image (the appended pixels are initialized with zero):
91+
92+
@add_toggle_cpp
93+
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp expand
94+
@end_toggle
95+
96+
@add_toggle_java
97+
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java expand
98+
@end_toggle
99+
100+
@add_toggle_python
101+
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py expand
102+
@end_toggle
103+
104+
#### Make place for both the complex and the real values
105+
106+
The result of a Fourier Transform is
107+
complex. This implies that for each image value the result is two image values (one per
108+
component). Moreover, the frequency domains range is much larger than its spatial counterpart.
109+
Therefore, we store these usually at least in a *float* format. Therefore we'll convert our
110+
input image to this type and expand it with another channel to hold the complex values:
111+
112+
@add_toggle_cpp
113+
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp complex_and_real
114+
@end_toggle
115+
116+
@add_toggle_java
117+
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java complex_and_real
118+
@end_toggle
119+
120+
@add_toggle_python
121+
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py complex_and_real
122+
@end_toggle
123+
124+
#### Make the Discrete Fourier Transform
125+
It's possible an in-place calculation (same input as
126+
output):
127+
128+
@add_toggle_cpp
129+
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp dft
130+
@end_toggle
131+
132+
@add_toggle_java
133+
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java dft
134+
@end_toggle
135+
136+
@add_toggle_python
137+
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py dft
138+
@end_toggle
139+
140+
#### Transform the real and complex values to magnitude
141+
A complex number has a real (*Re*) and a
142+
complex (imaginary - *Im*) part. The results of a DFT are complex numbers. The magnitude of a
143+
DFT is:
144+
145+
\f[M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2}\f]
146+
147+
Translated to OpenCV code:
148+
149+
@add_toggle_cpp
150+
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp magnitude
151+
@end_toggle
152+
153+
@add_toggle_java
154+
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java magnitude
155+
@end_toggle
156+
157+
@add_toggle_python
158+
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py magnitude
159+
@end_toggle
160+
161+
#### Switch to a logarithmic scale
162+
It turns out that the dynamic range of the Fourier
163+
coefficients is too large to be displayed on the screen. We have some small and some high
164+
changing values that we can't observe like this. Therefore the high values will all turn out as
165+
white points, while the small ones as black. To use the gray scale values to for visualization
166+
we can transform our linear scale to a logarithmic one:
167+
168+
\f[M_1 = \log{(1 + M)}\f]
169+
170+
Translated to OpenCV code:
171+
172+
@add_toggle_cpp
173+
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp log
174+
@end_toggle
175+
176+
@add_toggle_java
177+
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java log
178+
@end_toggle
179+
180+
@add_toggle_python
181+
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py log
182+
@end_toggle
183+
184+
#### Crop and rearrange
185+
Remember, that at the first step, we expanded the image? Well, it's time
186+
to throw away the newly introduced values. For visualization purposes we may also rearrange the
187+
quadrants of the result, so that the origin (zero, zero) corresponds with the image center.
188+
189+
@add_toggle_cpp
190+
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp crop_rearrange
191+
@end_toggle
192+
193+
@add_toggle_java
194+
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java crop_rearrange
195+
@end_toggle
196+
197+
@add_toggle_python
198+
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py crop_rearrange
199+
@end_toggle
200+
201+
#### Normalize
202+
This is done again for visualization purposes. We now have the magnitudes,
203+
however this are still out of our image display range of zero to one. We normalize our values to
204+
this range using the @ref cv::normalize() function.
205+
206+
@add_toggle_cpp
207+
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp normalize
208+
@end_toggle
209+
210+
@add_toggle_java
211+
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java normalize
212+
@end_toggle
213+
214+
@add_toggle_python
215+
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py normalize
216+
@end_toggle
135217

136218
Result
137219
------
@@ -140,7 +222,7 @@ An application idea would be to determine the geometrical orientation present in
140222
example, let us find out if a text is horizontal or not? Looking at some text you'll notice that the
141223
text lines sort of form also horizontal lines and the letters form sort of vertical lines. These two
142224
main components of a text snippet may be also seen in case of the Fourier transform. Let us use
143-
[this horizontal ](https://github.com/opencv/opencv/tree/master/samples/data/imageTextN.png) and [this rotated](https://github.com/opencv/opencv/tree/master/samples/data/imageTextR.png)
225+
[this horizontal ](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/imageTextN.png) and [this rotated](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/imageTextR.png)
144226
image about a text.
145227

146228
In case of the horizontal text:

doc/tutorials/core/table_of_content_core.markdown

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ understanding how to manipulate the images on a pixel level.
7676

7777
- @subpage tutorial_discrete_fourier_transform
7878

79+
*Languages:* C++, Java, Python
80+
7981
*Compatibility:* \> OpenCV 2.0
8082

8183
*Author:* Bernát Gábor

samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,45 +8,58 @@
88
using namespace cv;
99
using namespace std;
1010

11-
static void help(char* progName)
11+
static void help(void)
1212
{
1313
cout << endl
1414
<< "This program demonstrated the use of the discrete Fourier transform (DFT). " << endl
1515
<< "The dft of an image is taken and it's power spectrum is displayed." << endl
1616
<< "Usage:" << endl
17-
<< progName << " [image_name -- default ../data/lena.jpg] " << endl << endl;
17+
<< "./discrete_fourier_transform [image_name -- default ../data/lena.jpg]" << endl;
1818
}
1919

2020
int main(int argc, char ** argv)
2121
{
22-
help(argv[0]);
22+
help();
2323

2424
const char* filename = argc >=2 ? argv[1] : "../data/lena.jpg";
2525

2626
Mat I = imread(filename, IMREAD_GRAYSCALE);
27-
if( I.empty())
27+
if( I.empty()){
28+
cout << "Error opening image" << endl;
2829
return -1;
30+
}
2931

32+
//! [expand]
3033
Mat padded; //expand input image to optimal size
3134
int m = getOptimalDFTSize( I.rows );
3235
int n = getOptimalDFTSize( I.cols ); // on the border add zero values
3336
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
37+
//! [expand]
3438

39+
//! [complex_and_real]
3540
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
3641
Mat complexI;
3742
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
43+
//! [complex_and_real]
3844

45+
//! [dft]
3946
dft(complexI, complexI); // this way the result may fit in the source matrix
47+
//! [dft]
4048

4149
// compute the magnitude and switch to logarithmic scale
4250
// => log(1 + sqrt(Re(DFT(I))^2 + Im(DFT(I))^2))
51+
//! [magnitude]
4352
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
4453
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
4554
Mat magI = planes[0];
55+
//! [magnitude]
4656

57+
//! [log]
4758
magI += Scalar::all(1); // switch to logarithmic scale
4859
log(magI, magI);
60+
//! [log]
4961

62+
//! [crop_rearrange]
5063
// crop the spectrum, if it has an odd number of rows or columns
5164
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
5265

@@ -67,9 +80,12 @@ int main(int argc, char ** argv)
6780
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
6881
q2.copyTo(q1);
6982
tmp.copyTo(q2);
83+
//! [crop_rearrange]
7084

85+
//! [normalize]
7186
normalize(magI, magI, 0, 1, NORM_MINMAX); // Transform the matrix with float values into a
7287
// viewable image form (float between values 0 and 1).
88+
//! [normalize]
7389

7490
imshow("Input Image" , I ); // Show the result
7591
imshow("spectrum magnitude", magI);

0 commit comments

Comments
 (0)