Skip to content

imgproc: Bilateral filter performance improvement #27433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: 4.x
Choose a base branch
from

Conversation

madanm3
Copy link
Contributor

@madanm3 madanm3 commented Jun 11, 2025

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@madanm3
Copy link
Contributor Author

madanm3 commented Jun 11, 2025

                   Name of Test	                                         base	       patch	  base/patch 
                                                                                                  (gain in x)
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 1)  	 4.42 ms  	3.42 ms	   1.3x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 5)  	 6.94 ms  	3.42 ms	   2.0x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 1)  	 7.17 ms  	6.1 ms     1.2x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 5)  	 7.91 ms 	6.1 ms     1.3x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 1)  	11.22 ms 	8.75 ms	   1.3x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 5)  	18.51 ms 	8.74 ms	   2.1x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 1)  	17.16 ms       15.03 ms	   1.1x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 5)  	19.18 ms       15.03 ms	   1.3x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 1)    	 0.66 ms 	0.51 ms	   1.3x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 5)    	 1.03 ms 	0.51 ms	   2.0x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 1)    	 1.06 ms 	0.9 ms     1.2x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 5)    	 1.16 ms 	0.9 ms     1.3x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 1)    	 1.68 ms 	1.3 ms     1.3x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 5)    	 2.74 ms 	1.29 ms    2.1x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 1)    	 2.54 ms 	2.22 ms	   1.1x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 5)    	 2.83 ms 	2.22 ms	   1.3x

@madanm3 madanm3 force-pushed the fast_bilateral_simd branch 2 times, most recently from b250df5 to 3a86990 Compare June 16, 2025 04:49
@madanm3
Copy link
Contributor Author

madanm3 commented Jun 16, 2025

			 Name of Test	                                    ref	          patch	      ref/patch 
                                                                                                      (gain in x)
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 1) 	 6.75 ms	 6.58 ms	1.03 x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 5) 	 6.97 ms	 6.83 ms	1.02 x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 1) 	13.87 ms	13.69 ms	1.01 x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 5) 	13.91 ms	13.76 ms	1.01 x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 1)  	 4.42 ms	 3.39 ms	1.30 x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 5)  	 6.94 ms	 3.39 ms	2.05 x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 1)  	 7.17 ms	 6.82 ms	1.05 x
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 5)  	 7.91 ms	 6.83 ms	1.16 x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 1) 	18.12 ms	18.07 ms	1.00 x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 5) 	18.78 ms	18.74 ms	1.00 x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 1) 	33.38 ms	32.99 ms	1.01 x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 5) 	33.4 ms		33.06 ms	1.01 x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 1)  	11.22 ms	 8.35 ms	1.34 x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 5)  	18.51 ms	 8.32 ms	2.22 x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 1)  	17.16 ms	16.6 ms		1.03 x
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 5)  	19.18 ms	16.62 ms	1.15 x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 1)   	 0.99 ms	 0.97 ms	1.02 x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 5)   	 1.02 ms	 1 ms		1.02 x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 1)   	 1.9 ms		 1.93 ms	0.98 x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 5)   	 1.9 ms		 1.94 ms	0.98 x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 1)    	 0.66 ms	 0.5 ms		1.32 x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 5)    	 1.03 ms	 0.5 ms		2.06 x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 1)    	 1.06 ms	 1 ms		1.06 x
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 5)    	 1.16 ms	 1 ms		1.16 x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 1)   	 2.63 ms	 2.67 ms	0.99 x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 5)   	 2.72 ms	 2.76 ms	0.99 x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 1)   	 4.8 ms		 4.8 ms		1.00 x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 5)   	 4.79 ms	 4.8 ms		1.00 x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 1)    	 1.68 ms	 1.24 ms	1.35 x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 5)    	 2.74 ms	 1.23 ms	2.23 x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 1)    	 2.54 ms	 2.45 ms	1.04 x
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 5)    	 2.83 ms	 2.46 ms	1.15 x

@madanm3 madanm3 force-pushed the fast_bilateral_simd branch from 46c1fd3 to 74a0ad3 Compare June 16, 2025 07:49
- Use of additional buffer removed.
- Performance improved by avoiding memory store and load.
@madanm3 madanm3 force-pushed the fast_bilateral_simd branch from 74a0ad3 to 5198e0f Compare June 16, 2025 07:53
@madanm3 madanm3 force-pushed the fast_bilateral_simd branch from e8fc8e0 to 85a3b0e Compare June 16, 2025 09:14
@madanm3
Copy link
Contributor Author

madanm3 commented Jun 23, 2025

  • For 8U datatype, 42% gain has been observed with the patch on Genoa.
  • Both 8U and 32F filter has been modified to in-place filter computation.
  • Further 32F filter improvements to be added in the next PR.

@madanm3
Copy link
Contributor Author

madanm3 commented Jun 23, 2025

@asmorkalov Please let me know if any changes are required, thanks.

@asmorkalov
Copy link
Contributor

@fengyuentau @mshabunin could you take a look?

@asmorkalov asmorkalov requested a review from vpisarev June 27, 2025 07:10
@asmorkalov asmorkalov added this to the 4.13.0 milestone Jun 27, 2025
Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My perf stats:

i7-12700K

                          Name of Test                            i7-base i7-patch  i7-patch
                                                                                       vs
                                                                                    i7-base
                                                                                   (x-factor)
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 1)     0.217   0.109      2.00
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 5)     0.577   0.081      7.14
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 1)    0.281   0.289      0.97
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 5)    0.318   0.322      0.99
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 1)     0.211   0.226      0.93
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 5)     0.386   0.238      1.62
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 1)    0.691   0.724      0.95
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 5)    0.679   0.718      0.95
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 1)     0.538   0.168      3.21
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 5)     1.540   0.166      9.25
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 1)    0.492   0.520      0.95
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 5)    0.578   0.611      0.95
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 1)     0.408   0.436      0.94
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 5)     0.952   0.421      2.26
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 1)    1.021   1.053      0.97
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 5)    1.023   1.058      0.97
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 1)   1.669   0.538      3.10
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 5)   4.356   0.533      8.17
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 1)  2.249   2.219      1.01
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 5)  2.518   2.437      1.03
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 1)   1.087   1.078      1.01
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 5)   2.457   1.075      2.29
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 1)  7.366   8.181      0.90
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 5)  7.395   8.180      0.90
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 1)   4.873   1.112      4.38
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 5)   9.479   1.171      8.09
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 1)  5.369   5.276      1.02
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 5)  6.962   6.835      1.02
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 1)   2.013   2.022      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 5)   6.663   2.034      3.28
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 1)  9.325   9.236      1.01
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 5)  9.812   8.985      1.09

m2:

                          Name of Test                            m2-base m2-patch  m2-patch
                                                                                       vs
                                                                                    m2-base
                                                                                   (x-factor)
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 1)     0.250   0.212      1.18
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 5)     0.240   0.223      1.08
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 1)    0.410   0.387      1.06
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 5)    0.402   0.407      0.99
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 1)     0.421   0.381      1.11
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 5)     0.420   0.377      1.11
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 1)    0.973   1.125      0.87
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 5)    0.961   0.984      0.98
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 1)     0.504   0.453      1.11
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 5)     0.498   0.448      1.11
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 1)    0.888   0.849      1.05
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 5)    0.889   0.863      1.03
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 1)     0.917   0.818      1.12
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 5)     0.905   0.820      1.10
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 1)    1.719   1.844      0.93
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 5)    1.717   1.826      0.94
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 1)   0.980   0.851      1.15
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 5)   0.977   0.858      1.14
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 1)  2.329   2.251      1.03
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 5)  2.344   2.251      1.04
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 1)   2.116   1.827      1.16
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 5)   2.015   1.829      1.10
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 1)  4.744   4.968      0.95
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 5)  4.769   4.940      0.97
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 1)   2.020   1.841      1.10
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 5)   2.077   1.831      1.13
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 1)  4.431   4.245      1.04
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 5)  4.406   4.208      1.05
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 1)   3.971   3.643      1.09
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 5)   4.016   3.684      1.09
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 1)  7.990   8.692      0.92
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 5)  7.919   8.646      0.92

K1 (RISC-V):

GCC

                          Name of Test                            base-gcc patch-gcc patch-gcc
                                                                                         vs
                                                                                      base-gcc
                                                                                     (x-factor)
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 1)     0.829     0.829      1.00
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 5)     0.832     0.840      0.99
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 1)    2.301     2.272      1.01
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 5)    2.227     2.223      1.00
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 1)     2.181     2.205      0.99
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 5)     2.214     2.240      0.99
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 1)    10.406   10.414      1.00
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 5)    10.439   10.447      1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 1)     1.584     1.592      0.99
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 5)     1.580     1.586      1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 1)    4.317     4.316      1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 5)    4.327     4.337      1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 1)     4.471     4.451      1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 5)     4.623     4.628      1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 1)    23.164   23.141      1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 5)    23.283   23.268      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 1)   5.850     5.838      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 5)   5.885     5.872      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 1)  19.223   24.505      0.78
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 5)  19.261   24.559      0.78
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 1)   20.550   20.696      0.99
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 5)   21.531   21.531      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 1)  64.941   65.103      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 5)  64.396   64.549      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 1)   10.840   10.854      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 5)   10.769   10.791      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 1)  37.066   53.972      0.69
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 5)  37.288   53.857      0.69
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 1)   44.346   44.966      0.99
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 5)   46.038   46.246      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 1) 140.903   141.787     0.99
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 5) 140.954   141.242     1.00
                          Name of Test                            base-clang patch-clang patch-clang
                                                                                             vs
                                                                                         base-clang
                                                                                         (x-factor)
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 1)      0.794       0.789       1.01
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC1, 5)      0.789       0.794       0.99
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 1)     2.114       2.158       0.98
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC1, 5)     2.155       2.138       1.01
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 1)      2.008       2.022       0.99
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_8UC3, 5)      2.012       2.106       0.96
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 1)     10.111     10.110       1.00
BilateralFilter::TestBilateralFilter::(640x480, 3, CV_32FC3, 5)     10.098     10.177       0.99
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 1)      1.479       1.474       1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC1, 5)      1.489       1.477       1.01
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 1)     4.053       4.026       1.01
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC1, 5)     4.079       4.054       1.01
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 1)      3.998       3.989       1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_8UC3, 5)      4.003       4.228       0.95
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 1)     22.711     22.615       1.00
BilateralFilter::TestBilateralFilter::(640x480, 5, CV_32FC3, 5)     22.657     22.750       1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 1)    5.492       5.486       1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC1, 5)    5.490       5.509       1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 1)   21.069     23.356       0.90
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC1, 5)   20.874     23.308       0.90
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 1)    19.265     19.432       0.99
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_8UC3, 5)    19.286     19.260       1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 1)   62.823     62.852       1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 3, CV_32FC3, 5)   62.446     62.767       0.99
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 1)    10.134     10.115       1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC1, 5)    10.091     10.074       1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 1)   41.929     51.640       0.81
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC1, 5)   42.109     51.780       0.81
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 1)    41.756     41.321       1.01
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_8UC3, 5)    41.532     41.543       1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 1)  138.652     139.128      1.00
BilateralFilter::TestBilateralFilter::(1920x1080, 5, CV_32FC3, 5)  138.680     139.114      1.00

@asmorkalov
Copy link
Contributor

Hm, Is IPP enabled in your Intel build? What is the build CPU baseline?

@fengyuentau
Copy link
Member

fengyuentau commented Jun 27, 2025

Hm, Is IPP enabled in your Intel build? What is the build CPU baseline?

cmake -G Ninja -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DCMAKE_INSTALL_PREFIX=build/install -DWITH_IPP=OFF .

Baseline is git reset --hard HEAD~3 on this branch.

@fengyuentau
Copy link
Member

Added performance on riscv.

@madanm3
Copy link
Contributor Author

madanm3 commented Jul 23, 2025

@fengyuentau, @asmorkalov
Like to know if any changes/improvements expected for the PR. Thank you for the performance check on i7, m2 and RISC-V.

@asmorkalov
Copy link
Contributor

Hello @madanm3 Thanks a lot for the contribution! Looks like the ball is on my side. I made several experiments and get controversial results and I cannot reproduce speedup similar to yours. Please give me some time to debug the issue or prove my findings.

@madanm3
Copy link
Contributor Author

madanm3 commented Jul 25, 2025

@asmorkalov, The performance gains I observed on AMD Genoa differ significantly from the i7-12700K results.
I’ll wait for your findings, but in case if you need any support in experiments from my side, I will be happy to assist.

@madanm3 madanm3 changed the title Bilateral filter performance improvement imgproc: Bilateral filter performance improvement Aug 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants