-
-
Notifications
You must be signed in to change notification settings - Fork 699
colour: use floating-point APIs of lcms #3373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Avoids the need to pack the buffer of floats into lcms's fixed-point formats and ensures full precision internally in lcms. Resolves: libvips#3150. Supersedes: libvips#3368.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm amazed lcms has such different behaviour on this code path.
Well done for finding this Kleis.
I benchmarked like this. Before:
After:
So we're maybe 4x more CPU with this change, which is unfortunate :( However, the accuracy improvement is so dramatic, I don't think we have a choice. The high CPU cost seems alarming, but it's interesting that overall run time is unaffected, since libvips hides the extra processing with extra concurrency (phew!). |
I've done some quick performance testing/profiling and can confirm the CPU cost (ignoring concurrency) of both approaches is ~4x, with ~30% of the time spent calculating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having slept on it, I agree this PR is the better solution, especially as it doesn't prevent lcms from internally optimising the icc_transform
operation.
Ah, I was unaware of the additional CPU cost (I only tested overall run time). Let me check if there's another solution. I remembered that one of the libjxl maintainers said that it did a |
I had a quick go with fast float using this diff to this patch: diff --git a/libvips/colour/icc_transform.c b/libvips/colour/icc_transform.c
index 31f0c954c..e7ed52d32 100644
--- a/libvips/colour/icc_transform.c
+++ b/libvips/colour/icc_transform.c
@@ -91,6 +91,7 @@
/* Has to be before VIPS to avoid nameclashes.
*/
#include <lcms2.h>
+#include <lcms2_fast_float.h>
#include <vips/vips.h>
diff --git a/libvips/iofuncs/init.c b/libvips/iofuncs/init.c
index 93f0b17ba..25089d0bc 100644
--- a/libvips/iofuncs/init.c
+++ b/libvips/iofuncs/init.c
@@ -114,6 +114,11 @@
#include <vips/vips7compat.h>
#endif
+#ifdef HAVE_LCMS2
+#include <lcms2.h>
+#include <lcms2_fast_float.h>
+#endif /*HAVE_LCMS2*/
+
/* abort() on the first warning or error.
*/
int vips__fatal = 0;
@@ -469,6 +474,10 @@ vips_init( const char *argv0 )
return( 0 );
started = TRUE;
+#ifdef HAVE_LCMS2
+ cmsPlugin(cmsFastFloatExtensions());
+#endif /*HAVE_LCMS2*/
+
/* Try to set a minimum stacksize, default 2mb. We need to do this
* before any threads start.
*/ But I don't see a speedup :(
I've probably messed up somewhere. |
You're right, I don't see any speedup here either with the fast float plugin :( I did some digging and it looks like this is being caused by the --- a/src/cmsopt.c
+++ b/src/cmsopt.c
@@ -1803,8 +1803,7 @@ static _cmsOptimizationCollection DefaultOptimization[] = {
{ OptimizeByJoiningCurves, &DefaultOptimization[1] },
{ OptimizeMatrixShaper, &DefaultOptimization[2] },
- { OptimizeByComputingLinearization, &DefaultOptimization[3] },
- { OptimizeByResampling, NULL }
+ { OptimizeByComputingLinearization, NULL },
};
// The linked list head I see (click to enlarge): $ vipsthumbnail php-vips-192.jpg -s 1536x1024 -e srgb -o x.jpg
And: $ vipsthumbnail php-vips-169.jpg --linear -s 1920x1080 -o x.jpg
So, it looks like commit kleisauke@3e93bb0 might also help here, but only when that builtin lcms optimization is disabled. |
Oh very cool! And does changing the lcms optimisation table improve performance significantly? |
It seems that the For example: $ time vips icc_import php-vips-169.jpg t1.v --pcs xyz
real 0m0.106s
user 0m0.105s
sys 0m0.055s
$ time vips icc_import php-vips-169.jpg t1.v --pcs xyz
real 0m0.122s
user 0m0.247s
sys 0m0.043s This PR: $ time vips icc_import php-vips-169.jpg t1.v --pcs xyz
real 0m0.124s
user 0m0.257s
sys 0m0.046s Commit kleisauke@3e93bb0 + lcms patch: $ time vips icc_import php-vips-169.jpg t1.v --pcs xyz
real 0m0.123s
user 0m0.267s
sys 0m0.034s |
Great detective work Kleis. Did you see that the accuracy of the LUTs in |
Yes, I also noticed that. I initially tried these two patches: --- a/libvips/colour/icc_transform.c
+++ b/libvips/colour/icc_transform.c
@@ -444,6 +444,9 @@ vips_icc_build( VipsObject *object )
if( icc->black_point_compensation )
flags |= cmsFLAGS_BLACKPOINTCOMPENSATION;
+ if( icc->pcs == VIPS_PCS_XYZ )
+ flags |= cmsFLAGS_HIGHRESPRECALC;
+
if( !(icc->trans = cmsCreateTransform(
icc->in_profile, icc->in_icc_format,
icc->out_profile, icc->out_icc_format, --- a/libvips/colour/icc_transform.c
+++ b/libvips/colour/icc_transform.c
@@ -444,6 +444,9 @@ vips_icc_build( VipsObject *object )
if( icc->black_point_compensation )
flags |= cmsFLAGS_BLACKPOINTCOMPENSATION;
+ if( icc->pcs == VIPS_PCS_XYZ )
+ flags |= cmsFLAGS_GRIDPOINTS( 49 );
+
if( !(icc->trans = cmsCreateTransform(
icc->in_profile, icc->in_icc_format,
icc->out_profile, icc->out_icc_format, But both didn't work, unfortunately. |
I tried removing |
libvips uses a LUT for tone curves -- 1500 elements (from memory?) then linear piecewise interpolation between them, which is accurate enough, I think. Maybe we could make our own lcms plugin that does something similar? Or perhaps we need to swap to skcms :( |
Oh heh no we don't, it's much more complex, I'd forgotten. |
IIRC, skcms doesn't support conversion to LUT-based or CMYK profiles, so switching to that is not an option. |
Ah shame. |
It looks like this issue occurs only when exporting ( Here are some Details
$ time vipsthumbnail php-vips-169.jpg --linear -s 1920x1080 -o x.jpg
real 0m0.107s
user 0m0.168s
sys 0m0.024s
$ time vipsthumbnail php-vips-169.jpg -s 1920x1080 -e srgb -o x.jpg
real 0m0.095s
user 0m0.094s
sys 0m0.028s
$ time vipsthumbnail php-vips-192.jpg -s 1536x1024 -e srgb -o x.jpg
real 0m0.099s
user 0m0.121s
sys 0m0.025s
$ time vipsthumbnail php-vips-192.jpg -s 1536x1024 -i srgb -e srgb -o x.jpg
real 0m0.091s
user 0m0.086s
sys 0m0.023s This PR: $ time vipsthumbnail php-vips-169.jpg --linear -s 1920x1080 -o x.jpg
real 0m0.132s
user 0m0.344s
sys 0m0.023s
$ time vipsthumbnail php-vips-169.jpg -s 1920x1080 -e srgb -o x.jpg
real 0m0.095s
user 0m0.094s
sys 0m0.029s
$ time vipsthumbnail php-vips-192.jpg -s 1536x1024 -e srgb -o x.jpg
real 0m0.104s
user 0m0.174s
sys 0m0.027s
$ time vipsthumbnail php-vips-192.jpg -s 1536x1024 -i srgb -e srgb -o x.jpg
real 0m0.091s
user 0m0.083s
sys 0m0.026s
So, only processing in |
Ensures compilers may optimise.
Avoids the need to pack the buffer of floats into lcms's fixed-point formats and ensures full precision internally in lcms.
Resolves: #3150.
Supersedes: #3368.