export img decoders/encoders #8511

brian-armstrong-discord · 2017-04-03T22:17:03Z

This pull request changes the image codecs API so that findDecoder and findEncoder are available to users. This allows users to inspect the header of an image before decoding it so that they can decide whether they want to decode the image before they do so. For example, the image might describe itself as being 1000000 x 1000000 pixels, in which case openCV will go ahead and allocate an extremely large buffer to hold this image. This new API gives the user the ability to decide they don't want to do that.

In addition, this patch attempts to comment the API a little and make it safer to use by doing bounds checking in readData so that a too-small buffer given to readData will not yield some kind of bad write outside array bounds.

brian-armstrong-discord · 2017-04-03T22:47:45Z

added string description to each decoder so that the caller can decide if they don't want to touch the header once the magic bytes have been read. for example, tiffs are quite prone to CVEs, so we might not want to touch them at all.

this builds on opencv#8492 and opencv#8511 to add a new method, .orientation(), to ImageDecoder. The JPEG subclass now overrides this to do jpeg-specific exif tag reading we also now export a method, OrientationTransform, which uses this orientation value to do the actual matrix operations

brian-armstrong-discord · 2017-04-19T01:06:31Z

This patch resolves #8605 and #8606

alalek · 2017-04-19T13:09:39Z

New OpenCV public API should not contain complex classes with "virtual" methods. This is needed to avoid ABI (application binary interface) violations and producing hacks like this - just want to add new method.

Please follow "pImpl design pattern" instead. Example: ocl::Device
You would still have these "virtual classes", but they will go into the private scope (into some .hpp file from src folder: without exposing them public) - probably you should just reuse existed BaseImageEncoder as "internal" implementation.

Please avoid(or minimize) using of STL objects in interfaces (std::vector/std::map/std::string/etc) - this may raise ABI problems which is very hard to debug (especially for newbie C++ developers): #6198 #5227 #6165

brian-armstrong-discord · 2017-04-19T18:11:21Z

@alalek 👍 Good stuff. I think I thought I was accomplishing the pimpl pattern by making the virtual base class and returning Ptr but this makes a lot more sense. Will make those changes soon.

brian-armstrong-discord · 2017-04-20T21:24:16Z

@alalek What would you think about leaving ImageEncoder's write params arg as const std::vector<int>& params? imencode already takes this and it seems like it would make the most sense to have parity with that function, since the argument types are the same.

Actually, for that matter, imencode also uses std::vector<uchar>& buf for its output. So to do that with ImageEncoder here would match imencode as well.

As far as I can tell those are the only places in this patch where STL types are exposed in the ABI.

alalek · 2017-04-20T23:07:01Z

Probably we can try to use InputArray wrapper instead of "std::vector& params" (the latest changes support C++ std::array<> argument type).

void setDestination( std::vector& buf )

I believe Mat& (of 8UC1) should work in this case too.

brian-armstrong-discord · 2017-04-21T18:26:03Z

@alalek is it possible to run the tests locally? i tried pulling down the sample data but i wasn't really sure where it needs to go or how to invoke it, so i'm dependent on the tester here to check for now :(

brian-armstrong-discord · 2017-04-21T19:11:27Z

@alalek Ok, nevermind, I put together a small test program.

brian-armstrong-discord · 2017-04-21T21:19:50Z

@alalek So this passes the tests, but I'm not so good at opencv-fu, and I needed to allocate a new Mat and then copy that Mat back to the destination vector in imencode once the encode finished. This is the only way I know of to maintain the backwards compat here with std::vector.

Is there a way around that that we can avoid? I have briefly tried using OutputArray but I'm not really sure how to do the operations I need to on it, including when it's of type std::vector

brian-armstrong-discord · 2017-04-25T01:10:17Z

@alalek I folded #8547 into this patch - it wasn't a huge change, just exported the Exif stuff on top of Decoder/Encoder. It's now part of the JPEGDecoder

alalek · 2017-04-25T11:05:01Z

@brian-armstrong-discord There is some conflicts with master branch. Please rebase this patch to actual master branch.

Something like these:

https://help.github.com/articles/configuring-a-remote-for-a-fork/
git fetch upstream master
git rebase -i upstream/master
on conflict: resolve conflict and run git add <conflict file> and then git rebase --continue. Repeat
git push -f origin <branch name> (-f means force)

BTW, We prefer to have patches with single commit to eliminate intermediate "fixup" commits from git history. Sometimes it is better to squash your commits before rebase to upstream code (git rebase -i HEAD~11 (11 is number of commits); "pick" the first commit, "f" or "s" - others)

Rebased branch for reference:

https://github.com/alalek/opencv/tree/pr8511_r (rebased)
https://github.com/alalek/opencv/tree/pr8511_rs (squashed & rebased)

brian-armstrong-discord · 2017-05-02T20:49:17Z

@alalek Looks like the conflicts are gone now

brian-armstrong-discord · 2017-05-24T18:40:01Z

Hi @alalek

Have you had a chance to look at this recently?

alalek · 2017-05-24T20:24:30Z

Hi @brian-armstrong-discord !
I'm sorry for huge delay. I need more time to review this (to test some suggestions on public interfaces).

Turim · 2017-06-09T08:51:46Z

@alalek @brian-armstrong-discord I also found the idea -to get the header data- very promising. If one might ask, what's the state so far?

P.S. It doesn't seem like you need some help, but if you are - feel free to text me :)
P.P.S.

The stated patch intention is to provide, say, lazy evaluation for image processing. But to do so the export whole bunch of ImageEncoders/Decoders is seen as little redundand (4 me ofc) - e.g. we could have added just +1 method imreadheader() or like so. (as C-like interface extension)
I wonder shall or shall not Encoders be visible with only simmetric (AFAIK) reasoning
I have some concerns about imageDescription(). As for me, the downside is the case -what if I want to know, what the exact format of the image?-. Should I compare the strings? Or shall we introduce an enum instead?
Also, it seems, ctors like Decoder(const Decoder& oth, PtrDecoder::Impl) could be avoided.

brian-armstrong-discord · 2017-06-21T22:47:41Z

Hi @alalek

Any updates in this PR? I would love to put this work behind me finally :)

Anecdotally, this patch is working well in production. I know you will likely have some more requests on the interface of it, though.

brian-armstrong-discord · 2017-06-21T22:49:12Z

@Turim an imreadheader would, among other issues, force us to re-read the header bytes when we go to decode. it also opens the possibility of imread() and imreadheader() doing different things. this would be a brittle approach to the solution.

i'm open to using enums, but internally opencv uses strings, and it made sense to me to make this change as lightweight as possible.

Turim · 2017-06-23T07:49:12Z

@brian-armstrong-discord ,
Frankly speaking, I didn't see here strong reasoning against.

Yup, re-read() is possible. Ok, we could live even with that or provide within imreadheader() an out param, which could be used to read the image but header.
About possibilities. A strange argument, as for me, coz we could have had tests for the case. Besides, we could implement imread() via imreadheader() and that's it, isn't it?

i'm open to using enums, but internally opencv uses strings, and it made sense to me to make this change as lightweight as possible.

Can I ask you to be more specific? In competition 'enum vs strings' where's the pros of the string usage?
(Say, at present to the question -Which format does the image have?- I see the overhead in terms of string comparisons and have no clue -how to avoid it-).

brian-armstrong-discord · 2017-06-23T18:29:44Z

@Turim You're more than welcome to try whatever you think will work well. I'd like to stick with the approach that's in this PR because the Decoder and Encoder classes do the same work that was being done before and offer the user a relatively clean interface.

brian-armstrong-discord · 2017-09-20T20:05:26Z

@alalek Do you think there's any hope for this PR? I'm wondering where I should set my expectations.

sturkmen72 · 2017-11-14T16:26:37Z

i want to share my humble thoughts about this PR
i completely agree that there should be a function or class giving image information only reading file header.
when adding this functionality we also should keep in mind some related improvements
1.catching internal errors while loading images (see #3314)
2.when saving an image keeping EXIF information or adding new EXIF tags

personally i think a new class using existing decoders and encoders internally can be a good candidate to the solution

olalonde · 2017-11-14T18:09:48Z

https://blog.discordapp.com/how-discord-resizes-150-million-images-every-day-with-go-and-c-c9e98731c65d

alalek · 2017-11-14T19:14:58Z

Sorry for huge delay!

Current interface is very questionable (for example, setScale(), orientation() and nextPage() methods). Such interfaces should be polished accurately:

minimal "required" set of parameters.
easily extendable interface
usability (construct image decoder, call readHeader(), check result and then maybe get image size via width and height - this long story we should to hide from users)

Perhaps, we could start with simple set of functions, for example:

/** Query image parameters

@sa cv::imread

Throws exception if image is not supported.

@param filename Name of file to be loaded
@param flags Flag that can take values of cv::ImreadModes
@param[out] type (optional) type of decoded cv::Mat returned by cv::imtead
@param[out] decoderName (optional) name of used image decoder for cv::imread
@returns size of image
*/
Size imquery(const cv::String& filename, int flags = IMREAD_COLOR,
                CV_OUT int* type = NULL, CV_OUT cv::String* decoderName = NULL);

I believe it is able to handle declared set of problems from the PR description.

Turim · 2017-11-15T19:42:40Z

@alalek , IMHO

nextPage() can't be ignored as just redundand for nowadays we're able to handle multipage TIFF images;
I hope imageHeader/width and imageHeader/height still can be a subject for a discussion;

I think the case could be mantained for the imgcodecs to be more clean. Nowadays the only what a imgcodecs' user can is try to load the image via, say, imread and check the retval. But in the prod code it's just sort of gambling with user's memory (i.e. working set) - every try to load image, say, 10000x10000 or even bigger leads to a memory consumption peak with empty retval.

Am I right that you prefer not to add struct/classes?
The another way to mantain the case could be code like that:

/**
@param filename Name of file to be loaded
@param flags Flag that can take values of cv::ImreadModes
@param[out] type (optional) type of decoded cv::Mat returned by cv::imread
@param[out] width (optional) width of decoded cv::Mat returned by cv::imread
@param[out] height (optional) width of decoded cv::Mat returned by cv::imread
*/
CV_EXPORTS_W bool imreadheader(const String& filename, int flags, 
                                CV_OUT int* type = NULL, CV_OUT int* width = NULL, CV_OUT int* height = NULL);

/**
^like the previous
*/
struct ImageSize // or pair_of_int or tuple_of_int
{
int width;
int height;
};
CV_EXPORTS_W bool imreadheadermulti(const String& filename, int flags, 
                                CV_OUT int* type = NULL, CV_OUT std::vector<ImageSize>* sizes = NULL);

As for mentioned imquery() - yes, it gives you the option to distinguish JPEG from TIFF, but the image size is an open question still - so you basically don't know if there any way to load it.

P.S. @brian-armstrong-discord I apologize for I've just mentioned yet another interface for the case. Quite unsure should I create a separate discussion for that. Feel free to provide the feedback on that.
P.P.S. About 'to throw or not to throw' - I'm pretty unsure that it has any connection to the iface. To enable throwing means just to break the backward compatibility and previously written error handling.
P.P.P.S. Yup, both imreadheader() and imquery() aren't so good regarding the perf - on imread() you eventually reread the header. I think we can't fix that without classes/output_token_parameters which could use the state. But even with current perf restrictions, it's good just to have functions like those in the imgcodecs interface.

alalek · 2017-11-15T20:05:54Z

but the image size is an open question still

In case of imquery proposal - size is return value of mentioned function.

multipage TIFF

I prefer to handle the most universal cases and I prefer not to add some rarely used hacks (like setScale() or readNext()) into widely used interfaces.
Image is a single surface of pixels (check "wiki"-like definitions).
If you want to work with multipage images - welcome to VideoCapture interface, which is initially designed to handle set of images, including cases of different frames resolutions.

Throw

Check amount of useful information between "bool" result and "exception" with message and potentially stack-trace to the reason of problem.

Turim · 2017-11-15T20:13:49Z

@alalek , thanks for clarifying.

size as the retval

You're correct, my mistake.

I prefer not to add some rarely used hacks

AFAIK imreadmulti() supports it and, I presume, it's the only goal.
So we're able to read multiple images, but can get the Size only for the 1st - I presume - the more clean would be to add imquerymulti() or to deprecate imreadmulti().

brian-armstrong-discord · 2018-04-25T23:22:54Z

Should I go ahead and just abandon this? The imquery approach isn't very appetizing, and if this isn't mergeable, we might as well kill it.

brian-armstrong-discord · 2018-08-01T22:30:02Z

Are there any plans to add something like imquery? We'd love to get off this fork, especially now that opencv has changed substantially.

asenyaev · 2021-04-07T10:12:05Z

jenkins cn please retry a build

ocpalo · 2022-05-23T23:13:06Z

Hi,

I think this pull request covers my GSoC proposal :). Before the coding period started, I was doing some research and found myself here.

I am going to do a meeting with my mentor about this topic but after seeing lots of people discuss this topic, I wanted to hear about you if you are still interested. After re-read your comments, I came to this conclusion.

I think we should provide an API for reading image header-only using the approach c-like interface. If the user knows about the image data, he would not use this API to confirm. Only users would use this API who do not know about its image details. So for these users, re-reading the header should not be a problem. Because their concerns are probably about limited memory. They could sacrifice some CPU time to verify that their software is not going to die in the middle of the process.

I think it should be something like this

 int imreadheader(String const& file, int flags, <Struct Header Info>);
 int imreadheader(String const& file, int flags, <Struct Header Info>, Mat &img, int width, int height); // Read image to the img if image size within the given limits

The second function does a different thing than what it says but I think it is good to provide this API.

Please let me know what you think about this :). This pull request is opened in 2017 but is still open. So I did not bother to create a new one since there exists one on this topic. I hope this won't be a problem.

alalek · 2022-05-24T08:31:04Z

Having imreadheader or similar calls for reading of the image header are not enough.
Usually this call is followed by reading the actual image data.
Main point is that OpenCV and it s API should be effective and we should not read the input file twice.

ocpalo · 2022-05-24T14:45:22Z

/** Query image parameters

@sa cv::imread

Throws exception if image is not supported.

@param filename Name of file to be loaded
@param flags Flag that can take values of cv::ImreadModes
@param[out] type (optional) type of decoded cv::Mat returned by cv::imtead
@param[out] decoderName (optional) name of used image decoder for cv::imread
@returns size of image
*/
Size imquery(const cv::String& filename, int flags = IMREAD_COLOR,
                CV_OUT int* type = NULL, CV_OUT cv::String* decoderName = NULL);

You suggested this in 2017. I think I can implement this and overload the imread_ method. But somehow I can't escape from these lines in imread.

    // in imread_ function
    ImageDecoder decoder;

#ifdef HAVE_GDAL
    if(flags != IMREAD_UNCHANGED && (flags & IMREAD_LOAD_GDAL) == IMREAD_LOAD_GDAL ){
        decoder = GdalDecoder().newDecoder();
    }else{
#endif
        decoder = findDecoder( filename );
#ifdef HAVE_GDAL
    }
#endif

We can overload the imread function to read image data using the result of imquery call.
We can implement the overloaded imread_ function as

bool imread(String const& filename, int type, String const& decoderName, int flags, Size size) {
        ImageDecoder decoder;

       // Add a new method to ImageDecoder to find appropiate decoder by decoderName?
#ifdef HAVE_GDAL
    if(flags != IMREAD_UNCHANGED && (flags & IMREAD_LOAD_GDAL) == IMREAD_LOAD_GDAL ){
        decoder = GdalDecoder().newDecoder();
    }else{
#endif
        decoder = findDecoder( filename );
#ifdef HAVE_GDAL
    }
#endif

    /// if no decoder was found, return nothing.
    if( !decoder ){
        return 0;
    }
    
    int scale_denom = 1;
    if( flags > IMREAD_LOAD_GDAL )
    {
        if( flags & IMREAD_REDUCED_GRAYSCALE_2 )
            scale_denom = 2;
        else if( flags & IMREAD_REDUCED_GRAYSCALE_4 )
            scale_denom = 4;
        else if( flags & IMREAD_REDUCED_GRAYSCALE_8 )
            scale_denom = 8;
    }
    
        /// set the scale_denom in the driver
    decoder->setScale( scale_denom );

    /// set the filename in the driver
    decoder->setSource( filename );
    
    mat.create( size.height, size.width, type );
    
    // read the image data
    bool success = false;
    try
    {
        success = decoder->readData(mat);
    }
    catch (const cv::Exception& e)
    {
        std::cerr << "imread_('" << filename << "'): can't read data: " << e.what() << std::endl << std::flush;
    }
    catch (...)
    {
        std::cerr << "imread_('" << filename << "'): can't read data: unknown exception" << std::endl << std::flush;
    }
    if (!success)
    {
        mat.release();
        return false;
    }

    if( decoder->setScale( scale_denom ) > 1 ) // if decoder is JpegDecoder then decoder->setScale always returns 1
    {
        resize( mat, mat, Size( size.width / scale_denom, size.height / scale_denom ), 0, 0, INTER_LINEAR_EXACT);
    }

    /// optionally rotate the data if EXIF orientation flag says so
    if (!mat.empty() && (flags & IMREAD_IGNORE_ORIENTATION) == 0 && flags != IMREAD_UNCHANGED )
    {
        ApplyExifOrientation(decoder->getExifTag(ORIENTATION), mat);
    }
    
    return true;

What do you think about it? I think the function name imreaddata would be more appropriate.

We need a new function static bool findDecoderByName(String const& decoderName) method to find the right decoder in order to use imquery result decoderName. Then we can change the findDecoder part of overloaded imread function i guess.

brian-armstrong-discord force-pushed the export_img_decoder_encoder branch from efa3ea7 to 02eee62 Compare April 3, 2017 22:46

brian-armstrong-discord force-pushed the export_img_decoder_encoder branch from 02eee62 to 61aab2f Compare April 4, 2017 01:05

brian-armstrong-discord mentioned this pull request Apr 7, 2017

Export exif #8547

Closed

brian-armstrong-discord force-pushed the export_img_decoder_encoder branch 2 times, most recently from 172aa54 to c2e329e Compare April 25, 2017 18:29

export img decoders/encoders

ca5afce

brian-armstrong-discord force-pushed the export_img_decoder_encoder branch from c2e329e to ca5afce Compare April 25, 2017 21:15

Merge branch 'master' into export_img_decoder_encoder

85eb0d6

brian-armstrong-discord referenced this pull request Nov 2, 2017

imgcodecs: refactoring, improve code quality

999f41f

alalek mentioned this pull request Dec 20, 2017

imwrite multipage TIFF files #10367

Merged

brian-armstrong-discord mentioned this pull request Apr 25, 2018

nondeterministic libpng error: IDAT: unknown compression method discord/lilliput#30

Closed

Turim mentioned this pull request Apr 28, 2018

imgcodecs iface: add imquery() to get image size & decoder_type from the header #11415

Closed

alalek mentioned this pull request Dec 8, 2018

try to solve #13381 #13382

Merged

sturkmen72 mentioned this pull request Jul 29, 2020

imquery() and overloaded imread() functions #17753

Closed

6 tasks

brian-armstrong-discord closed this May 25, 2022

brian-armstrong-discord deleted the export_img_decoder_encoder branch May 25, 2022 00:15

Uh oh!

export img decoders/encoders #8511

export img decoders/encoders #8511

Uh oh!

Conversation

brian-armstrong-discord commented Apr 3, 2017

Uh oh!

brian-armstrong-discord commented Apr 3, 2017

Uh oh!

brian-armstrong-discord commented Apr 19, 2017

Uh oh!

alalek commented Apr 19, 2017

Uh oh!

brian-armstrong-discord commented Apr 19, 2017

Uh oh!

brian-armstrong-discord commented Apr 20, 2017

Uh oh!

alalek commented Apr 20, 2017

Uh oh!

brian-armstrong-discord commented Apr 21, 2017

Uh oh!

brian-armstrong-discord commented Apr 21, 2017

Uh oh!

brian-armstrong-discord commented Apr 21, 2017

Uh oh!

brian-armstrong-discord commented Apr 25, 2017

Uh oh!

alalek commented Apr 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-armstrong-discord commented May 2, 2017

Uh oh!

brian-armstrong-discord commented May 24, 2017

Uh oh!

alalek commented May 24, 2017

Uh oh!

Turim commented Jun 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-armstrong-discord commented Jun 21, 2017

Uh oh!

brian-armstrong-discord commented Jun 21, 2017

Uh oh!

Turim commented Jun 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-armstrong-discord commented Jun 23, 2017

Uh oh!

brian-armstrong-discord commented Sep 20, 2017

Uh oh!

sturkmen72 commented Nov 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

olalonde commented Nov 14, 2017

Uh oh!

alalek commented Nov 14, 2017

Uh oh!

Turim commented Nov 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalek commented Nov 15, 2017

Uh oh!

Turim commented Nov 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-armstrong-discord commented Apr 25, 2018

Uh oh!

brian-armstrong-discord commented Aug 1, 2018

Uh oh!

asenyaev commented Apr 7, 2021

Uh oh!

ocpalo commented May 23, 2022

Uh oh!

alalek commented May 24, 2022

Uh oh!

ocpalo commented May 24, 2022

Uh oh!

Uh oh!

alalek commented Apr 25, 2017 •

edited

Loading

Turim commented Jun 9, 2017 •

edited

Loading

Turim commented Jun 23, 2017 •

edited

Loading

sturkmen72 commented Nov 14, 2017 •

edited

Loading

Turim commented Nov 15, 2017 •

edited

Loading

Turim commented Nov 15, 2017 •

edited

Loading