-
-
Notifications
You must be signed in to change notification settings - Fork 475
Don't apply any text encoding to binary input data #557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't apply any text encoding to binary input data #557
Conversation
I have not looked into why this CI test fails. Have you? |
2bc0afe
to
bd3dba0
Compare
Should hopefully be fixed now, the decoder side and the test data were both also applying UTF-8 for |
First I thought, there might very well be something off with the encoder because I did not spend too much time and thought in this area. Now that I see what you changed on the decoding side I believe I understand the issue. My current mental model says that the code is correct as it is. The problem is that you used the PDFEncoder incorrectly. For the decoder, the API is defined to return UTF8 encoded bits in the I understand that this is not very user friendly and also counter intuitive but your change would break the API as specified. We could add some kind of |
For reference, that's the user code in question: https://invent.kde.org/frameworks/prison/-/blob/master/src/lib/pdf417barcode.cpp - this used to work for binary content pre-2.0 as-is. I'll try if applying UTF-8 encoding to the binary data works, but that sounds very wrong and error-prone. |
No luck so far, I either end up with an exception going back to an |
I would believe you would need to use the I checked how zint handles this. The library treats every input as UTF-8, except when the I could personally also live with a 'hack' that is somewhat similar to your approach above with the difference, that it would only be triggered if there is an explicit use of |
This fixes barcode generation for arbitrary binary input failing entirely or producing incomplete/incorrect results. Observed for example with Hungarian domestic railway tickets (zlib compressed data in PDF417).
bd3dba0
to
cdc8153
Compare
Using the Bypassing this by using the Updated the PR to a more minimal version that only applies to And yes, I agree that an explicit API for binary data would probably be best. |
In my above mentioned mental model the |
Correct, my problem is also fixed without the I don't have a minimized test case unfortunately, but I'll send you the full 202 byte sequence that fails here. |
Thanks for the data. I think I have an idea what is going wrong. Your user code
it should work, at least the bailing out of the zueci code stops. |
No luck with that so far, still the same "Unexpected charcode" exception. However, this does make a difference in the input data, now each character is indeed between 0 and 255 as one would expect, checking the same on the previous code fails (contained chars < 0, as you pointed out). So that's definitely a step in the right direction. I'll do some more checking tomorrow to be sure we don't mess up the input data anywhere else. |
I know what is missing (forgot to mention it): you need to explicitly specify the character set:
|
Ah, that works! Thank you very much for the help, and sorry for (again) having bothered you with what turned out to be a bug on our side! |
There's two issues here: - The QByteArray to std::wstring conversion has to use unsigned bytes, the previous code did an implicit conversion to int producing wrong results for value above 127. - We now have to explicitly specify we are inputting binary data. See also zxing-cpp/zxing-cpp#557. This fixes for example domestic Hungarian railway ticket barcodes not being shown in KDE Itinerary anymore.
There's two issues here: - The QByteArray to std::wstring conversion has to use unsigned bytes, the previous code did an implicit conversion to int producing wrong results for value above 127. - We now have to explicitly specify we are inputting binary data. See also zxing-cpp/zxing-cpp#557. This fixes for example domestic Hungarian railway ticket barcodes not being shown in KDE Itinerary anymore.
There's two issues here: - The QByteArray to std::wstring conversion has to use unsigned bytes, the previous code did an implicit conversion to int producing wrong results for value above 127. - We now have to explicitly specify we are inputting binary data. See also zxing-cpp/zxing-cpp#557. This fixes for example domestic Hungarian railway ticket barcodes not being shown in KDE Itinerary anymore. (cherry picked from commit 6b234bc)
This fixes barcode generation for arbitrary binary input failing entirely or producing incomplete/incorrect results.
Observed for example with Hungarian domestic railway tickets (zlib compressed data in PDF417).