-
-
Notifications
You must be signed in to change notification settings - Fork 474
MultiFormatWriter: fix encoding binary std::string #599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiFormatWriter: fix encoding binary std::string #599
Conversation
While `MultiFormatWriter::encode(std::wstring)` can encode binary data just fine, `MultiFormatWriter::encode(std::string)` corrupts the data by trying to interpret it as UTF-8. This may also break immediately because not all byte combinations are allowed in UTF-8. So when `_encoding` is set to `CharacterSet::BINARY`, the `std::string` needs to be converted to a `std::wstring` of unsigned values, because `TextEncoder::GetBytes(std::wstring)` calls `ToUtf8()` which cannot handle negative values. For example, without this commit, try: auto writer = MultiFormatWriter(BarcodeFormat::QRCode) .setEncoding(CharacterSet::BINARY). auto bitmap = writer.encode(std::string("\x7e\x7f\x80\x81"), 200, 200); Which will result in: "ValueError: Unexpected charcode". Please note ZXingWriter is NOT affected by this! This works just fine: $ printf "\\x7e\\x7f\\x80\\x81" > file $ example/ZXingWriter -binary QRCode file out.png Because: 1) it's calling `MultiFormatWriter::encode(std::wstring)` and 2) builds the `std::wstring` from `uint8_t`
Your fix/workaround is exactly what I did in |
Okay, I look forward to the new API then 😉 |
@axxel DataMatrix seems to especially problematic here, probably because of this transformation. What surprised me a little is that even UTF-8 sequences apparently cannot be encoded at the moment (at least with the stock
Here's a little sample for trying
On the other hand,
So before taking action I thought it would be wise to check your opinion on that. |
Exactly. This has indeed come up already earlier this year (see here).
Very wise ;). I have to admit that my progress on the new Writer API has been stalled recently. I have an early prototypish hack lying around but there was a lack of time and there are still quite a few open questions regarding the types used in the API (as mentioned in #332) If you need this 'now' I see two options:
|
Thanks for the quick reply! I think I'll try to hack something together then as it is a bit urgent, but of course, I would always happily be the first to alpha-test your solution! 😉 If you maybe already have a concrete idea, please tell me so 😉 |
While
MultiFormatWriter::encode(std::wstring)
can encode binary data just fine,MultiFormatWriter::encode(std::string)
corrupts the data by trying to interpret it as UTF-8.This may also break immediately because not all byte combinations are allowed in UTF-8.
So when
_encoding
is set toCharacterSet::BINARY
, thestd::string
needs to be converted to astd::wstring
of unsigned values, becauseTextEncoder::GetBytes(std::wstring)
callsToUtf8()
which cannot handle negative values.For example, without this commit, try:
Which will result in: "ValueError: Unexpected charcode".
For a QRCode,
MultiFormatWriter::encode(std::string("\x7e\x7f\x80\x81"))
will run:ToUtf8(str)
str
will now be0x7e, 0x7f, 0xff, 0xbf, 0xbe, 0x80, 0xff, 0xbf, 0xbe, 0x81
Please note ZXingWriter is NOT affected by this!
This works just fine:
Because:
MultiFormatWriter::encode(std::wstring)
andstd::wstring
fromuint8_t