-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Implement latin_1 in Rust #3046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yea, so far this looks good! That is the correct way to add a new encoding; if that's what you were looking for. I'm not sure if unicode_encode_ucs1
is reusable for other encodings, since anything >ucs1
would be some way of splitting the ucs2/ucs4 into parts, at which point it's just a different encoding.
While I was implementing this, I came across something that stymied me. In PyPy, the |
643eb35
to
4b8e8a0
Compare
Originally posted this in #3118 but it makes more sense here re: the ucs1 helper function, I think it makes more sense to abstract over utf8/ascii than ascii/latin1, since we use a utf8 str type (so we can just validate + extend_from_slice for utf8/ascii) whereas CPython has UCS1 as a buffer (so they can just validate + memcpy for ascii/latin1). Finishing up #3118 made me remember why I did that; in our implementation of codecs, ascii and utf8 just naturally have a lot of shared code while the alternative is true in CPython. |
This implementation is patterned off of the ascii codec.
edee586
to
0f889ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. @coolreader18 could you review this again?
Based on the PyPy implementation. The encoding function needs some work with respect to error handling.EDIT: Now based on @coolreader18'sascii
codec.Requesting @coolreader18 as a potential reviewer, as they wrote the architecture for the
encodings
module and the error handling function(s).