Conversation
coolreader18
left a comment
There was a problem hiding this comment.
Oh yea, so far this looks good! That is the correct way to add a new encoding; if that's what you were looking for. I'm not sure if unicode_encode_ucs1 is reusable for other encodings, since anything >ucs1 would be some way of splitting the ucs2/ucs4 into parts, at which point it's just a different encoding.
While I was implementing this, I came across something that stymied me. In PyPy, the |
643eb35 to
4b8e8a0
Compare
|
Originally posted this in #3118 but it makes more sense here re: the ucs1 helper function, I think it makes more sense to abstract over utf8/ascii than ascii/latin1, since we use a utf8 str type (so we can just validate + extend_from_slice for utf8/ascii) whereas CPython has UCS1 as a buffer (so they can just validate + memcpy for ascii/latin1). Finishing up #3118 made me remember why I did that; in our implementation of codecs, ascii and utf8 just naturally have a lot of shared code while the alternative is true in CPython. |
This implementation is patterned off of the ascii codec.
edee586 to
0f889ce
Compare
youknowone
left a comment
There was a problem hiding this comment.
looks good to me. @coolreader18 could you review this again?
Based on the PyPy implementation. The encoding function needs some work with respect to error handling.EDIT: Now based on @coolreader18'sasciicodec.Requesting @coolreader18 as a potential reviewer, as they wrote the architecture for the
encodingsmodule and the error handling function(s).