Stream/string/bytearray transformations in Python 3

Posted by Craig McQueen on Stack Overflow See other posts from Stack Overflow or by Craig McQueen
Published on 2009-07-29T01:04:56Z Indexed on 2011/02/10 23:26 UTC
Read the original article Hit count: 246

Filed under:
|

Python 3 cleans up Python's handling of Unicode strings. I assume as part of this effort, the codecs in Python 3 have become more restrictive, according to the Python 3 documentation compared to the Python 2 documentation.

For example, codecs that conceptually convert a bytestream to a different form of bytestream have been removed:

  • base64_codec
  • bz2_codec
  • hex_codec

And codecs that conceptually convert Unicode to a different form of Unicode have also been removed (in Python 2 it actually went between Unicode and bytestream, but conceptually it's really Unicode to Unicode I reckon):

  • rot_13

My main question is, what is the "right way" in Python 3 to do what these removed codecs used to do? They're not codecs in the strict sense, but "transformations". But the interface and implementation would be very similar to codecs.

I don't care about rot_13, but I'm interested to know what would be the "best way" to implement a transformation of line ending styles (Unix line endings vs Windows line endings) which should really be a Unicode-to-Unicode transformation done before encoding to byte stream, especially when UTF-16 is being used, as discussed this other SO question.

© Stack Overflow or respective owner

Related posts about encoding

Related posts about python-3.x