Windows cmd encoding change causes Python crash.

Posted by Alex on Stack Overflow See other posts from Stack Overflow or by Alex
Published on 2009-05-18T17:52:10Z Indexed on 2011/01/09 4:53 UTC
Read the original article Hit count: 270

Filed under:
|
|
|
|

First I chage Windows CMD encoding to utf-8 and run Python interpreter:

    chcp 65001
    python

Then I try to print a unicode sting inside it and when i do this Python crashes in a peculiar way (I just get a cmd prompt in the same window).

    >>> import sys
    >>> print u'ëèæîð'.encode(sys.stdin.encoding)

Any ideas why it happens and how to make it work?

UPD: sys.stdin.encoding returns 'cp65001'

UPD2: It just came to me that the issue might be connected with the fact that utf-8 uses multi-byte character set (kcwu made a good point on that). I tried running the whole example with 'windows-1250' and got 'ëeaî?'. Windows-1250 uses single-character set so it worked for those characters it understands. However I still have no idea how to make 'utf-8' work here.

UPD3: Oh, I found out it is a known Python bug. I guess what happens is that Python copies the cmd encoding as 'cp65001 to sys.stdin.encoding and tries to apply it to all the input. Since it fails to understand 'cp65001' it crushes on any input that contains non-ascii characters.

© Stack Overflow or respective owner

Related posts about python

Related posts about Windows