Python and Unicode: How everything should be Unicode

Posted by A A on Stack Overflow See other posts from Stack Overflow or by A A
Published on 2010-12-27T18:15:29Z Indexed on 2010/12/27 18:53 UTC
Read the original article Hit count: 383

Filed under:

bytestring

Forgive if this a long a question:

I have been programming in Python for around six months. Self taught, starting with the Python tutorial and then SO and then just using Google for stuff.

Here is the sad part: No one told me all strings should be Unicode. No, I am not lying or making this up, but where does the tutorial mention it? And most examples also I see just make use of byte strings, instead of Unicode strings. I was just browsing and came across this question on SO, which says how every string in Python should be a Unicode string. This pretty much made me cry!

I read that every string in Python 3.0 is Unicode by default, so my questions are for 2.x:

Should I do a:

print u'Some text' or just print 'Text' ?
Everything should be Unicode, does this mean, like say I have a tuple:

t = ('First', 'Second'), it should be t = (u'First', u'Second')?

I read that I can do a from __future__ import unicode_literals and then every string will be a Unicode string, but should I do this inside a container also?
When reading/ writing to a file, I should use the codecs module. Right? Or should I just use the standard way or reading/ writing and encode or decode where required?
If I get the string from say raw_input(), should I convert that to Unicode also?

What is the common approach to handling all of the above issues in 2.x? The from __future__ import unicode_literals statement?

Sorry for being a such a noob, but this changes what I have been doing for a long time and so clearly I am confused.

Developer IT

Python and Unicode: How everything should be Unicode - Developer IT

Python and Unicode: How everything should be Unicode

python

unicode

bytestring

Related posts about python

unmet dependencies in Ubuntu 12.04

How can I get sikuli-ide to work?

Getting PATH right for python after MacPorts install

call python with system() in R to run a python script emulating the python console

Python - Calling a non python program from python?

Related posts about unicode

Translating Between Unicode and Non-Unicode Character Sets in Java

SQLite, python, unicode, and non-utf data

SQLite, python, unicode, and non-utf data

notepad sql Unicode and Non Unicode

On Windows 7, dir or tree can't show unicode characters, even starting cmd with cmd /U

Categories cloud