pydev - Page 5 - Developer IT

Debugging Django project problem.

- by Wasim

Hi all, I asked this question before, but had no replies, maybe I wasn't so clear. I'm trying to debug a django project using MySQL database. If I run the admin or trying to use the shell to communicate to the data base every thing is well and I can do every thing. I installed MySQLdb for Python 2.6. I installed PyDev on my Apatana studio. Configured the Debugging with runserver 8001 --noreload. When I start debugging , When I arrive to the following code in C:\Python26\Lib\site-packages\django\db\backends\mysql\base.py try: import MySQLdb as Database except ImportError, e: from django.core.exceptions import ImproperlyConfigured raise ImproperlyConfigured("Error loading MySQLdb module: %s" % e) I get an import error : django.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: DLL load failed: The specified module could not be found. I trying to ge more deeply with the import MySQLdb as Database line , it goes to the C:\Python26\Lib\site-packages\MySQLdb__init__.py and fail in the line import _mysql. I can't understand the problem. When running the Django admin every thing is ok, but with debugging it fails to work. Any help please. Thanks in advance.

Read the article

eclipse django using wrong settings.py in pythonpath

- by user1290264

I have pydev/django installed in eclipse, and it runs fine. However, after adding a second django project to eclipse and running the server ('http://127.0.0.1:8000') the pythonpath seems to be stuck on project2 even when I run project1. As a summary, I have two django projects: project1, project2. When I run the django server for project1 I get: Validating models... 0 errors found Django version 1.5, using settings 'project1.settings' Development server is running at 'http://127.0.0.1:8000/' Quit the server with CTRL-BREAK. The above seems to suggest that django is using the correct settings file; however, when I go to 'http://127.0.0.1:8000/' it displays the urls from project2. Also, if I go to 'http://127.0.0.1:8000/admin' the models are getting pulled from the sqlite.db file in project2 as well. I've even tried removing project2 from eclipse entirely and now at 'http://127.0.0.1:8000/admin' I get this error: Python Path: ['C:\Users\Brad\workspaces\In Progress\project2', 'C:\Users\Brad\workspaces\In Progress\project2', 'C:\Python27\DLLs', 'C:\Python27\lib', 'C:\Python27\lib\plat-win', 'C:\Python27\lib\lib-tk', 'C:\Python27', 'C:\Python27\lib\site-packages', 'C:\Windows\system32\python27.zip'] If I run the server on a different port with project1 the path seems to be fine: runserver 7000 --noreload Then 'http://127.0.0.1:7000/' uses project1's paths, but it doesn't seem like I should have to do this. Note: I have setup the run configurations as correctly as I know how. In the main tab, the project and main module both point to the correct project (project1), and the "PYTHONPATH that will be used in the run:" includes project1. Also, I have cleared my browser history, cookies, and everything that chrome would let me delete.

Read the article

Why can't I include these data files in a Python distribution using distutils?

- by froadie

I'm writing a setup.py file for a Python project so that I can distribute it. The aim is to eventually create a .egg file, but I'm trying to get it to work first with distutils and a regular .zip. This is an eclipse pydev project and my file structure is something like this: ProjectName src somePackage module1.py module2.py ... config propsFile1.ini propsFile2.ini propsFile3.ini setup.py Here's my setup.py code so far: from distutils.core import setup setup(name='ProjectName', version='1.0', packages=['somePackage'], data_files = [('config', ['..\config\propsFile1.ini', '..\config\propsFile2.ini', '..\config\propsFile3.ini'])] ) When I run this (with sdist as a command line parameter), a .zip file gets generated with all the python files - but the config files are not included. I thought that this code: data_files = [('config', ['..\config\propsFile1.ini', '..\config\propsFile2.ini', '..\config\propsFile3.ini'])] indicates that those 3 specified config files should be copied to a "config" directory in the zip distribution. Why is this code not accomplishing anything? What am I doing wrong? (I have also tried playing around with the paths of the config files... But nothing seems to help. Would Python throw an error or warning if the path was incorrect / file was not found?)

Read the article

What are simple instructions for creating a Python package structure and egg?

- by froadie

I just completed my first (minor) Python project, and my boss wants me to package it nicely so that it can be distributed and called from other programs easily. He suggested I look into eggs. I've been googling and reading, but I'm just getting confused. Most of the sites I'm looking at explain how to use Python eggs that were already created, or how to create an egg from a setup.py file (which I don't yet have). All I have now is an Eclipse pydev project with about 4 modules and a settings/configuration file. In easy steps, how do I go about structuring it into folders/packages and compiling it into an egg? And once it's an egg, what do I have to know about deploying/building/using it? I'm really starting from scratch here, so don't assume I know anything; simple step-by-step instructions would be really helpful... These are some of the sites that I've been looking at so far: http://peak.telecommunity.com/DevCenter/PythonEggs http://www.packtpub.com/article/writing-a-package-in-python http://www.ibm.com/developerworks/library/l-cppeak3.html#N10232 I've also browsed a few SO questions but haven't really found what I need. Thanks!

Read the article

wxPython: How to handle event binding and Show() properly.

- by Gopal

Hi all, I'm just starting out with wxPython and this is what I would like to do: a) Show a Frame (with Panel inside it) and a button on that panel. b) When I press the button, a dialog box pops up (where I can select from a choice). c) When I press ok on dialog box, the dialog box should disappear (destroyed), but the original Frame+Panel+button are still there. d) If I press that button again, the dialog box will reappear. My code is given below. Unfortunately, I get the reverse effect. That is, a) The Selection-Dialog box shows up first (i.e., without clicking on any button since the TopLevelframe+button is never shown). b) When I click ok on dialog box, then the frame with button appears. c) Clicking on button again has no effect (i.e., dialog box does not show up again). What am I doing wrong ? It seems that as soon as the frame is initialized (even before the .Show() is called), the dialog box is initialized and shown automatically. I am doing this using Eclipse+Pydev on WindowsXP with Python 2.6 ============File:MainFile.py=============== import wx import MyDialog #This is implemented in another file: MyDialog.py class TopLevelFrame(wx.Frame): def __init__(self,parent,id): wx.Frame.__init__(self,parent,id,"Test",size=(300,200)) panel=wx.Panel(self) button=wx.Button(panel, label='Show Dialog', pos=(130,20), size=(60,20)) # Bind EVENTS --> HANDLERS. button.Bind(wx.EVT_BUTTON, MyDialog.start(self)) # Run the main loop to start program. if __name__=='__main__': app=wx.PySimpleApp() TopLevelFrame(parent=None, id=-1).Show() app.MainLoop() ============File:MyDialog.py=============== import wx def start(parent): inputbox = wx.SingleChoiceDialog(None,'Choose Fruit', 'Selection Title', ['apple','banana','orange','papaya']) if inputbox.ShowModal()==wx.ID_OK: answer = inputbox.GetStringSelection() inputbox.Destroy()

Read the article

MySQLDB query not returning all rows

- by RBK

I am trying to do a simple fetch using MySQLDB in Python. I have 2 tables(Accounts & Products). I have to look up Accounts table, get acc_id from it & query the Products table using it. The Products tables has more than 10 rows. But when I run this code it randomly returns between 0 & 6 rows each time I run it. Here's the code snippet: # Set up connection con = mdb.connect('db.xxxxx.com', 'user', 'password', 'mydb') # Create cursor cur = con.cursor() # Execute query cur.execute("SELECT acc_id FROM Accounts WHERE ext_acc = '%s'" % account_num ) # account_num is alpha-numberic and is got from preceding part of the program # A tuple is returned, so get the 0th item from it acc_id = cur.fetchone()[0] print "account_id = ", acc_id # Close the cursor - I was not sure if I can reuse it cur.close() # Reopen the cursor cur = con.cursor() # Second query cur.execute("SELECT * FROM Products WHERE account_id = %d" % acc_id) keys = cur.fetchall() print cur.rowcount # This prints incorrect row count for key in keys: # Does not print all rows. Tried to directly print keys instead of iterating - same result :( print key # Closing the cursor & connection cur.close() con.close() The weird part is, I tried to step through the code using a debugger(PyDev on Eclipse) and it correctly gets all rows(both the value stored in the variable 'keys' as well as console output are correct). I am sure my DB has correct data since I ran the same SQL on MySQL console & got the correct result. Just to be sure I was not improperly closing the connection, I tried using with con instead of manually closing the connection and it's the same result. I did RTFM but I couldn't find much in it to help me with this issue. Where am I going wrong? Thank you. EDIT: I noticed another weird thing now. In the line cur.execute("SELECT * FROM Products WHERE account_id = %d" % acc_id), I hard-coded the acc_id value, i.e made it cur.execute("SELECT * FROM Products WHERE account_id = %d" % 322) and it returns all rows

Read the article

What IDE to use for Python

- by husayt

As a Python newbie, it is interesting to know what IDE's ("GUIs/editors") others use for Python coding. If you can just give the name (e.g. Textpad, Eclipse ..) that will be enough. If it is already mentioned, you can just vote for it. But if you can also give some more comparative information, that will be much appreciated. Thanks. Update: Results so far PyDev with Eclipse (CP, F, AC, PD, EM, SI, MLS, UML, SC, UT, LN, CF, BM) Komodo (CP, C/F, MLS, PD, AC, SC, SI, BM, LN, CF, CT) Emacs (CP, F, AC, MLS, PD, EM, SC, SI, BM, LN, CF, CT, UT, UML) Vim (CP, F, AC, MLS, SI, BM, LN, CF ) TextMate (Mac, CT, CF, MLS, SI, BM, LN) Gedit (Linux, F, AC, MLS, BM, LN, CT [sort of]) Idle (CP, F, AC) PIDA (Linux, CP, F, AC, MLS, SI, BM, LN, CF)(VIM Based) NotePad++ (Windows) BlueFish (Linux) JEdit (CP, F, BM, LN, CF, MLS) E-Texteditor (TextMate Clone for Windows) WingIde (CP, C, AC, MLS (support for C), PD, EM, SC, SI, BM, LN, CF, CT, UT) Eric Ide (CP, F, AC, PD, EM, SI, LN, CF, UT) Pyscripter (Windows, F, AC, PD, EM, SI, LN, CT, UT) ConTEXT (Windows, C) SPE (F, AC, UML) SciTE (CP, F, MLS, EM, BM, LN, CF, CT, SH) Zeus (W, C, BM, LN, CF, SI, SC, CT) NetBeans (CP, F, PD, UML, AC, MLS, SC, SI, BM, LN, CF, CT, UT, RAD) DABO (CP) BlackAdder (C, CP, CF, SI) PythonWin (W, F, AC, PD, SI, BM, CF) Geany (CP, F, very limited AC, MLS, SI, BM, LN, CF) UliPad (CP, F, AC, PD, MLS, SI, LI, CT, UT, BM) Boa Constructor (CP, F, AC, PD, EM, SI, BM, LN, UML, CF, CT) ScriptDev (W, C, AC, MLS, PD, EM, SI, BM, LN, CF, CT) Spider (CP, F, AC) Editra (CP, F, AC, MLS, SC, SI, BM, LN, CF) Pfaide (Windows, C, AC, MLS, SI, BM, LN, CF, CT) KDevelop (CP, F, MLS, SC, SI, BM, LN, CF) Acronyms used: CP - Cross Platfom C - Commercial F - Free AC - Automatic Code-completion MLS - Multi-Language Support PD - Integrated Python Debugging EM - ErrorMarkup SC - Source Control integration SI - Smart Indent BM - Bracket Matching LN - Line Numbering UML - UML editing / viewing CF - Code Folding CT - Code Templates UT - Unit Testing UID - Gui Designer (e.g. QT, Eric, ..) DB - integrated database support RAD - Rapid app development support I don't mention basics like Syntax highlighting as I expect these by default. This is a just dry list reflecting your feedback and comments, I am not advocating any of these tools. I will keep updating this list as you keep posting your answers. PS. Can you help me to add features of the above editors to the list (like autocomplete, debugging, or etc)?

Read the article

Which credentials should I put in for Google App Engine BulkLoader at development server?

- by Hoang Pham

Hello everyone, I would like to ask which kind of credentials do I need to put on for importing data using the Google App Engine BulkLoader class appcfg.py upload_data --config_file=models.py --filename=listcountries.csv --kind=CMSCountry --url=http://localhost:8178/remote_api vit/ And then it asks me for credentials: Please enter login credentials for localhost Here is an extraction of the content of the models.py, I use this listcountries.csv file class CMSCountry(db.Model): sortorder = db.StringProperty() name = db.StringProperty(required=True) formalname = db.StringProperty() type = db.StringProperty() subtype = db.StringProperty() sovereignt = db.StringProperty() capital = db.StringProperty() currencycode = db.StringProperty() currencyname = db.StringProperty() telephonecode = db.StringProperty() lettercode = db.StringProperty() lettercode2 = db.StringProperty() number = db.StringProperty() countrycode = db.StringProperty() class CMSCountryLoader(bulkloader.Loader): def __init__(self): bulkloader.Loader.__init__(self, 'CMSCountry', [('sortorder', str), ('name', str), ('formalname', str), ('type', str), ('subtype', str), ('sovereignt', str), ('capital', str), ('currencycode', str), ('currencyname', str), ('telephonecode', str), ('lettercode', str), ('lettercode2', str), ('number', str), ('countrycode', str) ]) loaders = [CMSCountryLoader] Every tries to enter the email and password result in "Authentication Failed", so I could not import the data to the development server. I don't think that I have any problem with my files neither my models because I have successfully uploaded the data to the appspot.com application. So what should I put in for localhost credentials? I also tried to use Eclipse with Pydev but I still got the same message :( Here is the output: Uploading data records. [INFO ] Logging to bulkloader-log-20090820.121659 [INFO ] Opening database: bulkloader-progress-20090820.121659.sql3 [INFO ] [Thread-1] WorkerThread: started [INFO ] [Thread-2] WorkerThread: started [INFO ] [Thread-3] WorkerThread: started [INFO ] [Thread-4] WorkerThread: started [INFO ] [Thread-5] WorkerThread: started [INFO ] [Thread-6] WorkerThread: started [INFO ] [Thread-7] WorkerThread: started [INFO ] [Thread-8] WorkerThread: started [INFO ] [Thread-9] WorkerThread: started [INFO ] [Thread-10] WorkerThread: started Password for [email protected]: [DEBUG ] Configuring remote_api. url_path = /remote_api, servername = localhost:8178 [DEBUG ] Bulkloader using app_id: abc [INFO ] Connecting to /remote_api [ERROR ] Exception during authentication Traceback (most recent call last): File "D:\Projects\GoogleAppEngine\google_appengine\google\appengine\tools\bulkloader.py", line 2802, in Run request_manager.Authenticate() File "D:\Projects\GoogleAppEngine\google_appengine\google\appengine\tools\bulkloader.py", line 1126, in Authenticate remote_api_stub.MaybeInvokeAuthentication() File "D:\Projects\GoogleAppEngine\google_appengine\google\appengine\ext\remote_api\remote_api_stub.py", line 488, in MaybeInvokeAuthentication datastore_stub._server.Send(datastore_stub._path, payload=None) File "D:\Projects\GoogleAppEngine\google_appengine\google\appengine\tools\appengine_rpc.py", line 344, in Send f = self.opener.open(req) File "C:\Python25\lib\urllib2.py", line 381, in open response = self._open(req, data) File "C:\Python25\lib\urllib2.py", line 399, in _open '_open', req) File "C:\Python25\lib\urllib2.py", line 360, in _call_chain result = func(*args) File "C:\Python25\lib\urllib2.py", line 1107, in http_open return self.do_open(httplib.HTTPConnection, req) File "C:\Python25\lib\urllib2.py", line 1082, in do_open raise URLError(err) URLError: <urlopen error (10061, 'Connection refused')> [INFO ] Authentication Failed Thank you!

Read the article

Python - Bitmap won't draw/display

- by Wallter

I have been working on this project for some time now - it was originally supposed to be a test to see if, using wxPython, I could build a button 'from scratch.' From scratch means: that i would have full control over all the aspects of the button (i.e. controlling the BMP's that are displayed... what the event handlers did... etc.) I have run into several problems (as this is my first major python project.) Now, when the all the code is working for the life of me I can't get an image to display. Basic code - not working dc = wx.BufferedPaintDC(self) dc.SetFont(self.GetFont()) dc.SetBackground(wx.Brush(self.GetBackgroundColour())) dc.Clear() dc.DrawBitmap(wx.Bitmap("/home/wallter/Desktop/Mouseover.bmp"), 100, 100) self.Refresh() self.Update() Full Main.py import wx from Custom_Button import Custom_Button from wxPython.wx import * ID_ABOUT = 101 ID_EXIT = 102 class MyFrame(wx.Frame): def __init__(self, parent, ID, title): wxFrame.__init__(self, parent, ID, title, wxDefaultPosition, wxSize(400, 400)) self.CreateStatusBar() self.SetStatusText("Program testing custom button overlays") menu = wxMenu() menu.Append(ID_ABOUT, "&About", "More information about this program") menu.AppendSeparator() menu.Append(ID_EXIT, "E&xit", "Terminate the program") menuBar = wxMenuBar() menuBar.Append(menu, "&File"); self.SetMenuBar(menuBar) # The call for the 'Experiential button' self.Button1 = Custom_Button(parent, -1, wx.Point(100, 100), wx.Bitmap("/home/wallter/Desktop/Mouseover.bmp"), wx.Bitmap("/home/wallter/Desktop/Normal.bmp"), wx.Bitmap("/home/wallter/Desktop/Click.bmp")) # The following three lines of code are in place to try to get the # Button1 to display (trying to trigger the Paint event (the _onPaint.) # Because that is where the 'draw' functions are. self.Button1.Show(true) self.Refresh() self.Update() # Because the Above three lines of code did not work, I added the # following four lines to trigger the 'draw' functions to test if the # '_onPaint' method actually worked. # These lines do not work. dc = wx.BufferedPaintDC(self) dc.SetFont(self.GetFont()) dc.SetBackground(wx.Brush(self.GetBackgroundColour())) dc.DrawBitmap(wx.Bitmap("/home/wallter/Desktop/Mouseover.bmp"), 100, 100) EVT_MENU(self, ID_ABOUT, self.OnAbout) EVT_MENU(self, ID_EXIT, self.TimeToQuit) def OnAbout(self, event): dlg = wxMessageDialog(self, "Testing the functions of custom " "buttons using pyDev and wxPython", "About", wxOK | wxICON_INFORMATION) dlg.ShowModal() dlg.Destroy() def TimeToQuit(self, event): self.Close(true) class MyApp(wx.App): def OnInit(self): frame = MyFrame(NULL, -1, "wxPython | Buttons") frame.Show(true) self.SetTopWindow(frame) return true app = MyApp(0) app.MainLoop() Full CustomButton.py import wx from wxPython.wx import * class Custom_Button(wx.PyControl): def __init__(self, parent, id, Pos, Over_BMP, Norm_BMP, Push_BMP, **kwargs): wx.PyControl.__init__(self,parent, id, **kwargs) self.Bind(wx.EVT_LEFT_DOWN, self._onMouseDown) self.Bind(wx.EVT_LEFT_UP, self._onMouseUp) self.Bind(wx.EVT_LEAVE_WINDOW, self._onMouseLeave) self.Bind(wx.EVT_ENTER_WINDOW, self._onMouseEnter) self.Bind(wx.EVT_ERASE_BACKGROUND,self._onEraseBackground) self.Bind(wx.EVT_PAINT,self._onPaint) self.pos = Pos self.Over_bmp = Over_BMP self.Norm_bmp = Norm_BMP self.Push_bmp = Push_BMP self._mouseIn = False self._mouseDown = False def _onMouseEnter(self, event): self._mouseIn = True def _onMouseLeave(self, event): self._mouseIn = False def _onMouseDown(self, event): self._mouseDown = True def _onMouseUp(self, event): self._mouseDown = False self.sendButtonEvent() def sendButtonEvent(self): event = wx.CommandEvent(wx.wxEVT_COMMAND_BUTTON_CLICKED, self.GetId()) event.SetInt(0) event.SetEventObject(self) self.GetEventHandler().ProcessEvent(event) def _onEraseBackground(self,event): # reduce flicker pass def Iz(self): dc = wx.BufferedPaintDC(self) dc.DrawBitmap(self.Norm_bmp, 100, 100) def _onPaint(self, event): # The printing functions, they should work... but don't. dc = wx.BufferedPaintDC(self) dc.SetFont(self.GetFont()) dc.SetBackground(wx.Brush(self.GetBackgroundColour())) dc.Clear() dc.DrawBitmap(self.Norm_bmp) # This never printed... I don't know if that means if the EVT # is triggering or what. print '_onPaint' # draw whatever you want to draw # draw glossy bitmaps e.g. dc.DrawBitmap if self._mouseIn: # If the Mouse is over the button dc.DrawBitmap(self.Over_bmp, self.pos) else: # Since the mouse isn't over it Print the normal one # This is adding on the above code to draw the bmp # in an attempt to get the bmp to display; to no avail. dc.DrawBitmap(self.Norm_bmp, self.pos) if self._mouseDown: # If the Mouse clicks the button dc.DrawBitmap(self.Push_bmp, self.pos) This code won't work? I get no BMP displayed why? How do i get one? I've gotten the staticBitmap(...) to display one, but it won't move, resize, or anything for that matter... - it's only in the top left corner of the frame? Note: the frame is 400pxl X 400pxl - and the "/home/wallter/Desktop/Mouseover.bmp"

Read the article

SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article

SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Developer IT