Should I strip the XML declaration from suds output before parsing with lxml?

Posted by mikl on Stack Overflow See other posts from Stack Overflow or by mikl
Published on 2010-03-16T21:21:05Z Indexed on 2010/03/16 21:31 UTC
Read the original article Hit count: 313

Filed under:

suds

I’m trying to implement a SOAP webservice in Python 2.6 using the suds library. That is working well, but I’ve run into a problem when trying to parse the output with lxml.

Suds returns a suds.sax.text.Text object with the reply from the SOAP service. The suds.sax.text.Text class is a subclass of the Python built-in Unicode class. In essence, it would be comparable with this Python statement:

u'<?xml version="1.0" encoding="utf-8" ?><root><lotsofelements \></root>'

Which is incongrous, since if the XML declaration is correct, the contents are UTF-8 encoded, and thus not a Python Unicode object (because those are stored in some internal encoding like UCS4).

lxml will refuse to parse this, as documented, since there is no clear answer to what encoding it should be interpreted as.

As I see it, there are two ways out of this bind:

Strip the <?xml> declaration, including the encoding.
Convert the output from Suds into a bytestring, using the specified encoding.

Currently, the data I’m receiving from the webservice is within the ASCII-range, so either way will work, but both feels very much like ugly hacks to me, and I’m not quite sure what would happen, if I start to receive data that would need a wider range of Unicode characters.

Any good ideas? I can’t imagine I’m the first one in this position…

Developer IT

Should I strip the XML declaration from suds output before parsing with lxml? - Developer IT

Should I strip the XML declaration from suds output before parsing with lxml?

python

unicode

lxml

soap

suds

Related posts about python

unmet dependencies in Ubuntu 12.04

How can I get sikuli-ide to work?

Getting PATH right for python after MacPorts install

call python with system() in R to run a python script emulating the python console

Python - Calling a non python program from python?

Related posts about unicode

Translating Between Unicode and Non-Unicode Character Sets in Java

SQLite, python, unicode, and non-utf data

SQLite, python, unicode, and non-utf data

notepad sql Unicode and Non Unicode

On Windows 7, dir or tree can't show unicode characters, even starting cmd with cmd /U

Categories cloud