Is there any python module to convert PDF files into text? I tried one piece of code found in Activestate which uses pypdf but the text generated had no space between and was of no use.
I am using python lxml library to parse html pages:
import lxml.html
# this might run indefinitely
page = lxml.html.parse('http://stackoverflow.com/')
Is there any way to set timeout for parsing?
Hey im new to python. How do you get a portion of a list by the relative value of its sorting key.
example...
list = [11,12,13,14,15,16,1,2,3,4,5,6,7,8,9,10]
list.sort()
newList = list.split("all numbers that are over 13")
assert newList == [14,15,16]
I wanted to know if there was a way I can get my python script located on a shared web hosting provider to read the contents of a folder on my desktop and list out the contents?
Can this be done using tempfiles?
Hi,
I have a script where I launch with popen a shell command.
The problem is that the script don't wait that popen command is finished and go forward.
om_points = os.popen(command, "w")
.....
How can I tell to my python script to wait until the shell command has finished?
Thanks.
pt=[2]
pt[0]=raw_input()
when i do this , and give an input suppose 1011 , it says list indexing error- " list assignment index out of range" . may i know why? i think i am not able to assign a list properly . how to assign an array of 2 elements in python then?
I use the following method to break the double loop in Python.
for word1 in buf1:
find = False
for word2 in buf2:
...
if res == res1:
print "BINGO " + word1 + ":" + word2
find = True
if find:
break
Is there better way to break the double loop?
Hi, I have a CSV file and I want to bulk-import this file into my sqlite3 database using Python. the command is ".import .....". but it seems that it cannot work like this. Can anyone give me an example of how to do it in sqlite3? I am using windows just in case.
Thanks
If possible I would like to use the following structure for a command however I can't seem to figure out how to achieve this in Python:
./somescript.py arg <optional argument> -- "some long argument"
Would it be possible to achieve this in a feasible manner without too much dirty code? Or should I just reconsider the syntax (which is primarily preference).
Thanks!
I have a SimpleXMLRPCServer server (Python).
How can I get the IP address of the client in the request handler?
This information appears in the log. However, I am not sure how to access this information from within the request handler.
I'm trying to rewrite the equivalent of the python replace() function without using regexp. Using this code, i've managed to get it to work with single chars, but not with more than one character:
def Replacer(self, find_char, replace_char):
s = []
for char in self.base_string:
if char == find_char:
char = replace_char
#print char
s.append(char)
s = ''.join(s)
my_string.Replacer('a','E')
Anybody have any pointers how to make this work with more than one character? example:
my_string.Replacer('kl', 'lll')
Ok, so I've read both PEP 8 and PEP 257, and I've written lots of docstrings for functions and classes, but I'm a little unsure about what should go in a module docstring. I figured, at a minimum, it should document the functions and classes that the module exports, but I've also seen a few modules that list author names, copyright information, etc. Does anyone have an example of how a good python docstring should be structured?
I'm looking for a way to programatically control a browser on a Mac (i.e. not IE) using Python.
The actions I need include following links, checking if elements exist in a page, and submitting forms.
Which solution would you recommend?
Thanks!
What tools are good to use for code analysis in python?
I have a large source repository split across multiple projects, and I would like to be able to run tools across the directories to see details like Cyclomatic Complexity, and perhaps be able to spot errors using static analysis.
Ideally, I would like to be able to produce a report about the health of the source code, so we can spot problem areas that need to be addressed.
Ignoring all the characteristics of each languages and focusing SOLELY on speed, which language is better performance-wise?
You'd think this would be a rather simple question to answer, but I haven't found a decent one.
I'm aware that some types of operations may be faster with python, and vice-versa, but I cannot find any detailed information on this. Can anyone shed some light on the performance differences?
I have a Python script that pulls in data from many sources (databases, files, etc.). Supposedly, all the strings are unicode, but what I end up getting is any variation on the following theme (as returned by repr()):
u'D\\xc3\\xa9cor'
u'D\xc3\xa9cor'
'D\\xc3\\xa9cor'
'D\xc3\xa9cor'
Is there a reliable way to take any four of the above strings and return the proper unicode string?
u'D\xe9cor' # --> Décor
The only way I can think of right now uses eval(), replace(), and a deep, burning shame that will never wash away.
I'm writing python package/module and would like the logging messages mention what module/class/function they come from. I.e. if I run this code:
import mymodule.utils.worker as worker
w = worker.Worker()
w.run()
I'd like to logging messages looks like this:
2010-06-07 15:15:29 INFO mymodule.utils.worker.Worker.run <pid/threadid>: Hello from worker
How can I accomplish this?
Thanks.
Hello,
Sometimes I break long conditions in IFs to several lines. The most obvious way to do this is:
if (cond1 == 'val1' and cond2 == 'val2' and
cond3 == 'val3' and cond4 == 'val4'):
do_something
Isn't very very appealing visually, because the action blends with the conditions. However, it is the natural way using correct Python indentation of 4 spaces.
Edit:
By the way, for the moment I'm using:
if ( cond1 == 'val1' and cond2 == 'val2' and
cond3 == 'val3' and cond4 == 'val4'):
do_something
Not very pretty, I know :-)
Can you recommend an alternative way ?
I have a variable in Python containing a floating point number (e.g. num = 24654.123), and I'd like to determine the number's precision and scale values (in the Oracle sense), so 123.45678 should give me (8,5), 12.76 should give me (4,2), etc.
I was first thinking about using the string representation (via str or repr), but those fail for large numbers:
>>> num = 1234567890.0987654321
>>> str(num) = 1234567890.1
>>> repr(num) = 1234567890.0987654
i have to send F2 key to telnet host how do i send it using python...using getch() i found that this '<' used for F2 key but while sending its not working..i think there will be some way to send special function keys but i am not able to find it..if somebody knows please help me.thanks in advance
Hi all,
I need to create daemon that will monitor certain directory and will process every file that's written to that particular path.
My choice is either java or python.
Did you guys have any experience using both technology? what is the best one?
Is there any free Python to C translator? for example capable to translate such lib as lib for Fast content-aware image resizing (which already depends on some C libs) to C classes and files?
Given a Python package, how can I automatically find all its sub-packages?
I used to have a function that would just browse the file system, looking for folders that have an __init__.py* file in them, but now I need a method that would work even if the whole package is in an egg.
If I point Firefox at http://bitbucket.org/tortoisehg/stable/wiki/Home/ReleaseNotes, I get a page of HTML. But if I try this in Python:
import urllib
site = 'http://bitbucket.org/tortoisehg/stable/wiki/Home/ReleaseNotes'
req = urllib.urlopen(site)
text = req.read()
I get the following:
500 Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
What am I doing wrong?