Search Results

Search found 13693 results on 548 pages for 'python metaprogramming'.

Page 137/548 | < Previous Page | 133 134 135 136 137 138 139 140 141 142 143 144  | Next Page >

  • Python web scraping involving HTML tags with attributes

    - by rohanbk
    I'm trying to make a web scraper that will parse a web-page of publications and extract the authors. The skeletal structure of the web-page is the following: <html> <body> <div id="container"> <div id="contents"> <table> <tbody> <tr> <td class="author">####I want whatever is located here ###</td> </tr> </tbody> </table> </div> </div> </body> </html> I've been trying to use BeautifulSoup and lxml thus far to accomplish this task, but I'm not sure how to handle the two div tags and td tag because they have attributes. In addition to this, I'm not sure whether I should rely more on BeautifulSoup or lxml or a combination of both. What should I do? At the moment, my code looks like what is below: import re import urllib2,sys import lxml from lxml import etree from lxml.html.soupparser import fromstring from lxml.etree import tostring from lxml.cssselect import CSSSelector from BeautifulSoup import BeautifulSoup, NavigableString address='http://www.example.com/' html = urllib2.urlopen(address).read() soup = BeautifulSoup(html) html=soup.prettify() html=html.replace('&nbsp', '&#160') html=html.replace('&iacute','&#237') root=fromstring(html) I realize that a lot of the import statements may be redundant, but I just copied whatever I currently had in more source file. EDIT: I suppose that I didn't make this quite clear, but I have multiple tags in page that I want to scrape.

    Read the article

  • Common Pitfalls in Python

    - by Anurag Uniyal
    Today I was bitten again by "Mutable default arguments" after many years. I usually don't use mutable default arguments unless needed but I think with time I forgot about that, and today in the application I added tocElements=[] in a pdf generation function's argument list and now 'Table of Content' gets longer and longer after each invocation of "generate pdf" :) My question is what other things should I add to my list of things to MUST avoid? Mutable default arguments Import modules always same way e.g. from y import x and import x are different things, they are treated as different modules. Do not use range in place of lists because range() will become an iterator anyway, the following will fail: myIndexList = [0,1,3] isListSorted = myIndexList == range(3) # will fail in 3.0 isListSorted = myIndexList == list(range(3)) # will not same thing can be mistakenly done with xrange: `myIndexList == xrange(3)`. Catching multiple exceptions try: raise KeyError("hmm bug") except KeyError,TypeError: print TypeError It prints "hmm bug", though it is not a bug, it looks like we are catching exceptions of type KeyError,TypeError but instead we are catching KeyError only as variable TypeError, use this instead: try: raise KeyError("hmm bug") except (KeyError,TypeError): print TypeError

    Read the article

  • python raw_input odd behavior with accents containing strings

    - by Ryan
    I'm writing a program that asks the user for input that contains accents. The user input string is tested to see if it matches a string declared in the program. As you can see below, my code is not working: code # -*- coding: utf-8 -*- testList = ['má'] myInput = raw_input('enter something here: ') print myInput, repr(myInput) print testList[0], repr(testList[0]) print myInput in testList output in eclipse with pydev enter something here: má mv° 'm\xe2\x88\x9a\xc2\xb0' má 'm\xc3\xa1' False output in IDLE enter something here: má má u'm\xe1' má 'm\xc3\xa1' Warning (from warnings module): File "/Users/ryanculkin/Desktop/delete.py", line 8 print myInput in testList UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal False How can I get my code to print True when comparing the two strings? Additionally, I note that the result of running this code on the same input is different depending on whether I use eclipse or IDLE. Why is this? My eventual goal is to put my program on the web; is there anything that I need to be aware of, since the result seems to be so volatile?

    Read the article

  • python search replace using wildcards

    - by tom smith
    hi somewhat confused.. but trying to do a search/repace using wildcards if i have something like: <blah.... ssf ff> <bl.... ssf dfggg ff> <b.... ssf ghhjj fhf> and i want to replace all of the above strings with say, <hh >t any thoughts/comments on how this can be accomplished? thanks update (thanks for the comments!) i'm missing something... my initial sample text are: Soo Choi</span>LONGEDITBOX">Apryl Berney Soo Choi</span>LONGEDITBOX">Joel Franks Joel Franks</span>GEDITBOX">Alexander Yamato and i'm trying to get Soo Choi foo Apryl Berney Soo Choi foo Joel Franks Joel Franks foo Alexander Yamato i've tried derivations of name=re.sub("</s[^>]*\">"," foo ",name) but i'm missing something... thoughts... thanks

    Read the article

  • Python: how to enclose strings in a list with < and >

    - by Michael Konietzny
    Hello, i would like to enclose strings inside of list into < (formatted like <%s). The current code does the following: def create_worker (general_logger, general_config): arguments = ["worker_name", "worker_module", "worker_class"] __check_arguments(arguments) def __check_arguments(arguments): if len(sys.argv) < 2 + len(arguments): print "Usage: %s delete-project %s" % (__file__," ".join(arguments)) sys.exit(10) The current output looks like this: Usage: ...\handler_scripts.py delete-project worker_name worker_module worker_class and should look like this: Usage: ...\handler_scripts.py delete-project <worker_name> <worker_module> <worker_class> Is there any short way to do this ? Greetings, Michael

    Read the article

  • serializing JSON files with newlines in Python

    - by user248237
    I am using json and jsonpickle sometimes to serialize objects to files, using the following function: def json_serialize(obj, filename, use_jsonpickle=True): f = open(filename, 'w') if use_jsonpickle: import jsonpickle json_obj = jsonpickle.encode(obj) f.write(json_obj) else: simplejson.dump(obj, f) f.close() The problem is that if I serialize a dictionary for example, using "json_serialize(mydict, myfilename)" then the entire serialization gets put on one line. This means that I can't grep the file for entries to be inspected by hand, like I would a CSV file. Is there a way to make it so each element of an object (e.g. each entry in a dict, or each element in a list) is placed on a separate line in the JSON output file? thanks.

    Read the article

  • mouse rollover event in Python (VPython)

    - by kame
    Is there something similar to scene.mouse.getclick in the visual module (VPython)? I need it for a rollover. Thanks in advance. EDIT: I need a function for doing something when the mouse moves inside a special area without clicking.

    Read the article

  • Python TKinter connect variable to entry widget

    - by Sano98
    Hi everyone, I'm trying to associate a variable with a Tkinter entry widget, in a way that: Whenever I change the value (the "content") of the entry, mainly by typing something into it, the variable automatically gets assigned the value of what I've typed. Without me having to push a button "Update value " or something like that first. Whenever the variable gets changed (by some other part of the programm), I want the entry value displayed to be adjusted automatically. I believe that this could work via the textvariable. I read the example on http://effbot.org/tkinterbook/entry.htm, but it is not exactly helping me for what I have in mind. I have a feeling that there is a way of ensuring the first condition with using entry's "validate". Any ideas? Thank you for your input! Sano

    Read the article

  • Shared value in parallel python

    - by Jonathan
    Hey all- I'm using ParallelPython to develop a performance-critical script. I'd like to share one value between the 8 processes running on the system. Please excuse the trivial example but this illustrates my question. def findMin(listOfElements): for el in listOfElements: if el < min: min = el import pp min = 0 myList = range(100000) job_server = pp.Server() f1 = job_server.submit(findMin, myList[0:25000]) f2 = job_server.submit(findMin, myList[25000:50000]) f3 = job_server.submit(findMin, myList[50000:75000]) f4 = job_server.submit(findMin, myList[75000:100000]) The pp docs don't seem to describe a way to share data across processes. Is it possible? If so, is there a standard locking mechanism (like in the threading module) to confirm that only one update is done at a time? l = Lock() if(el < min): l.acquire if(el < min): min = el l.release I understand I could keep a local min and compare the 4 in the main thread once returned, but by sharing the value I can do some better pruning of my BFS binary tree and potentially save a lot of loop iterations. Thanks- Jonathan

    Read the article

  • Python: unable to inherit from a C extension.

    - by celil
    I am trying to add a few extra methods to a matrix type from the pysparse library. Apart from that I want the new class to behave exactly like the original, so I chose to implement the changes using inheritance. However, when I try from pysparse import spmatrix class ll_mat(spmatrix.ll_mat): pass this results in the following error TypeError: Error when calling the metaclass bases cannot create 'builtin_function_or_method' instances What is this causing this error? Is there a way to use delegation so that my new class behaves exactly the same way as the original?

    Read the article

  • Python - urllib2 & cookielib

    - by Adrian
    I am trying to open the following website and retrieve the initial cookie and use it for the second url-open BUT if you run the following code it outputs 2 different cookies. How do I use the initial cookie for the second url-open? import cookielib, urllib2 cj = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) home = opener.open('https://www.idcourts.us/repository/start.do') print cj search = opener.open('https://www.idcourts.us/repository/partySearch.do') print cj Output shows 2 different cookies every time as you can see: <cookielib.CookieJar[<Cookie JSESSIONID=0DEEE8331DE7D0DFDC22E860E065085F for www.idcourts.us/repository>]> <cookielib.CookieJar[<Cookie JSESSIONID=E01C2BE8323632A32DA467F8A9B22A51 for www.idcourts.us/repository>]>

    Read the article

  • Encoding in python with lxml - complex solution

    - by Vojtech R.
    Hi, I need to download and parse webpage with lxml and build UTF-8 xml output. I thing schema in pseudocode is more illustrative: from lxml import etree webfile = urllib2.urlopen(url) root = etree.parse(webfile.read(), parser=etree.HTMLParser(recover=True)) txt = my_process_text(etree.tostring(root.xpath('/html/body'), encoding=utf8)) output = etree.Element("out") output.text = txt outputfile.write(etree.tostring(output, encoding=utf8)) So webfile can be in any encoding (lxml should handle this). Outputfile have to be in utf-8. I'm not sure where to use encoding/coding. Is this schema ok? (I cant find good tutorial about lxml and encoding, but I can find many problems with this...) I need robust approved solution so I ask you seniors. Many thanks

    Read the article

  • python: using __import__ to import a module which in turn generates an ImportError

    - by bbb
    Hi there, I have a funny problem I'd like to ask you guys ('n gals) about. I'm importing some module A that is importing some non-existent module B. Of course this will result in an ImportError. This is what A.py looks like import B Now let's import A >>> import A Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/importtest/A.py", line 1, in <module> import B ImportError: No module named B Alright, on to the problem. How can I know if this ImportError results from importing A or from some corrupt import inside A without looking at the error's string representation. The difference is that either A is not there or does have incorrect import statements. Hope you can help me out... Cheers bb

    Read the article

  • Using Python and Mechanize with ASP Forms

    - by tchaymore
    I'm trying to submit a form on an .asp page but Mechanize does not recognize the name of the control. The form code is: <form id="form1" name="frmSearchQuick" method="post"> .... <input type="button" name="btSearchTop" value="SEARCH" class="buttonctl" onClick="uf_Browse('dledir_search_quick.asp');" > My code is as follows: br = mechanize.Browser() br.open(BASE_URL) br.select_form(name='frmSearchQuick') resp = br.click(name='btSearchTop') I've also tried the last line as: resp = br.submit(name='btSearchTop') The error I get is: raise ControlNotFoundError("no control matching "+description) ControlNotFoundError: no control matching name 'btSearchTop', kind 'clickable' If I print br I get this: IgnoreControl(btSearchTop=) But I don't see that anywhere in the HTML. Any advice on how to submit this form?

    Read the article

  • how to send file via http with python

    - by ep45
    Hello, I have a problem. I use Apache with mod_wsgi and webpy, and when i send a file on http, a lot packets are lost. This is my code : web.header('Content-Type','video/x-flv') web.header('Content-length',sizeFile) f = file(FILE_PATH, 'rb') while True: buffer = f.read(4*1024) if buffer : yield buffer else : break f.close() What in my code is wrong ? thanks.

    Read the article

  • Using Tkinter in python to edit the title bar

    - by Dan
    I am trying to add a custom title to a window but I am having troubles with it. I know my code isn't right but when I run it, it creates 2 windows instead, one with just the title tk and another bigger window with "Simple Prog". How do I make it so that the tk window has the title "Simple Prog" instead of having a new additional window. I dont think I'm suppose to have the Tk() part because when i have that in my complete code, there's an error from tkinter import Tk, Button, Frame, Entry, END class ABC(Frame): def __init__(self,parent=None): Frame.__init__(self,parent) self.parent = parent self.pack() ABC.make_widgets(self) def make_widgets(self): self.root = Tk() self.root.title("Simple Prog")

    Read the article

  • Python fit polynomial, power law and exponential from data

    - by Nadir
    I have some data (x and y coordinates) coming from a study and I have to plot them and to find the best curve that fits data. My curves are: polynomial up to 6th degree; power law; and exponential. I am able to find the best fit for polynomial with while(i < 6): coefs, val = poly.polyfit(x, y, i, full=True) and I take the degree that minimizes val. When I have to fit a power law (the most probable in my study), I do not know how to do it correctly. This is what I have done. I have applied the log function to all x and y and I have tried to fit it with a linear polynomial. If the error (val) is lower than the others polynomial tried before, I have chosen the power law function. Am I correct? Now how can I reconstruct my power law starting from the line y = mx + q in order to draw it with the original points? I need also to display the function found. I have tried with: def power_law(x, m, q): return q * (x**m) using x_new = np.linspace(x[0], x[-1], num=len(x)*10) y1 = power_law(x_new, coefs[0], coefs[1]) popt, pcov = curve_fit(power_law, x_new, y1) but it seems not to work well.

    Read the article

  • Converting string to datetime object in python

    - by Gussi
    Given this string: "Fri, 09 Apr 2010 14:10:50 +0000" how does one convert it to a datetime object? After doing some reading I feel like this should work, but it doesn't... >>> from datetime import datetime >>> >>> str = 'Fri, 09 Apr 2010 14:10:50 +0000' >>> fmt = '%a, %d %b %Y %H:%M:%S %z' >>> datetime.strptime(str, fmt) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.6/_strptime.py", line 317, in _strptime (bad_directive, format)) ValueError: 'z' is a bad directive in format '%a, %d %b %Y %H:%M:%S %z' It should be noted that this works without a problem >>> from datetime import datetime >>> >>> str = 'Fri, 09 Apr 2010 14:10:50' >>> fmt = '%a, %d %b %Y %H:%M:%S' >>> datetime.strptime(str, fmt) datetime.datetime(2010, 4, 9, 14, 10, 50) But I'm stuck with "Fri, 09 Apr 2010 14:10:50 +0000", I would prefer to convert exactly that without changing (or slicing) that string in any way.

    Read the article

  • Sending and receiving async over multiprocessing.Pipe() in Python

    - by dcolish
    I'm having some issues getting the Pipe.send to work in this code. What I would ultimately like to do is send and receive messages to and from the foreign process while its running in a fork. This is eventually going to be integrated into a pexpect loop for talking to interpreter processes. ` from multiprocessing import Process, Pipe def f(conn): cmd = '' if conn.poll(): cmd = conn.recv() i = 1 i += 1 conn.send([42 + i, cmd, 'hello']) if __name__ == '__main__': parent_conn, child_conn = Pipe() p = Process(target=f, args=(child_conn,)) p.start() from pdb import set_trace; set_trace() while parent_conn.poll(): print parent_conn.recv() # prints "[42, None, 'hello']" parent_conn.send('OHHAI') p.join() `

    Read the article

  • Text-based game graphics in Python

    - by Jasper
    Hi, i'm pretty new 2 programming, and I'm creating a simple text-based game I'm wondering if there is a simple way to create my own terminal-type window with which I can place coloured input etc. Is there a graphics module well suited to this? I'm using Mac, but I would like it to work on Windows as well Thanks

    Read the article

  • Python SQLite: database is locked

    - by user322683
    I'm trying this code: import sqlite connection = sqlite.connect('cache.db') cur = connection.cursor() cur.execute('''create table item (id integer primary key, itemno text unique, scancode text, descr text, price real)''') connection.commit() cur.close() I'm catching this exception: Traceback (most recent call last): File "cache_storage.py", line 7, in <module> scancode text, descr text, price real)''') File "/usr/lib/python2.6/dist-packages/sqlite/main.py", line 237, in execute self.con._begin() File "/usr/lib/python2.6/dist-packages/sqlite/main.py", line 503, in _begin self.db.execute("BEGIN") _sqlite.OperationalError: database is locked Permissions for cache.db are ok. Any ideas?

    Read the article

< Previous Page | 133 134 135 136 137 138 139 140 141 142 143 144  | Next Page >