Search Results

Search found 18899 results on 756 pages for 'python c extension'.

Page 149/756 | < Previous Page | 145 146 147 148 149 150 151 152 153 154 155 156  | Next Page >

  • Python web scraping involving HTML tags with attributes

    - by rohanbk
    I'm trying to make a web scraper that will parse a web-page of publications and extract the authors. The skeletal structure of the web-page is the following: <html> <body> <div id="container"> <div id="contents"> <table> <tbody> <tr> <td class="author">####I want whatever is located here ###</td> </tr> </tbody> </table> </div> </div> </body> </html> I've been trying to use BeautifulSoup and lxml thus far to accomplish this task, but I'm not sure how to handle the two div tags and td tag because they have attributes. In addition to this, I'm not sure whether I should rely more on BeautifulSoup or lxml or a combination of both. What should I do? At the moment, my code looks like what is below: import re import urllib2,sys import lxml from lxml import etree from lxml.html.soupparser import fromstring from lxml.etree import tostring from lxml.cssselect import CSSSelector from BeautifulSoup import BeautifulSoup, NavigableString address='http://www.example.com/' html = urllib2.urlopen(address).read() soup = BeautifulSoup(html) html=soup.prettify() html=html.replace('&nbsp', '&#160') html=html.replace('&iacute','&#237') root=fromstring(html) I realize that a lot of the import statements may be redundant, but I just copied whatever I currently had in more source file. EDIT: I suppose that I didn't make this quite clear, but I have multiple tags in page that I want to scrape.

    Read the article

  • python search replace using wildcards

    - by tom smith
    hi somewhat confused.. but trying to do a search/repace using wildcards if i have something like: <blah.... ssf ff> <bl.... ssf dfggg ff> <b.... ssf ghhjj fhf> and i want to replace all of the above strings with say, <hh >t any thoughts/comments on how this can be accomplished? thanks update (thanks for the comments!) i'm missing something... my initial sample text are: Soo Choi</span>LONGEDITBOX">Apryl Berney Soo Choi</span>LONGEDITBOX">Joel Franks Joel Franks</span>GEDITBOX">Alexander Yamato and i'm trying to get Soo Choi foo Apryl Berney Soo Choi foo Joel Franks Joel Franks foo Alexander Yamato i've tried derivations of name=re.sub("</s[^>]*\">"," foo ",name) but i'm missing something... thoughts... thanks

    Read the article

  • Encoding in python with lxml - complex solution

    - by Vojtech R.
    Hi, I need to download and parse webpage with lxml and build UTF-8 xml output. I thing schema in pseudocode is more illustrative: from lxml import etree webfile = urllib2.urlopen(url) root = etree.parse(webfile.read(), parser=etree.HTMLParser(recover=True)) txt = my_process_text(etree.tostring(root.xpath('/html/body'), encoding=utf8)) output = etree.Element("out") output.text = txt outputfile.write(etree.tostring(output, encoding=utf8)) So webfile can be in any encoding (lxml should handle this). Outputfile have to be in utf-8. I'm not sure where to use encoding/coding. Is this schema ok? (I cant find good tutorial about lxml and encoding, but I can find many problems with this...) I need robust approved solution so I ask you seniors. Many thanks

    Read the article

  • Shared value in parallel python

    - by Jonathan
    Hey all- I'm using ParallelPython to develop a performance-critical script. I'd like to share one value between the 8 processes running on the system. Please excuse the trivial example but this illustrates my question. def findMin(listOfElements): for el in listOfElements: if el < min: min = el import pp min = 0 myList = range(100000) job_server = pp.Server() f1 = job_server.submit(findMin, myList[0:25000]) f2 = job_server.submit(findMin, myList[25000:50000]) f3 = job_server.submit(findMin, myList[50000:75000]) f4 = job_server.submit(findMin, myList[75000:100000]) The pp docs don't seem to describe a way to share data across processes. Is it possible? If so, is there a standard locking mechanism (like in the threading module) to confirm that only one update is done at a time? l = Lock() if(el < min): l.acquire if(el < min): min = el l.release I understand I could keep a local min and compare the 4 in the main thread once returned, but by sharing the value I can do some better pruning of my BFS binary tree and potentially save a lot of loop iterations. Thanks- Jonathan

    Read the article

  • python: using __import__ to import a module which in turn generates an ImportError

    - by bbb
    Hi there, I have a funny problem I'd like to ask you guys ('n gals) about. I'm importing some module A that is importing some non-existent module B. Of course this will result in an ImportError. This is what A.py looks like import B Now let's import A >>> import A Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/importtest/A.py", line 1, in <module> import B ImportError: No module named B Alright, on to the problem. How can I know if this ImportError results from importing A or from some corrupt import inside A without looking at the error's string representation. The difference is that either A is not there or does have incorrect import statements. Hope you can help me out... Cheers bb

    Read the article

  • Using Tkinter in python to edit the title bar

    - by Dan
    I am trying to add a custom title to a window but I am having troubles with it. I know my code isn't right but when I run it, it creates 2 windows instead, one with just the title tk and another bigger window with "Simple Prog". How do I make it so that the tk window has the title "Simple Prog" instead of having a new additional window. I dont think I'm suppose to have the Tk() part because when i have that in my complete code, there's an error from tkinter import Tk, Button, Frame, Entry, END class ABC(Frame): def __init__(self,parent=None): Frame.__init__(self,parent) self.parent = parent self.pack() ABC.make_widgets(self) def make_widgets(self): self.root = Tk() self.root.title("Simple Prog")

    Read the article

  • Python fit polynomial, power law and exponential from data

    - by Nadir
    I have some data (x and y coordinates) coming from a study and I have to plot them and to find the best curve that fits data. My curves are: polynomial up to 6th degree; power law; and exponential. I am able to find the best fit for polynomial with while(i < 6): coefs, val = poly.polyfit(x, y, i, full=True) and I take the degree that minimizes val. When I have to fit a power law (the most probable in my study), I do not know how to do it correctly. This is what I have done. I have applied the log function to all x and y and I have tried to fit it with a linear polynomial. If the error (val) is lower than the others polynomial tried before, I have chosen the power law function. Am I correct? Now how can I reconstruct my power law starting from the line y = mx + q in order to draw it with the original points? I need also to display the function found. I have tried with: def power_law(x, m, q): return q * (x**m) using x_new = np.linspace(x[0], x[-1], num=len(x)*10) y1 = power_law(x_new, coefs[0], coefs[1]) popt, pcov = curve_fit(power_law, x_new, y1) but it seems not to work well.

    Read the article

  • Converting string to datetime object in python

    - by Gussi
    Given this string: "Fri, 09 Apr 2010 14:10:50 +0000" how does one convert it to a datetime object? After doing some reading I feel like this should work, but it doesn't... >>> from datetime import datetime >>> >>> str = 'Fri, 09 Apr 2010 14:10:50 +0000' >>> fmt = '%a, %d %b %Y %H:%M:%S %z' >>> datetime.strptime(str, fmt) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.6/_strptime.py", line 317, in _strptime (bad_directive, format)) ValueError: 'z' is a bad directive in format '%a, %d %b %Y %H:%M:%S %z' It should be noted that this works without a problem >>> from datetime import datetime >>> >>> str = 'Fri, 09 Apr 2010 14:10:50' >>> fmt = '%a, %d %b %Y %H:%M:%S' >>> datetime.strptime(str, fmt) datetime.datetime(2010, 4, 9, 14, 10, 50) But I'm stuck with "Fri, 09 Apr 2010 14:10:50 +0000", I would prefer to convert exactly that without changing (or slicing) that string in any way.

    Read the article

  • Using Python and Mechanize with ASP Forms

    - by tchaymore
    I'm trying to submit a form on an .asp page but Mechanize does not recognize the name of the control. The form code is: <form id="form1" name="frmSearchQuick" method="post"> .... <input type="button" name="btSearchTop" value="SEARCH" class="buttonctl" onClick="uf_Browse('dledir_search_quick.asp');" > My code is as follows: br = mechanize.Browser() br.open(BASE_URL) br.select_form(name='frmSearchQuick') resp = br.click(name='btSearchTop') I've also tried the last line as: resp = br.submit(name='btSearchTop') The error I get is: raise ControlNotFoundError("no control matching "+description) ControlNotFoundError: no control matching name 'btSearchTop', kind 'clickable' If I print br I get this: IgnoreControl(btSearchTop=) But I don't see that anywhere in the HTML. Any advice on how to submit this form?

    Read the article

  • how to send file via http with python

    - by ep45
    Hello, I have a problem. I use Apache with mod_wsgi and webpy, and when i send a file on http, a lot packets are lost. This is my code : web.header('Content-Type','video/x-flv') web.header('Content-length',sizeFile) f = file(FILE_PATH, 'rb') while True: buffer = f.read(4*1024) if buffer : yield buffer else : break f.close() What in my code is wrong ? thanks.

    Read the article

  • Python turtle module confusion

    - by John
    Hi, I'm trying to to add more lines to the triangle, so instead of 3 leading off there will be 5 depending on the parameter given but I really have no idea what to do at this stage and any help would be very welcome. Thanks in advance!:) def draw_sierpinski_triangle(tracer_on, colour, initial_modulus, line_width, initial_heading,initial_x, initial_y, steps): turtle=Turtle() turtle.name = 'Mother of all turtles' turtle.reset () turtle.tracer (tracer_on) turtle.speed ('fastest') turtle.color (colour) turtle.width (line_width) turtle.up() turtle.goto (initial_x, initial_y) turtle.down() turtle.set_heading (initial_heading) draw_sub_pattern (tracer_on, turtle, initial_modulus, 0, steps) def draw_sub_pattern (tracer_on, turtle, modulus, depth, steps): if (depth >= steps): return; x, y = turtle.position () heading = turtle.heading () # draw the pattern turtle.up() turtle.down() turtle.forward (modulus) draw_sub_pattern(tracer_on, turtle, modulus * 0.5, depth + 1, steps) turtle.up() turtle.goto(x, y) turtle.down() turtle.set_heading (heading + 120) turtle.forward (modulus) draw_sub_pattern(tracer_on, turtle, modulus * 0.5, depth + 1, steps) turtle.up() turtle.goto(x, y) turtle.down() turtle.set_heading (heading + 240) turtle.forward (modulus) draw_sub_pattern(tracer_on, turtle, modulus * 0.5, depth + 1, steps)

    Read the article

  • Sending and receiving async over multiprocessing.Pipe() in Python

    - by dcolish
    I'm having some issues getting the Pipe.send to work in this code. What I would ultimately like to do is send and receive messages to and from the foreign process while its running in a fork. This is eventually going to be integrated into a pexpect loop for talking to interpreter processes. ` from multiprocessing import Process, Pipe def f(conn): cmd = '' if conn.poll(): cmd = conn.recv() i = 1 i += 1 conn.send([42 + i, cmd, 'hello']) if __name__ == '__main__': parent_conn, child_conn = Pipe() p = Process(target=f, args=(child_conn,)) p.start() from pdb import set_trace; set_trace() while parent_conn.poll(): print parent_conn.recv() # prints "[42, None, 'hello']" parent_conn.send('OHHAI') p.join() `

    Read the article

  • Text-based game graphics in Python

    - by Jasper
    Hi, i'm pretty new 2 programming, and I'm creating a simple text-based game I'm wondering if there is a simple way to create my own terminal-type window with which I can place coloured input etc. Is there a graphics module well suited to this? I'm using Mac, but I would like it to work on Windows as well Thanks

    Read the article

  • Python SQLite: database is locked

    - by user322683
    I'm trying this code: import sqlite connection = sqlite.connect('cache.db') cur = connection.cursor() cur.execute('''create table item (id integer primary key, itemno text unique, scancode text, descr text, price real)''') connection.commit() cur.close() I'm catching this exception: Traceback (most recent call last): File "cache_storage.py", line 7, in <module> scancode text, descr text, price real)''') File "/usr/lib/python2.6/dist-packages/sqlite/main.py", line 237, in execute self.con._begin() File "/usr/lib/python2.6/dist-packages/sqlite/main.py", line 503, in _begin self.db.execute("BEGIN") _sqlite.OperationalError: database is locked Permissions for cache.db are ok. Any ideas?

    Read the article

  • Python Least-Squares Natural Splines

    - by Eldila
    I am trying to find a numerical package which will fit a natural which minimizes weighted least squares. There is a package in scipy which does what I want for unnatural splines. import numpy as np import matplotlib.pyplot as plt from scipy import interpolate import random x = np.arange(0,5,1.0/2) xs = np.arange(0,5,1.0/500) y = np.sin(x+1) for i in range(len(y)): y[i] += .2*random.random() - .1 knots = np.array([1,2,3,4]) tck = interpolate.splrep(x,y,s=1,k=3,t=knots,task=-1) ynew = interpolate.splev(xs,tck,der=0) plt.figure() plt.plot(xs,ynew,x,y,'x')

    Read the article

  • Why can you reference an imported module using the importing module in python

    - by noam
    I am trying to understand why any import can be referenced using the importing module, e.g #module master.py import slave and then >>>import master >>>print master.slave gives <module 'slave' from 'C:\Documents and Settings....'> What is the purpose of the feature? I can see how it can be helpful in a package's __init__.py file, but nothing else. Is it a side effect of the fact that every import is added to the module's namespace and that the module's namespace is visible from the outside? If so, why didn't they make an exception with data imported from other modules (e.g don't show it as part of the module's namespace for other modules)?

    Read the article

< Previous Page | 145 146 147 148 149 150 151 152 153 154 155 156  | Next Page >