Search Results

Search found 13816 results on 553 pages for 'python larry'.

Page 359/553 | < Previous Page | 355 356 357 358 359 360 361 362 363 364 365 366  | Next Page >

  • Scraping paginated items from a website using scrapy

    - by Mridang Agarwalla
    I'm using scrapy to scrape items from a site. I'm not being able to implement this scraping pattern. The site I'm trying to scrape is a forum and I scrape the site once a day. Each page has a table containing posts. New posts are added to the top of the table and as more and more posts are posted to the site, the older posts go further into the pages due to pagination. This is a very simple scenario and we will assume that the order of the posts never change. I would like to scrape this site and scrape all the "new" records until the last scraped post from yesterday is encountered. I have configured my spider to paginate endlessly and when it encounters yesterday's last scraped post, it should stop. How can implement this? (My Scrapy installation works with my Django installation using django-dynamic-scraper )

    Read the article

  • Graphics glitch when drawing to a Cairo context obtained from a gtk.DrawingArea inside a gtk.Viewport.

    - by user410023
    I am trying to redraw the part of the DrawingArea that is visible in the Viewport in the expose-event handler. However, it seems that I am doing something wrong with the coordinates that are passed to the event handler because there is garbage at the edge of the Viewport when scrolling. Can anyone tell what I am doing wrong? Here is a small example: import pygtk pygtk.require("2.0") import gtk from numpy import array from math import pi class Circle(object): def init(self, position = [0., 0.], radius = 0., edge = (0., 0., 0.), fill = None): self.position = position self.radius = radius self.edge = edge self.fill = fill def draw(self, ctx): rect = array(ctx.clip_extents()) rect[2] -= rect[0] rect[3] -= rect[1] center = rect[2:4] / 2 ctx.arc(center[0], center[1], self.radius, 0., 2. * pi) if self.fill != None: ctx.set_source_rgb(*self.fill) ctx.fill_preserve() ctx.set_source_rgb(*self.edge) ctx.stroke() class Scene(object): class Proxy(object): directory = {} def init(self, target, layers = set()): self.target = target self.layers = layers Scene.Proxy.directory[target] = self def __init__(self, viewport): self.objects = {} self.layers = [set()] self.viewport = viewport self.signals = {} def draw(self, ctx): x = self.viewport.get_hadjustment().value y = self.viewport.get_vadjustment().value ctx.set_source_rgb(1., 1., 1.) ctx.paint() ctx.translate(x, y) for obj in self: obj.draw(ctx) def add(self, item, layer = 0): item = Scene.Proxy(item, layers = set((layer,))) assert(hasattr(item.target, "draw")) assert(isinstance(layer, int)) item.layers.add(layer) while not layer < len(self.layers): self.layers.append(set()) self.layers[layer].add(item) if not item in self.objects: self.objects[item] = set() self.objects[item].add(layer) def remove(self, item, layers = None): item = Scene.Proxy.directory[item] if layers == None: layers = self.objects[item] for layer in layers: layer.remove(item) item.layers.remove(layer) if len(item.layers) == 0: self.objects.remove(item) def __iter__(self): for layer in self.layers: for item in layer: yield item.target class App(object): def init(self): signals = { "canvas_exposed": self.update_canvas, "gtk_main_quit": gtk.main_quit } self.builder = gtk.Builder() self.builder.add_from_file("graphics_glitch.glade") self.window = self.builder.get_object("window") self.viewport = self.builder.get_object("viewport") self.canvas = self.builder.get_object("canvas") self.scene = Scene(self.viewport) signals.update(self.scene.signals) self.builder.connect_signals(signals) self.window.show() def update_canvas(self, widget, event): ctx = self.canvas.window.cairo_create() self.scene.draw(ctx) ctx.clip() if name == "main": app = App() scene = app.scene scene.add(Circle((0., 0.), 10.)) gtk.main() And the Glade file "graphics_glitch.glade": <?xml version="1.0"?> <interface> <requires lib="gtk+" version="2.16"/> <!-- interface-naming-policy project-wide --> <object class="GtkWindow" id="window"> <property name="width_request">200</property> <property name="height_request">200</property> <property name="visible">True</property> <signal name="destroy" handler="gtk_main_quit"/> <child> <object class="GtkScrolledWindow" id="scrolledwindow1"> <property name="visible">True</property> <property name="can_focus">True</property> <property name="hadjustment">h_adjust</property> <property name="vadjustment">v_adjust</property> <property name="hscrollbar_policy">automatic</property> <property name="vscrollbar_policy">automatic</property> <child> <object class="GtkViewport" id="viewport"> <property name="visible">True</property> <property name="resize_mode">queue</property> <child> <object class="GtkDrawingArea" id="canvas"> <property name="width_request">640</property> <property name="height_request">480</property> <property name="visible">True</property> <signal name="expose_event" handler="canvas_exposed"/> </object> </child> </object> </child> </object> </child> </object> <object class="GtkAdjustment" id="h_adjust"> <property name="lower">-1000</property> <property name="upper">1000</property> <property name="step_increment">1</property> <property name="page_increment">25</property> <property name="page_size">25</property> </object> <object class="GtkAdjustment" id="v_adjust"> <property name="lower">-1000</property> <property name="upper">1000</property> <property name="step_increment">1</property> <property name="page_increment">25</property> <property name="page_size">25</property> </object> </interface> Thanks! --Dan

    Read the article

  • Programatically Determining Bin Path

    - by Andy
    I'm working on a web app called pj and there is a bin file and a src folder. The relative paths before I deploy the app will look something like: pj/bin and pj/src/pj/script.py. However, after deployment, the relative paths will look like: pj_dep/deployed/bin and pj_dep/deployed/lib/python2.6/site-packages/pj/script.py Question: Within script.py, I am trying to find the path of a file in the bin directory. This leads to 2 different behaviors in the dev and deployment environment. If I do os.path.join(os.path.dirname(__file__), 'bin') to try to get the path for the dev environment, I will have a different path for the deployment environment. Is there a more generalized way I can find the bin directory so that I do not need to rely on an if statement to determine how many directories to go up based on the current env? This doesn't seem flexible and might cause other issues later on when the code is moved.

    Read the article

  • the error "invalid literal for int() with base 10:" keeps coming up

    - by ratce003
    I'm trying to write a very simple program, I want to print out the sum of all the multiples of 3 and 5 below 100, but, an error keeps accuring, saying "invalid literal for int() with base 10:" my program is as follows: sum = "" sum_int = int(sum) for i in range(1, 101): if i % 5 == 0: sum += i elif i % 3 == 0: sum += i else: sum += "" print sum Any help would be much appreciated.

    Read the article

  • Django database caching

    - by hekevintran
    I have a Django form that uses an integer field to lookup a model object by its primary key. The form has a save() method that uses the model object referred to by the integer field. The model's manager's get() method is called twice, once in the clean method and once in the save() method: class MyForm(forms.Form): id_a = fields.IntegerField() def clean_id_a(user_id): id_a = self.cleaned_data['id_a'] try: # here is the first call to get MyModel.objects.get(id=id_a) except User.DoesNotExist: raise ValidationError('Object does not exist') def save(self): id_a = self.cleaned_data['id_a'] # here is the second call to get my_model_object = MyModel.objects.get(id=id_a) # do other stuff I wasn't sure whether this hits the database two times or one time so I returned the object itself in the clean method so that I could avoid a second get() call. Does calling get() hit the database two times? Or is the object cached in the thread? class MyForm(forms.Form): id_a = fields.IntegerField() def clean_id_a(user_id): id_a = self.cleaned_data['id_a'] try: # here is my workaround return MyModel.objects.get(id=id_a) except User.DoesNotExist: raise ValidationError('Object does not exist') def save(self): # looking up the cleaned value returns the model object my_model_object = self.cleaned_data['id_a'] # do other stuff

    Read the article

  • Non-global middleware in Django

    - by hekevintran
    In Django there is a settings file that defines the middleware to be run on each request. This middleware setting is global. Is there a way to specify a set of middleware on a per-view basis? I want to have specific urls use a set of middleware different from the global set.

    Read the article

  • In Elixir or SQLAlchemy, is there a way to also store a comment for a/each field in my entities?

    - by kchau
    Our project is basically a web interface to several systems of record. We have many tables mapped, and the names of each column aren't as well named and intuitive as we'd like... The users would like to know what data fields are available (i.e. what's been mapped from the database). But, it's pointless to just give them column names like: USER_REF1, USER_REF2, etc. So, I was wondering, is there a way to provide a comment in the declaration of my field? E.g. class SegregationCode(Entity): using_options(tablename="SEGREGATION_CODES") segCode = Field(String(20), colname="CODE", ... primary_key=True) #Have a comment attr too? If not, any suggestions?

    Read the article

  • 'NoneType' object has no attribute 'data'

    - by Bill Jordan
    Hello guys, I am sending a SOAP request to my server and getting the response back. sample of the response string is shown below: <?xml version = '1.0' ?> <env:Envelope xmlns:env=http:////www.w3.org/2003/05/soap-envelop . .. .. <env:Body> <epas:get-all-config-resp xmlns:epas="urn:organization:epas:soap"> ^M ... ... <epas:property name="Tom">12</epas:property> > > <epas:property name="Alice">34</epas:property> > > <epas:property name="John">56</epas:property> > > <epas:property name="Danial">78</epas:property> > > <epas:property name="George">90</epas:property> > > <epas:property name="Luise">11</epas:property> ... ^M </env:Body? </env:Envelop> What I noticed in the response is that there is an extra character shown in the body which is "^M". Not sure if this could be the issue. Note the ^M shown! when I tried parsing the string returned from the server to get the names and values using the code sample: elements = minidom.parseString(xmldoc).getElementsByTagName("property") myDict = {} for element in elements: myDict[element.getAttribute('name')] = element.firstChild.data But, I am getting this error: 'NoneType' object has no attribute 'data'. May be its something to do with the "^M" shown on the xml response back! Any ideas/comments would be appreciated, Cheers

    Read the article

  • How to clone a mercurial repository over an ssh connection initiated by fabric when http authorizati

    - by Monika Sulik
    I'm attempting to use fabric for the first time and I really like it so far, but at a certain point in my deployment script I want to clone a mercurial repository. When I get to that point I get an error: err: abort: http authorization required My repository requires http authorization and fabric doesn't prompt me for the user and password. I can get around this by changing my repository address from: https://hostname/repository to: https://user:password@hostname/repository But for various reasons I would prefer not to go this route. Are there any other ways in which I could bypass this problem?

    Read the article

  • downloading archives response corrupts files

    - by panchicore
    wrapper = FileWrapper(file("C:/pics.zip")) content_type = mimetypes.guess_type(result.files)[0] response = HttpResponse(wrapper, content_type=content_type) response['Content-Length'] = os.path.getsize("C:/pics.zip") response['Content-Disposition'] = "attachment; filename=pics.zip" return response pics.zip is a valid file with 3 pictures inside. server response the download, but when I am going to open the zip, winrar says This archive is either in unknown format or damaged! If I change the file path and the file name to a valid image C:/pic.jpg is downloaded damaged too. What Im missing in this download view?

    Read the article

  • Selective emboldeing of text in a webpage

    - by Eknath Iyer
    while printing out utf-8 characters onto a webpage, if encapsulate them with they get emboldened, but anything else, the page turns blank. Why? def main(): print "Content-type: text/html\r\n\r\n"; print '<html>' print '<head>' print '<style type="text/css">' print '.highlight { background-color: yellow }' print '.color1 { color: green; }' print '.color2 { color: blue; }' print '.color3 { color: purple; }' print '.color4 { color: red; }' print '.color5 { color: teal; }' print '.color6 { color: yellow; }' print '.color7 { color: orange; }' print '.color8 { color: violet; }' print '</style></head>' print '<body>' form = cgi.FieldStorage() ch = form.getvalue('choice') if ch == 'English': in_sent = form.getvalue('f1') in_sent = in_sent.lower() cho=0 elif ch == 'Hindi': in_sent = trans_he(form.getvalue('transl1').decode("utf-8")).strip() cho=1 #cho = 0 for english #cho = 1 for hindi adict=[] print '<center><u> User Input Sentence ==> <b>', in_sent,'</b></u></center><br>' in_sent=in_sent.strip().split(' ') colordict={} counter=1 for word in in_sent: colordict[word]=counter counter = counter + 1 f = open('bidirectional.alignment.txt','rb').read() records=f.strip().split('\n\n\n') for record in records: el=[] el2 = [] #basic file processing is done here. record = record.strip().split('\n') source = record[cho] target = record[(cho+1)%2] source_sent = source.split(' # ')[1] target_sent = target.split(' # ')[1] source_words = source_sent.strip().split(' ') target_words = target_sent.strip().split(' ') trans_index = source.split(' # ')[2].strip().split(' ') for word in in_sent: if word in source_words: if int(trans_index[source_words.index(word)]) > 0: tword=target_words[(int(trans_index[source_words.index(word)])-1)] target_sent = target_sent.replace(tword+' ','<b>'+tword+' </b>') # When the <b> tag is used here(for the 'target_sent = ...' statement). it is fine. But when <b> is replaced by something like in the next line or even <i> or <u>, it doesn't show an output at all source_sent = source_sent.replace(word+' ','<span class="color1">'+word+' </span>') el2.append(source_sent) el2.append(target_sent) el.append(target_sent.count('<b>')) el.append(el2) if target_sent.count('<b>') > 0: adict.append(el) print '<table><tr><td><center><h1>SOURCE LANGUAGE</h1></center></td><td><center> <h1>TARGET LANGUAGE</h1></center></td></tr>' for entry in adict: print '<tr><td>',entry[1][0],'</td><td>',trans_eh(entry[1][1]).encode("utf-8"),'</td> </tr>' print '</table></body>' print '</html>' main()

    Read the article

  • How to create instances of related models in Django

    - by sevennineteen
    I'm working on a CMSy app for which I've implemented a set of models which allow for creation of custom Template instances, made up of a number of Fields and tied to a specific Customer. The end-goal is that one or more templates with a set of custom fields can be defined through the Admin interface and associated to a customer, so that customer can then create content objects in the format prescribed by the template. I seem to have gotten this hooked up such that I can create any number of Template objects, but I'm struggling with how to create instances - actual content objects - in those templates. For example, I can define a template "Basic Page" for customer "Acme" which has the fields "Title" and "Body", but I haven't figured out how to create Basic Page instances where these fields can be filled in. Here are my (somewhat elided) models... class Customer(models.Model): ... class Field(models.Model): ... class Template(models.Model): label = models.CharField(max_length=255) clients = models.ManyToManyField(Customer, blank=True) fields = models.ManyToManyField(Field, blank=True) class ContentObject(models.Model): label = models.CharField(max_length=255) template = models.ForeignKey(Template) author = models.ForeignKey(User) customer = models.ForeignKey(Customer) mod_date = models.DateTimeField('Modified Date', editable=False) def __unicode__(self): return '%s (%s)' % (self.label, self.template) def save(self): self.mod_date = datetime.datetime.now() super(ContentObject, self).save() Thanks in advance for any advice!

    Read the article

  • Technique to remove common words(and their plural versions) from a string

    - by Jake M
    I am attempting to find tags(keywords) for a recipe by parsing a long string of text. The text contains the recipe ingredients, directions and a short blurb. What do you think would be the most efficient way to remove common words from the tag list? By common words, I mean words like: 'the', 'at', 'there', 'their' etc. I have 2 methodologies I can use, which do you think is more efficient in terms of speed and do you know of a more efficient way I could do this? Methodology 1: - Determine the number of times each word occurs(using the library Collections) - Have a list of common words and remove all 'Common Words' from the Collection object by attempting to delete that key from the Collection object if it exists. - Therefore the speed will be determined by the length of the variable delims import collections from Counter delim = ['there','there\'s','theres','they','they\'re'] # the above will end up being a really long list! word_freq = Counter(recipe_str.lower().split()) for delim in set(delims): del word_freq[delim] return freq.most_common() Methodology 2: - For common words that can be plural, look at each word in the recipe string, and check if it partially contains the non-plural version of a common word. Eg; For the string "There's a test" check each word to see if it contains "there" and delete it if it does. delim = ['this','at','them'] # words that cant be plural partial_delim = ['there','they',] # words that could occur in many forms word_freq = Counter(recipe_str.lower().split()) for delim in set(delims): del word_freq[delim] # really slow for delim in set(partial_delims): for word in word_freq: if word.find(delim) != -1: del word_freq[delim] return freq.most_common()

    Read the article

  • Sending data from one Protocol to another Protocol in Twisted?

    - by veb
    Hi! One of my protocols is connected to a server, and with the output of that I'd like to send it to the other protocol. I need to access the 'msg' method in ClassA from ClassB but I keep getting: exceptions.AttributeError: 'NoneType' object has no attribute 'write' Actual code: http://pastebin.com/MQPhduSY Any ideas please? :-)

    Read the article

  • Sort a list of tuples without case sensitivity

    - by dound
    How can I efficiently and easily sort a list of tuples without being sensitive to case? For example this: [('a', 'c'), ('A', 'b'), ('a', 'a'), ('a', 5)] Should look like this once sorted: [('a', 5), ('a', 'a'), ('A', 'b'), ('a', 'c')] The regular lexicographic sort will put 'A' before 'a' and yield this: [('A', 'b'), ('a', 5), ('a', 'a'), ('a', 'c')]

    Read the article

  • BeautifulSoup, but for CSS?

    - by MTsoul
    BeautifulSoup parses HTML and offers various ways to manipulate and search within HTML. Is there something similar for CSS? Specifically, I'd like to know if a given HTML text is rendered as bold. Either it has an ancestor that is the <strong> or the <bold> tag (which can be done with BeautifulSoup), or it has an ancestor (or itself) that has CSS attributes with font-weight: bold. Is this possible without resulting to writing my own library?

    Read the article

  • I am trying to move a rectangle in Pygame using coordinates but won't work

    - by user1821449
    this is my code import pygame from pygame.locals import * import sys pygame.init() pygame.display.set_caption("*no current mission*") size = (1280, 750) screen = pygame.display.set_mode(size) clock = pygame.time.Clock() bg = pygame.image.load("bg1.png") guy = pygame.image.load("hero_stand.png") rect = guy.get_rect() x = 10 y = 10 while True: for event in pygame.event.get(): if event.type == pygame.QUIT: sys.exit() if event.type == KEYDOWN: _if event.key == K_RIGHT: x += 5 rect.move(x,y)_ rect.move(x,y) screen.blit(bg,(0,0)) screen.blit(guy, rect) pygame.display.flip() it is just a simple test to see if i can get a rectangle to move. Everything seems to work except the code I put in italic.

    Read the article

  • how to import a.py not a folder

    - by zjm1126
    zjm_code |-----a.py |-----a |----- __init__.py |-----b.py in a.py is : c='ccc' in b.py is : import a print dir(a) when i execute b.py ,it show (it import 'a' folder): ['__builtins__', '__doc__', '__file__', '__name__', '__path__'] and when i delete a folder, it show ,(it import a.py): ['__builtins__', '__doc__', '__file__', '__name__', 'c'] so my question is : how to import a.py via not delete a folder thanks

    Read the article

< Previous Page | 355 356 357 358 359 360 361 362 363 364 365 366  | Next Page >