Search Results

Search found 17029 results on 682 pages for 'python modules'.

Page 392/682 | < Previous Page | 388 389 390 391 392 393 394 395 396 397 398 399  | Next Page >

  • Url open encoding

    - by badc0re
    I have the following code for urllib and BeautifulSoup: getSite = urllib.urlopen(pageName) # open current site getSitesoup = BeautifulSoup(getSite.read()) # reading the site content print getSitesoup.originalEncoding for value in getSitesoup.find_all('link'): # extract all <a> tags defLinks.append(value.get('href')) The result of it: /usr/lib/python2.6/site-packages/bs4/dammit.py:231: UnicodeWarning: Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER. "Some characters could not be decoded, and were " And when i try to read the site i get: ?7?e????0*"I??G?H????F??????9-??????;??E?YÞBs????????????4i???)?????^W?????`w?Ke??%??*9?.'OQB???V??@?????]???(P??^??q?$?S5???tT*?Z

    Read the article

  • Regex and unicode

    - by dbr
    I have a script that parses the filenames of TV episodes (show.name.s01e02.avi for example), grabs the episode name (from the www.thetvdb.com API) and automatically renames them into something nicer (Show Name - [01x02].avi) The script works fine, that is until you try and use it on files that have Unicode show-names (something I never really thought about, since all the files I have are English, so mostly pretty-much all fall within [a-zA-Z0-9'\-]) How can I allow the regular expressions to match accented characters and the likes? Currently the regex's config section looks like.. config['valid_filename_chars'] = """0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!@£$%^&*()_+=-[]{}"'.,<>`~? """ config['valid_filename_chars_regex'] = re.escape(config['valid_filename_chars']) config['name_parse'] = [ # foo_[s01]_[e01] re.compile('''^([%s]+?)[ \._\-]\[[Ss]([0-9]+?)\]_\[[Ee]([0-9]+?)\]?[^\\/]*$'''% (config['valid_filename_chars_regex'])), # foo.1x09* re.compile('''^([%s]+?)[ \._\-]\[?([0-9]+)x([0-9]+)[^\\/]*$''' % (config['valid_filename_chars_regex'])), # foo.s01.e01, foo.s01_e01 re.compile('''^([%s]+?)[ \._\-][Ss]([0-9]+)[\.\- ]?[Ee]([0-9]+)[^\\/]*$''' % (config['valid_filename_chars_regex'])), # foo.103* re.compile('''^([%s]+)[ \._\-]([0-9]{1})([0-9]{2})[\._ -][^\\/]*$''' % (config['valid_filename_chars_regex'])), # foo.0103* re.compile('''^([%s]+)[ \._\-]([0-9]{2})([0-9]{2,3})[\._ -][^\\/]*$''' % (config['valid_filename_chars_regex'])), ]

    Read the article

  • How can you dispatch on request method in Django URLpatterns?

    - by rcampbell
    It's clear how to create a URLPattern which dispatches from a URL regex: (r'^books/$', books), where books can further dispatch on request method: def books(request): if request.method == 'POST': ... else ... I'd like to know if there is an idiomatic way to include the request method inside the URLPattern, keeping all dispatch/route information in a single location, such as: (r'^books/$', GET, retrieve-book), (r'^books/$', POST, update-books), (r'^books/$', PUT, create-books),

    Read the article

  • Writing a post search algorithm.

    - by MdaG
    I'm trying to write a free text search algorithm for finding specific posts on a wall (similar kind of wall as Facebook uses). A user is suppose to be able to write some words in a search field and get hits on posts that contain the words; with the best match on top and then other posts in decreasing order according to match score. I'm using the edit distance (Levenshtein) "e(x, y) = e" to calculate the score for each post when compared to the query word "x" and post word "y" according to: score(x, y) = 2^(2 - e)(1 - min(e, |x|) / |x|) Each word in a post contributes to the total score for that specific post. This approach seems to work well when the posts are of roughly the same size, but sometime certain large posts manages to rack up score solely on having a lot of words in them while in practice not being relevant to the query. Am I approaching this problem in the wrong way or is there some way to normalize the score that I haven't thought of?

    Read the article

  • remove border of a gtk.button

    - by spanctus
    Hi, I want to remove the border of the gtk.button, but i Don't know how to do it. I tried with : button = gtk.Button() button.set_style("inner-border",0) but i have an error : the property doesn't exist. I tried too to create a new gtk.Style and use it for the button, but same error. Anyone has an idea ? Thanks

    Read the article

  • How to reset Scrapy parameters? (always running under same parameters)

    - by Jean Ventura
    I've been running my Scrapy project with a couple of accounts (the project scrapes a especific site that requieres login credentials), but no matter the parameters I set, it always runs with the same ones (same credentials). I'm running under virtualenv. Is there a variable or setting I'm missing? Edit: It seems that this problem is Twisted related. Even when I run: scrapy crawl -a user='user' -a password='pass' -o items.json -t json SpiderName I still get an error saying: ERROR: twisted.internet.error.ReactorNotRestartable And all the information I get, is the last 'succesful' run of the spider.

    Read the article

  • Google App Engine - Document Editor Creation/Tap Into Google Docs?

    - by Josh Patton
    What is the best way to create a custom document editor in GAE? I'm making a website meant for a School Robotics Club (With support for any other organization - DRY). We currently use Google services for online collaboration, I'm wondering if there is a way to tap into Google Docs and allow users to edit a Google Document without using Google Accounts or the Google Doc interface. If that is not possible (I've researched and I don't think it is), what is the best way to make a document editor? I want it completely on the website I'm creating, so I'm assuming just some javascript editor like TinyMCE + Ajax + Datastore. Is their anything that replicates Google Doc's/Microsoft Offices's/OpenOffice.org's feature set as far as fonts, spacing, alignment, justification, etc.?

    Read the article

  • How to use regular expressions to pull a substring? (screen scraping)

    - by Diego
    Hey guys, i'm really trying to understand regular expressions while scraping a site, i've been using it in my code enough to pull the following, but am stuck here. I need to quickly grab this: http://www.example.com/online/store/TitleDetail?detail&sku=123456789 from this: ('<a href="javascript:if(handleDoubleClick(this.id)){window.location=\'http://www.example.com/online/store/TitleDetail?detail&sku=123456789\';}" id="getTitleDetails_123456789">\r\n\t\t\t \tcheck store inventory\r\n\t\t\t </a>', 1) This is where I got confused. any ideas?

    Read the article

  • how to model a follower stream in appengine?

    - by molicule
    I am trying to design tables to buildout a follower relationship. Say I have a stream of 140char records that have user, hashtag and other text. Users follow other users, and can also follow hashtags. I am outlining the way I've designed this below, but there are two limitaions in my design. I was wondering if others had smarter ways to accomplish the same goal. The issues with this are The list of followers is copied in for each record If a new follower is added or one removed, 'all' the records have to be updated. The code class HashtagFollowers(db.Model): """ This table contains the followers for each hashtag """ hashtag = db.StringProperty() followers = db.StringListProperty() class UserFollowers(db.Model): """ This table contains the followers for each user """ username = db.StringProperty() followers = db.StringListProperty() class stream(db.Model): """ This table contains the data stream """ username = db.StringProperty() hashtag = db.StringProperty() text = db.TextProperty() def save(self): """ On each save all the followers for each hashtag and user are added into a another table with this record as the parent """ super(stream, self).save() hfs = HashtagFollowers.all().filter("hashtag =", self.hashtag).fetch(10) for hf in hfs: sh = streamHashtags(parent=self, followers=hf.followers) sh.save() ufs = UserFollowers.all().filter("username =", self.username).fetch(10) for uf in ufs: uh = streamUsers(parent=self, followers=uf.followers) uh.save() class streamHashtags(db.Model): """ The stream record is the parent of this record """ followers = db.StringListProperty() class streamUsers(db.Model): """ The stream record is the parent of this record """ followers = db.StringListProperty() Now, to get the stream of followed hastags indexes = db.GqlQuery("""SELECT __key__ from streamHashtags where followers = 'myusername'""") keys = [k,parent() for k in indexes[offset:numresults]] return db.get(keys) Is there a smarter way to do this?

    Read the article

  • ahow can I resolve Django Error: str' object has no attribute 'autoescape'?

    - by Angelbit
    Hi have tried to create a inclusion tag on Django but don't work and return str' object has no attribute 'autoescape' this is the code of custom tag: from django import template from quotes.models import Quotes register = template.Library() def show_quote(): quote = Quotes.objects.values('quote', 'author').get(id=0) return {'quote': quote['quote']} register.inclusion_tag('quotes.html')(show_quote) EDIT: Quote class from django.db import models class Quotes(models.Model): quote = models.CharField(max_length=255) author = models.CharField(max_length=100) class Meta: db_table = 'quotes' quotes.html <blockquote id="quotes">{{ quote }}</blockquote>

    Read the article

  • Segment string into array, regex?

    - by Hanpan
    I have a string which looks like this: "[segment1][segment2][segment2]" What I'd like is to be able to split the string into an array, so I'd end up with: Array[0] = "segment1", Array[1] = "segment2", Array[2] = "segment3" I've tried using the string split function, but it doesn't seem to do exactly what I want. I was wondering if anyone has some regex which might help me out? Thanks in advance

    Read the article

  • sqlite3 'database is locked' won't go away with retries

    - by Azarias
    I have a sqlite3 database that is accessed by a few threads (3-4). I am aware of the general limitations of sqlite3 with regards to concurrency as stated http://www.sqlite.org/faq.html#q6 , but I am convinced that is not the problem. All of the threads both read and write from this database. Whenever I do a write, I have the following construct: try: Cursor.execute(q, params) Connection.commit() except sqlite3.IntegrityError: Notify except sqlite3.OperationalError: print sys.exc_info() print("DATABASE LOCKED; sleeping for 3 seconds and trying again") time.sleep(3) Retry On some runs, I won't even hit this block, but when I do, it never comes out of it (keeps retrying, but I keep getting the 'database is locked' error from exc_info. If I understand the reader/writer lock usage correctly, some amount of waiting should help with the contention. What this sounds like is deadlock, but I do not use any transactions in my code, and every SELECT or INSERT is simply a one off. Some threads, however, keep the same connection when they do their operation (which includes a mix of SELECTS and INSERTS and other modifiers). I would appericiate it if you could shade a light on this, and also ways around fixing it (besides using a different database engine.)

    Read the article

  • Convert object to DateRange

    - by user655832
    I'm querying an underlying PostgreSQL database using Pandas 0.8. Pandas is returning the DataFrame properly but the underlying timestamp column in my database is being returned as a generic "object" type in Pandas. As I would eventually like to seasonal normalization of my data I am curious as to how to convert this generic "object" column to something that is appropriate for analysis. Here is my current code to retrieve the data: # get records from db example import pandas.io.sql as psql import psycopg2 # define query to get all subs created this year QRY = """ select i i, i * random() f, case when random() > 0.5 then true else false end t, (current_date - (i*random())::int)::timestamp with time zone tsz from generate_series(1,1000) as s(i) order by 4 ; """ CONN_STRING = "host='localhost' port=5432 dbname='postgres' user='postgres'" # connect to db conn = psycopg2.connect(CONN_STRING) # get some data set index on relid column df = psql.frame_query(QRY, con=conn) print "Row count retrieved: %i" % (len(df),) Thanks for any help you can render. M

    Read the article

  • Does urllib2.urlopen() actually fetch the page?

    - by beagleguy
    hi all, I was condering when I use urllib2.urlopen() does it just to header reads or does it actually bring back the entire webpage? IE does the HTML page actually get fetch on the urlopen call or the read() call? handle = urllib2.urlopen(url) html = handle.read() The reason I ask is for this workflow... I have a list of urls (some of them with short url services) I only want to read the webpage if I haven't seen that url before I need to call urlopen() and use geturl() to get the final page that link goes to (after the 302 redirects) so I know if I've crawled it yet or not. I don't want to incur the overhead of having to grab the html if I've already parsed that page. thanks!

    Read the article

  • Does this introduce security vulnerabilities?

    - by mcmt
    I don't think I'm missing anything. Then again I'm kind of a newbie. def GET(self, filename): name = urllib.unquote(filename) full = path.abspath(path.join(STATIC_PATH, filename)) #Make sure request is not tricksy and tries to get out of #the directory, e.g. filename = "../.ssh/id_rsa". GET OUTTA HERE assert full[:len(STATIC_PATH)] == STATIC_PATH, "bad path" return open(full).read()

    Read the article

  • In SqlAlchemy, how to ignore m2m relationship attributes when merge?

    - by ablmf
    There is a m2m relation in my models, User and Role. I want to merge a role, but i DO NOT want this merge has any effect on user and role relation-ship. Unfortunately, for some complicate reason, role.users if not empty. I tried to set role.users = None, but SA complains None is not a list. At this moment, I use sqlalchemy.orm.attributes.del_attribute, but I don't know if it's provided for this purpose.

    Read the article

< Previous Page | 388 389 390 391 392 393 394 395 396 397 398 399  | Next Page >