Search Results

Search found 20092 results on 804 pages for 'python import'.

Page 175/804 | < Previous Page | 171 172 173 174 175 176 177 178 179 180 181 182  | Next Page >

  • How to find links and modify an Html using BeautifulSoup in Python

    - by systempuntoout
    Starting from an Html input like this: <p> <a href="http://www.foo.com">this if foo</a> <a href="http://www.bar.com">this if bar</a> </p> using BeautifulSoup, i would like to change this Html in: <p> <a href="http://www.foo.com">this if foo[1]</a> <a href="http://www.bar.com">this if bar[2]</a> </p> saving parsed links in a dictionary with a result like this: links_dict = {"1":"http://www.foo.com","2":"http://www.bar.com"} Is it possible to do this using BeautifulSoup? Any valid alternative?

    Read the article

  • encapsulation in python list (want to use " instead of ')

    - by Codehai
    I have a list of users users["pirates"] and they're stored in the format ['pirate1','pirate2']. If I hand the list over to a def and query for it in MongoDB, it returns data based on the first index (e.g. pirate1) only. If I hand over a list in the format ["pirate1","pirate"], it returns data based on all the elements in the list. So I think there's something wrong with the encapsulation of the elements in the list. My question: can I change the encapsulation from ' to " without replacing every ' on every element with a loop manually? Short Example: aList = list() # get pirate Stuff # users["pirates"] is a list returned by a former query # so e.g. users["pirates"][0] may be peter without any quotes for pirate in users["pirates"]: aList.append(pirate) aVar = pirateDef(aList) print(aVar) the definition: def pirateDef(inputList = list()): # prepare query col = mongoConnect().MYCOL # query for pirates Arrrr pirates = col.find({ "_id" : {"$in" : inputList}} ).sort("_id",1).limit(50) # loop over users userList = list() for person in pirates: # do stuff that has nothing to do with the problem # append user to userlist userList.append(person) return userList If the given list has ' encapsulation it returns: 'pirates': [{'pirate': 'Arrr', '_id': 'blabla'}] If capsulated with " it returns: 'pirates' : [{'_id': 'blabla', 'pirate' : 'Arrr'}, {'_id': 'blabla2', 'pirate' : 'cheers'}] EDIT: I tried figuring out, that the problem has to be in the MongoDB query. The list is handed over to the Def correctly, but after querying pirates only consists of 1 element... Thanks for helping me Codehai

    Read the article

  • How to replace only part of the match with python re.sub

    - by Arty
    I need to match two cases by one reg expression and do replacement 'long.file.name.jpg' - 'long.file.name_suff.jpg' 'long.file.name_a.jpg' - 'long.file.name_suff.jpg' I'm trying to do the following re.sub('(\_a)?\.[^\.]*$' , '_suff.',"long.file.name.jpg") But this is cut the extension '.jpg' and I'm getting long.file.name_suff. instead of long.file.name_suff.jpg I understand that this is because of [^.]*$ part, but I can't exclude it, because I have to find last occurance of '_a' to replace or last '.' Is there a way to replace only part of the match?

    Read the article

  • building a pairwise matrix in scipy/numpy in Python from dictionaries

    - by user248237
    I have a dictionary whose keys are strings and values are numpy arrays, e.g.: data = {'a': array([1,2,3]), 'b': array([4,5,6]), 'c': array([7,8,9])} I want to compute a statistic between all pairs of values in 'data' and build an n by x matrix that stores the result. Assume that I know the order of the keys, i.e. I have a list of "labels": labels = ['a', 'b', 'c'] What's the most efficient way to compute this matrix? I can compute the statistic for all pairs like this: result = [] for elt1, elt2 in itertools.product(labels, labels): result.append(compute_statistic(data[elt1], data[elt2])) But I want result to be a n by n matrix, corresponding to "labels" by "labels". How can I record the results as this matrix? thanks.

    Read the article

  • Python method to remove iterability

    - by Debilski
    Suppose I have a function which can either take an iterable/iterator or a non-iterable as an argument. Iterability is checked with try: iter(arg). Depending whether the input is an iterable or not, the outcome of the method will be different. Not when I want to pass a non-iterable as iterable input, it is easy to do: I’ll just wrap it with a tuple. What do I do when I want to pass an iterable (a string for example) but want the function to take it as if it’s non-iterable? E.g. make that iter(str) fails.

    Read the article

  • python destructuring-bind dictionary contents

    - by Stephen
    Hi, I am trying to 'destructure' a dictionary and associate values with variables names after its keys. Something like params = {'a':1,'b':2} a,b = params.values() but since dictionaries are not ordered, there is no guarantee that params.values() will return values in the order of (a,b). Is there a nice way to do this? Thanks

    Read the article

  • Plotting a cumulative graph of python datetimes

    - by ventolin
    Say I have a list of datetimes, and we know each datetime to be the recorded time of an event happening. Is it possible in matplotlib to graph the frequency of this event occuring over time, showing this data in a cumulative graph (so that each point is greater or equal to all of the points that went before it), without preprocessing this list? (e.g. passing datetime objects directly to some wonderful matplotlib function) Or do I need to turn this list of datetimes into a list of dictionary items, such as: {"year": 1998, "month": 12, "date": 15, "events": 92} and then generate a graph from this list? Sorry if this seems like a silly question - I'm not all too familiar with matplotlib, and would like to save myself the effort of doing this the latter way if matplotlib can already deal with datetime objects itself.

    Read the article

  • Optimizing python link matching regular expression

    - by Matt
    I have a regular expression, links = re.compile('<a(.+?)href=(?:"|\')?((?:https?://|/)[^\'"]+)(?:"|\')?(.*?)>(.+?)</a>',re.I).findall(data) to find links in some html, it is taking a long time on certain html, any optimization advice? One that it chokes on is http://freeyourmindonline.net/Blog/

    Read the article

  • Python: eliminating stack traces into library code?

    - by Mark Harrison
    When I get a runtime exception from the standard library, it's almost always a problem in my code and not in the library code. Is there a way to truncate the exception stack trace so that it doesn't show the guts of the library package? For example, I would like to get this: Traceback (most recent call last): File "./lmd3-mkhead.py", line 71, in <module> main() File "./lmd3-mkhead.py", line 66, in main create() File "./lmd3-mkhead.py", line 41, in create headver1[depotFile]=rev TypeError: Data values must be of type string or None. and not this: Traceback (most recent call last): File "./lmd3-mkhead.py", line 71, in <module> main() File "./lmd3-mkhead.py", line 66, in main create() File "./lmd3-mkhead.py", line 41, in create headver1[depotFile]=rev File "/usr/anim/modsquad/oses/fc11/lib/python2.6/bsddb/__init__.py", line 276, in __setitem__ _DeadlockWrap(wrapF) # self.db[key] = value File "/usr/anim/modsquad/oses/fc11/lib/python2.6/bsddb/dbutils.py", line 68, in DeadlockWrap return function(*_args, **_kwargs) File "/usr/anim/modsquad/oses/fc11/lib/python2.6/bsddb/__init__.py", line 275, in wrapF self.db[key] = value TypeError: Data values must be of type string or None.

    Read the article

  • Extract anything that looks like links from large amount of data in python

    - by Riz
    Hi, I have around 5 GB of html data which I want to process to find links to a set of websites and perform some additional filtering. Right now I use simple regexp for each site and iterate over them, searching for matches. In my case links can be outside of "a" tags and be not well formed in many ways(like "\n" in the middle of link) so I try to grab as much "links" as I can and check them later in other scripts(so no BeatifulSoup\lxml\etc). The problem is that my script is pretty slow, so I am thinking about any ways to speed it up. I am writing a set of test to check different approaches, but hope to get some advices :) Right now I am thinking about getting all links without filtering first(maybe using C module or standalone app, which doesn't use regexp but simple search to get start and end of every link) and then using regexp to match ones I need.

    Read the article

  • Easy Python input question

    - by Josh K
    I'd like to have something similar to the following pseudo code: while input is not None and timer < 5: input = getChar() timer = time.time() - start if timer >= 5: print "took too long" else: print input Anyway to do this without threading? I would like an input method that returns whatever has been entered since the last time it was called, or None (null) if nothing was entered.

    Read the article

  • Python: Comparing specific columns in two csv files

    - by coder999
    Say that I have two CSV files (file1 and file2) with contents as shown below: file1: fred,43,Male,"23,45",blue,"1, bedrock avenue" file2: fred,39,Male,"23,45",blue,"1, bedrock avenue" I would like to compare these two CSV records to see if columns 0,2,3,4, and 5 are the same. I don't care about column 1. What's the most pythonic way of doing this? EDIT: Some example code would be appreciated.

    Read the article

  • optimize python code

    - by user283405
    i have code that uses BeautifulSoup library for parsing. But it is very slow. The code is written in such a way that threads cannot be used. Can anyone help me about this? I am using beautifulsoup library for parsing and than save in DB. if i comment the save statement, than still it takes time so there is no problem with database. def parse(self,text): soup = BeautifulSoup(text) arr = soup.findAll('tbody') for i in range(0,len(arr)-1): data=Data() soup2 = BeautifulSoup(str(arr[i])) arr2 = soup2.findAll('td') c=0 for j in arr2: if str(j).find("<a href=") > 0: data.sourceURL = self.getAttributeValue(str(j),'<a href="') else: if c == 2: data.Hits=j.renderContents() #and few others... #... c = c+1 data.save() Any suggestions? Note: I already ask this question here but that was closed due to incomplete information.

    Read the article

  • Python finding substring between certain characters using regex and replace()

    - by jCuga
    Suppose I have a string with lots of random stuff in it like the following: strJunk ="asdf2adsf29Value=five&lakl23ljk43asdldl" And I'm interested in obtaining the substring sitting between 'Value=' and '&', which in this example would be 'five'. I can use a regex like the following: match = re.search(r'Value=?([^&>]+)', strJunk) >>> print match.group(0) Value=five >>> print match.group(1) five How come match.group(0) is the whole thing 'Value=five' and group(1) is just 'five'? And is there a way for me to just get 'five' as the only result? (This question stems from me only having a tenuous grasp of regex) I am also going to have to make a substitution in this string such such as the following: val1 = match.group(1) strJunk.replace(val1, "six", 1) Which yields: 'asdf2adsf29Value=six&lakl23ljk43asdldl' Considering that I plan on performing the above two tasks (finding the string between 'Value=' and '&', as well as replacing that value) over and over, I was wondering if there are any other more efficient ways of looking for the substring and replacing it in the original string. I'm fine sticking with what I've got but I just want to make sure that I'm not taking up more time than I have to be if better methods are out there.

    Read the article

  • Join a list of lists together into 1 list in Python

    - by dotty
    Hay All. I have a list which consists of many lists, here is an example [ [Obj, Obj, Obj, Obj], [Obj], [Obj], [ [Obj,Obj], [Obj,Obj,Obj] ] ] Is there a way to join all these items together as 1 list, so the output will be something like [Obj,Obj,Obj,Obj,Obj,Obj,Obj,Obj,Obj,Obj,Obj] Thanks

    Read the article

  • speed up calling lot of entities, and getting unique values, google app engine python

    - by user291071
    OK this is a 2 part question, I've seen and searched for several methods to get a list of unique values for a class and haven't been practically happy with any method so far. So anyone have a simple example code of getting unique values for instance for this code. Here is my super slow example. class LinkRating2(db.Model): user = db.StringProperty() link = db.StringProperty() rating2 = db.FloatProperty() def uniqueLinkGet(tabl): start = time.time() dic = {} query = tabl.all() for obj in query: dic[obj.link]=1 end = time.time() print end-start return dic My second question is calling for instance an iterator instead of fetch slower? Is there a faster method to do this code below? Especially if the number of elements called be larger than 1000? query = LinkRating2.all() link1 = 'some random string' a = query.filter('link = ', link1) adic ={} for itema in a: adic[itema.user]=itema.rating2

    Read the article

  • Working with a list, performing arithmetic logic in Python

    - by haea ohoh
    Suppose I have made a large list of numbers, and I want to make another one which I will add, pairwise, with the first list. Here's the first list, A: [109, 77, 57, 34, 94, 68, 96, 72, 39, 67, 49, 71, 121, 89, 61, 84, 45, 40, 104, 68, 54, 60, 68, 62, 91, 45, 41, 118, 44, 35, 53, 86, 41, 63, 111, 112, 54, 34, 52, 72, 111, 113, 47, 91, 107, 114, 105, 91, 57, 86, 32, 109, 84, 85, 114, 48, 105, 109, 68, 57, 78, 111, 64, 55, 97, 85, 40, 100, 74, 34, 94, 78, 57, 77, 94, 46, 95, 60, 42, 44, 68, 89, 113, 66, 112, 60, 40, 110, 89, 105, 113, 90, 73, 44, 39, 55, 108, 110, 64, 108] And here's B: [35, 106, 55, 61, 81, 109, 82, 85, 71, 55, 59, 38, 112, 92, 59, 37, 46, 55, 89, 63, 73, 119, 70, 76, 100, 49, 117, 77, 37, 62, 65, 115, 93, 34, 107, 102, 91, 58, 82, 119, 75, 117, 34, 112, 121, 58, 79, 69, 68, 72, 110, 43, 111, 51, 102, 39, 52, 62, 75, 118, 62, 46, 74, 77, 82, 81, 36, 87, 80, 56, 47, 41, 92, 102, 101, 66, 109, 108, 97, 49, 72, 74, 93, 114, 55, 116, 66, 93, 56, 56, 93, 99, 96, 115, 93, 111, 57, 105, 35, 99] How might I generate the arithmatic addition logic, processing each pairwise value one by one (A[0] and B[0], through A[99], B[99]) and producing the list C (A[0] + B[0] through A[99]+ B[99])?

    Read the article

< Previous Page | 171 172 173 174 175 176 177 178 179 180 181 182  | Next Page >