Search Results

Search found 19662 results on 787 pages for 'python module'.

Page 128/787 | < Previous Page | 124 125 126 127 128 129 130 131 132 133 134 135  | Next Page >

  • Extract anything that looks like links from large amount of data in python

    - by Riz
    Hi, I have around 5 GB of html data which I want to process to find links to a set of websites and perform some additional filtering. Right now I use simple regexp for each site and iterate over them, searching for matches. In my case links can be outside of "a" tags and be not well formed in many ways(like "\n" in the middle of link) so I try to grab as much "links" as I can and check them later in other scripts(so no BeatifulSoup\lxml\etc). The problem is that my script is pretty slow, so I am thinking about any ways to speed it up. I am writing a set of test to check different approaches, but hope to get some advices :) Right now I am thinking about getting all links without filtering first(maybe using C module or standalone app, which doesn't use regexp but simple search to get start and end of every link) and then using regexp to match ones I need.

    Read the article

  • Apply function to one element of a list in Python

    - by user189637
    I'm looking for a concise and functional style way to apply a function to one element of a tuple and return the new tuple, in Python. For example, for the following input: inp = ("hello", "my", "friend") I would like to be able to get the following output: out = ("hello", "MY", "friend") I came up with two solutions which I'm not satisfied with. One uses a higher-order function. def apply_at(arr, func, i): return arr[0:i] + [func(arr[i])] + arr[i+1:] apply_at(inp, lambda x: x.upper(), 1) One uses list comprehensions (this one assumes the length of the tuple is known). [(a,b.upper(),c) for a,b,c in [inp]][0] Is there a better way? Thanks!

    Read the article

  • Python scope problems only when _assigning_ to a variable

    - by wallacoloo
    So I'm having a very strange error right now. I found where it happens, and here's the simplest code that can reproduce it. def parse_ops(str_in): c_type = "operator" def c_dat_check_type(t): print c_type #c_type = t c_dat_check_type("number") >>> parse_ops("12+a*2.5") If you run it as-is, it prints "operator". But if you uncomment that line, it gives an error: Traceback (most recent call last): File "<pyshell#212>", line 1, in <module> parse_ops("12+a*2.5") File "<pyshell#211>", line 7, in parse_ops c_dat_check_type("number") File "<pyshell#211>", line 4, in c_dat_check_type print c_type UnboundLocalError: local variable 'c_type' referenced before assignment Notice the error occurs on the line that worked just fine before. Any ideas what causes this and how I can fix this? I'm using Python 2.6.1.

    Read the article

  • Problems trying to format currency with Python (Django)

    - by h3
    I have the following code in Django: import locale locale.setlocale( locale.LC_ALL, '' ) def format_currency(i): return locale.currency(float(i), grouping=True) It work on some computers in dev mode, but as soon as I try to deploy it on production I get this error: Exception Type: TemplateSyntaxError Exception Value: Caught ValueError while rendering: Currency formatting is not possible using the 'C' locale. Exception Location: /usr/lib/python2.6/locale.py in currency, line 240 The weird thing is that I can do this on the production server and it will work without any errors: python manage.py shell >>> import locale >>> locale.setlocale( locale.LC_ALL, '' ) 'en_CA.UTF-8' >>> locale.currency(1, grouping=True) '$1.00' I .. don't get it.i

    Read the article

  • Dealing with Windows line-endings in Python

    - by Adam Nelson
    I've got a 700MB XML file coming from a Windows provider. As one might expect, the line endings are '\r\n' (or ^M in vi). What is the most efficient way to deal with this situation aside from getting the supplier to send over '\n' :-) Use os.linesep Use rstrip() (requiring opening the file ... which seems crazy) Using Universal newline support is not standard on my Mac Snow Leopard - so isn't an option. I'm open to anything that requires Python 2.6+ but it needs to work on Snow Leopard and Ubuntu 9.10 with minimal external requirements. I don't mind a small performance penalty but I am looking for the standard best way to deal with this.

    Read the article

  • Regular expressions in python unicode

    - by Remy
    I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

    Read the article

  • Python programming. Accessing Windows rigth click menu options

    - by Zack
    I'm hoping to automate a few tasks at work. One of them being combining and converting power point files to PDFs. I'm a bit of a newbie (I just finished Magus Heitland's Beginning Python), so I'm not entirely sure what I'm specifically asking. On windows, one can select multiple files, right click, and select combine as adobe PDF. I've figured out the 'grouping' of the files I want to convert (I traverse the dir and nest the files inside of a list based on their names), but I'm unsure how to pursue the next step (the rightclick/combine command). Googling has led me to things like win32api, pywinauto, and ctypes. But as I read over what they do my newbieness prevents me from knowing which is the tool I need. Could any one suggest a few good resources or tips?

    Read the article

  • Subtracting two lists in Python

    - by wich
    In Python, How can one subtract two non-unique, unordered lists? Say we have a = [0,1,2,1,0] and b = [0, 1, 1] I'd like to do something like c = a - b and have c be [2, 0] or [0, 2] order doesn't matter to me. This should throw an exception if a does not contain all elements in b. Note this is different from sets! I'm not interested in finding the difference of the sets of elements in a and b, I'm interested in the difference between the actual collections of elements in a and b. I can probably work this out with a for loop, looking up the first element of b in a and then removing the element from b and from a, etc. But this doesn't appeal to me, I'd like to do this with list comprehension in a nice and easy way. Is this possible?

    Read the article

  • Return numerical array in python

    - by khan
    Okay..this is kind of an interesting question. I have a php form through which user enters values for x and y like this: X: [1,3,4] Y: [2,4,5] These values are stored into database as varchars. From there, these are called by a python program which is supposed to use them as numerical (numpy) arrays. However, these are called as plain strings, which means that calculation can not be performed over them. Is there a way to convert them into numerical arrays before processing or is there something else which is wrong? Helpp!!

    Read the article

  • How do I splice a python string programmatically?

    - by Robin Welch
    Very simple question, hopefully. So, in Python you can split up strings using indices as follows: >>> a="abcdefg" >>> print a[2:4] cd but how do you do this if the indices are based on variables? E.g. >>> j=2 >>> h=4 >>> print a[j,h] Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: string indices must be integers

    Read the article

  • Find the min max and average of one column of data in python

    - by user1440194
    I have a set of data that looks like this 201206040210 -3461.00000000 -8134.00000000 -4514.00000000 -4394.00000000 0 201206040211 -3580.00000000 -7967.00000000 -4614.00000000 -7876.00000000 0 201206040212 -3031.00000000 -9989.00000000 -9989.00000000 -3419.00000000 0 201206040213 -1199.00000000 -6961.00000000 -3798.00000000 -5822.00000000 0 201206040214 -2940.00000000 -5524.00000000 -5492.00000000 -3394.00000000 0 I want to take the second to last column and find the min, max, and average. Im a little confused on how to use split when the columns are delimited by a space and -. i Figure once i do that i can use min() and max function. I have written a shell script to do the same here #!/bin/ksh awk '{print substr($5,2);}' data' > /data1 sort -n data1 > data2 tail -1 data2 head -1 data2 awk '{sum+=$1} END {print "average = ",sum/NR}' data2 Im just not sure how to do this in python. Thanks

    Read the article

  • Python win32api not moving mouse cursor in VirtualBox

    - by wes
    I'm trying to use this Python code: import math for i in xrange(500): x = 500 + math.sin(math.pi * i / 100) * 500 y = 500 + math.cos(i) * 100 x, y = int(x), int(y) win32api.SetCursorPos((x, y)) time.sleep(.01) taken from here to move the mouse cursor in an XP VirtualBox. The mouse icon will flicker to the appropriate graphic (when it hits the edge of a window it turns into the <- resize image, for instance), but it doesn't actually move the visible cursor. I can move the mouse around while the code is running. Same result using the ctypes example in the above link. It works fine in the Win7 host. I have Guest Additions installed, if that matters.

    Read the article

  • Summary count for Python logging

    - by Craig McQueen
    At the end of my Python program, I'd like to be able to get a summary of the number of items logged through the standard logging module. I'd specifically like to be able to get a count for each specified name (and possibly its children). E.g. if I have: input_logger = getLogger('input') input_logger.debug("got input1") input_logger.debug("got input2") input_logger.debug("got input3") network_input_logger = getLogger('input.network') network_input_logger.debug("got network input1") network_input_logger.debug("got network input2") getLogger('output') output_logger.debug("sent output1") Then at the end I'd like to get a summary such as: input: 5 input.network: 2 output: 1 Perhaps by calling a getcount() method for a logger or a handler. What would be a good way to achieve this? I imagine it would involve a sub-class of one of the logging classes, but I'm not sure which one would be best.

    Read the article

  • Python | How to create dynamic and expandable dictionaries

    - by MMRUser
    I want to create a Python dictionary which holds values in a multidimensional accept and it should be able to expand, this is the structure that the values should be stored :- userdata = {'data':[{'username':'Ronny Leech','age':'22','country':'Siberia'},{'username':'Cronulla James','age':'34','country':'USA'}]} Lets say I want to add another user def user_list(): users = [] for i in xrange(5, 0, -1): lonlatuser.append(('username','%s %s' % firstn, lastn)) lonlatuser.append(('age',age)) lonlatuser.append(('country',country)) return dict(user) This will only returns a dictionary with a single value in it (since the key names are same values will overwritten).So how do I append a set of values to this dictionary. Note: assume age, firstn, lastn and country are dynamically generated. Thanks.

    Read the article

  • Python Parse CSV Correctly

    - by cornerstone
    I am very new to Python. I want to parse a csv file such that it will recognize quoted values - For example 1997,Ford,E350,"Super, luxurious truck" should be split as ('1997', 'Ford', 'E350', 'Super, luxurious truck') and NOT ('1997', 'Ford', 'E350', '"Super', ' luxurious truck"') the above is what I get if I use something like str.split(). How do I do this? Also would it be best to store these values in an array or some other data structure? because after I get these values from the csv I want to be able to easily choose, lets say any two of the columns and store it as another array or some other data structure. Thanks in advance.

    Read the article

  • What are "named tuples" in Python?

    - by Denilson Sá
    Reading the changes in Python 3.1, I found something... unexpected: The sys.version_info tuple is now a named tuple: I never heard about named tuples before, and I thought elements could either be indexed by numbers (like in tuples and lists) or by keys (like in dicts). I never expected they could be indexed both ways. Thus, my questions are: What are named tuples? How to use them? Why/when should I use named tuples instead of normal tuples? Why/when should I use normal tuples instead of named tuples? Is there any kind of "named list" (a mutable version of the named tuple)?

    Read the article

  • Identifying a function call in a python script line in runtime

    - by Dani
    I have a python script that I run with 'exec'. The script's string has calls to functions. When a function is called, I would like it to know the line number and offset in line for that call in the script (in the string I fed exec with). Here is an example. If my script is: foo1(); foo2(); foo1() foo3() And if I have code that prints (line,offset) in every function, I should get (0,0), (0,8), (0,16), (1,0) In most cases this can be easily done by getting the stack frame, because it contains the line number and the function name. The only problem is when there are two functions with the same name in a certain line. Unfortunately this is a common case for me. Any ideas?

    Read the article

  • Rearrange a python list into n lists, by column

    - by Ben R
    Trying to solve this at this hour has gotten my mind into a tail-spin: I want to rearrange a list l into a list of n lists, where n is the number of columns. e.g., l = [1,2,3,4,5,6,7,8] n = 5 ==> [[1,6][2,7][3,8][4][5]] another example: l = [1,2,3,4,5,6,7,8,9,10] n = 4 ==> [[1,5,9],[2,6,10],[3,7][4,8] Can someone please help me out with an algorithm? Feel free to use any python awesomeness that's available; I'm sure theres some cool mechanism that's a good fit for this, i just can't think of it.

    Read the article

  • Sympy python circumference

    - by Mattia Villani
    I need to display a circumference. In order to do that I thought I could calculata for a lot of x the two values of y, so I did: import sympy as sy from sympy.abc import x,y f = x**2 + y**2 - 1 a = x - 0.5 sy.solve([f,a],[x,y]) and this is what I get: Traceback (most recent call last): File "<input>", line 1, in <module> File "/usr/lib/python2.7/dist-packages/sympy/solvers/solvers.py", line 484, in solve solution = _solve(f, *symbols, **flags) File "/usr/lib/python2.7/dist-packages/sympy/solvers/solvers.py", line 749, in _solve result = solve_poly_system(polys) File "/usr/lib/python2.7/dist-packages/sympy/solvers/polysys.py", line 40, in solve_poly_system return solve_biquadratic(f, g, opt) File "/usr/lib/python2.7/dist-packages/sympy/solvers/polysys.py", line 48, in solve_biquadratic G = groebner([f, g]) File "/usr/lib/python2.7/dist-packages/sympy/polys/polytools.py", line 5308, i n groebner raise DomainError("can't compute a Groebner basis over %s" % domain) DomainError: can't compute a Groebner basis over RR How can I calculate the y's values ?

    Read the article

  • Python .app doesn't read .txt file like it should

    - by Bambo
    This question relates to this one: Python app which reads and writes into its current working directory as a .app/exe i got the path to the .txt file fine however now when i try to open it and read the contents it seems that it doesn't extract the data properly. Here's my code - http://pastie.org/4876896 These are the errors i'm getting: 30/09/2012 10:28:49.103 [0x0-0x4e04e].org.pythonmac.unspecified.main: for index, item in enumerate( lines ): # iterate through lines 30/09/2012 10:28:49.103 [0x0-0x4e04e].org.pythonmac.unspecified.main: TypeError: 'NoneType' object is not iterable I kind of understand what the errors mean however i'm not sure why they are being flagged up because if i run my script with it not in a .app form it doesn't get these errors and extracts the data fine.

    Read the article

  • stuck in while loop python

    - by user1717330
    I am creating a chat server in python and got quite far as a noob in the language. I am having 1 problem at the moment which I want to solve before I go further, but I cannot seem to find how to get the problem solved. It is about a while loop that continues.. in the below code is where it goes wrong while 1: try: data = self.channel.recv ( 1024 ) print "Message from client: ", data if "exit" in data: self.channel.send("You have closed youre connection.\n") break except KeyboardInterrupt: break except: raise When this piece of code get executed, on my client I need to enter "exit" to quit the connection. This works as a charm, but when I use CTRL+C to exit the connection, my server prints "Message from client: " a couple of thousand times. where am I going wrong?

    Read the article

  • Python unittest with expensive setup

    - by Staale
    My test file is basically: class Test(unittest.TestCase): def testOk(): pass if __name__ == "__main__": expensiveSetup() try: unittest.main() finally: cleanUp() However, I do wish to run my test through Netbeans testing tools, and to do that I need unittests that don't rely on an environment setup done in main. Looking at http://stackoverflow.com/questions/402483/caching-result-of-setup-using-python-unittest - it recommends using Nose. However, I don't think Netbeans supports this. I didn't find any information indicating that it does. Additionally, I am the only one here actually writing tests, so I don't want to introduce additional dependencies for the other 2 developers unless they are needed. How can I do the setup and cleanup once for all the tests in my TestSuite? The expensive setup here is creating some files with dummy data, as well as setting up and tearing down a simple xml-rpc server. I also have 2 test classes, one testing locally and one testing all methods over xml-rpc.

    Read the article

  • Problems inserting file data into sqlite database using python

    - by tylerc230
    I'm trying to open an image file in python and add that data to an sqlite table. I created the table using: "CREATE TABLE "images" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL , "description" VARCHAR, "image" BLOB );" I am trying to add the image to the db using: imageFile = open(imageName, 'rb') b = sqlite3.Binary(imageFile.read()) targetCursor.execute("INSERT INTO images (image) values(?)", (b,)) targetCursor.execute("SELECT id from images") for id in targetCursor: imageid= id[0] targetCursor.execute("INSERT INTO %s (questionID,imageID) values(?,?)" % table, (questionId, imageid)) When I print the value of 'b' it looks like binary data but when I call: 'select image from images where id = 1' I get '????' printed to the console. Anyone know what I'm doing wrong?

    Read the article

  • Python CSV file processing

    - by kingwarchief
    I just got introduced to python, the first language I get to learn, and I have this question below: I have an excel based CSV file with two columns (or rows, Pythonically) that I am working on. What I need to do is to perform some operations so that I can compare the two data entries in each 'row'. To be more precise, one column has constant numbers all the way down, whereas the other column varies. So I need to count the number of times the varying column data entry values crosses the constant value on the other column. For example: Varying Column; Constant Column 24 25 26 25 crosses 27 25 26 25 25.5 25 23 25 crossed 26 25 crossed So in this case the number of times there is a cross

    Read the article

  • Using Python simplejson for transmitting JSON to another server results in unicode encoding problems

    - by Mark
    Hi there, I'm encoding a string with Python's simplejson library with special characters: hello testing spécißl characters plusses: +++++ special chars :œ?´®†¥¨ˆøp“ß?ƒ©??°¬O˜çv?˜µ== However, when I encode it and transmit it to the other machine (using POST), it turns out like this: {'message': ['{"body": "hello testing sp\\u00e9ci\\u00dfl characters\\n\\nplusses: \\n\\nspecial chars :\\u0153\\u2211\\u00b4\\u00ae\\u2020\\u00a5\\u00a8\\u02c6\\u00f8\\u03c0\\u201c\\u00df\\u2202\\u0192\\u00a9\\u02d9\\u2206\\u02da\\u00ac\\u03a9\\u2248\\u00e7\\u221a\\u222b\\u02dc\\u00b5\\u2264\\u2265"}']} The + signs are completely stripped and the rest are in this unicode(?) format. My code for this is: data = {'body': data_string} data_encoded = json.dumps(data) Any ideas? Thanks! Edit: I've tried using json.dumps(data, ensure_ascii=False) but it results in a UnicodeError ordinal not in range error.

    Read the article

< Previous Page | 124 125 126 127 128 129 130 131 132 133 134 135  | Next Page >