Search Results

Search found 13534 results on 542 pages for 'python'.

Page 73/542 | < Previous Page | 69 70 71 72 73 74 75 76 77 78 79 80  | Next Page >

  • maintaining a large list in python

    - by Oren
    I need to maintain a large list of python pickleable objects that will quickly execute the following algorithm: The list will have 4 pointers and can: Read\Write each of the 4 pointed nodes Insert new nodes before the pointed nodes Increase a pointer by 1 Pop nodes from the beginning of the list (without overlapping the pointers) Add nodes at the end of the list The list can be very large (300-400 MB), so it shouldnt be contained in the RAM. The nodes are small (100-200 bytes) but can contain various types of data. The most efficient way it can be done, in my opinion, is to implement some paging-mechanism. On the harddrive there will be some database of a linked-list of pages where in every moment up to 5 of them are loaded in the memory. On insertion, if some max-page-size was reached, the page will be splitted to 2 small pages, and on pointer increment, if a pointer is increased beyond its loaded page, the following page will be loaded instead. This is a good solution, but I don't have the time to write it, especially if I want to make it generic and implement all the python-list features. Using a SQL tables is not good either, since I'll need to implement a complex index-key mechanism. I tried ZODB and zc.blist, which implement a BTree based list that can be stored in a ZODB database file, but I don't know how to configure it so the above features will run in reasonable time. I don't need all the multi-threading\transactioning features. No one else will touch the database-file except for my single-thread program. Can anyone explain me how to configure the ZODB\zc.blist so the above features will run fast, or show me a different large-list implementation?

    Read the article

  • Producing an view of a text's revision history in Python

    - by hekevintran
    I have two versions of a piece of text and I want to produce an HTML view of its revision similar to what Google Docs or Stack Overflow displays. I need to do this in Python. I don't know what this technique is called but I assume that it has a name and hopefully there is a Python library that can do it. Version 1: William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. Version 2: William Henry "Bill" Gates III (born October 28, 1955)[2] is a business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American. The desired output: William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American. Using the diff command doesn't work because it tells me which lines are different but not which columns/words are different. $ echo 'William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen.' > oldfile $ echo 'William Henry "Bill" Gates III (born October 28, 1955)[2] is a business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American.' > newfile $ diff -u oldfile newfile --- oldfile 2010-04-30 13:32:43.000000000 -0700 +++ newfile 2010-04-30 13:33:09.000000000 -0700 @@ -1 +1 @@ -William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. +William Henry "Bill" Gates III (born October 28, 1955)[2] is a business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American.' > oldfile

    Read the article

  • Python: fetching SVG file using urllib is returning binary when I need ASCII

    - by Drew Dara-Abrams
    I'm using urllib (in Python) to fetch an SVG file: import urllib urllib.urlopen('http://alpha.vectors.cloudmade.com/BC9A493B41014CAABB98F0471D759707/-122.2487,37.87588,-122.265823,37.868054?styleid=1&viewport=400x231').read() which produces output of the sort: xb6\xf6\x00\xb3\xfb2\xff\xda\xc5\xf2\xc2\x14\xef\xcd\x82\x0b\xdbU\xb0\x81\xcaF\xd8\x1a\xf6\xdf[i)\xba\xcf\x80\xab\xd6\x8c\xe3l_\xe7\n\xed2,\xbdm\xa0_|\xbb\x12\xff\xb6\xf8\xda\xd9\xc3\xd9\t\xde\x9a\xf8\xae\xb3T\xa3\r`\x8a\x08!T\xfb8\x92\x95\x0c\xdd\x8b!\x02P\xea@\x98\x1c^\xc7\xda\\\xec\xe3\xe1\xbe,0\xcd\xbeZ~\x92\xb3\xfa\xdd\xfcbyu\xb8\x83\xbb\xbdS\x0f\x82\x0b\xfe\xf5_\xdawn\xff\xef_\xff\xe5\xfa\x1f?\xbf\xffoZ\x0f\x8b\xbfV\xf4\x04\x00' when I was expecting more like this: <?xml version='1.0' encoding='UTF-8'?> <svg xmlns="http://www.w3.org/2000/svg" xmlns:cm="http://cloudmade.com/" width="400" height="231"> <rect width="100%" height="100%" fill="#eae8dd" opacity="1"/> <g transform="scale(0.209849975856)"> <g transform="translate(13610569, 4561906)" flood-opacity="0.1" flood-color="grey"> <path d="M -13610027.720000000670552 -4562403.660000000149012 I guess this is an issue of binary vs. ASCII. Can anyone help me (a Python newbie) with the appropriate conversion so that I can get on with parsing and manipulating the SVG code?

    Read the article

  • python - tkinter - update label from variable

    - by Tom
    I wrote a python script that does some stuff to generate and then keep changing some text stored as a string variable. This works, and I can print the string each time it gets changed. Problems have arisen while trying to display that output in a GUI (just as a basic label) using tkinter. I can get the label to display the string for the first time... but it never updates. This is really the first time I have tried to use tkinter, so it's likely I'm making a foolish error. What I've got looks logical to me, but I'm evidently going wrong somewhere! from tkinter import * outputText = 'Ready' counter = int(0) root = Tk() root.maxsize(400, 400) var = StringVar() l = Label(root, textvariable=var, anchor=NW, justify=LEFT, wraplength=398) l.pack() var.set(outputText) while True: counter = counter + 1 #do some stuff that generates string as variable 'result' outputText = result #do some more stuff that generates new string as variable 'result' outputText = result #do some more stuff that generates new string as variable 'result' outputText = result if counter == 5: break root.mainloop() I also tried: from tkinter import * outputText = 'Ready' counter = int(0) root = Tk() root.maxsize(400, 400) var = StringVar() l = Label(root, textvariable=var, anchor=NW, justify=LEFT, wraplength=398) l.pack() var.set(outputText) while True: counter = counter + 1 #do some stuff that generates string as variable 'result' outputText = result var.set(outputText) #do some more stuff that generates new string as variable 'result' outputText = result var.set(outputText) #do some more stuff that generates new string as variable 'result' outputText = result var.set(outputText) if counter == 5: break root.mainloop() In both cases, the label will show 'Ready' but won't update to change that to the strings as they're generated later. After a fair bit of googling and looking through answers on this site, I thought the solution might be to use update_idletasks - I tried putting that in after each time the variable was changed, but it didn't help. It also seems possible I am meant to be using trace and callback somehow to make all this work...but I can't get my head around how that works (I tried, but didn't manage to make anything that even looked like it would do something, let alone actually worked). I'm still very new to both python and especially tkinter, so, any help much appreciated but please spell it out for me if you can :)

    Read the article

  • using python 'with' statement with iterators?

    - by Stephen
    Hi, I'm using Python 2.5. I'm trying to use this 'with' statement. from __future__ import with_statement a = [] with open('exampletxt.txt','r') as f: while True: a.append(f.next().strip().split()) print a The contents of 'exampletxt.txt' are simple: a b In this case, I get the error: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/python-7036sVf.py", line 5, in <module> a.append(f.next().strip().split()) StopIteration And if I replace f.next() with f.read(), it seems to be caught in an infinite loop. I wonder if I have to write a decorator class that accepts the iterator object as an argument, and define an __exit__ method for it? I know it's more pythonic to use a for-loop for iterators, but I wanted to implement a while loop within a generator that's called by a for-loop... something like def g(f): while True: x = f.next() if test1(x): a = x elif test2(x): b = f.next() yield [a,x,b] a = [] with open(filename) as f: for x in g(f): a.append(x)

    Read the article

  • Python API for VirtualBox

    - by jessica
    I have made a command-line interface for virtualbox such that the virtualbox can be controlled from a remote machine. now I am trying to implement the commmand-line interface using python virtualbox api. For that I have downloaded the pyvb package (python api documentation shows functions that can be used for implementing this under pyvb package). but when I give pyvb.vb.VB.startVM(instance of VB class,pyvb.vm.vbVM) SERVER SIDE CODE IS from pyvb.constants import * from pyvb.vm import * from pyvb.vb import * import xpcom import pyvb import os import socket import threading class ClientThread ( threading.Thread ): # Override Thread's init method to accept the parameters needed: def init ( self, channel, details ): self.channel = channel self.details = details threading.Thread.__init__ ( self ) def run ( self ): print 'Received connection:', self.details [ 0 ] while 1: s= self.channel.recv ( 1024 ) if(s!='end'): if(s=='start'): v=VB() pyvb.vb.VB.startVM(v,pyvb.vm.vbVM) else: self.channel.close() break print 'Closed connection:', self.details [ 0 ] server = socket.socket ( socket.AF_INET, socket.SOCK_STREAM ) server.bind ( ( '127.0.0.1', 2897 ) ) server.listen ( 5 ) while True: channel, details = server.accept() ClientThread ( channel, details ).start() it shows an error Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib/python2.5/threading.py", line 486, in __bootstrap_inner self.run() File "news.py", line 27, in run pyvb.vb.VB.startVM(v,pyvb.vm.vbVM.getUUID(m)) File "/usr/lib/python2.5/site-packages/pyvb-0.0.2-py2.5.egg/pyvb/vb.py", line 65, in startVM cmd='%s %s'%(VB_COMMAND_STARTVM, vm.getUUID()) AttributeError: 'str' object has no attribute 'getUUID'

    Read the article

  • How do you invoke a python script inside a jar file using python ?

    - by Trevor
    I'm working on an application that intersperses a bunch of jython and java code. Due to the nature of the program (using wsadmin) we are really restricted to Python 2.1 We currently have a jar containing both java source and .py modules. The code is currently invoked using java, but I'd like to remove this in favor of migrating as much functionality as possible to jython. The problem I have is that I want to either import or execute python modules inside the existing jar file from a calling jython script. I've tried a couple of different ways without success. My directory structure looks like: application.jar |-- com |--example |-- action |-- MyAction.class |-- pre_myAction.py The 1st approach I tried was to do imports from the jar. I added the jar to my sys.path and tried to import the module using both import com.example.action.myAction and import myAction. No success however, even when I put init.py files into the directory at each level. The 2nd approach I tried was to load the resource using the java class. So I wrote the below code: import sys import os import com.example.action.MyAction as MyAction scriptName = str(MyAction.getResource('/com/example/action/myAction.py')) scriptStr = MyAction.getResourceAsStream('/com/example/action/myAction.py') try: print execfile(scriptStr) except: print "failed 1" try: print execfile(scriptName) except: print "failed 2" Both of these failed. I'm at a bit of a loss now as to how I should proceed. Any ideas ? cheers, Trevor

    Read the article

  • Create a VPN with Python

    - by user213060
    I want to make a device "tunnel box" that you plug an input ethernet line, and an output ethernet line, and all the traffic that goes through it gets modified in a special way. This is similar to how a firewall, IDS, VPN, or similar boxes are connected inline in a network. I think you can just assume that I am writing a custom VPN in Python for the purpose of this question: LAN computer <--\ LAN computer <---> [LAN switch] <--> ["tunnel box"] <--> [internet modem] <--> LAN computer <--/ My question is, what is a good way to program this "tunnel box" from python? My application needs to see TCP flows at the network layer, not as individual ethernet frames. Non-TCP/IP traffic such as ICPM and other types should just be passed through. Example Twisted-like Code for my "tunnel box" tunnel appliance: from my_code import special_data_conversion_function class StreamInterceptor(twisted.Protocol): def dataReceived(self,data): data=special_data_conversion_function(data) self.outbound_connection.send(data) My initial guesses: TUN/TAP with twisted.pair.tuntap.py - Problem: This seems to only work at the ethernet frame level, not like my example? Socks proxy - Problem: Not transparent as in my diagram. Programs have to be specifically setup for it. Thanks!

    Read the article

  • Python doctest error

    - by user74283
    Hi I recently started experimenting with python currently reading "Think like a computer scientist: Learning python v2nd edition" I have been having some trouble with doctest. I use a windows 7 machine and Eclipse IDE with pydev. My question is when i run the script below i get the error below. Said script is below the the error message Traceback (most recent call last): File "C:\Users\shaytac\PythonProjects\test.py", line 21, in doctest.testmod() File "C:\Python26\lib\doctest.py", line 1829, in testmod for test in finder.find(m, name, globs=globs, extraglobs=extraglobs): File "C:\Python26\lib\doctest.py", line 852, in find self._find(tests, obj, name, module, source_lines, globs, {}) File "C:\Python26\lib\doctest.py", line 906, in _find globs, seen) File "C:\Python26\lib\doctest.py", line 894, in _find test = self._get_test(obj, name, module, globs, source_lines) File "C:\Python26\lib\doctest.py", line 978, in _get_test filename, lineno) File "C:\Python26\lib\doctest.py", line 597, in get_doctest return DocTest(self.get_examples(string, name), globs, File "C:\Python26\lib\doctest.py", line 611, in get_examples return [x for x in self.parse(string, name) File "C:\Python26\lib\doctest.py", line 573, in parse self._parse_example(m, name, lineno) File "C:\Python26\lib\doctest.py", line 631, in _parse_example self._check_prompt_blank(source_lines, indent, name, lineno) File "C:\Python26\lib\doctest.py", line 718, in _check_prompt_blank line[indent:indent+3], line)) ValueError: line 2 of the docstring for main.compare lacks blank after : 'compare(5, 4) ' def compare(a, b): """ >>>compare(5, 4) 1 >>>compare(7, 7) 0 >>>compare(2, 3) -1 >>>compare(42, 1) 1 """ if a > b : return 1 if a == b : return 0 if a < b : return -1 if __name__ == '__main__': import doctest doctest.testmod()

    Read the article

  • python: nonblocking subprocess, check stdout

    - by Will Cavanagh
    Ok so the problem I'm trying to solve is this: I need to run a program with some flags set, check on its progress and report back to a server. So I need my script to avoid blocking while the program executes, but I also need to be able to read the output. Unfortunately, I don't think any of the methods available from Popen will read the output without blocking. I tried the following, which is a bit hack-y (are we allowed to read and write to the same file from two different objects?) import time import subprocess from subprocess import * with open("stdout.txt", "wb") as outf: with open("stderr.txt", "wb") as errf: command = ['Path\\To\\Program.exe', 'para', 'met', 'ers'] p = subprocess.Popen(command, stdout=outf, stderr=errf) isdone = False while not isdone : with open("stdout.txt", "rb") as readoutf: #this feels wrong for line in readoutf: print(line) print("waiting...\\r\\n") if(p.poll() != None) : done = True time.sleep(1) output = p.communicate()[0] print(output) Unfortunately, Popen doesn't seem to write to my file until after the command terminates. Does anyone know of a way to do this? I'm not dedicated to using python, but I do need to send POST requests to a server in the same script, so python seemed like an easier choice than, say, shell scripting. Thanks! Will

    Read the article

  • Help me find an appropriate ruby/python parser generator

    - by Geo
    The first parser generator I've worked with was Parse::RecDescent, and the guides/tutorials available for it were great, but the most useful feature it has was it's debugging tools, specifically the tracing capabilities ( activated by setting $RD_TRACE to 1 ). I am looking for a parser generator that can help you debug it's rules. The thing is, it has to be written in python or in ruby, and have a verbose mode/trace mode or very helpful debugging techniques. Does anyone know such a parser generator ? EDIT: when I said debugging, I wasn't referring to debugging python or ruby. I was referring to debugging the parser generator, see what it's doing at every step, see every char it's reading, rules it's trying to match. Hope you get the point. BOUNTY EDIT: to win the bounty, please show a parser generator framework, and illustrate some of it's debugging features. I repeat, I'm not interested in pdb, but in parser's debugging framework. Also, please don't mention treetop. I'm not interested in it.

    Read the article

  • Python with Wiimote using pywiiuse module

    - by Anon
    After seeing the abilities and hackibility of wiimotes I really want to use it in my 'Intro to programming' final. Everyone must make a python program and present it to the class. I want to make a game with pygame incorporating a wiimote. I found pywiiuse which is a very basic wrapper for the wiiuse library using c types. I can NOT get anything beyond LEDs and vibrating to work. Buttons, IR, motion sensing, nothing. I've tried different versions of wiiuse, pywiiuse, even python. I can't even get the examples that came with it to run. Here's the code I made as a simple test. I copied some of the example C++ code. from pywiiuse import * from time import sleep #Init wiimotes = wiiuse_init() #Find and start the wiimote found = wiiuse_find(wiimotes,1,5) #Make the variable wiimote to the first wiimote init() found wiimote = wiimotes.contents #Set Leds wiiuse_set_leds(wiimote,WIIMOTE_LED_1) #Rumble for 1 second wiiuse_rumble(wiimote,1) sleep(1) wiiuse_rumble(wiimote,0) #Turn motion sensing on(supposedly) wiiuse_motion_sensing(wiimote,1) while 1: #Poll the wiimotes to get the status like pitch or roll if(wiiuse_poll(wiimote,1)): print 'EVENT' And here's the output when I run it. wiiuse version 0.9 wiiuse api version 8 [INFO] Found wiimote [assigned wiimote id 1]. EVENT EVENT Traceback (most recent call last): File "C:\Documents and Settings\Nick\Desktop\wiimotetext.py", line 26, in <mod ule> if(wiiuse_poll(wiimote,1)): WindowsError: exception: access violation reading 0x00000004 It seems each time I run it, it prints out EVENT 2-5 times until the trace back. I have no clue what to do at this point, I've been trying for the past two days to get it working. Thanks!

    Read the article

  • Python Pre-testing for exceptions when coverage fails

    - by Tal Weiss
    I recently came across a simple but nasty bug. I had a list and I wanted to find the smallest member in it. I used Python's built-in min(). Everything worked great until in some strange scenario the list was empty (due to strange user input I could not have anticipated). My application crashed with a ValueError (BTW - not documented in the official docs). I have very extensive unit tests and I regularly check coverage to avoid surprises like this. I also use Pylint (everything is integrated in PyDev) and I never ignore warnings, yet I failed to catch this bug before my users did. Is there anything I can change in my methodology to avoid these kind of runtime errors? (which would have been caught at compile time in Java / C#?). I'm looking for something more than wrapping my code with a big try-except. What else can I do? How many other build in Python functions are hiding nasty surprises like this???

    Read the article

  • Building a python module and linking it against a MacOSX framework

    - by madflo
    I'm trying to build a Python extension on MacOSX 10.6 and to link it against several frameworks (i386 only). I made a setup.py file, using distutils and the Extension object. I order to link against my frameworks, my LDFLAGS env var should look like : LDFLAGS = -lc -arch i386 -framework fwk1 -framework fwk2 As I did not find any 'framework' keyword in the Extension module documentation, I used the extra_link_args keyword instead. Extension('test', define_macros = [('MAJOR_VERSION', '1'), ,('MINOR_VERSION', '0')], include_dirs = ['/usr/local/include', 'include/', 'include/vitale'], extra_link_args = ['-arch i386', '-framework fwk1', '-framework fwk2'], sources = "testmodule.cpp", language = 'c++' ) Everything is compiling and linking fine. If I remove the -framework line from the extra_link_args, my linker fails, as expected. Here is the last two lines produced by a python setup.py build : /usr/bin/g++-4.2 -arch x86_64 -arch i386 -isysroot / -L/opt/local/lib -arch x86_64 -arch i386 -bundle -undefined dynamic_lookup build/temp.macosx-10.6-intel-2.6/testmodule.o -o build/lib.macosx-10.6-intel-2.6/test.so -arch i386 -framework sgdosx -framework srtosx -framework ssvosx -framework stsosx Unfortunately, the .so that I just produced is unable to find several symbols provided by this framework. I tried to check the linked framework with otool. None of them is appearing. $ otool -L test.so test.so: /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.0.1) There is the output of otool run on a test binary, made with g++ and ldd using the LDFLAGS described at the top of my post. On this example, the -framework did work. $ otool -L vitaosx vitaosx: /Library/Frameworks/sgdosx.framework/Versions/A/sgdosx (compatibility version 1.0.0, current version 1.0.0) /Library/Frameworks/ssvosx.framework/Versions/A/ssvosx (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.0.1) May this issue be linked to the "-undefined dynamic_lookup" flag on the linking step ? I'm a little bit confused by the few lines of documentation that I'm finding on Google. Cheers,

    Read the article

  • Endless problems with a very simple python subprocess.Popen task

    - by Thomas
    I'd like python to send around a half-million integers in the range 0-255 each to an executable written in C++. This executable will then respond with a few thousand integers. Each on one line. This seems like it should be very simple to do with subprocess but i've had endless troubles. Right now im testing with code: // main() u32 num; std::cin >> num; u8* data = new u8[num]; for (u32 i = 0; i < num; ++i) std::cin >> data[i]; // test output / spit it back out for (u32 i = 0; i < num; ++i) std::cout << data[i] << std::endl; return 0; Building an array of strings ("data"), each like "255\n", in python and then using: output = proc.communicate("".join(data))[0] ...doesn't work (says stdin is closed, maybe too much at one time). Neither has using proc.stdin and proc.stdout worked. This should be so very simple, but I'm getting constant exceptions, and/or no output data returned to me. My Popen is currently: proc = Popen('aux/test_cpp_program', stdin=PIPE, stdout=PIPE, bufsize=1) Advise me before I pull my hair out. ;)

    Read the article

  • Getting all new messages from a Maildir in python

    - by Jesper
    I have a mail dir: foo@foo:~/Maildir$ ls -l total 288 drwx------ 2 foo foo 155648 2010-04-19 15:19 cur -rw------- 1 foo foo 440 2010-03-20 08:50 dovecot.index.log -rw------- 1 foo foo 112 2010-03-20 08:49 dovecot-uidlist -rw------- 1 foo foo 8 2010-03-20 08:49 dovecot-uidvalidity -rw------- 1 foo foo 0 2010-03-20 08:49 dovecot-uidvalidity.4ba48c0e drwx------ 2 foo foo 114688 2010-04-19 16:07 new drwx------ 2 foo foo 4096 2010-04-19 16:07 tmp And in python I'm trying to get all new messages (Python 2.6.5rc2). First, getting "Maildir" works: >>> import mailbox >>> md = mailbox.Maildir('/home/foo/Maildir') >>> md.iterkeys().next() '1269924477.Vfc01I4249fM708004.foo' But how do I access "Maildir/new"? This does not work: >>> md = mailbox.Maildir('/home/foo/Maildir/new') >>> md.iterkeys().next() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/mailbox.py", line 346, in iterkeys self._refresh() File "/usr/lib/python2.6/mailbox.py", line 467, in _refresh for entry in os.listdir(subdir_path): OSError: [Errno 2] No such file or directory: '/home/foo/Maildir/new/new' >>> Any ideas?

    Read the article

  • Converting python collaborative filtering code to use Map Reduce

    - by Neil Kodner
    Using Python, I'm computing cosine similarity across items. given event data that represents a purchase (user,item), I have a list of all items 'bought' by my users. Given this input data (user,item) X,1 X,2 Y,1 Y,2 Z,2 Z,3 I build a python dictionary {1: ['X','Y'], 2 : ['X','Y','Z'], 3 : ['Z']} From that dictionary, I generate a bought/not bought matrix, also another dictionary(bnb). {1 : [1,1,0], 2 : [1,1,1], 3 : [0,0,1]} From there, I'm computing similarity between (1,2) by calculating cosine between (1,1,0) and (1,1,1), yielding 0.816496 I'm doing this by: items=[1,2,3] for item in items: for sub in items: if sub >= item: #as to not calculate similarity on the inverse sim = coSim( bnb[item], bnb[sub] ) I think the brute force approach is killing me and it only runs slower as the data gets larger. Using my trusty laptop, this calculation runs for hours when dealing with 8500 users and 3500 items. I'm trying to compute similarity for all items in my dict and it's taking longer than I'd like it to. I think this is a good candidate for MapReduce but I'm having trouble 'thinking' in terms of key/value pairs. Alternatively, is the issue with my approach and not necessarily a candidate for Map Reduce?

    Read the article

  • Decoding tcp packets using python

    - by mikip
    Hello I am trying to decode data received over a tcp connection. The packets are small, no more than 100 bytes. However when there is a lot of them I receive some of the the packets joined together. Is there a way to prevent this. I am using python I have tried to separate the packets, my source is below. The packets start with STX byte and end with ETX bytes, the byte following the STX is the packet length, (packet lengths less than 5 are invalid) the checksum is the last bytes before the ETX def decode(data): while True: start = data.find(STX) if start == -1: #no stx in message pkt = '' data = '' break #stx found , next byte is the length pktlen = ord(data[1]) #check message ends in ETX (pktken -1) or checksum invalid if pktlen < 5 or data[pktlen-1] != ETX or checksum_valid(data[start:pktlen]) == False: print "Invalid Pkt" data = data[start+1:] continue else: pkt = data[start:pktlen] data = data[pktlen:] break return data , pkt I use it like this #process reports try: data = sock.recv(256) except: continue else: while data: data, pkt = decode(data) if pkt: process(pkt) Also if there are multiple packets in the data stream, is it best to return the packets as a collection of lists or just return the first packet I am not that familiar with python, only C, is this method OK. Any advice would be most appreciated. Thanks in advance Thanks

    Read the article

  • Python: puzzling behaviour inside httplib

    - by Anna
    I have added one line ( import pdb; pdb.set_trace() ) to httplib's HTTPConnection.putheader, so I can see what's going on inside. httplib.py, line 489: def putheader(self, header, value): """Send a request header line to the server. For example: h.putheader('Accept', 'text/html') """ import pdb; pdb.set_trace() if self.__state != _CS_REQ_STARTED: raise CannotSendHeader() str = '%s: %s' % (header, value) self._output(str) then ran this from the interpreter import urllib2 urllib2.urlopen('http://www.ioerror.us/ip/headers') ... and as expected PDB kicks in: > c:\python26\lib\httplib.py(858)putheader() -> if self.__state != _CS_REQ_STARTED: (Pdb) in PDB I have the luxury of evaluating expressions on the fly, so I have tried to enter self.__state: (Pdb) self.__state *** AttributeError: HTTPConnection instance has no attribute '__state' Alas, there is no __state of this instance. However when I enter step, the debugger gets past the if self.__state != _CS_REQ_STARTED: line without a problem. Why is this happening? If the self.__state doesn't exist python would have to raise an exception as it did when I entered the expression. Python version: 2.6.4 on win32

    Read the article

  • Speed vs security vs compatibility over methods to do string concatenation in Python

    - by Cawas
    Similar questions have been brought (good speed comparison there) on this same subject. Hopefully this question is different and updated to Python 2.6 and 3.0. So far I believe the faster and most compatible method (among different Python versions) is the plain simple + sign: text = "whatever" + " you " + SAY But I keep hearing and reading it's not secure and / or advisable. I'm not even sure how many methods are there to manipulate strings! I could count only about 4: There's interpolation and all its sub-options such as % and format and then there's the simple ones, join and +. Finally, the new approach to string formatting, which is with format, is certainly not good for backwards compatibility at same time making % not good for forward compatibility. But should it be used for every string manipulation, including every concatenation, whenever we restrict ourselves to 3.x only? Well, maybe this is more of a wiki than a question, but I do wish to have an answer on which is the proper usage of each string manipulation method. And which one could be generally used with each focus in mind (best all around for compatibility, for speed and for security). Thanks.

    Read the article

  • Reading UTF-8 XML and writing it to a file with Python

    - by Harri
    I'm trying to parse UTF-8 XML file and save some parts of it to another file. Problem is, that this is my first Python script ever and I'm totally confused about the character encoding problems I'm finding. My script fails immediately when it tries to write non-ascii character to a file, but it can print it to command prompt (at least in some level) Here's the XML (from the parts that matter at least, it's a *.resx file which contains UI strings) <?xml version="1.0" encoding="utf-8"?> <root> <resheader name="foo"> <value>bar</value> </resheader> <data name="lorem" xml:space="preserve"> <value>ipsum öä</value> </data> </root> And here's my python script from xml.dom.minidom import parse names = [] values = [] def getStrings(path): dom = parse(path) data = dom.getElementsByTagName("data") for i in range(len(data)): name = data[i].getAttribute("name") names.append(name) value = data[i].getElementsByTagName("value") values.append(value[0].firstChild.nodeValue.encode("utf-8")) def writeToFile(): with open("uiStrings-fi.py", "w") as f: for i in range(len(names)): line = names[i] + '="'+ values[i] + '"' #varName='varValue' f.write(line) f.write("\n") getStrings("ResourceFile.fi-FI.resx") writeToFile() And here's the traceback: Traceback (most recent call last): File "GenerateLanguageFiles.py", line 24, in writeToFile() File "GenerateLanguageFiles.py", line 19, in writeToFile line = names[i] + '="'+ values[i] + '"' #varName='varValue' UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in ran ge(128) How should I fix my script so it would read and write UTF-8 characters properly? The files I'm trying to generate would be used in test automation with Robots Framework.

    Read the article

  • Replicating SQL's 'Join' in Python

    - by Daniel Mathews
    I'm in the process of trying to switch from R to Python (mainly issues around general flexibility). With Numpy, matplotlib and ipython, I've am able to cover all my use cases save for merging 'datasets'. I would like to simulate SQL's join by clause (inner, outer, full) purely in python. R handles this with the 'merge' function. I've tried the numpy.lib.recfunctions join_by, but it critical issues with duplicates along the 'key': join_by(key, r1, r2, jointype='inner', r1postfix='1', r2postfix='2', defaults=None, usemask=True, asrecarray=False) Join arrays r1 and r2 on key key. The key should be either a string or a sequence of string corresponding to the fields used to join the array. An exception is raised if the key field cannot be found in the two input arrays. Neither r1 nor r2 should have any duplicates along key: the presence of duplicates will make the output quite unreliable. Note that duplicates are not looked for by the algorithm. source: http://presbrey.mit.edu:1234/numpy.lib.recfunctions.html Any pointers or help will be most appreciated!

    Read the article

  • Python byte per byte XOR decryption

    - by neurino
    I have an XOR encypted file by a VB.net program using this function to scramble: Public Class Crypter ... 'This Will convert String to bytes, then call the other function. Public Function Crypt(ByVal Data As String) As String Return Encoding.Default.GetString(Crypt(Encoding.Default.GetBytes(Data))) End Function 'This calls XorCrypt giving Key converted to bytes Public Function Crypt(ByVal Data() As Byte) As Byte() Return XorCrypt(Data, Encoding.Default.GetBytes(Me.Key)) End Function 'Xor Encryption. Private Function XorCrypt(ByVal Data() As Byte, ByVal Key() As Byte) As Byte() Dim i As Integer If Key.Length <> 0 Then For i = 0 To Data.Length - 1 Data(i) = Data(i) Xor Key(i Mod Key.Length) Next End If Return Data End Function End Class and saved this way: Dim Crypter As New Cryptic(Key) 'open destination file Dim objWriter As New StreamWriter(fileName) 'write crypted content objWriter.Write(Crypter.Crypt(data)) Now I have to reopen the file with Python but I have troubles getting single bytes, this is the XOR function in python: def crypto(self, data): 'crypto(self, data) -> str' return ''.join(chr((ord(x) ^ ord(y)) % 256) \ for (x, y) in izip(data.decode('utf-8'), cycle(self.key)) I had to add the % 256 since sometimes x is 256 i.e. not a single byte. This thing of two bytes being passed does not break the decryption because the key keeps "paired" with the following data. The problem is some decrypted character in the conversion is wrong. These chars are all accented letters like à, è, ì but just a few of the overall accented letters. The others are all correctly restored. I guess it could be due to the 256 mod but without it I of course get a chr exception... Thanks for your support

    Read the article

  • python histogram one-liner

    - by mykhal
    there are many ways, how to code histogram in Python. by histogram, i mean function, counting objects in an interable, resulting in the count table (i.e. dict). e.g.: >>> L = 'abracadabra' >>> histogram(L) {'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2} it can be written like this: def histogram(L): d = {} for x in L: if x in d: d[x] += 1 else: d[x] = 1 return d ..however, there are much less ways, how do this in a single expression. if we had "dict comprehensions" in python, we would write: >>> { x: L.count(x) for x in set(L) } but we don't have them, so we have to write: >>> dict([(x, L.count(x)) for x in set(L)]) however, this approach may yet be readable, but is not efficient - L is walked-through multiple times, so this won't work for single-life generators.. the function should iterate well also through gen(), where: def gen(): for x in L: yield x we can go with reduce (R.I.P.): >>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong! oops, does not work, the key name is 'x', not x :( i ended with: >>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {}) (in py3k, we would have to write list(d.items()) instead of d.items(), but it's hypothethical, since there is no reduce there) please beat me with a better one-liner, more readable! ;)

    Read the article

  • S3 Backup Memory Usage in Python

    - by danpalmer
    I currently use WebFaction for my hosting with the basic package that gives us 80MB of RAM. This is more than adequate for our needs at the moment, apart from our backups. We do our own backups to S3 once a day. The backup process is this: dump the database, tar.gz all the files into one backup named with the correct date of the backup, upload to S3 using the python library provided by Amazon. Unfortunately, it appears (although I don't know this for certain) that either my code for reading the file or the S3 code is loading the entire file in to memory. As the file is approximately 320MB (for today's backup) it is using about 320MB just for the backup. This causes WebFaction to quit all our processes meaning the backup doesn't happen and our site goes down. So this is the question: Is there any way to not load the whole file in to memory, or are there any other python S3 libraries that are much better with RAM usage. Ideally it needs to be about 60MB at the most! If this can't be done, how can I split the file and upload separate parts? Thanks for your help. This is the section of code (in my backup script) that caused the processes to be quit: filedata = open(filename, 'rb').read() content_type = mimetypes.guess_type(filename)[0] if not content_type: content_type = 'text/plain' print 'Uploading to S3...' response = connection.put(BUCKET_NAME, 'daily/%s' % filename, S3.S3Object(filedata), {'x-amz-acl': 'public-read', 'Content-Type': content_type})

    Read the article

< Previous Page | 69 70 71 72 73 74 75 76 77 78 79 80  | Next Page >