Search Results

Search found 15224 results on 609 pages for 'parallel python'.

Page 322/609 | < Previous Page | 318 319 320 321 322 323 324 325 326 327 328 329 | Next Page >

How to improve the speed of a loop containing a sqlalchemy query statement as conditional

- by LtPinback

This loop checks if a record is in the sqlite database and builds a list of dictionaries for those records that are missing and then executes a multiple insert statement with the list. This works but it is very slow (at least i think it is slow) as it takes 5 minutes to loop over 3500 queries. I am a complete newbie in python, sqlite and sqlalchemy so I wonder if there is a faster way of doing this. list_dict = [] session = Session() for data in data_list: if session.query(Class_object).filter(Class_object.column_name_01 == data[2]).filter(Class_object.column_name_00 == an_id).count() == 0: list_dict.append({'column_name_00':a_id, 'column_name_01':data[2]}) conn = engine.connect() conn.execute(prices.insert(),list_dict) conn.close() session.close() edit: I moved session = Session() outside the loop. Did not make a difference.

Read the article
If a command line program is unsure of stdout's encoding, what encoding should it output?

- by mackstann

I have a command line program written in Python, and when I pipe it through another program on the command line, sys.stdout.encoding is None. This makes sense, I suppose -- the output could be another program, or a file you're redirecting it into, or whatever, and it doesn't know what encoding is desired. But neither do I! This program will be used by many different people (humor me) in different ways. Should I play it safe and output only ascii (replacing non-ascii chars with question marks)? Or should I output UTF-8, since it's so widespread these days?

Read the article
Hadoop Map Reduce job never finishes

- by rohanbk

I am running a Hadoop Map Reduce job using a Python Mapper and Reducer script, and Hadoop Streaming. Both my Map and Reduce jobs run till they are both 100%, but the job doesn't end. I know that when things go sour, Hadoop will terminate the job, but in this case, both stages reach a 100% and just never end. Has anyone else encountered anything similar? Also, how do I debug my program to figure out where things are going wrong? If I use a smaller input file, and I just run something like: $> cat input_file | mapper.py | sort | reduce.py >> output_file everything works perfectly fine. However, when I use Hadoop, things don't work out.

Read the article
Take advantage of multiple cores executing SQL statements

- by willvv

I have a small application that reads XML files and inserts the information on a SQL DB. There are ~ 300 000 files to import, each one with ~ 1000 records. I started the application on 20% of the files and it has been running for 18 hours now, I hope I can improve this time for the rest of the files. I'm not using a multi-thread approach, but since the computer I'm running the process on has 4 cores I was thinking on doing it to get some improvement on the performance (although I guess the main problem is the I/O and not only the processing). I was thinking on using the BeginExecutingNonQuery() method on the SqlCommand object I create for each insertion, but I don't know if I should limit the max amount of simultaneous threads (nor I know how to do it). What's your advice to get the best CPU utilization? Thanks

Read the article
Adding a method to a function object at runtime

- by Carson Myers

I read a question earlier asking if there was a times method in Python, that would allow a function to be called n times in a row. Everyone suggested for _ in range(n): foo() but I wanted to try and code a different solution using a function decorator. Here's what I have: def times(self, n, *args, **kwargs): for _ in range(n): self.__call__(*args, **kwargs) import new def repeatable(func): func.times = new.instancemethod(times, func, func.__class__) @repeatable def threeArgs(one, two, three): print one, two, three threeArgs.times(7, "one", two="rawr", three="foo") When I run the program, I get the following exception: Traceback (most recent call last): File "", line 244, in run_nodebug File "C:\py\repeatable.py", line 24, in threeArgs.times(7, "one", two="rawr", three="foo") AttributeError: 'NoneType' object has no attribute 'times' So I suppose the decorator didn't work? How can I fix this?

Read the article
String contains all the elements of a list

- by CSSS

I am shifting to Python, and am still relatively new to the pythonic approach. I want to write a function that takes a string and a list and returns true if all the elements in the list occur in the string. This seemed fairly simple. However, I am facing some difficulties with it. The code goes something like this: def myfun(str,list): for a in list: if not a in str: return False return True Example : myfun('tomato',['t','o','m','a']) should return true myfun('potato',['t','o','m','a']) should return false myfun('tomato',['t','o','m']) should return true Also, I was hoping if someone could suggest a possible regex approach here. I am trying out my hands on them too.

Read the article
Running the script for the 2-nd time, the messages are not retrieved from the mail server

- by Max Li

I read the mails from my gmail account with the code following below. import poplib pop_conn = poplib.POP3_SSL('pop.gmail.com') pop_conn.user('user') # result: '+OK send PASS' pop_conn.pass_('password') # result: '+OK Welcome.' print pop_conn.list()[1] pop_conn.quit() It shows me 1 message as expected. However, if I run this script for the second time, I get 0 messages as result. On the server the message is still there and unread. How can I get all the messages also running the script for the second time? For me it behaves as an email client that doesn't download the same mail twice. Is there some flag to force the program to download everything again? I use python 2.7.x on ubuntu 12.10

Read the article
Why is memory management so visible in Java?

- by Emil

I'm playing around with writing some simple Spring-based web apps and deploying them to Tomcat. Almost immediately, I run into the need to customize the Tomcat's JVM settings with -XX:MaxPermSize (and -Xmx and -Xms); without this, the server easily runs out of PermGen space. Why is this such an issue for Java compared to other garbage collected languages? Comparing counts of "tune X memory usage" for X in Java, Ruby, Perl and Python, shows that Java has easily an order of magnitude more hits in Google than the other languages combined.

Read the article
Removing duplicates (within a given tolerance) from a Numpy array of vectors

- by Brendan

I have an Nx5 array containing N vectors of form 'id', 'x', 'y', 'z' and 'energy'. I need to remove duplicate points (i.e. where x, y, z all match) within a tolerance of say 0.1. Ideally I could create a function where I pass in the array, columns that need to match and a tolerance on the match. Following this thread on Scipy-user, I can remove duplicates based on a full array using record arrays, but I need to just match part of an array. Moreover this will not match within a certain tolerance. I could laboriously iterate through with a for loop in Python but is there a better Numponic way?

Read the article
too many threads due to synch communication

- by MasoudIzzy

I'm using threads and xmlrpclib in python at the same time. Periodically, I create a bunch of thread to complete a service on a remote server via xmlrpclib. The problem is that, there are times that the remote server doesn't answer. This causes the thread to wait forever for a response which it never gets. Over time, number of threads in this state increases and will reach the maximum number of allowed threads on the system (I'm using fedora). I tried to use socket.setdefaulttimeout(10); but the exception that is created by that will cause the server to defunct. I used it at server side but it seems that it doesn't work :/ Any idea how can I handle this issue?

Read the article
show() doesn't redraw anymore

- by Abruzzo Forte e Gentile

Hi All I am working in linux and I don't know why using python and matplotlib commands draws me only once the chart I want. The first time I call show() the plot is drawn, wihtout any problem, but not the second time and the following. I close the window showing the chart between the two calls. Do you know why and hot to fix it? Thanks AFG from numpy import * from pylab import * data = array( [ 1,2,3,4,5] ) plot(data) [<matplotlib.lines.Line2D object at 0x90c98ac>] show() # this call shows me a plot #..now I close the window... data = array( [ 1,2,3,4,5,6] ) plot(data) [<matplotlib.lines.Line2D object at 0x92dafec>] show() # this one doesn't shows me anything

Read the article
symlink files newer than X age, then later remove symlink once file ages?

- by bleomycin

Hello everyone, i have a large number of files/folders coming in each day that are being sorted automatically to a wide variety of folders. I'm looking for a way to automatically find these files/folders and create symlinks to them all within an "incoming" folder. Searching for file age should be sufficient for finding the files, however searching for age and owner would be ideal. Then once the files/folders being linked to reach a certain age, say 5 days, remove the symlinks to them automatically from the "incoming" folder. Is this possible to do with a simple shell or python script that can be run with cron? Thanks!

Read the article
Create unique file name and fetching it to commandline argument

- by user343934

Hi everyone, I am working on python right now and i am little bit stuck in performing some tricks. I have web form with two options- File upload and textarea, i can easily pass file name with file upload options but have problem when it's textarea. Because when i use textarea then first i have to save values passed from textarea to some files and save it on the working directory. After that i can execute commandline argument and pass same saved filename name. For this problem i have to generate unique file first and save the values passed from textarea in it. Can anybody give me some tips to solve my problem. Any algorithms, suggestions and lines of code are appreciated. Thanks for your concern

Read the article
Generate all permutations with sort constraint

- by Moos Hueting

Hi! I have a list consisting of other lists and some zeroes, for example: x = [[1, 1, 2], [1, 1, 1, 2], [1, 1, 2], 0, 0, 0] I would like to generate all the combinations of this list while keeping the order of the inner lists unchanged, so [[1, 1, 2], 0, 0, [1, 1, 1, 2], [1, 1, 2], 0] is fine, but [[1, 1, 1, 2], [1, 1, 2], 0, 0, [1, 1, 2], 0] isn't. I've got the feeling that this should be fairly easy in Python, but I just don't see it. Could somebody help me out?

Read the article
How to get data from a incoming email and then copy data to some directory

- by Zegnhabi

First of all, I have some time reading this page and I find very interesting, the content also has many questions and are very entertaining. My question is about handling my incoming mail server, no matter if you use PHP, Perl, or Python. I do not care, what if I want is the result which should be as close to: I send an email to [email protected], this post will add a case such as photos, then when the mail reaches the server, the server takes to process mail and copy the attached files, in this case the photos to a folder / home / public_html / photos and then, if possible notify you if it was successful or not. In advance thank you very much. And I hope and can be done. ñ_ñ

Read the article
Replacing emty csv column values with a zero

- by homerjay

Hey, So I'm dealing with a csv file that has missing values. What I want my script to is: #!/usr/bin/python import csv import sys #1. Place each record of a file in a list. #2. Iterate thru each element of the list and get its length. #3. If the length is less than one replace with value x. reader = csv.reader(open(sys.argv[1], "rb")) for row in reader: for x in row[:]: if len(x)< 1: x = 0 print x print row Here is an example of data, I trying it on, ideally it should work on any column lenghth Before: actnum,col2,col4 xxxxx , , xxxxx , 845 , xxxxx , ,545 After actnum,col2,col4 xxxxx , 0 , 0 xxxxx , 845, 0 xxxxx , 0 ,545 Any guidance would be appreciated

Read the article
how to get unique values set from a repeating values list

- by Mariselvam

I need to parse a large log file (flat file), which contains two column of values (column-A , column-B). Values in both columns are repeating. I need to find for each unique value in column-A , I need to find a set of column-B values. Is this can be done using unix shell command or need to write any perl or python script? What are the ways this can be done? Example: xxxA 2 xxxA 1 xxxB 2 XXXC 3 XXXA 3 xxxD 4 output: xxxA - 2,1,3 xxxB - 2 xxxC - 3 xxxD - 4

Read the article
Routing Skype call to another Voip company

- by Anarchist

Hello, As my project to do over this summer I would like to create a program that answers a Skype call using the Skype API and allows a user to connect to another VOIP provider (through SIP) and make calls by dialling through the client callers Skype application. I understand that the Skype API allows me to answer and receive keypad input, but I'm stuck on actually sending the sound of the call to a SIP client. Is there an API/library that would allow me to take the Skype receiving audio as input in the SIP client? Is this even possible? I'm not tied to a language but I had planned on using Python. Thanks.

Read the article
How can I convert this string to list of lists?

- by Phrixus

Hi, what I'm trying to do is.. if a user types in [[0,0,0], [0,0,1], [1,1,0]] and press enter, the program should convert this string to several lists; one list holding [0][0][0], other for [0][0][1], and the last list for [1][1][0] I thought tuple thing would work out but no luck... :( I started phython yesterday -- (I'm C / C++ guy.) and cannot use the full advantages of this language... Does python have a good way to handle this? I need help~ :'(

Read the article
Sorting a 2D numpy array by multiple axes

- by perimosocordiae

I have a 2D numpy array of shape (N,2) which is holding N points (x and y coordinates). For example: array([[3, 2], [6, 2], [3, 6], [3, 4], [5, 3]]) I'd like to sort it such that my points are ordered by x-coordinate, and then by y in cases where the x coordinate is the same. So the array above should look like this: array([[3, 2], [3, 4], [3, 6], [5, 3], [6, 2]]) If this was a normal Python list, I would simply define a comparator to do what I want, but as far as I can tell, numpy's sort function doesn't accept user-defined comparators. Any ideas?

Read the article
create temporary table from cursor

- by Claudiu

Is there any way, in PostgreSQL accessed from Python using SQLObject, to create a temporary table from the results of a cursor? Previously, I had a query, and I created the temporary table directly from the query. I then had many other queries interacting w/ that temporary table. Now I have much more data, so I want to only process 1000 rows at a time or so. However, I can't do CREATE TEMP TABLE ... AS ... from a cursor, not as far as I can see. Is the only thing to do something like: rows = cur.fetchmany(1000); cur2 = conn.cursor() cur2.execute("""CREATE TEMP TABLE foobar (id INTEGER)""") for row in rows: cur2.execute("""INSERT INTO foobar (%d)""" % row) or is there a better way? This seems awfully inefficient.

Read the article
get n records at a time from a temporary table

- by Claudiu

I have a temporary table with about 1 million entries. The temporary table stores the result of a larger query. I want to process these records 1000 at a time, for example. What's the best way to set up queries such that I get the first 1000 rows, then the next 1000, etc.? They are not inherently ordered, but the temporary table just has one column with an ID, so I can order it if necessary. I was thinking of creating an extra column with the temporary table to number all the rows, something like: CREATE TEMP TABLE tmptmp AS SELECT ##autonumber somehow##, id FROM .... --complicated query then I can do: SELECT * FROM tmptmp WHERE autonumber>=0 AND autonumber < 1000 etc... how would I actually accomplish this? Or is there a better way? I'm using Python and PostgreSQL.

Read the article
matplotlib plot window won't appear

- by user1518837

I'm using Python 2.7.3 in 64-bit. I installed pandas as well as matplotlib 1.1.1, both for 64-bit. Right now, none of my plots are showing. After attempting to plot from several different dataframes, I gave up in frustration and tried the following first example from http://pandas.pydata.org/pandas-docs/dev/visualization.html: INPUT: import matplotlib.pyplot as plt ts = Series(randn(1000), index=date_range ('1/1/2000', periods=1000)) ts = ts.cumsum() ts.plot() pylab.show() OUTPUT: Axes(0.125,0.1;0.775x0.8) And no plot window appeared. Other StackOverflow threads I've read suggested I might be missing DLLs. Any suggestions?

Read the article
averaging matrix efficiently

- by user248237

in Python, given an n x p matrix, e.g. 4 x 4, how can I return a matrix that's 4 x 2 that simply averages the first two columns and the last two columns for all 4 rows of the matrix? e.g. given: a = array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]) return a matrix that has the average of a[:, 0] and a[:, 1] and the average of a[:, 2] and a[:, 3]. I want this to work for an arbitrary matrix of n x p assuming that the number of columns I am averaging of n is obviously evenly divisible by n. let me clarify: for each row, I want to take the average of the first two columns, then the average of the last two columns. So it would be: 1 + 2 / 2, 3 + 4 / 2 <- row 1 of new matrix 5 + 6 / 2, 7 + 8 / 2 <- row 2 of new matrix, etc. which should yield a 4 by 2 matrix rather than 4 x 4. thanks.

Read the article
How to find hidden properties/methods in Javascript objects?

- by ramanujan

I would like to automatically determine all of the properties (including the hidden ones) in a given Javascript object, via a generalization of this function: function keys(obj) { var ll = []; for(var pp in obj) { ll.push(pp); } return ll; } This works for user defined objects but fails for many builtins: repl> keys({"a":10,"b":2}); // ["a","b"] repl> keys(Math) // returns nothing! Basically, I'd like to write equivalents of Python's dir() and help(), which are really useful in exploring new objects. My understanding is that only the builtin objects have hidden properties (user code evidently can't set the "enumerable" property till HTML5), so one possibility is to simply hardcode the properties of Math, String, etc. into a dir() equivalent (using the lists such as those here). But is there a better way?

Read the article

< Previous Page | 318 319 320 321 322 323 324 325 326 327 328 329 | Next Page >