numpy - Page 2 - Developer IT

Numpy ‘smart’ symmetric matrix

- by Debilski

Is there a smart and space-efficient symmetric matrix in numpy which automatically fills [j][i] when [i][j] is written to? a = numpy.symmetric((3, 3)) a[0][1] = 1 print a # [[0 1 0], [1 0 0], [0 0 0]] An automatic Hermitian would also be nice, although I won’t need that at the time of writing.

Read the article

How do I calculate percentiles with python/numpy?

- by Uri

Is there a convenient way to calculate percentiles for a sequence or single-dimensional numpy array? I am looking for something similar to Excel's percentile function. I looked in NumPy's statistics reference, and couldn't find this. All I could find is the median (50th percentile), but not something more specific.

Read the article

Python 3 with numpy and object refernces

- by user963386

I need to create a large matrix (array) structure (3 axis) and each element should store the reference to a Python object (myclass instance). Is it possible to use numpy to create such an array. Which data type should I use in order to store Python references? The advantage of numpy is the support of slicing at different levels. The alternativee is to create a nested (nested) list but it is a cumbersome solution.

Read the article

Numpy Matrix keeps giving me an Error,

- by uberjumper

Okay this is werid, i keep getting the error, randomly. ValueError: matrix must be 2-dimensional So i tracked it down, and cornered it to basically something like this: a_list = [[(1,100) for _ in range(32)] for _ in range(32)] numpy.matrix(a_list) Whats wrong with this? If i print a_list it is clearly a 2d matrix of tuples, however numpy does not believe so.

Read the article

Writing csv header removes data from numpy array written below

- by user338095

I'm trying to export data to a csv file. It should contain a header (from datastack) and restacked arrays with my data (from datastack). One line in datastack has the same length as dataset. The code below works but it removes parts of the first line from datastack. Any ideas why that could be? s = ','.join(itertools.chain(dataset)) + '\n' newfile = 'export.csv' f = open(newfile,'w') f.write(s) numpy.savetxt(newfile, (numpy.transpose(datastack)), delimiter=', ') f.close()

Read the article

Numpy array dimensions

- by cristian

Hello, I'm currently trying to learn Numpy and Python. Given the following array: import numpy as N a = N.array([[1,2],[1,2]]) Is there a function that returns the dimensions of a (e.g.a is a 2 by 2 array). size() returns 4 and that doesn't help very much. Thanks.

Read the article

List comprehension, map, and numpy.vectorize performance

- by mcstrother

I have a function foo(i) that takes an integer and takes a significant amount of time to execute. Will there be a significant performance difference between any of the following ways of initializing a: a = [foo(i) for i in xrange(100)] a = map(foo, range(100)) vfoo = numpy.vectorize(foo) a = vfoo(range(100)) (I don't care whether the output is a list or a numpy array.) Is there a better way?

Read the article

Efficiently generate numpy array from list comprehension output?

- by shootingstars

Is there a more efficient way than using numpy.asarray() to generate an array from output in the form of a list? This appears to be copying everything in memory, which doesn't seem like it would be that efficient with very large arrays. (Updated) Example: import numpy as np a1 = np.array([1,2,3,4,5,6,7,8,9,10]) # pretend this has thousands of elements a2 = np.array([3,7,8]) results = np.asarray([np.amax(np.where(a1 > element)) for element in a2])

Read the article

Fastest way to generate delimited string from 1d numpy array

- by Abiel

I have a program which needs to turn many large one-dimensional numpy arrays of floats into delimited strings. I am finding this operation quite slow relative to the mathematical operations in my program and am wondering if there is a way to speed it up. For example, consider the following loop, which takes 100,000 random numbers in a numpy array and joins each array into a comma-delimited string. import numpy as np x = np.random.randn(100000) for i in range(100): ",".join(map(str, x)) This loop takes about 20 seconds to complete (total, not each cycle). In contrast, consider that 100 cycles of something like elementwise multiplication (x*x) would take than one 1/10 of a second to complete. Clearly the string join operation creates a large performance bottleneck; in my actual application it will dominate total runtime. This makes me wonder, is there a faster way than ",".join(map(str, x))? Since map() is where almost all the processing time occurs, this comes down to the question of whether there a faster to way convert a very large number of numbers to strings.

Read the article

Error Converting PIL B&W images to Numpy Arrays

- by Elliot

I am getting weird errors when I try to convert a black and white PIL image to a numpy array. An example of the code I am working with is below. if image.mode != '1': image = image.convert('1') #convert to B&W data = np.array(image) #convert data to a numpy array n_lines = data.shape[0] #number of raster passes line_range = range(data.shape[1]) for l in range(n_lines): # process one horizontal line of the image line = data[l] for n in line_range: if line[n] == 1: write_line_to(xl, z+scale*n, speed) #conversion to other program code elif line[n] == 0: run_to(xl, z+scale*n) #conversion to other program code I have tried this using both array and asarray for the conversion, and gotten different errors. If I use array, then the data I get out is nothing like what I put in. It looks like several very shrunken partial images side by side, with the remainder of the image space filled in in black. If I use asarray, then the entirety of python crashes during the raster step (on a random line). If I work with a greyscale image ('L'), then neither of these errors occurs for either array or asarray. Does anyone know what I am doing wrong? Is there something odd about the way PIL encodes B&W images, or something special I need to pass numpy to make it convert properly?

Read the article

Rewriting a for loop in pure NumPy to decrease execution time

- by Statto

I recently asked about trying to optimise a Python loop for a scientific application, and received an excellent, smart way of recoding it within NumPy which reduced execution time by a factor of around 100 for me! However, calculation of the B value is actually nested within a few other loops, because it is evaluated at a regular grid of positions. Is there a similarly smart NumPy rewrite to shave time off this procedure? I suspect the performance gain for this part would be less marked, and the disadvantages would presumably be that it would not be possible to report back to the user on the progress of the calculation, that the results could not be written to the output file until the end of the calculation, and possibly that doing this in one enormous step would have memory implications? Is it possible to circumvent any of these? import numpy as np import time def reshape_vector(v): b = np.empty((3,1)) for i in range(3): b[i][0] = v[i] return b def unit_vectors(r): return r / np.sqrt((r*r).sum(0)) def calculate_dipole(mu, r_i, mom_i): relative = mu - r_i r_unit = unit_vectors(relative) A = 1e-7 num = A*(3*np.sum(mom_i*r_unit, 0)*r_unit - mom_i) den = np.sqrt(np.sum(relative*relative, 0))**3 B = np.sum(num/den, 1) return B N = 20000 # number of dipoles r_i = np.random.random((3,N)) # positions of dipoles mom_i = np.random.random((3,N)) # moments of dipoles a = np.random.random((3,3)) # three basis vectors for this crystal n = [10,10,10] # points at which to evaluate sum gamma_mu = 135.5 # a constant t_start = time.clock() for i in range(n[0]): r_frac_x = np.float(i)/np.float(n[0]) r_test_x = r_frac_x * a[0] for j in range(n[1]): r_frac_y = np.float(j)/np.float(n[1]) r_test_y = r_frac_y * a[1] for k in range(n[2]): r_frac_z = np.float(k)/np.float(n[2]) r_test = r_test_x +r_test_y + r_frac_z * a[2] r_test_fast = reshape_vector(r_test) B = calculate_dipole(r_test_fast, r_i, mom_i) omega = gamma_mu*np.sqrt(np.dot(B,B)) # write r_test, B and omega to a file frac_done = np.float(i+1)/(n[0]+1) t_elapsed = (time.clock()-t_start) t_remain = (1-frac_done)*t_elapsed/frac_done print frac_done*100,'% done in',t_elapsed/60.,'minutes...approximately',t_remain/60.,'minutes remaining'

Read the article

numpy array assignment problem

- by Sujan

Hi All: I have a strange problem in Python 2.6.5 with Numpy. I assign a numpy array, then equate a new variable to it. When I perform any operation to the new array, the original's values also change. Why is that? Please see the example below. Kindly enlighten me, as I'm fairly new to Python, and programming in general. -Sujan >>> import numpy as np >>> a = np.array([[1,2],[3,4]]) >>> b = a >>> b array([[1, 2], [3, 4]]) >>> c = a >>> c array([[1, 2], [3, 4]]) >>> c[:,1] = c[:,1] + 5 >>> c array([[1, 7], [3, 9]]) >>> b array([[1, 7], [3, 9]]) >>> a array([[1, 7], [3, 9]])

Read the article

removing pairs of elements from numpy arrays that are NaN (or another value) in Python

- by user248237

I have an array with two columns in numpy. For example: a = array([[1, 5, nan, 6], [10, 6, 6, nan]]) a = transpose(a) I want to efficiently iterate through the two columns, a[:, 0] and a[:, 1] and remove any pairs that meet a certain condition, in this case if they are NaN. The obvious way I can think of is: new_a = [] for val1, val2 in a: if val2 == nan or val2 == nan: new_a.append([val1, val2]) But that seems clunky. What's the pythonic numpy way of doing this? thanks.

Read the article

hierarchical clustering on correlations in Python scipy/numpy?

- by user248237

How can I run hierarchical clustering on a correlation matrix in scipy/numpy? I have a matrix of 100 rows by 9 columns, and I'd like to hierarchically clustering by correlations of each entry across the 9 conditions. I'd like to use 1-pearson correlation as the distances for clustering. Assuming I have a numpy array "X" that contains the 100 x 9 matrix, how can I do this? I tried using hcluster, based on this example: Y=pdist(X, 'seuclidean') Z=linkage(Y, 'single') dendrogram(Z, color_threshold=0) however, pdist is not what I want since that's euclidean distance. Any ideas? thanks.

Read the article

Numpy modify array in place?

- by User

I have the following code which is attempting to normalize the values of an m x n array (It will be used as input to a neural network, where m is the number of training examples and n is the number of features). However, when I inspect the array in the interpreter after the script runs, I see that the values are not normalized; that is, they still have the original values. I guess this is because the assignment to the array variable inside the function is only seen within the function. How can I do this normalization in place? Or do I have to return a new array from the normalize function? import numpy def normalize(array, imin = -1, imax = 1): """I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)""" dmin = array.min() dmax = array.max() array = imin + (imax - imin)*(array - dmin)/(dmax - dmin) print array[0] def main(): array = numpy.loadtxt('test.csv', delimiter=',', skiprows=1) for column in array.T: normalize(column) return array if __name__ == "__main__": a = main()

Read the article

slicing 2d numpy array

- by MedicalMath

I have a 2d numpy array called FilteredOutput that has 2 columns and 10001 rows, though the number of rows is a variable. I am trying to take the 2nd column of FilteredOutput and use it to populate a new 1d numpy array called timeSeriesArray using the following line of code: timeSeriesArray=p.array(FilteredOutput[:,0]) I got this syntax from the following link. But the problem is that I am getting the following error message: TypeError: list indices must be integers, not tuple Can anyone show me the proper syntax for populating the 1d array timeSeriesArray with the contents of the second column of the 2d array FilteredOutput?

Read the article

Values of Variables Matrix NumPy

- by Max Mines

I'm working on a program that determines if lines intersect. I'm using matrices to do this. I understand all the math concepts, but I'm new to Python and NumPy. I want to add my slope variables and yint variables to a new matrix. They are all floats. I can't seem to figure out the correct format for entering them. Here's an example: import numpy as np x = 2 y = 5 w = 9 z = 12 I understand that if I were to just be entering the raw numbers, it would look something like this: matr = np.matrix('2 5; 9 12') My goal, though, is to enter the variable names instead of the ints.

Read the article

append versus resize for numpy array

- by Abruzzo Forte e Gentile

Hi all I would like to append a value at the end of my numpy.array. I saw numpy.append function but this performs an exact copy of the original array adding at last my new value. I would like to avoid copies since my arrays are big. I am using resize method and then set the last index available to the new value. Can you confirm that resize is the best way to append a value at the end? Is it not moving memory around someway? Thanks AFG oldSize = myArray,shape(0) myArray.resize( oldSize + 1 ) myArray[oldSize] = newValue

Read the article

Construct Numpy index given list of starting and ending positions

- by Abiel

I have two identically-sized numpy.array objects (both one-dimensional), one of which contains a list of starting index positions, and the other of which contains a list of ending index positions (alternatively you could say I have a list of starting positions and window lengths). In case it matters, the slices formed by the starting and ending positions are guaranteed to be non-overlapping. I am trying to figure out how to use these starting and ending positions to form an index for another array object, without having to use a loop. For example: import numpy as np start = np.array([1,7,20]) end = np.array([3,10,25]) Want to reference somearray[1,2,7,8,9,20,21,22,23,24])

Read the article

numpy arange with multiple intervals

- by Heiko Westermann

Hi, i have an numpy array which represents multiple x-intervals of a function: In [137]: x_foo Out[137]: array([211, 212, 213, 214, 215, 216, 217, 218, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950]) as you can see, in x_foo are two intervals: one from 211 to 218, and one from 940 to 950. these are intervals, which i want to interpolate with scipy. for this, i need to adjust the spacing, e.g "211.0 211.1 211.2 ..." which you would normaly do with: arange( x_foo[0], x_foo[-1], 0.1 ) in the case of multiple intervals, this is not possible. so heres my question: is there a numpy-thonic way to do this in array-style? or do i need to write a function which loops over the whole array and split if the difference is 1? thanks!

Read the article

Python/Numpy - Save Array with Column AND Row Titles

- by Scott B

I want to save a 2D array to a CSV file with row and column "header" information (like a table). I know that I could use the header argument to numpy.savetxt to save the column names, but is there any easy way to also include some other array (or list) as the first column of data (like row titles)? Below is an example of how I currently do it. Is there a better way to include those row titles, perhaps some trick with savetxt I'm unaware of? import csv import numpy as np data = np.arange(12).reshape(3,4) # Add a '' for the first column because the row titles go there... cols = ['', 'col1', 'col2', 'col3', 'col4'] rows = ['row1', 'row2', 'row3'] with open('test.csv', 'wb') as f: writer = csv.writer(f) writer.writerow(cols) for row_title, data_row in zip(rows, data): writer.writerow([row_title] + data_row.tolist())

Read the article

Sorting a 2D numpy array by multiple axes

- by perimosocordiae

I have a 2D numpy array of shape (N,2) which is holding N points (x and y coordinates). For example: array([[3, 2], [6, 2], [3, 6], [3, 4], [5, 3]]) I'd like to sort it such that my points are ordered by x-coordinate, and then by y in cases where the x coordinate is the same. So the array above should look like this: array([[3, 2], [3, 4], [3, 6], [5, 3], [6, 2]]) If this was a normal Python list, I would simply define a comparator to do what I want, but as far as I can tell, numpy's sort function doesn't accept user-defined comparators. Any ideas?

Read the article

Numpy zero rank array indexing/broadcasting

- by Lemming

I'm trying to write a function that supports broadcasting and is fast at the same time. However, numpy's zero-rank arrays are causing trouble as usual. I couldn't find anything useful on google, or by searching here. So, I'm asking you. How should I implement broadcasting efficiently and handle zero-rank arrays at the same time? This whole post became larger than anticipated, sorry. Details: To clarify what I'm talking about I'll give a simple example: Say I want to implement a Heaviside step-function. I.e. a function that acts on the real axis, which is 0 on the negative side, 1 on the positive side, and from case to case either 0, 0.5, or 1 at the point 0. Implementation Masking The most efficient way I found so far is the following. It uses boolean arrays as masks to assign the correct values to the corresponding slots in the output vector. from numpy import * def step_mask(x, limit=+1): """Heaviside step-function. y = 0 if x < 0 y = 1 if x > 0 See below for x == 0. Arguments: x Evaluate the function at these points. limit Which limit at x == 0? limit > 0: y = 1 limit == 0: y = 0.5 limit < 0: y = 0 Return: The values corresponding to x. """ b = broadcast(x, limit) out = zeros(b.shape) out[x>0] = 1 mask = (limit > 0) & (x == 0) out[mask] = 1 mask = (limit == 0) & (x == 0) out[mask] = 0.5 mask = (limit < 0) & (x == 0) out[mask] = 0 return out List Comprehension The following-the-numpy-docs way is to use a list comprehension on the flat iterator of the broadcast object. However, list comprehensions become absolutely unreadable for such complicated functions. def step_comprehension(x, limit=+1): b = broadcast(x, limit) out = empty(b.shape) out.flat = [ ( 1 if x_ > 0 else ( 0 if x_ < 0 else ( 1 if l_ > 0 else ( 0.5 if l_ ==0 else ( 0 ))))) for x_, l_ in b ] return out For Loop And finally, the most naive way is a for loop. It's probably the most readable option. However, Python for-loops are anything but fast. And hence, a really bad idea in numerics. def step_for(x, limit=+1): b = broadcast(x, limit) out = empty(b.shape) for i, (x_, l_) in enumerate(b): if x_ > 0: out[i] = 1 elif x_ < 0: out[i] = 0 elif l_ > 0: out[i] = 1 elif l_ < 0: out[i] = 0 else: out[i] = 0.5 return out Test First of all a brief test to see if the output is correct. >>> x = array([-1, -0.1, 0, 0.1, 1]) >>> step_mask(x, +1) array([ 0., 0., 1., 1., 1.]) >>> step_mask(x, 0) array([ 0. , 0. , 0.5, 1. , 1. ]) >>> step_mask(x, -1) array([ 0., 0., 0., 1., 1.]) It is correct, and the other two functions give the same output. Performance How about efficiency? These are the timings: In [45]: xl = linspace(-2, 2, 500001) In [46]: %timeit step_mask(xl) 10 loops, best of 3: 19.5 ms per loop In [47]: %timeit step_comprehension(xl) 1 loops, best of 3: 1.17 s per loop In [48]: %timeit step_for(xl) 1 loops, best of 3: 1.15 s per loop The masked version performs best as expected. However, I'm surprised that the comprehension is on the same level as the for loop. Zero Rank Arrays But, 0-rank arrays pose a problem. Sometimes you want to use a function scalar input. And preferably not have to worry about wrapping all scalars in at least 1-D arrays. >>> step_mask(1) Traceback (most recent call last): File "<ipython-input-50-91c06aa4487b>", line 1, in <module> step_mask(1) File "script.py", line 22, in step_mask out[x>0] = 1 IndexError: 0-d arrays can't be indexed. >>> step_for(1) Traceback (most recent call last): File "<ipython-input-51-4e0de4fcb197>", line 1, in <module> step_for(1) File "script.py", line 55, in step_for out[i] = 1 IndexError: 0-d arrays can't be indexed. >>> step_comprehension(1) array(1.0) Only the list comprehension can handle 0-rank arrays. The other two versions would need special case handling for 0-rank arrays. Numpy gets a bit messy when you want to use the same code for arrays and scalars. However, I really like to have functions that work on as arbitrary input as possible. Who knows which parameters I'll want to iterate over at some point. Question: What is the best way to implement a function as the one above? Is there a way to avoid if scalar then like special cases? I'm not looking for a built-in Heaviside. It's just a simplified example. In my code the above pattern appears in many places to make parameter iteration as simple as possible without littering the client code with for loops or comprehensions. Furthermore, I'm aware of Cython, or weave & Co., or implementation directly in C. However, the performance of the masked version above is sufficient for the moment. And for the moment I would like to keep things as simple as possible.

Read the article

numpy calling sse2 via ctypes

- by Daniel

Hello, In brief, I am trying to call into a shared library from python, more specifically, from numpy. The shared library is implemented in C using sse2 instructions. Enabling optimisation, i.e. building the library with -O2 or –O1, I am facing strange segfaults when calling into the shared library via ctypes. Disabling optimisation (-O0), everything works out as expected, as is the case when linking the library to a c-program directly (optimised or not). Attached you find a snipped which exhibits the delineated behaviour on my system. With optimisation enabled, gdb reports a segfault in __builtin_ia32_loadupd (__P) at emmintrin.h:113. The value of __P is reported as optimised out. test.c: #include <emmintrin.h> #include <complex.h> void test(const int m, const double* x, double complex* y) { int i; __m128d _f, _x, _b; double complex f __attribute__( (aligned(16)) ); double complex b __attribute__( (aligned(16)) ); __m128d* _p; b = 1; _b = _mm_loadu_pd( (double *) &b ); _p = (__m128d*) y; for(i=0; i<m; ++i) { f = cexp(-I*x[i]); _f = _mm_loadu_pd( (double *) &f ); _x = _mm_loadu_pd( (double *) &x[i] ); _f = _mm_shuffle_pd(_f, _f, 1); *_p = _mm_add_pd(*_p, _f); *_p = _mm_add_pd(*_p, _x); *_p = _mm_mul_pd(*_p,_b); _p++; } return; } Compiler flags: gcc -o libtest.so -shared -std=c99 -msse2 -fPIC -O2 -g -lm test.c test.py: import numpy as np import os def zerovec_aligned(nr, dtype=np.float64, boundary=16): '''Create an aligned array of zeros. ''' size = nr * np.dtype(dtype).itemsize tmp = np.zeros(size + boundary, dtype=np.uint8) address = tmp.__array_interface__['data'][0] offset = boundary - address % boundary return tmp[offset:offset + size].view(dtype=dtype) lib = np.ctypeslib.load_library('libtest', '.' ) lib.test.restype = None lib.test.argtypes = [np.ctypeslib.ctypes.c_int, np.ctypeslib.ndpointer(np.float64, flags=('C', 'A') ), np.ctypeslib.ndpointer(np.complex128, flags=('C', 'A', 'W') )] n = 13 y = zerovec_aligned(n, dtype=np.complex128) x = np.ones(n, dtype=np.float64) # x = zerovec_aligned(n, dtype=np.float64) # x[:] = 1. lib.test(n,x,y) My system: Ubuntu Linux i686 2.6.31-22-generic Compiler: gcc (Ubuntu 4.4.1-4ubuntu9) Python: Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) [GCC 4.4.1] Numpy: 1.4.0 I have taken provisions (cf. python code) that y is aligned and the alignment of x should not matter (I think; explicitly aligning x does not solve the problem though). Note also that i use _mm_loadu_pd instead of _mm_load_pd when loading b and f. For the C-only version _mm_load_pd works (as expected). However, when calling the function via ctypes using _mm_load_pd always segfaults (independent of optimisation). I have tried several days to sort out this issue without success ... and I am on the verge beating my monitor to death. Any input welcome. Daniel

Read the article

linear combinations in python/numpy

- by nmaxwell

greetings, I'm not sure if this is a dumb question or not. Lets say I have 3 numpy arrays, A1,A2,A3, and 3 floats, c1,c2,c3 and I'd like to evaluate B = A1*c1+ A2*c2+ A3*c3 will numpy compute this as for example, E1 = A1*c1 E2 = A2*c2 E3 = A3*c3 D1 = E1+E2 B = D1+E3 or is it more clever than that? In c++ I had a neat way to abstract this kind of operation. I defined series of general 'LC' template functions, LC for linear combination like: template<class T,class D> void LC( T & R, T & L0,D C0, T & L1,D C1, T & L2,D C2) { R = L0*C0 +L1*C1 +L2*C2; } and then specialized this for various types, so for instance, for an array the code looked like for (int i=0; i<L0.length; i++) R.array[i] = L0.array[i]*C0 + L1.array[i]*C1 + L2.array[i]*C2; thus avoiding having to create new intermediate arrays. This may look messy but it worked really well. I could do something similar in python, but I'm not sure if its nescesary. Thanks in advance for any insight. -nick

Search Results

Search found 338 results on 14 pages for 'numpy'.

Page 2/14 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Debilski

- by Uri

- by user963386

- by uberjumper

- by user338095

- by cristian

- by mcstrother

- by shootingstars

- by Abiel

- by Elliot

- by Statto

- by Sujan

- by user248237

- by user248237

- by User

- by MedicalMath

- by Max Mines

- by Abruzzo Forte e Gentile

- by Abiel

- by Heiko Westermann

- by Scott B

- by perimosocordiae

- by Lemming

- by Daniel

- by nmaxwell

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >