numpy - Page 6 - Developer IT

"isnotnan" functionality in numpy, can this be more pythonic?

- by Dragan Chupacabrovic

Hello Everybody, I need a function that returns non-NaN values from an array. Currently I am doing it this way: >>> a = np.array([np.nan, 1, 2]) >>> a array([ NaN, 1., 2.]) >>> np.invert(np.isnan(a)) array([False, True, True], dtype=bool) >>> a[np.invert(np.isnan(a))] array([ 1., 2.]) Python: 2.6.4 numpy: 1.3.0 Please share if you know a better way, Thank you

Read the article

Numpy array, how to select indices satisfying multiple conditions?

- by Bob

Suppose I have a numpy array x = [5, 2, 3, 1, 4, 5], y = ['f', 'o', 'o', 'b', 'a', 'r']. I want to select the elements in y corresponding to elements in x that are greater than 1 and less than 5. I tried x = array([5, 2, 3, 1, 4, 5]) y = array(['f','o','o','b','a','r']) output = y[x > 1 & x < 5] # desired output is ['o','o','b','a'] but this doesn't work. How would I do this?

Read the article

Ambiguous Evaluation of Lambda Expression on Array

- by Joe

I would like to use a lambda that adds one to x if x is equal to zero. I have tried the following expressions: t = map(lambda x: x+1 if x==0 else x, numpy.array) t = map(lambda x: x==0 and x+1 or x, numpy.array) t = numpy.apply_along_axis(lambda x: x+1 if x==0 else x, 0, numpy.array) Each of these expressions returns the following error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() My understanding of map() and numpy.apply_along_axis() was that it would take some function and apply it to each value of an array. From the error it seems that the the lambda is being evaluated as x=array, not some value in array. What am I doing wrong? I know that I could write a function to accomplish this but I want to become more familiar with the functional programming aspects of python.

Read the article

Building up an array in numpy/scipy by iteration in Python?

- by user248237

Often, I am building an array by iterating through some data, e.g.: my_array = [] for n in range(1000): # do operation, get value my_array.append(value) # cast to array my_array = array(my_array) I find that I have to first build a list and then cast it (using "array") to an array. Is there a way around these? all these casting calls clutter the code... how can I iteratively build up "my_array", with it being an array from the start? thanks.

Read the article

Calculate Matrix Rank using scipy

- by Hooked

I'd like to calculate the mathematical rank of a matrix using scipy. The most obvious function numpy.rank calculates the dimension of an array (ie. scalars have dimension 0, vectors 1, matrices 2, etc...). I am aware that the numpy.linalg.lstsq module has this capability, but I was wondering if such a fundamental operation is built into the matrix class somewhere. Here is an explicit example: from numpy import matrix, rank A = matrix([[1,3,7],[2,8,3],[7,8,1]]) print rank(A) This gives 2 the dimension, where I'm looking for an answer of 3.

Read the article

Compute divergence of vector field using python

- by nyvltak

Is there a function that could be used for calculation of the divergence of the vectorial field? (in matlab http://www.mathworks.ch/help/techdoc/ref/divergence.html) I would expect it exists in numpy/scipy but I can not find it using google :(. # I need to calculate div[A * grad(F)], where F = np.array([[1,2,3,4],[5,6,7,8]]) (2D numpy ndarray) A = np.array([[1,2,3,4],[1,2,3,4]]) (2D numpy ndarray) so grad(F) is a set of 2D ndarrays # I know, I can calculate divergence like this: http://en.wikipedia.org/wiki/Divergence#Application_in_Cartesian_coordinates but do not want to reinvent the wheel. (and also I expent there is some optimized function)

Read the article

What to beware of reading old Numarray tutorials and examples?

- by DarenW

Python currently uses Numpy for heavy duty math and image processing. The earlier Numeric and Numarray are obsolete, but still today there are many tutorials, notes, sample code and other documentation using them. Some of these cover special topics of interest, some are well written but haven't been updated or replaced, or are otherwise of use. Quite a bit is the same between Numeric, Numarray and Numpy, so I usually get good mileage out these older docs. Ocassionaly, though, I run into a line of code that results in error. Not often enough to remember how to get around it, but usually I figure it out at the cost of some time. What are the main things to watch out for when relying on such older documentation for current Numpy use? Is there a list of how to translate the differences that exist?

Read the article

more efficient way to pickle a string

- by gatoatigrado

The pickle module seems to use string escape characters when pickling; this becomes inefficient e.g. on numpy arrays. Consider the following z = numpy.zeros(1000, numpy.uint8) len(z.dumps()) len(cPickle.dumps(z.dumps())) The lengths are 1133 characters and 4249 characters respectively. z.dumps() reveals something like "\x00\x00" (actual zeros in string), but pickle seems to be using the string's repr() function, yielding "'\x00\x00'" (zeros being ascii zeros). i.e. ("0" in z.dumps() == False) and ("0" in cPickle.dumps(z.dumps()) == True)

Read the article

Pass data in np.dnarray to Highcharts

- by F.N.B

I'm working with python 2.7, jinja2, flask and Highcharts. I create two numpy array (x1 and x2, type = numpy.dnarray) and I pass to Highcharts. My problems is, Highcharts don't recognize the commas in the vector. This is my jinja2 code: <script> $(function () { $('#container').highcharts({ series: [{ name: 'Tokyo', data: {{ x1 }} }, { name: 'London', data: {{ x2 }} }] }); }); And this is the error that I look with network chrome dev tools: series: [{ name: 'Tokyo', data: [1 4 5 2 3] }, { name: 'London', data: [3 6 7 4 1] }] I need change the numpy array to python list to pass to Highcharts or there is a better way to do?? Thanks

Read the article

How can this code be made more Pythonic?

- by usethedeathstar

This next part of code does exactly what I want it to do. dem_rows and dem_cols contain float values for a number of things i can identify in an image, but i need to get the nearest pixel for each of them, and than to make sure I only get the unique points, and no duplicates. The problem is that this code is ugly and as far as I get it, as unpythonic as it gets. If there would be a pure-numpy-solution (without for-loops) that would be even better. # next part is to make sure that we get the rounding done correctly, and than to get the integer part out of it # without the annoying floatingpoint-error, and without duplicates fielddic={} for i in range(len(dem_rows)): # here comes the ugly part: abusing the fact that i overwrite dictionary keys if I get duplicates fielddic[int(round(dem_rows[i]) + 0.1), int(round(dem_cols[i]) + 0.1)] = None # also very ugly: to make two arrays of integers out of the first and second part of the keys field_rows = numpy.zeros((len(fielddic.keys())), int) field_cols = numpy.zeros((len(fielddic.keys())), int) for i, (r, c) in enumerate(fielddic.keys()): field_rows[i] = r field_cols[i] = c

Read the article

A 3-D grid of regularly spaced points

- by Jack

I want to create a list containing the 3-D coords of a grid of regularly spaced points, each as a 3-element tuple. I'm looking for advice on the most efficient way to do this. In C++ for instance, I simply loop over three nested loops, one for each coordinate. In Matlab, I would probably use the meshgrid function (which would do it in one command). I've read about meshgrid and mgrid in Python, and I've also read that using numpy's broadcasting rules is more efficient. It seems to me that using the zip function in combination with the numpy broadcast rules might be the most efficient way, but zip doesn't seem to be overloaded in numpy.

Read the article

Open source alternative to MATLAB's fmincon function?

- by dF

Is there an open-source alternative to MATLAB's fmincon function for constrained linear optimization? I'm rewriting a MATLAB program to use Python / NumPy / SciPy and this is the only function I haven't found an equivalent to. A NumPy-based solution would be ideal, but any language will do.

Read the article

Can't import matplotlib

- by None

I installed matplotlib using the Mac disk image installer for MacOS 10.5 and Python 2.5. I installed numpy then tried to import matplotlib but got this error: ImportError: numpy 1.1 or later is required; you have 2.0.0.dev8462. It seems to that version 2.0.0.dev8462 would be later than version 1.1 but I am guessing that matplotlib got confused with the ".dev8462" in the version. Is there any workaround to this?

Read the article

Euclidian Distances between points

- by R S

I have an array of points in numpy: points = rand(dim, n_points) And I want to: Calculate all the l2 norm (euclidian distance) between a certain point and all other points Calculate all pairwise distances. and preferably all numpy and no for's.

Read the article

Matplotlib installation problems

- by Werner

Hi, I need to install matplotlib in a remote linux machine, and I am a normal user there. I downlodad the source and run python setup.py build but I get errors, related with numpy, which is not installed, so I decieded to install it first. I download and compile with python setup.py build My question now is, how do I tell to teh matplotlib installation where the numpy files have been installed? Thanks

Read the article

Replicating SQL's 'Join' in Python

- by Daniel Mathews

I'm in the process of trying to switch from R to Python (mainly issues around general flexibility). With Numpy, matplotlib and ipython, I've am able to cover all my use cases save for merging 'datasets'. I would like to simulate SQL's join by clause (inner, outer, full) purely in python. R handles this with the 'merge' function. I've tried the numpy.lib.recfunctions join_by, but it critical issues with duplicates along the 'key': join_by(key, r1, r2, jointype='inner', r1postfix='1', r2postfix='2', defaults=None, usemask=True, asrecarray=False) Join arrays r1 and r2 on key key. The key should be either a string or a sequence of string corresponding to the fields used to join the array. An exception is raised if the key field cannot be found in the two input arrays. Neither r1 nor r2 should have any duplicates along key: the presence of duplicates will make the output quite unreliable. Note that duplicates are not looked for by the algorithm. source: http://presbrey.mit.edu:1234/numpy.lib.recfunctions.html Any pointers or help will be most appreciated!

Read the article

Combinatorial optimisation of a distance metric

- by Jose

I have a set of trajectories, made up of points along the trajectory, and with the coordinates associated with each point. I store these in a 3d array ( trajectory, point, param). I want to find the set of r trajectories that have the maximum accumulated distance between the possible pairwise combinations of these trajectories. My first attempt, which I think is working looks like this: max_dist = 0 for h in itertools.combinations ( xrange(num_traj), r): for (m,l) in itertools.combinations (h, 2): accum = 0. for ( i, j ) in itertools.izip ( range(k), range(k) ): A = [ (my_mat[m, i, z] - my_mat[l, j, z])**2 \ for z in xrange(k) ] A = numpy.array( numpy.sqrt (A) ).sum() accum += A if max_dist < accum: selected_trajectories = h This takes forever, as num_traj can be around 500-1000, and r can be around 5-20. k is arbitrary, but can typically be up to 50. Trying to be super-clever, I have put everything into two nested list comprehensions, making heavy use of itertools: chunk = [[ numpy.sqrt((my_mat[m, i, :] - my_mat[l, j, :])**2).sum() \ for ((m,l),i,j) in \ itertools.product ( itertools.combinations(h,2), range(k), range(k)) ]\ for h in itertools.combinations(range(num_traj), r) ] Apart from being quite unreadable (!!!), it is also taking a long time. Can anyone suggest any ways to improve on this?

Read the article

ndarray field names for both row and column?

- by Graham Mitchell

I'm a computer science teacher trying to create a little gradebook for myself using NumPy. But I think it would make my code easier to write if I could create an ndarray that uses field names for both the rows and columns. Here's what I've got so far: import numpy as np num_stud = 23 num_assign = 2 grades = np.zeros(num_stud, dtype=[('assign 1','i2'), ('assign 2','i2')]) #etc gv = grades.view(dtype='i2').reshape(num_stud,num_assign) So, if my first student gets a 97 on 'assign 1', I can write either of: grades[0]['assign 1'] = 97 gv[0][0] = 97 Also, I can do the following: np.mean( grades['assign 1'] ) # class average for assignment 1 np.sum( gv[0] ) # total points for student 1 This all works. But what I can't figure out how to do is use a student id number to refer to a particular student (assume that two of my students have student ids as shown): grades['123456']['assign 2'] = 95 grades['314159']['assign 2'] = 83 ...or maybe create a second view with the different field names? np.sum( gview2['314159'] ) # total points for the student with the given id I know that I could create a dict mapping student ids to indices, but that seems fragile and crufty, and I'm hoping there's a better way than: id2i = { '123456': 0, '314159': 1 } np.sum( gv[ id2i['314159'] ] ) I'm also willing to re-architect things if there's a cleaner design. I'm new to NumPy, and I haven't written much code yet, so starting over isn't out of the question if I'm Doing It Wrong. I am going to be needing to sum all the assignment points for over a hundred students once a day, as well as run standard deviations and other stats. Plus, I'll be waiting on the results, so I'd like it to run in only a couple of seconds. Thanks in advance for any suggestions.

Read the article

Scipy sparse... arrays?

- by spitzanator

Hey, folks. So, I'm doing some Kmeans classification using numpy arrays that are quite sparse-- lots and lots of zeroes. I figured that I'd use scipy's 'sparse' package to reduce the storage overhead, but I'm a little confused about how to create arrays, not matrices. I've gone through this tutorial on how to create sparse matrices: http://www.scipy.org/SciPy_Tutorial#head-c60163f2fd2bab79edd94be43682414f18b90df7 To mimic an array, I just create a 1xN matrix, but as you may guess, Asp.dot(Bsp) doesn't quite work because you can't multiply two 1xN matrices. I'd have to transpose each array to Nx1, and that's pretty lame, since I'd be doing it for every dot-product calculation. Next up, I tried to create an NxN matrix where column 1 == row 1 (such that you can multiply two matrices and just take the top-left corner as the dot product), but that turned out to be really inefficient. I'd love to use scipy's sparse package as a magic replacement for numpy's array(), but as yet, I'm not really sure what to do. Any advice? Thank you very much!

Read the article

Finding matching submatrics inside a matrix

- by DaveO

I have a 100x200 2D array expressed as a numpy array consisting of black (0) and white (255) cells. It is a bitmap file. I then have 2D shapes (it's easiest to think of them as letters) that are also 2D black and white cells. I know I can naively iterate through the matrix but this is going to be a 'hot' portion of my code so speed is an concern. Is there a fast way to perform this in numpy/scipy? I looked briefly at Scipy's correlate function. I am not interested in 'fuzzy matches', only exact matches. I also looked at some academic papers but they are above my head.

Read the article

Detecting periodic repetitions in the data stream

- by pulegium

Let's say I have an array of zeros: a = numpy.zeros(1000) I then introduce some repetitive 'events': a[range(0, 1000, 30)] = 1 Question is, how do I detect the 'signal' there? Because it's far from the ideal signal if I do the 'regular' FFT I don't get a clear indication of where my 'true' signal is: f = abs(numpy.fft.rfft(a)) Is there a method to detect these repetitions with some degree of certainty? Especially if I have few of those mixed in, for example here: a[range(0, 1000, 30)] = 1 a[range(0, 1000, 110)] = 1 a[range(0, 1000, 48)] = 1 I'd like to get three 'spikes' on the resulting data...

Read the article

Mean of Sampleset and powered Sampleset

- by Milla Well

I am working on an ICA implementation wich is based on the assumption, that all source signals are independent. So I checked on the basic concepts of Dependence vs. Correlation and tried to show this example on sample data from numpy import * from numpy.random import * k = 1000 s = 10000 mn = 0 mnPow = 0 for i in arange(1,k): a = randn(s) a = a-mean(a) mn = mn + mean(a) mnPow = mnPow + mean(a**3) print "Mean X: ", mn/k print "Mean X^3: ", mnPow/k But I couldn't produce the last step of this example E(X^3) = 0: >> Mean X: -1.11174580826e-18 >> Mean X^3: -0.00125229267144 First value I would consider to be zero, but second value is too large, isn't it? Since I subtract the mean of a, I expected the mean of a^3 to be zero as well. Does the problem lie in the random number generator, the precision of the numerical values in my misunderstanding of the concepts of mean and expected value?

Read the article

ValueError: setting an array element with a sequence.

- by MedicalMath

This code: import numpy as p def firstfunction(): UnFilteredDuringExSummaryOfMeansArray = [] MeanOutputHeader=['TestID','ConditionName','FilterType','RRMean','HRMean','dZdtMaxVoltageMean','BZMean','ZXMean' ,'LVETMean','Z0Mean','StrokeVolumeMean','CardiacOutputMean','VelocityIndexMean'] dataMatrix = BeatByBeatMatrixOfMatrices[column] roughTrimmedMatrix = p.array(dataMatrix[1:,1:17]) trimmedMatrix = p.array(roughTrimmedMatrix,dtype=p.float64) myMeans = p.mean(trimmedMatrix,axis=0,dtype=p.float64) conditionMeansArray = [TestID,testCondition,'UnfilteredBefore',myMeans[3], myMeans[4], myMeans[6], myMeans[9] , myMeans[10], myMeans[11], myMeans[12], myMeans[13], myMeans[14], myMeans[15]] UnFilteredDuringExSummaryOfMeansArray.append(conditionMeansArray) secondfunction(UnFilteredDuringExSummaryOfMeansArray) return def secondfunction(UnFilteredDuringExSummaryOfMeansArray): RRDuringArray = p.array(UnFilteredDuringExSummaryOfMeansArray,dtype=p.float64)[1:,3] return firstfunction() Throws this error message: File "mypath\mypythonscript.py", line 3484, in secondfunction RRDuringArray = p.array(UnFilteredDuringExSummaryOfMeansArray,dtype=p.float64)[1:,3] ValueError: setting an array element with a sequence. However, this code works: import numpy as p a=range(24) b = p.reshape(a,(6,4)) c=p.array(b,dtype=p.float64)[:,2] I re-arranged the code a bit to put it into a cogent posting, but it should more or less have the same result. Can anyone show me what to do to fix the problem in the broken code above so that it stops throwing an error message?

Read the article

How much RAM used by Python dict or list?

- by Who8MyLunch

My problem: I am writing a simple Python tool to help me visualize my data as a function of many parameters. Each change in parameters involves a non-trivial amount of time, so I would like to cache each step's resulting imagery and supporting data in a dictionary. But then I worry that this dictionary could grow too large over time. Most of my data is in the form of Numpy arrays. My question: How would one go about computing the total number of bytes used by a Python dictionary. The dictionary itself may contain lists and other dictionaries, each of which contain data stored in Numpy arrays. Ideas?

Read the article

cPickle ImportError: No module named multiarray

- by Rafal

Hello, I'm using cPickle to save my Database into file. The code looks like that: def Save_DataBase(): import cPickle from scipy import * from numpy import * a=Results.VersionName #filename='D:/results/'+a[a.find('/')+1:-a.find('/')-2]+Results.AssType[:3]+str(random.randint(0,100))+Results.Distribution+".lft" filename='D:/results/pppp.lft' plik=open(filename,'w') DataOutput=[[[DataBase.Arrays.Nodes,DataBase.Arrays.Links,DataBase.Arrays.Turns,DataBase.Arrays.Connectors,DataBase.Arrays.Zones], [DataBase.Nodes.Data,DataBase.Links.Data,DataBase.Turns.Data,DataBase.OrigConnectors.Data,DataBase.DestConnectors.Data,DataBase.Zones.Data], [DataBase.Nodes.DictionaryPy2Vis,DataBase.Links.DictionaryPy2Vis,DataBase.Turns.DictionaryPy2Vis,DataBase.OrigConnectors.DictionaryPy2Vis,DataBase.DestConnectors.DictionaryPy2Vis,DataBase.Zones.DictionaryPy2Vis], [DataBase.Nodes.DictionaryVis2Py,DataBase.Links.DictionaryVis2Py,DataBase.Turns.DictionaryVis2Py,DataBase.OrigConnectors.DictionaryVis2Py,DataBase.DestConnectors.DictionaryVis2Py,DataBase.Zones.DictionaryVis2Py], [DataBase.Paths.List]],[Results.VersionName,Results.noZones,Results.noNodes,Results.noLinks,Results.noTurns,Results.noTrips, Results.Times.VersionLoad,Results.Times.GetData,Results.Times.GetCoords,Results.Times.CrossTheTime,Results.Times.Plot_Cylinder, Results.AssType,Results.AssParam,Results.tStart,Results.tEnd,Results.Distribution,Results.tVector]] cPickle.dump(DataOutput, plik, protocol=0) plik.close()` And it works fine. Most of my Database rows are lists of a lists, vecor-like, or array-like data sets. But now when I input data, an error occurs: def Load_DataBase(): import cPickle from scipy import * from numpy import * filename='D:/results/pppp.lft' plik= open(filename, 'rb') """ first cPickle load approach """ A= cPickle.load(plik) """ fail """ """ Another approach - data format exact as in Output step above , also fails""" [[[DataBase.Arrays.Nodes,DataBase.Arrays.Links,DataBase.Arrays.Turns,DataBase.Arrays.Connectors,DataBase.Arrays.Zones], [DataBase.Nodes.Data,DataBase.Links.Data,DataBase.Turns.Data,DataBase.OrigConnectors.Data,DataBase.DestConnectors.Data,DataBase.Zones.Data], [DataBase.Nodes.DictionaryPy2Vis,DataBase.Links.DictionaryPy2Vis,DataBase.Turns.DictionaryPy2Vis,DataBase.OrigConnectors.DictionaryPy2Vis,DataBase.DestConnectors.DictionaryPy2Vis,DataBase.Zones.DictionaryPy2Vis], [DataBase.Nodes.DictionaryVis2Py,DataBase.Links.DictionaryVis2Py,DataBase.Turns.DictionaryVis2Py,DataBase.OrigConnectors.DictionaryVis2Py,DataBase.DestConnectors.DictionaryVis2Py,DataBase.Zones.DictionaryVis2Py], [DataBase.Paths.List]],[Results.VersionName,Results.noZones,Results.noNodes,Results.noLinks,Results.noTurns,Results.noTrips, Results.Times.VersionLoad,Results.Times.GetData,Results.Times.GetCoords,Results.Times.CrossTheTime,Results.Times.Plot_Cylinder, Results.AssType,Results.AssParam,Results.tStart,Results.tEnd,Results.Distribution,Results.tVector]]= cPickle.load(plik)` Error is (in both cases): A= cPickle.load(plik) ImportError: No module named multiarray Any Ideas? PS.

Search Results

Search found 338 results on 14 pages for 'numpy'.

Page 6/14 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >

- by Dragan Chupacabrovic

- by Bob

- by Joe

- by user248237

- by Hooked

- by nyvltak

- by DarenW

- by gatoatigrado

- by F.N.B

- by usethedeathstar

- by Jack

- by dF

- by None

- by R S

- by Werner

- by Daniel Mathews

- by Jose

- by Graham Mitchell

- by spitzanator

- by DaveO

- by pulegium

- by Milla Well

- by MedicalMath

- by Who8MyLunch

- by Rafal

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >