making binned boxplot in matplotlib with numpy and scipy in Python

Posted by user248237 on Stack Overflow See other posts from Stack Overflow or by user248237
Published on 2010-04-26T21:06:51Z Indexed on 2010/04/27 0:03 UTC
Read the original article Hit count: 274

Filed under:
|
|
|
|

I have a 2-d array containing pairs of values and I'd like to make a boxplot of the y-values by different bins of the x-values. I.e. if the array is:

my_array = array([[1, 40.5], [4.5, 60], ...]])

then I'd like to bin my_array[:, 0] and then for each of the bins, produce a boxplot of the corresponding my_array[:, 1] values that fall into each box. So in the end I want the plot to contain number of bins-many box plots.

I tried the following:

min_x = min(my_array[:, 0])
max_x = max(my_array[:, 1])

num_bins = 3
bins = linspace(min_x, max_x, num_bins)
elts_to_bins = digitize(my_array[:, 0], bins)

However, this gives me values in elts_to_bins that range from 1 to 3. I thought I should get 0-based indices for the bins, and I only wanted 3 bins. I'm assuming this is due to some trickyness with how bins are represented in linspace vs. digitize.

What is the easiest way to achieve this? I want num_bins-many equally spaced bins, with the first bin containing the lower half of the data and the upper bin containing the upper half... i.e., I want each data point to fall into some bin, so that I can make a boxplot.

thanks.

© Stack Overflow or respective owner

Related posts about python

Related posts about numpy