Parallel computation of the median of a large array

Posted by recipriversexclusion on Stack Overflow See other posts from Stack Overflow or by recipriversexclusion
Published on 2010-05-28T21:02:29Z Indexed on 2010/05/28 21:32 UTC
Read the original article Hit count: 151

Filed under:
|

I got asked this question once and still haven't been able to figure it out:

You have an array of N integers, where N is large, say, a billion. You want to calculate the median value of this array. Assume you have m+1 machines (m workers, one master) to distribute the job to. How would you go about doing this?

Since the median is a nonlinear operator, you can't just find the median in each machine and then take the median of those values.

© Stack Overflow or respective owner

Related posts about parallel-processing

Related posts about median