Enumerating combinations in a distributed manner

Posted by Reyzooti on Stack Overflow See other posts from Stack Overflow or by Reyzooti
Published on 2011-01-15T08:15:36Z Indexed on 2011/01/15 8:53 UTC
Read the original article Hit count: 161

Filed under:
|
|
|
|

I have a problem where I must analyse 500C5 combinations (255244687600) of something. Distributing it over a 10 node cluster where each cluster processes roughly 10^6 combinations per second means the job will be complete in about 7hours.

The problem I have is distributing the 255244687600 combinations over the 10 nodes. I'd like to present each node with 25524468760, however the algorithms I'm using can only produce the combinations sequentially, I'd like to be able to pass the set of elements and a range of combination indicies eg: [0-10^7) or [10^7,2.0 10^7) etc and have the nodes themselves figure out the combinations.

The algorithms I'm using at the moment are from the following:

http://home.roadrunner.com/~hinnant/combinations.html

A logical question

I've considered using a master node, that enumerates each of the combinations and sends work to each of the nodes, however the overhead incurred in iterating the combinations from a single node and communicating back and forth work is enormous, and will subsequently lead to the master node becoming the bottleneck.

Are there any good combination iterating algorithms geared up for efficient/optimal distributed enumeration?

© Stack Overflow or respective owner

Related posts about c++

Related posts about algorithm