Subset generation by rules

Posted by Sazug on Stack Overflow See other posts from Stack Overflow or by Sazug
Published on 2009-12-19T21:10:05Z Indexed on 2010/04/16 12:53 UTC
Read the original article Hit count: 243

Filed under:
|
|

Let's say that we have a 5000 users in database. User row has sex column, place where he/she was born column and status (married or not married) column.

How to generate a random subset (let's say 100 users) that would satisfy these conditions:

  • 40% should be males and 60% - females
  • 50% should be born in USA, 20% born in UK, 20% born in Canada, 10% in Australia
  • 70% should be married and 30% not.

These conditions are independent, that is we cannot do like this:

  • (0.4 * 0.5 * 0.7) * 100 = 14 users that are males, born in USA and married
  • (0.4 * 0.5 * 0.3) * 100 = 6 users that are males, born in USA and not married.

Is there an algorithm to this generation?

© Stack Overflow or respective owner

Related posts about algorithm

Related posts about sample