Generating strongly biased radom numbers for tests

Posted by nobody on Stack Overflow See other posts from Stack Overflow or by nobody
Published on 2012-10-07T02:13:38Z Indexed on 2012/10/07 3:38 UTC
Read the original article Hit count: 142

Filed under:
|
|

I want to run tests with randomized inputs and need to generate 'sensible' random numbers, that is, numbers that match good enough to pass the tested function's preconditions, but hopefully wreak havoc deeper inside its code.

math.random() (I'm using Lua) produces uniformly distributed random numbers. Scaling these up will give far more big numbers than small numbers,
and there will be very few integers.

I would like to skew the random numbers (or generate new ones using the old function as a randomness source) in a way that strongly favors 'simple' numbers, but will still cover the whole range, I.e. extending up to positive/negative infinity (or ±1e309 for double). This means:

  • numbers up to, say, ten should be most common,
  • integers should be more common than fractions,
  • numbers ending in 0.5 should be the most common fractions,
  • followed by 0.25 and 0.75; then 0.125,
  • and so on.

A different description: Fix a base probability x such that probabilities will sum to one and define the probability of a number n as xk where k is the generation in which n is constructed as a surreal number1. That assigns x to 0, x2 to -1 and +1, x3 to -2, -1/2, +1/2 and +2, and so on. This gives a nice description of something close to what I want (it skews a bit too much), but is near-unusable for computing random numbers. The resulting distribution is nowhere continuous (it's fractal!), I'm not sure how to determine the base probability x (I think for infinite precision it would be zero), and computing numbers based on this by iteration is awfully slow (spending near-infinite time to construct large numbers).

Does anyone know of a simple approximation that, given a uniformly distributed randomness source, produces random numbers very roughly distributed as described above?

I would like to run thousands of randomized tests, quantity/speed is more important than quality. Still, better numbers mean less inputs get rejected.

Lua has a JIT, so performance can't be reasonably predicted.
Jumps based on randomness will break every prediction, and many calls to math.random() will be slow, too. This means a closed formula will be better than an iterative or recursive one.


1 Wikipedia has an article on surreal numbers, with a nice picture. A surreal number is a pair of two surreal numbers, i.e. x := {n|m}, and its value is the number in the middle of the pair, i.e. (for finite numbers) {n|m} = (n+m)/2 (as rational). If one side of the pair is empty, that's interpreted as increment (or decrement, if right is empty) by one. If both sides are empty, that's zero. Initially, there are no numbers, so the only number one can build is 0 := { | }. In generation two one can build numbers {0| } =: 1 and { |0} =: -1, in three we get {1| } =: 2, {|1} =: -2, {0|1} =: 1/2 and {-1|0} =: -1/2 (plus some more complex representations of known numbers, e.g. {-1|1} ? 0). Note that e.g. 1/3 is never generated by finite numbers because it is an infinite fraction – the same goes for floats, 1/3 is never represented exactly.

© Stack Overflow or respective owner

Related posts about random

Related posts about lua