ProblemWe would like a function which skews the number line between 0.0 and 1.0 such that the range of numbers closer to 0.0 are denser than those closer to 1.0 or vice versa.
Consider the following statistics gathered from the Python function random.random(), where we shall count the number of random numbers generated that are greater than 0.5 and those that are less out of 10000:
The frequencies are very close to each other. This is because random.random() uses a uniform probability distribution, meaning that each number between 0.0 and 1.0 has an equal chance of being generated. But what if we wanted to bias the frequencies so that more numbers are less than 0.5?
An application for this could be to randomly select an element from an array which is sorted on priority such that you want to randomly select the first element more often than the last. To do this you bias the random number generator towards smaller indices.
SolutionWe need a function which takes the set of inputs from 0.0 to 1.0 and returns numbers in the same range, however skewing the mapping such that a larger part of the number range gets mapped to a smaller part of the output range.
In the graph below, y = x, the mapping is unbiased, no part of the output range has been squashed.
However if we were to bend the line so that it still goes through (0,0) and (1,1) but is curved then we could skew the mapping as demonstrated below:
The graph on the left skews the output such that there is a bias for smaller numbers; all inputs which are less than 0.6 will be mapped to be closer to 0.0 whilst the rest of the inputs will be mapped to be closer to 1.0. On the other hand, the graph on the right skews the output such that there is a bias for larger numbers; only inputs smaller than 0.4 will be mapped to be closer to 0.0 whilst the rest of the inputs will be mapped to be closer to 1.0. Also, in the graph on the left, the smaller the number the greater the chance of it being returned whilst in the graph on the right, the larger the number the greater the chance of it being returned.
We would like to devise a function which will give us this curve, and to some how parameterise the curvature so that the bias can be made stronger or weaker. Such a curve is given by the exponent function
y = x^k (1)
which, given that k is positive, goes through (0,0) and (1,1) and which different values of k will change the curvature such that when k is between 0.0 and 1.0 the curvature will be going flatter and when k is between 1.0 and infinity the curvature will be going steeper.
In order to control the curvature we shall allow the user to specify a point which the curve is to pass through. To find which value of k will go through the point (x',y'), we make k subject of the formula in (1) after substituting x with x' and y with y' and we get
k = ln y' / ln x' (2)
and by substituting (2) into (1) we get
y = x^(ln y' / ln x') (3)
which is useful for describing our curve, provided both y' and x' are between 0 and 1, both not included.
However we would like the way we control the curvature to be more meaningful to the user. So instead we let the user provide just one variable, called the bias, which will be the x value which will be mapped to 0.5, hence all smaller values will be mapped to be less than 0.5 and all greater values will be mapped to be more than 0.5. To do this, we set y' to be 0.5, so that x' will be the bias, that is, the x value which gives 0.5. After modifying equation (3), the new equation is
y = x^(ln 0.5 / ln b) (4)
where b is the bias.
The following graphs are obtained for different values of b:
ResultsThe following statistics demonstrate the validity of the function. Again 10000 random numbers are generated using Python's random.random() except that this time we shall pass the random numbers to the devised funtion as x using different values for b and all cases use a seed of 0:
So using this function we can easily give a bias to random number generators. The only problem with it is that a bias of zero or one will result in a log of zero and division by zero respectively and hence cannot be done.
SummaryTo bias a random number generator which gives random fractions so that it is biased towards numbers closer to 1 or 0, use the equation y = x^(ln 0.5 / ln b) where x is the unbiased random number, b is the fraction of numbers to be less than 0.5 and y is the biased random number.