I've written articles about weighted random number in the past, but today I ran into a use I've been meaning to explain for a long time.
For example when rolling two dice, the mostly likely number to roll is 7. With 4 dice, it's 14. These are weighted rolls in the context of this article as the likely outcomes are not evenly distributed, but tend toward some center point.
One of the weighting algorithm I've written about in the past is Banded Inverse Root Nonuniform Scatter. This is the function:
Where α1 and α2 are random numbers between 0 and 1, and S is the “scatter coefficient”. The root of this function is the banding part.
This weights the roll toward 0. The larger the value of S, strong the pull toward 1. Using two of these functions together give a range the function a peak centered at 0 that goes both positive and genitive. Note that the last part of the function normalizes the output so it is between 0 and 1. The process will be explained in a bit, but this function will be called nb(S). So in parts, the full function is:
This function can be simplified if the square root is removed. The root makes the curve more gradual, but this isn't needed.
The trick to this function is the use of the -1, +1 in the denominator. This allows the scatter coefficient to have a defined range between negative infinity and positive infinity (i.e. -∞ < S < +∞), although the useful range is 0 ≤ S < +∞.
The normalized function looks like this:
Rebuilding the center-weighted function results in:
So g( α1, α2, S ) is our weighted function. α1 and α2 are random numbers between 0 and 1. S is the scatted coefficient 0 ≤ S < ∞. The larger the value of S, the more weighted the output is toward 0.
The graph above shows the histogram for distribution for various scatter values and illustrates how as the scatter coefficient increases, the concentration toward the center increases. Note that this function does not create a bell curve (or normal distribution). Instead it has a sharp point at the center. This means that for larger values of S the likelihood of being away from the center point diminishes very rapidly—much more than it would with a function that has normal distribution. So the function favors the center point more strongly than those producing normal distribution.
Now some of the function's versatility. The function is normally used to generate some range.
Here, M is a scale factor (magnitude) and c is an offset that allows the function to have a range such that -(M + c) < v < (M + c). Now a function can be defined to return a value in a given range with some weight.
Where vmin < w( vmin, vmax, S ) < vmax. The floor function makes sure the values are integer numbers, and can be omitted if real number are desired. The center point will always be half way between vmin and vmax.
This function can be modified slightly to simulate a dice roll. Let n be the number of dice, and s be the number of sides on each die. Then vmin = n, vmax = n * s. The scatter coefficient (S) can be varied, but the distribution will not be identical to that of an actual dice roll.
Here the floor function is required. n < d( n, s, S ) < n*s.
In this histrogram, the difference in distribution can be seen between an actual dice roll (in this case, five 6-sides die) and the simulated function d( n, s, S ) where S = 3. Note they both peak at the same location (between 17 and 18) with roughly the same likelihood for these numbers. However, the chances for rolling a 15 are greater with a true dice roll, and less in the simulated. Likewise, rolling an 8 is less likely with dice, and more likely simulated. Keep in mind that the simulated dice roll can do something an actual dice roll can not: produce fractional results. If the floor function part of d( n, s, S ) is removed, any real number in the range can be returned. So while an exhaustive check for every dice roll is possible, every simulated roll is not. Thus, the graph above used one million samples to produce the simulated histrogram.
There are some additional way the function g( α1, α2, S ) can be used. If an uncentered value desired, the random input can be fixed.
These histrogram show the output of 10,000 samples of the function, where α is a random number (0 ≤ α ≤ 1). Note how in both cases, when S = 1 the distribution is uniform for all values. This is because when S = 1, the weighting function is doing nothing, and the random value α is being returned.