Appendix 2: The Random Number Generator

A sequence is random if knowledge of an initial segment of that sequence provides no information which can be used to predict the next item in the sequence. Thus a mathematical random number generator, which generates random numbers in a deterministic fashion based on some arithmetic algorithm, is not random assuming that the algorithm is known. But if the numbers in the sequence exhibit no (or very little) correlation, so that in the absence of knowledge of the algorithm it is impossible to predict the next number, then the sequence may be called pseudo-random, and the generator is called a pseudo-random number generator, abbreviated "PRNG".

The period of a PRNG is the number of random numbers generated before the sequence begins to repeat itself. There are some PRNGs whose sequences degenerate quickly into some short cycle. A good PRNG should have a very long period, or at least a period longer than the number of random numbers one needs.

For this work it was decided to take a PRNG from among those offered in Press et al. 1992. In early trials the ran1() generator (Press et al. 1992, p.280) was used. This has a period of about 109. It was found that the simulation program required more than this number of random numbers, so ran2() (Press et al. 1992, p.282) was used. This has a period of more than 1018, which is far more than what is needed by the simulation program.

Several programs were developed to test that this PRNG was implemented properly.

(i) The first test program generates a sequence of random numbers and checks for a repetition. Repetitions do sometimes occur, as the sample output below shows, but this does not cause the series of random numbers to repeat itself.

```x[   326] = 0.6915861697796846  iy2[   326] = 1485169932
x[ 49080] = 0.6915861697796846  iy2[ 49080] = 1485169932
x[ 47067] = 0.9751363018930823  iy2[ 47067] = 2094089180
x[ 83888] = 0.9751363018930823  iy2[ 83888] = 2094089180
x[ 86899] = 0.0051710821872307  iy2[ 86899] = 11104814
x = 0.0051710821872307  iy2 = 11104814
x[ 93989] = 0.2621932156786431  iy2[ 93989] = 563055621
x = 0.2621932156786431  iy2 = 563055621
x = 0.0156809600688897  iy2 = 33674604
x = 0.0156809600688897  iy2 = 33674604
x[ 58767] = 0.8290420968404869  iy2[ 58767] = 1780354276
x = 0.8290420968404869  iy2 = 1780354276
x[ 59327] = 0.7235610072979171  iy2[ 59327] = 1553835370
x = 0.7235610072979171  iy2 = 1553835370
x[ 19183] = 0.1803431372759708  iy2[ 19183] = 387283923
x = 0.1803431372759708  iy2 = 387283923
x = 0.7191802249896895  iy2 = 1544427712
x = 0.7191802249896895  iy2 = 1544427712```

Press et al. 1992 state that the period of this PRNG is greater than 1018. This does not imply that a sequence of more than 1018 different random numbers is produced, but states that the sequence does not begin to repeat itself until after 1018 random numbers have been generated. In fact (assuming 32-bit ints) it is impossible for such a sequence to consist of different numbers because ran2() returns the minimum of AM*iy and RNMX, where AM is a constant double and iy is a positive signed int. Assuming that an int is a 32-bit value (as is so for the C compiler used in this study), there are no more than 231 = 2,147,483,647 < 1010 positive signed ints, so in a sequence of 1010 such ints there must be repetitions.

(ii) The second test program is based upon the chi-square test, as described by Knuth (1997, pp.42-48). It performs a specified number (usually in the range 10-100) of tests. In each test it randomly sets up a number k of "bins": 10, 15, 20, 30 or 50 bins. It then chooses a number n of random numbers to be generated, in the range 40,000 through 4,000,000. It then generates n random numbers x (such that 0 < x < 1) and assigns each to bin (int)(k*x) (the bins are numbered from 0 through k-1). On average one expects each bin to receive about as many numbers as any other bin, but there are random deviations from the mean of N/k. These deviations could be either greater or less than expected by chance. To find out, one considers the "chi-square" value given by the sum of the terms ti where ti is the square of the deviation from the expected value divided by the expected value. One then consults a table (Knuth 1997, p.44) to see whether the chi-square value is so low (or so high) that it is expected to occur in less than 5% of cases, or even so low (or so high) that it is expected to occur in less than 1% of cases. When this test is performed 100 times one expects to obtain a chi-square value which is too low in about 5% of cases, and too high in about 5% of cases. This indeed is what is found with this test program (sample output below), indicating that the PRNG has been implemented correctly.

```    1: k = 21, n = 1186596, v = 26.29, nu = 20
2: k = 51, n =  647636, v = 40.39, nu = 50
3: k = 11, n = 3039920, v =  8.05, nu = 10
4: k = 51, n = 1105064, v = 63.54, nu = 50
5: k = 16, n = 3714016, v = 11.11, nu = 15
...
34: k = 16, n =  442270, v = 17.58, nu = 15
35: k = 51, n =  438536, v = 32.12, nu = 50   LOW?
36: k = 11, n = 1830180, v =  5.27, nu = 10
37: k = 16, n = 3156768, v = 12.94, nu = 15
38: k = 11, n = 2470110, v = 21.31, nu = 10   HIGH?
39: k = 16, n = 3074280, v = 13.74, nu = 15
...
61: k = 21, n =  493950, v = 25.47, nu = 20
62: k = 51, n = 1019676, v = 81.11, nu = 50   HIGH???
63: k = 11, n =  984620, v = 17.64, nu = 10
64: k = 51, n = 3031560, v = 31.64, nu = 50   LOW?
65: k = 11, n = 2347464, v =  7.48, nu = 10
...
89: k = 21, n =  947503, v = 28.86, nu = 20
90: k = 31, n = 1130220, v = 20.06, nu = 30
91: k = 31, n = 1318176, v = 19.72, nu = 30
92: k = 11, n =  344808, v =  3.34, nu = 10   LOW?
93: k = 11, n = 1957050, v = 11.09, nu = 10
94: k = 51, n = 3253764, v = 47.88, nu = 50
95: k = 51, n = 2491458, v = 51.82, nu = 50
96: k = 21, n = 3146084, v = 20.85, nu = 20
97: k = 11, n = 3009600, v =  9.74, nu = 10
98: k = 11, n =  699403, v = 13.21, nu = 10
99: k = 21, n = 3584512, v = 29.74, nu = 20
100: k = 11, n = 3273555, v =  5.48, nu = 10

HIGH???, occurred in 2.0% of cases.
??? occurred in 2.0% of cases.
LOW?, occurred in 6.0% of cases.
HIGH?, occurred in 1.0% of cases.
? occurred in 7.0% of cases.
ran2() appears to be satisfactory.```

(iii) The third test program simply calls ran2() repeatedly to test its speed. It was found that on a 330 MHz PC 200 million random numbers were generated by ran2() in 59 seconds, giving a speed of 3.39 million random numbers per second.

 Title page Contents Next: Appendix 3