Possibility and Probability

A Python programmer with a personality thinking about space exploration

2 July 2006

What is the randomness of randint()?

by Nick

Recently I discovered the random.randint() function in python. Basically you call it with 2 ints, a low value and a high value. It will return a integer in that range (inclusive). I was playing around with it and I thought it seemed to be giving me the same number awfully often, so I whipped up a test: call that method 1 million times, record the values, then repeat 6 times. I’m using randint() to simulate dice so I’m curious to see if the number distribution is even across the numbers 1 through 6. Below is my test code:

for x in range(6):
    counts = [0,0,0,0,0,0,0,0]
    for x in range(ONE_MILLION):
        counts[d6()] += 1
    for i in counts:
        print i, ',',
    print ''

Each time d6() (my wrapper around randint()) is called, it returns a number 1-6. This is used as a look up into the counts list, and the number there is incremented by one. I have 0’s on both sides of the 1-6 slots just to make sure it really is returning a correctly bounded value. The numbers in each row should sum up to 1 million. By running this 6 times, I should get an idea of where the numbers are falling to make sure there is an even distribution. (Truly random numbers will have an average distribution over the long term, if they are grouping around one number, then they random number generator is not doing a good job.) I took the total of each column (which should be very close to 1 million) and then found the percent error ( ((amount - expected) / expected) *100) (omitting the absolute values that are usually used). The average of the percent errors was 0. This leads me to believe that the distribution of random numbers generated by the randint() function are sufficiently random for my uses. Now that I have stated this, I have no more excuses but to continue on with coding the game that will use said function in a dice throwing function. :) Below is the spreadsheet of my data as generated by Google Spreadsheets. | | | | | | | | | | |
—|—|—|—|—|—|—|—|—|—|—

0 166367 166368 166846 166996 167006 166417 0   1000000
0 165463 166853 166669 166644 167031 167340 0   1000000
0 167284 166470 167052 166227 166123 166844 0   1000000
0 166893 166893 165958 166655 167011 166590 0   1000000
0 166887 166370 166124 166672 167160 166787 0   1000000
0 166802 167174 166724 165704 166800 166796 0   1000000
                 
Total: 999696 1000128 999373 998898 1001131 1000774    
% Err: -0.304 0.128 -0.627 -1.102 1.131 0.774    

| Avg % Err:| -0| | | | | | | |
By the way, this data was generated with python 2.4.3.

tags: