Statistics
The Hypergeometric Distribution

In spinning a coin several times, we can regard each spin as identical to all others. The universe has an infinite supply of "heads" and "tails" possibilities. In a fair situation, governed only by chance, getting heads on one spin doesn't increase or decrease your chances of getting heads on the next spin. They are independent. Sometimes, however, events in a sequence can interfere with the probabilities of later events in the series. This situation leads to what is called the hypergeometric distrbution. Its standard example is drawing counters of different color from a box (or "urn," or other container). Here is a simple introduction to that arrangement, presented because instances are sometimes confused with the Poisson Distribution properly so called.

The Box Model

Instead of spinning a coin, let us put 10 red and 10 white golf balls in a box, total 20 golf balls, and shake them until they are mixed. Notice that unlike the universe's supply of Heads and Tails, the number of balls in the box is finite, and adding or subtracting 1 ball would change the profile of available possibilities. That is to say, your next coin spin is free (drawing on an infinite and thus undiminished supply of options), if somebody ahead of you in the golf game took out a ball, your probability options are changed. The procedure is to reach in without looking, and take out one ball. All the balls feel the same. Your chance of getting a red one on the first try is 10/20 = 0.500, since any of the 10 red balls would serve, and there are 20 balls altogether.

If you put the ball back and shake the box again, your next draw will have the same chance of getting a red as the first draw, since there are still 10 red balls out of 20 total balls. This is the replacive option.

But suppose instead you put the first ball aside after you draw it. Now, on your second draw, there are only 9 red balls, plus the original 10 white balls, for a total of only 19. This is the nonreplacive option. There are 9 red balls in the total of 19, and your chance of getting red under those conditions is 9/19 = 0·474. This is less than the 0·500 chance you had on the first draw. As you keep drawing red balls, and keep not putting them back, your chance of getting a red ball on the next draw continues to decline. After you have (let's suppose) drawn 9 red balls in a row, there will be a total of 11 balls left in the box, of which only 1 is red (you have drawn all the others), and your chance of getting that red ball is thus 1/11 = 0·091.

Here is a table of the number of red balls available at each draw, the total balls, and the chance of drawing a red ball. It is assumed that a red ball is drawn each time. The bottom row, "Chance [of getting a red ball]," is calculated by dividing the figures for "[number of] Red [balls]" and "Total [number of all balls]" in the above rows:

Draw

1
2
3
4
5
6
7
8
9
10

Red

10
9
8
7
6
5
4
3
2
1

Total

20
19
18
17
16
15
14
13
12
11

Chance

0·500
0·474
0·444
0·412
0·375
0·333
0·286
0·231
0·167
0·091

We have found that successive probabilities multiply. To find the odds of drawing red balls only in the situation here described, we should add a row to the table for the Cumulative Chance: the probability of getting only red balls on the first and all successive attempts. Here is the revised table:

Draw

1
2
3
4
5
6
7
8
9
10

Red

10
9
8
7
6
5
4
3
2
1

Total

20
19
18
17
16
15
14
13
12
11

Chance

0·500
0·474
0·444
0·412
0·375
0·333
0·286
0·231
0·167
0·091

Cum Chance

0·500
0·237
0·105
0·043
0·016
0·005
0·002
0·000
0·000
0·000

Within the three decimal places for which we have room in the table, the cumulative chance declines to less than 0·001, that is, less than 1 in 1,000. The exact number is 0·0000135, or about 1 in 73,904. This is many times less likely than the chance of getting 10 red balls in a row when the probability of each draw is constant at 0·500 throughout. In fact:

P(10r, replacive) = 0·000977 (1 in 1,024)
P(10r, nonreplacive) = 0·0000135 (1 in 73,904)

On the next or 11th draw, there will be no chance whatever of getting a red ball, since all the red balls have been drawn and there are only white ones left. That is,

P(11r, nonreplacive) = 0·0000000 (cannot occur)

It therefore makes a big difference, in sequences of events, if the earlier ones interfere with the probabilities of the later ones. This is not a matter of math, it is a matter of the physical setup which the math is reporting. If you are clear about the physical setup, the math is more or less a matter of routine.

Explorations
The Birthday Problem
The Gwodyen Dau/Dv Jing

Back to Poisson Page

Contact The Project / Exit to Resources Page