First we need some definitions:
An experiment is any process that allows researchers to obtain observations
A random variable is a variable that has a single numeric value for each outcome of an experiment.
Consider this example: The experiment is rolling a single six sided die. A random variable x can hold the result of one roll of this die. So x can be 1,2,3,4,5, or 6. This is an example of a discrete random variable since it has a finite number of values.
Here is another example: The experiment is going to Roberts Common from 11-12 on a particular day and counting the number of Casper College students purchasing lunch. A random variable x can hold the result of this count. So x can be (theoretically) any whole number from 0 up to the total number of Casper College students. Again this random variable is discrete
Here is one final example: The experiment is to measure the amount of gas in the fuel tank on a particular car. The random variable x can hold the result of this measurement. In this case x can be any real number from 0 to the fuel capacity of the tank. For example my Ford Explorer has a fuel capacity of 21 gallons, so x could be any real number from 0 to 21 gallons. There are not a finite or countable number of these measurements so x is a continuous random variable
Here are the formal definitions of these terms:
A discrete random variable has either a finite number of values or a countable number of values
A continuous random variable has infinitely many values and those values can be associated with measurements on a continuous scale with no interruptions.
A probability distribution gives the probability for each value of the random variable. For example in the experiment of rolling one die, the possible outcomes are 1,2,3,4,5,6. The probability of each outcome (assuming the die is fair) is 1/6 or approximately 16.67%. We can represent the probability distribution as follows:
|
X |
P(x) |
|
1 |
.1667 |
|
2 |
.1667 |
|
3 |
.1667 |
|
4 |
.1667 |
|
5 |
.1667 |
|
6 |
.1667 |
Note the values for the random variable are given in the left column and the probability of each outcome is given in the right column.
Here is another example: The experiment is rolling two dice simultaneously. The outcome we are looking for is the sum of the two dice. The possible outcomes are 2,3,4,5,6,7,8,9,10,11, and 12. The outcomes can be summarized in the following chart:
|
Outcomes for Die #1 |
|||||||
|
1 |
2 |
3 |
4 |
5 |
6 |
||
|
Outcome of Die #2 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
|
2 |
3 |
4 |
5 |
6 |
7 |
8 |
|
|
3 |
4 |
5 |
6 |
7 |
8 |
9 |
|
|
4 |
5 |
6 |
7 |
8 |
9 |
10 |
|
|
5 |
6 |
7 |
8 |
9 |
10 |
11 |
|
|
6 |
7 |
8 |
9 |
10 |
11 |
12 |
|
The values in the center of the table in bold are the sum of the two dice faces. Note that there are 36 outcomes and the table will tell you exactly how many occurrences of each sum there are. Here is the probability distribution:
|
X |
P(X) |
|
2 |
1/36=0.028 |
|
3 |
2/36=0.056 |
|
4 |
3/36=0.083 |
|
5 |
4/36=0.111 |
|
6 |
5/36=0.139 |
|
7 |
6/36=0.167 |
|
8 |
5/36=0.139 |
|
9 |
4/36=0.111 |
|
10 |
3/36=0.083 |
|
11 |
2/36=0.055 |
|
12 |
1/36=0.028 |
ALL PROBABILITY DISTRIBUTIONS MUST HAVE:
If the chart does not satisfy these properties then it is not a probability distribution. There are some excellent examples on page p.188-189 in the book
MEAN, VARIANCE, and STANDARD DEVIATION
You can calculate the mean, variance and standard deviation of a probability distribution as follows:
MEAN: Multiply x by P(x) for each outcome in the table, sum these values to get the mean. In notation this is ![]()
VARIANCE: Multiply
by P(x) for each outcome in the table, sum these values and then subtract the mean squared. In notation: ![]()
STANDARD DEVIATION: Just take the square root of the variance
Let's do this for the two dice example above:
|
X |
P(X) |
X * P(X) |
X^2 * P(X) |
|
2 |
1/36=0.028 |
0.056 |
0.112 |
|
3 |
2/36=0.056 |
0.168 |
0.504 |
|
4 |
3/36=0.083 |
0.332 |
1.328 |
|
5 |
4/36=0.111 |
0.555 |
2.775 |
|
6 |
5/36=0.139 |
0.834 |
5.004 |
|
7 |
6/36=0.167 |
1.169 |
8.183 |
|
8 |
5/36=0.139 |
1.112 |
8.896 |
|
9 |
4/36=0.111 |
0.999 |
8.991 |
|
10 |
3/36=0.083 |
0.83 |
8.3 |
|
11 |
2/36=0.055 |
0.605 |
6.655 |
|
12 |
1/36=0.028 |
0.336 |
4.032 |
To calculate the mean: sum the third column (this is 6.996) and this is the mean
To calculate the variance: sum the fourth column (this is 54.78) and subtract the mean squared to get: 54.78 -(6.996)^2 = 5.8360
The standard deviation is just the square root of the variance so we get: 2.416
A few observations:
Suppose you play the following gambling game:
You roll a fair die. If the outcome is a 3 or a 5, you win $2. If the outcome is 1,2,4,6 then you lose $1. How much money should you expect to have after playing this game many times?
To help answer this question, we need expected value:
The expected value (or expectation ) of a discrete random variable is denoted E, and it represents the average value of all outcome. It is calculated in the same way the mean of the probability distribution is done above.
We can analyze the game above by using the following table:
|
Event |
X |
P(X) |
X * P(X) |
|
Win |
$2 |
1/3 |
2/3=0.667 |
|
Lose |
-$1 |
2/3 |
-2/3= -0.667 |
|
TOTAL |
|
|
$0 |
So after playing this game many times you would expect you total earnings to be nothing. This is sometimes called a fair game. Of course this is only what we would expect, it could happen otherwise. Note that when you play this game you will never win $0. You will always win $2 or fork over $1. The expected value is only the average over the long run
Here is another example
You are playing a game with a standard deck of cards as follows. The cards are shuffled a card is chosen at random. If the card is a club you win $10. If the card is not a club you lose $6. What is the expected value:
E = (win)(probability of winning) + (loss)(probability of losing)
= ($10)(1/4) + (-$6)(3/4) = -$2
In the long run your average "winnings" would a loss of $2