Stat 2005 homepage

Stat 2005 activity page

Stat 2005 bulletin board

Back to Chapter 5 page

CHAPTER 5

Section 2

Lesson material- The standard normal distribution

Motivating review

The standard normal distribution

Z-scores from the standard normal distribution

Objectives (what you should know how to do ) after completing these sections

  1. Understand what is meant by a continuous probability distribution and in particular know the normal distribution
  2. Know how to calculate a score or a probability from the standard normal distribution

 

A bit of review to motivate us!

 

Let's review with the following example:

 

Example: According to DuPont Automotive 15% of sport compact cars are dark green. Assume 10 sport compact cars are randomly selected. Let X be the random variable that is the number of dark green cars in the sample. Find a probability distribution for the random variable


Solution: Is X binomial?? Yes! There are certainly two outcomes (dark green or not dark green) There are a fixed number of trials (10), the trials are independent and the probability is constant between trials. So using the binomial probability formula (or Statdisk), we get the following distribution:

 

 

Now you know  that the probability that exactly 5 cars in the sample of 10 are dark green is 0.00849.

 

The Binomial distribution is great for DISCRETE random variables. The only problem is that not all random variables are DISCRETE. Here is another example

 

Example: Mr. Wildman wishes to purchase a new sport or compact car. He can't seem to decide so being into statistics he decides to select randomly from all the different models available. What is the probability that the car he selects will have an average city miles per gallon that exceeds 42 mpg?

 

Here the random variable X represents the average city mpg of the car - this is not discrete (why??), but a continuous random variable. I can't write down a probability distribution like we did in the binomial example above. So we need some new tools!

 

Back to top

 

The Standard Normal Distribution

 

Most continuous random variables follow what is know as the normal distribution:

 

A continuous random variable has a normal distribution if that distribution is symmetric and bell-shaped and the distribution fits the equation:

 

Yuck! Fortunately for us we don't need to use this formula. We need to know though that the distribution is bell-shaped. What we need to use and understand is the idea that there is a correspondence between the area under the distribution of a continuous random variable and probability

 

To see this connection consider this example

 

Example: Suppose in a certain manufacturing process the temperature is controlled to range between 0 degrees and 5 degrees with all values being equally likely. Let X be the temperature in this manufacturing process. Since it is equally likely to be all values between 0 and 5 degrees we get a relative frequency histogram that looks as follows:

 

 

In  other words we get a relative frequency histogram that has five bars each of height 0.2 (or 20%). This is an example of what is known as a uniform probability distribution function. A probability distribution where every value of the random variable is equally likely. Here is another definition:

The graph of a continuous probability distribution is called a density curve, it has the following properties:

1) The curve has a total area under the curve of 1

2) Every point on the curve has a height between 0 and 1.

 

Certainly the uniform distribution given above has these properties

 

Example: Consider the uniform distribution given above. What is:

 

a)

b)

 

If you think about what each of these problems are asking it is easy to determine these probabilities. Since we have temperatures evenly distributed between 0 and 5, the probability getting a temperature greater than or equal to 1 is 4/5. Similarly, the probability of getting a temperature greater than or equal to 1 and less than or equal to 3 should be 2/5

Notice in the relative frequency graph given above that the total area under the graph is 1 (0.2 * 5 = 1). In part a we want the probability that we select a temperature it is greater than or equal to 1. The area under the curve greater than or equal to one is 4/5 (0.2 * 4=0.8). Notice that  corresponds to the area under the curve from 1 to 5! In part b we want a temperature that is between 1 and 3. The area greater than or equal to 1 and less than or equal to 3 under the curve is 2/5 (0.2 * 2 = 0.4). Again the area under the curve and the probability are the same!

It is easy to find area under the curve with a uniform distribution (it is always a rectangle), what about the normal distribution.

 

If you graph the normal distribution you get a graph like the following:

To calculate a probability under this curve we need 2 things:

1) We need a standard to compare against - this is called the standard normal distribution. It is the normal distribution with mean of 0 and standard deviation of 1

2) We have to be able to find the area under the standard normal curve - there are 2 ways to do this - using table A2 in your book or using a TI-83 calculator

 

Here is another example:

 

Suppose we manufacturer thermometers that are supposed to give readings of 0 degrees at the freezing point. We test a large sample and find that the mean of the sample is 0 degrees with a standard deviation of 1 degree and that the distribution is bell-shaped. If one thermometer is randomly selected what is the probability that the reading at the freezing point is between 0 and 1.32 degrees

 

Solution: Since the distribution is bell-shaped we will assume that it is the normal distribution and since its mean is 0 and the standard deviation 1, we will assume it is the standard normal distribution.

 

To find the probability that a thermometer has zero readings between 0 and 1.32 degrees, we need to find the area under the standard normal curve between 0 and 1.32. First we will do this using table A-2.  

 

For an on-line version of Table A-2 take this link

 

Notice that Table A-2, gives the area under the curve between 0 and any point z. Since we want the area under the curve between 0 and 1.32, look down the column labelled z till you get to 1.3, now go across the row until you are under the label .02 (since 1.3 + .02 = 1.32), this is the area under the curve between 0 and 1.32, and there fore this is the probability the thermometer measures between 0 and 1.32 at the freezing point

 

To do this on e TI-83,

 

Hit 2nd VARS to get the distr menu, choose item 2 for normalcdf

Type in the following

normalcdf(0,1.32) and hit enter to get the answer

 

Note the calculator answer is far superior in terms of accuracy

 

To test your skills, find the probability that a thermometer is selected at random and measures between 0 and 1.47

 

Take this link to see where to find this in the table

 

On the calculator use: normalcdf(0, 1.47) to get the answer

 

More problems:

 

1) What is the probability that one selected measures between -2.57 and 0 degrees

 

Solution: To determine this you would need the area between -2.57 and 0 on the normal curve which at first glance does not seem to be in the table. But since the normal curve is symmetric about zero, this is identical to the area between 0 and 2.57. So looking up 2.57 on the table we get ,4949. Identically using the calculator we use normalcdf(0,2,57)

 

2) What is the probability that one selected measures greater than 1.53 degrees

 

Solution: Table A-2 give the area between 0 and 1.53, BUT the area under the whole curve is 1 (since it is a density function) and so the area under the positive half of the curve is 0.5, so the table gives use .4370 as the area under the curve from 0 to 1.53, there fore the area greater than 1.53 is 0.5 - .4370 or 0.063, this is the probability that one selected measures above 1.53.

 

On the calculator use 2nd vars to get he distr menu and then select 2 for normalcdf( now enter 1.53 and a comma and hit 2nd comma for the E notation and enter 99and a parenthesis. When you are finished it should look like this:

normalcdf(1.53,E99)

The E99 means infinity, so this give the area under the normal curve from 1.53 to infinity

 

3) What is the probability that one is selected and it measures between -1.31 and 1.46 at the freezing point

 

Solution: Via Table A-2: Look up 1.31, this gives the area between 0 and 1.31, by the symmetry of the normal curve this is the same as the area between -1.31 and 0. The answer you get is 0.4049. Now look up 1.46, this gives the area between 0 and 1.46, which is .4279. Since

Area between -1.31 and 1.46

=

Area between 0 and 1.46

+

Area between -1.31 and 0

The final answer is 0.4049 + 0.4279 = 0.8328

 

Via the calculator you just need to use normalcdf(-1.31, 1.46)

 

Back to top

 

Finding Z-scores when given Probabilities

 

Sometimes we wish to find a decile, percentile or quartile for a standard normal distribution. Consider these problems

 

Example 1: Find the temperature of the 80th percentile

 

Solution: Since the 80th percentile means 80% of the scores are below this value and since the area under the curve is 1, this means that the area below the score representing the 80th percentile is 0.8. To use the table, consider that it gives only the positive half of the curve and that the area under the negative half is 0.5. This means you are looking for a score corresponding to 0.3. Looking at the BODY of the table, find the value closest to 0.3, you will see this is the column and row that corresponds to 0.84, So a z-score of 0.84 is the 80th percentile.

 

It is easier on a calculator: hit 2nd VARS for the distr menu, but select 3 for invNorm, now enter .8 and a parenthesis. You should get this

InvNorm(.8) = .84162

 

Notice once again the calculator is a bit more accurate

 

Example 2: Find the temperature corresponding to

 

Solution: This is the 10th percentile, so 10% of the scores are below this value. The area under the curve below this value is 0.1 (again the total area under the curve is 1 - so 0.10 * 1 = 0.1). By symmetry this is the same as looking for the point whose area ABOVE is 0.1. Using the table we need to find the value corresponding to 0.4 in the body of the table. The closest value is 1.28. This is the point with 10% of the scores above it, so -1.28 is the point with 10% of the scores below it.

 

On the calculator just use invNorm(0.1) to get then answer

Back to top