Home
Up

 

Estimating a population standard deviation or variance

We now move on to the question of  estimating the population standard deviation or variance from a sample.

Why is this question important? Well it is easy to see many situations where finding an estimate for the mean is valuable - i.e. what is the average men's waist size, what is the average GPA of Casper College students, etc. What about standard deviation - why would we want to estimate it?

One good reason is we are often very concerned with consistency. Suppose you manufacture donuts and advertise that each donut in your package is 10 oz. While is important that your donuts average size be 10 oz. - you will also have unhappy customers if your donuts vary too much from the 10 oz. size - since standard deviation measures this variance, we need to be able to estimate it

In the case of taking samples and finding means, the central limit theorem tells us that sample means are normally distributed. The same is true for sample proportions. Unfortunately this is not true for sample standard deviations. They follow another distribution called the chi-square

Definition: In a normally distributed population with variance , if you randomly select samples of size n and compute the sample variance for each sample. The sample statistic has what is called a chi-squared distribution.

Important Facts about chi-squared:

1) is never negative (n > 1 always and both the sample and population variances are squared - so it must be positive or zero)

2) The chi-squared distribution is different for each number of degrees of freedom, The degrees of freedom are given by n-1

3) Chi squared is NOT symmetric - here is one for 3 degrees of freedom. As the degrees of freedom increase it becomes more like the normal distribution

Here is the distribution for 3 degrees of freedom

Here is another shot for 10 degrees of freedom

To find the values of chi-squared you can use table A-4 p. 766. Note that the table is organized to give the chi square value for area to the RIGHT of the value. Since we are working with confidence intervals, we want to find TWO critical values so that 95% of the area is between the TWO values. Here is and example

Example: For 9 degrees of freedom find the two critical values for chi-squared at the 95% confidence level

Since we are at the 95% confidence level, This means that we want 2.5% of the area below the left critical value and 2.5% of the area above the right critical value

To find these critical values from a table - look at the row for 9 degrees of freedom, the left critical value has 0.95+0.025=0.975 to the right of it. So look at this column to find the critical value of 2.70. Now for the right critical value we have 0.025 are to the right of it so look at this column to find19.023

You can also use a calculator program called X2val which is available at the following web site:

http://www.hsu.edu/faculty/lloydm/ti/ti83/83.html

Take the link to probability and statistics and download the following programs

x2val.zip

zzinewt.zip

zzmenu.zip

zzrank.zip

These files are zipped so you will need an unzipper and you will need a TI-graph link to get the program from your computer to you calculator

To run the program hit PRGM and find x2val and then hit enter twice. You will be asked for the degrees of freedom and the confidence level (enter as a decimal). Then it will give you the two critical values.

The formulas for the confidence intervals are as follows

Variance:

This is not a misprint you do use the right critical point on the left and the left critical point on the right

Standard deviation

Now for some problems:

Example 1: The National Center for Educational Statistics surveyed college graduates about the lengths of time required to earn their bachelor's degrees. The mean is 5.15 years, and the standard deviation is 1.68 years. Assume that the sample size is 101. Based on this sample data, construct a 99% confidence interval for the standard deviation of the times required by all college graduates

Well we first need to find the critical values - so the degrees of freedom are n-1 = 101-1=100. Since we want the 99% confidence level- this means that 0.005 is the area of each critical region, from the table this yields and the right critical value of , so this gives

Example 2:  A NAPA auto parts supplier wants to know how long car owners plan to keep their cars. A random sample of 25 car owners results in a mean of 7.01 years and a standard deviation of 3.74 years, respectively. Assuming the sample is drawn from a normally distributed population, find a 95% confidence interval for the population mean and a 95% confidence interval for the standard deviation

Here degrees of freedom are 24. Confidence level is 95% so we have the following critical values:

12.401 and 39.364

Hence:

Back to top