Stat 2005 homepage

Stat 2005 information

Stat 2005 activity page

Stat 2005 bulletin board

Stat 2005 resource page

CHAPTER 1

INTRODUCTION TO STATISTICS

HELP PAGE

 Some Important Terms

Statistics - a collection of methods for planning experiments, obtaining data and then organizing, summarizing, presenting, interpreting and drawing conclusions from the data

Population - The complete collection of information to be studied

Census- collection of data from every element of the population

Sample - a collection of data from a subgroup of the population

Parameter - Numerical measurement from the population

Statistic - Numerical measurement from the sample

EXAMPLE: Suppose you are interested in the average G.P.A. of all Casper College students. The population is every student enrolled at Casper College (this is about 5000 people) If you could calculate the average G.P.A. by contacting every Casper College student this would be an example of a census. However if selected a small group of students and calculated the G.P.A. of these students, you have a sample. As you will see usually a census is difficult or impossible and a carefully selected sample can give you an excellent estimate of the actual average G.P.A. of all Casper College students. The average calculated from the entire population is an example of a parameter and the average calculated from the sample is an example of a statistic

TERMS ABOUT DATA

Data can be:

Quantitative - numbers representing counts or measurements

or

Qualitative - data that can be separated into different categories and are distinguished by some nonnumeric characteristic

Quantitative data can be either:

Discrete - meaning there are a finite or countable number of possible values

or

Continuous - meaning there are an infinite number of possible values

EXAMPLE: Data such as gender or brand names of soft drinks or names of states you have visited is qualitative. Data such as weights, heights, gas mileage of cars, grade point averages are quantitative. If the data represents counts it is discrete - such as number of students receiving an A on an exam or years in which republicans were elected president. If the data represents measurements - weights, heights, G.P.A.'s it is continuous

There is another important way to classify data as nominal, ordinal, interval or ratio. The following table and example give you a summary.

Level of Measurement

Description

Examples

Nominal

(Name only!)

CATEGORIES ONLY! Data can not be arranged in an ordering scheme

Types of cars driven by College faculty

Zip Codes

Genders

Ordinal

(Type of ranking)

Data can be arranged in some order, differences between data values are meaningless

Letter grades

Car size categories (compact, full size etc..)

Interval

(Ratios have no meaning)

Differences between values can be found, but no inherent starting point and RATIOS HAVE NO MEANING

Years in which there was a solar eclipse

Temperatures

Ratio

(Ratio's have meaning)

Like interval but with an inherent starting point and RATIOS HAVE MEANING

Body weights

Heights of buildings in Casper

Lengths of commercials

EXAMPLE: Nominal data is just names of categories. For example is you say that there are 15 females and 20 males enrolled in this course, this data is nominal. The categories are male and female - even though numbers are included they have no computational significance. For example we could not average the 15 females and 20 males. Ordinal data is like a ranking - for example number of students receiving A's B's etc.. in this class. There is a defined order (A is better than B etc..) but differences are not meaningful or cannot be determined. In interval data you can find differences, but there is no inherent starting point and ratios are meaningless. For example the average daily temperature in Casper is interval. In Ratio measurement ratios do have meanings such as in body weight or height

Most students have trouble differentiating between interval and ratio levels of measurement. Here is a simple test: If one number is twice the other is the quantity being measured also twice the other quantity? For example if you have two weights 140 lbs. and 280 lbs. it should be clear that 280 lbs. is twice as heavy as 140 lbs. So weights are an example of a ratio level of measurement. However say you have two temperatures 40 degrees and 80 degrees, 80 degrees is not twice as hot as 40 degrees, so this is an example of an interval level of measurement. Another test is that in the ratio level of measurement zero means absence of quantity. If you consider weights, 0 lb. means that you have NO weight (so weight is ratio), while with the interval level of measurement, such as temperature 0 degrees Fahrenheit does not mean the absence of heat which is what temperature measures.

Uses and Abuses of Statistics:

Read this section in the book carefully. Pay particular attention to the definition of a self-selected survey. If you design a survey where the respondents themselves decide whether or not they should be included, you could have significant bias in your sample. For example if you sent out a survey by mail where you asked taxpayers their opinion of the I.R.S., chances are that the respondents may only be those who have a strong opinion on this subject. This means your sample will be unfairly skewed towards those with strong opinions. A common type of self selected survey is one on T.V. or the internet where individuals are asked to respond via phone or e-mail. You are much more likely to respond if the issue is one where you have a strong opinion

Design of Experiments

There are basically two types of experiments: observational studies and experiments. An observational study is one where we observe and measure specific characteristics, but we do not attempt to manipulate or modify the subjects being studies. For example the annual Christmas bird count is an observational study - in this study they simply try to identify the number of different types of bird present. In an experiment we apply some treatment and then proceed to observe its effects on the subjects. For example in testing a new drug, you may give some patients the drug and others a placebo and then measure the effects.

Most often, we can not do a census and need to use a sample to determine the information we want about a population. The method of selecting a sample is critical! Data carelessly selected is useless! There are five common methods of sampling that are used - they are summarized in the table below:

Sampling Method

Description

Random

A random sample is where each member of the population has an equal chance of being selected. For example we might use a computer to generate id numbers of Casper College students randomly and make this our sample

Stratified

You classify the population into two or more groups and draw a sample from each. For example we might classify our population into females and males or African-Americans and Caucasians

Systematic Sampling

Select every kth member. For example if you have 300 term papers (from the same class) you might select every 5th paper

Cluster sampling

You divide the population area in sections and randomly select a few of these sections and then choose all the members from them

Convenience Sampling

Use results that are readily available. For example just asking your friends. This method is highly likely to result in bias

Back to top