Stat 2005 homepage

Stat 2005 activity page

Stat 2005 bulletin board

Back to Chapter 2 page

CHAPTER 2

Section 1 and 2

Lesson material

Frequency tables

How to create

Relative Frequency tables

Cumulative Frequency tables

Summary

Objectives (what you should know what to do ) after completing these sections

  1. Be able to find lower class limits, upper class limits, class boundaries, class widths, and class marks for a given frequency table
  2. Construct a frequency table, and or a relative frequency table and or a cumulative frequency table for given data

SUMMARIZING DATA WITH FREQUENCY TABLES

Data is often collected in order to answer some question or give us some insight on information. For example we might look at college entrance exam scores to determine how well prepared students are for college. In all cases when looking at data we want to have a variety of tools to help us understand the data set. The material in this chapter gives us the tools to do this

When analyzing a data set we should first consider whether the data comes from a complete population (remember this means everything - for example test scores in this class) or a sample. Methods of descriptive statistics are used to summarize the important characteristics of a set of population data

One easy way to analyze data is the frequency table - A frequency table lists categories (called classes ) of scores along with the number of scores that fall into each category. Consider the example below which involves the ACT math score of incoming students at a small college

ACT SCORE

Frequency

0-4

3

5-9

10

10-14

17

15-19

35

20-24

25

25-29

15

30-34

5

This frequency table has 7 classes (0-4,5-9,10-14,15-19,20-24,25-29,30-34). The frequency represents the number of students receiving that score.

Here are some important terms for frequency tables

Lower class limits are the smallest numbers that can actually belong to a class (in table above these are 0,5,10, 15, 20, 25, 30)

Upper class limits are the largest numbers that can actually belong to a class (in table above these are 4,9,14,19,24,29,34)

Class boundaries are the numbers used to separate class limits, but without the gap created by class limits. To get class boundaries find the size of the gap between the upper class limit of one class and the lower class limit of the next class. Add half of this to the each upper class limit to find the upper class boundaries and subtract half of it from each lower class limit to find the lower class boundaries. For example in the table above, the size of the gap between lower and upper class limits is 1, so you add 1/2 to each upper class limit and subtract 1/2 from each lower class limit. Here is a chart of the class boundaries:

Class Boundaries

-0.5 to 4.5

4.5 to 9.5

9.5 to 14.5

14.5 to 19.5

19.5 to 24.5

24.5 to 29.5

29.5 to 34.5

Class marks are the midpoints of the classes. To calculate these add the lower class limit and the upper class limit and divide this by two. (they are 2,7,12,17,22,27,32)

Class width is the difference between two consecutive lower class limits or two consecutive lower class boundaries (in example above it is 5)

HOW TO CREATE YOUR OWN FREQUENCY TABLE

 Let's figure the method out with an example: Here is some data on distance to school for all kindergarten students at Sunnyflowery Elementary. The distances are given to closest 0.1 of a mile

Student ID

Miles

Student ID

Miles

Student ID

Miles

Student ID

Miles

1362

1.5

2877

1

4355

1.2

6573

0.4

1486

2.1

2964

0.5

4454

1.5

8436

2.8

1587

1.3

3491

0.8

4531

1.7

8592

0.3

1877

0.2

3588

0.3

5482

2.3

8854

0.1

1932

2.5

3711

1.5

5533

1.4

8964

2.2

1946

0.7

3780

0.2

5717

8.5

2103

3.5

3921

1.3

6307

6.2

The ID number data is interesting, but not particularly important. We want to create a frequency table for the miles from school the students must travel

To create a frequency table, we first need to decide how many classes we want. Normally we choose between 5-20 classes (people can't seem to handle more than 20 easily). Looking at my data I see that students travel from 0 to 8.5 miles to school each day. I'm going to choose 15 classes. We now need to determine the class width you can do this by taking the range of the data (highest value - lowest value) and dividing by the number of classes. For our example this is: (8.5-0.1)/15=0.56 Always round up (not off) this number. Round this to the same number of decimal places as the original data - in this case we get 0.6. Now choose as a lower limit of the fist class either the lowest value or a convenient value slightly less (not less than the lowest data point minus the value calculated above). Our lowest value is 0.1, so I'll choose 0. Now add the class width to the starting point to get the second lower class limit (in our case this will be 0.6) keep doing this for all classes. Here is what your chart should look like with just lower class limits:

0.0-

0.6-

1.2-

1.8-

2.4-

3.0-

3.6-

4.2-

4.8-

5.4-

6.0-

6.6-

7.2-

7.8-

8.4-

 

Now that we have the lower class limits in a column, we easily identify the upper class limits:

0.0-0.5

0.6-1.1

1.2-1.7

1.8-2.3

2.4-2.9

3.0-3.5

3.6-4.1

4.2-4.7

4.8-5.3

5.4-5.9

6.0-6.5

6.6-7.1

7.2-7.7

7.8-8.3

8.4-8.9

Now count the number of students who fall into each class. Here is the finished frequency table

Miles

Frequency

0.0-0.5

7

0.6-1.1

3

1.2-1.7

8

1.8-2.3

3

2.4-2.9

2

3.0-3.5

1

3.6-4.1

0

4.2-4.7

0

4.8-5.3

0

5.4-5.9

0

6.0-6.5

1

6.6-7.1

0

7.2-7.7

0

7.8-8.3

0

8.4-8.9

1

RELATIVE FREQUENCY TABLES

A relative frequency table includes the percent of items in each class of the frequency table. For example in our kindergarten example above, we would calculate the percent of students in each class. To calculate the relative frequencies you use:

Relative frequency = frequency/total

For example for the class: 0.0 - 0.5 in the table we have a count of 7. This means that there are 7 students that live within 0.5 miles from SunnyFlowery Elementary. There are a total of 26 students overall in the table (to get this number just add the frequency column). This gives a relative frequency of 0.269. Doing this for all the classes yields the following table:

Miles

Relative

Frequency

0.0-0.5

0.269

0.6-1.1

0.115

1.2-1.7

0.308

1.8-2.3

0.115

2.4-2.9

0.076

3.0-3.5

0.038

3.6-4.1

0.00

4.2-4.7

0.00

4.8-5.3

0.00

5.4-5.9

0.00

6.0-6.5

0.038

6.6-7.1

0.00

7.2-7.7

0.00

7.8-8.3

0.00

8.4-8.9

0.038

The relative frequencies should add up to 1 (for 100%) or be very close (our chart is 0.997 due to roundoff error)

CUMULATIVE FREQUENCY TABLE

Sometimes we want to know the cumulative total instead of the total for individual classes. In this case we have a cumulative frequency table. The frequency column is replaced with a cumulative total. For example look at the second row of the table below -it contains the number from 0 to 0.5 (this is 7) and the number from 0.6 to 1.1 (this is 3) to give a total of 10. The next row contains these ten values plus the ones from 1.2 to 1.7 and so on. Note the labels in the first column (i.e. less than 1.8 means 0 to 1.7 - the less than notation is typical) Here is the cumulative frequency for our kindergarten data:

Miles

Cumulative Frequency

Less than 0.6

7

Less than 1.2

10

Less than 1.8

18

Less than 2.4

21

Less than 3.0

23

Less than 3.6

24

Less than 4.2

24

Less than 4.8

24

Less than 5.4

24

Less than 6.0

24

Less than 6.6

25

Less than 7.2

25

Less than 7.8

25

Less than 8.4

25

Less than 9.0

26


Of course the last number in the cumulative frequency column should be the total number of values in the data set

REMEMBER WHEN CONSTRUCTING TABLES

  1. Classes must be mutually exclusive. Each data point must belong to only ONE class
  2. Include all classes even if the frequency is 0
  3. Try to use the same width for all classes, open endend intervals such as 65 or older are often acceptable
  4. Select some convenient numbers for class limits - try to use numbers relevant to the situation
  5. Use between 5 and 20 classes
  6. The sum of the class frequencies is equal to the number of original data values

Here are some other worked examples:

Example 1

Example 2

Take this link for some practice problems