Surfstat.australia: an online text in introductory Statistics



Ways of presenting discrete (qualitative or quantitative) data.

Frequency Distributions

A simple and effective way of summarising discrete data is by counting the number of observations falling into each category. The number associated with each category is called the frequency and the collection of frequencies over all categories gives the frequency distribution of that variable.

The relative frequency is a number which describes the proportion of observations falling in a given category. This can be illustrated using the 'damaged cartons' example from the introduction section of the notes. Observe which category a subject or object belongs to e.g. damaged carton - corner gouge, tip crush, end smash. Count how many observations in each category - this gives 'frequency' or 'count' data. Tabulate results in frequency table showing frequencies or relative frequencies or percentages.
Type Total
A - Flap out 16 0.0096 1
B - Flap torn 17 0.0102 1
C - End smashed 132 0.0793 8
D - Puncture 95 0.0571 6
E - Glue problem 87 0.0523 5
F - Corner gouge 984 0.5913 59
G - Compression wrinkle 15 0.0090 1
H - Tip crushed 303 0.1821 18
I - Tot. destruction 15 0.0090 1
Total 1664 0.9999* 100

(* the relative frequencies do not add to 1.0000 due to rounding)

Relative frequency for type A
Percentage for type A
The usefulness of relative frequencies and percentages is clear: for example, it is easily seen that corner gouge accounts for 59% of the total number of damages.

Bar charts and Pareto charts

The frequency distribution of a variable is often presented graphically as a bar chart. For example, the data in the frequency table above can be shown as:

The vertical scale can be frequencies or relative frequencies or percentages.
On the horizontal axis

all boxes should have the same width
leave gaps between the boxes (because there is no connection between them)
the boxes can be in any order.

Pareto Chart - rearrange the boxes in a bar chart in order of importance - starting with category with highest frequency.

Progress check

  1. The total of the Relative Frequency column in a frequency distribution tables is
  2. The difference between a bar chart and a Pareto chart is that in a Pareto chart

... Previous page Next page ...