Surfstat.australia: an online text in introductory Statistics

# SUMMARISING AND PRESENTING DATA

## MEASURES OF VARIABILITY

These are statistics which summarise how spread out the data values are. They are also called measures of dispersion.

### Range

The range is the difference between the lowest value and the highest value: the maximum minus the minimum. For the Commodore data, the maximum is \$29,500 and the minimum is \$2,200:

Range = (Maximum - Minimum) = (29500 - 2200) = 27300
All
Commodores
Omitting
\$29,500
Median 9,500 9,500 not changed
Max 29,500 20,000 changed
Min 2,200 2,200
Range 27,300 17,800 changed greatly
Mean 10,080 9,555 changed

The range depends only on the extreme values in the data set.
Mistakes in data, such as reversing digits (e.g. 52 for 25) or omitting digits (e.g. 12 for 132) may produce extreme values. A measure of the spread of data which is not so much affected by extreme values as the range is to take values 5% in from either end, or 1/4 in from either end.

### Quartiles

When the data are arranged in order of magnitude (i.e. they are ranked) the quartiles are 3 numbers which divide the data into four groups each having approximately the same number of values.

### Procedure

1. Order the n data values from smallest to largest.
2. The 2nd quartile, Q2 is the median of the whole data set.
3. If n is even, the first quartile, Q1, is the median of the smallest n/2 observations and the third quartile, Q3, is the median of the largest n/2 observations.
4. If n is odd, Q1 is the median of the smallest observations, and Q3 is the median of the largest obesrvations.

### Interquartile Range

The inteaquartile range is defined as   IQR = Q3 - Q1.

EXAMPLE. Consider first 9 Commodore prices ( in \$,000)

6.0,   6.7,   3.8,   7.0,   5.8,   9.975,   10.5,   5.99,   20.0

Arrange these in order of magnitude

3.8,   5.8,   5.99,   6.0,   6.7,   7.0,   9.975,   10.5,   20.0

The median is Q2 = 6.7 (there are 4 values on either side)

Q1 = 5.9 (median of the 4 smallest values)

Q3 = 10.2 (median of the 4 largest values)

IQR = 10.2 - 5.9 = 4.3.

[Some textbooks and computer programs use slightly different definitions for Q1 and Q3 from the ones given here. The calculated values, however, are usually very similar. Use HELP DESCRIBE to see the MINITAB definition.]

Just as the median is not affected much by extreme values, neither is the IQR. For example, for the Commodore prices MINITAB gives

### Percentiles

Quartiles divide the ordered data into quarters, but we can consider any fractions we please. The most common are "percentiles", where we take hundredths. The first quartile is thus the 25th percentile, the median is the 50th percentile and the upper quartile is the 75th percentile.

The percentiles most commonly used, after the 50th, are those close to 100. Thus the 90th percentile is the value that is exceeded by only 10% of the sample or the population, and the 99th percentile is exceeded by only 1 in 100.

You will occasionally also see "deciles", which are found by dividing the data into tenths, and "quintiles", which divide the data into fifths. The first quintile is identical to the 20th percentile, the median is the fifth decile, and so on.

#### Progress check

1. Find the interquartile range of the following 6 numbers:   11, 15, 16, 17, 24, 34.

 ... Previous page Next page ...