Surfstat.australia: an online text in introductory Statistics

SUMMARISING AND PRESENTING DATA

MEASURES OF VARIABILITY

Comments

  1. MINITAB command is STDEV C1 and standard deviation is also given, with other values, by DESCRIBE C1.

  2. Although (n-1) is most commonly used for the denominator, sometimes n is used. Many calculators have both versions - usually the one with n is (wrongly) called s and the one with (n-1) is called s. Some textbooks start using n and then change to (n-1). (eg Staudte's "Seeing Through Statistics".)
  3. An alternative version of the formula which is easier to use is

    e.g. for data set A, n=7
  4. For grouped data with values and corresponding frequencies the formula is

    EXAMPLE - for the student age data
    For the student age data (xi = age of student, fi = number of students with age xi)

    Boxplots

  5. The standard deviation, like the mean, is strongly influenced by extreme values. eg Commodore Prices

The MINITAB command

The box contains 50% of the values. The whiskers show how far the values are spread.
The MINITAB command is BOXPLOT C1

For all Commodore prices

Omitting the value $29,500

If an observation is more than 3 times the interquartile range (IQR) from an end of the box, it is the MINITAB convention to regard it as an "outlier" (possibly a mistake?) and it is marked as o on a box plot. If an observation is between 1.5 x IQR and 3 x IQR from one end of the box, it is a possible outlier. It is marked as * on a box plot. The whiskers of the boxplot do not extend to the *'s and o's.

Five-number summary

The five-number summary of a distribution consists of the median M, the quartiles Q1 and Q3 and the smallest and largest individual observations, written in the order

minimum, Q1, M, Q3, maximum

This provides a quick overall description of a distribution.

Progress check

  1. which of the following is NOT true of a boxplot?
  2. When calculating a sample standard deviation using a statistical calculator that offers both sn and sn-1, it is better to
  3. Which measure of central tendency is included in the "five-number summary"?
  4. Which measures of dispersion can be found from the "five-number summary"?


... Previous page Next page ...