Surfstat.australia: an online text in introductory Statistics

SUMMARISING AND PRESENTING DATA

TYPES OF VARIABLE

It is useful to distinguish between two broad types of variables: qualitative and quantitative (or numeric). Each is broken down into two sub-types: qualitative data can be ordinal or nominal, and numeric data can be discrete (often, integer) or continuous.

Because qualitative data always have a limited number of alternative values, such variables are also described as discrete. All qualitative data are discrete, while some numeric data are discrete and some are continuous.

For statistical analysis, qualitative data can be converted into discrete numeric data by simply counting the different values that appear.

Note: the word "variable" is used in two senses. It can mean an item of data collected on each sampling unit, and it can mean "random variable". A random variable is a variable in the mathematical sense, but one that takes different values according to a probability distribution. The word "variate" is also sometimes used to mean random variable. In Statistics, we use random variables to build probability models for data variables. This makes sense because when data are collected on observational units sampled at random, the values recorded for the data variables can be regarded as realisations of mathematical random variables.

Qualitative Data

Qualitative data arise when the observations fall into separate distinct categories.

Examples are:
Colour of eyes : blue, green, brown etc
Exam result : pass or fail
Socio-economic status : low, middle or high.

Such data are inherently discrete, in that there are a finite number of possible categories into which each observation may fall.

Data are classified as:
nominal if there is no natural order between the categories (eg eye colour), or
ordinal if an ordering exists (eg exam results, socio-economic status).

Quantitative Data

Quantitative or numerical data arise when the observations are counts or measurements. The data are said to be discrete if the measurements are integers (eg number of people in a household, number of cigarettes smoked per day) and continuous if the measurements can take on any value, usually within some range (eg weight).

Quantities such as sex and weight are called variables, because the value of these quantities vary from one observation to another. Numbers calculated to describe important features of the data are called statistics. For example, (i) the proportion of females, and (ii) the average age of unemployed persons, in a sample of residents of a town are statistics.

The following table shows a part of some (hypothetical) data on a group of 48 subjects.
'Age' and 'income' are continuous numeric variables,
'age group' is an ordinal qualitative variable,
and 'sex' is a nominal qualitative variable.

The ordinal variable 'age group' is created from the continuous variable 'age' using five categories:
age group = 1 if age is less than 20;
age group = 2 if age is 20 to 29;
age group = 3 if age is 30 to 39;
age group = 4 if age is 40 to 49;
age group = 5 if age is 50 or more

Table - Hypothetical Data

Subject No Age
(years)
Age
Group
Annual Income
(x $10,000)
Sex
1 32 3 4.1 F
2 20 2 1.5 M
3 45 4 2.3 F
. . . . .
. . . . .
47 19 1 0.5 F
48 32 3 1.9 F

Progress check

  1. A person's highest educational level is which type of variable?
  2. The number of motor-vehicle accidents on a particular stretch
    of the Pacific Highway in a week is which type of variable?
  3. Nominal data are often analysed in the form of:

Exercises


... Previous page Next page ...