+
x is called the regression
line of y on x (the least squares regression line)
The idea can be generalised to equations involving more than one predictor
More generally, e.g.
Correlation coefficient is
so x and y are interchangeable.
Regression line is y =
+
x where
=
-
and
=
so x and y are not interchangeable.
If you treat x as the response and y as the predictor you get a different line because you would minimise the horizontal distances rather than the vertical distances.
i.e. the square of the correlation coefficient r is the product of the slopes of these two lines. If the linear relationship is perfect the two lines will be the same: then because "X" and "Y" are interchanged, each slope will be the reciprocal of the other, and their product will be 1.
If the plot (or theory) suggests a non-linear relationship it may be possible to either transform the data into a linear relationship and then fit a straight line or else use a non-linear function of the predictor variables.
Speed (x) 15.86 16.88 17.50 18.62 19.97 21.06 Stride Rate(y) 3.05 3.12 3.17 3.25 3.36 3.46
Data were entered into a MINTAB worksheet in columns C5 (speed) and C6 (stride).
MTB> Plot 'stride' 'speed'; SUBC> Symbol 'x'.
Analysis of Variance SOURCE DF SS MS F p Regression 1 0.11789 0.11789 1807.69 0.000 Error 4 0.00026 0.00007 Total 5 0.11815 MTB > let c3 = 1.79677 + 0.078527 * c5 MTB > let c4 = c6 - c3 MTB > name c3 'fit' c4 'residual' MTB > print c5 c6 c3 c4 ROW speed stride fit residual 1 15.86 3.05 3.04221 0.0077918 2 16.88 3.12 3.12231 -0.0023060 3 17.50 3.17 3.17099 -0.0009923 4 18.62 3.25 3.25894 -0.0089428 5 19.97 3.36 3.36495 -0.0049543 6 21.06 3.46 3.45055 0.0094514 MTB > plot c4 c5
Progress check |
| ... Previous page | Next page ... |