Linear Regression Problems with Solutions

Linear regression and data modeling problems are presented on this page along with detailed solutions. A linear regression calculator and grapher may also be used to verify answers and generate additional practice examples.

Review: Least Squares Linear Regression

If a plot of \( n \) data pairs \( (x, y) \) suggests a linear relationship between \( x \) and \( y \), the least squares method can be used to determine the best-fitting straight line.

The least squares regression line minimizes the sum of the squares of the vertical distances \( d_1 + d_2 + \cdots + d_n \) between the observed data points and the line.

Least squares regression minimizing vertical distances
Figure 1. Least squares regression minimizing the sum of squared vertical deviations.

The equation of the least squares regression line is written in slope–intercept form:

\[ y = ax + b \]

where the coefficients \( a \) and \( b \) are given by:

\[ a = \frac{ n \sum_{i=1}^{n} x_i y_i - \left(\sum_{i=1}^{n} x_i\right) \left(\sum_{i=1}^{n} y_i\right) }{ n \sum_{i=1}^{n} x_i^2 - \left(\sum_{i=1}^{n} x_i\right)^2 } \] \[ b = \frac{1}{n} \left( \sum_{i=1}^{n} y_i - a \sum_{i=1}^{n} x_i \right) \]

Problems

Problem 1

Consider the set of points \[ \{(-2,-1),(1,1),(3,2)\}. \]

a) Find the least squares regression line.
b) Plot the data points and the regression line on the same set of axes.

Problem 2

Consider the data set \[ \{(-1,0),(0,2),(1,4),(2,5)\}. \]

a) Find the least squares regression line.
b) Plot the data points and the regression line.

Problem 3

The following table shows values of \( x \) and their corresponding values of \( y \).

x01234
y23546

a) Find the least squares regression line \( y = ax + b \).
b) Estimate the value of \( y \) when \( x = 10 \).

Problem 4

The sales of a company (in millions of dollars) for each year are shown below.

Year20052006200720082009
Sales1219293745

a) Find the least squares regression line.
b) Use the model to estimate the company’s sales in 2012.

Solutions to the Above Problems

Solution to Problem 1

xy\(xy\)\(x^2\)
-2-124
1111
3269
\(\sum x=2\)\(\sum y=2\)\(\sum xy=9\)\(\sum x^2=14\)

Using the formulas:

\[ a=\frac{3(9)-2(2)}{3(14)-2^2}=\frac{23}{38}, \qquad b=\frac{1}{3}\left(2-\frac{23}{38}\cdot2\right)=\frac{5}{19} \]

The regression line is:

\[ y=\frac{23}{38}x+\frac{5}{19} \]
Linear regression graph problem 1
Figure 2. Linear regression for Problem 1.

Solution to Problem 2

xy\(xy\)\(x^2\)
-1001
0200
1441
25104
\(\sum x=2\)\(\sum y=11\)\(\sum xy=14\)\(\sum x^2=6\)
\[ a=\frac{4(14)-2(11)}{4(6)-2^2}=\frac{17}{10}=1.7, \qquad b=\frac{1}{4}(11-1.7\cdot2)=1.9 \]
Linear regression graph problem 2
Figure 3. Linear regression for Problem 2.

Solution to Problem 3

\[ a=\frac{5(49)-10(20)}{5(30)-10^2}=0.9, \qquad b=\frac{1}{5}(20-0.9\cdot10)=2.2 \] \[ y=0.9x+2.2 \] \[ y(10)=0.9(10)+2.2=11.2 \]

Solution to Problem 4

Let \( t=x-2005 \) represent the number of years after 2005.

\[ a=\frac{5(368)-10(142)}{5(30)-10^2}=8.4, \qquad b=\frac{1}{5}(142-8.4\cdot10)=11.6 \]

For 2012, \( t=7 \):

\[ y=8.4(7)+11.6=70.4 \]

The estimated sales in 2012 are 70.4 million dollars.

More References and Links

  1. Linear Regression Calculator and Grapher
  2. Linear Least Squares Fitting
  3. Elementary Statistics and Probability