# Chapter 12: Simple Linear Regression

1.  Using the Cars93 data (see the exercises at the end of Chapter 2 for more information about Cars93, if necessary), suppose we want to investigate whether two variables---MPG.city and Horsepower---are related.  As a first step, what is the correlation of these two variables?

1. -0.7398998
2. -0.6726362 X
3. -0.5246562
4. -0.3699499

Solution:

> cor(Cars93$MPG.city, Cars93$Horsepower)

[1] -0.6726362

2. Letting MPG.city be the dependent variable and Horsepower the independent variable, find the total sum of squares, SSy for the estimated regression equation.

1. 3777.24
2. 2905.57 X
3. 5520.58
4. 2121.07

Solution:

> sum((Cars93$MPG.city - mean(Cars93$MPG.city))^2)

[1] 2905.57

3.  Referring to preceding exercise, find the regression sum of squares, SSreg.

1. 1314.594 X
2. 1472.346
3. 1077.967
4. 1262.011

Solution:

>slr <- lm(MPG.city ~ Horsepower, data = Cars93)

> sum((predict(slr)-mean(Cars93$MPG.city))^2) [1] 1314.594 4. What is the coefficient of determination, r2? 1. 0.5655492 2. 0.3890979 3. 0.5003980 4. 0.4524394 X Solution: > sum((predict(slr) - mean(Cars93$MPG.city)) ^ 2) / sum((Cars93$MPG.city - mean(Cars93$MPG.city)) ^ 2)

[1] 0.4524394

5.  What is the estimated regression coefficient b1?

1. -0.10826150
2. -0.05413075
3. -0.07217434 X
4. -0.13532691

Solution:

> b1 <- sum((Cars93$Horsepower - mean(Cars93$Horsepower)) *     (Cars93$MPG.city - mean(Cars93$MPG.city))) / sum((Cars93$Horsepower - mean(Cars93$Horsepower)) ^ 2)

> b1

[1] -0.07217434

6.  What is the estimated intercept term b0?

1. 32.7462 X
2. 47.4821
3. 20.6302
4. 15.4726

Solution:

> b0 <- mean(Cars93$MPG.city) - b1 * mean(Cars93$Horsepower)

> b0

[1] 32.7462

7.  Which is the estimated regression equation?

1. ŷ = 3.27462 - 0.07217x
2. ŷ = 32.7462 - 0.07217x  X
3. ŷ = 0.07217 - 32.7462x
4. ŷ = 32.7462 - 0.72174x

Solution:

Substitute b0 = 32.7462 and b1 = -0.07217 into regression equation to obtain

ŷ = 32.7462 - 0.07217x

8.  What is the standard error of the regression coefficient b1?

1. 0.008323 X
2. 0.166466
3. 0.058263
4. 0.004161

Solution:

> syx <- sqrt(sum((Cars93$MPG.city - predict(slr))^2 / (nrow(Cars93) - 2))) > ssx <- sqrt(sum((Cars93$Horsepower - mean(Cars93\$Horsepower))^2))

> std_error_b1 <- syx / ssx

> std_error_b1

[1] 0.008323347

9.  What is the t statistic associated with the regression coefficient b1?

1. -4.7692
2. -11.532
3. -8.6713 X
4. -6.9370

Solution:

Dividing b1 (see exercise 5) by std_error_b1 (see exercise 8), we have t

> t <- b1 / std_error_b1

> t

[1] -8.671312

10.   What is the p-value associated with the regression coefficient b1?

1. 0.0000000000001536838 X
2. 0.0000000000192104814
3. 0.0000000000000737682
4. 0.0000000000115262967

Solution:

> pvalue <- 2 * pt(-8.671312, 91)

> pvalue

[1] 0.0000000000001536838