# Statistics with R

## Student Resources

# Chapter 3: Descriptive Statistics: Numerical Methods

1. These data are taken from Chapter 1: 28, 31, 22, 26, 23, 27, 24, 28, 21, 30, and 41. See Table 1, Section 1.1. (These data values are the age of 11 individuals in a statistics class.) Use R to answer the following questions. What is the median?

- 28
- 26
- 25
- 27 X

**Solution:**

> age <- c(28, 31, 22, 26, 23, 27, 24, 28, 21, 30, 41)

> median(age)

[1] 27

2. What is the mode of the set of data used in the preceding exercise?

- 27
- 30
- 28 X
- 26

**Solution:**

> table(age)

age

21 22 23 24 26 27 28 30 31 41

1 1 1 1 1 1 2 1 1 1

3. Find the 28^{th} and 73^{rd} percentiles of the data used in the previous exercise.

- 24 and 30 X
- 26 and 28
- 26 and 30
- 25 and 27

**Solution:**

> quantile(age, type = 2, probs = c(0.28, 0.73))

28% 73%

24 30

4. What is the range of the data used in the preceding exercise?

- 17
- 20 X
- 13
- 22

**Solution:**

> max(age) - min(age)

[1] 20

5. What is the interquartile range of the data used above?

- 4
- 5
- 6
- 7 X

**Solution:**

> IQR(age, type = 2)

[1] 7

or

> quantile(age, type = 2, probs = c(0.75)) - quantile(age, type = 2, probs = c(0.25))

75%

7

6. What is the variance of the data used in the previous exercises?

- 33.34219
- 27.68511
- 30.85455 X
- 24.33333

**Solution:**

> var(age)

[1] 30.85455

7. Find the standard deviation of the data used above.

- 6.155534
- 5.554687 X
- 6.073421
- 5.323245

**Solution:**

> sd(age)

[1] 5.554687

8. Find the coefficient of variation for the data used above.

- 0.2029952 X
- 0. 2430017
- 0.1871433
- 0.1923429

**Solution:**

> sd(age) / mean(age)

[1] 0.2029952

9. What is the largest standardized value?

- 3.654648
- 3.145634
- -2.965606
- 2.454929 X

**Solution:**

> scale(sort(age))

[1,] -1.14563369

[2,] -0.96560554

[3,] -0.78557739

[4,] -0.60554924

[5,] -0.24549293

[6,] -0.06546478

[7,] 0.11456337

[8,] 0.11456337

[9,] 0.47461967

[10,] 0.65464783

[11,] 2.45492935

10. The eight-day forecast of high-and-low temperatures (in degrees Fahrenheit) for New York City for the period from June 8 to June 15, 2017 is: 68 and 56; 82 and 64; 84 and 69; 93 and 70; 91 and 71; 90 and 70; 84 and 64; and 78 and 69. Create a data frame consisting of the two variables (name them “high” and “low,” respectively), name the data frame “temps.” Find the correlation and covariance of variables “high” and “low.”

- 23.15571 and 0.9471028
- 40.27653 and 0.8778212
- 34.60714 and 0.8407997 X
- 46.15235 and 0.9176467

**Solution:**

> high <- c(68, 82, 84, 93, 91, 90, 84, 78)

> low <- c(56, 64, 69, 70, 71, 70, 64, 69)

> temps <- data.frame(High = high, Low = low)

> cov(temps$High, temps$Low)

[1] 34.60714

> cor(temps$High, temps$Low)

[1] 0.8407997

>