Chapter 3: Descriptive Statistics: Numerical Methods

1.  These data are taken from Chapter 1: 28, 31, 22, 26, 23, 27, 24, 28, 21, 30, and 41.  See Table 1, Section 1.1.  (These data values are the age of 11 individuals in a statistics class.)  Use R to answer the following questions.  What is the median?

  1. 28
  2. 26
  3. 25
  4. 27 X

Solution:

> age <- c(28, 31, 22, 26, 23, 27, 24, 28, 21, 30, 41)

> median(age)

[1] 27

2.  What is the mode of the set of data used in the preceding exercise?

  1. 27
  2. 30
  3. 28 X
  4. 26

Solution:

> table(age)

age

21 22 23 24 26 27 28 30 31 41

 1     1    1   1   1    1    2   1    1   1

3.  Find the 28th and 73rd percentiles of the data used in the previous exercise.

  1. 24 and 30 X
  2. 26 and 28
  3. 26 and 30
  4. 25 and 27

Solution:

> quantile(age, type = 2, probs = c(0.28, 0.73))

28% 73%

 24  30

4.  What is the range of the data used in the preceding exercise?

  1. 17
  2. 20 X
  3. 13
  4. 22

Solution:

> max(age) - min(age)

[1] 20

5.  What is the interquartile range of the data used above?

  1. 4
  2. 5
  3. 6
  4. 7 X

Solution:

> IQR(age, type = 2)

[1] 7

or

> quantile(age, type = 2, probs = c(0.75)) - quantile(age, type = 2, probs = c(0.25))

75%

  7

6.  What is the variance of the data used in the previous exercises?

  1. 33.34219
  2. 27.68511
  3. 30.85455 X
  4. 24.33333

Solution:

> var(age)

[1] 30.85455

7.  Find the standard deviation of the data used above.

  1. 6.155534
  2. 5.554687 X
  3. 6.073421
  4. 5.323245

Solution:

> sd(age)

[1] 5.554687

8.  Find the coefficient of variation for the data used above.

  1. 0.2029952 X
  2. 0. 2430017
  3. 0.1871433
  4. 0.1923429

Solution:

> sd(age) / mean(age)

[1] 0.2029952

9.  What is the largest standardized value? 

  1. 3.654648
  2. 3.145634
  3. -2.965606
  4. 2.454929 X

Solution:

> scale(sort(age))           

 [1,] -1.14563369

 [2,] -0.96560554

 [3,] -0.78557739

 [4,] -0.60554924

 [5,] -0.24549293

 [6,] -0.06546478

 [7,]  0.11456337

 [8,]  0.11456337

 [9,]  0.47461967

[10,]  0.65464783

[11,]  2.45492935

10.   The eight-day forecast of high-and-low temperatures (in degrees Fahrenheit) for New York City for the period from June 8 to June 15, 2017 is: 68 and 56; 82 and 64; 84 and 69; 93 and 70; 91 and 71; 90 and 70; 84 and 64; and 78 and 69.  Create a data frame consisting of the two variables (name them “high” and “low,” respectively), name the data frame “temps.”  Find the correlation and covariance of variables “high” and “low.”

  1. 23.15571 and 0.9471028
  2. 40.27653 and 0.8778212
  3. 34.60714 and 0.8407997 X
  4. 46.15235 and 0.9176467

Solution:

> high <- c(68, 82, 84, 93, 91, 90, 84, 78)

> low <- c(56, 64, 69, 70, 71, 70, 64, 69)

> temps <- data.frame(High = high, Low = low)

> cov(temps$High, temps$Low)

[1] 34.60714

> cor(temps$High, temps$Low)

[1] 0.8407997