4.33 Always plot your data! Table 4.1 presents four sets of data prepared by the statistician Frank
Anscombe to illustrate the dangers of calculating without first plotting the data.12
TABLE 4.1 Four data sets for exploring correlation and regression
Data Set A
10
8
13
9
11
14
6
12
5
y 8.04
6.95
4.82
5.68
7.58
13
8.818.33 9.96 7.24 4.26
11 14 6
10.84
12
10
8
9
Data Set B
y
9.14
8.14
8.77
8.10
6.13
3.10
9.13
7.26
4.74
8.74
13
9.26
11
2
Data Set C
10
00
8
9
14
6
4
12
5
y 7.46
6.77
7.81
8.84
5.39
8.15
6.42
12.74
8
7.11
8
6.08
8
5.73
19
8
8
8
8
Data Set D
8
8
8
y 6.58 5.76 7.71
8.848.47 7.04 5.25 5.56
791
6.89
12.50
a. Without making scatterplots, find the correlation and the least-squares regression line for all four data
sets. What do you notice? Use the regression line to predict y for x = 10.
b. Make a scatterplot for each of the data sets and add the regression line to each plot.
c. In which of the four cases would you be willing to use the regression line to describe the dependence of
y on ar? Explain your answer in each case.
