1. This is a
a) 2 x 2 table. b) 2 x 5 table. c) 5 x 2 table. d) 5 x 5 table
2. The appropriate null hypothesis for this data is that
a) the distribution of parents’ responses on this question is the same for male and female infants. b) the distribution of gender is the same for each parents’ response to this question. c) gender and parents’ responses on this question are independent. d) gender and parents’ responses on this question are dependent
3. A study to compare two types of infant formula was run at two sites, one in Atlanta and the second in Denver. The study was run over a three-week period. Subjects at both sites were classified as dropouts if they left the study before the conclusion, or completers if they finished the study. The following table gives the number of dropouts and completers at each site. A chi-square test was performed and the result was X
2 = 5.101 with p-value = 0.024.
Responder Dropout Completer Atlanta 16 134 Denver 21 379
The correct conclusion is a) we found evidence to suggest that Atlanta had a greater dropout rate. b) we found evidence to suggest that Denver had a greater dropout rate. c) any differences can be explained by sampling variability. d) there is no association between responder and dropout rate.
For Questions 4 – 7 A study was performed to examine the personal goals of children in grades 4, 5, and 6. A random sample of students was selected for each of the grades from schools in Georgia. The students received a questionnaire regarding personal goals. They were asked what they would most like to do at school: make good grades, be popular, or be good at sports. Results are presented in the table below by the sex of the child.
Make good grades Be popular Be good in sports
Boys 96 32 94
Girls 295 45 40
4. The proportion of boys who chose the goal “be good in sports” and the proportion of girls who chose
the goal “be good in sports” are a) proportion of boys = 0.42, proportion of girls = 0.07. b) proportion of boys = 0.70, proportion of girls = 0.30. c) proportion of boys = 0.16, proportion of girls = 0.07. d) proportion of boys = 0.42, proportion of girls = 0.11.
5. Suppose we wish to test the null hypothesis that there are no differences among the proportion of boys
and the proportion of girls choosing each of the three personal goals. Under the null hypothesis, the expected number of boys that would select “be good in sports” is a) 49.4 b) 67 c) 74 d) 33
6.. Suppose we wish to test the null hypothesis that there are no differences among the proportion of
boys and the proportion of girls choosing each of the three personal goals. The value of the chi- square statistic X
a) 0.2893 b) 1.2644 c) 90.0266 d) 45.4335
For Questions 7 – 15 At what age do babies learn to crawl? Does it take longer to learn in the winter when babies are often bundled in clothes that restrict their movement? Data were collected at the University of Denver Infant Study Center where parents and their babies participated in one of a number of experiments between 1988 and 1991. Parents reported the age (in weeks) at which their child was first able to creep or crawl a distance of four feet within one minute. The researchers also recorded the average outdoor temperature (in °F) six months after each baby’s birthdate. For each month of the year, the researchers selected one baby, at random, born in that month. If we fit the least-squares line to the 12 data points (one for each month) we obtain the following results from a software package. Notice that temperature is taken as the explanatory variable and crawling age as the response.
s = 1.319
Variable Parameter estimate Standard error of estimate
Intercept 35.6781 1.318
Temperature -0.077739 0.0251
Here is a scatterplot of average crawling age versus average outdoor temperature six months after birth followed by a plot of the residuals versus average outdoor temperature six months after birth.
Suppose the researchers test the hypotheses
0 H : the slope of the least-squares regression line = 0
a H : the slope of the least-squares regression line ≠ 0.
7. The explanatory variable in this study is
a) crawling age b) the age (in weeks) at which a baby was first able to creep or crawl a distance of four feet within
one minute. c) the extent to which parents honestly reported the age at which their baby was first able to crawl
and didn’t exaggerate in order to make their baby appear gifted. d) the average outdoor temperature six months after a baby’s birthdate
8. The slope of the least-squares regression line is (approximately)
a) 35.68. b) 1.32. c) -0.08. d) -0.80
9. The quantity s = 1.319 is an estimate of the standard deviation of the deviations in the simple linear
regression model. The degrees of freedom for s 2 are
a) 1.74. b) 10. c) 11. d) 12
10. The value of the t statistic for this test is
a) -0.06. b) -3.10. c) 27.07. d) 3.10
11. Which of the following statements is supported by these plots?
a) There is no striking evidence in these plots that the assumptions for regression are violated. b) There is evidence in these plots that the assumptions for regression are violated. c) There is an influential observation in the plot, which should be deleted. d) There is an outlier in the plots suggesting that our above results must be interpreted with caution.
12. A 90% confidence interval for the slope of the least-squares regression line is (approximately) a) -0.078 ± 0.041. b) -0.078 ± 0.045. c) -0.078 ± 0.056. d) -0.078± 0.059.
13. Suppose we wish to determine the mean crawling age for all babies born when the average outdoor
temperature is 25°F six months after birth. We use computer software to do the prediction and obtain the following output.
Temp. Predict Stdev. Mean Predict
25° F 33.735 0.739
95% C.I. for Mean Predict 95% Predict Interval
(32.087, 35.382) (30.364, 37.105)
A 95% interval for mean crawling age is a) (32.087, 35.382). b) (30.364, 37.105). c) 33.735 ± 0.739. d) 33.735 ± 6.741
For Questions 14 – 16 Is there a relationship between brain size and intelligence? The Full Scale IQ scores (FSIQ) and brain sizes (in pixels, as measured by MRI scans) of 39 subjects were measured. Researchers wished to study the relationship between FSIQ and brain size, using brain size to predict FSIQ. However, the researchers believed that brain size is also dependent on body size and that some adjustment for body size might be necessary in order to understand the relation between brain size and intelligence. Therefore, the researchers also measured the heights (in inches) of the 39 subjects and used height as a measure of body size. They then used a multiple regression model to predict FSIQ from brain size and height. They obtained the following results. Analysis of Variance
Source df Sum of squares Mean Squares F
Model 2 5861.7
Error 36 15805.2
Total 38 21666.9
Variable Parameter estimate Standard error
Intercept 117.22 59.09
Brain size 0.00020957 0.00005816
Height -2.824 1.065
s = 20.95, 2
R = 0.271 The researchers assume that the statistical model for the relation between FSIQ, brain size, and height is the multiple linear regression model
i FSIQ =
0 β +
1 β (brain size)i +
2 β (height)i +
for i = 1, 2,…, 39. The deviations i
ε are assumed to be independent and normally distributed with mean 0 and standard deviation σ.
Complete the above ANOVA Table and then answer questions 14 – 16.
14. One of the subjects had FSIQ = 130, brain size = 866,662, and height = 66.5. The residual for this
subject is a) 18.95. b) 20.95. c) 111.05. d) 5.35.
15. The p-value of the analysis of variance F test of the hypothesis Ho: 1 2
0β β= = is a) less than 0.01. b) between 0.01 and 0.05. c) between 0.05 and 0.10 d) greater than 0.10.
16. Based on the above results, we may conclude that
a) the proportion of the variation of FSIQ that is explained by brain size in a multiple linear regression is 0.271.
b) the proportion of the variation of FSIQ that is explained by the variables brain size and height in a multiple linear regression is 0.271.
c) the proportion of the variation of FSIQ that is explained by the variables brain size and height in a multiple linear regression is 0.52.
d) None of the above For Questions 17 – 20 A manufacturer of infant formula is running an experiment using the standard (control) formula, and two new formulas, A and B. The goal is to boost the immune system in infants. 120 infants in the study are randomly assigned to each of three groups: group A, group B, and a control group. There are 40 infants per group, and the study is run for 12 weeks. At the end of the study the variable measured is total IGA (in mg per dl), with higher values being more desirable. We are going to run a one-way ANOVA on these data. A partial ANOVA table is given below.
One-Way Analysis of Variance
Analysis of Variance
Source DF SS MS F p
17. Complete the ANOVA table above. What is the value of the F-statistic? a) 5.469 b) 0.002 c) 0.118 d) 4.679 18. The hypotheses tested by the one-way ANOVA F test are
a) Ho: the mean IGA score is the same for all three formulas. Ha: the mean IGA score is higher for both treatment groups A and B than the control group.
b) Ho: the mean IGA score is the same for all three formulas. Ha: the mean IGA score is higher for at least one of the two treatment groups than the control group.
c) Ho: the mean IGA score is the same for all three formulas. Ha: the mean IGA score is not the same for all three formulas.
d) Ho: the mean IGA score is the same for all three formulas. Ha: the mean IGA score is lower for at least one of the two treatment groups than the control group.
19. The mean square for groups in this table is a) 0.00786. b) 0.01179. c) 0.02357. d) 0.31841
20. The p-value for testing
Ho: the mean IGA score is the same for all three formulas Ha: the mean IGA score is not the same for all three formulas a) is less than 0.001. b) is between 0.001 and 0.025. c) is between 0.05 and 0.1 d) is greater than 0.10.
Extra Credit Use the following to answer questions 1–5: A study was conducted to monitor the emissions of a noxious substance from a chemical plant and the concentration of the chemical at a location in close proximity to the plant at various times throughout the year. A total of 14 measurements were made. Computer output for the simple linear regression least- squares fit is provided (some entries have been omitted and replaced with *****):
Linear Fit Concentration = 1.5429211 + 1.8247687 Emissions Summary of Fit RSquare 0.793919 RSquare Adj 0.776745 Root Mean Square Error 1.513979 Mean of Response 8.810714 Observations (or Sum Wgts) 14 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Prob > F Model ** 105.96390 ********* 46.229 <.0001 Error ** *********** ********* C. Total ** 133.46949 Parameter Estimates Term Estimate Std Error t Ratio Prob>|t| Intercept 1.5429211 1.142937 **** 0.2019 Emissions 1.8247687 0.268379 **** <.0001
1. The degrees of freedom for SSM and SSE are, respectively:
a) DFM = 2, DFE = 12. b) DFM = 1, DFE = 12. c) DFM = 1, DFE = 13. d) DFM = 1, DFE = 14.
2. What is the value for the SSE?
a) 27.50559 b) 10.26688 c) 1.142937 d) 2.292
3. What is the estimate of σ 2
? a) 1.514 b) 1.143 c) 0.794 d) 2.292
4. What is the test statistic and its value to test Ho: 1
0β = against Ha: 1
0β ≠ ? a) F = 46.2294 b) t = 6.80 c) t = 1.35 d) Either A or B.
5. What is the 95% confidence interval estimate for 0
β ? A) (1.24, 2.41) B) (-0.49, 3.58) C) (-0.95, 4.03) D) (1.35, 2.39)
6. What is the goal of statistics? a) Creating sampling distributions of the sample mean that are approximately normal b) Maximize systematic variance, minimize error variance. c) A density curve has an area underneath it of 1 d) P-values are more informative than the reject-or-not result of a fixed level α test.