1. MULTIPLE REGRESSION. For this and the next 5 Qs. Consider the following data set.
Y X1 X2 X3 X4
SALES ADV BONUS MKTSHR COMPET
963.50 374.27 230.98 33 202.22
893.00 408.50 236.28 29 252.77
1,057.25 414.31 271.57 34 293.22
1,183.25 448.42 291.20 24 202.22
1,419.50 517.88 282.17 32 303.33
1,547.75 637.60 321.16 29 353.88
1,580.00 635.72 294.32 28 374.11
1,071.50 446.86 305.69 31 404.44
1,078.25 489.59 238.41 20 394.33
1,122.50 500.56 271.38 30 303.33
1,304.75 484.18 332.64 25 333.66
1,552.25 618.07 261.80 34 353.88
1,040.00 453.39 235.63 42 262.88
1,045.25 440.86 249.68 28 333.66
1,102.25 487.79 232.99 28 232.55
1,225.25 537.67 272.20 30 273.00
1,508.00 612.21 266.64 29 323.55
1,564.25 601.46 277.44 32 404.44
1,634.75 585.10 312.35 36 283.11
1,159.25 524.56 292.87 34 222.44
1,202.75 535.17 268.27 31 283.11
1,294.25 486.03 309.85 32 242.66
1,467.50 540.17 291.03 28 333.66
1,583.75 583.85 289.29 27 313.44
1,124.75 499.15 272.55 26 374.11
a. Write the first-order linear model relating mean sales, i.e. E(Y), to the specified explanatory variables.
b. Now fit the model to the dataset (i.e. run the regression). [I] What proportion of the total variation in sales is explained by the regression? [II] What is the name of the statistic used to obtain this information?
2. [a] Please state only the null hypothesis for the overall model utility. [b] What test of significance is used to test this hypothesis? [c] What is the numerical value of this test statistic? [d] What is your conclusion, and the basis for your conclusion? Please express your answers CLEARLY and CAREFULLY in order to earn full credit.
3. If the model is statistically significant, then how much sales is predicted if X1 = $500, X2 = $200, X3 = 38, and X4 = 205?
4. What is the measure of unexplained variation in the regression study? What is its value?
5. Is there sufficient evidence to indicate that Sales is positively related to ADV, once Bonus, Mkt share, and Compet are accounted for? Test using = 0.05. To answer this question, please [a] state your null hypothesis, [b] state the value of the coefficient, [c] show the value of the test statistic and the corresponding p-value, and then [d] state your conclusion.
6. Test the hypothesis: H0: 4 = 0 against HA: 4 ≠ 0. Please give your conclusion.
7. Which one of the following answers is correct? If this null hypothesis H0: 1 = 2 = 3 = 0 is rejected, the interpretation should be that:
a. There is no linear relationship between Y and any of the 3 independent variables
b. There is a regression relationship between Y and at least one of the 3 independent variables
c. All 3 independent variables have a slope of zero
d. All 3 independent variables have equal slopes
e. There is a regression relationship between Y and all 3 independent variables
8. Consider the following estimated multiple regression equation, with n = 25:
Y = 5 + 10X1 + 20X2.
R2 = 0.90; Sb1 = 3.2; Sb2 = 5.5
Calculate the test statistic to test whether X1 contributes information to the prediction of Y.
e. None of the above
9. An article about banks’ profitability lists four independent variables that may affect profitability. A regression analysis with the four independent variables is carried out. The dataset consists of a random sample of 120 observations. Results of the analysis include: SSE 4,560 and SSR = 562. [I] calculate the F statistic [II] Is there a regression relationship between the dependent variable and the explanatory variables? Explain fully.
10. For this and the next question: A collector of antique grandfather clocks believes that the price (Y) received for the clocks at an auction increases with age of the clocks (X1) measured in years and also with the number of bidders (X2). The model is then fitted to the data and the following summary results are obtained:
Std. err. of est.: SY.X 133.1365
F (p value) 120.6511 (0.000)
Suppose you wish to test the hypothesis that the auction price increases as the number of bidders increases. Which of the following conclusions would be correct with respect to this inquiry?
a. Since the p value for is less than 0.05, reject H0 and conclude that the mean sales price of the clocks increases as the number of bidders increases, when age is held constant.
b. Since the p value for is less than 0.05, reject H0 and conclude that the mean sales price of the clocks increases as the number of bidders increases, when age is held constant.
c. Since the F statistic is significant, reject H0 and conclude that the mean sales price of the clocks is positively related to both the number of bidders and age of the clocks.
d. The mean price of a clock rises by $12.73 for every one-year increase in the age of the clock.
11. A measure of unexplained variation in the regression is the mean square error (MSE). What is the value of this statistic?
d. Not enough information to determine
12. Use for this and the next question. A multiple regression with 4 independent variables produced the following output:
Coefficient Standard Error
X1 31.00 2.331
X2 1.07 1.011
X3 11.23 8.830
X4 0.0056 0.001
Calculate the F statistic for the regression.
d. Insufficient information
13. The P-value associated with the test of significance for 4 is:
a. > 0.05
b. < 0.05
d. Insufficient information
14. The coefficient of determination, R2, has which one of the following properties?
a. Always negative
b. Ranges from –1 to +1
c. Ratio of unexplained variation to explained variation
d. It has the same sign as the slope of the regression line
e. Ranges from zero to one
15. An auto insurance company wants to predict client claims from the amount of premium paid on each policy. Which of the following variables should be the explanatory variable in the study?
a. Amount of claim
c. Limits per coverage
d. Policy premium
16. In a simple linear regression, when testing ¬H0: 1 = 0, against H1: 1 0, failing to reject the null hypothesis means that:
a. the slope of the regression line is not zero
b. the relationship between x and y may be multiplicative
c. there is no linear relationship between x and y
d. there is a linear relationship between x and y
e. None of the above
17. The assumptions of the simple linear regression model include:
a. the errors are normally distributed
b. the error terms have a constant variance
c. the errors have a mean of zero
d. All of the above
e. a and c only
18. The deviation which is explained by the regression line can be expressed as:
a. the difference between each actual Y value and the mean of Y:
b. the difference between each residual value and the corresponding Y value: (e – Y)
c. the difference between each predicted Y value and the mean of Y:
d. the difference between each actual Y value and the predicted y value:
e. the difference between each x value and each y value: (x – y)
19. In testing H0: 1 = 2 = 3,= 0, a p value of 0.0001 and of 0.01 would give an indication that:
a. the null hypothesis should not be rejected
b. the null hypothesis should be rejected
c. all three independent variables have a slope of zero
d. there is no linear relationship between y and any of the three independent variables
e. none of the above
20. What is the slope of the following regression line y = 3.4x 2.5?
e. none of the above