The file problem4.xlsx contains data on the following variables from a firm that erects boiler drums,
|Hours||The number of worker hours to erect boiler drums|
|Capacity||Boiler capacity in pounds per hour.|
|Pressure||Boiler pressure in pounds per square inch.|
|Boiler type||Equals 1 if a “utility” boiler; 0 if an “industrial” boiler.|
|Drum type||Equals 1 if a “steam” drum; 0 if a “mud” drum.|
Estimate the following model.
Hours=0 + 1Capacity + 2Pressure + 3Boiler_type + 3Drum_type
(a) (2 pts) Report the coefficient of Capacity and interpret the coefficient.
(b) (2 pts) Report and briefly interpret the R-square value.
(c) (4 pts) Using the estimated regression test the hypothesis that “industrial” Boiler type takes 3000 hrs extra to erect as compared to the “utility” Boiler type.
Write down the null and alternate hypothesis and report your conclusions.
Ford Motor Company purchases millions of tires per year for installation on new vehicles that it manufactures. Ford uses a regression model to help in this process. This regression model is used to assist in negotiating tire prices with suppliers.
Ford purchased a total of 61 different types of tires in a particular year. The data set for this problem (problem5.xlsx) contains information about the price per tire paid by Ford for each of these 61 types of tires. It also contains information about the characteristics of each of these tires and the number of each type of tire purchased by Ford.
|Price||Tire price in dollars per tire.|
|Diameter||Inside diameter of the tire in inches.|
|Ratio||Ratio of height of tire sidewall to width of tread, in percent.|
|Terrain||Equals 1 if the tire is for all Terrain vehicle, 0 otherwise.|
|Weight||Weight of tire in pounds.|
|Tread||Tread width of tire in inches.|
|OWL||Equals 1 if raised Outside White Lettering, 0 otherwise.|
|Volume||The number of tires of this type purchased by Ford in 1997 (thousands).|
|RATING||Indicator variable denoting the speed safety classification of a tire. For simplicity we will consider only two safety classifications (‘normal highway driving’ and ‘high speed driving’). This variable (RATING) equals 1 if rated for normal highway driving, 0 otherwise (i.e. high speed driving).|
|WW||Indicator variable denoting the tire color (tires are either ‘black wall’ or ‘white wall’). This variable equals 1 if a ‘white wall’ tire, 0 otherwise (i.e. a ‘black wall’ tire).|
Estimate the following model, used by Ford.
Note: the prices in this data have been altered to preserve data confidentiality.
Make sure to use the natural log of price in your regression , NOT the price column!!!!
(a) (2 pts) Write the coefficient of WEIGHT and interpret the coefficient.
(b) (2 pts) Write the coefficient of RATING and interpret the coefficient.
(c) (2 pts)What is the impact of color (variable WW) on price?
(d) (3 pts)The coefficient of Diameter is larger than the coefficient of Weight. Does this mean that Diameter is a more important variable than Weight?
(e) (3 pts) What is the elasticity of Price with respect to Weight.?
(f) (6 pts) There is an existing belief in the Ford Motor Company that tires that are rated for normal highway driving are typically 5 pounds heavier than similar tires which are rated for high speed driving. That is, normal highway driving tires that are 5 pounds heavier than similar high speed driving tires (similar on all other aspects) tend to be priced the same.
Please perform appropriate statistical tests to check this belief and state your answer whether your data supports this belief or not.
Problem 6 (25 total points)
In this problem you will work with quarterly sales data of work T-shirts from a sample of 100 stores belonging to a large Retailer. The data is contained in problem6.xlsx , the first column is simply a store-id (STOREID), second column contains the quarterly sales in number of T-shirts sold (SALES), the third column gives you the quarter to which these sales belong (QUARTER) and finally the fourth column tells you whether the T-shirts were on promotion or not (PROM) [PROM = 1 implying that the T-shirts were on promotion].
(a) (3 pts)You need to run a regression with SALES as your dependent variable and QUARTER & PROM as your independent variables. Note that QUARTER is a categorical variable so you need to introduce appropriate dummy variables to represent it, choose the 4rth quarter as your reference when you introduce dummy variables. That is, estimate the following regression model
SALES = 0 + 1QTR1 + 2QTR2 + 3QTR3 + 4PROM
where QTR1, QTR2, QTR3 are dummy variables for the first, second and third quarters respectively.
Report and interpret the estimated coefficient on QTR1
(b) (4 pts)Based on the estimated regression model in part (a) what is your estimate of the fourth quarter sales with and without promotions?
(c) (2 pts) Notice that the p-value for 2, the coefficient on QTR2, is greater than (assuming to be 0.05). In other words, you cannot reject the hypothesis that2= 0. What does it tell you about the impact of the second quarter on sales?
(d) (2 pts)What is the impact of promotions on sales?
(e) (5 pts)Is the impact of promotions on sales the same across all quarters? Please answer this question by showing appropriate steps that you followed to reach your conclusion. You may have to modify your regression equation to answer this and subsequent questions.
(f) (4 pts) Using your working for part(e) please calculate and report the impact of promotions in the third quarter.
(g) (5 pts)Which quarters have similar impact of promotions on Sales? Please support your answer by relevant details.
When searching the internet using the popular search engine “google” you might have noticed a string of sponsored ads/links on the right side and on the top of the search page. There is a whole process how these links (and the particular order in which they appear) get decided. The process is sometimes termed as “keyword advertising”.
The file (problem7.xlsx) contains part of the data from one such keyword advertising campaign conducted over a period of time by a sports retailer selling on the internet. The retailer had purchased multiple sponsored links to be displayed for specific keywords typed by the user. So, based on the search term typed by the user a sponsored link by the retailer gets displayed at various positions during the day (the positions vary based on competitive auction dynamics which we will not explore). Some of these displays result in someone clicking on the link (i.e. results in a click). The data file has information on the following variables.
1) The number of times a particular sponsored link was displayed in a day (IMPRESSIONS)
2) The average of the various positions these sponsored links appeared on during that day (AVGPOS)
3) The number of clicks on that day obtained by these sponsored links (CLICKS).
Please conduct the following regression to answer the following questions.
CLICKS = 0 + 1IMPRESSIONS + 2AVGPOS
(a) (2 pts)Report and interpret your estimates of 0, 1 and 2.
(b) (3 pts)The mean of IMPRESSIONS is equal to 502.18. You can also calculate that in the excel spreadsheet.
Now, please subtract this number (502.18) from all the data you have on IMPRESSIONS. That is, create another column in excel and call it IMPRESSIONS_minus_mean. Put data in this column as per the following formula,
IMPRESSIONS_minus_mean = IMPRESSIONS – 502.18
Similarly, the mean of AVGPOS data is 3.978. Create another column and call it AVGPOS_minus_mean. Put data in this column as per the following formula,
AVGPOS_minus_mean = AVGPOS – 3.978
Now estimate the following regression
CLICKS = 0 + 1IMPRESSIONS_minus_mean + 2AVGPOS_minus_mean
Report your estimates of 0, 1 and 2.
(c) (4 pts)In the regression carried out in part (b) does the intercept have a managerial interpretation? if yes, state the managerial interpretation. If no, then explain why a managerial interpretation does not exist.
You wish to use the characteristics of refrigerators to predict their price. You have data on 37 refrigerators: price (PRICE) in dollars, annual cost of operation (OPCOST) in dollars, total size in cubic feet of the refrigerator (REFRSIZE) [the total size = the size of the freezer + the size of the non-frozen cooling compartment], size in cubic feet of the freezer (FREZSIZE), number of shelves (SHELVES), and number of features like egg racks and ice makers (FEATURES). The data for this problem is contained in (problem8.xlsx).
Use Excel to estimate the model:
PRICE = 0 + 1OPCOST + 2REFRSIZE + 3FREZSIZE + 4SHELVES + 5FEATURES +
(a) (2 pts)Interpret the coefficient of FREZSIZE.
(b) (2 pts)Interpret the coefficient of SHELVES.
(c) (3 pts)LetCOOLSIZE be the size of a refrigerator’s cooling compartment (i.e., the part of the refrigerator that keeps food cold but not frozen). REFRSIZE is the refrigerator size and FREZSIZE is the freezer size. Then REFRSIZE = COOLSIZE + FREZSIZE. In the model of refrigerator price, suppose we deleted REFRSIZE (i.e., drop variable REFRSIZE) and inserted the variable COOLSIZE instead.
Run a regression to estimate this model in which the variable REFRSIZE is deleted and variable COOLSIZE is included.
PRICE = 0 + 1OPCOST + 2COOLSIZE + 3FREZSIZE + 4SHELVES + 5FEATURES +
Report and Interpret the coefficient of FREZSIZE in this model.
(d) (4 pts)Please provide explanation to reconcile the interpretations of the coefficient on FREZSIZE in part (a) and part (c)
(e) (5 pts)If consumers are concerned about the total cost of using a refrigerator, consumers will pay less for refrigerators that are more expensive to operate. Suppose that a refrigerator is expected to last twelve years and that consumers discount future expenditures at 4% (real rate of interest). We would then expect that a one dollar increase in the annual cost of operation would reduce the price a consumer is willing to pay by
Do consumers consider the total cost of ownership in the manner described above when they purchase a refrigerator? Write down and test a null hypothesis to answer this question. Clearly state your conclusions.