# Mathematics

Question 1. [12 marks]

Market research has indicated that customers are likely to bypass Roma tomatoes that weigh less than 70 grams. A produce company produces Roma tomatoes that average 74.0 grams with a standard deviation of 3.2 grams.

(a) [2 marks] Assuming that the normal distribution is a reasonable model for the

weights of these tomatoes, what proportion of Roma tomatoes are currently undersize (less than 70g)?

(b) [2 marks] How much must a Roma tomato weigh to be among the heaviest 10%?

(c ) [2 marks] The aim of the current research is to reduce the proportion of

undersized tomatoes to no more than 2%. One way of reducing this proportion is to reduce the standard deviation. If the average size of the tomatoes remains 74.0 grams, what must the target standard deviation be to achieve the 2% goal?

(d) [3 marks] The company claims that the goal of 2% undersized tomatoes is

reached.To test this, a random sample of 25 tomatoes is taken. What is the

distribution of undersized tomatoes in this sample if the company’s claim is true?

Explain your reasoning.

Question 2:

In an article in Marketing Science, Silk and Berndt investigate the output of advertising agencies. They describe ad agency output by finding the shares of dollar billing volume coming from various media categories such as network television, spot television, newspaper, radio, and so forth.

Suppose that a random sample of 400 U.S. advertising agencies gives an average percentage share of billing volume from network television equal to 7.46 percent with a standard deviation of 1.42 percent. Further, suppose that a random sample of 400 U.S. advertising agencies gives an average percentage share of billing volume from spot television commercials equal to 12.44 percent with a standard deviation of 1.55 percent.

Using the sample information, does it appear that the mean percentage share of billing volume from spot television commercials for the U.S. advertising agencies is greater than the mean percentage share of billing volume from network television? Explain.

Module #3: Sampling Distributions, Estimates, and Hypothesis Testing

Question 3:

[3] Identify which of these types of sampling is used: random, systematic, convenience, stratified, or cluster.

a) The instructor of this course observed at a Walnut Creek Police sobriety checkpoint at which every fifth driver was stopped and interviewed. Some drivers were arrested.

b) The instructor of this course observed professional wine tasters working at a winery in Napa Valley, CA. Assume that a taste test involved three different wines randomly selected from each of five different wineries.

c) The U.S. Department of Corrections collects data about returning prisoners by randomly selecting five federal prisons and surveying all of the prisoners in each of the prisons.

d) In a Gallup poll, 1003 adults were called after their telephone numbers were randomly generated by a computer, and 20% of them said that they get news on the Internet every day.

e) The instructor of this course surveyed all of my students to obtain sample data consisting of the number of credit cards students possess in one of my statistics classes.

Question 4:

[4] In March 16, 1998, issue of Fortune magazine, the results of a survey of 2,221 MBA students from across the United States conducted by the Stockholm-based academic consulting firm Universum showed that only 20 percent of MBA students expect to stay at their first job five years or more. Source: Shalley Branch, “MBAs: What Do They Really Want,” Fortune (March 16, 1998), p.167.

a) Assuming that a random sample was selected, construct a 98% confidence interval for the proportion of all U.S. MBA students who expect to stay at their first job five years or more.

b) Based on the interval from a), can you conclude that there is strong evidence that less than one-fourth of all U.S. MBA students expect to stay? Explain why.

Question 5:

[5] An earlier study claims that U.S. adults spend an average of 114 minutes with their families per day. A recently taken sample of 25 adults showed that they spend an average of 109 minutes per day with their families. The sample standard deviation is 11 minutes. Assume that the time spent by adults with their families has an approximate normal distribution. We wish to test whether the mean time spent currently by all adults with their families is less than 114 minutes a day.

a) Construct a 95% confidence interval for the mean time spent by all adults with their families.

b) Does the sample information support that the mean time spent currently by all adults with their families is less than 114 minutes a day? Explain your conclusion in words.

Question 6:

[6] When 40 people used the Weight Watchers diet for one year, their mean weight loss was 3.00 lb. (based on data from “Comparison of the Atkins, Ornish, Weight Watchers, and Zone Diets for Weight Loss and Heart Disease Reduction,” by Dansinger, et al., Journal of the American Medical Association, Vol. 293, No. 1). Assume that the standard deviation of all such weight changes is = 4.9 lb. We shall use a 0.01 significance level to test the claim that the mean weight loss is greater than 0.

a) Set up the null and alternative hypotheses, and perform the hypothesis test.

b) Based on these results, does the diet appear to be effective? Does the diet appear to have practical significance?

Question 7:

[7] In the case of Castenedav. Partida, it was found that during a period of 11 years in Hilda County, Texas, 870 people were selected for grand jury duty, and 39% of them were Americans of Mexican ancestry. Among the people eligible for grand jury duty, 79.1% were Americans of Mexican ancestry. We shall use a 0.01 significance level to test the claim that the selection process is biased against Americans of Mexican ancestry.

(a) Set up the null and alternative hypotheses, and perform the hypothesis test.

(b) Does the jury selection system appear to be fair?

Question 8:

[8] A local television station has added a consumer spot to its nightly news. The consumer reporter has recently bought sixteen bottles of aspirin from a local drugstore and has counted the aspirins in each bottle. Although the bottles advertised 500 aspirins, the reporter found the following numbers with the mean count 498.8125:

499, 498, 496, 501, 493, 495, 497, 502, 496, 502, 499, 501, 500, 498, 501, 503

The consumer reporter claims that this is an obvious case of the public being taken advantage of. Using a confidence interval estimate method or a hypothesis testing method, do you think that the reporter’s claim is justifiable?

Module #4: Two-Sample Tests and Simple Linear Regression

Question 9:

[9] In an article in Marketing Science, Silk and Berndt investigate the output of advertising agencies. They describe ad agency output by finding the shares of dollar billing volume coming from various media categories such as network television, spot television, newspaper, radio, and so forth.

Suppose that a random sample of 400 U.S. advertising agencies gives an average percentage share of billing volume from network television equal to 7.46 percent with a standard deviation of 1.42 percent. Further, suppose that a random sample of 400 U.S. advertising agencies gives an average percentage share of billing volume from spot television commercials equal to 12.44 percent with a standard deviation of 1.55 percent.

Using the sample information, does it appear that the mean percentage share of billing volume from spot television commercials for the U.S. advertising agencies is greater than the mean percentage share of billing volume from network television? Explain.

Question 10:

[10] A random sample of the birth weights of 186 babies has a mean of 3103g and a standard deviation of 696g (based on data from “Cognitive Outcomes of Preschool Children with Prenatal Cocaine Exposure,” by Singer et al., Journal of the American Medical Association, Vol. 291, No. 20). These babies were born to mothers who did not use cocaine during their pregnancies. Further, a random sample of the birth weights of 190 babies born to mothers who used cocaine during their pregnancies has a mean of 2700g and a standard deviation of 645g. Does cocaine use appear to affect the birth weight of a baby? Substantiate you conclusion.

Question 11:

[11] The owner of an intra -city moving company typically has his most experienced manager predict the total number of labor hours that will be required to complete an upcoming move. This approach had proved useful in the past, but he would like to be able to develop a more accurate method of predicting the labor hours by using the amount of cubic feet moved. In a preliminary effort to provide a more accurate method, he has collected data for 36 moves, in which the travel time was an insignificant portion of the labor hours worked.

The data are in the Excel file, MOVING.xls downloadable from File or click Companion Website at www.peasronhighered.com/levine, and go to the Excel Date Files link.

a) Set up a scatter diagram.

b) Assuming a linear relationship, find the regression coefficients, b0, b1, and its regression equation.

c) Interpret the meaning of the slope b1 in this problem.

d) Predict the labor hours for moving 500 cubic feet.

e) What factors besides the cubic feet moved might affect labor hours?

f) Determine the coefficient of determination, r2, and interpret its meaning.

g) Find the standard error of the estimate.

h) How useful do you think this regression model is for labor hours?

i) Determine if the assumption of normality is violated by using the normal probability plot for residuals.

j) At the 0.05 level of significance, is there evidence of a linear relationship between the numbers of cubic feet moved and labor hours?

k) Set up a 95% confidence interval estimate of the population slope,1.

l) Set up a 95% confidence interval estimate of the average labor hours for all moves of 500 cubic feet.

m) Set up a 95% confidence interval of the labor hours of an individual move that has 500 cubic feet.

n) Explain the difference in the results obtained in (l) and (m).

Question 12:

[4] An auto manufacturing company wanted to investigate how the price of one of its car models depreciates with age. The research department at the company took a sample of eight cars of this model and collected the following information on the ages (in years) and prices (in hundreds of dollars) of these cars. The data are in USEDCAR.xls.

Age (x)

8

3

6

9

2

5

6

3

Price (y)

16

74

40

19

124

36

33

89

a) Find the value of the linear correlation coefficient r.

b) Find the value of the coefficient of determination r2, and interpret the meaning for this problem.

c) At the 0.05 level of significance, is there a significant linear relationship between two variables?

d) Determine the adequacy of the fit of the model.

e) Evaluate whether the assumptions of regression (LINE) have been seriously violated.

f) If there is a linear correlation, what is the regression equation?

g) Interpret the meaning of the slope b1 in this problem.

h) Interpret the meaning of the Y-intercept b0 in this problem. Will it make sense to you as far as this model is concerned? Explain why.

i) Set up a 95% confidence interval estimate of the population slope.

j) Set up a 95% confidence interval estimate of the average price for all cars of this model after 7 years.

k) Set up a 95% confidence interval of the average price of a car of this model after 7 years.

l) Explain the difference in the results obtained in (j) and (k).

Question 13: