Review for Test
Suggestion: treat this review set as you would an actual test. Sit down with your one page of notes and your calculator, and give it a try. That way you will know what areas you still need to study.
1) What is the average length of a company’s policy book? Suppose policy books are randomly sampled from 45 medium-sized companies. The average number of pages in the sample books is 213, and the population standard deviation of 48. Use this information to construct a 98% confidence interval to estimate the mean number of pages for the population of medium-sized company policy books.
2) A random sample of small-business managers was given a leadership style questionnaire. The results were scaled so that each manager received a score for initiative. Suppose the following data are a random sample of these scores.
a) Assuming σ is 3.0, use these data to construct a 90% confidence interval to estimate the average score on initiative for all small-business managers.
b) What if you didn’t know the σ? Construct a 90% confidence interval for the
c) What assumptions did you have to make to do the analysis in part b? Were
3) Is the environment a major issue with Americans? To answer that question, a researcher conducts a survey of 1255 randomly selected Americans. Suppose 714 of the sampled people replied that the environment is a major issue with them.
a) What is the point estimate of this proportion?
b) Construct a 95% confidence interval to estimate the proportion of Americans who feel that the environment is a major issue with them.
4) An entrepreneur wants to open an appliance service repair shop. She would like to know about what the average home repair bill is, including the charge for the service call for appliance repair in the area. She wants the estimate to be within $20 of the actual figure. She believes the range of such bills is between $30 and $600. How large a sample should the entrepreneur take if she wants to be 95% confident of the results?
5) According to a survey of 27 establishments by Runzheimer International, the average cost of a fast-food meal (quarter-pound cheeseburger, large fries, medium soft drink, excluding taxes) in Seattle is $4.82 with a the standard deviation of $0.37.
a) Construct a 95% confidence interval for the population mean cost for all fast-food meals in Seattle. Assume the cost of a fast-food meal in Seattle is normally distributed.
b) Using your confidence interval as a guide, draw a conclusion about whether the true (population) cost of a fast food meal has significantly increased from $4.50.
c) Suppose you thought the true cost was $4.75. Does your confidence interval indicate that true (population) cost of a fast food meal has significantly increased?
6) A research firm has been asked to determine the proportion of all restaurants in the state of Ohio that serve alcoholic beverages. The firm wants to be 98% confident of its results but has no idea of what the actual proportion is. The firm would like to report an error of no more than 0.05. How large a sample should it take?
7) True or False: A statistic taken from a sample that is used to estimate a population parameter is called a point estimate.
8) True or False: The width of the confidence interval depends only on the desired level of confidence.
9) True or False: A 95% confidence interval for the population mean implies that the probability that the population mean lies within this interval is 0.95.
10) True or False: When population standard deviation is unknown, sample standard deviation is used and the interval estimation is based on values from the t- rather than the z-distribution.
11) When a company with thousands of employees reports a sample statistic (a single number) as the estimate of the population mean number of days of work missed per employee due to illness during a year, it is providing __________.
a) a perfect estimate
b) an interval estimate
c) a deterministic estimate
d) a point estimate
12) In a normal distribution, what values of z should we choose if we want 95% of the area under the curve symmetrically distributed around the mean?
a) ± 1.00
b) ± 1.50
c) ± 1.96
d) ± 2.33
13) The z value for a 98% confidence interval around the point estimate is _______.
14) In order to construct a 90% confidence interval for the population mean when the population standard deviation σ is unknown and the sample of size n = 18, the appropriate t-value to use is represented by __________.
15) When developing the interval estimates for the population proportion π, a z-distribution can be used if__________.
a) n > 30
b) n < 30
c) n * π > 5 and n * (1 – π) < 5
d) n * π > 5 and n * (1- π) > 5
16) The formula to determine the sample size for the estimation of the population proportion to yield a desired error of estimation requires the value of π, the population proportion. When the value of π is unknown, researchers use a value of _______.
17) A computer manufacturer estimates that its line of minicomputers has, on average, 8.4 days of downtime per year. To test this claim, a researcher contacts seven companies that own one of these computers and is allowed to access company computer records. It is determined that, for the sample, the average number of downtime days is 5.6, with a sample standard deviation of 1.3 days. Assuming that number of down-time days is normally distributed, test to determine whether these minicomputers actually average 8.4 days of downtime in the entire population. Let α = .01.
18) A study of MBA graduates by Universum for the American Graduate Survey 1999 revealed that MBA graduates have several expectations of prospective employers beyond their base pay. In particular, according to the study 46% expect a performance-related bonus, 46% expect stock options, 42% expect a signing bonus, 28% expect profit sharing, 27% expect extra vacation/personal days, 25% expect tuition reimbursement, 24% expect health benefits, and 19% expect guaranteed annual bonuses. Suppose a study was conducted last year to see whether these expectations have changed. If 125 MBA graduates were randomly selected last year, and if 66 expected stock options, does this result provide enough evidence to declare that a significantly higher proportion of MBAs expect stock options? Let α = .05.
19) Downtime in manufacturing is costly and can result in late deliveries, backlogs, failure to meet orders, and even loss of market share. Suppose a manufacturing plant has been averaging 23 minutes of downtime per day for the past several years, but during the past year, there has been a significant effort by both management and production workers to reduce downtime. In an effort to determine if downtime has been significantly reduced, company productivity researchers have randomly sampled 31 days over the past several months from company records and have recorded the daily downtimes shown below in minutes.
Use these data and a 1% level of significance to determine if downtime has been significantly reduced. Assume that daily downtimes are normally distributed in the population.
20) Suppose you have just been hired as the statistician for a company which manufactures drugs (legal ones, of course). The company has set up the following hypotheses:
Ho: the production run of a drug is of satisfactory quality
H1: the production run of a drug is of poor quality and should not be sold
a) What are the Type I and Type II errors in this situation? Please DO NOT give general definitions of these types of errors — speak in the words of THIS example so that the organization’s CEO can understand the possible errors.
b) What are the ramifications (practical consequences) of making each type of error defined in part “a” above?
c) Based on your answers to parts “a” and “b” above, what value would you choose for alpha?
21) True or False: A null hypothesis must always include the equality sign.
22) True or False: A hypothesis test always contains the possibility of committing one of two types of errors called Type I and Type II errors.
23) True or False: The null hypothesis is rejected if the p-value (i.e., the probability of getting a test statistic at least as extreme as the observed value) is greater than the significance level.
24) The probability of committing a Type I error is called __________.
a) the p-value
b) the power of the test
c) the strength of the test
d) the level of significance
25) The null hypothesis in a hypothesis test contains a statement about the population parameter such as the population mean or population proportion. The decision to reject the null hypothesis will be made if the outcome of the study or the sample statistic such as sample mean or sample proportion is __________.
a) very close to the hypothesized value of the population parameter
b) equal to the hypothesized value of the population parameter
c) somewhat close to the hypothesized value of the population parameter
d) significantly different from the hypothesized value of the population parameter
26) In a hypothesis test, the test statistic computed from the sample data is considered extreme or significant if it is __________.
a) very large
b) very small
c) highly unlikely to occur due to chance
d) highly likely to occur due to chance
27) In hypothesis testing the statistical conclusion is either to reject or not to reject the null hypothesis based on the sample data. Which of the following statements is true?
a) The conclusion is always free from error if the correct procedure is followed.
b) The conclusion will never reject a true null hypothesis.
c) The conclusion will always reject a false null hypothesis.
d) The conclusion is always subject to error because it is based on sample data.
28) The dean of a business school claims that the average starting salary of its graduates is more than 60 (in $000’s). It is known that the population standard deviation is 10 (in $000’s). Sample data on the starting salaries of 64 randomly selected recent graduates yielded a mean of 62 (in $000s). Which of the following sets of hypotheses is correct?
a) H0: µ = 60 and H1: µ ≠ 60
b) H0: µ = 60 and H1: µ < 60
c) H0: µ = 60 and H1: µ > 60
d) H0: µ > 60 and H1: µ < 60
29) If the calculated test statistic falls in the rejection region, the statistical action is to __________.
a) reject the null hypothesis
b) reject the alternate hypothesis
c) reject both the hypotheses
d) accept both the hypotheses
30) The probability of getting a test statistic as extreme as the observed test statistic computed from the sample data under the assumption that the null hypothesis is true is called the __________.
a) critical value
b) extreme value
c) significance level
31) In a one-sided test of the hypotheses about a population proportion, the p-value was found to be 0.045. The null hypothesis in this test would __________.
a) be rejected if the significance level is = 0.05
b) not be rejected if the significance level is = 0.05
c) be rejected if the significance level is = 0.01
d) be rejected if the significance level is = 0.04
32) A researcher interviewed 2,067 people and asked whether they were the primary decision makers in the household about whether to buy a new car last year. Their gender was also recorded. Use these data to determine whether gender is independent of being the primary decision maker in purchasing a car last year.
Primary decision Maker?
33) A study by Market Facts/TeleNation for Personnel Decisions International (PDI) found that the average workweek is getting longer for U.S. full-time workers. Forty three percent of the responding workers in the survey cited “more work, more business” as the number one reason for this increase in workweek. Suppose you want to test this figure in California to determine whether California workers feel the same way. A random sample of 315 California full-time workers whose workweek has been getting longer is chosen. They are offered a selection of possible reasons for this increase and 120 pick “more work, more business.” Does the 43% U.S. figure for this reason holds true in California? NOTE: please do this example in two ways… first with chapter 9 material and then with chi-square in chapter 10. Note the similarities!
34) True or False: The chi-square distribution is continuous and has a range between minus infinity and plus infinity.
35) True or False: Chi-square goodness-of-fit tests are always one-tailed tests.
36) True or False: Chi-square test of independence is not useful for analyzing nominal data.
37) True or False: In a chi-square test, if there are 8 rows and 4 columns, the number of degrees of freedom for the chi-square statistic would be 32.
38) Consider a chi-square goodness-of-fit test in which k = the number of categories and the parameters of the expected distribution (i.e., the distribution used to determine the expected frequencies) are given. The test statistic,χ2, is computed by comparing the ____.
a) observed mean to the expected mean
b) observed proportion to the expected proportion
c) observed frequencies to the expected frequencies
d) observed median to the expected median
39) If there is no relationship between two variables, e.g., the brand of soft drink preferred by a customer and the age of the customer, the two variables are __________.
c) mutually exclusive
40) The contingency table in a chi-square test of independence between two variables has four rows and six columns representing the number of categories for each variable. The number of degrees of freedom associated with the test statistic, χ2, is __________. a) 10
41) A company wants to study the relationship between an employee’s length of employment and their number of workdays absent. Specifically, the company wants to determine if absenteeism can be predicted if they know the length of employment. The consultant hired to help with the study collected the following information on a random sample of seven employees.
Number of workdays absent
Length of employment (years)
a) In this problem, which variable is the dependent variable (Y)?
b) What is the correlation? Interpret its meaning. NOTE: you don’t have to do any calculations…just run the analysis, locate the appropriate figure on the printout, and interpret in practical terms.
d) Suppose you know that a person has worked at the company for 7 years. How many workdays would you predict he/she would be absent?
e) Interpret in practical terms the meaning of the slope in this example.
f) Is there a statistically significant negative correlation between number of years of employment and number of days a person is absent? Test at the 5% level of significance.
42) True or False: If the coefficient of correlation between two numerical variables is -1, it means that the two variables are not related.
43) True or False: The process of constructing a mathematical model to predict or determine one variable by another is called a regression analysis.
44) True or False: In a simple regression analysis the predictor variable is called the independent or explanatory variable.
45) True or False: Usually, the first step in regression analysis is to construct a scatter plot.
46) True or False: The numerical value of the coefficient of determination r2 varies from –1 to +1.
47) True or False: If a regression model has an r2 of 0.20, it means that about 20% of the variation in the dependent variable is explained by the regression model.
48) A numerical measure that indicates the strength of relationship between matched observations of two variables is __________.
a) the median
b) the standard deviation
c) the mean
d) the correlation coefficient
49) In the equation y = 1.57 + 0.0407 x, which represents a straight line, 0.0407 is the _____.
b) y-intercept of the line
c) slope of the line
d) x-intercept of the line
50) Which of the following equations correctly represents the equation of the regression line for sample data to predict y using X?
a) y = β0 + β1 * x
b) y = β0 + β1* x + ε
c) = b0 + b1 * x
d) = b0 + b1 / x
51) A hospital administrator developed a regression line, y = 30 + 2x, to predict y = the number of full-time employees (FTE) needed using x = the number of beds. The slope of this regression line suggests that for a unit increase in beds: __________.
a) the number of FTEs is predicted to increase by 32
b) the number of FTEs is predicted to decrease by 32
c) the number of FTEs is predicted to increase by 2
d) the number of FTEs is predicted to decrease by 32
52) The following regression model was fitted to sample data with 12 observations: Ŷ = 30 + 4.50x. What is the predicted value of y for a given value of x = 6?
53) The proportion of the variability in the dependent variable (y) accounted for or explained by the independent variable (x) is called the __________.
a) correlation coefficient
b) regression slope coefficient
c) regression sum of squares
d) coefficient of determination