# Mathematics

Probability and Statistics

Third Edition

Murray R. Spiegel, PhD Former Professor and Chairman of Mathematics

Rensselaer Polytechnic Institute Hartford Graduate Center

John J. Schiller, PhD Associate Professor of Mathematics

Temple University

R. Alu Srinivasan, PhD Professor of Mathematics

Temple University

Schaum’s Outline Series

New York Chicago San Francisco Lisbon London Madrid Mexico City

Milan New Delhi San Juan Seoul Singapore Sydney Toronto

Copyright © 2009, 2000, 1975 by The McGraw-Hill Companies Inc. All rights reserved. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher.

ISBN: 978-0-07-154426-9

MHID: 0-07-154426-7

The material in this eBook also appears in the print version of this title: ISBN: 978-0-07-154425-2, MHID: 0-07-154425-9.

All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps.

McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs. To contact a representative please e-mail us at bulksales@mcgraw-hill.com.

TERMS OF USE

This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strict- ly prohibited. Your right to use the work may be terminated if you fail to comply with these terms.

THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise.

iii

Preface to the Third Edition

In the second edition of Probability and Statistics, which appeared in 2000, the guiding principle was to make changes in the first edition only where necessary to bring the work in line with the emphasis on topics in con- temporary texts. In addition to refinements throughout the text, a chapter on nonparametric statistics was added to extend the applicability of the text without raising its level. This theme is continued in the third edition in which the book has been reformatted and a chapter on Bayesian methods has been added. In recent years, the Bayesian paradigm has come to enjoy increased popularity and impact in such areas as economics, environmental science, medicine, and finance. Since Bayesian statistical analysis is highly computational, it is gaining even wider ac- ceptance with advances in computer technology. We feel that an introduction to the basic principles of Bayesian data analysis is therefore in order and is consistent with Professor Murray R. Spiegel’s main purpose in writing the original text—to present a modern introduction to probability and statistics using a background of calculus.

J. SCHILLER R. A. SRINIVASAN

Preface to the Second Edition

The first edition of Schaum’s Probability and Statistics by Murray R. Spiegel appeared in 1975, and it has gone through 21 printings since then. Its close cousin, Schaum’s Statistics by the same author, was described as the clearest introduction to statistics in print by Gian-Carlo Rota in his book Indiscrete Thoughts. So it was with a degree of reverence and some caution that we undertook this revision. Our guiding principle was to make changes only where necessary to bring the text in line with the emphasis of topics in contemporary texts. The extensive treatment of sets, standard introductory material in texts of the 1960s and early 1970s, is considerably reduced. The definition of a continuous random variable is now the standard one, and more emphasis is placed on the cu- mulative distribution function since it is a more fundamental concept than the probability density function. Also, more emphasis is placed on the P values of hypotheses tests, since technology has made it possible to easily de- termine these values, which provide more specific information than whether or not tests meet a prespecified level of significance. Technology has also made it possible to eliminate logarithmic tables. A chapter on nonpara- metric statistics has been added to extend the applicability of the text without raising its level. Some problem sets have been trimmed, but mostly in cases that called for proofs of theorems for which no hints or help of any kind was given. Overall we believe that the main purpose of the first edition—to present a modern introduction to prob- ability and statistics using a background of calculus—and the features that made the first edition such a great suc- cess have been preserved, and we hope that this edition can serve an even broader range of students.

J. SCHILLER R. A. SRINIVASAN

Preface to the First Edition

The important and fascinating subject of probability began in the seventeenth century through efforts of such math- ematicians as Fermat and Pascal to answer questions concerning games of chance. It was not until the twentieth century that a rigorous mathematical theory based on axioms, definitions, and theorems was developed. As time progressed, probability theory found its way into many applications, not only in engineering, science, and math- ematics but in fields ranging from actuarial science, agriculture, and business to medicine and psychology. In many instances the applications themselves contributed to the further development of the theory.

The subject of statistics originated much earlier than probability and dealt mainly with the collection, organ- ization, and presentation of data in tables and charts. With the advent of probability it was realized that statistics could be used in drawing valid conclusions and making reasonable decisions on the basis of analysis of data, such as in sampling theory and prediction or forecasting.

The purpose of this book is to present a modern introduction to probability and statistics using a background of calculus. For convenience the book is divided into two parts. The first deals with probability (and by itself can be used to provide an introduction to the subject), while the second deals with statistics.

The book is designed to be used either as a textbook for a formal course in probability and statistics or as a comprehensive supplement to all current standard texts. It should also be of considerable value as a book of ref- erence for research workers or to those interested in the field for self-study. The book can be used for a one-year course, or by a judicious choice of topics, a one-semester course.

I am grateful to the Literary Executor of the late Sir Ronald A. Fisher, F.R.S., to Dr. Frank Yates, F.R.S., and to Longman Group Ltd., London, for permission to use Table III from their book Statistical Tables for Biological, Agri- cultural and Medical Research (6th edition, 1974). I also wish to take this opportunity to thank David Beckwith for his outstanding editing and Nicola Monti for his able artwork.

M. R. SPIEGEL

iv

v

Contents

Part I PROBABILITY 1

CHAPTER 1 Basic Probability 3

Random Experiments Sample Spaces Events The Concept of Probability The Axioms of Probability Some Important Theorems on Probability Assignment of Probabilities Conditional Probability Theorems on Conditional Probability Independent Events Bayes’Theorem or Rule Combinatorial Analysis Fundamental Principle of Counting Tree Diagrams Permutations Combinations Binomial Coefficients Stirling’s Approxima- tion to n!

CHAPTER 2 Random Variables and Probability Distributions 34

Random Variables Discrete Probability Distributions Distribution Functions for Random Variables Distribution Functions for Discrete Random Variables Continuous Random Vari- ables Graphical Interpretations Joint Distributions Independent Random Variables Change of Variables Probability Distributions of Functions of Random Variables Convo- lutions Conditional Distributions Applications to Geometric Probability

CHAPTER 3 Mathematical Expectation 75

Definition of Mathematical Expectation Functions of Random Variables Some Theorems on Expectation The Variance and Standard Deviation Some Theorems on Variance Stan- dardized Random Variables Moments Moment Generating Functions Some Theorems on Moment Generating Functions Characteristic Functions Variance for Joint Distribu- tions. Covariance Correlation Coefficient Conditional Expectation, Variance, and Moments Chebyshev’s Inequality Law of Large Numbers Other Measures of Central Tendency Percentiles Other Measures of Dispersion Skewness and Kurtosis

CHAPTER 4 Special Probability Distributions 108

The Binomial Distribution Some Properties of the Binomial Distribution The Law of Large Numbers for Bernoulli Trials The Normal Distribution Some Properties of the Nor- mal Distribution Relation Between Binomial and Normal Distributions The Poisson Dis- tribution Some Properties of the Poisson Distribution Relation Between the Binomial and Poisson Distributions Relation Between the Poisson and Normal Distributions The Central Limit Theorem The Multinomial Distribution The Hypergeometric Distribution The Uniform Distribution The Cauchy Distribution The Gamma Distribution The Beta Distribution The Chi-Square Distribution Student’s t Distribution The F Distribution Relationships Among Chi-Square, t, and F Distributions The Bivariate Normal Distribution Miscellaneous Distributions

Contentsvi

Part II STATISTICS 151

CHAPTER 5 Sampling Theory 153

Population and Sample. Statistical Inference Sampling With and Without Replacement Random Samples. Random Numbers Population Parameters Sample Statistics Sampling Distributions The Sample Mean Sampling Distribution of Means Sampling Distribution of Proportions Sampling Distribution of Differences and Sums The Sample Variance Sampling Distribution of Variances Case Where Population Variance Is Un- known Sampling Distribution of Ratios of Variances Other Statistics Frequency Distri- butions Relative Frequency Distributions Computation of Mean, Variance, and Moments for Grouped Data

CHAPTER 6 Estimation Theory 195

Unbiased Estimates and Efficient Estimates Point Estimates and Interval Estimates. Relia- bility Confidence Interval Estimates of Population Parameters Confidence Intervals for Means Confidence Intervals for Proportions Confidence Intervals for Differences and Sums Confidence Intervals for the Variance of a Normal Distribution Confidence Intervals for Variance Ratios Maximum Likelihood Estimates

CHAPTER 7 Tests of Hypotheses and Significance 213

Statistical Decisions Statistical Hypotheses. Null Hypotheses Tests of Hypotheses and Significance Type I and Type II Errors Level of Significance Tests Involving the Normal Distribution One-Tailed and Two-Tailed Tests P Value Special Tests of Significance for Large Samples Special Tests of Significance for Small Samples Relationship Between Estimation Theory and Hypothesis Testing Operating Characteristic Curves. Power of a Test Quality Control Charts Fitting Theoretical Distributions to Sample Frequency Distributions The Chi-Square Test for Goodness of Fit Contingency Tables Yates’ Correction for Con- tinuity Coefficient of Contingency

CHAPTER 8 Curve Fitting, Regression, and Correlation 265

Curve Fitting Regression The Method of Least Squares The Least-Squares Line The Least-Squares Line in Terms of Sample Variances and Covariance The Least-Squares Parabola Multiple Regression Standard Error of Estimate The Linear Correlation Coefficient Generalized Correlation Coefficient Rank Correlation Probability Interpreta- tion of Regression Probability Interpretation of Correlation Sampling Theory of Regression Sampling Theory of Correlation Correlation and Dependence

CHAPTER 9 Analysis of Variance 314

The Purpose of Analysis of Variance One-Way Classification or One-Factor Experiments Total Variation. Variation Within Treatments. Variation Between Treatments Shortcut Meth- ods for Obtaining Variations Linear Mathematical Model for Analysis of Variance Ex- pected Values of the Variations Distributions of the Variations The F Test for the Null Hypothesis of Equal Means Analysis of Variance Tables Modifications for Unequal Num- bers of Observations Two-Way Classification or Two-Factor Experiments Notation for Two-Factor Experiments Variations for Two-Factor Experiments Analysis of Variance for Two-Factor Experiments Two-Factor Experiments with Replication Experimental Design

Contents vii

CHAPTER 10 Nonparametric Tests 348

Introduction The Sign Test The Mann–Whitney U Test The Kruskal–Wallis H Test The H Test Corrected for Ties The Runs Test for Randomness Further Applications of the Runs Test Spearman’s Rank Correlation

CHAPTER 11 Bayesian Methods 372

Subjective Probability Prior and Posterior Distributions Sampling From a Binomial Pop- ulation Sampling From a Poisson Population Sampling From a Normal Population with Known Variance Improper Prior Distributions Conjugate Prior Distributions Bayesian Point Estimation Bayesian Interval Estimation Bayesian Hypothesis Tests Bayes Fac- tors Bayesian Predictive Distributions

APPENDIX A Mathematical Topics 411

Special Sums Euler’s Formulas The Gamma Function The Beta Function Special Integrals

APPENDIX B Ordinates y of the Standard Normal Curve at z 413

APPENDIX C Areas under the Standard Normal Curve from 0 to z 414

APPENDIX D Percentile Values for Student’s t Distribution with Degrees of Freedom 415

APPENDIX E Percentile Values for the Chi-Square Distribution with Degrees of Freedom 416

APPENDIX F 95th and 99th Percentile Values for the F Distribution with , Degrees of Freedom 417

APPENDIX G Values of e 419

APPENDIX H Random Numbers 419

SUBJECT INDEX 420

INDEX FOR SOLVED PROBLEMS 423

2l

n2n1

n

x2p

n tp

This page intentionally left blank

PART I

Probability

This page intentionally left blank

CHAPTER 12

3

Basic Probability

Random Experiments We are all familiar with the importance of experiments in science and engineering. Experimentation is useful to us because we can assume that if we perform certain experiments under very nearly identical conditions, we will arrive at results that are essentially the same. In these circumstances, we are able to control the value of the variables that affect the outcome of the experiment.

However, in some experiments, we are not able to ascertain or control the value of certain variables so that the results will vary from one performance of the experiment to the next even though most of the conditions are the same. These experiments are described as random. The following are some examples.

EXAMPLE 1.1 If we toss a coin, the result of the experiment is that it will either come up “tails,” symbolized by T (or 0), or “heads,” symbolized by H (or 1), i.e., one of the elements of the set {H, T} (or {0, 1}).

EXAMPLE 1.2 If we toss a die, the result of the experiment is that it will come up with one of the numbers in the set {1, 2, 3, 4, 5, 6}.

EXAMPLE 1.3 If we toss a coin twice, there are four results possible, as indicated by {HH, HT, TH, TT}, i.e., both heads, heads on first and tails on second, etc.

EXAMPLE 1.4 If we are making bolts with a machine, the result of the experiment is that some may be defective. Thus when a bolt is made, it will be a member of the set {defective, nondefective}.

EXAMPLE 1.5 If an experiment consists of measuring “lifetimes” of electric light bulbs produced by a company, then the result of the experiment is a time t in hours that lies in some interval—say, 0 t 4000—where we assume that no bulb lasts more than 4000 hours.

Sample Spaces A set S that consists of all possible outcomes of a random experiment is called a sample space, and each outcome is called a sample point. Often there will be more than one sample space that can describe outcomes of an experiment, but there is usually only one that will provide the most information.

EXAMPLE 1.6 If we toss a die, one sample space, or set of all possible outcomes, is given by {1, 2, 3, 4, 5, 6} while another is {odd, even}. It is clear, however, that the latter would not be adequate to determine, for example, whether an outcome is divisible by 3.

It is often useful to portray a sample space graphically. In such cases it is desirable to use numbers in place of letters whenever possible.

EXAMPLE 1.7 If we toss a coin twice and use 0 to represent tails and 1 to represent heads, the sample space (see Example 1.3) can be portrayed by points as in Fig. 1-1 where, for example, (0, 1) represents tails on first toss and heads on second toss, i.e., TH.

��

CHAPTER 1

If a sample space has a finite number of points, as in Example 1.7, it is called a finite sample space. If it has as many points as there are natural numbers 1, 2, 3, . . . , it is called a countably infinite sample space. If it has as many points as there are in some interval on the x axis, such as 0 x 1, it is called a noncountably infinite sample space. A sample space that is finite or countably infinite is often called a discrete sample space, while one that is noncountably infinite is called a nondiscrete sample space.

Events An event is a subset A of the sample space S, i.e., it is a set of possible outcomes. If the outcome of an experi- ment is an element of A, we say that the event A has occurred. An event consisting of a single point of S is often called a simple or elementary event.

EXAMPLE 1.8 If we toss a coin twice, the event that only one head comes up is the subset of the sample space that consists of points (0, 1) and (1, 0), as indicated in Fig. 1-2.

��