Preface xiiiAcknowledgments xvAbbreviations xviiAbout the companion website xxi1 Introduction 11.1 Supervised versus unsupervised learning 21.2 Parametric versus nonparametric models 31.3 Types of data 41.4 Overview of parametric predictive analytics 52 Simple linear regression and correlation 72.1 Fitting a straight line 92.1.1 Least squares (LS) method 92.1.2 Linearizing transformations 112.1.3 Fitted values and residuals 132.1.4 Assessing goodness of fit 142.2 Statistical inferences for simple linear regression 172.2.1 Simple linear regression model 172.2.2 Inferences on ß0 and ß1 182.2.3 Analysis of variance for simple linear regression 192.2.4 Pure error versus model error 202.2.5 Prediction of future observations 212.3 Correlation analysis 242.3.1 Bivariate normal distribution 262.3.2 Inferences on correlation coefficient 272.4 Modern extensions 282.5 Technical notes 292.5.1 Derivation of the LS estimators 292.5.2 Sums of squares 302.5.3 Distribution of the LS estimators 302.5.4 Prediction interval 32Exercises 323 Multiple linear regression: basics 373.1 Multiple linear regression model 393.1.1 Model in scalar notation 393.1.2 Model in matrix notation 403.2 Fitting a multiple regression model 413.2.1 Least squares (LS) method 413.2.2 Interpretation of regression coefficients 453.2.3 Fitted values and residuals 453.2.4 Measures of goodness of fit 473.2.5 Linearizing transformations 483.3 Statistical inferences for multiple regression 493.3.1 Analysis of variance for multiple regression 493.3.2 Inferences on regression coefficients 513.3.3 Confidence ellipsoid for the ß vector 523.3.4 Extra sum of squares method 543.3.5 Prediction of future observations 593.4 Weighted and generalized least squares 603.4.1 Weighted least squares 603.4.2 Generalized least squares 623.4.3 Statistical inference on GLS estimator 633.5 Partial correlation coefficients 633.5.1 Test of significance of partial correlation coefficient 653.6 Special topics 663.6.1 Dummy variables 663.6.2 Interactions 693.6.3 Standardized regression 743.7 Modern extensions 753.7.1 Regression trees 763.7.2 Neural nets 783.8 Technical notes 813.8.1 Derivation of the LS estimators 813.8.2 Distribution of the LS estimators 813.8.3 Gauss-Markov theorem 823.8.4 Properties of fitted values and residuals 833.8.5 Geometric interpretation of least squares 833.8.6 Confidence ellipsoid for ß 853.8.7 Population partial correlation coefficient 85Exercises 864 Multiple linear regression: model diagnostics 954.1 Model assumptions and distribution of residuals 954.2 Checking normality 964.3 Checking homoscedasticity 984.3.1 Variance stabilizing transformations 994.3.2 Box-Cox transformation 1004.4 Detecting outliers 1034.5 Checking model misspecification 1064.6 Checking independence 1084.6.1 Runs test 1094.6.2 Durbin-Watson test 1094.7 Checking influential observations 1104.7.1 Leverage 1114.7.2 Cook's distance 1114.8 Checking multicollinearity 1144.8.1 Multicollinearity: causes and consequences 1144.8.2 Multicollinearity diagnostics 115Exercises 1195 Multiple linear regression: shrinkage and dimension reduction methods 1275.1 Ridge regression 1285.1.1 Ridge problem 1285.1.2 Choice of lambda 1295.2 Lasso regression 1325.2.1 Lasso problem 1325.3 Principal components analysis and regression1355.3.1 Principal components analysis (PCA) 1355.3.2 Principal components regression (PCR) 1425.4 Partial least squares (PLS) 1465.4.1 PLS1 algorithm 1475.5 Technical notes 1545.5.1 Properties of ridge estimator 1545.5.2 Derivation of principal components 155Exercises 1566 Multiple linear regression: variable selection and model building 1596.1 Best subset selection 1606.1.1 Model selection criteria 1606.2 Stepwise regression 1656.3 Model building 1746.4 Technical notes 1756.4.1 Derivation of the Cp statistic 175Exercises 1777 Logistic regression and classification 1817.1 Simple logistic regression 1837.1.1 Model 1837.1.2 Parameter estimation 1857.1.3 Inferences on parameters 1897.2 Multiple logistic regression 1907.2.1 Model and inference 1907.3 Likelihood ratio (LR) test 1947.3.1 Deviance 1957.3.2 Akaike information criterion (AIC) 1977.3.3 Model selection and diagnostics 1977.4 Binary classification using logistic regression 2017.4.1 Measures of correct classification 2017.4.2 Receiver operating characteristic (ROC) curve 2047.5 Polytomous logistic regression 2077.5.1 Nominal logistic regression 2087.5.2 Ordinal logistic regression 2127.6 Modern extensions 2157.6.1 Classification trees 2157.6.2 Support vector machines 2187.7 Technical notes 222Exercises 2248 Discriminant analysis 2338.1 Linear discriminant analysis based on Mahalnobis distance 2348.1.1 Mahalnobis distance 2348.1.2 Bayesian classification 2358.2 Fisher's linear discriminant function 2398.2.1 Two groups 2398.2.2 Multiple groups 2418.3 Naive Bayes 2438.4 Technical notes 2448.4.1 Calculation of pooled sample covariance matrix 2448.4.2 Derivation of Fisher's linear discriminant functions 2458.4.3 Bayes rule 247Exercises 2479 Generalized linear models 2519.1 Exponential family and link function 2519.1.1 Exponential family 2519.1.2 Link function 2549.2 Estimation of parameters of GLM 2559.2.1 Maximum likelihood estimation 2559.2.2 Iteratively reweighted least squares (IRWLS) Algorithm 2569.3 Deviance and AIC 2589.4 Poisson regression 2639.4.1 Poisson regression for rates 2669.5 Gamma regression 2699.6 Technical notes 2739.6.1 Mean and variance of the exponential family of distributions 2739.6.2 MLE of ßand its evaluation using the IRWLS algorithm 274Exercises 27710 Survival analysis 28110.1 Hazard rate and survival distribution 28210.2 Kaplan-Meier estimator 28310.3 Logrank test 28610.4 Cox's proportional hazards model 28910.4.1 Estimation 29010.4.2 Examples 29110.4.3 Time-dependent covariates 29510.5 Technical notes 30010.5.1 ML estimation of the Cox proportional hazards model 300Exercises 301Appendix A Primer on matrix algebra and multivariate distributions 305A.1 Review of matrix algebra 305A.2 Review of multivariate distributions 307A.3 Multivariate normal distribution 309Appendix B Primer on maximum likelihood estimation 311B.1 Maximum likelihood estimation 311B.2 Large sample inference on MLEs 313B.3 Newton-Raphson and Fisher scoring algorithms 315B.4 Technical notes 317Appendix C Projects 319C.1 Project 1 321C.2 Project 2 322C.3 Project 3 324Appendix D Statistical tables 327References 339Answers to selected exercises 343Index 355
Ajit C. Tamhane, PhD, is Professor of Industrial Engineering & Management Sciences with a courtesy appointment in Statistics at Northwestern University. He is a fellow of the American Statistical Association, Institute of Mathematical Statistics, American Association for Advancement of Science and an elected member of the International Statistical Institute.