ISBN-13: 9781119830870 / Angielski / Twarda / 2023
ISBN-13: 9781119830870 / Angielski / Twarda / 2023
Preface xiv1 Introduction 11.1 What Is Regression Analysis? 11.2 Publicly Available Data Sets 21.3 Selected Applications of Regression Analysis 31.3.1 Agricultural Sciences 31.3.2 Industrial and Labor Relations 41.3.3 Government 51.3.4 History 51.3.5 Environmental Sciences 61.3.6 Industrial Production 61.3.7 The Space Shuttle Challenger 71.3.8 Cost of Health Care 71.4 Steps in Regression Analysis 71.4.1 Statement of the Problem 91.4.2 Selection of Potentially Relevant Variables 91.4.3 Data Collection 91.4.4 Model Specification 101.4.5 Method of Fitting 121.4.6 Model Fitting 131.4.7 Model Criticism and Selection 141.4.8 Objectives of Regression Analysis 151.5 Scope and Organization of the Book 162 A Brief Introduction to R 192.1 What Is R and RStudio? 192.2 Installing R and RStudio 202.3 Getting Started With R 212.3.1 Command Level Prompt 212.3.2 Calculations Using R 222.3.3 Editing Your R Code 242.3.4 Best Practice: Object Names in R 252.4 Data Values and Objects in R 252.4.1 Types of Data Values in R 252.4.2 Types (Structures) of Objects in R 282.4.3 Object Attributes 342.4.4 Testing (Checking) Object Type 342.4.5 Changing Object Type 342.5 R Packages (Libraries) 352.5.1 Installing R Packages 352.5.2 Name Spaces 362.5.3 Updating R 372.5.4 Datasets in R Packages 372.6 Importing (Reading) Data into R Workspace 372.6.1 Best Practice: Working Directory 382.6.2 Reading ASCII (Text) Files 382.6.3 Reading CSV Files 402.6.4 Reading Excel Files 402.6.5 Reading Files from the Internet 412.7 Writing (Exporting) Data to Files 422.7.1 Diverting Normal R Output to a File 422.7.2 Saving Graphs in Files 422.7.3 Exporting Data to Files 432.8 Some Arithmetic and Other Operators 432.8.1 Vectors 432.8.2 Matrix Computations 452.9 Programming in R 502.9.1 Best Practice: Script Files 502.9.2 Some Useful Commands or Functions 502.9.3 Conditional Execution 512.9.4 Loops 532.9.5 Functions and Functionals 542.9.6 User Defined Functions 552.10 Bibliographic Notes 603 Simple Linear Regression 653.1 Introduction 653.2 Covariance and Correlation Coefficient 653.3 Example: Computer Repair Data 693.4 The Simple Linear Regression Model 723.5 Parameter Estimation 733.6 Tests of Hypotheses 773.7 Confidence Intervals 823.8 Predictions 833.9 Measuring the Quality of Fit 843.10 Regression Line Through the Origin 883.11 Trivial Regression Models 893.12 Bibliographic Notes 904 Multiple Linear Regression 974.1 Introduction 974.2 Description of the Data and Model 974.3 Example: Supervisor Performance Data 984.4 Parameter Estimation 1004.5 Interpretations of Regression Coefficients 1014.6 Centering and Scaling 1044.6.1 Centering and Scaling in Intercept Models 1044.6.2 Scaling in No-Intercept Models 1054.7 Properties of the Least Squares Estimators 1064.8 Multiple Correlation Coefficient 1074.9 Inference for Individual Regression Coefficients 1084.10 Tests of Hypotheses in a Linear Model 1114.10.1 Testing All Regression Coefficients Equal to Zero4.10.2 Testing a Subset of Regression Coefficients Equal to 1134.10.3 Testing the Equality of Regression Coefficients4.10.4 Estimating and Testing of Regression Parameters 1184.11 Predictions 1214.12 Summary 1225 Regression Diagnostics: Detection of Model Violations 1315.1 Introduction 1315.2 The Standard Regression Assumptions 1325.3 Various Types of Residuals 1345.4 Graphical Methods 1365.5 Graphs Before Fitting a Model 1395.5.1 One-Dimensional Graphs 139 5.5.2 Two-Dimensional Graphs 1405.5.3 Rotating Plots 1425.5.4 Dynamic Graphs 1425.6 Graphs After Fitting a Model 1435.7 Checking Linearity and Normality Assumptions 1435.8 Leverage, Influence, and Outliers 1445.8.1 Outliers in the Response Variable 1465.8.2 Outliers in the Predictors 1465.8.3 Masking and Swamping Problems 1475.9 Measures of Influence 1485.9.1 Cook's Distance 1505.9.2 Welsch and Kuh Measure 1515.9.3 Hadi's Influence Measure 1515.10 The Potential-Residual Plot 1525.11 Regression Diagnostics in R 154 5.12 What to Do with the Outliers? 1555.13 Role of Variables in a Regression Equation 1565.11.1 Added-Variable Plot 1565.11.2 Residual Plus Component Plot 1575.14 Effects of an Additional Predictor 1595.15 Robust Regression 1616 Qualitative Variables as Predictors 1676.1 Introduction 1676.2 Salary Survey Data 1686.3 Interaction Variables 1716.4 Systems of Regression Equations 1756.4.1 Models with Different Slopes and Different Intercepts 1766.4.2 Models with Same Slope and Different Intercepts 1836.4.3 Models with Same Intercept and Different Slopes 1846.5 Other Applications of Indicator Variables 1856.6 Seasonality 1866.7 Stability of Regression Parameters Over Time 1877 Transformation of Variables 1957.1 Introduction 1957.2 Transformations to Achieve Linearity 1977.3 Bacteria Deaths Due to X-Ray Radiation 1997.3.1 Inadequacy of a Linear Model 2007.3.2 Logarithmic Transformation for Achieving Linearity 2017.4 Transformations to Stabilize Variance 2037.5 Detection of Heteroscedastic Errors 2087.6 Removal of Heteroscedasticity 2107.7 Weighted Least Squares 2117.8 Logarithmic Transformation of Data 2127.9 Power Transformation 2137.10 Summary 2168 Weighted Least Squares 2238.1 Introduction 2238.2 Heteroscedastic Models 2248.2.1 Supervisors Data 2248.2.2 College Expense Data 2268.3 Two-Stage Estimation 2278.4 Education Expenditure Data 2298.5 Fitting a Dose-Response Relationship Curve 2379 The Problem of Correlated Errors 2419.1 Introduction: Autocorrelation 2419.2 Consumer Expenditure and Money Stock 2429.3 Durbin-Watson Statistic 2459.4 Removal of Autocorrelation by Transformation 2469.5 Iterative Estimation with Autocorrelated Errors 2499.6 Autocorrelation and Missing Variables 2509.7 Analysis of Housing Starts 2519.8 Limitations of the Durbin-Watson Statistic 2539.9 Indicator Variables to Remove Seasonality 2559.10 Regressing Two Time Series 25710 Analysis of Collinear Data 26110.1 Introduction 26110.2 Effects of Collinearity on Inference 26210.3 Effects of Collinearity on Forecasting 267CONTENTS10.4 Detection of Collinearity 27110.4.1 Simple Signs of Collinearity 27110.4.2 Variance Inflation Factors 27410.4.3 The Condition Indices 27611 Working With Collinear Data 28311.1 Introduction 28311.2 Principal Components 28311.3 Computations Using Principal Components 28711.4 Imposing Constraints 28911.5 Searching for Linear Functions of the ß's 29211.6 Biased Estimation of Regression Coefficients 29511.7 Principal Components Regression 29611.8 Reduction of Collinearity in the Estimation Data 29811.9 Constraints on the Regression Coefficients 30011.10 Principal Components Regression: A Caution 30111.11 Ridge Regression 30311.12 Estimation by the Ridge Method 30511.13 Ridge Regression: Some Remarks 30811.14 Summary 31111.15 Bibliographic Notes 31112 Variable Selection Procedures 32112.1 Introduction 32112.2 Formulation of the Problem 32212.3 Consequences of Variables Deletion 32212.4 Uses of Regression Equations 32412.4.1 Description and Model Building 32412.4.2 Estimation and Prediction 32412.4.3 Control 32412.5 Criteria for Evaluating Equations 32512.5.1 Residual Mean Square 32512.5.2 Mallows Cp 32612.5.3 Information Criteria 32712.6 Collinearity and Variable Selection 32812.7 Evaluating All Possible Equations 32812.8 Variable Selection Procedures 32912.8.1 Forward Selection Procedure 32912.8.2 Backward Elimination Procedure 33012.8.3 Stepwise Method 33012.9 General Remarks on Variable Selection Methods 33112.10 A Study of Supervisor Performance 33212.11 Variable Selection with Collinear Data 33612.12 The Homicide Data 33612.13 Variable Selection Using Ridge Regression 33912.14 Selection of Variables in an Air Pollution Study 33912.15 A Possible Strategy for Fitting Regression Models 34512.16 Bibliographic Notes 34713 Logistic Regression 35313.1 Introduction 35313.2 Modeling Qualitative Data 35413.3 The Logit Model 35413.4 Example: Estimating Probability of Bankruptcies 35613.5 Logistic Regression Diagnostics 35813.6 Determination of Variables to Retain 35913.7 Judging the Fit of a Logistic Regression 36213.8 The Multinomial Logit Model 36413.8.1 Multinomial Logistic Regression 36413.8.2 Example: Determining Chemical Diabetes 36513.8.3 Ordinal Logistic Regression 36813.8.4 Example: Determining Chemical Diabetes Revisited 36813.9 Classification Problem: Another Approach 37014 Further Topics 37514.1 Introduction 37514.2 Generalized Linear Model 37514.3 Poisson Regression Model 37614.4 Introduction of New Drugs 37714.5 Robust Regression 37814.6 Fitting a Quadratic Model 37914.7 Distribution of PCB in U.S. Bays 381Exercises 384References 385Index
Ali S. Hadi ,PhD, Fellow ASA (1997), Member ISI (1998), Fellow AAS (2019) is Distinguished University Professor and former Chair of the Department of Mathematics and Actuarial Science at the American University in Cairo (AUC). He is also the Founder of the Actuarial Science Program at AUC (2004), the Founder of the Data Science Program at AUC (2019), and the former Vice Provost and Director of Graduate Studies and Research at the American University in Cairo. Dr. Hadi is the author and co-author of four other books and numerous articles. For more info, see his Website at: www1.aucegypt.edu/faculty/hadi.
1997-2024 DolnySlask.com Agencja Internetowa