ISBN-13: 9781119953241 / Angielski / Twarda / 2014 / 592 str.
ISBN-13: 9781119953241 / Angielski / Twarda / 2014 / 592 str.
A comprehensive overview of the internationalisation of correspondence analysis Correspondence Analysis: Theory, Practice and New Strategies examines the key issues of correspondence analysis, and discusses the new advances that have been made over the last 20 years. The main focus of this book is to provide a comprehensive discussion of some of the key technical and practical aspects of correspondence analysis, and to demonstrate how they may be put to use. Particular attention is given to the history and mathematical links of the developments made. These links include not just those major contributions made by researchers in Europe (which is where much of the attention surrounding correspondence analysis has focused) but also the important contributions made by researchers in other parts of the world. Key features include
"the book is outstandingly comprehensive and informative, well written, and clear. If the book is adopted for courses in Statistics for not only students in applied fields, but also for students in Statistics, it will provide them with an excellent up–to–date knowledge of the entire spectrum of correspondence analysis. I would also like to recommend the book very strongly to most researchers including seasoned researchers in data analysis, for the book will undoubtedly fill in the gap of their knowledge about versatile correspondence analysis. I learned a lot, reading the book." (Psychometrika 2016)
Foreword xv
Preface xvii
Part One Introduction 1
1 Data Visualisation 3
1.1 A Very Brief Introduction to Data Visualisation 3
1.1.1 A Very Brief History 3
1.1.2 Introduction to Visualisation Tools for Numerical Data 4
1.1.3 Introduction to Visualisation Tools for Univariate Categorical Data 6
1.2 Data Visualisation for Contingency Tables 10
1.2.1 Fourfold Displays 11
1.3 Other Plots 12
1.4 Studying Exposure to Asbestos 13
1.4.1 Asbestos and Irving J. Selikoff 13
1.4.2 Selikoff s Data 17
1.4.3 Numerical Analysis of Selikoff s Data 17
1.4.4 A Graphical Analysis of Selikoff s Data 18
1.4.5 Classical Correspondence Analysis of Selikoff s Data 20
1.4.6 Other Methods of Graphical Analysis 22
1.5 Happiness Data 25
1.6 Correspondence Analysis Now 29
1.6.1 A Bibliographic Taste 29
1.6.2 The Increasing Popularity of Correspondence Analysis 29
1.6.3 The Growth of the Correspondence Analysis Family Tree 32
1.7 Overview of the Book 34
1.8 R Code 35
References 36
2 Pearson s Chi–Squared Statistic 44
2.1 Introduction 44
2.2 Pearson s Chi–Squared Statistic 44
2.2.1 Notation 44
2.2.2 Measuring the Departure from Independence 45
2.2.3 Pearson s Chi–Squared Statistic 47
2.2.4 Other 2 Measures of Association 48
2.2.5 The Power Divergence Statistic 49
2.2.6 Dealing with the Sample Size 50
2.3 The Goodman––Kruskal Tau Index 51
2.3.1 Other Measures and Issues 52
2.4 The 2 × 2 Contingency Table 52
2.4.1 Yates Continuity Correction 53
2.5 Early Contingency Tables 54
2.5.1 The Impact of Adolph Quetelet 55
2.5.2 Gavarret s (1840) Legitimate Children Data 58
2.5.3 Finley s (1884) Tornado Data 58
2.5.4 Galton s (1892) Fingerprint Data 59
2.5.5 Final Comments 61
2.6 R Code 61
2.6.1 Expectation and Variance of the Pearson Chi–Squared Statistic 61
2.6.2 Pearson s Chi–Squared Test of Independence 62
2.6.3 The Cressie––Read Statistic 64
References 67
Part Two Correspondence Analysis of Two–Way Contingency Tables 71
3 Methods of Decomposition 73
3.1 Introduction 73
3.2 Reducing Multidimensional Space 73
3.3 Profiles and Cloud of Points 74
3.4 Property of Distributional Equivalence 79
3.5 The Triplet and Classical Reciprocal Averaging 79
3.5.1 One–Dimensional Reciprocal Averaging 80
3.5.2 Matrix Form of One–Dimensional Reciprocal Averaging 81
3.5.3 –Dimensional Reciprocal Averaging 83
3.5.4 Some Historical Comments 83
3.6 Solving the Triplet Using Eigen–Decomposition 84
3.6.1 The Decomposition 84
3.6.2 Example 85
3.7 Solving the Triplet Using Singular Value Decomposition 86
3.7.1 The Standard Decomposition 86
3.7.2 The Generalised Decomposition 88
3.8 The Generalised Triplet and Reciprocal Averaging 89
3.9 Solving the Generalised Triplet Using Gram––Schmidt Process 91
3.9.1 Ordered Categorical Variables and a priori Scores 91
3.9.2 On Finding Orthogonalised Vectors 92
3.9.3 A Recurrence Formulae Approach 94
3.9.4 Changing the Basis Vector 96
3.9.5 Generalised Correlations 97
3.10 Bivariate Moment Decomposition 100
3.11 Hybrid Decomposition 100
3.11.1 An Alternative Singly Ordered Approach 102
3.12 R Code 103
3.12.1 Eigen–Decomposition in R 103
3.12.2 Singular Value Decomposition in R 103
3.12.3 Singular Value Decomposition for Matrix Approximation 104
3.12.4 Generating Emerson s Polynomials 106
3.13 A Preliminary Graphical Summary 109
3.14 Analysis of Analgesic Drugs 112
References 115
4 Simple Correspondence Analysis 120
4.1 Introduction 120
4.2 Notation 121
4.3 Measuring Departures from Complete Independence 122
4.3.1 The Duplication Constant 123
4.3.2 Pearson Ratios 123
4.4 Decomposing the Pearson Ratio 124
4.5 Coordinate Systems 126
4.5.1 Standard Coordinates 126
4.5.2 Principal Coordinates 127
4.5.3 Biplot Coordinates 132
4.6 Distances 136
4.6.1 Distance from the Origin 136
4.6.2 Intra–Variable Distances and the Metric 137
4.6.3 Inter–Variable Distances 138
4.7 Transition Formulae 140
4.8 Moments of the Principal Coordinates 141
4.8.1 The Mean of 142
4.8.2 The Variance of 142
4.8.3 The Skewness of 143
4.8.4 The Kurtosis of 143
4.8.5 Moments of the Asbestos Data 144
4.9 How Many Dimensions to Use? 145
4.10 R Code 147
4.11 Other Theoretical Issues 154
4.12 Some Applications of Correspondence Analysis 156
4.13 Analysis of a Mother s Attachment to Her Child 158
References 165
5 Non–Symmetrical Correspondence Analysis 177
5.1 Introduction 177
5.2 The Goodman––Kruskal Tau Index 180
5.2.1 The Tau Index as a Measure of the Increase in Predictability 180
5.2.2 The Tau Index in the Context of ANOVA 182
5.2.3 The Sensitivity of 182
5.2.4 A Demonstration: Revisiting Selikoff s Asbestos Data 185
5.3 Non–Symmetrical Correspondence Analysis 186
5.3.1 The Centred Column Profile Matrix 186
5.3.2 Decomposition of 187
5.4 The Coordinate Systems 188
5.4.1 Standard Coordinates 188
5.4.2 Principal Coordinates 189
5.4.3 Biplot Coordinates 193
5.5 Transition Formulae 197
5.5.1 Supplementary Points 198
5.5.2 Reconstruction Formulae 198
5.6 Moments of the Principal Coordinates 199
5.6.1 The Mean of 199
5.6.2 The Variance of 200
5.6.3 The Skewness of 201
5.6.4 The Kurtosis of 201
5.7 The Distances 201
5.7.1 Column Distances 201
5.7.2 Row Distances 203
5.8 Comparison with Simple Correspondence Analysis 204
5.9 R Code 204
5.10 Analysis of a Mother s Attachment to Her Child 209
References 212
6 Ordered Correspondence Analysis 216
6.1 Introduction 216
6.2 Pearson s Ratio and Bivariate Moment Decomposition 221
6.3 Coordinate Systems 222
6.3.1 Standard Coordinates 222
6.3.2 The Generalised Correlations 223
6.3.3 Principal Coordinates 225
6.3.4 Location, Dispersion and Higher Order Components 229
6.3.5 The Correspondence Plot and Generalised Correlations 230
6.3.6 Impact on the Choice of Scores 232
6.4 Artificial Data Revisited 233
6.4.1 On the Structure of the Association 233
6.4.2 A Graphical Summary of the Association 233
6.4.3 An Interpretation of the Axes and Components 234
6.4.4 The Impact of the Choice of Scores 235
6.5 Transition Formulae 236
6.6 Distance Measures 238
6.6.1 Distance from the Origin 238
6.6.2 Intra–Variable Distances 239
6.7 Singly Ordered Analysis 239
6.8 R Code 241
6.8.1 Generalised Correlations and Principal Inertias 241
6.8.2 Doubly Ordered Correspondence Analysis 245
References 248
7 Ordered Non–Symmetrical Correspondence Analysis 251
7.1 Introduction 251
7.2 General Considerations 252
7.2.1 Orthogonal Polynomials Instead of Singular Vectors 253
7.3 Doubly Ordered Non–Symmetrical Correspondence Analysis 254
7.3.1 Bivariate Moment Decomposition 254
7.3.2 Generalised Correlations in Bivariate Moment Decomposition 255
7.4 Singly Ordered Non–Symmetrical Correspondence Analysis 257
7.4.1 Hybrid Decomposition for an Ordered Predictor Variable 257
7.4.2 Hybrid Decomposition in the Case of Ordered Response Variables 258
7.4.3 Generalised Correlations in Hybrid Decomposition 258
7.5 Coordinate Systems for Ordered Non–Symmetrical Correspondence Analysis 259
7.5.1 Polynomial Plots for Doubly Ordered Non–Symmetrical Correspondence Analysis 260
7.5.2 Polynomial Biplot for Doubly Ordered Non–Symmetrical Correspondence Analysis 262
7.5.3 Polynomial Plot for Singly Ordered Non–Symmetrical Correspondence Analysis with an Ordered Predictor Variable 262
7.5.4 Polynomial Biplot for Singly Ordered Non–Symmetrical Correspondence Analysis with an Ordered Predictor Variable 263
7.5.5 Polynomial Plot for Singly Ordered Non–Symmetrical Correspondence Analysis with an Ordered Response Variable 264
7.5.6 Polynomial Biplot for Singly Ordered Non–Symmetrical Correspondence Analysis with an Ordered Response Variable 265
7.6 Tests of Asymmetric Association 265
7.7 Distances in Ordered Non–Symmetrical Correspondence Analysis 266
7.7.1 Distances in Doubly Ordered Non–Symmetrical Correspondence Analysis 267
7.7.2 Distances in Singly Ordered Non–Symmetrical Correspondence Analysis 269
7.8 Doubly Ordered Non–Symmetrical Correspondence of Asbestos Data 269
7.8.1 Trends 270
7.9 Singly Ordered Non–Symmetrical Correspondence Analysis of Drug Data 277
7.9.1 Predictability of Ordered Rows Given Columns 278
7.10 R Code for Ordered Non–Symmetrical Correspondence Analysis 283
References 300
8 External Stability and Confidence Regions 302
8.1 Introduction 302
8.2 On the Statistical Significance of a Point 303
8.3 Circular Confidence Regions for Classical Correspondence Analysis 304
8.4 Elliptical Confidence Regions for Classical Correspondence Analysis 306
8.4.1 The Information in the Optimal Correspondence Plot 306
8.4.2 The Information in the First Two Dimensions 308
8.4.3 Eccentricity of Elliptical Regions 309
8.4.4 Comparison of Confidence Regions 309
8.5 Confidence Regions for Non–Symmetrical Correspondence Analysis 311
8.5.1 Circular Regions in Non–Symmetrical Correspondence Analysis 312
8.5.2 Elliptical Regions in Non–Symmetrical Correspondence Analysis 312
8.6 Approximate –values and Classical Correspondence Analysis 313
8.6.1 Approximate –values Based on Confidence Circles 313
8.6.2 Approximate –values Based on Confidence Ellipses 314
8.7 Approximate –values and Non–Symmetrical Correspondence Analysis 315
8.8 Bootstrap Elliptical Confidence Regions 315
8.9 Ringrose s Bootstrap Confidence Regions 316
8.9.1 Confidence Ellipses and Covariance Matrix 317
8.10 Confidence Regions and Selikoff s Asbestos Data 318
8.11 Confidence Regions and Mother––Child Attachment Data 322
8.12 R Code 325
8.12.1 Calculating the Path of a Confidence Ellipse 326
8.12.2 Constructing Elliptical Regions in a Correspondence Plot 327
References 335
9 Variants of Correspondence Analysis 337
9.1 Introduction 337
9.2 Correspondence Analysis Using Adjusted Standardised Residuals 337
9.3 Correspondence Analysis Using the Freeman––Tukey Statistic 340
9.4 Correspondence Analysis of Ranked Data 342
9.5 R Code 343
9.5.1 Adjusted Standardised Residuals 343
9.5.2 Freeman––Tukey Statistic 349
9.6 The Correspondence Analysis Family 353
9.6.1 Detrended Correspondence Analysis 353
9.6.2 Canonical Correspondence Analysis 354
9.6.3 Inverse Correspondence Analysis 355
9.6.4 Ordered Correspondence Analysis 355
9.6.5 Grade Correspondence Analysis 355
9.6.6 Symbolic Correspondence Analysis 356
9.6.7 Correspondence Analysis of Proximity Data 356
9.6.8 Residual (Scaling) Correspondence Analysis 360
9.6.9 Log–Ratio Correspondence Analysis 362
9.6.10 Parametric Correspondence Analysis 364
9.6.11 Subset Correspondence Analysis 364
9.6.12 Foucart s Correspondence Analysis 365
9.7 Other Techniques 365
References 366
Part Three Correspondence Analysis of Multi–Way Contingency Tables 373
10 Coding and Multiple Correspondence Analysis 375
10.1 Introduction to Coding 375
10.2 Coding Data 377
10.2.1 B–Splines 377
10.2.2 Crisp Coding 380
10.2.3 Fuzzy Coding 382
10.3 Coding Ordered Categorical Variables by Orthogonal Polynomials 382
10.4 Burt Matrix 384
10.5 An Introduction to Multiple Correspondence Analysis 386
10.6 Multiple Correspondence Analysis 388
10.6.1 Notation 388
10.6.2 Decomposition Methods 389
10.6.3 Coordinates, Transition Formulae and Adjusted Inertia 393
10.7 Variants of Multiple Correspondence Analysis 395
10.7.1 Joint Correspondence Analysis 396
10.7.2 Stacking and Concatenation 397
10.8 Ordered Multiple Correspondence Analysis 398
10.8.1 Orthogonal Polynomials in Multiple Correspondence Analysis 398
10.8.2 Hybrid Decomposition of Multiple Indicator Tables 399
10.8.3 Two Ordered Variables and Their Contingency Table 400
10.8.4 Test of Statistical Significance 401
10.8.5 Properties of Ordered Multiple Correspondence Analysis 403
10.8.6 Graphical Displays in Ordered Multiple Correspondence Analysis 404
10.9 Applications 405
10.9.1 Customer Satisfaction in Health Care Services 406
10.9.2 Two Quality Aspects 411
10.10 R Code 417
10.10.1 B–Spline Function 417
10.10.2 Crisp and Fuzzy Coding Using B–Splines in R 421
10.10.3 Crisp Coding and the Burt Table by Indicator Functions in R 425
10.10.4 Classical and Multiple Correspondence Analysis in R 428
References 444
11 Symmetrical and Non–Symmetrical Three–Way Correspondence Analysis 451
11.1 Introduction 451
11.2 Notation 453
11.3 Symmetric and Asymmetric Association in Three–Way Contingency Tables 454
11.4 Partitioning Three–Way Measures of Association 455
11.4.1 Partitioning Pearson s Three–Way Statistic 457
11.4.2 Partitioning Marcotorchino s and Gray––William s Three–Way Indices 458
11.4.3 Marcotorchino s Index 460
11.4.4 Partitioning the Three–Way Delta Index 461
11.4.5 Three–Way Delta Index 463
11.5 Formal Tests of Predictability 463
11.5.1 Testing Pearson s Statistic 464
11.5.2 Testing the Marcotorchino s Index 464
11.5.3 Testing the Delta Index 465
11.5.4 Discussion 465
11.6 Tucker3 Decomposition for Three–Way Tables 466
11.7 Correspondence Analysis of Three–Way Contingency Tables 467
11.7.1 Symmetrically Associated Variables 467
11.7.2 Asymmetrically Associated Variables 468
11.7.3 Additional Property 469
11.8 Modelling of Partial and Marginal Dependence 470
11.9 Graphical Representation 471
11.9.1 Interactive Plot 471
11.9.2 Interactive Biplot 472
11.9.3 Category Contribution 474
11.10 On the Application of Partitions 474
11.10.1 Olive Data: Partitioning the Asymmetric Association 474
11.10.2 Job Satisfaction Data: Partitioning the Asymmetric Association 476
11.11 On the Application of Three–Way Correspondence Analysis 477
11.11.1 Job Satisfaction and Three–Way Symmetrical Correspondence Analysis 477
11.11.2 Job Satisfaction and Three–Way Non–Symmetrical Correspondence Analysis 483
11.12 R Code 490
References 511
Part Four The Computation of Correspondence Analysis 517
12 Computing and Correspondence Analysis 519
12.1 Introduction 519
12.2 A Look Through Time 519
12.2.1 Pre–1990 519
12.2.2 From 1990 to 2000 520
12.2.3 The Early 2000s 522
12.3 The Impact of R 523
12.3.1 Overview of Correspondence Analysis in R 523
12.3.2 MASS 524
12.3.3 Nenadi´c and Greenacre s (2007) ca 525
12.3.4 Murtagh (2005) 527
12.3.5 ade4 530
12.4 Some Stand–Alone Programs 533
12.4.1 JMP 533
12.4.2 SPSS 533
12.4.3 PAST 534
12.4.4 DtmVic5.6+ 535
References 540
Index 545
Eric J. Beh
School of Mathematics & Physical Sciences, University of Newcastle, Australia
Rosaria Lombardo
Department of Economics, Second University of Naples, Italy
A comprehensive overview of the internationalisation of correspondence analysis
Correspondence Analysis: Theory, Practice and New Strategies examines the key issues of correspondence analysis, and discusses the new advances that have been made over the last 20 years.
The main focus of this book is to provide a comprehensive discussion of some of the key technical and practical aspects of correspondence analysis, and to demonstrate how they may be put to use. Particular attention is given to the history and mathematical links of the developments made. These links include not just those major contributions made by researchers in Europe (which is where much of the attention surrounding correspondence analysis has focused) but also the important contributions made by researchers in other parts of the world.
Key features include:
A comprehensive international perspective on the key developments of correspondence analysis.
Discussion of correspondence analysis for nominal and ordinal categorical data.
Discussion of correspondence analysis of contingency tables with varying association structures (symmetric and non–symmetric relationship between two or more categorical variables).
Extensive treatment of many of the members of the correspondence analysis family for two–way, three–way and multiple contingency tables.
Correspondence Analysis offers a comprehensive and detailed overview of this topic which will be of value to academics, postgraduate students and researchers who want to have a better understanding of correspondence analysis. Readers interested in the historical development, internationalisation and diverse applicability of correspondence analysis will also find much to enjoy in this book.
1997-2024 DolnySlask.com Agencja Internetowa