ISBN-13: 9781447110712 / Angielski / Miękka / 2014 / 295 str.
ISBN-13: 9781447110712 / Angielski / Miękka / 2014 / 295 str.
The classification of patterns is an important area of research which is central to all pattern recognition fields, including speech, image, robotics, and data analysis. Neural networks have been used successfully in a number of these fields, but so far their application has been based on a 'black box approach' with no real understanding of how they work. In this book, Sarunas Raudys - an internationally respected researcher in the area - provides an excellent mathematical and applied introduction to how neural network classifiers work and how they should be used.. .
1. Quick Overview.- 1.1 The Classifier Design Problem.- 1.2 Single Layer and Multilayer Perceptrons.- 1.3 The SLP as the Euclidean Distance and the Fisher Linear Classifiers.- 1.4 The Generalisation Error of the EDC and the Fisher DF.- 1.5 Optimal Complexity — The Scissors Effect.- 1.6 Overtraining in Neural Networks.- 1.7 Bibliographical and Historical Remarks.- 2. Taxonomy of Pattern Classification Algorithms.- 2.1 Principles of Statistical Decision Theory.- 2.2 Four Parametric Statistical Classifiers.- 2.2.1 The Quadratic Discriminant Function.- 2.2.2 The Standard Fisher Linear Discriminant Function.- 2.2.3 The Euclidean Distance Classifier.- 2.2.4 The Anderson-Bahadur Linear DF.- 2.3 Structures of the Covariance Matrices.- 2.3.1 A Set of Standard Assumptions.- 2.3.2 Block Diagonal Matrices.- 2.3.3 The Tree Type Dependence Models.- 2.3.4 Temporal Dependence Models.- 2.4 The Bayes Predictive Approach to Design Optimal Classification Rules.- 2.4.1 A General Theory.- 2.4.2 Learning the Mean Vector.- 2.4.3 Learning the Mean Vector and CM.- 2.4.4 Qualities and Shortcomings.- 2.5. Modifications of the Standard Linear and Quadratic DF.- 2.5.1 A Pseudo-Inversion of the Covariance Matrix.- 2.5.2 Regularised Discriminant Analysis (RDA).- 2.5.3 Scaled Rotation Regularisation.- 2.5.4 Non-Gausian Densities.- 2.5.5 Robust Discriminant Analysis.- 2.6 Nonparametric Local Statistical Classifiers.- 2.6.1 Methods Based on Mixtures of Densities.- 2.6.2 Piecewise-Linear Classifiers.- 2.6.3 The Parzen Window Classifier.- 2.6.4 The k-NN Rule and a Calculation Speed.- 2.6.5 Polynomial and Potential Function Classifiers.- 2.7 Minimum Empirical Error and Maximal Margin Linear Classifiers.- 2.7.1 The Minimum Empirical Error Classifier.- 2.7.2 The Maximal Margin Classifier.- 2.7.3 The Support Vector Machine.- 2.8 Piecewise-Linear Classifiers.- 2.8.1 Multimodal Density Based Classifiers.- 2.8.2 Architectural Approach to Design of the Classifiers.- 2.8.3 Decision Tree Classifiers.- 2.9 Classifiers for Categorical Data.- 2.9.1 Multinornial Classifiers.- 2.9.2 Estimation of Parameters.- 2.9.3 Decision Tree and the Multinornial Classifiers.- 2.9.4 Linear Classifiers.- 2.9.5 Nonparametric Local Classifiers.- 2.10 Bibliographical and Historical Remarks.- 3. Performance and the Generalisation Error.- 3.1 Bayes, Conditional, Expected, and Asymptotic Probabilities of Misclassification.- 3.1.1 The Bayes Probability of Misclassification.- 3.1.2 The Conditional Probability of Misclassification.- 3.1.3 The Expected Probability of Misclassification.- 3.1.4 The Asymptotic Probability of Misclassification.- 3.1.5 Learning Curves: An Overview of Different Analysis Methods.- 3.1.6 Error Estimation.- 3.2 Generalisation Error of the Euclidean Distance Classifier.- 3.2.1 The Classification Algorithm.- 3.2.2 Double Asymptotics in the Error Analysis.- 3.2.3 The Spherical Gaussian Case.- 3.2.3.1 The Case N2 = N1.- 3.2.3.2 The Case N2 ? N1.- 3.3 Most Favourable and Least Favourable Distributions of the Data.- 3.3.1 The Non-Spherical Gaussian Case.- 3.3.2 The Most Favourable Distributions of the Data.- 3.3.3 The Least Favourable Distributions of the Data.- 3.3.4 Intrinsic Dimensionality.- 3.4 Generalisation Errors for Modifications of the Standard Linear Classifier.- 3.4.1 The Standard Fisher Linear DF.- 3.4.2 The Double Asymptotics for the Expected Error.- 3.4.3 The Conditional Probability of Misc1assification.- 3.4.4 A Standard Deviation of the Conditional Error.- 3.4.5 Favourable and Unfavourable Distributions.- 3.4.6 Theory and Real-World Problems.- 3.4.7 The Linear Classifier D for the Diagonal CM.- 3.4.8 The Pseudo-Fisher Classifier.- 3.4.9 The Regularised Discriminant Analysis.- 3.5 Common Parameters in Different Competing Pattern Classes.- 3.5.1 The Generalisation Error of the Quadratic DF.- 3.5.2 The Effect of Common Parameters in Two Competing Classes.- 3.5.3 Unequal Sampie Sizes in Plug-In Classifiers.- 3.6 Minimum Empirical Error and Maximal Margin Classifiers.- 3.6.1 Favourable Distributions of the Pattern Classes.- 3.6.2 VC Bounds for the Conditional Generalisation Error.- 3.6.3 Unfavourable Distributions for the Euclidean Distance and Minimum Empirical Error Classifiers.- 3.6.4 Generalisation Error in the Spherical Gaussian Case.- 3.6.5 Intrinsic Dimensionality.- 3.6.6 The Influence of the Margin.- 3.6.7 Characteristics of the Learning Curves.- 3.7 Parzen Window Classifier.- 3.7.1 The Decision Boundary of the PW Classifier with Spherical Kerneis.- 3.7.2 The Generalisation Error.- 3.7.3 Intrinsic Dimensionality.- 3.7.4 Optimal Value of the Smoothing Parameter.- 3.7.5 The k-NN Rule.- 3.8 Multinomial Classifier.- 3.9 Bibliographical and Historical Remarks.- 4. Neural Network Classifiers.- 4.1 Training Dynamics of the Single Layer Perceptron.- 4.1.1 The SLP and its Training Rule.- 4.1.2 The SLP as Statistical Classifier.- 4.1.2.1 The Euclidean Distance Classifier.- 4.1.2.2 The Regularised Discriminant Analysis.- 4.1.2.3 The Standard Linear Fisher Classifier.- 4.1.2.4 The Pseudo-Fisher Classifier.- 4.1.2.5 Dynamics of the Magnitudes of the Weights.- 4.1.2.6 The Robust Discriminant Analysis.- 4.1.2.7 The Minimum Empirical Error Classifier.- 4.1.2.8 The Maximum Margin (Support Vector) Classifier.- 4.1.3 Training Dynamics and Generalisation.- 4.2 Non-linear Decision Boundaries.- 4.2.1 The SLP in Transformed Feature Space.- 4.2.2 The MLP Classifier.- 4.2.3 Radial Basis-Function Networks.- 4.2.4 Learning Vector Quantisation Networks.- 4.3 Training Peculiarities of the Perceptrons.- 4.3.1 Cost Function Surfaces of the SLP Classifier.- 4.3.2 Cost Function Surfaces of the MLP Classifier.- 4.3.3 The Gradient Minimisation of the Cost Function.- 4.4 Generalisation of the Perceptrons.- 4.4.1 Single Layer Perceptron.- 4.4.1.1 Theoretical Background.- 4.4.1.2 The Experiment Design.- 4.4.1.3 The SLP and Parametric Classifiers.- 4.4.1.4 The SLP and Structural (Nonparametric) Classifiers.- 4.4.2 Multilayer Perceptron.- 4.4.2.1 Weights of the Hidden Layer Neurones are Common for all Outputs.- 4.4.2.2 Intrinsic Dimensionality Problems.- 4.4.2.3 An Effective Capacity of the Network.- 4.5 Overtraining and Initialisation.- 4.5.1 Overtraining.- 4.5.2 Effect of Initial Values.- 4.6 Tools to Control Complexity.- 4.6.1 The Number of Iterations.- 4.6.2 The Weight Decay Term.- 4.6.3 The Antiregularisation Technique.- 4.6.4 Noise Injection.- 4.6.4.1 Noise Injection into Inputs.- 4.6.4.2 Noise Injection into the Weights and into the Outputs of the Network.- 4.6.4.3 “Coloured” Noise Injection into Inputs.- 4.6.5 Control of Target Values.- 4.6.6 The Learning Step.- 4.6.7 Optimal Values of the Training Parameters.- 4.6.8 Learning Step in the Hidden Layer of MLP.- 4.6.9 Sigmoid Scaling.- 4.7 The Co-Operation of the Neural Networks.- 4.7.1 The Boss Decision Rule.- 4.7.2 Small Sampie Problems and Regularisation.- 4.8 Bibliographical and Historical Remarks.- 5. Integration of Statistical and Neural Approaches.- 5.1 Statistical Methods or Neural Nets?.- 5.2 Positive and Negative Attributes of Statistical Pattern Recognition.- 5.3 Positive and Negative Attributes of Artificial Neural Networks.- 5.4 Merging Statistical Classifiers and Neural Networks.- 5.4.1 Three Key Points in the Solution.- 5.4.2 Data Transformation or Statistical Classifier?.- 5.4.3 The Training Speed and Data Whitening Transformation.- 5.4.4 Dynamics of the Classifier after the Data Whitening Transformation.- 5.5 Data Transformations for the Integrated Approach.- 5.5.1 Linear Transformations.- 5.5.2 Non-linear Transformations.- 5.5.3 Performance of the Integrated Classifiers in Solving Real-World Problems.- 5.6 The Statistical Approach in Multilayer Feed-forward Networks.- 5.7 Concluding and Bibliographical Remarks.- 6. Model Selection.- 6.1 Classification Errors and their Estimation Methods.- 6.1.1 Types of Classification Error.- 6.1.2 Taxonomy of Error Rate Estimation Methods.- 6.1.2.1 Methods for Splitting the Design Set into Training and Validation Sets.- 6.1.2.2 Practical Aspects of using the Leave-One-Out Method.- 6.1.2.3 Pattern Error Functions.- 6.2 Simplified Performance Measures.- 6.2.1 Performance Criteria for Feature Extraction.- 6.2.1.1 Unsupervised Feature Extraction.- 6.2.1.2 Supervised Feature Extraction.- 6.2.2 Performance Criteria for Feature Selection.- 6.2.3 Feature Selection Strategies.- 6.3 Accuracy of Performance Estimates.- 6.3.1 Error Counting Estimates.- 6.3.1.1 The Hold-Out Method.- 6.3.1.2 The Resubstitution Estimator.- 6.3.1.3 The Leaving-One-Out Estimator.- 6.3.1.4 The Bootstrap Estimator.- 6.3.2 Parametric Estimators for the Linear Fisher Classifier.- 6.3.3 Associations Between the Classification Performance Measures.- 6.4 Feature Ranking and the Optimal Number of Feature.- 6.4.1 The Complexity of the Classifiers.- 6.4.2 Feature Ranking.- 6.4.3 Determining the Optimal Number of Features.- 6.5 The Accuracy of the Model Selection.- 6.5.1 True, Apparent and Ideal Classification Errors.- 6.5.2 An Effect of the Number of Variants.- 6.5.3 Evaluation of the Bias.- 6.6 Additional Bibliographical Remarks.- Appendices.- A.1 Elements of Matrix Algebra.- A.2 The First Order Tree Type Dependence Model.- A.3 Temporal Dependence Models.- A.4 Pikelis Algorithm for Evaluating Means and Variances of the True, Apparent and Ideal Errors in Model Selection.- A.5 Matlab Codes (the Non-Linear SLP Training, the First Order Tree Dependence Model, and Data Whitening Transformation).- References.
1997-2024 DolnySlask.com Agencja Internetowa