ISBN-13: 9781119835172 / Angielski / Twarda / 2023 / 672 str.
ISBN-13: 9781119835172 / Angielski / Twarda / 2023 / 672 str.
Foreword by Ravi Bapna xixForeword by Gareth James xxiPreface to the Second R Edition xxiiiAcknowledgments xxviPart I PreliminariesChapter 1 Introduction 31.1 What Is Business Analytics? 31.2 What Is Machine Learning? 51.3 Machine Learning, AI, and Related Terms 51.4 Big Data 71.5 Data Science 81.6 Why Are There So Many Different Methods? 81.7 Terminology and Notation 91.8 Road Maps to This Book 11Order of Topics 13Chapter 2 Overview of the Machine Learning Process 172.1 Introduction 172.2 Core Ideas in Machine Learning 182.3 The Steps in a Machine Learning Project 212.4 Preliminary Steps 232.5 Predictive Power and Overfitting 352.6 Building a Predictive Model 412.7 Using R for Machine Learning on a Local Machine 462.8 Automating Machine Learning Solutions 472.9 Ethical Practice in Machine Learning 52Problems 57Part II Data Exploration and Dimension ReductionChapter 3 Data Visualization 633.1 Uses of Data Visualization 633.2 Data Examples 653.3 Basic Charts: Bar Charts, Line Charts, and Scatter Plots 673.4 Multidimensional Visualization 753.5 Specialized Visualizations 913.6 Major Visualizations and Operations, by Machine Learning Goal 97Problems 99Chapter 4 Dimension Reduction 1014.1 Introduction 1014.2 Curse of Dimensionality 1024.3 Practical Considerations 1024.4 Data Summaries 1034.5 Correlation Analysis 1074.6 Reducing the Number of Categories in Categorical Variables 1094.7 Converting a Categorical Variable to a Numerical Variable 1114.8 Principal Component Analysis 1114.9 Dimension Reduction Using Regression Models 1214.10 Dimension Reduction Using Classification and Regression Trees 121Problems 123Part III Performance EvaluationChapter 5 Evaluating Predictive Performance 1295.1 Introduction 1305.2 Evaluating Predictive Performance 1305.3 Judging Classifier Performance 1365.4 Judging Ranking Performance 1505.5 Oversampling 156Problems 162Part IV Prediction and Classification MethodsChapter 6 Multiple Linear Regression 1676.1 Introduction 1676.2 Explanatory vs. Predictive Modeling 1686.3 Estimating the Regression Equation and Prediction 1706.4 Variable Selection in Linear Regression 176Problems 188Chapter 7 k-Nearest Neighbors (kNN) 1937.1 The k-NN Classifier (Categorical Outcome) 1937.2 k-NN for a Numerical Outcome 2017.3 Advantages and Shortcomings of k-NN Algorithms 204Problems 205Chapter 8 The Naive Bayes Classifier 2078.1 Introduction 2078.2 Applying the Full (Exact) Bayesian Classifier 2098.3 Solution: Naive Bayes 2118.4 Advantages and Shortcomings of the Naive Bayes Classifier 220Problems 223Chapter 9 Classification and Regression Trees 2259.1 Introduction 2269.2 Classification Trees 2289.3 Evaluating the Performance of a Classification Tree 2359.4 Avoiding Overfitting 2399.5 Classification Rules from Trees 2479.6 Classification Trees for More Than Two Classes 2489.7 Regression Trees 2499.8 Advantages and Weaknesses of a Tree 2509.9 Improving Prediction: Random Forests and Boosted Trees 252Problems 257Chapter 10 Logistic Regression 26110.1 Introduction 26110.2 The Logistic Regression Model 26310.3 Example: Acceptance of Personal Loan 26410.4 Evaluating Classification Performance 27110.5 Variable Selection 27310.6 Logistic Regression for Multi-Class Classification 27410.7 Example of Complete Analysis: Predicting Delayed Flights 277Problems 289Chapter 11 Neural Nets 29311.1 Introduction 29311.2 Concept and Structure of a Neural Network 29411.3 Fitting a Network to Data 29511.4 Required User Input 30711.5 Exploring the Relationship Between Predictors and Outcome 30811.6 Deep Learning 30911.7 Advantages and Weaknesses of Neural Networks 320Problems 322Chapter 12 Discriminant Analysis 32512.1 Introduction 32512.2 Distance of a Record from a Class 32712.3 Fisher's Linear Classification Functions 32912.4 Classification Performance of Discriminant Analysis 33312.5 Prior Probabilities 33412.6 Unequal Misclassification Costs 33412.7 Classifying More Than Two Classes 33612.8 Advantages and Weaknesses 339Problems 341Chapter 13 Generating, Comparing, and Combining Multiple Models 34513.1 Ensembles 34613.2 Automated Machine Learning (AutoML) 35213.3 Explaining Model Predictions 35813.4 Summary 360Problems 362Part V Intervention and User FeedbackChapter 14 Interventions: Experiments, Uplift Models, and Reinforcement Learning 36714.1 A/B Testing 36814.2 Uplift (Persuasion) Modeling 37314.3 Reinforcement Learning 38014.4 Summary 388Problems 390Part VI Mining Relationships Among RecordsChapter 15 Association Rules and Collaborative Filtering 39315.1 Association Rules 39415.2 Collaborative Filtering 40715.3 Summary 419Problems 421Chapter 16 Cluster Analysis 42516.1 Introduction 42616.2 Measuring Distance Between Two Records 42916.3 Measuring Distance Between Two Clusters 43416.4 Hierarchical (Agglomerative) Clustering 43716.5 Non-Hierarchical Clustering: The k-Means Algorithm 444Problems 450Part VII Forecasting Time SeriesChapter 17 Handling Time Series 45517.1 Introduction 45517.2 Descriptive vs. Predictive Modeling 45717.3 Popular Forecasting Methods in Business 45717.4 Time Series Components 45817.5 Data Partitioning and Performance Evaluation 463Problems 466Chapter 18 Regression-Based Forecasting 46918.1 A Model with Trend 46918.2 A Model with Seasonality 47618.3 A Model with Trend and Seasonality 47818.4 Autocorrelation and ARIMA Models 479Problems 489Chapter 19 Smoothing and Deep Learning Methods for Forecasting 49919.1 Smoothing Methods: Introduction 50019.2 Moving Average 50019.3 Simple Exponential Smoothing 50519.4 Advanced Exponential Smoothing 50719.5 Deep Learning for Forecasting 511Problems 516Part VIII Data AnalyticsChapter 20 Social Network Analytics 52720.1 Introduction 52720.2 Directed vs. Undirected Networks 52920.3 Visualizing and Analyzing Networks 53020.4 Social Data Metrics and Taxonomy 53420.5 Using Network Metrics in Prediction and Classification 53820.6 Collecting Social Network Data with R 54520.7 Advantages and Disadvantages 545Problems 548Chapter 21 Text Mining 54921.1 Introduction 54921.2 The Tabular Representation of Text 55021.3 Bag-of-Words vs. Meaning Extraction at Document Level 55121.4 Preprocessing the Text 55221.5 Implementing Machine Learning Methods 56021.6 Example: Online Discussions on Autos and Electronics 56021.7 Example: Sentiment Analysis of Movie Reviews 56421.8 Summary 568Problems 570Chapter 22 Responsible Data Science 57322.1 Introduction 57322.2 Unintentional Harm 57422.3 Legal Considerations 57622.4 Principles of Responsible Data Science 57722.5 A Responsible Data Science Framework 58022.6 Documentation Tools 58422.7 Example: Applying the RDS Framework to the COMPAS Example 58822.8 Summary 598Problems 599Part IX CasesChapter 23 Cases 60323.1 Charles Book Club 60323.2 German Credit 61023.3 Tayko Software Cataloger 61523.4 Political Persuasion 61923.5 Taxi Cancellations 62323.6 Segmenting Consumers of Bath Soap 62523.7 Direct-Mail Fundraising 62923.8 Catalog Cross-Selling 63223.9 Time Series Case: Forecasting Public Transportation Demand 63423.10 Loan Approval 636Index 647
Galit Shmueli, PhD, is Distinguished Professor and Institute Director at National Tsing Hua University's Institute of Service Science. She has designed and instructed business analytics courses since 2004 at University of Maryland, Statistics.com, The Indian School of Business, and National Tsing Hua University, Taiwan.Peter C. Bruce, is Founder of the Institute for Statistics Education at Statistics.com, and Chief Learning Officer at Elder Research, Inc.Peter Gedeck, PhD, is Senior Data Scientist at Collaborative Drug Discovery and teaches at statistics.com and the UVA School of Data Science. His specialty is the development of machine learning algorithms to predict biological and physicochemical properties of drug candidates.Inbal Yahav, PhD, is a Senior Lecturer in The Coller School of Management at Tel Aviv University, Israel. Her work focuses on the development and adaptation of statistical models for use by researchers in the field of information systems.Nitin R. Patel, PhD, is Co-founder and Lead Researcher at Cytel Inc. He was also a Co-founder of Tata Consultancy Services. A Fellow of the American Statistical Association, Dr. Patel has served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University, USA.
1997-2024 DolnySlask.com Agencja Internetowa