ISBN-13: 9781119828792 / Angielski / Twarda / 2023 / 540 str.
ISBN-13: 9781119828792 / Angielski / Twarda / 2023 / 540 str.
Foreword by Ravi Bapna xxiPreface to the RapidMiner Edition xxiiiAcknowledgments xxviiPART I PRELIMINARIESCHAPTER 1 Introduction 31.1 What Is Business Analytics? 31.2 What Is Machine Learning? 51.3 Machine Learning, AI, and Related Terms 51.4 Big Data 71.5 Data Science 81.6 Why Are There So Many Different Methods? 91.7 Terminology and Notation 91.8 Road Maps to This Book 121.9 Using RapidMiner Studio 14CHAPTER 2 Overview of the Machine Learning Process 192.1 Introduction 192.2 Core Ideas in Machine Learning 202.3 The Steps in a Machine Learning Project 232.4 Preliminary Steps 252.5 Predictive Power and Overfitting 322.6 Building a Predictive Model with RapidMiner 372.7 Using RapidMiner for Machine Learning 452.8 Automating Machine Learning Solutions 472.9 Ethical Practice in Machine Learning 52PART II DATA EXPLORATION AND DIMENSION REDUCTIONCHAPTER 3 Data Visualization 633.1 Introduction 633.2 Data Examples 653.3 Basic Charts: Bar Charts, Line Charts, and Scatter Plots 663.4 Multidimensional Visualization 753.5 Specialized Visualizations 873.6 Summary: Major Visualizations and Operations, by Machine Learning Goal 92CHAPTER 4 Dimension Reduction 974.1 Introduction 974.2 Curse of Dimensionality 984.3 Practical Considerations 984.4 Data Summaries 1004.5 Correlation Analysis 1034.6 Reducing the Number of Categories in Categorical Attributes 1054.7 Converting a Categorical Attribute to a Numerical Attribute 1074.8 Principal Component Analysis 1074.9 Dimension Reduction Using Regression Models 1174.10 Dimension Reduction Using Classification and Regression Trees 119PART III PERFORMANCE EVALUATIONCHAPTER 5 Evaluating Predictive Performance 1255.1 Introduction 1255.2 Evaluating Predictive Performance 1265.3 Judging Classifier Performance 1315.4 Judging Ranking Performance 1465.5 Oversampling 151PART IV PREDICTION AND CLASSIFICATION METHODSCHAPTER 6 Multiple Linear Regression 1636.1 Introduction 1636.2 Explanatory vs. Predictive Modeling 1646.3 Estimating the Regression Equation and Prediction 1666.4 Variable Selection in Linear Regression 171CHAPTER 7 k-Nearest Neighbors (k-NN) 1897.1 The k-NN Classifier (Categorical Label) 1897.2 k-NN for a Numerical Label 2007.3 Advantages and Shortcomings of k-NN Algorithms 202CHAPTER 8 The Naive Bayes Classifier 2098.1 Introduction 2098.2 Applying the Full (Exact) Bayesian Classifier 2118.3 Solution: Naive Bayes 2138.4 Advantages and Shortcomings of the Naive Bayes Classifier 223CHAPTER 9 Classification and Regression Trees 2299.1 Introduction 2299.2 Classification Trees 2329.3 Evaluating the Performance of a Classification Tree 2409.4 Avoiding Overfitting 2459.5 Classification Rules from Trees 2559.6 Classification Trees for More Than Two Classes 2569.7 Regression Trees 2569.8 Improving Prediction: Random Forests and Boosted Trees 2599.9 Advantages and Weaknesses of a Tree 261CHAPTER 10 Logistic Regression 26910.1 Introduction 26910.2 The Logistic Regression Model 27110.3 Example: Acceptance of Personal Loan 27210.4 Logistic Regression for Multi-class Classification 28310.5 Example of Complete Analysis: Predicting Delayed Flights 286CHAPTER 11 Neural Networks 30511.1 Introduction 30611.2 Concept and Structure of a Neural Network 30611.3 Fitting a Network to Data 30711.4 Required User Input 32111.5 Exploring the Relationship Between Predictors and Target Attribute 32211.6 Deep Learning 32311.7 Advantages and Weaknesses of Neural Networks 334CHAPTER 12 Discriminant Analysis 33712.1 Introduction 33712.2 Distance of a Record from a Class 34012.3 Fisher's Linear Classification Functions 34112.4 Classification Performance of Discriminant Analysis 34612.5 Prior Probabilities 34812.6 Unequal Misclassification Costs 34812.7 Classifying More Than Two Classes 34912.8 Advantages and Weaknesses 351CHAPTER 13 Generating, Comparing, and Combining Multiple Models 35913.1 Automated Machine Learning (AutoML) 35913.2 Explaining Model Predictions 36713.3 Ensembles 37313.4 Summary 381PART V INTERVENTION AND USER FEEDBACKCHAPTER 14 Interventions: Experiments, Uplift Models, and Reinforcement Learning 38714.1 A/B Testing 38714.2 Uplift (Persuasion) Modeling 39314.3 Reinforcement Learning 40014.4 Summary 405PART VI MINING RELATIONSHIPS AMONG RECORDSCHAPTER 15 Association Rules and Collaborative Filtering 40915.1 Association Rules 40915.2 Collaborative Filtering 42415.3 Summary 438CHAPTER 16 Cluster Analysis 44516.1 Introduction 44516.2 Measuring Distance Between Two Records 44916.3 Measuring Distance Between Two Clusters 45516.4 Hierarchical (Agglomerative) Clustering 45716.5 Non-Hierarchical Clustering: The k-Means Algorithm 466PART VII FORECASTING TIME SERIESCHAPTER 17 Handling Time Series 47917.1 Introduction 48017.2 Descriptive vs. Predictive Modeling 48117.3 Popular Forecasting Methods in Business 48117.4 Time Series Components 48217.5 Data Partitioning and Performance Evaluation 486CHAPTER 18 Regression-Based Forecasting 49718.1 A Model with Trend 49818.2 A Model with Seasonality 50418.3 A Model with Trend and Seasonality 50818.4 Autocorrelation and ARIMA Models 509CHAPTER 19 Smoothing and Deep Learning Methods for Forecasting 53319.1 Smoothing Methods: Introduction 53419.2 Moving Average 53419.3 Simple Exponential Smoothing 54119.4 Advanced Exponential Smoothing 54519.5 Deep Learning for Forecasting 549PART VIII DATA ANALYTICSCHAPTER 20 Social Network Analytics 56320.1 Introduction 56320.2 Directed vs. Undirected Networks 56420.3 Visualizing and Analyzing Networks 56720.4 Social Data Metrics and Taxonomy 57120.5 Using Network Metrics in Prediction and Classification 57720.6 Collecting Social Network Data with RapidMiner 58420.7 Advantages and Disadvantages 584CHAPTER 21 Text Mining 58921.1 Introduction 58921.2 The Tabular Representation of Text: Term-Document Matrix and "Bag-of-Words'' 59021.3 Bag-of-Words vs. Meaning Extraction at Document Level 59221.4 Preprocessing the Text 59321.5 Implementing Machine Learning Methods 60221.6 Example: Online Discussions on Autos and Electronics 60221.7 Example: Sentiment Analysis of Movie Reviews 60721.8 Summary 614CHAPTER 22 Responsible Data Science 61722.1 Introduction 61722.2 Unintentional Harm 61822.3 Legal Considerations 62022.4 Principles of Responsible Data Science 62122.5 A Responsible Data Science Framework 62422.6 Documentation Tools 62822.7 Example: Applying the RDS Framework to the COMPAS Example 63122.8 Summary 641PART IX CASESCHAPTER 23 Cases 64723.1 Charles Book Club 64723.2 German Credit 65323.3 Tayko Software Cataloger 65823.4 Political Persuasion 66223.5 Taxi Cancellations 66523.6 Segmenting Consumers of Bath Soap 66723.7 Direct-Mail Fundraising 67023.8 Catalog Cross-Selling 67223.9 Time Series Case: Forecasting Public Transportation Demand 67323.10 Loan Approval 675Index 685
Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University's Institute of Service Science, College of Technology Management. She has designed and instructed business analytics courses since 2004 at University of Maryland, Statistics.com, The Indian School of Business, and National Tsing Hua University, Taiwan.Peter C. Bruce, is Founder of the Institute for Statistics Education at Statistics.com, and Chief Learning Officer at Elder Research, Inc.Amit V. Deokar, PhD, is Associate Dean of Undergraduate Programs and an Associate Professor of Management Information Systems at the Manning School of Business at University of Massachusetts Lowell. Since 2006, he has developed and taught courses in business analytics, with expertise in using the RapidMiner platform. He is an Association for Information Systems Distinguished Member Cum Laude.Nitin R. Patel, PhD, is cofounder and lead researcher at Cytel Inc. He was also a co-founder of Tata Consultancy Services. A Fellow of the American Statistical Association, Dr. Patel has served as a visiting professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years.
1997-2024 DolnySlask.com Agencja Internetowa