Introduction xxiiiChapter 1 Introduction to Machine Learning 1What Is Machine Learning? 2What Problems Will Machine Learning Be Solving in This Book? 3Classification 4Regression 4Clustering 5Types of Machine Learning Algorithms 5Supervised Learning 5Unsupervised Learning 7Getting the Tools 8Obtaining Anaconda 8Installing Anaconda 9Running Jupyter Notebook for Mac 9Running Jupyter Notebook for Windows 10Creating a New Notebook 11Naming the Notebook 12Adding and Removing Cells 13Running a Cell 14Restarting the Kernel 16Exporting Your Notebook 16Getting Help 17Chapter 2 Extending Python Using NumPy 19What Is NumPy? 19Creating NumPy Arrays 20Array Indexing 22Boolean Indexing 22Slicing Arrays 23NumPy Slice Is a Reference 25Reshaping Arrays 26Array Math 27Dot Product 29Matrix 30Cumulative Sum 31NumPy Sorting 32Array Assignment 34Copying by Reference 34Copying by View (Shallow Copy) 36Copying by Value (Deep Copy) 37Chapter 3 Manipulating Tabular Data Using Pandas 39What Is Pandas? 39Pandas Series 40Creating a Series Using a Specified Index 41Accessing Elements in a Series 41Specifying a Datetime Range as the Index of a Series 42Date Ranges 43Pandas DataFrame 45Creating a DataFrame 45Specifying the Index in a DataFrame 46Generating Descriptive Statistics on the DataFrame 47Extracting from DataFrames 49Selecting the First and Last Five Rows 49Selecting a Specific Column in a DataFrame 50Slicing Based on Row Number 50Slicing Based on Row and Column Numbers 51Slicing Based on Labels 52Selecting a Single Cell in a DataFrame 54Selecting Based on Cell Value 54Transforming DataFrames 54Checking to See If a Result Is a DataFrame or Series 55Sorting Data in a DataFrame 55Sorting by Index 55Sorting by Value 56Applying Functions to a DataFrame 57Adding and Removing Rows and Columns in a DataFrame 60Adding a Column 61Removing Rows 61Removing Columns 62Generating a Crosstab 63Chapter 4 Data Visualization Using matplotlib 67What Is matplotlib? 67Plotting Line Charts 68Adding Title and Labels 69Styling 69Plotting Multiple Lines in the Same Chart 71Adding a Legend 72Plotting Bar Charts 73Adding Another Bar to the Chart 74Changing the Tick Marks 75Plotting Pie Charts 77Exploding the Slices 78Displaying Custom Colors 79Rotating the Pie Chart 80Displaying a Legend 81Saving the Chart 82Plotting Scatter Plots 83Combining Plots 83Subplots 84Plotting Using Seaborn 85Displaying Categorical Plots 86Displaying Lmplots 88Displaying Swarmplots 90Chapter 5 Getting Started with Scikit-learn for Machine Learning 93Introduction to Scikit-learn 93Getting Datasets 94Using the Scikit-learn Dataset 94Using the Kaggle Dataset 97Using the UCI (University of California, Irvine) Machine Learning Repository 97Generating Your Own Dataset 98Linearly Distributed Dataset 98Clustered Dataset 98Clustered Dataset Distributed in Circular Fashion 100Getting Started with Scikit-learn 100Using the LinearRegression Class for Fitting the Model 101Making Predictions 102Plotting the Linear Regression Line 102Getting the Gradient and Intercept of the Linear Regression Line 103Examining the Performance of the Model by Calculating the Residual Sum of Squares 104Evaluating the Model Using a Test Dataset 105Persisting the Model 106Data Cleansing 107Cleaning Rows with NaNs 108Replacing NaN with the Mean of the Column 109Removing Rows 109Removing Duplicate Rows 110Normalizing Columns 112Removing Outliers 113Tukey Fences 113Z-Score 116Chapter 6 Supervised Learning--Linear Regression 119Types of Linear Regression 119Linear Regression 120Using the Boston Dataset 120Data Cleansing 125Feature Selection 126Multiple Regression 128Training the Model 131Getting the Intercept and Coefficients 133Plotting the 3D Hyperplane 133Polynomial Regression 135Formula for Polynomial Regression 138Polynomial Regression in Scikit-learn 138Understanding Bias and Variance 141Using Polynomial Multiple Regression on the Boston Dataset 144Plotting the 3D Hyperplane 146Chapter 7 Supervised Learning--Classification Using Logistic Regression 151What Is Logistic Regression? 151Understanding Odds 153Logit Function 153Sigmoid Curve 154Using the Breast Cancer Wisconsin (Diagnostic) Data Set 156Examining the Relationship Between Features 156Plotting the Features in 2D 157Plotting in 3D 158Training Using One Feature 161Finding the Intercept and Coefficient 162Plotting the Sigmoid Curve 162Making Predictions 163Training the Model Using All Features 164Testing the Model 166Getting the Confusion Matrix 166Computing Accuracy, Recall, Precision, and Other Metrics 168Receiver Operating Characteristic (ROC) Curve 171Plotting the ROC and Finding the Area Under the Curve (AUC) 174Chapter 8 Supervised Learning--Classification Using Support Vector Machines 177What Is a Support Vector Machine? 177Maximum Separability 178Support Vectors 179Formula for the Hyperplane 180Using Scikit-learn for SVM 181Plotting the Hyperplane and the Margins 184Making Predictions 185Kernel Trick 186Adding a Third Dimension 187Plotting the 3D Hyperplane 189Types of Kernels 191C 194Radial Basis Function (RBF) Kernel 196Gamma 197Polynomial Kernel 199Using SVM for Real-Life Problems 200Chapter 9 Supervised Learning--Classification Using K-Nearest Neighbors (KNN) 205What Is K-Nearest Neighbors? 205Implementing KNN in Python 206Plotting the Points 206Calculating the Distance Between the Points 207Implementing KNN 208Making Predictions 209Visualizing Different Values of K 209Using Scikit-Learn's KNeighborsClassifier Class for KNN 211Exploring Different Values of K 213Cross-Validation 216Parameter-Tuning K 217Finding the Optimal K 218Chapter 10 Unsupervised Learning--Clustering Using K-Means 221What Is Unsupervised Learning? 221Unsupervised Learning Using K-Means 222How Clustering in K-Means Works 222Implementing K-Means in Python 225Using K-Means in Scikit-learn 230Evaluating Cluster Size Using the Silhouette Coefficient 232Calculating the Silhouette Coefficient 233Finding the Optimal K 234Using K-Means to Solve Real-Life Problems 236Importing the Data 237Cleaning the Data 237Plotting the Scatter Plot 238Clustering Using K-Means 239Finding the Optimal Size Classes 240Chapter 11 Using Azure Machine Learning Studio 243What Is Microsoft Azure Machine Learning Studio? 243An Example Using the Titanic Experiment 244Using Microsoft Azure Machine Learning Studio 246Uploading Your Dataset 247Creating an Experiment 248Filtering the Data and Making Fields Categorical 252Removing the Missing Data 254Splitting the Data for Training and Testing 254Training a Model 256Comparing Against Other Algorithms 258Evaluating Machine Learning Algorithms 260Publishing the Learning Model as a Web Service 261Publishing the Experiment 261Testing the Web Service 263Programmatically Accessing the Web Service 263Chapter 12 Deploying Machine Learning Models 269Deploying ML 269Case Study 270Loading the Data 271Cleaning the Data 271Examining the Correlation Between the Features 273Plotting the Correlation Between Features 274Evaluating the Algorithms 277Logistic Regression 277K-Nearest Neighbors 277Support Vector Machines 278Selecting the Best Performing Algorithm 279Training and Saving the Model 279Deploying the Model 280Testing the Model 282Creating the Client Application to Use the Model 283Index 285
Wei-Meng Lee is a technologist and founder of Developer Learning Solutions , a technology company specializing in hands-on training on the latest mobile technologies. Wei-Meng has many years of training experiences and his training courses place special emphasis on the learning-by-doing approach. His hands-on approach to learning programming makes understanding the subject much easier than reading books, tutorials, and documentations. His name regularly appears in online and print publications such as DevX.com, MobiForge.com, and CoDe Magazine.
1997-2024 DolnySlask.com Agencja Internetowa