Acknowledgment xiPreface xiii1 Data Munging Basics1 Introduction 11.1 Filtering and Selecting Data 61.2 Treating Missing Values 111.3 Removing Duplicates 141.4 Concatenating and Transforming Data 161.5 Grouping and Data Aggregation 20References 202 Data Visualization 232.1 Creating Standard Plots (Line, Bar, Pie) 262.2 Defining Elements of a Plot 302.3 Plot Formatting 332.4 Creating Labels and Annotations 382.5 Creating Visualizations from Time Series Data 422.6 Constructing Histograms, Box Plots, and Scatter Plots 44References 543 Basic Math and Statistics 573.1 Linear Algebra 573.2 Calculus 583.2.1 Differential Calculus 583.2.2 Integral Calculus 583.3 Inferential Statistics 603.3.1 Central Limit Theorem 603.3.2 Hypothesis Testing 603.3.3 ANOVA 603.3.4 Qualitative Data Analysis 603.4 Using NumPy to Perform Arithmetic Operations on Data 613.5 Generating Summary Statistics Using Pandas and Scipy 643.6 Summarizing Categorical Data Using Pandas 683.7 Starting with Parametric Methods in Pandas and Scipy 843.8 Delving Into Non-Parametric Methods Using Pandas and Scipy 873.9 Transforming Dataset Distributions 91References 944 Introduction to Machine Learning 974.1 Introduction to Machine Learning 974.2 Types of Machine Learning Algorithms 1014.3 Explanatory Factor Analysis 1144.4 Principal Component Analysis (PCA) 115References 1215 Outlier Analysis 1235.1 Extreme Value Analysis Using Univariate Methods 1235.2 Multivariate Analysis for Outlier Detection 1255.3 DBSCan Clustering to Identify Outliers 127References 1336 Cluster Analysis 1356.1 K-Means Algorithm 1356.2 Hierarchial Methods 1416.3 Instance-Based Learning w/ k-Nearest Neighbor 149References 1567 Network Analysis with NetworkX 1577.1 Working with Graph Objects 1597.2 Simulating a Social Network (ie; Directed Network Analysis) 1637.3 Analyzing a Social Network 169References 1718 Basic Algorithmic Learning 1738.1 Linear Regression 1738.2 Logistic Regression 1838.3 Naive Bayes Classifiers 189References 1959 Web-Based Data Visualizations with Plotly 1979.1 Collaborative Aanalytics 1979.2 Basic Charts 2089.3 Statistical Charts 2129.4 Plotly Maps 216References 21910 Web Scraping with Beautiful Soup 22110.1 The BeautifulSoup Object 22410.2 Exploring NavigableString Objects 22810.3 Data Parsing 23010.4 Web Scraping 23310.5 Ensemble Models with Random Forests 235References 254Data Science Projects 25711 Covid19 Detection and Prediction 259Bibliography 27512 Leaf Disease Detection 277Bibliography 28313 Brain Tumor Detection with Data Science 285Bibliography 29514 Color Detection with Python 297Bibliography 30015 Detecting Parkinson's Disease 301Bibliography 30216 Sentiment Analysis 303Bibliography 30617 Road Lane Line Detection 307Bibliography 31518 Fake News Detection 317Bibliography 31819 Speech Emotion Recognition 319Bibliography 32220 Gender and Age Detection with Data Science 323Bibliography 33921 Diabetic Retinopathy 341Bibliography 35022 Driver Drowsiness Detection in Python 351Bibliography 35623 Chatbot Using Python 357Bibliography 36324 Handwritten Digit Recognition Project 365Bibliography 36825 Image Caption Generator Project in Python 369Bibliography 37926 Credit Card Fraud Detection Project 381Bibliography 39127 Movie Recommendation System 393Bibliography 41128 Customer Segmentation 413Bibliography 43129 Breast Cancer Classification 433Bibliography 44330 Traffic Signs Recognition 445Bibliography 453
Kolla Bhanu Prakash, PhD, is a Professor and Research Group Head for A.I. & Data Science Research group at K L University, India. He has published more than 80 research papers in international and national journals and conferences, as well as authored/edited 12 books and seven patents. His research interests include deep learning, data science, and quantum computing.