Using a novel integration of mathematics and Python codes, this book illustrates the fundamental concepts that link probability, statistics, and machine learning, so that the reader can not only employ statistical and machine learning models using modern Python modules, but also understand their relative strengths and weaknesses. To clearly connect theoretical concepts to practical implementations, the author provides many worked-out examples along with "Programming Tips" that encourage the reader to write quality Python code. The entire text, including all the figures and numerical results, is reproducible using the Python codes provided, thus enabling readers to follow along by experimenting with the same code on their own computers. Modern Python modules like Pandas, Sympy, Scikit-learn, Statsmodels, Scipy, Xarray, Tensorflow, and Keras are used to implement and visualize important machine learning concepts like the bias/variance trade-off, cross-validation, interpretability, and regularization. Many abstract mathematical ideas, such as modes of convergence in probability, are explained and illustrated with concrete numerical examples. This book is suitable for anyone with undergraduate-level experience with probability, statistics, or machine learning and with rudimentary knowledge of Python programming.
Introduction.- Part 1 Getting Started with Scientific Python.- Installation and Setup.- Numpy.- Matplotlib.- Ipython.- Jupyter Notebook.- Scipy.- Pandas.- Sympy.- Interfacing with Compiled Libraries.- Integrated Development Environments.- Quick Guide to Performance and Parallel Programming.- Other Resources.- Part 2 Probability.- Introduction.- Projection Methods.- Conditional Expectation as Projection.- Conditional Expectation and Mean Squared Error.- Worked Examples of Conditional Expectation and Mean Square Error Optimization.- Useful Distributions.- Information Entropy.- Moment Generating Functions.- Monte Carlo Sampling Methods.- Useful Inequalities.- Part 3 Statistics.- Python Modules for Statistics.- Types of Convergence.- Estimation Using Maximum Likelihood.- Hypothesis Testing and P-Values.- Confidence Intervals.- Linear Regression.- Maximum A-Posteriori.- Robust Statistics.- Bootstrapping.- Gauss Markov.- Nonparametric Methods.- Survival Analysis.- Part 4 Machine Learning.- Introduction.- Python Machine Learning Modules.- Theory of Learning.- Decision Trees.- Boosting Trees.- Logistic Regression.- Generalized Linear Models.- Regularization.- Support Vector Machines.- Dimensionality Reduction.- Clustering.- Ensemble Methods.- Deep Learning.- Notation.- References.- Index.
Dr. José Unpingco completed his PhD from the University of California (UCSD), San Diego and has since worked in industry as an engineer, consultant, and instructor on a wide-variety of advanced data science topics, with deep experience in machine learning. He was the onsite technical director for large-scale Signal and Image Processing for the Department of Defense (DoD) where he also spearheaded the DoD-wide adoption of scientific Python. In his time as the primary scientific Python instructor for the DoD, he taught over 600 scientists and engineers. Dr. Unpingco is currently the Vice President for Machine Learning/Data Science for the Gary and Mary West Health Institute, a non-profit Medical Research Organization in San Diego, California. He is also a lecturer at UCSD for their undergraduate and graduate Machine Learning and Data Science degree programs.
Using a novel integration of mathematics and Python codes, this book illustrates the fundamental concepts that link probability, statistics, and machine learning, so that the reader can not only employ statistical and machine learning models using modern Python modules, but also understand their relative strengths and weaknesses. To clearly connect theoretical concepts to practical implementations, the author provides many worked-out examples along with "Programming Tips" that encourage the reader to write quality Python code. The entire text, including all the figures and numerical results, is reproducible using the Python codes provided, thus enabling readers to follow along by experimenting with the same code on their own computers.
Modern Python modules like Pandas, Sympy, Scikit-learn, Statsmodels, Scipy, Xarray, Tensorflow, and Keras are used to implement and visualize important machine learning concepts like the bias/variance trade-off, cross-validation, interpretability, and regularization. Many abstract mathematical ideas, such as modes of convergence in probability, are explained and illustrated with concrete numerical examples. This book is suitable for anyone with undergraduate-level experience with probability, statistics, or machine learning and with rudimentary knowledge of Python programming.
· Features a novel combination of modern Python implementations and underlying mathematics to illustrate and visualize the foundational ideas of probability, statistics, and machine learning;
· Includes meticulously worked-out numerical examples, all reproducible using the Python code provided in the text, that compute and visualize statistical and machine learning models thus enabling the reader to not only implement these models but understand their inherent trade-offs;
· Utilizes modern Python modules such as Statsmodels, Tensorflow, Keras, Sympy, and Scikit-learn, along with embedded "Programming Tips" to encourage readers to develop quality Python codes that implement and illustrate practical concepts.