d. Dataset preparation: Outlier detection, imputing missing data
e. Dataset visualization
f. Common measures of the data: mean, variance, and other measures
g. Higher-order measures: skewness and non-gaussian parameter
h. Summary
3. Model development
a. Introduction
b. Ordinary Linear regression
c. Lasso, Ridge, Elastic Net
d. Gradient boost
e. Least Angle Regression
f. Polynomial regression
4. Introduction to machine learning
a. Supervised vs. unsupervised learning
b. Classification vs. regression
c. Classification methods:
1> Support vector machine
2> Random forest
d. Regression methods:
1> SVM
2> Random forest
3> Neural network
e. Summary
5. Quick dive into probabilistic methods
a. Introduction
b. What is probability
c. Central limit theorem
d. Probability models
e. Gaussian Process Regression
6. Optimization
a. Introduction
b. Convex optimization
c. Steepest decent
d. Conjugate Gradient
e. Newton’s, Broyden–Fletcher–Goldfarb–Shanno (BFGS), Davidon–Fletcher–Powell (DFP) methods
f. Bayesian Optimization
Part III: Application in glass science
7. Property prediction
a. Introduction
b. Regression
c. Case studies
d. Common pitfalls
1> Inadequate data
2> Truncation
3> Overfitting
e. Summary
8. Glass discovery
a. Introduction
b. Optimization: GA, Bayesian Optimization
c. Glass design chart
d. Case studies
e. Common pitfalls
1> Poor extrapolation
2> Problems of non-convexity
3> Convergence
f. Summary
9. Understanding glass physics
a. Introduction
b. Glass transition
c. Composition dependent property
d. Glass formability
e. Common pitfalls
f. Summary
10. Atomistic modeling
a. Introduction
b. Development of interatomic potentials
c. Predicting glass structures — MPNN
d. Graph mining
e. Summary
11. Future directions
N. M. Anoop Krishnan is an Associate Professor in the Department of Civil Engineering, IIT Delhi, with a joint affiliation in the Yardi School of Artificial Intelligence, IIT Delhi. Prior to this, he worked as Lecturer and Postdoctoral Researcher at the University of California, Los Angeles. His primary area of research includes data- and physics-based modeling of materials. He has published more than 100 peer-reviewed publications and won several prestigious awards including the Google research scholar award (2023), W. A. Weyl international glass science award, Young Associate 2022 (Indian Academy of Sciences), Young Engineer Award 2020 (Indian National Academy of Engineering).
Hariprasad Kodamana is an Associate Professor in the Department of Chemical Engineering, IIT Delhi withaffiliation in the Yardi School of Artificial Intelligence, IIT Delhi. Prior to this, he worked as Assistant Professor at IIT Kharagpur, Postdoctoral Researcher and Sessional Instructor at the University of Alberta, Canada, and Process Engineer at GE Energy. His primary area of research includes data-driven modeling and optimization. He serves as Reviewer for various scientific journals and has won several awards including the Young Faculty Incentive Fellowship (IIT Delhi) and the IIT Bombay Institute Award for best Ph.D. thesis.
Ravinder Bhattoo is currently a postdoctoral researcher in the University of Wisconsin-Madison. Prior to this, he completed his Ph.D. in the Department of Civil Engineering, IIT Delhi and undergraduate degree in civil engineering from IIT Roorkee. He works in the area of machine learning applied to glass science to predict the composition–property relationships in glasses. He has won several awards including the prestigious prime minister’s research fellowship (PMRF).
Focusing on the fundamentals of machine learning, this book covers broad areas of data-driven modeling, ranging from simple regression to advanced machine learning and optimization methods for applications in materials modeling and discovery. The book explains complex mathematical concepts in a lucid manner to ensure that readers from different materials domains are able to use these techniques successfully. A unique feature of this book is its hands-on aspect—each method presented herein is accompanied by a code that implements the method in open-source platforms such as Python. This book is thus aimed at graduate students, researchers, and engineers to enable the use of data-driven methods for understanding and accelerating the discovery of novel materials.