Preface.- Introduction.- Basic Functions of the SPSS Modeler.- Univariate Statistics.- Multivariate Statistics.- Regression Models.- Factor Analysis.- Cluster Analysis.- Classification Models.- Using R with the Modeler.- Imbalanced Data and Resampling Techniques.- Case Study: Fault Detection in Semiconductor Manufacturing Process.- Appendix.
Dr. Tilo Wendler is a Professor at the University of Applied Sciences HTW Berlin, Germany. He studied mathematics, physics and business information technology. In his doctoral thesis, he examined determinants of user expectations in using information technology. He is also interested in applying complex statistical methods in the banking sector, especially in the field of rating methods. He has been teaching business statistics and data mining for ten years.
Dr. Sören Gröttrup is a Professor of Machine Learning and Statistics at the Technische Hochschule Ingolstadt, Germany. After studying mathematics and computer science, he was awarded a Ph.D. for his research on stochastic models with biological applications. Alongside his doctoral studies, he worked as a data analyst, analyzing genomic data at a research institute. For many years, he worked as a data scientist in the airline and data-driven marketing business, and advised various companies on digitalization and machine learning projects.
Now in its second edition, this textbook introduces readers to the IBM SPSS Modeler and guides them through data mining processes and relevant statistical methods. Focusing on step-by-step tutorials and well-documented examples that help demystify complex mathematical algorithms and computer programs, it also features a variety of exercises and solutions, as well as an accompanying website with data sets and SPSS Modeler streams. While intended for students, the simplicity of the Modeler makes the book useful for anyone wishing to learn about basic and more advanced data mining, and put this knowledge into practice. This revised and updated second edition includes a new chapter on imbalanced data and resampling techniques as well as an extensive case study on the cross-industry standard process for data mining.