An Introduction.- The Case for Programming.- Elements of Programming.- Transforming Data.- Record Linkage.- Exploratory Data Analysis.- Regression Analysis.- Framing Classification.- Three Quantitative Perspectives.- Prediction.- Cluster Analysis.- Spatial Data.- Natural Language.- The Ethics of Data Science.- Developing Data Products.- Building Data Teams.- Appendix A: Planning a Data Product.- Appendix B: Interview Questions.
Jeffrey C. Chen: (1) Affiliated Researcher, Bennett Institute for Public Policy, University of Cambridge Edward A. Rubin: (1) Assistant Professor, University of Oregon (Dept. of Economics) Gary J. Cornwall: (1) Research Economist, U.S. Bureau of Economic Analysis
This textbook presents the essential tools and core concepts of data science to public officials, policy analysts, and economists among others in order to further their application in the public sector. An expansion of the quantitative economics frameworks presented in policy and business schools, this book emphasizes the process of asking relevant questions to inform public policy. Its techniques and approaches emphasize data-driven practices, beginning with the basic programming paradigms that occupy the majority of an analyst’s time and advancing to the practical applications of statistical learning and machine learning. The text considers two divergent, competing perspectives to support its applications, incorporating techniques from both causal inference and prediction. Additionally, the book includes open-sourced data as well as live code, written in R and presented in notebook form, which readers can use and modify to practice working with data.