Introduction xvAssessment Test xxiiChapter 1 Today's Data Analyst 1Welcome to the World of Analytics 2Data 2Storage 3Computing Power 4Careers in Analytics 5The Analytics Process 6Data Acquisition 7Cleaning and Manipulation 7Analysis 8Visualization 8Reporting and Communication 8Analytics Techniques 10Descriptive Analytics 10Predictive Analytics 11Prescriptive Analytics 11Machine Learning, Artificial Intelligence, and Deep Learning 11Data Governance 13Analytics Tools 13Summary 15Chapter 2 Understanding Data 17Exploring Data Types 18Structured Data Types 20Unstructured Data Types 31Categories of Data 36Common Data Structures 39Structured Data 39Unstructured Data 41Semi-structuredData 42Common File Formats 42Text Files 42JavaScript Object Notation 44Extensible Markup Language (XML) 45HyperText Markup Language (HTML) 47Summary 48Exam Essentials 49Review Questions 51Chapter 3 Databases and Data Acquisition 57Exploring Databases 58The Relational Model 59Relational Databases 62Nonrelational Databases 68Database Use Cases 71Online Transactional Processing 71Online Analytical Processing 74Schema Concepts 75Data Acquisition Concepts 81Integration 81Data Collection Methods 83Working with Data 88Data Manipulation 89Query Optimization 96Summary 99Exam Essentials 100Review Questions 101Chapter 4 Data Quality 105Data Quality Challenges 106Duplicate Data 106Redundant Data 107Missing Values 110Invalid Data 111Nonparametric data 112Data Outliers 113Specification Mismatch 114Data Type Validation 114Data Manipulation Techniques 116Recoding Data 116Derived Variables 117Data Merge 118Data Blending 119Concatenation 121Data Append 121Imputation 122Reduction 124Aggregation 126Transposition 127Normalization 128Parsing/String Manipulation 130Managing Data Quality 132Circumstances to Check for Quality 132Automated Validation 136Data Quality Dimensions 136Data Quality Rules and Metrics 140Methods to Validate Quality 142Summary 144Exam Essentials 145Review Questions 146Chapter 5 Data Analysis and Statistics 151Fundamentals of Statistics 152Descriptive Statistics 155Measures of Frequency 155Measures of Central Tendency 160Measures of Dispersion 164Measures of Position 173Inferential Statistics 175Confidence Intervals 175Hypothesis Testing 179Simple Linear Regression 186Analysis Techniques 190Determine Type of Analysis 190Types of Analysis 191Exploratory Data Analysis 192Summary 192Exam Essentials 194Review Questions 196Chapter 6 Data Analytics Tools 201Spreadsheets 202Microsoft Excel 203Programming Languages 205R 205Python 206Structured Query Language (SQL) 208Statistics Packages 209IBM SPSS 210SAS 211Stata 211Minitab 212Machine Learning 212IBM SPSS Modeler 213RapidMiner 214Analytics Suites 217IBM Cognos 217Power BI 218MicroStrategy 219Domo 220Datorama 221AWS QuickSight 222Tableau 222Qlik 224BusinessObjects 225Summary 225Exam Essentials 225Review Questions 227Chapter 7 Data Visualization with Reports and Dashboards 231Understanding Business Requirements 232Understanding Report Design Elements 235Report Cover Page 236Executive Summary 237Design Elements 239Documentation Elements 244Understanding Dashboard Development Methods 247Consumer Types 247Data Source Considerations 248Data Type Considerations 249Development Process 250Delivery Considerations 250Operational Considerations 252Exploring Visualization Types 252Charts 252Maps 258Waterfall 264Infographic 266Word Cloud 267Comparing Report Types 268Static and Dynamic 268Ad Hoc 269Self-Service (On-Demand) 269Recurring Reports 269Tactical and Research 270Summary 271Exam Essentials 272Review Questions 274Chapter 8 Data Governance 279Data Governance Concepts 280Data Governance Roles 281Access Requirements 281Security Requirements 286Storage Environment Requirements 289Use Requirements 291Entity Relationship Requirements 292Data Classification Requirements 292Jurisdiction Requirements 297Breach Reporting Requirements 298Understanding Master Data Management 299Processes 300Circumstances 301Summary 303Exam Essentials 304Review Questions 306Appendix Answers to the Review Questions 311Chapter 2: Understanding Data 312Chapter 3: Databases and Data Acquisition 314Chapter 4: Data Quality 315Chapter 5: Data Analysis and Statistics 317Chapter 6: Data Analytics Tools 319Chapter 7: Data Visualization with Reports and Dashboards 322Chapter 8: Data Governance 323Index 327
ABOUT THE AUTHORSMike Chapple, PhD, is Teaching Professor of IT, Analytics, and Operations at the University of Notre Dame. He's a technology professional and educator with over 20 years of experience. Mike provides certification resources at his website, CertMike.com.Sharif Nijim is Assistant Teaching Professor of IT, Analytics, and Operations in the Mendoza College of Business at the University of Notre Dame. He teaches undergraduate and graduate courses on cloud computing, business analytics, and information technology.