Introduction xxiAssessment Test xxxiiiAnswers to Assessment Test xxxviiiChapter 1: Introduction and Overview 1Introducing Snowflake 2Snowflake Journey 3Snowflake Certifications 3Signing Up for a Snowflake Trial 5Summary 13Exam Essentials 14Review Questions 15Chapter 2: Snowflake Architecture 17Traditional Database Architectures 18Snowflake's Hybrid Architecture 19The Three Layers of Snowflake Architecture 20The Database Storage Layer: Micro-Partitions and Clustering 22The Query Processing Layer: The Concept of a VirtualWarehouse 26Configuring a Virtual Warehouse 30Summary 32Exam Essentials 33Review Questions 34Chapter 3: Interfaces and Connectivity 39Snowflake Web UI 40Web Interface Components 41Snowflake Partners 46Snowflake Connectors and Drivers 46User Defined Functions 47SnowSQL 47New Snowflake Web Interface 50Summary 53Exam Essentials 53Review Questions 54Chapter 4: Loading Data 57Data Loading in Snowflake 59Supported File Formats for Loading Data 60Compression 60Encryption 60What Is a Stage? 61External Stage 62Internal Stages 62Data Loading via Internal Stages 64Loading On-premises Data via the Table Stage 64Loading On-premises Data via the User Stage 67Loading On-premises Data via the Named Internal Stage 70Data Ingestion Using the Named External Stage 73Loading Data from Cloud Storage via the External Stage 74Loading Data via the Snowflake Web UI 75Basic Data Transformations While Ingesting 77External Tables 79Semi-Structured Data 81The VARIANT Data Type and Semi-Structured Data 81Loading NDJSON Data 82Loading JSON Data 85Unloading Data from Snowflake 88File Formats Supported for Unloading 89Compression 89Encryption 90Unload Data to an On-premises System via an Internal Stage 90Load Near-Real- Time Streaming Data 91How Snowpipe Works 92Optimizing Data Loading and Unloading 93Bulk Load Optimization 93Optimizing Snowpipe Loads 93Optimizing Data Unloading 93Summary 94Exam Essentials 94Review Questions 95Chapter 5 Data Pipelines 99Introducing Tasks 100Scheduling a Task 100Connecting Multiple Tasks in a Task Tree 104User-Managed and Serverless Tasks 107Introducing Streams 110Capture Changes Using Streams 110Summary 114Exam Essentials 114Review Questions 115Chapter 6 Continuous Data Protection 117Components of Continuous Data Protection 118Data Encryption 118Access Control 118Network Policies 119Time Travel 119Fail-safe 119Time Travel and Fail-safe 119Time Travel in Action 122Undrop Using Time Travel 127Time Travel and Fail-Safe Storage Costs 128Temporary Tables 129Transient Tables 129Summary 129Exam Essentials 130Review Questions 131Chapter 7 Cloning and Data Sharing 133Zero-Copy Cloning 134How Zero-Copy Cloning Works 135Database, Schema, and Table Cloning 138Cloning with Time Travel 142Secure Data Sharing in Snowflake 143Direct Sharing 144Snowflake Data Marketplace 150Data Exchange 151Summary 152Exam Essentials 152Review Questions 154Chapter 8 Performance 157Snowflake Performance Considerations 158Virtual Warehouse Configuration 160Virtual Warehouse Scaling Up and Down 160Scaling Out Using Multi-cluster Virtual Warehouses 161Caching in Snowflake 164Metadata Cache 165Query Result Cache 166Virtual Warehouse Cache 167Micro-partition Pruning 168Micro-partitions and Data Clustering 169What Happens Behind the Scenes 169Clustering Keys 170Search Optimization 171Materialized Views 171Summary 172Exam Essentials 173Review Questions 175Chapter 9 Security 179Data Encryption at Rest 180Key Rotation and Rekeying 181Tri-Secret Secure 181Authentication 181Multifactor Authentication (MFA) 181Key Pair Authentication 183SSO via SAML 2.0 Federated Authentication 183Password Policies 183User Provisioning Through SCIM 184Authorization 184Access Control in Snowflake 185Column-Level Security 188Row-Level Security 188Secure Views and Secure UDFs 189User Access Audit Log 189Network 189Network Policies 189Support for Private Connectivity 190Encryption for All Communication 190Compliance with Standards 190Summary 191Exam Essentials 192Review Questions 194Chapter 10 Account and Resource Management 197Resource Monitors 198System Usage and Billing 205The ACCOUNT_USAGE Schema 206The INFORMATION_SCHEMA Schema 209Snowflake Releases 214Phased Release Process for Full Releases 214Summary 214Exam Essentials 215Review Questions 216Appendix Answers to the Review Questions 219Chapter 1: Introduction and Overview 220Chapter 2: Snowflake Architecture 220Chapter 3: Interfaces and Connectivity 222Chapter 4: Loading Data 224Chapter 5: Data Pipelines 226Chapter 6: Continuous Data Protection 226Chapter 7: Cloning and Data Sharing 227Chapter 8: Performance 229Chapter 9: Security 231Chapter 10: Account and Resource Management 232Index 235
Hamid Mahmood Qureshi is a Senior Cloud and Datawarehouse Professional with two decades of experience architecting, designing, and leading the deployment of many data warehouses and business intelligence solutions. He has substantial experience and qualifications in various data analytics systems, including Teradata, Oracle, Hadoop, and modern cloud-based tools like Snowflake. Having worked extensively with traditional technologies combined with his knowledge of modern platforms, he has accumulated substantial practical expertise in data warehousing and analytics on Snowflake, which he has captured in his publications.