Snowflake16 min read

How Snowflake execute query internally

Learn how queries work internally in Snowflake with query execution flow, parsing, optimization, caching, virtual warehouse execution, micro-partition pruning, and real-world performance tuning concepts.

2026-05-21

Part of Series

Snowflake Fundamentals

Progress

5/7

How Queries Work Internally in Snowflake

Whenever we execute a SQL query in Snowflake, many intelligent processes happen behind the scenes before we receive the final result.

Snowflake does not simply scan the entire table and return output.

Instead, it follows a highly optimized:

Query Execution Flow

Understanding this flow is extremely important for:

  • Snowflake interviews
  • Snowflake certifications
  • Performance optimization
  • Cost optimization
  • Real-world Data Engineering projects

Let’s understand the complete query execution lifecycle step-by-step in the simplest way possible.


What Happens When You Run a Query?

Suppose we execute:

SELECT * FROM SALES WHERE COUNTRY = 'India';

Most beginners think:

Query → Scan Table → Return Result

But internally, Snowflake performs multiple intelligent operations before execution even starts.


High-Level Query Execution Flow

Snowflake query execution mainly involves these stages:

  1. Query Submission
  2. Authentication & Parsing
  3. Result Cache Validation
  4. Query Optimization & Planning
  5. Virtual Warehouse Execution
  6. Result Generation & Cache Storage

Each step plays an important role in performance optimization.


Step 1: Query Submission

A query can be submitted from multiple sources:

  • Snowflake Web UI
  • Python Connector
  • JDBC/ODBC Drivers
  • Airflow Pipelines
  • BI Tools
  • ETL Applications
  • dbt
  • Spark Connectors

In enterprise systems, queries usually come from automated pipelines and dashboard applications.


Real-World Example

Imagine ordering food in a restaurant.

You place the order through a waiter.

Similarly:

  • User/Application = Customer
  • Query = Food Order
  • Snowflake = Restaurant System

The process starts after query submission.


Step 2: Authentication & SQL Parsing

After query submission, Snowflake’s:

Cloud Services Layer

starts processing the request.

This layer handles:

  • Authentication
  • Authorization
  • SQL Parsing
  • Metadata Management
  • Query Coordination

What Happens Here?

Snowflake validates:

  • Is the user authenticated?
  • Does the role have table access?
  • Is the SQL syntax correct?
  • Are referenced objects available?

If anything fails:

  • Query execution stops immediately
  • Error is returned

No compute resources are used yet.


Example

Suppose query contains:

SELECT * FROM unknown_table;

Snowflake immediately throws:

Table does not exist

without starting the warehouse.


Real-Life Analogy

Restaurant checks:

  • Whether item exists in menu
  • Whether order is valid

before cooking starts.


Step 3: Result Cache Validation (Very Important)

Before starting the Virtual Warehouse, Snowflake first checks:

Result Cache

This is one of Snowflake’s most powerful optimization techniques.


What is Result Cache?

Snowflake stores final query outputs for previously executed queries.

If the same query runs again:

  • Snowflake returns cached result instantly
  • No warehouse compute needed
  • No storage scan needed
  • No additional credits consumed

Conditions for Result Cache Usage

Result Cache works only if:

  • Query text is exactly same
  • Underlying data has not changed
  • Same permissions exist
  • Compatible session settings exist

If all conditions match:

Query execution stops here itself.

Result is returned instantly.


Why This is Important

This dramatically improves:

  • Dashboard performance
  • Repeated BI queries
  • Frequent analytics reports
  • Cost optimization

Real-World Example

Suppose a Power BI dashboard refreshes every 5 minutes using same query.

Without Result Cache:

  • Warehouse executes repeatedly
  • Credits increase

With Result Cache:

  • Instant response
  • Zero compute usage

Huge cost savings.


Real-Life Analogy

You order the same food again.

Restaurant already prepared it earlier.

Food is served instantly without cooking again.


Step 4: Query Optimization & Planning

If Result Cache cannot be used:

Snowflake now creates the best execution strategy.

This stage is called:

Query Optimization


What Snowflake Optimizer Does

Snowflake decides:

  • Which micro-partitions to scan
  • Join execution order
  • Parallel execution strategy
  • Data pruning opportunities
  • Cache usage possibilities
  • Best execution path

This optimization is fully automatic.

Unlike traditional databases:

  • No manual indexing needed
  • No query hints required in most cases

Role of Micro-Partition Pruning

Snowflake uses:

Metadata Cache

to identify unnecessary partitions.

Instead of scanning entire table:

Snowflake scans only required partitions.

This significantly reduces:

  • Data scanned
  • Query runtime
  • Compute cost

Example

SELECT * FROM sales WHERE year = 2025;

Snowflake scans only partitions containing:

year = 2025

instead of entire dataset.


Real-Life Analogy

Instead of searching all books in a library:

You first identify the exact shelf using catalog metadata.


Step 5: Virtual Warehouse Execution

Now the:

Virtual Warehouse

starts executing the query.

This is the actual compute layer of Snowflake.


What Happens During Execution?

Warehouse performs:

  • Data scanning
  • Parallel execution
  • Aggregations
  • Joins
  • Sorting
  • Filtering
  • Transformations

Execution happens using distributed compute nodes.


Warehouse Auto Resume

If warehouse was suspended:

Snowflake automatically resumes it.

This is called:

Auto Resume

Very useful for cost optimization.


How Snowflake Speeds Up Execution

During execution Snowflake uses:

1. Local Disk Cache

Previously scanned files may already exist in warehouse SSD cache.

This reduces storage access.


2. Metadata Cache

Helps skip unnecessary micro-partitions.


3. Massive Parallel Processing

Queries run across multiple compute nodes simultaneously.

This dramatically improves performance.


Real-World Example

Suppose:

  • 5TB sales table
  • Query needs only India records

Snowflake:

  • Uses metadata pruning
  • Scans only relevant partitions
  • Executes in parallel
  • Uses local cache if available

Result:

  • Faster execution
  • Lower cost
  • Better performance

Step 6: Result Generation & Cache Storage

After query execution completes:

Snowflake stores final result inside:

Result Cache

Now future identical queries can return instantly.

This creates a continuous optimization cycle.


Why Snowflake Query Execution is Fast

Snowflake achieves high performance because of:

  • Result Cache
  • Metadata Cache
  • Local Warehouse Cache
  • Partition Pruning
  • Massive Parallel Processing
  • Separate Compute & Storage
  • Automatic Query Optimization

All these work together automatically.


Complete Query Execution Architecture Flow

Query Submitted Authentication & Parsing Result Cache Check Query Optimization Micro-Partition Pruning Warehouse Execution Parallel Processing Result Returned Result Stored in Cache

This is the complete lifecycle of a Snowflake query.


Real-World Enterprise Scenario

Suppose a company has:

  • ETL pipelines
  • BI dashboards
  • Data Science workloads
  • Thousands of analysts

Snowflake intelligently handles:

  • Query concurrency
  • Caching
  • Optimization
  • Parallel execution
  • Cost reduction

This is one major reason why Snowflake became highly popular in modern cloud data platforms.


Common Snowflake Interview Questions

Beginner Level

  1. What happens when a query is submitted in Snowflake?
  2. What is the role of Cloud Services Layer?
  3. When is Result Cache checked?
  4. What is Virtual Warehouse?
  5. What is Auto Resume?

Intermediate Level

  1. How does Snowflake optimize queries?
  2. What is partition pruning?
  3. How does Metadata Cache work?
  4. Why are repeated queries faster?
  5. What happens if warehouse is suspended?

Scenario-Based Questions

  1. Why does same dashboard query become instant after first execution?
  2. Why does query performance reduce after warehouse suspension?
  3. How does Snowflake reduce storage scanning?
  4. How does Snowflake avoid full table scans?
  5. How would you optimize slow Snowflake queries?

These are extremely common interview topics for Snowflake and Data Engineering roles.


Key Takeaways

  • Snowflake follows a highly optimized query execution flow
  • Result Cache is checked before compute execution
  • Metadata Cache enables partition pruning
  • Virtual Warehouse performs distributed parallel execution
  • Snowflake automatically optimizes execution plans
  • Query optimization directly impacts performance and cost

Understanding query execution flow is extremely important for becoming strong in Snowflake performance tuning.


What’s Next?

In the next article, we will learn:

Snowflake Pricing and Credit System Explained

We will cover:

  • Compute credits
  • Storage pricing
  • Cloud services cost
  • Warehouse billing
  • Cost optimization techniques
  • Real-world enterprise pricing strategies

because understanding Snowflake pricing is extremely important in production Data Engineering projects.


If this article helped you, continue following the Snowflake Fundamentals learning path to master Snowflake step-by-step with practical examples, architecture concepts, interview-focused explanations, and real-world Data Engineering scenarios.

Soumya Ranjan Bisoyi

Written By

Soumya Ranjan Bisoyi

Data Engineer • Mentor • Educator

Helping aspiring Data Engineers learn SQL, Spark, Snowflake, Azure, and real-world Data Engineering concepts through practical, beginner-friendly content.