Snowflake : Architecture
2 min readMay 5, 2024
Snowflake is provided as Software-as-a-Service (SaaS) that runs completely on cloud infrastructure. It uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse.
Database Storage
- When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format.
- All data is encrypted AES 256 strong encryption
- Snowflake stores this optimized data in cloud storage.
- Snowflake manages all aspects of how this data is stored – the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake.
- The data objects stored by Snowflake are not directly visible nor accessible by customers; they are only accessible through SQL query operations run using Snowflake
Query Processing
- Snowflake allows to create multiple, independent compute clusters for query processing and they are called virtual warehouses.
- They all access same data source without any contention (Unlimited scale)
- When a virtual warehouse is resized, all subsequent queries take advantage of new resources.
Cloud Services
- Services layer is fully maintained by snowflake and distributed across multiple availability to ensure high availability
- The cloud services layer is a collection of services that coordinate activities across Snowflake.
- These services tie together all of the different components of Snowflake in order to process user requests, from login to query dispatch
- The cloud services layer also runs on compute instances provisioned by Snowflake from the cloud provider.
- The key component of service layer is the metadata store which powers number of snowflake unique features like Zero copy cloning, Time travel, Data sharing