Snowflake : Architecture

Sai Prabhanj Turaga
2 min readMay 5, 2024

--

Snowflake is provided as Software-as-a-Service (SaaS) that runs completely on cloud infrastructure. It uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse.

Database Storage

  • When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format.
  • All data is encrypted AES 256 strong encryption
  • Snowflake stores this optimized data in cloud storage.
  • Snowflake manages all aspects of how this data is stored – the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake.
  • The data objects stored by Snowflake are not directly visible nor accessible by customers; they are only accessible through SQL query operations run using Snowflake

Query Processing

  • Snowflake allows to create multiple, independent compute clusters for query processing and they are called virtual warehouses.
  • They all access same data source without any contention (Unlimited scale)
  • When a virtual warehouse is resized, all subsequent queries take advantage of new resources.

Cloud Services

  • Services layer is fully maintained by snowflake and distributed across multiple availability to ensure high availability
  • The cloud services layer is a collection of services that coordinate activities across Snowflake.
  • These services tie together all of the different components of Snowflake in order to process user requests, from login to query dispatch
  • The cloud services layer also runs on compute instances provisioned by Snowflake from the cloud provider.
  • The key component of service layer is the metadata store which powers number of snowflake unique features like Zero copy cloning, Time travel, Data sharing

--

--