Firebolt is also a decoupled storage and compute architecture that adds storage and query optimizations for 10x better performance and increased efficiency. While it does have isolated tenancy like Snowflake, it currently only runs on AWS. It also allows SQL to be run against external data formats to support ingestion. It also lets you choose any engine node type and number for each engine (cluster.)
Snowflake was one of the first decoupled storage and compute architectures, making it the first to have nearly unlimited compute scale and workload isolation, and horizontal user scalability. It runs on AWS, Azure and GCP, and while by default it is multi-tenant compute and data, it can run in a Snowflake VPC. But it only provides 1, 2, 4, … 128 node clusters with no choice of node sizes. To get the biggest nodes you need to choose the biggest cluster.
Firebolt provides the same scalability benefits of a decoupled storage and compute architecture. It improves compute efficiency through its optimizations, and by allowing the choice of any sized node and number of nodes for each cluster. It also improves write scalability and supports continuous ingestion without requiring partition rewrites for each single write. Firebolt also improves network efficiency by only accessing the data ranges needed, not entire partitions.
Snowflake delivers strong scalability with its decoupled storage and compute architecture. But it is inefficient at scaling for certain queries that require larger nodes, because the only way to get larger nodes is with larger clusters. It is better suited for batch writes and upserts as well because it requires entire micro-partitions to be rewritten for each write. Snowflake also transfers entire micro-partitions over the network, which creates a bottleneck at scale.
Firebolt has clearly demonstrated storage and compute optimization, along with indexing, make a big difference in performance. Benchmarks by Firebolt, customers and prospects have demonstrated 4-6000x performance gains across a wide range of queries compared to any of the alternatives. This comes in part from more efficient storage access, where its F3 format and remote data access only fetches the data needed, not entire partitions. Query optimization, combined with extensive indexing also make a big difference as demonstrated through specific query examples of the impact of primary, aggregating and join indexes. Choice of any size and number of nodes for each engine helps as well. Firebolt also added native semi-structured data support and continuous, low latency ingestion.
Snowflake is a perfect example of a modern storage and compute architecture that is not optimized for performance. Its data access is not optimized. Snowflake has no indexing for fetching the exact data it needs. It only keeps track of data ranges within each micro-partition, which range in size from 50 MB to 150 MB uncompressed, and can overlap. Whenever Snowflake does not have the data cached locally in the virtual warehouse, it has to fetch all of the micro-partitions that might have the data, which can take seconds or longer. While Snowflake does some query plan optimization, it does not show up in the performance of queries, which are 4-6000x slower than Firebolt in customer benchmarks. The two biggest explanations are a lack of indexing, and limited query plan optimization. However, Snowflake does provide result set caching across virtual warehouses in addition to SSD caching within each virtual warehouse. This does deliver solid performance for repetitive query workloads after the first query.
Firebolt offers many of the same benefits as Snowflake with its decoupled storage and compute, particularly isolation of workloads and support for high user concurrency. It is the only cloud data warehouse that has optimized compute and storage together for faster ingestion, network and query performance. Its F3 format enables sub-second network access. Indexing and query optimization enables sub-second query performance. It uniquely enables continuous ingestion at scale as well. This makes Firebolt not only well suited for reporting and dashboards, but also much better for interactive and ad hoc use cases, as well as operational and customer-facing use cases.
Snowflake has broader support for use cases beyond traditional reporting and dashboards. Its decoupled storage and compute architecture enables you to isolate different workloads to meet SLAs, and it also supports high user concurrency. But Snowflake also does not provide interactive or ad hoc query performance because of inefficient data access along with a lack of extensive indexing and query optimization. Snowflake also cannot support streaming or low latency ingestion below one minute ingestion intervals. All of these limitations exclude Snowflake from many operational use cases and most customer-facing applications that require second-level performance.
Firebolt is the only data warehouse with decoupled storage and compute that supports ad hoc and semi-structured data analytics with sub-second performance at scale. It also combines simplified administration with choice and control over instance types and 10x or greater efficiency for the best price-performance. This makes it the best choice for ad hoc, high performance, operational and customer-facing analytics.
Snowflake as a more modern cloud data warehouse with decoupled storage and compute is easier to manage for reporting and dashboards, and delivers strong user scalability. It also runs on more than AWS. But like the others, Snowflake does not deliver sub-second performance for ad hoc, interactive analytics at any reasonable scale, or support continuous ingestion well. It is also often very expensive to scale, especially for large data sets, complex queries and semi-structured data.