Performance at Scale

When it comes to performance, we do not compromise. Deliver sub-second analytics consistently as you scale into the Terabytes.

Time to insight: 0.01 sec.

Unlocking more value from data through performance and efficiency

When we talk about performance at Firebolt, it’s not about marginally improving query response times. We engineered Firebolt to deliver an order-of-magnitude performance leap, which will unlock the true potential of your data for your business.
‍
An order-of-magnitude leap in performance allows businesses to analyze much more data, at higher granularity, while delivering great user experiences fueled by lightning fast queries.
‍
It enables companies to build new business models around data, and turn into data-first companies. And thanks to the efficiency of our technology, this also means it’s actually more affordable than ever before.

How the performance leap is achieved

Our product is purpose-built to tackle the two main bottlenecks in big-data analytics:

Storage Layer bottleneck

Storage Layer bottleneck

Data lakes are built for infinite storage. But they are terribly slow when large amounts of data need to be scanned and moved to the compute layer for querying.

performance at scale

Compute Layer bottleneck

Highly efficient data processing techniques are required to deal with the volume, variety and velocity of data. Queries that cannot be accelerated through scale-out completely diminish the end-user experience.

Your S3 Data Lake

S3 Storage

The rapid journey from lake to user insight

Parquet

JSON

Avro

CSV

This is where the data’s journey into Firebolt starts. Ingestion is supported from typical raw data formats (eg Parquet, Avro, JSON).

Ingestion

Firebolt’s performance story starts at the storage level, with Firebolt’s proprietary file format - F3 (pronounced “Triple F”). Data, as it is ingested into Firebolt, is sorted, compressed, and indexed as it’s materialized into the columnar F3 format. The fact that the data is sorted enables a unique type of indexing - Sparse indexing.

F3: Firebolt File Format

Every table in Firebolt has an index attached to it over one or multiple fields. These are “Sparse indexes”. Thanks to their ability to span across huge datasets while remaining small in size, they can be fully cached in the compute layer and be readily available.

Human-written queries may be correct, but not necessarily optimized for performance. This is why we built our own cost-based optimizer, that turns every SQL into its most performant version, tailored for our query execution engine.

Sparse indexing & reduced data scans

Cost based optimizer

Firebolt relies on the most modern academic research on CPU efficiency for analytic workloads so that you can get faster performance on cheaper compute resources. Firebolt’s query engine implements vectorization and SIMD (single instruction, multiple data) concepts. These deliver blazing query execution speeds through applying query instructions on batches of column data instead of row-by-row, and through optimized usage of the CPU cache.

Vectorized Processing

Sparse indexes deliver highly aggressive data pruning, allowing queries to pick up highly granular data ranges, without having to scan entire partitions. As a result Firebolt queries scan much less data compared to traditional technologies - thus eliminating one of the biggest performance bottlenecks in cloud query engines.