Performance at Scale

The most performant and efficient technology for big-data analytics, by far.

Time to insight: 0.01 sec.

Unlocking more value from data through performance and efficiency

When we talk about performance at Firebolt, it’s not about marginally improving query response times. We engineered Firebolt from the ground up to deliver an order-of-magnitude performance leap, which will unlock the true potential of your data for your business. 

An order-of-magnitude leap in performance allows businesses to analyze much more data, at higher granularity, while delivering great user experiences fueled by lightning fast queries. 

It enables companies to build new business models around data, and turn into data-first companies. And thanks to the efficiency of our technology, this also means it’s actually more affordable than ever before.

Stop making data compromises

How the performance leap is achieved?

Firebolt’s founding team includes a few of the world’s leading high-performance database experts. Our product is purpose-built to tackle the two main bottlenecks in big-data analytics:

Storage Layer bottleneck

Storage Layer bottleneck

Data lakes are built for infinite storage. But they are terribly slow when large amounts of data need to be scanned and moved to the compute layer for querying.
performance at scale

Compute Layer bottleneck

Century-old techniques for processing data are not efficient enough for today’s data sets. Queries that cannot be accelerated through scale-out completely diminish the end-user experience.
Your S3 Data Lake
S3 Storage

Lake to insight engine

Parquet
JSON
Avro
CSV

This is where the data’s journey into Firebolt starts. Ingesting data into Firebolt is also dramatically faster compared to other warehouses. Ingestion is supported from typical raw data formats (eg Parquet, Avro, JSON), and also streaming sources like Kafka. 

Firebolt Pipeline

Through the pipeline the ingested data is sorted, compressed ,indexed and materialized into Firebolt’s proprietary file format - F3. This storage format enables a variety of benefits - from highly-efficient and low-footprint indexes to merging data for quick updates.

F3: Firebolt File Format

Behind the scenes Firebolt automatically creates and manages a unique type of indexes which are called “Sparse Indexes”. Thanks to their ability to span across huge datasets while remaining small in size, they can be fully loaded into memory. These indexes enable pulling highly granular data parts from F3 for each query, thus avoiding time-consuming data pulls that end up not taking part in the query.

Human-written queries may be correct, but not necessarily optimized for performance. This is why we built our own cost-based optimizer, that turns every SQL into its most performant version, tailored for our query execution engine.

Sparse Indexing

Cost based optimizer

Just-In-Time (JIT) compilation means that every SQL is compiled into an optimized program for that particular query, maximizing the ability of the CPU to run it at maximum speed. With this method every SQL, after building its execution plan and turning it into source code, is then also compiled using a compiler-backend library which turns it into a CPU-optimized program.

Firebolt relies on the most modern academic research on CPU efficiency for analytic workloads so that you can get faster performance on cheaper compute resources. Firebolt’s query engine implements vectorization and SIMD (single instruction, multiple data) concepts. These deliver blazing query execution speeds through applying query instructions on batches of column data instead of row-by-row, and through optimized usage of the CPU cache.

JIT compilation

Vectorized Processing

F3 firebolt file format

Granular data pruning

Queries in Firebolt leverage sparse indexes to only pull the rows from S3 that are relevant for the query, at the most granular level. This is crucial for performance in data lake environments, as fetching unnecessary data from the low storage layer has a huge performance penalty. Other technologies typically load unpruned and larger than necessary chunks of data, resulting in very slow response times that are not suitable for interactive big-big data analytic workloads.

Firebolt POC:
Proving order-of-magnitude performance lift over your data

Curious to learn more?

Discover jaw dropping performance over your most challenging data and queries

We use cookies to give you a better online experience
Got it