Technical Deep Dive: Automated Column Statistics · Product
Firebolt’s automated column statistics keep optimizer insights up to date, improving query plans and performance automatically—no query changes required.
Explore technical tips and topics from Firebolt experts and the community.
Firebolt’s automated column statistics keep optimizer insights up to date, improving query plans and performance automatically—no query changes required.
Lyft's Ritesh Varyani on building a unified data platform (Spark, Trino, ClickHouse) balancing OSS & AI reliability.
Maddie Daianu (Head of Data & AI) shares Credit Karma's multi-cloud strategy using BigQuery and an agentic layer.
Technical Deep Dive: Efficient and ACID Compliant Vector Search Indexes in Firebolt
Learn how Late Materialization speeds up top-K queries by delaying column scans.
Ashok Singamaneni built Spark Expectations at Nike to enforce data quality and reduce recomputes in production data pipelines.
Firebolt FuzzBerg to accelerate security testing of Iceberg and other file based readers.
Learn how Instacart's team moved from Elasticsearch to a custom Postgres setup, improving search efficiency and control.
Firebolt ARM Rollout; optimising price performance
Firebolt supports explicit, multi-statement transactions using BEGIN, COMMIT, and ROLLBACK syntax while maintaining ACID compliance and stateless architecture.
Technical deep dive on the powerful MERGE SQL command, enabling simultaneous operations on a single table.
Fabi.ai's Lei Tang on bridging the gap between data teams and business users with intelligent AI systems
Faster queries from the start with smart cache loading on engine boot, upgrade, and scale
Firebolt built Auror to securely validate container images with low latency in Kubernetes clusters.
Firebolt removes redundant joins to boost SQL performance and optimize complex subqueries.
Uber's AI Genie resolves on-call pains using internal data. Spark powers its infrastructure; LLMs revolutionize databases.
This features enables users to not use precious resources on just maintaining a connection when in fact their client is not doing anything.
Learn how Notion's Lead BI Engineer, Sumit Gupta, uses AI to revolutionize data workflows and generate customer insights.
Firebolt's CTO and VP of Engineering discuss the launch of Firebolt's self-managed version, Firebolt Core.
Hear about the different methods deployed in Firebolt for reducing the number of scanned rows (aka pruning).
Discover how Firebolt delivers seamless, no-downtime upgrades using shadow clusters and real-time performance verification to ensure peak reliability.
Firebolt's new READ_ICEBERG capability does a lot of heavy lifting to provide low-latency access to your Iceberg tables.
Explore cutting-edge PostgreSQL innovations, distributed database architecture, and cloud-native data processing solutions with YingJun Wu of Rising Wave.
Discover the new LOCATION object, a foundational improvement to Firebolt’s data access model.
In this blog post you will learn how GROUPING SETS work and how Firebolt’s implementation uses smart query planning to execute them efficiently.
Discover how Firebolt implements SQL functions for data exploration.
Connect Firebolt to AI tools like Claude and Copilot using the new MCP Server to streamline workflows, run smart queries, and boost data engineering efficiency.
Explore how Firebolt's transaction system maps to the four essential steps—Execute, Validate, Order, Persist. Learn how Firebolt uses MVCC, OCC, and Foundation
Explore the innovative evolution of DuckDB and AI-driven database tech solutions with CEO DuckDB Labs, Hannes Mühleisen.
Discover DataStrato’s unified open-source approach to data governance and simplifying data management with Lisa Cao.
We will explore in more detail how Firebolt implements robust operations on geospatial data.
Explore the world of data engineering and marketing, real time data integration, AI and data movement with Daniel Pálma.
Explore how Firebolt processes & optimizes GEOGRAPHY data using S2 cells, shape indexes, and query pruning for peak performance.
Implementing fast geospatial queries in Firebolt using the S2 Geometry Library.
Discover Firebolt’s Zero-Copy Clone feature: a cost-efficient way to clone massive tables instantly without duplicating data.
In this episode of The Data Engineering Show, Chad Sanderson explores the world of data change management.
Wouter Trappers shares his slightly unconventional path from philosopher to data consultant and engineer.
Dive into key highlights from Firebolt's Data Rewind conversation series.
Build trust with a CISO's perspective on Firebolt's security and privacy commitments.
In this episode of The Data Engineering Show, Ryanne Dolan from LinkedIn joins the Bros to discuss LinkedIn's Hoptimator project.
Learn how Firebolt identifies zero-day vulnerabilities as efficiently as its query processor.
Gain insights into how Firebolt was built to redefine cloud data performance and scalability.
Enhance query performance with Firebolt's caching and subresult reuse features.
Learn about making a query engine Postgres-compliant in part one of this in-depth series.
Firebolt engines provide multi-dimensional elasticity to our customers allowing them to achieve desired price-performance without causing downtime for customers
Andy Pavlo, Associate Professor at Carnegie Mellon University, delves into database internals and optimization.
Too often expensive resources and manhours are spent on dashboards no one uses, resulting in zero ROI.
Principles essential for data quality, cost optimization, and data modeling, as adopted by the world's leading companies
Data engineering should be less about the stack and more about best practices. While tools may change, foundational principles will remain constant.
Joe Hellerstein and Joseph Gonzalez inspired generations of database enthusiasts and are now on the show
Megan Lieu about her approach to data advocacy as well as the power of notebooks, especially when they enable collaboration
This time on The Data Engineering Show, Xiaoxu Gao is an inspiring Python and data engineering expert with 10.6K followers on Medium.
Vin Vashista, the guy we all love to follow, has never seen a dashboard with positive ROI. He met the bros to talk about replacing BI dashboards with analytics.
Joe Reis and Matt Housley joined the bros for some much-needed ranting, priceless data advice, and good laughs.
As people in the data industry go, Bill Inmon is among the top, often seen as the godfather of the data warehouse.
Meenal Iyer, VP Data at Momentive.ai, talks about enforcing collaboration in large organizations
When it comes to data management, have we come a long way since the early 2000s?
After years of data engineering experience at Airbnb, Netflix, and Facebook, Zach Wilson is now focused on spreading the knowledge in EcZachly
How ZipRecruiter and Yotpo build resilient self-service products that keep customers happy and engineers calm
Barr Moses explains how to make sure your data is accurate in a world where so many different teams are accessing it
Amplitude's cutting-edge data stack and how it processes 5 Trillion real-time events while dealing with mutable data
80% of the code that you write doesn’t work on the first try. But knowing which 80% is not working is the real challenge
Sudeep Kumar, Principal Engineer at Salesforce considers the shift to ClickHouse as one of his biggest accomplishments.
Maxime Beauchemin, the CEO & Founder at Preset and Creator of Apache Superset and Airflow, told the Data Bros about his recipe for a smart data-driven company.
According to Yoav Shmaria, VP R&D Platform at Similarweb, the best way to manage data warehouse costs is tagging
While many corporations are “stuck” on-prem, Klarna made the move and today is a cloud-only company. Gunnar Tangring explains how.
An episode about Eventbrite’s data stack modernization process, and how you get engineers to adopt new technologies
How the data platform evolved as Slack grew from a startup to an IPOed and then acquired company.
Should data engineering AND BI be handled by the same people?
Why would you create ugly data? According to Jens Larsson, don’t even go near raw data.
Ananth Packkildurai is Principal Software Engineer at Zendesk and runs one of the strongest newsletters in data
Gong manages hundreds of thousands of videoconferences and millions of emails PER DAY, which add up to hundreds of TBs.
Bolt's ride-hailing app serves 2B users globally and handles 500K queries daily. Erik Heintare sharing how it's going to solve their biggest data challenges.
Scaling a data platform to support 1.5T events per day requires complicated technical migrations and alignment between hundreds of engineers.
It’s the mother of all development projects. You use it daily. And so do 65M developers around the world.
How does a tech stack that always needs to be at the forefront of technology look like?
How Vimeo handles Data Ops to deal with massive scale
How does Substack's data platform support 500K paying subscribers?
Steven Moy, Software Engineer at Yelp, has joined the Data Bros to discuss Yelp's Data Infrastructure on the Data Engineering Show Podcast.
Canva is one of the hottest, if not the hottest, graphic design platforms out there. How are they handling growth? Krishna Naidu answers told the data bros.
Alexandra Sudilovski, Senior BI Expert at AppsFlyer, told the The Data Engineering Show how AppsFlyer manages scale without sacrificing performance.