Data engineering from the early 2000s till today - BlackRock
When it comes to data management, have we come a long way since the early 2000s?
A primer on analyzing semi-structured data (Part 2)
With the data ingested, let’s delve right into two popular frameworks to visualizing the data.
A primer on analyzing semi-structured data (Part 1)
This guide will provide you with the fundamental knowledge necessary to handle semi-structured data effectively.
Distributed Query Execution in Firebolt
In this blog, we focus on distributed query execution as an integral part of Firebolt.
Zach Wilson on what makes a great data engineer
How good you are at Spark or Flink ≠ how good you are at data engineering. Zach Wilson explains.
Data quality with ‘dbt’ and Firebolt
dbt data quality - Implementing data quality tests and using dbt extensions for enhanced data quality checks.
How ZipRecruiter and Yotpo power self-service data platforms that work
How ZipRecruiter and Yotpo build resilient self-service products that keep customers happy and engineers calm
25 Ad Tech Data Pros: Workshop Summary
In a recent workshop, 25 data pros working in the Ad Tech industry discussed querying large data sets efficiently
How we mastered dbt: A true story
At Firebolt, we found out that a duet of dbt and Paradime works for our needs.
Data Observability with Millions of Users - Barr Moses
Barr Moses explains how to make sure your data is accurate in a world where so many different teams are accessing it
Analyzing the GitHub Events Dataset using Firebolt - Writing a Data App using Java
Writing a small data app using the Firebolt JDBC drive.
Analyzing the GitHub Events Dataset using Firebolt - Incremental Updates with Apache Airflow
Looking at GithubArchive dataset of public events - leveraging Apache Airflow workflows for keeping our data up-to-date.
Analyzing the GitHub Events Dataset using Firebolt - Using Jupyter for data exploration
In this blog we will discover the data using Streamlit and Jupyter and the Firebolt Python SDK.
Analyzing the GitHub Events Dataset using Firebolt - Querying with Streamlit
Writing a data app, using Streamlit and Jupyter and the Firebolt Python SDK. A multi-series blog.
Event streams in Firebolt
Event streams have always been problematic to analyze in SQL. This is how we do it.
How Amplitude Engineers Process 5 Trillion Real-time Events
Amplitude's cutting-edge data stack and how it processes 5 Trillion real-time events while dealing with mutable data
AWS re:Invent Keynote Recap for Data Professionals
AWS re:invent 2022 was all about building the anticipation and delivering on expectations of us technologists.
Making Observability a Key Business Driver
80% of the code that you write doesn’t work on the first try. But knowing which 80% is not working is the real challenge
Semi-structured data modeling
How to ingest, store and query JSON data, for example, is a consistent question on the minds of customers.
PostgreSQL Swiss army knife and The analytics workload
Is Postgres truly the right engine for analytics?
Firebolt and Data Mesh
Data Mesh is hot stuff. But from a technology perspective it’s still not very well defined.
Big Data Analytics for Gaming
In our recent ‘Big Data Analytics for Gaming Workshop’ we let the audience do the talking, here’s a summary of the talk.
A ClickHouse Review from a Practitioner’s Point of View
Sudeep Kumar, Principal Engineer at Salesforce considers the shift to Clickhouse as one of his biggest accomplishments
Hey David and Tristan, this is where Firebolt is at
"When I see David Jayatillake and Tristan Handy comment on Firebolt's approach it is clear that Firebolt is on track."
The Creator of Airflow About His Recipe for Smart Data-Driven Companies
Max walks the Bros through his recipe for a smart data-driven company, and the genesis of Airflow, Superset & Presto.
Druid Architecture Compared to Firebolt - A Practitioner’s View
Firebolt provides an alternative to Druid, delivering fast response times, high concurrency and the convenience of a Saa
Cloud data warehouse costs: Look before you leap
In this post, we look at factors to consider when building a data warehouse.
How Similarweb Delivers Customer Facing Analytics Over 100s of TBs
According to Yoav Shmaria, VP R&D Platform at Similarweb, the best way to manage data warehouse costs is tagging
Faster Data Replication from Kafka Using Hevo and Firebolt
How to Set Up Your Data Analytics Stack with Kafka, Hevo, and Firebolt.
A new level of efficiency in analytics
Are you spending more than you planned on your Data Warehouse? Analyze more. Use less compute resources.
Loading Snowplow data into Firebolt with dbt
How to enable sub-second analysis across billions of rows of customer behavior data: Part I - Setting up the load
How Klarna Designed a New Data Platform in the Cloud
Klarna is one of the leading fintech companies in the world, valued at $45B.
How Eventbrite is Modernizing its Data Stack
An episode about Eventbrite’s data stack modernization process, and how you get engineers to adopt new technologies
Simplicity and Power of Agg Indexes at Scale
One of the ways Firebolt is able to support data-driven applications is by leveraging aggregating indexes on the tables.
A Deep Dive into Slack's Data Architecture
How the data platform evolved as Slack grew from a startup to an IPOed and then acquired company.
Transitioning Scopely’s 5.5 PB Data Platform to the Modern Data Stack
Should data engineering AND BI be handled by the same people?
Getting Rid of Raw Data with Jens Larsson
Why would you create ugly data? According to Jens Larsson, don’t even go near raw data.
How Zendesk engineers manage customer-facing data applications
Ananth Packkildurai is Principal Software Engineer at Zendesk and runs one of the strongest newsletters in data
Future of Performance is Not About Performance
The data warehousing market has gone absolutely mad over performance. Why is this the case?
SQL: Thinking in Lambdas
Many programming languages are imperative – tell the compiler how to operate by providing the instructions in order.
Firebolt Announces Series C Round at $1.4 Billion Valuation to Build the World's Fastest and Most Versatile Cloud Data Warehouse
Demand from engineering teams has skyrocketed since Firebolt emerged from stealth last year
How are those data intensive customer facing apps engineered at Gong?
Gong manages hundreds of thousands of videoconferences and millions of emails PER DAY, which add up to hundreds of TBs.
How Bolt Engineers Are Designing Its Next-Gen Data Platform
Bolt engineers are in the midst of designing a new next-gen data platform
Firebolt Indexes in Action (clustered and non-clustered)
Indexes are the primary way for users to accelerate query performance in Firebolt. Learn about them here.
How did Agoda scale its data platform to support 1.5T events per day?
Scaling a data platform to support 1.5T events per day requires complicated technical migrations
Cloud Data Warehouse: The Hitchhikers Guide
Everything you needed to know about cloud data warehouses but were afraid to ask...
Postgres and MySQL for Analytics - Meeting the 1 second SLA
Learn when to use Postgres, MySQL, in-memory databases, HTAP, or data warehouses to meet the 1 sec SLA in analytics.
Diving Into GitHub's Data Stack
It’s the mother of all development projects. You use it daily. And so do 65M developers around the world.
Top 10 Ways to Improve Cloud Data Warehouse Performance (And how it’s done in Firebolt)
Lear the top 10 tips of how to improve your cloud data warehouse performance.
Building Data Products For Data Engineers
How does a tech stack that always needs to be at the forefront of technology look like?
Snowflake vs Databricks vs Firebolt
More and more, people are asking me “how do you compare Snowflake and Databricks?” We did our best to answer.
How Vimeo Keeps Data Intact with 85B Events Per Month
How Vimeo handles Data Ops to deal with massive scale?
How Substack's Data Platform Supports 500K Paying Subscribers
How does Substack's data platform support 500K paying subscribers?
A Technical Deep Dive to Yelp's Data Infrastructure — with Steven Moy
Steven Moy thoroughly explains Yelp’s data architecture under the hood and how it evolved over the past ten years.
How do Canva's engineers and analysts scale data platforms to keep up with growth? — with Krishna Naidu
Canva is one of the hottest, if not the hottest, graphic design platforms out there.
How AppsFlyer manages scale without sacrificing performance
Appsflyer deals not only with 120 billion events per day, but does so while growing quickly as a company
Firebolt Ignites Growth with a $127M Series B Funding Round
Upstart cloud data warehouse sees rapid growth in 2021, plans to double its workforce
Amazon Athena 2.0 (Presto 0.217) - What’s New
Amazon Athena engine version 2 - what’s new and big enough to call this a 2.0 release?
The real meaning of a data lake
Making sense of a data lakes, delta lake, lakehouse, data warehouse and more.
JSON the SQL: Choosing the best data warehouse for semi-structured data
Working with semi-structured data can be more like a Jason (horror movie) Sequel than JSON SQL.
ETL Vs. ELT - Know The Differences
Explore the significant differences between ELT and ETL data integration processes and find the best option for you.
How to accelerate Looker performance on Redshift, Snowflake and BigQuery
How to accelerate Looker performance on Redshift, Snowflake and BigQuery? Short-term fixes and the long-term solutions.
The Double Redshift: About Redshift Alternatives and Limits
When do you need to shift from Redshift, and what are the alternatives? Learn here.
How to upgrade from Tableau extracts to a fast Tableau live connection
Learn how to upgrade from Tableau extracts to Tableau live connection to deliver sub-seconds performance every time.
AWS Athena Error: Query exhausted resources at this scale factor
If you’re using Amazon Athena, you may have seen these errors. About AWS Athena errors and how to deal with them.
Snowflake vs. Redshift (vs. Firebolt)
A detailed comparison of Snowflake vs. Redshift, by architecture, scalability, performance, use cases and cost.
Athena vs. Redshift Spectrum vs. Presto
Learn some simple rules of thumb you can use to choose the best federated query engine for your company's needs.
Choosing Between The Best Federated Query Engine And a Data Warehouse
How companies should avoid creating a slow many headed federated Gorgon out of out of Athena.
Why even simple queries can be slow (but not with Firebolt)
Why even simple queries can be slow in cloud data warehouses and how Firebolt uses indexing to prune data and stay fast?
How To Support Ad Hoc Analysis - Part 2
How to support ad hoc analysis - Part 2: The right ad hoc analytics architecture
How To Support Ad Hoc Analysis - Part 1
How to support ad hoc analysis: Part 1 - The 4 requirements for an ad hoc analytics architecture