<- Back to all posts

AWS re:Invent 2022 Keynote Recap for Data Professionals

December 5, 2022

December 5, 2022

AWS re:Invent Keynote Recap for Data Professionals

Multiple contributors

No items found.

Listen to this article

Powered by NotebookLM

Listen to this article

AWS re:invent has always been about building the anticipation and delivering on expectations of us technologists. Year after year, AWS’ consistency in pushing innovation forward is truly exciting. The question is, what goodies did we get, as data practitioners, this year ? As an AWS partner, we at Firebolt leverage the best of AWS services to deliver our Cloud Data Warehouse for Data Apps offering. Firebolt’s price-performance-efficiency mindset is very much in tune with AWS offerings. As fundamental offerings evolve within AWS, we pass on the price-performance improvements to our customers. So here is how we see the AWS re:Invent announcements.

‍

For starters, ‘Monday Night Live with Peter DeSantis’ keynote by Peter DeSantis was a great way to warm-up for the week ahead. Setting the stage by introducing various innovations from AWS and ‘Why AWS does what it does’ from an infrastructure perspective is a must for everyone, even if you are not into CPUs, memory, networking protocols. This session was focused on performance. To quote Peter Desantis verbatim -’Great performance is the result of innovating from the ground up and then continually investing over time and being committed to performance’. Love it! Great performance does not happen overnight. You need the right foundations and it takes time. The drive at AWS to improve performance at the hypervisor, CPU, networking level is truly worth watching. The big news here is that there are Nitro, Graviton enhancements that continue to improve price-performance. Price-Performance efficiency is something that we all should care about, not just for the wallet but for the planet. AWS is definitely on the right track in how they approach infrastructure improvements holistically. The implications of this are wide spread across data, analytics, AI/ML and much more. This was the perfect session to tee up Adam Selipsky’s kickoff keynote.

‍

Adam Selipsky’s keynote started off with a focus on sustainability and environmental commitments. Paying attention to the health of the planet is top of mind for the cloud giant. Definitely, something we all need to work on with regards to our daily behavior as humans. From a data practitioner perspective, spate of new announcements, not atypical of AWS. Few key services that are in preview included Opensearch serverless, Zero ETL integration between Aurora and Redshift, Apache Spark integration with Redshift, AWS Athena for Spark and a brand new governance solution in the form of Amazon DataZone. The common theme across these announcements is definitely an acknowledgement of the complexities that data practitioners have to deal with to glue together solutions. There are different worlds to bridge. Zero ETL integration promises flow of data from Aurora into Redshift, simplifying the process of building data pipelines. Sounds promising and it will be great to see the details. There is a move towards in-database functionality with tools like dbt. Wonder where Zero ETL stops and where dbt will pick up especially if you are a data engineer? Amazon Data Zone is something many AWS customers had been asking for. From a data governance perspective, managing a comprehensive data catalog with clear separation of duties across data producers and consumers, combined with business perspectives is not a simple task. AWS previously had relied primarily on partner offerings, but with other hyperscalers offering competitive services in this area, it only makes sense that AWS has an offering. There were additional enhancements to Quicksight, newer, faster instance types and others.

‍

If you are a data analytics or data science professional, checkout Swami’s keynote from day 3. This took us one level deeper into the services that were announced the previous day. Additional announcements included services in preview, AWS Glue Data Quality which introduces data quality checks for your data lake or data pipelines, AWS Data Exchange for S3 and Lake Formation and others. With these announcements, AWS is branching out from the core offerings to additional capabilities of the Modern Data Stack, essentially spelling out a data platform strategy with integration across various services. Clearly, AWS is pushing the boundaries when it comes to having the most comprehensive set of Cloud services - 200+ and counting.

‍

Overall, price-performance improvements are always welcome, Better integrations across the data stack is something that we all should strive for. We can’t stop there. Building the right foundations for speed-scale-efficiency does not happen overnight. Every little progression counts. At Firebolt, we are on this journey to deliver insights swiftly and efficiently, together with AWS.

Table of Contents

This is some text inside of a div block.

This is some text inside of a div block.

Technical Deep Dive: Automated Column Statistics

Collect statistics about the values in your columns to improve query plans.

Hans-Peter Lehmann

Why 99% of Data Teams Give Up on Real-Time And How Artie Changes That

Robin Tang explains how Artie simplifies real-time data streaming and CDC for teams at ClickUp, Substack, and Alloy.

Firebolt Team

The $100M Problem: How Lyft's Data Platform Prevents ML Failures with Ritesh Varyani at Lyft

Lyft's Ritesh Varyani details their polyglot data strategy unifying Spark, Trino, and ClickHouse with AI.

Firebolt Team

Intrigued? Want to read some more?