November 28, 2024
November 28, 2024

Tech Stacks and Tradeoffs: Xudo's Founder on Picking the Right Tools for BI Success

No items found.

Listen to this article

Listen to this article

Wouter Trappers is the founder of Xudo and shares his slightly unconventional path from philosopher to data consultant and engineer with the Bros in this latest episode of The Data Engineering Show. Wouter’s grounding in philosophy has proved to be a shaping influence on his approach to business intelligence, much more than just a software solution, for Wouter, BI is all about change management and aligning leadership with data projects.

Listen on Spotify or Apple Podcasts

Intro/Outro - 00:00:04:

The Data Engineering Show is brought to you by Firebolt, the cloud data warehouse for low-latency analytics. Get $200 credits and start your free trial at firebolt.io.

Benjamin - 00:00:15: Hi everyone, and welcome back to The Data Engineering Show. Today we're super happy to have Wouter joining, who leads the data analytics consultancy called Xudo. Wouter, do you want to say a few intro words and introduce yourself to the audience?

Wouter - 00:00:31: Yeah, so my name is Wouter Trappers. I'm working from Ghent, Belgium, and I'm indeed the founder of Xudo. With Xudo, the intention is to guide people through their business intelligence journey. But myself, I'm a philosopher from background. So I really am an autodidact and I rolled into this profession from job to job and then learning every time a little bit more. So yeah, that's a bit my background.

Benjamin - 00:00:57: Backgrounds. So from philosopher to basically a BI and data analytics consultant, that's a super, super interesting journey. Maybe tell us a bit more about that in your background.

Wouter - 00:01:09: Yeah, so like you may expect, there's not really a job that really fits the philosophy backgrounds one-on-one. So after you study abroad, study like philosophy still needs to orient yourself a bit in the job markets. So first start, it's working as a teacher. But then I found out this was not really my thing. And I started working in the private sector with basic admin jobs. And then I discovered I had a knack for analysis and for Excel. And basically that started my data journey as maybe a lot of other people as well.

Eldad - 00:01:44: Benjamin, once upon a time, Excel was that thing that ignited all the passion on data and crunching and analyzing data. And it still is. So thanks for the intro.

Wouter - 00:01:55: I think there are still a lot of people who start with Excel, mainly in other jobs than IT. People in marketing or in finance or supply chain who discovered the power of data through Excel. So I still think it's still a gateway into data for a lot of people.

 Benjamin - 00:02:13: Definitely. So then after you got hooked on data analytics through Excel, basically, tell us a bit about your journey that ultimately got you to then start your own company now.

Wouter - 00:02:23: Yes, I started indeed in Excel. And I started in a financial controlling department of a large retail store here in Belgium. And the European retail store is also active in the US. There I used Excel and then also Access to automate all the processes for controlling and reporting. And also using VBA. And then when the data sets became too large for Excel, getting into Access and learning about the databases, how you write SQL code, and then going from there. Once you know the front-end functions of Excel and the back-end SQL that's behind it, you can basically learn any BI tool on the market. So after that, I switched to a company that builds software for medical professions like pharmacies, general practitioners. And so on, everything except for hospitals. It's called Corillus. And there I was introduced to QlikView, now Qlik Sense, a product of the Qlik company, where I started applying the principles that I learned during my time working with Excel and Access, but then in a more professional business intelligence environment. There I automated all the BI flows for the different departments, like after-sales, customer care sales, finance, and HR. And then the people finally knew what was going on in their company. Yeah.

Eldad - 00:03:48: MS Access, VBA.

Wouter - 00:03:51: Yes.

Eldad - 00:03:52: Those are terms people mostly don't remember anymore. But let me tell you this. This is how data engineering started. Before VBA, we had to get out of our environment to write code. And VBA was that first programming language that was embedded within our business apps. So Excel, MS Access. Microsoft Word for some reason as well. So people took that skill set, Visual Basic, an amazing language, and started to apply it within their business environment. Therefore, the first generation of data engineering is born. And nobody called that data engineering back then. But I remember that period and I love that period. And yes, we've made great progress since then, but the foundation stayed the same.

Benjamin - 00:04:41: I always love this recurrent segment of the show. Where you explain to me who doesn't maybe know all of these things that people did back in the day.

Eldad - 00:04:50: Ah, and one last thing, QlikView, which turned Qlik Sense. I think it's a Swedish, originated in Sweden, one of the first BI companies that actually built a full stack. So you had the engine, in-memory engine, which was revolutionary from a slice and dice perspective. And you've had the UI in it. And you've had their own kind of VBA language, which was a bit different, but same concept. And it was a huge hitch back then. The reason I know it so well, because in my previous startup early on, we competed with QlikView. So QlikView was our nemesis and everything we did was compared to how the QlikView guys did. So thanks you for mentioning that. Great history.

Benjamin - 00:05:36: Also definitely a first on The Data Engineering. I never heard QlikView mentioned before. Jumping a couple of years into the present. Right? Tell us more about basically what types of challenges you're facing today and then helping companies solve.

Wouter - 00:05:49: Yeah. So I'm a sole consultant at Xudo. The idea first was to help people and companies to come up with a data strategy. In my definition, it is finding out how data can help you to improve your company by making more impact on revenue, by reducing costs and identifying redundancies or most important use case. In my view. For business intelligence is the peace of mind that you can trust the data that you're looking at to make decisions. So I thought I would build data strategy roadmaps and puts building blocks on the roadmap and then execute a roadmap. But unfortunately, I didn't find any customers for this offering. So I pivoted to concept enhance on working in the companies. So I usually have two types of companies I work for at the same time, one larger company that makes me money. And then a smaller company that I can really help build their data foundations from the ground up, and the larger companies, I take like role in the data teams of the company themselves. And in the smaller companies, I work with the tools that the company has at hand or I suggest my own tools that they can work with. And then I prefer the smaller customers. But of course, the bigger customers pay the bills.

Benjamin - 00:07:07: Right. This is actually super interesting and something we haven't explored that much on the show before. In many cases, we have like data engineers. Coming who work at companies that have like very established kind of data stacks, right? Ten different query engines, Iceberg, 5-BI tools and so on. Like when you're maybe smaller and also like, I don't know, like company from a more traditional sector kind of bootstrapping that data engineering or data warehousing stack. Tell us more about the types of challenges that are common there and what maybe the most common struggles are.

Wouter - 00:07:40: Yeah, I think if people are looking to start their data journey, they usually wind up in their research on the websites of the technology vendors and they are selling business intelligence projects. As a software that will solve their problems. But in fact, business intelligence It's always a change journey. So you have to approach it like this. If you do BI well, then you will use insights from the dashboards to improve your company, to improve the processes, and then also to maybe adapt the systems to those new processes. And then you go into a business intelligence cycle. So I think it's very important to make sure that the people who want to start using data are aware that if it's done well, it's always a transformational change for the organization. And that's something else than just implementing a software. So I put a lot of emphasis on change management as well. And also starting from the top leadership alignment on which are the KPIs of your business, which KPIs haven't you thought of yet, which are the obvious ones you want to tackle first. And then building kind of a roadmap starting there. But then also, of course, looking at the systems and the data that is in place. Where is the data? Can we Access the data? And then building the vision from the top and starting from the data at the bottom and meeting in the middle with nice dashboards to help the business move forward.

Benjamin - 00:09:08: At what point are you usually engaging with companies then, right? Like say they reach out to you if they're interested in your services. I guess at that point they already got into. Wanting to become data-driven and wanting some data solutions or is it in many cases also the chore actually the one pitching the company, hey, you're not using any BI tools or anything right now. Tell us more about that very early part of that process.

Wouter - 00:09:34: Most of those clients come inbound. So they find me and they already have an idea that they want to start using data. And usually they already have some kind of tool that they want to use as a source. And sometimes they also have some kind of like, one of these clients is called on a fine clever. It's like a nonprofit organization who helps people with disabilities, and they have a lot of data. They had done large CRM projects with a custom built application on top to help their coaches, help these people. And then they knew they could do more with their data and the tech stack they used to build the CRM and application was Zoho, it's a very modular program. It's an Indian company. And Zoho, also as an analytics module. So what they then did is they activate the analytics module. I studied it. I learned the new, the new module and I started building there. But of course, starting from the basics, building a data model, building an architecture, making sure that I could communicate what I was doing with it. Because the intention was to leave the company after a couple of months so that they could run it themselves and also extend. So a little bit hard, of course, because if you've never worked with tools like this can be daunting. So I basically taught them the basics of SQL. Then they use ChatGPT to write extensions to the code. And then once a year they call me in because they don't find the answer in ChatGPT. And then I solve their problems. For instance.

Benjamin - 00:11:08: That's actually super interesting how GenAI is already influencing workloads or especially for maybe less technical users who aren't as exposed to that technology.

Wouter - 00:11:18: Yeah. What I use GenAI myself, is to write simple standalone PowerShell scripts, for instance. Very powerful to do that. I haven't used it in larger deployments because it has to be integrated. It has to be persistent. I haven't used it myself that way. But for simple standalone scripts, it's very useful.

Benjamin - 00:11:38: Yeah. I feel the same way. I've done some benchmarking of Firebolt over the past couple of days and just bootstrapped some Python scripts. And it's really total, magic at this point. How easy it becomes, especially if it's closely integrated with your IDE. During my day to day job is mostly low level C++ development. There it's a bit less useful still in many cases, but especially for bootstrapping these standalone scripts, it's really crazy the amount of utility you can get from it.

Eldad - 00:12:05: It's also crazy how many tools, apps, and systems you can apply with GenAI.

Wouter - 00:12:12: Yeah.

Eldad - 00:12:12: Like you mentioned Zoho. Like I wouldn't have never imagined that. ChatGPT would be able to even remember that. And that's so widely used in very less technical environments. So now having a GenAI means we can start using a lot of what we already have. That's especially important for people and companies that have challenges catching up and making sure they're properly aligned on the right stack. So it's not that needed anymore. You can just use GenAI to cover the gaps and it's fascinating to see actually, to hear it happen in real life.

Wouter - 00:12:48: I think it's important in this case to understand that in Zoho you have different options to configure your setup. And I chose the option where you can write your own SQL so that you don't get locked into some type of a proprietary syntax of Zoho where there are a lot less experts who know how to work with this.

Benjamin - 00:13:08: One thing I'm super curious about is how do you think that actually changes these types of consulting projects, right? Because in the past you basically. At the end would have done a handover to their engineering team so they can solve their own problems. Now, in many cases, you're handing over to the AI who maintains things, does minor fixes themselves. Do you think that, for example, changes the way you would create like a knowledge base of the project before you give it into other hands? Like, give us your perspective on that. I'm super curious.

Wouter - 00:13:37: I think you still need a knowledge base to document things like architecture and high level approach. I think the GenAI in this case solves some minor type of. Issues they may have or the questions they may want to try to answer using their data and extending the queries a little bit. But I think in this case, the power of GenAI is to give the data environment in the hands of non-technical people who can think logically and who can sort of describe the prompts they need to get the codes they can use. So I think you mentioned handing it over to the engineering team, but this nonprofit doesn't have an engineering team. So I'm their engineering team. And I'm not interested, to be fair, to take up these less interesting support tasks so then they can work themselves. And then once a year I can help them do a review or put them on track a little bit.

Benjamin - 00:14:31: Right. Super interesting. So as you bootstrap then these data stacks, right? I think the company you talked about before already had like an existing solution where they had data in it. And then when you bootstrap the data warehousing stack you need to adjust to data. Is it, like, you have a blueprint for a data and BI stack that you can apply in many cases. Or is it really completely different, the actual solution you converge to from company to company?

Wouter - 00:15:01: The companies that engage me usually already have a tech stack. So I don't really have the opportunity to recommend the stack at that time. And I learned the environment of the customers.

Eldad - 00:15:12: Would you fire a customer or a prospect for having the wrong stack or having a stack that you're absolutely not willing to work with?

Wouter - 00:15:19: Yeah, I think if I'm not comfortable to work with, I will honestly admit it and probably the customer will not hire me. So that's the way it works.

Eldad - 00:15:27: So there's still the tooling fit between the consultant and the client and there needs to be a match. Not every tool we work with every consultant. And as you say, you will not start something with a stack that you don't trust, for example.

Wouter - 00:15:44: No, because I'm a philosopher in the end. I'm not that technical and I don't want to pretend that I'm something I'm not.

Benjamin - 00:15:51: I think after like a decade plus in the industry, at some point you are also a philosopher, but also a data engineer, a BI engineer. Super interesting. Cool. So when you then work with these larger clients, like how much of the problem basically translates? Like, do you feel like there's specific problems companies have that haven't done BI or data warehousing before? And then at the larger company, it's completely different? Or does it actually translate across company sizes?

Wouter - 00:16:25: I think the first point that I mentioned that BI projects have to be approached as change projects also translates to larger customers, where the issue usually is not that they don't have enough technology, but they have too much. So then they have to start pruning and putting emerging overlapping dashboards together or making sure that the message and the analysis that's possible in each dashboard is very clear. And then you go more down the governance track. But I think behind this thing I see at larger companies is still the same issue, namely that they are still approaching BI projects as a software project and not as a change project.

Benjamin - 00:17:09: One thing I'm curious about here that you mentioned is that data lineage and data quality is also something that really matters in these transformation processes, because obviously you want to make sure that the dashboard you look at, and make business decisions and change your strategy maybe as the right data. One thing that I find curious about is that if you think about tools for this, like Monte Carlo, which does data quality and these types of things, to me it always felt like something that actually happens quite far down your data journey. So you add your data stack and then at some point you have your first incidents, right? And you realize, oh damn, this dashboard I looked at actually represents something that doesn't reflect reality. It's interesting to me that this is something you already push for that early on in the process. It makes intuitively a lot of sense, but I would actually expect that for many companies, this is something that comes much later and not by the time they bootstrap their BI or data warehousing stack.

Wouter - 00:18:10: You mean governance?

Benjamin - 00:18:11: I mean like these kind of governance, lineage aspects, kind of data quality aspects and so on. I always felt like that might be more of a reaction to issues, you run into once you establish that the first time. And then for many companies, this is actually not something that they think about on day one.

Wouter - 00:18:29: Yeah, I think if you talk about data lineage, you can do it in two ways. Or you start building your data model where you draw your lines between your source, your intermediate layers, and then your visualization layers. And then you have some kind of plan upfront, and that's the data lineage that you then have to maintain when you start building it, and when you start expanding this environment. And in an ideal world, everyone documents everything and then you can find exactly what data is where and how it is used. Of course, in practice, it's usually not that clear. The people start building and then they have something that works and the business relies on it until there's a question, where is this data coming from? And nobody knows. And then you have to start documenting like the re-engineering where the data is coming from, what different steps are between the source and the visualization and can be extremely complex. So that's two ways I look at data lineage, one upfront and one after the fact. The second one is data lineage that I've done a couple of times for larger clients who have a lot of data, a lot of complex transformations, just to communicate the complexity and to also make them understand how difficult it is to change something and to build upon the existing data. And then, of course, you have tools to automate this kind of lineage, but in my experience, it's never really able to capture all the complexity that's going on. For instance, if you have a select star somewhere, I don't know any tool can go back in the code and find the exact fields that are used in the star, for instance.

Eldad - 00:20:14: It also depends on the source, right? Like, usually people think logs, data points that engineers generate while building apps. But most of the world still depends on a CRM, ERP, complex models, very business-driven vertical data sources. Federated sources. So a lot of stuff that many times we tend to forget. When we go with the kind of cloud native logs data. So I think modern lineage and quality is focusing on the latter, like on logs generated data, where as we build the apps and generate the logs, we make mistakes as engineers, and that needs to be fixed and that needs to be cataloged. But if you just go a few years back and look at existing data models, this is a whole different game. And that requires the human factor. And I connect to that a lot because people in companies spend fortunes of time and effort in building those things. The business runs on those things. And it's not easy to just go in and say, oh, let's just replace it. So it's like a surgery one little step at a time, trying to figure out how to do most impact with as less harm as possible. And again, it's also an educational and technological aspect, right? If you're born into data, into tech, from day zero, and you've went through the usual cycles and education, it's one thing, but it's a very different thing if you already have a huge business and you've operated for many years on previous generations of IT systems. So there's a lot of work ahead. And most companies need consultants, not just to consult, but to really guide them from one era to another era. There's a lot of philosophy in it. So a lot of human factor, a lot of convincing and reasoning, and it has nothing to do with technology.

Wouter - 00:22:13: Sometimes I try to make the connection between philosophy and data. And then the case that you would describe, Benjamin, is, for instance, a way of looking from philosophy through a lens of philosophy to this kind of data project. It's like you have to talk the same language. So in order to understand each other, often that's already an issue within large companies. You have different teams with different cultures. Sometimes the people who built the legacy systems are still there. They're completely talking a different language than the younger hires who are cloud native. And then already to recognize that this can be a problem, that there is a different culture and a different way of using words can help to unlock some difficulties that those companies have.

Benjamin - 00:23:00: So talking about cloud native hires, in how many cases are the companies you're working with actually on the cloud already? Like is most of the work you're doing on-prem? Or are these companies usually in the cloud already? And if they're not, is this like maybe also part of the journey?

Wouter - 00:23:16: Yeah, like the nonprofit I was talking about, they started in the cloud. Like SOA is a cloud platform. So they already started in the cloud. The larger companies usually started on-prem and then migrated some of their tools already to the cloud. Or maybe they are in the process of doing so. That are the two pieces I see at this time.

Benjamin - 00:23:37: Cool. Awesome, Wouter. Anything else at the moment, you see popping up a lot in the industry that you're passionate about, that you wanted to chat about today on the podcast.

Wouter - 00:23:48: I think that what I'm most excited about are the people who are on social media talking about going back to basic. A couple of years ago, everything wanted to do new things. But I think we have to understand that the old things were there for a reason. And I think people are rediscovering some of the basic concepts now. And I'm glad to see that.

Benjamin - 00:24:07: Nice. That's, by the way, a very recurring theme and I think kind of sentiment that a lot of people in the industry have.

Eldad - 00:24:14: Data is like fashion. It goes in cycles. We always get to the same starting point. In a good way, of course.

Benjamin - 00:24:20: Yeah. Very cool. Wouter, thank you so much for being on the show today. Seriously, this was a totally amazing perspective. We really never had a guest who works that much with companies in the early stages of their data journey. So it was amazing to have you on. All of the best with the clients you're working with. And thank you so much for joining.

Eldad - 00:24:40: Thank you.

Wouter - 00:24:41: My pleasure. Thanks for having me.

Intro/Outro - 00:24:44:

The Data Engineering Show is brought to you by Firebolt, the cloud data warehouse for low-latency analytics. Get $200 credits and start your free trial at firebolt.io.

Read all the posts

Intrigued? Want to read some more?