# Large scale data engineering at Momentive.ai - Meenal Iyer (/blog/large-scale-data-engineering-at-momentive-ai-meenal-iyer)



As companies scale, data can get messy. The data team says one thing, the business team says something else. Meenal Iyer, VP Data at Momentive.ai, met the Data Bros to talk about enforcing collaboration in large organizations to ensure what she considers the three most important factors in data: Adoption, Trust, and Value.

<Video provider="youtube" id="RPDxGu_P04w" title="Large scale data engineering at Momentive.ai - Meenal Iyer" />

Listen on [Spotify](https://open.spotify.com/episode/15CdXgmYIWYsV21xkcJ1hw) or [Apple Podcast](https://podcasts.apple.com/us/podcast/large-scale-data-engineering-at-momentive-ai-meenal-iyer/id1561927688?i=1000620875451)

Benjamin: Hi everyone. Welcome back to the Data Engineering Show. Welcome Meenal our guest today. Welcome Eldad, kind of back from vacation after missing out on the last two episodes.

Eldad: Glad to be back.

Benjamin: Good to have you. So we have a great episode planned today. We have Meenal Iyer joining us from Momentive.ai. So, anyone who hasn't heard about that before, maybe you've used SurveyMonkey, basically Momentive is the parent company. And, I'm sure Meenal will tell us just in a minute kind of what other products they're kind of working on? What is the company doing and so on? Meenal is the VP of data there. So, we have a great conversation plan, kind of talking about the data challenges there, kind of data leadership, those types of things.

Meenal, do you just kind of want to quickly introduce yourself, tell our listeners about Momentive and then we can jump right in.

Meenal: Awesome. Thank you again for the opportunity. Hi, Benjamin and Eldad. Hi everyone. I'm Meenal, I head the data team here at Momentive and now going forward, it's going to be called SurveyMonkey again. We are still in the process of renaming our company. So just...

Benjamin: So it's like a flip flop, basically going from SurveyMonkey to Momentive and then back to SurveyMonkey.

Eldad: Because everyone knows SurveyMonkey.

Meenal: Exactly, yeah. Everyone knows SurveyMonkey.

Eldad: Yeah, it works. Sorry, go on.

Meenal: Yeah, I've had an exciting 11 months over here, looking to build a data platform that can allow the organization to kind of make, data-driven decisions and then produce value for the organization itself. We just were simply acquired by a private equity firm and, as you know, it just becomes more imminent that the data team kind of starts producing more value than it typically does in an organization and such scenarios. It's going to be an interesting journey, starting now or going forward. So, I'm super excited, and yeah, very excited to be on the show.

Benjamin: Awesome. Super cool. Do you want to kind of give us a quick recap basically of what got you into kind of big data, data engineering, those types of things? Because you have a bunch of experience across many different companies, so I'm sure our listeners would love to learn a bit more about that.

Meenal: Oh, absolutely. So, no big story there. I have a long story, but I got into data by chance. Realized I really enjoyed working in it and figured that I want to make my career here. So, I kind of dabbled in different industries. The reason being that I wanted to learn new businesses, wherever I went, and then looked to see if I can solve different kinds of data problems everywhere. So, it's now gotten to a point where I have an understanding of the industry, so all I have to learn is the business and then I have a playbook and I essentially use and employ that playbook and make organizations successful in their data journey itself.

Benjamin: Gotcha. Sounds cool. So this kind of leads us right into, I think, the first interesting thing to talk about. So, tell us a bit more about that playbook, basically, and also as times change, like especially in data engineering, things are moving so rapidly, how much do you have to adjust the playbook, basically, or is it actually quite constant?

Meenal: Well, I think I would say my playbook is now about six years old till before then it was kind of evolving and just for the reason that space in itself was evolving till that point we had a concept of where we were very heavily into data warehousing, having warehouses. But now, the nature of data has changed, the usage of data has changed, how organizations perceive data has changed? The value that these teams provide has become very, very different. The value generation is now coming out of data teams, the ideation, the monetization. And so for that reason, the playbook had to evolve in a way that we start looking at data democratization in a much broader sense. Data privacy and governance became large components within the whole model itself.

So, I would say there was a shift or a change over the years as to from where we started. From there it was very simple and not simple in terms of the build out itself, but simple in terms of what the requirements of the data team were. It was that you have or produce the data, you have a model essentially that services it to different teams within the organization. And then organizations had their own decentralized analysts. And people who were really good with data itself on their teams, who would like to pull the data and do work with it. But then over time it changed to where the data team itself needed to be that center of excellence and that value generator rather than just being the holder of data or just having governance over the data itself.

In order for that to happen, the education or the training that the team had to undergo or the way the team has to partner essentially with business rather than just being someone who just takes orders from the business itself.

We become partners because since we hold the data, we have a full understanding of all the data that exists, how the data all ties in together, and the value that can come out of that data that business may or may not be able to see. And so, how can we make that our motto, is how my playbook has evolved into. So, of course we still believe fully in self-serve analytics. So where, we push the data out and we make data available, so that business can make the decisions. But we also hold that center of excellence hub where we can start ideating in terms of what we can produce out of this data? What value can we bring out of it? Because that is the real ROI of what the team can actually provide.

So, yes, the playbook kind of has all of these things, but I would say, data democratization is like the word that would kind of encompass the end-to-end of what I have within my playbook itself.

Benjamin: Gotcha.

Eldad: Okay, so Benjamin, let me explain to you in a nutshell, in 40 seconds kind of the evolution of data. So, at first we served engineering teams and they were using data to build products. It was amazing. But then we got kind of used to it. So, the business actually got on and they started to use data to run the business instead of just building products. So, we switched to serve the business instead, which was also amazing. But then again, we reached the point where we just democratize data and we open it for everyone. So, they can serve it themselves and we just serve the data. So, I think it's kind of getting back to the roots, we only serve the data and if we model it right, we open it right. You've mentioned compliance and we've mentioned all of those things, gatekeeping, an excellence center. I think we're entering an era where it's all about controlling metadata and opening the data, so people can use it themselves. Tell us how it evolved? Actually, tell us when was the first time in your career that you considered data to be a strategic part of your team's ability to serve?

Meenal: I would say, we were naturally doing it throughout my career. Like, but it got to, I worked in companies where I truly was able to see how we could push the value for the organization itself. And let me give you an example. So, when we started off we again started off as a regular organization. We had our warehouse. We were just kind of pushing data out and then as we began conversations with the business, and I got to learn a lot more about what the business teams were doing and the challenges that they were facing, and I realized it is so much more simple if I can help them build this out themselves rather than them doing it. So, it started slowly with automation. There was an automation of an exercise that used to take one person on their team, 76% of their week to do, and that is all they did. And even then, the value that it provided was not complete. And I said I can very easily build this out and automate it for you, create a whole simulation for you and you just have to press a button and like input values and you get an output. And not only are you able to project this for three years, but if you want to project it across five years and see what that looks like, you should be able to do that. And that was kind of the first foray and then I was like, Wow. And, I know it's like a moment, but, it was...

Eldad: Wow, they're all lazy here. I just managed to kind of replace them over a weekend. I need to, I'm leaving.

Meenal: No, but sometimes it's like, there are folks on different teams who are just there for that one specific task, and that very repeatable task, which provides them some level of security, I guess. And I said, we can very easily convert. So we just did. We did go and deliver it to them, and they were so happy. And then they kind of championed us, going forward and soon, we had a couple of teams coming in and saying, oh, can you do this for us? This is what we are struggling with. So, we built fraud models, we built like market basket analysis models, and that's how it kind of started. And I was like, huh. So, that kind of essentially became my thing to do. So, as I started moving into other organizations, conversations with the business became a regular thing, communication with stakeholders, understanding what their challenges are, became a very, very regular thing. And I realized that they all are very willing and ready to share what their concerns or their challenges are. And then you just have to find a way in which you can help and/or assist them. And you would be very surprised but there is always a way for data to assist and people don't just say data is an asset, just like that.

It is truly an asset and if used well within organizations, we have an ability to assist with every business function that exists over there. So, that kind of became my motto. I started identifying who my champion team would be because those champion teams became the sellers of my team, and they became the sellers of our abilities and capabilities as well. And so, going through this champion methodology essentially assisted there as well. So, it's something that I have put in my playbook and I have employed.

Eldad: Awesome.

Benjamin: This playbook, as you move between different organizations to make them more data driven, to kind of champion data teams and so on, like how much does it generalize? So like, let's be maybe very specific, like now that you're at SurveyMonkey how much tuning does it need? How much time do you have to spend just learning a lot about your organization before you can kind of start implementing that?

Meenal: So, the business changes. I have moved across multiple industries. So the business changes. Every business operates a little differently than the other ones. So, of course, there are those nuances. The metrics are the KPIs of the organization measures are different. The business functions are different. Sales looks a little different here versus how it looked as retail sales. So, there are a lot of those nuances that you have to take into account. But then if you look at the overall strategy, you still have a data team. We still have a data science, BI analytics team. We have business functions. We have teams that need to be upscaled. We still have a maturity model, that you understand that at what level of maturity you are and where you have to progress.

So, there are certain things that are common and you just have to realize, okay, there in the journey is this organization. And then, you kind of take that journey across. So, for example, as I get into organizations, I look to see how self-service an organization is

In some cases, you know, you enter an organization, it's pretty mature. Self-serve analytics has already been built out. So then, you think, okay, what next? Like, how do I take them to that next level? In some cases, no, they haven't even got self-serve. The data team is functioning almost as an IT team. And how do you shift that from an IT mindset to basically a data team mindset. So, there is that evolution that happens, but if you look from a playbook standpoint, you just define or understand, okay, as to which and where in that journey they are. And then you basically start from that journey. But the playbook already has it in such a way that I know where, at which point I need to start and then move ahead on this. And yes, there are some small tweaks here and there that we have to do for the organization itself, just based on certain changes that may happen or in the way that you may have to operate. There are some changes, but for the most part that that playbook has been good.

Benjamin: Gotcha. So, how do you kind of along this journey measure the success because you're coming in as a data leader and if you have this specific vision of where the data team moves or how a kind of high functioning data team operates? How do you get buy-in into that and then kind of show, hey, look we're operating better now than a year ago or two years ago?

Meenal: So I think that is, how do I put it? Okay. So say if you're at the beginning of your journey, in the beginning of the journey is where you have to make a case as to that I have to, this is where I have to kind of drive the organization towards. This is where you are. So the first state is where you kind of go and speak to individuals within the organization and get a feel from them. Because you hear a very different story from the data team typically. I have been a developer and I know how I was. I used to always say my code is the best. I can never make mistakes and I produce the best thing. So, it's exactly that way.

Eldad: No, it's not your fault. You're using different excel versions of the same schema definition.

Meenal: Exactly.

Eldad: So, they end up building multiple versions of the truth on the centralized data warehouse.

Eldad: Tell us a bit about, do you use products to enforce collaboration on shared metadata, consistent metadata? How do you do it?

Meenal: So again, the centralized team, so let's talk about these large organizations. So, you have a centralized team. The function of that centralized team is one to produce that golden state of the data and to produce a semantic layer. So what is a semantic layer, is basically a business view of the data, where the business is able to kind of come in and essentially question the data, do anything with the data. So that may be, whether they want to use it to build their own data science, whether to do their own analytics, there are a whole bunch of functions that they can do. In some cases, they have their own sandboxes where they kind of play with the data and see, okay, what could be, and then they push it to the central team to kind of productionalize. So, there are a lot of these functions. So you produce, so you have the semantic layer ready. Your semantic layer has all of your key enterprise metrics and KPIs already predefined. So, irrespective of where it is being used going forward, it is always going to stay consistent. So, tomorrow, it is not gonna be that one team took it out and they have a very different value of what financial sales needs to look like. And then, marketing says that, oh, this is the value of financial sales. No, it all comes from the single layer. So, the very key and important part is that first get this groundwork set in and then what you do is then you have a business glossary, of course, which has the definitions of the metrics and then who the keyword for that metric definition is. So, if there are changes to that, then it's a communicated and published document. Then everyone has, and then they know what it exactly means. So, if there needs to be a change, then we need to go and follow protocol to essentially make that change within the system, put it on the semantic layer, and then go out for that other team itself. Yes, it's a little bit of a process, but it can be optimized. Once you have that, you have your data dictionary, you have your business glossary published, you have your semantic layer. Then, the third thing you do is basically you define the tools that can be used within the organization. So again, that is something that should be managed through the central team. And say for example, you say Tableau is the only tool that we have from a reporting analytics, dashboarding standpoint. And Tableau should be the only one that should be utilized within the organization. Now, if a team comes up with a specific use case and says, oh, you know what, Tableau doesn't work for our needs and we are going to need to use Power BI or we need to use Looker because...

Eldad: Sisense, of course, only Sisense.

Meenal: Yes, Sisense. And, if that's the one that is going to provide for our function. Then, you know, as a central data team, again, it's your responsibility to ensure that that tool is really required because sometimes it is just a matter of preference and not need. And so, it's your responsibility to ensure that tool is really the tool that you need to take forward or go forward with. But my statement here most is around the fact that it's your responsibility to also standardize the tools in use across the organizations and in cases where these functions are going to now be decentralized to the other team, so one thing they're going to require is access to your data warehouse or your data. And they're going to need a playground for themselves where they are going to start building their artifacts.

Eldad: They can actually change everything they want.

Meenal: Exactly. So they can change their stuff and push that stuff. And, then you have to tell, again, tools. So you have to provide them with the tools that they use so that they don't purchase their own tools. And so the total cost of ownership still stays constant. And then once you have that, you build out templates essentially as much as possible so that they can work within that same standards and guidelines. And then of course, continuous education. So, data literacy is a big part of what I do, and continually educating them in terms of what's right, what's wrong, how to use the data, what data exists, and how to use the data in a governed sense and in a private sense? Because in some cases the data that they may be taking in may be sensitive data and usage of sensitive data education is very, very key and important, as to how to do that. So, some of that becomes a little repetitive, but it's very, very essential for organizations itself. And, then, of course, a lot of training on the tools so that they are using the tools in the right way. And they're using it optimally for their needs. Now, this I talk about in very, very large organizations.

Now, you come to like smaller and mid-size organizations. In that case, you should minimize the amount of decentralization that actually happens. So from a data standpoint, you don't expect them to go and be running their own ETLs and doing anything beyond.

Eldad: It's the same stack, but just a free new addition.

Meenal: Exactly. I think that's a perfect analogy. Yes.

Eldad: Remove all the enterprise features, no auditing, no compliance, no security.

Meenal: No. So, you still have all of that, but it's not a central team. You still have education because they still have self-serve, so the ability to do self-serve, but your self-serve is now limited to where they're more dashboarding and then doing stuff from that point onwards. So, they are not responsible for bringing the data in or ETLs and you want to try to minimize that as much as possible, because it's a smaller organization, yours is a smaller team as well, so you want to kind of keep that management of it much, much more centralized. So, education still exists here because they still need to understand the importance of the data, what data exists, and how sensitive and private data should actually be utilized? So, that still exists from a literary standpoint.

Eldad: You picked the right computer so they don't burn the monthly budget.

Meenal: Exactly. So, you know that part of it still continues. But, I would say that's how I typically prefer that we organize data in, because again, large organizations there are just too much to manage. And, it's not essential that every time your data team doesn't have to be for 40 people, like a 40-people team to serve a larger organization. You can actually manage with a smaller team. It's just that you...

Eldad: We're going to do a sister show, that's called Data Politics unlike Data Engineering, very similar to Data Engineering.

Benjamin: The Data Politics show nice.

Eldad: The Data Politics Show and it is interesting though to see how data gravity affects data politics. And, it does.

Meenal: Yeah. I'm sure. I'm sure it does. Again, with the importance that data has across the organization, I'm sure there are politics associated with it. But, yeah, that's kind of how I look at self-serve, and that's how I prefer to manage it within organizations where I go and lead such efforts.

Eldad: Thank you.

Benjamin: In terms of, I just lost track of my train of thought. So, Tamar, when you listen to this, please cut it out.

Eldad: No, please. Tamar, please keep it. We never cut out. We've never cut anything out. We're not going to start now. Now I will ask, while Benjamin is getting his threads in order, what's kind of the most exciting thing coming to your team this year? What are you working on that's big and risky and supposed to make a big impact?

Meenal: Well, there are a couple of initiatives, I obviously can't go into details of them, due to privacy, but there are a couple of very interesting things the team has been working on. So, one really awesome thing that I was able to do for my team earlier this year was doing a data hackathon. So, you typically have software engineering hackathons, but data hackathons are very cool. They're much cooler than software engineers.

Benjamin: As a software engineer, I feel offended, but tell us how to host an amazing data hackathon?

Meenal: So, what we did, sorry, Benjamin. I was…

Benjamin: It is okay. I can handle it.

Eldad: They're doing three hackathons a week in Munich there. His team is doing like, oh, let's do a hackathon. Hackathons are unique, Benjamin, you do it once in a while and you eat pizza. So, but yeah. Sorry, go ahead.

Meenal: No, no. So, we did a hackathon. What we did though, Benjamin, we kind of set the topics previously, and what the topics where is these were longstanding problems within the organization and challenges that the organization was facing. And we were looking to see either to come out with a solve for it or with a prototype for it. So, either the outcome would be a plan, as to how we would tackle the problem or it would come out with a full solve, with a prototype. I'm happy to say like three out of the four projects, I don't have a large team. So, three out of the four projects that we did are like going live already. So, that was like, and those like, Eldad to your question is, the super exciting stuff that we actually did. All of them revenue generating ones again, and things that we were super excited about. So, we are taking some of that learning and essentially we have taken it a step ahead and we are looking for other similar revenue generating opportunities and we have found some similar such ideas and we are moving forward with that as well. So, again, that is like the fun and exciting stuff that's coming up. Again, can't go into much details, but yes so far.

Benjamin: For these open business problems initially, you worked on specific things, given how many different kinds of business functions there are that you guys are helping with as a data team? How did you decide basically which problems were worth tackling?

Meenal: So we have, as SaaS has, like you have a Freemium and the smaller version of your platform, and then you have the much more enterprise version of the platform. Our focus was to essentially see how we can increase the acquisition of customers here, like on the premium side of it, and then retain them and then get them to move to more paid and so what we did is that from the problems that we had. So, we had 12 problems that came to us and we had to choose four out of them and the four we selected were all related to. Eventually, that's how we kind of focused on it because that was very important for the organization and we wanted to make sure that we were helping with that, specifically given the time. So, that's kind of how we prioritized it. That's not to say that the other ones have been ignored. The other ones we have also taken on, we do intend to have like a virtual hackathon. This one we were fortunate to be able to do in person.

Eldad: But it's virtual, I'm just saying.

Meenal: Yeah. So, the intent is to kind of tackle those in the subsequent virtual hackathon. And then, of course, I'll let you all know how that goes because I have never done a virtual hackathon before.

Benjamin: Awesome. That sounds like a kind of huge challenge to get the energy. So, gotcha. So, in a data hackathon you start out with a specific business objective. You pick projects around that as you close out the hackathon. So this is the final part, like, how did demo day look or how did every team close out their data hackathon project?

Meenal: So, this is where our coolness factor ends a little bit in comparison to software engineering hackathon because you all can actually show, like…

Benjamin: I knew it.

Meenal: I made him happy again. Of course, our demo essentially was data, our demo was the solution. So, it was more PowerPoint slides than anything else. But, the outputs itself were super, super exciting. We had data science models built out for all of them. And so, we could showcase those models and show the output of those models itself. So, for our judges, it was so exciting to see that data live and see the answers to these questions. That was like the fun part of it.

Benjamin: But that sounds super cool. Like I don't see what the kind of missing coolness factors like we built kind of the query engine. So, our demos usually are, you click something, it's slow, then after the hackathon, you click something, now it's much faster. Because our query engine, like we improved some algorithm or something. So, okay. Maybe it's levels. We can agree the data hackathon is just as cool as the software hackathon.

Meenal: I agree. I changed my statement.

Benjamin: Perfect. So, on this kind of lovely note, let's wrap up today's episode. Meenal, any kind of closing words to our audience that you wanted to talk about in terms of data teams, etc?

Meenal: One thing I want to close out with, and I'm sorry we didn't get a chance to talk much about is the ROI of data teams. And I think, if you've heard my Montecarlo post as well, I talk about ATV, which is Adoption, Trust and Value. And, if you follow these three methodologies as you go and build out a strategy for your data team itself, you will realize that the value that your team provides far outweighs the cost of the investment that you have made in building the team out itself. So, that's one thing I would love to leave you all with.

So, adoption essentially talks about the fact, data democratization. So, as you build your semantic layer out and as you have the organization essentially adopt, your data platform itself, it's very essential that the adoption piece of it occurs, for the other pieces of it to happen. Now, the second part is trust. Of course, if there is no quality of data or if there's no trust and transparency within your data, the adoption is not going to be complete. And, so you have to ensure that that is the other pillar that you have to take care of and then the value generation starts happening is that once you have adoption and trust in, then you start producing value out of your platform itself.

The second thing I would like to leave you all with is the fact that nothing is possible without the team itself. And so it's very essential that your team has the ability to cross train, upscale and continuous learning has to be provided to the team so that they are kind of growing and provide them the opportunity to become as productive as possible in whatever it is that they do because the job that the data team does is always very underappreciated. And, so in order for the teams to be appreciated and for the teams to be more productive, you have to provide them the environment to actually be productive and really provide a satisfying outcome for everything.

So, two very, very key and very important things to take care of. But that's the message that I would probably leave this.

Benjamin: Awesome. I think those were great closing words. Thank you so much for joining in. So, this was an awesome kind of learning about how you think about building high performing data organizations. We look forward to hearing how the virtual hackathon, data hackathon, goes in the end. Awesome! Thanks for joining in, Meenal.

Eldad: Thank you for joining.

Meenal: Thank you so much.

Eldad: Bye-bye. Take care.

Meenal: Take care. Bye.
