Dr. Peter Fishman is a Co-Founder & serves as Chief Executive Officer at Mozart Data. He served as the Vice President of Growth & Analytics at Zenefits. Previously, he was Director of Analytics at Yammer and Principal Data Science Manager of Microsoft Office Analytics. Prior to joining Yammer, he worked at Playdom/Disney and as a statistician for the Philadelphia Eagles. He holds a B.S. from Duke University and a Ph.D. in economics from UC Berkeley.
Adel is a Data Science educator, speaker, and Evangelist at DataCamp where he has released various courses and live training on data analysis, machine learning, and data engineering. He is passionate about spreading data skills and data literacy throughout organizations and the intersection of technology and society. He has an MSc in Data Science and Business Analytics. In his free time, you can find him hanging out with his cat Louis.
The role of a “data scientist” used to look completely different from company to company because it was a catch-all title that encompassed multiple skill sets. Now, the field has gone more granular, where novel, hybrid data roles are emerging such as marketing ops, biz-ops, and more.
Deep thinking about the domain a data analyst is working in is a core skill data analysts must continually cultivate. They must understand the causal relationships underlying their datasets and be able to identify how data might be used correctly and incorrectly to infer them.
For early-stage startups, nothing will replace the need for data analysts to deeply understand their customers, which is best done by observing them, talking with them, and sitting down with them.
Being able to make inferences with small data sets is a critical skill. It's a little confusing because, typically, you can't make inferences with small data. If you only see one or two observations, you can't make a valid statistical inference, but when you deeply about mechanisms–how you would set the data up to actually learn the answers in a space where you're pretty constrained by the database–what you find is that even when your data size is infinite, you always want to cut it into a smaller cohort to make a more precise inference and you always run out of data without fail.
When it comes to data skills and roles today, there is a much greater granularity than in the past. However, with specific expertise distinguished by different titles, a lot of the core skillset stay the same. What makes for a great data analyst or scientist also makes for a great marketing ops analyst, and while they may do different things on a day-in, day-out basis, the core is still about data thinking and data capabilities, rather than specific technical expertise.
Adel Nehme: Hello everyone. This is Adel, data science educator and evangelist at Data Camp. The past few years have seen an incredible addition of new tools and frameworks that empower even the smallest data teams to do more. These tools are often what is referred to as the modern data stack. One aspect of the modern data stack is that it empowers practitioners, like data analysts to deliver insights and improve value at a much faster scale. This is why I'm excited to be speaking with Peter Fishman, CEO of Mozart Data. Mozart Data empowers data analysts by providing them with out of the box data warehouses that allow anyone to connect their disparate data sources easily, apply simple transformations, and start analyzing data all without any data engineers. Throughout our conversation, we speak about his experience launching Mozart Data, the trials and tribulations most data teams face when trying to hit the ground running, the skills modern data analysts need to have, the importance of developing subject matter expertise analytics roles, and more.
Adel Nehme: If you enjoyed this podcast, make sure to subscribe and rate the show, but only if you enjoyed it. Also, if you're interested in the modern data stack and want to transition your local notebook environment to a cloud-based collaborative environment, I highly recommend checking out Data Camp Workspace, where you'll be able to code in Python and R and use a bunch of templates and data sets to get you started in data science right on the browser. Now, let's dive right in. Peter, it's great to have you on the sh... See more
Peter Fishman: Great to be here. I'm Pete Fishman, I'm the co-founder and CEO of Mozart Data. Like many people in the data space, I'm something of a failed academic that transitioned into the world of sort of applying statistical experience and putting that into technology. So I've been working at startups for the last decade plus, mostly in data functions and ultimately decided... Myself and my friend, Dan decided to build basically ourselves as a service. And then we built Mozart Data, which we called the easiest way to spin up a modern data stack.
Adel Nehme: That's great. So can you walk us through how these experiences that you've had across industry and academia led you to launch Mozart Data? And can you walk us through the challenges Mozart Data tries to solve?
Peter Fishman: There's a long thread there because it does sadly capture many, many, many years, but there is a lot of consistency in the theme. So what has basically happened is that data has become essentially like bigger over time. Not just the buzzword of big data, but basically the computing power ultimately has a lot of downstream effects. People can collect more data because they can get more value out of that data. My sort of arc looks like I was really doing very early empirical work in grad school when obviously statistics have been around for a very, very long time. But the first time where you could really use hundreds of thousands or millions of observations.
Peter Fishman: Today doing analysis with millions of observations is not just like trivial, people would eye roll that. But for me, that was kind of the size of the data sets that I was working with during my PhD program, which at the time was almost unthinkably large, exceeding whatever Excel could do.
Peter Fishman: But what ultimately has happened is that you find insight in the data and then companies figure out ways to take advantage of it. And then you have to go find that next insight in the data. So I started my career in the Facebook game space where a lot of these companies competed over using data in novel ways. Facebook had billions of users. So as a result, the data sizes and volumes were gigantic and you could make really novel insights. And we started doing a lot of really, really paid very close attention to CACs and LTBs. And the game was to build a virtuous cycle of buying people's eye balls very efficiently, and then feeding that into monetizing and getting more people onto your platform and getting a virtuous cycle going.
Peter Fishman: I then saw the opportunity to deploy that into the B2B world. So the defining part of my career was this company Yammer. At Yammer, we took a lot of the B2C approach to software development and then applied it in the B2B world and the bottom up SaaS world, which didn't really exist at the time. But it calls for a lot of understanding what your users are doing and understanding the attractiveness of the prospects as a function of who's actually using your product and that required data folks. And not only that, data infrastructure. So I built a tool at Yammer called Avocado along with a really great team. Avocado today is really Mozart Data plus Mode Analytics. And from there, have had a lot of different opportunities to have similar data infrastructure at different companies before ultimately deciding to build it myself.
Adel Nehme: That's very exciting. And I'm excited to unpack this further. Before though, I want to set the stage for the landscape data teams are working in today and the dynamics that really led to the launch of Mozart Data. As you said, and I completely agree with this notion that data science has become table stakes and no longer a nice to have. So I wanted to start off our chat by first asking, how would you define a data-driven organization and how can an organization integrate data science as a table stakes practice today?
Peter Fishman: Sure. So I think most people have a mental image of a data-driven organization as one with lots of TVs all over the office. Now, there are nonexistence offices. And those TVs have time series of KPIs and people just walking around the building understand what's going on with the company by observing the time series of the KPI. I set up a strawman, but I very deeply disagree with that.
Peter Fishman: So the first thing I'd say is very few canned ways of looking at the data often provide the necessary insights that you're talking about. A data-driven organization is one where data has a very important part at the key decision making tables. That can mean a very senior executive that's a data person. That can mean that data starts every meeting. That can mean that data analysts have access to all sorts of key decision makers or ultimately data becomes the key decision maker more so than the word strategic.
Peter Fishman: I often find that non-data-driven organizations often talk about strategic investments, ones that almost can't be justified in the data. When you start a company, when there's zero idea, zero data, zero people, or one of each of those things, you end up really needing to actually be strategic. You need to imagine a world that doesn't exist, that cannot be justified by backwards looking and you need to essentially apply your own direction and thought and beliefs. Now, data can inform that. I mean, one of my favorite examples. When I, again, worked at this Facebook Games company called [inaudible 00:07:49], we used to sometimes run advertisements on games that were basically half finished. And though you couldn't make like a statistically highly confident conclusion about how effective or successful the game might be, you could get a flavor for how difficult it would be to maybe acquire users.
Peter Fishman: So your belief about that could be tested even before the game existed. So that's not to say at a super early company, you have to only go on gut and strategy. But what I think of as a data-driven organization is one that data is a first class citizen, but not just that they collect data and they have dashboards and that they look at time series and they can go to bed at night because they know that their company is going up and to the right. But that rather key decisions are informed by data and cuts and dives and summaries and models of the data.
Adel Nehme: So to double down on your point here, a data-driven organization is where data becomes a habit across the decision making life cycle, rather than something to look at.
Peter Fishman: Absolutely.
Adel Nehme: So what are the main challenges affecting organizations today who truly want to make the most of their data?
Peter Fishman: The way that an organization gets to a place where it can be data-driven is by not being data-driven. So the success that brought you here is not going to be a data-driven success. It's going to be a success that's driven by often the founders, but typically beliefs about the world that couldn't necessarily be justified at the time that end up actually proving out to be true. But you typically have this headwind of the thing that brought you here was you weren't data-driven.
Peter Fishman: How organizations become data-driven tends to be an underlying belief that our organization must be data-driven. And not because a venture capitalist has told you to be data-driven, not because world and the podcast you listen to tell you to be data-driven, but rather because you ultimately truly believe that the signals that the world is giving you is going to be more informative when aggregated and summarized the right way.
Peter Fishman: I teach a class sometimes at Berkeley, where I did my PhD and I go back, and I put up a bunch of different ads from games that we ran on Facebook. And I said, "Which of these is the most effective, which is going to get the best clicks?" People raised their hands, not indiscriminately, but they have some that they like and the ones that they like actually tend to be sort of the better ones. But when you show the ad to 100 million people, their opinion is correct more so than any true expert. And I think what you need to do is maybe develop that muscle over time. Now, that is not to say that if you haven't done that, if you haven't had it beaten into you, that you really need to be thinking about the data, thinking about the data the right way and using the data.
Peter Fishman: You can still very quickly adopt that. If I go back to my time at Yammer, we had two very strongly opinionated leaders, two co-founders, David Sachs and Adam Bisoni, who have a ton of intuition. They're famously very talented at product and technology, and they would have a ton of intuition. And it was in fact that intuition that made Yammer an attractive company for me to join. But early in my career, actually in the first three months of my career, we ran an AB test on the new user flow, which went counter to both of their intuitions. And we did it like almost haphazardly by accident, but it really set my career up for success because the results were very clear and slightly counterintuitive. And you very, very rarely see that in technology. I think even data people like to hype, oh, you run these experiments and you get these counterintuitive results and then your company becomes better.
Peter Fishman: That happens rarely. Much more often you get null results off of things that you think are almost certainly going to work rather than you get counterintuitive statistically significant results. That's happened to me, not a handful of times in my career, but very, very, very few times in my career. And it just happened that it was in an early part of my time at Yammer, which basically changed their whole perspective on how important it was to run AB tests when releasing products. And it became an essential part of the release criteria. And I think ultimately that was a little bit of chance, but a lot of open-mindedness of those two folks, both of whom are now investors in Mozart Data, but on top of it, I think it takes like either it's at your core or you get a very clear lesson and that's how you become a data-driven organization.
Modern Data Stack
Adel Nehme: That's awesome. And I wish we can dedicate an entire episode just to unpack your experience at Yammer and working with people like David Sachs. Now, of course, a key component of becoming data-driven as an organization is the set of tools and supporting infrastructure that enables faster time to insight. This is what often is referred to as the modern data stack. I'd love it if you can break down what you think is meant by the modern data stack and what are the characteristics that differentiate it from the previous set of tools data teams are used to?
Peter Fishman: The modern data stack is not really all that modern. The modern data stack is a modernization of existing data tools and of data pipeline tools that have been around for a very long amount of time. The branding on it is great because I hear the words all of the time, and it is well deserved on some level, which is to say cloud data warehousing has become like ubiquitous in the users of data space. So the first thing I'd say is that there are these powerful [inaudible 00:13:17] that are sort of able to crunch again, giant amounts of data. Not the data sizes that I was working with 20 years ago, but like real joints on huge data sets. So what that does is that enables you to use data from multiple places.
Peter Fishman: So what a modern data stack is effectively not too different than what a V lookup in Excel would be, which is to say, it's joining data from multiple places. The stack that gets you there is an EL tool, powerful data warehouse, and a T, a transform layer. So a layer to essentially clean up your data. So you have to extract and load data from many different sources, and then you have to clean and transform it. So ELT the data. So when people talk about the modern data stack, they are talking about ELT, but T now has a big meaning.
Peter Fishman: Everybody's always known that cleaning is a huge part of what a data person does. My old boss at Microsoft was Ronnie [inaudible 00:14:21], who has a joke. And I don't know if it's his joke, but I know he loves to use it, which is to say, "95% of data science is cleaning data, and only 5% of data science is complaining about cleaning data." He lands the punchline a little bit better than I do, but what he's saying is that actually he will think, oh, it's all building these incredible models off of these beautiful data sets that you compete on or are given. And in practice, actually, so much of the work is cleaning and making sure the data is right or consistent.
Peter Fishman: And so little of the work that a data person does is real data analysis. And it's certainly not 0%, but the joke lands better when you expect the answer to be 5%, but only it turns out it's actually just about complaining is the rest of the time. If I think about what the modern data stack is, is now all of these tools that represent the cleaning layer. And it's not just essentially scheduled tables, it's a variety of different parts of making sure that the data that you are looking at downstream, whether that's in your BI tool most likely, is actually raw, has essentially traveled without any problems.
Adel Nehme: An exciting part of the modern data stack for me is really the emergence of new categories within the data stack. For example, last year we interviewed Bar Moses, CEO of Monte Carlo and how they're trailblazing the data observability category. What are some of the categories and tools you've seen emerge over the past few years that you've been excited about?
Peter Fishman: Sure. Of course, I'm going to say manage data pipeline I think is the coolest category. And I happen to love one particular company in that. However, beyond that, there are a variety of tools that fit a little bit more, what I call like upstream and up market into larger companies that have larger data teams that are using their data in a variety of ways. But ultimately once you have loaded your data into your warehouse, there's a variety of things. There's data observability, there's data cataloging. I remember way back in the day, we used to have columns. Revenue underscore final, underscore the one to use, underscore you really want this one V6. And what I think is obviously the ability for larger data teams to come in and understand the world typically quickly, which is you actually what you find is once you actually have a mature data organization, it might take someone's weeks to come in or months even to come in and understand the stack.
Peter Fishman: And DJ Patel has a line about his time at LinkedIn, which was, "So much about being successful as a data scientist at LinkedIn was about getting a win in your first 90 days." And if it takes you 90 days to get up on the stack or 89 days, you better like be amazing. You better be able to find something incredible in one day. Whereas if it takes you a week or a day or an hour to get up on the stack, well, now you have a real chance to be successful at that company.
Peter Fishman: So there's a proliferation of tools that really savvy companies like a LinkedIn, like a Yammer, all built and used. Obviously, Airbnb has built a number of the famous ones and what those tools were about were making data people effective. And now, a lot of companies have sprouted up in terms of building those toolings that these companies spent countless... Airbnb probably spent hundreds of millions of dollars on. Not that it mattered, but they spent countless millions of dollars developing, now making that accessible to companies that don't have the budgets of Airbnb or Facebook or whomever. So I see a lot of development in that space. Obviously, other categories that are popping up that are... Reverse ETL is a really great example of a downstream one that we had built bottom up SAS worlds and at like subscale, right? So now having services that will do this, or having services that do extract and load, I think are really, really important for companies.
Adel Nehme: Where does Mozart Data fit within the modern data stack and how does it solve some of the challenges we've discussed thus far? And can you walk us through some examples of Mozart Data in action?
Peter Fishman: Mozart Data basically is an all in one data platform. So what that means is in under an hour, you can start connecting multiple data sources and we spin you up a Snowflake data warehouse, and you can start writing your transforms and connecting a BI tool or reverse CTL tool and start to get insights.
Peter Fishman: So the real magic is that this used to take months and a number of data engineer hires, or you do a lot of vendor assessment and then pick your potpourri of vendors to do it, or you hire a consultant to do it. Today, this can all be done in effectively no time. And by the time you're done with a demo, you could be up and running and querying your data in your favorite BI tool. Really, there is this challenge of this speed to insight, and Mozart wants to empower not just like very savvy data engineers, but rather everyone in the data landscape to be up and running with this modern data stack all very quickly, all without being gated by engineering.
Adel Nehme: What I love about Mozart Data is how much it empowers data analysts and citizen data analysts to get started quickly with data and provide value quickly without depending on data engineering or infrastructure work. You're someone who's led data teams, worked with a lot of data analysts while developing Mozart Data and more. I'd love it if you can break down how you think the data analyst role has evolved over the past few years, and where do you see it heading in the future.
Peter Fishman: Right at the time where the term data science, again, like Jeff [inaudible 00:20:06] and DJ Patel kicked off this term data scientist. And then, the incredibly rapid growth in that profession happened. The title data scientist was being applied everywhere in the data space. And the reason was because working as a data scientist basically meant that you got paid a lot more than working as a data analyst. So everybody started co-opting the term. And then you saw it to represent, you had folks that were doing ML engineering all the way to folks that were maybe just out of college working with data for a first time, all holding this title data scientist. And it represented a vastly different set of skills, all encompassed by the same title and different... It meant a different thing at different companies.
Peter Fishman: Today, you see much greater granularity of that. You see people that hold rev ops or BI ops titles. You see folks where their specific expertise is distinguished. So an analytics engineer is someone that's very different from a data engineer and a data scientist today has a specific role within a company. A data analyst tends to have a specific role. Now, we still see a lot of, if we had a Venn diagram of the skill sets, a lot of that would overlap. And I think actually the best... I don't think that one title is... There's no greater than sign. I think a lot of the core skillset ends up being the same. What makes for like a really great data scientist actually makes for a really great marketing ops analyst, which is to say a deep understanding of causal relationships, of inference, and like it's a different set of technical skills. Obviously it's a different role within the company and the organization. You do different things on a day in day out basis, but the core is still about data thinking and data capabilities rather than specific technical expertise.
Adel Nehme: I completely agree here, especially since there's a layer of skills that certain extent in variant as the role evolves over the time. What do you think are the defining skills data analysts should cultivate to become successful in a modern data team today?
Peter Fishman: I'm a little bit biased because I spent a big part of my 20s thinking about really causality. So I did a PhD in economics. I studied behavioral economics. And what was typically true was you would get great data sets that were not generated by experiment. So data sets where you measured things over time and you had an understanding of an individual with an ID over time, but you didn't necessarily have what you really wanted, which is to run a scientific experiment. But people in condition A and condition B and then have a hypothesis and see which one wins out.
Peter Fishman: When you don't have that, you have to basically do almost statistical tricks. You have to think about, okay, what is something like an experiment? And I think often that this is one of the most like underrated skills in data to really think about what you're trying to do with your data is essentially assign a causal relationship based on the past that you think applies in the future for a number of reasons, right?
Peter Fishman: You think that there was a mechanism that brought it that still exists today. So I think people that have really that deep thinking about like understanding causal relationships and understanding what typically is wrong with data. So the classic example is you say, okay, well, drowning deaths are always up in the months where ice cream consumption is up. And it's like, obviously all novices say, "Well, that's because in the warm months, people are eating ice cream and they're going to the beach or they're going to the pool." And of course, and they realize that's actually not the causal mechanism, but then you divorce it from that specific joking context. And then you bring it into a world where many things are going on and your job depends, in some sense, the value you bring to the company depends on identifying a relationship that you think moves maybe the company's... Whether it's their marketing, their business, their product, their users forward.
Peter Fishman: And then, you start abandoning that critical perspective. So in general, what I like is a set of almost dismantling of good work, thinking about all the ways in which a good insight or good work could be flawed. Maybe somebody did a robustness check that proved that it wasn't flawed, but at the very least when you read it, can you be... Or look at the work that was done, can you be skeptical and say, okay, well, maybe it's mostly driven by something that won't necessarily repeat itself because a lot of these, they do replication studies. And when I worked at Microsoft, I worked at Bing. And Bing, you had the huge luxury of not just millions, not just billions, trillions of observations.
Peter Fishman: And you could keep tests running and get inference from there. So I think like inference is the big skill, but then also inference with small data is also a real skill. It's a little confusing because typically you can't make inference with small data. So if you seek one observation or NF1 or two, literally you can't make a valid statistical inference from that, but really having a deep thought about mechanism and how you would set it up to actually learn that answer in a space where you're pretty constrained by database, what you find is, and we found this at Bing, that even when your data size is infinity, you always want to cut it and cut it and cut it and cut it and cut it to a smaller and smaller cohort to make a more and more precise inference.
Peter Fishman: And without fail you run out of data, even when the data seems like the size is infinity. I think two skills to me often are the most underrated. It's the ones that I think people should develop and work on. And it's also the one that we interview for not just at Mozart, but at a lot of the places that I've worked.
Adel Nehme: And being able to make these inferences and spot these casual relationships within the data set requires a lot of subject matter expertise. Oftentimes, what's missed in the discourse around upskilling and breaking into tech is subject matter expertise and domain knowledge, especially to be able to succeed in analytics roles and data roles. Can you comment or expand on the importance of subject matter expertise in a data role and how it has helped you in your career?
Peter Fishman: Well, just literally picking this picks up great, like you mentioned, off of the last question, which is if your key insight is thinking about the right mechanism that is driving the causal relationship you're ascribing to your data, then actually understanding what your users are doing and what motivates your users is critical. So again, I worked at Yammer. We were the biggest per capita consumers of our product as a company. So it's not surprising. Dan and I, my co-founder of Mozart Data. He and I, 13 years ago, started a hot sauce company. We were also the number one consumers of that hot sauce. So subject matter expertise is 100% like a table stakes, thinking that you have to bring in order to understand those relationships.
Peter Fishman: Now, the flip is sometimes that deeply works against you. So it's not linear up. It's not necessarily just concave as in, as you get more and more subject matter expertise, this first derivative remains positive. You can find that sometimes you are so deep in your world, you are missing what the typical user is doing. And actually a lot of times in past jobs, we've had that problem where we are the right tail of usage and expect everybody to understand some of the subtle things that are going on within the tool. And what you find is that people have a both surprisingly surface willingness to pay attention. You're the most important thing to you. And a lot of times you can build software. And to you, it's incredible, but to the typical user that isn't willing to make that same investment into learning all of your nuances, it might not be the case.
Peter Fishman: So subject matter expertise, first of all, is the table stakes to get started. You can't reasonably understand the mechanisms that are driving your user base without understanding your users in the first place. That's why you often see companies like Airbnb and Uber, consumer companies, people that work there are just nuts about using those products. Brian Chesky famously stayed in Airbnb for one whole year. Didn't have an apartment. And that was a critical part of essentially developing domain expert... Yes, it's about empathy for the customer, but also it's also about tuning that domain expertise. And everybody I knew that worked in ride sharing was taking ride shares everywhere. They had to go across the street, they'd take a ride share. I think it's developing not just that subject matter expertise, but also the real getting to your user's mindset.
Adel Nehme: So given your experience in startup and working in smaller organizations, how do you instill that subject matter expertise in early stage startups when they don't necessarily have these massive user bases when they make hires for example?
Peter Fishman: So I worked at a company called Open Door and Open Door when I was there was largely buying and selling homes in Phoenix. And I had no desire to buy or sell. I didn't own a home in Phoenix, but I didn't have a desire to buy a home in Phoenix. Obviously, now they're in many, many, many more markets and I didn't have the ability to gain expertise in essentially that buyer's journey because I never went through it. You're not always gifted the situation that I discussed with the consumer companies, where you're a data scientist that say, Facebook, your dog-hooting it all the time. I think the key is one, obviously, if you can do that, it is a huge advantage. And if you can't, I think you really want to disproportionately invest in talking with... Sort of YC tropes, which are talk to customers, talk to customers, talk to customers.
Peter Fishman: So I do think that sitting down, observing customers, talking to customers, talking to prospects that rejected you, all of those things are trying to up your knowledge. Now, the flip is I'm now selling a product that de facto I've worked on for 20 years. So your subject matter expertise is not necessarily one that happens the second that you sign your offer letter/ and your subject matter expertise, hopefully you're leveraging in my case over 40 years of subject matter expertise. But beyond that, you want to be able to really understand your customer, whether that customer is you, whether that's exhaustive research. You shouldn't think of your title as well. My title says data in it. So I've got to be in a back corner, doing data. A lot of the term that I like to use is use your feet, which is talk to the product or customer facing folks in your organization. Or if you can, talk to customers.
Developing Subject Matter Expertise
Adel Nehme: That's great. And flipping the question slightly, if I am at data analyst breaking into a new vertical, whether at a startup or an enterprise, what is the fastest way for me to develop subject matter expertise?
Peter Fishman: So I do think adjacent problems can be helpful. I mean, I've loved reading Nate Silver for the longest time. And I do think reading folks that think about data the right way. I started my career in the NFL as a statistician, not as a player. I had been into sports statistics for my whole life. And I think that there were a lot of parallels to the thinking about baseball famously had solved a lot of these sort of real like problems of figuring out what had tight relationships with performance and what had predictability, et cetera. But that thing then very deeply motivated, and I was excited about, I just had a passion for it. I read up a lot about it. And I think if there's an analytics in a space that you love. Now, for me, that was baseball and football, and there was now tons of material. At the time, there was just limited amounts of material.
Peter Fishman: But if you could find those people that love writing really savvy in the spaces that you love, I think that you're going to find good analytical, thought questioning breakdown, and that's going to apply to whatever discipline that you're going to do. I mean, reading Moneyball, my favorite book from Michael Lewis really is the same type of thinking that I would give early-stage startups essentially about it's de facto the same advice YC gives, which is write down the equation of success and then break it into its component parts and then measure those parts and then dive into one of these pieces is not working, cohort and cut and summarize. And that's how you start analytics anywhere. But it certainly was how the As did it some odd years ago when they were trying to compete with bigger market teams.
Adel Nehme: That's awesome. And as we close out our conversation, I'd love if we can think about the future for a bit, and what you think are some of the trends that are really going to shape how individuals and organizations work with data. I'd love it if you can list some of the trends that you're particularly excited about when it comes to the modern data stack and how it will affect data driven organizations.
Peter Fishman: I think we touched on one of them, which is the real rise of the citizen data scientist. First of all, you see a bunch of savvy people that write SQL that don't have data titles. Biz ops, marketing ops, all of these writing SQL or R or Python or something like that is just not uncommon in roles that were almost exclusively non-tech. I think that is a really exciting moment for anyone in the data space, because the data is now opening up to many, many, many more roles at companies and many, many, many more people have the chops to do something, to be a little dangerous with their data. I think this is a great trend for companies trying to solve the data problem for SMBs.
Peter Fishman: And obviously I'm very excited about one of those companies, Mozart Data. The other part of the trend that I'm excited about that sort of also relates to Mozart data is it used to cost you like hire a couple of data engineers and buy a bunch of expensive infrastructure, and you might be out $2, $3, $4, $5 million just to get started on your data journey.
Peter Fishman: Today, it's a $6 swipe of a credit card, and you're off to the races. Now, it's metered and your bills become significant. Your investment in data ultimately becomes significant. But the fact that you can get started for close to nothing is incredible. It is a huge difference. So if you think about the types of companies, those are companies that could really afford a multimillion dollar investment in data so that they could have that advantage were the largest companies.
Peter Fishman: You could only get jobs at the biggest companies, because those are the companies that had the data teams. Those are the companies that were leveraging the data and could effectively take advantage of their scale and applying those data insights. Today, this is becoming table stakes earlier and earlier. So more and more companies like ours, not just ours, but like ours are really empowering and enabling the SMB to use data infra, the types of data tooling what I see more up market. In fact, generally I find data stacks to actually be stronger downstream before there's a dozen sources of truth. It's actually kind of a little bit of paradoxic, which is actually almost the more constrained your budget, the more likely you are to end up with effectively a tighter data stack.
Call to Action
Adel Nehme: That's great. And I love that first trend, especially, and this is something that we've definitely seen at Data Camp with the hybridization of jobs and the emergence of data skills in traditional roles like finance, marketing, and a lot more. Now, finally, Peter, I had an awesome chat with you today. Do you have any final call to action before we wrap up?
Peter Fishman: Yeah. Obviously, I'm rooting for so many people in their data journeys and we love helping small companies at the start of their data journey get up and running on their data infrastructure all in under an hour without really needing any data engineering support. So if you're interested in that, we'd love to talk to you at Mozart Data. So I'm [email protected]
Adel Nehme: That's awesome. Thank you so much, Peter, for coming on the podcast.
Working with Pivot Tables in Excel
How to Create a Data Analyst Resume
Reshaping Data with pandas in Python
Reshaping Data with tidyr in R
Data Quality Dimensions Cheat Sheet