[AI and the Modern Data Stack] Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at Snowflake

Richie and Sridhar explore Snowflake and its uses, how generative AI is changing the attitudes of leaders towards data, the challenges of enterprise search, management and the role of semantic layers in the effective use of AI, a look into Snowflakes products including Snowpilot and Cortex, advice for organizations looking to improve their data management, and much more.

Feb 2024

Guest

Sridhar Ramaswamy

Host

Richie Cotton

Key Quotes

I have probably written code with like the Python date functions, like for the past 20 years, but I can never remember exactly what they do. To be able to simply type into chat GPT, I'm trying to extract date from, you know, a string that looks like this. And it instantly gives me that answer so that I'm not fishing through a stack overflow posts about doing that. I think that kind of productivity is, is super real. The value that companies can get out of AI, which basically comes from understanding language, understanding knowledge, making it easier for us to talk and query it. I think that is a breakthrough. And I would definitely encourage every CIO, every CDO to think about how can they make existing things that they do that are tedious, are difficult, more efficient?

Data for enterprises has been an ongoing priority. But to me, what really excites everybody including me, including your mom, and the CEOs about generative AI is, they all go like, you mean I can talk to a computer in like plain language? And it's actually going to understand what I'm saying? I think that’s what people are excited by. CEOs know for example that they have needed a bunch of analysts, a bunch of different tools like dashboard and visualization tools and BI tools in order to look at the data. I think people are super excited by the prospect of just better human-to-data communication.

Key Takeaways

Modern data cloud platforms like Snowflake enable not just data storage but also transformation, machine learning, and application development on top of your data, offering a more integrated approach to data management and utilization.

Ensuring high data quality and effective metadata management is crucial for leveraging AI and machine learning capabilities effectively, enabling better data understanding and usage across your organization.

AI is not just about automating existing tasks but also about enabling the creation of new applications and improving communication between humans and computers, offering a broad spectrum of opportunities for innovation.

Links From The Show

Snowflake

Snowflake acquires Neeva to accelerate search in the Data Cloud through generative AI

Use AI in Seconds with Snowflake Cortex

[Course] Introduction to Snowflake

Transcript

Richie Cotton: Welcome to DataFramed. This is Richie. Snowflake has been a big deal in the data space for years. In the mid 2010s, the platform is a major driver of moving data to the cloud. More recently, it's become apparent that it's very useful to have data and AI in the same place. So Snowflake has been rapidly adding AI features.

Today, our guest is Sridhar Ramaswamy, the senior vice president of AI at Snowflake. He has an engineering background and ran the ads and commerce department at Google before founding the AI search company, Neeva. Neeva was acquired by Snowflake last year, leading him to his current role. I'm keen to hear about how AI can be used to solve business problems like poor data quality and data being stuck in silos.

And how AI can be used to increase productivity for analyzing data. In addition to his work at Snowflake, Sridhar is also a venture partner at Greylock. So I'm also interested in his views on what's happening in the wider AI space. Let's hear what he has to say.

Hi, Sridhar. Great to have you on the show.

Sridhar Ramaswamy: Excited to be on the show with you, Richie.

Richie Cotton: I'd love to have bit of context on Snowflake. So flagship product is a data cloud. So what makes this different to a data warehouse?

Sridhar Ramaswamy: Well, data is at the center of most enterprises. That's what they run on day in and It is one thing to have a warehouse to store the data, but of course, you want to do stuff with it, whether it is... See more

transforming it, or being able to run machine learning on top of it, or build applications on top.

Or have your partners bring their data to you so you have that context in one place. Or others bring applications also, you'll begin to get the picture. Snowflake started as the place for all data 10 plus years ago. But over time, this data has so much gravity that things like collaboration, applications, very different kinds of things which you can do with data, all begin to be part of the core offering from us and from our partners.

That's what we mean when we talk about Snowflake being a data cloud.

Richie Cotton: Okay, so, you can't have just the data warehouse where things are stored, you know, the application sort of layer and all these other bits on top of it. So I'd love to discuss all these things more in detail. Before we get to that can you just tell me a bit about what sort of organizations are using Snowflake?

Sridhar Ramaswamy: Well, much of the Fortune 1000, the Enterprise 2000 they are all using Snowflake. These include very large companies like Fidelity across the board, across different industries. I would say that finance, healthcare, media are some of our strong suits, but it spans the spectrum.

Anybody that basically wants an authoritative view of data gravitates towards Snowflake. Because they realize that things like our unique architecture that offers for very flexible and separated compute and storage and also our business model which is consumption based, you only pay for what you consume makes for a great addition to the IT space that pretty much every organization has.

Also, we have broad adoption by a lot of people, and they love us. Because we just work out of the box, require very, very low maintenance and are very cost efficient for the value that we bring to these enterprises.

Richie Cotton: so yeah, low maintenance and cost efficiency are like good things. I'd like to talk a bit about how generative AI has changed things. Obviously, that's been the big story this last year. So. Do you think the rise of generative AI has changed executive attitudes to data?

Sridhar Ramaswamy: I think smart executives have always known that having their data story in a good place just made their job easier. If you look at some of our customers, Fidelity, for example, and they're open about the fact that we are like the data layer for how they run their their business. This is because they have a number of operational systems for doing things like if you trade stocks on Fidelity, it goes to an operational system.

But those systems are not really meant for visibility, not really meant For insight. And so they collect all of that data into Snowflake. Also have their partners bring that data onto to the same platform so that they have the 360 degree view of it. I would say data to a certain extent for enterprises has been an ongoing priority.

But to me what really, really excites everybody, including me, including your mom and the CEOs about generative ai, is they all go like, wait. You mean I can talk to a computer in like plain language and it's actually going to understand what I'm saying? I think it's that that people are excited by.

CEOs know for example, that they have needed a bunch of analysts you know, a bunch of different tools like dashboards and visualization tools and BI tools in order to look at the data. I think people are super excited by the prospect. of just better human to data communication. And that's the same attitude that we have at Snowflake.

We think, oh, wait you mean we can create a chatbot for a specific dataset and you can just ask questions in English and it'll do a good job of giving you answers. And if it can't give you an answer, it'll just say no, I can't do that. We are very excited by being able to provide things like that.

But I would say the core thing that all of us on and should be excited about is this idea that natural language as opposed to strange buttons and text boxes, you have to enter data into and magical incantations from software engineers and analysts is getting replaced by ordinary language. I think that's the real power of language models.

Of course, they can do a lot more, but to me, just that, if we can realize the value of that. is going to be a big, big deal for enterprises.

Richie Cotton: Yeah, certainly the idea of having a natural language interface is much more intuitive for many people, I think. Can you tell me how this, idea translates to a competitive advantage for businesses?

Sridhar Ramaswamy: Well, for Snowflake, for example the idea of language models and AI is a great add on. But the core advantage that we have is that thousands of enterprises trust us with their data. They bring all of the data about their businesses to their Snowflake instance. They set up different kinds of extraction pipelines, different kinds of visualization pipelines.

And all of that is there. And what AI now does is it creates this additional value on top where this data is more easily accessible where insights are easier to get at. And so, in that sense, I see AI as a major accelerant for traditional enterprise software. Now, there are lots of new applications.

There is also going to be disruption. There are lots of new applications that are going to come up. that were basically unimagined and unimaginable. Meaning in 2001 and 2002, you know, by that time, there were cell phones. Like, You and I probably used them. And I remember like, this brick of a phone that I had back when I was working at Bell Labs.

It would be a pound. But we could never really imagine Uber because a whole bunch of other things needed to come together. Similarly, I think AI is going to throw up a whole new class of applications, everything from image generation to video generation. I don't know about you, but I don't really use memes anymore.

I go to ChatGPT, write up a little description of, I'll send you one after our podcast. of hey, I'm talking with Richie about data, make me a little car drone saying something funny about it, and out comes the school car. I think those kinds of applications will also be there. I think it will cause disruption in the media sphere.

But I think for core enterprises I think nimble ones especially will adopt AI and use that as an accelerant for their code offering. Absolutely. That's what we're doing at Snowflake.

Richie Cotton: That's really interesting. I hadn't really thought of like the meme industry being the, the thing that's being disrupted. . It's like all the millennials. It's no, that's it. No, no AI for me alright, cool. Uh, I don't know whether you have any more examples of like, of any of your sort of early adopter customers started creating some of these new applications you've been talking about?

Sridhar Ramaswamy: Oh, totally. So the kinds of applications that people are super excited by is just more fluid interaction with existing stuff. So for example the first project that my team launched a little side story I started a search engine called Neva among the the first AI powered search engines on the planet.

And so Snowflake acquired us May last year. so we are experts in like search and in AI hot ingredients right now. And, but the first application that we launched was really just like conversational marketplace search. Snowflake has a marketplace where you can buy datasets, where you can buy applications.

And we were like, ah, You should be able to type in anything into that, not just a three word query. You can type in like a couple sentences into it, and we'll generate the right answers for you. Kernel of an idea. A lot of the work that we do day to day is search over specialized car. Meaning we search for help when they're using a particular product.

We'll search like in drive for specific documents on and on and off. And I would say the, the prototypical application for ai. is to take the data that is relevant to a particular context and put it into some sort of search index. You can use a vector index. You can use like what's called an IR and information retrieval index or combine the two as you're doing at Snowflake.

So you search for that information, you take the output of the search, feed it into a language model, and ask it to generate a fluid interactive chatbot. Now we are chatting with a data corpus. That's like the earliest application that our customers are developing. And Snowflake makes it easy.

Just yesterday, I was like, ah, I want to build an end to end application using Streamlet, which is our wrap and prototyping environment. And in like an hour, I took a CNN News dataset, stuck it into Snowflake in, put up a vector index on it and then used Streamlet and a language model to be like, you can search over this.

You can interact with this corpus. Now, I'm not the best programmer in the world. Those days are long gone. But I was able to do this, as I said, in less than an hour. That's the power that we bring. And there are also other applications something that we call Snowpilot, which is a copilot that helps you write SQL.

Lots of people are trying it out. We have another project that uses language models in order to extract structured information, say from things like contracts. Companies sign lots of contracts, they're all going to magical numbers in these contracts. What's the rep share? What's the, what's the penalty if something is out of SLA?

They forget about these contracts and don't really know what goes into them, but people wanna extract the structured information. So we have a project called Doc ai that helps people extract structured information from unstructured documents, puts it into a table so that you can run classical analysis using SQL.

On top of that we now have I think, over a hundred customers that are using it and it's in private preview and soon headed to public preview so they can deploy it in production. Hopefully, this gives you a flavor of the kinds of things. But I would say like, table stakes application number one is think customer support, think document search.

How can we do this much better? How can we do it a lot more interactively? And then going all the way up to, ooh, let's create a multi modal model that can look through PDFs and extract structured information. And there's a whole lot in between, 100%.

Richie Cotton: amazing that there are like so many different applications there. And you mentioned that even with some sort of fairly basic programming skills, you could build something that actually added value in an hour. Yeah, that's the dream. All right. I'd like to get into some of these applications in a bit more detail.

So, maybe we'll start with search since that's your, your forte. I speak to an awful lot of chief data officers and the one thing every single one of the complaints that is the data across their organization is stuck in silos. They have all this data. No one really knows where it is. They can't access it.

It feels like a I and and search is going to help with this. Can you talk me through how it's going to help?

Sridhar Ramaswamy: So, some of our larger deployments of Snowflake. have a hundred thousand tables. That's nutty. If you're like, Oh, I want information about this specific topic. Where should I look? It's really, really hard. Usually, all of these devolve into a giant Slack channel in which you're like, Hey, I'd like information about this project.

It does somebody know something. It comes down to it. And tools like Google don't really help because they don't have the kind of deep context into specific data sets, what are the semantics of it, and so on. at Snowflake, we have an ambitious effort called Horizon which basically Makes sharing, creating of shares sharing data within an enterprise just a whole lot easier.

we help you figure out semantics, we help you figure out, for example is this column email addresses? Is this column other kind of PII? Of course, you can also put information about tables, about schemas and we have this effort to make it really easy for you to search through the data sets again in the natural language and get to the data.

Of course, access control is a big, big deal and no company is going to say in the name of making data easily visible, I'm going to make everything visible, that is also a disaster. But what they're doing are clever techniques by which you can search over the metadata. and figure out, ooh, there is this data set, but I actually don't have access to it.

How do I request the owner of the data to provide me with access because my request is a legit metric? So we think about the life cycle of data discovery and then how that subsequently drives data sharing. And I think this is the kind of stuff that is going to be helping a whole lot. And then other aspects of AI that we will get into then will make it easy for people to be able to quickly query that data.

Part of the objective of Snowpilot, the co pilot effort within Snowflake. Is that it should be able to use things like the previous queries, the context from the experts on any particular schema to help future people write SQL in an easier fashion. But it all starts with having the data in the right place, having metadata attached to it and making it super easy to discover and share data in a controlled way.

And that's what we're doing with Verizon.

Richie Cotton: That's really interesting. The idea that even if a data set needs to be kept private, you can still make the metadata public or at least slightly more visible across your organization.

Sridhar Ramaswamy: The metadata searchable within the enterprise. In other words, you can separate out the two searching over the the privilege to search or metadata Is different from the privilege to actually be able to run stuff on it and Again, we are in the business of providing data owners, enterprises, our customers with the right tools.

We think this is an interesting differential. By the way, we obsess about these details. We even have something called a future grant. Where we basically say, I want to give access to this particular schema. Let's say about like revenue data from Europe. To Richie but I also want to give the same access for all future tables that I'm going to create in the schema because these things are living, breathing things and, as new things come on, you want to keep that access.

Again, that's a choice that, business orders can make.

Richie Cotton: So data access management is one of those things where it feels like it's no one idea, no one's idea of a fun time. So maybe you don't want all this stuff automated. So people aren't having to mess about with some of the technical details.

Sridhar Ramaswamy: Yeah, I think, it's a matter of providing the right level of abstraction. Just saying everything is open is clearly not going to work. On the other hand, people are realizing that saying everything is closed doesn't really work either. So it becomes a question. what's the right level of abstraction that you offer to the administrators, to the business data owners so that they can responsibly manage how data is shared.

I always think of process as like, you know, it should be just enough. Not too much friction, but not too little, friction, everything is open either. That's like the magical Goldilocks situation that we try to get our customers into.

Richie Cotton: So you mentioned before about using semantic search and these natural language interfaces to make all this sort of work. So this has been hyped up as a technology that's going to be, like, much better than keyword search and is going to solve a lot of our enterprise search problems. I'm just wondering how realistic is that?

Are we at a point where all of our enterprise search problems solved now, or is there still work to be done?

Sridhar Ramaswamy: Oh, I mean, look, the basic problem is that there are a ton of applications used in the enterprise, like hundreds we didn't realize it at Niva, 50 percent company. Around for only four years. And then we got bought by Snowflake. We had to make a list of all of the software that we used and what data that there is.

And that list kept going and going and going. And all of these are little silos. So I think that getting all the data together in a queryable form is very much an open project. I don't think that is done. And things like access control, every application, remember, not only has data, but has rules for who can access data.

And the rules are typically also disjointed. so we have a number of connectors for bringing in data from different kinds of applications like Salesforce into Snowflake. So it really becomes more of a second brain for the enterprise where all of this data sits in there.

And only after you have the data does semantic search come into play and can get you the right data. People are big fans of what's called vector indexing. It's an evolution of the same language model AI technology, really. What it does is it takes your English query and creates an embedding out of it.

And then looks for documents that are roughly in the same space. The problem with vector search is that sometimes it lacks precision. It turns out even if you type in 20 words, there are two or three of those words that really matter a lot. And so you need to make sure that documents that you return have those works.

So I would say this is a rapidly evolving field. There is excitement because of vector indexing because it can do some pretty magical stuff. But you also need to combine that with more traditional what are called IR information retrieval techniques of the kind that were pioneered by, Google.

getting better. But it's not press this button or sign this agreement and everything is done kind of situation. There's work to do.

Richie Cotton: Okay, it does sound like a lot of the success then is really based on the quality of the data and how well you're managing it and how well you're doing, you're working with metadata.

Sridhar Ramaswamy: That's right. That's right. And how you, bring it in, let's face it, if a CIO is using 300 applications, they're not going to say, I need a copy of each of the 300 applications somewhere else or I need to figure out how to provide API access. These applications often have terrible APIs for accessing the data in them because it's not really in their interest to provide you with the API.

They're like, yeah, yeah, yeah, come to our application using our website. And so it's work. It's tricky. There's, it's not, again, you know,

Richie Cotton: Okay, and it seems like even beyond the tooling there, just for data quality, you need to worry about processes and Your organizational culture. I don't know whether you have any advice on how you might improve your culture to improve the data quality and management.

Sridhar Ramaswamy: I think a culture of thoughtful inquiry. Where you're like let's use data wherever it is feasible. Let's look at the biggest needs that we have and make sure that we have the data to support it. Will drive a set of priorities for the organization.

Every organization has, and you know this, all of us have more things to do than we can realistically get done. so prioritization of what are the most important sources how do we make sure that we have a handle on those? And then how do we provide visibility? I would say like, prioritization.

and a mentality of really having data in a good place for the things that matter. How much revenue are you making? You better have good data on that. How much are you spending? You better have good data on that. It's like start taking a top down approach like that, and then prioritizing the biggest places where you need to invest in getting data.

invest in insights on that data is what I think is important. Too many teams, too many companies will start these mega digital transformation projects. We are going to be a digital only, everything in one place sort of company. Those projects usually don't really succeed they they basically try to do Um, So I think prioritization. Using tools that a company like Snowflake provides, we not only provide the data platform, but we also provide things like connectors thoughtfully and prioritizing the right data sources so that they can be, queried and insight, can be built on top of them.

I think there's, there's no real substitute for that. There's not a silver bullet that is going to solve data visibility problems in any complex enterprise. Life is just too complicated.

Richie Cotton: I do think uh, you made a very good point there that, yeah, most businesses should probably know how much money they're making, how much money they're spending, and just starting with that real high value data. Yeah, it's gonna be useful. So, I'd like to go back to applications. One thing that you were talking about earlier was SQL generation.

So, this is like one of the big promises of generative AI. You can instead of writing SQL, you can just write a natural language query. Can you tell me a bit about how that works in Snowflake?

Sridhar Ramaswamy: first of all let me start out by saying that SQL generation on complicated schemas with poor metadata is not a solved problem. Don't let anyone convince you that a language model is going to look at a horribly designed schema and like magically help you be proficient with, that is just not where the tech is.

However if columns have good names if there is additional metadata available on tables, if there are things like views, for example that capture the essence of the data that is sitting in a schema, then language models indeed can help a whole lot. They can take all of this context the metadata about tables, the metadata about columns, the metadata about value distribution in the columns say, have access to previous queries that have been run against a schema, and people have written comments on those queries, they can take all of that context and use them as aids in generating SQL.

That's what we do with Snowflake. Because at Snowflake, we have access to everything that I just said we can bring all of that smarts, present it in a clever way to the language model, and tell the language model, These are the tables you're dealing with. This is how they're normally joined.

And this is the question that the user has. Can you think through the process of writing a piece of SQL for that? In situations like that the models do much, much much better and are able to generate SQL for some pretty difficult problems. And that's what we're doing at Snowflake.

We take state of the art models, whether it's aama two or a and we have a pretty large team, several hundred engineers that are working on things like fine tuning these models to do better sql. So we do a lot of work in the data prep. And we're also looking into things like, can we fine tune models with, customer specific information, but give them a copy so that their data is not mixed in with anyone else's. And can these models be much better at generating SQL for those customers because it has this additional content. So we have this team that's basically working on things like fine tuning models for SQL generation.

As I said, we have an effort that looks into understanding the metadata behind schemas. And we combine both of these into the co pilot experience on Snowflake which unsurprisingly is like this pane on the right. There you type in a, query in English. And it's going to generate a piece of SQL for you.

You look at it, make sure that it is fine. And then you can hit run. And the query runs in the worksheet. The next thing that we are working on is basically an API, a programmatic version of Copilot, so that our customers can now build applications. The idea is that you point this API to a schema and embed the API into a tool where a user now is able to ask questions.

And underneath, the model generates the SQL, runs the SQL, and returns the result back to the user. That's like the next thing that we are working on. But hopefully this gives you an idea of what are the ingredients of Snowpilot and how is it being deployed in practice, and where is it going to go relatively soon.

The thing that I'll stress here is that this is very much a software engineering problem. Just like GitHub Copilot uses models to help you write code, but really there's a lot of clever software engineering that goes into presenting the right context for the model so that it can do a great job.

There's no like magic answer. Of Hey, I have a couple of million lines of code. Language model, help me do my thing. that's fiction.

Richie Cotton: It does sound very interesting that a lot of what you're doing seems to be prompt engineering. So you're basically depriving all that extra context in order to write good SQL in the background. So the user says, well, this is my business problem. And then you provide all that sort of background data.

Sridhar Ramaswamy: it's actually multiple things. it is fine tuning which is where you take a model that is capable of doing a lot and give it, teach it a bunch of additional context. So, for example, out of the box, you know, these models are great at snowflake SQL, our sort of variant dialect of SQL.

And so that's what you fine tune. On the other hand, you also want to present very specific context. And it's more than I would say, like, when you say prompt engineering I think of that as like a kid hacking around in chat GPT. These are more software engineering systems that carefully construct what goes into a model, how many calls get made and so on.

in order to solve a business problem. It's the difference between somebody writing a line, one line of Python and getting something done, versus a team that is going to write like, a Python package that is going to be used by thousands and thousands of people. That's what I mean by it's real soft branching.

Richie Cotton: Yeah, I like that. focusing on the engineering side of things. Okay. All right. So one thing you mentioned was that once you get to these big enterprise schemers, and if there's like poorly labeled data, then that's where problems start to occur. I'm wondering if Do people need to start designing databases differently in order to make this work?

Do you need to have smaller schemas or do you need to focus more on column names and metadata and things like that in order to get good SQL generation?

Sridhar Ramaswamy: It's a great question. The good news is that tools. have already been teaching enterprises to create these semantic layers. What language models struggle to do today? BI tools have been struggling with this for the past 20 years. And so there are a ton of, unfortunately, it's not really standardized.

DBT has its semantic layer, so does Power BI, or ThoughtSpot, or Looker. And so there are variants of these things. So the work that is needed to get a schema to a place that a language model will have is able to do a better job with it. It's similar to the work that enterprises do in order to get their data ready for PI tools.

And we are actively looking at if an enterprise already has this data, can we just use it? But I would overall, to answer your question, I would say data cleanliness and making sure that there are not like eight date columns. And you need an archaeologist to figure out which one to use for a particular giant.

That's just like good software engineering practice. when I write code, for example, I write it from the perspective of, A, this code is going to live forever. B, I'm going to come back to it three months from now and not remember a thing about what I did because it's not the kind of stuff that stays in your brain forever.

And so having that mentality and really saying, things should be named appropriately is very important.

Richie Cotton: Absolutely. I've certainly been.

Sridhar Ramaswamy: I love the name NextGen, by the way. Whenever people start a new software module, they'll call it NextGen FUBAR. And six months will go by and you're like, wait, what, what, what, what?

This has been in production for six months. What do you mean this is next gen? Because the new person that shows up, that's the only thing that they know.

Richie Cotton: Yeah, at some point it's last gen, but yeah, badly named. All right. So, yeah, I can certainly see how being able to maintain code is going to be incredibly important. So naming things is very useful. I'm curious as to how people use these AI features in practice. Is it people using natural language to do all their queries then, or the people who are just Like, I don't want this.

I just want to write a sequel. Is there a mix? What do you users do?

Sridhar Ramaswamy: So I would say it spans a spectrum. We wanted the power of language models to be available to all of our users, including our analysts. And we don't think of say, Copilot as like a replacement for a business analyst. It's going to make them more efficient. It's going to make some of their more mundane tasks a little bit easier.

But there's more. So let me design Snowflake Cortex, which is the AI layer that ships with every Snowflake deployment. We, first of all, wanted to make it super easy to use. So we, in fact, exposed Cortex as a set of SQL functions. So, for example, if you wanted to do a summarization on a text column that is in a snowflake table, that is as simple as making a SQL call to the summarize function and you pick the model that you want to send this text to and it'll take your instructions and generate a summary for you.

And the list goes on and on. We also expose what we call a complete, which is basically, you can think of it as, assembling a prompt and sending it to a language model. But you can do this in SQL, this prototype that I was telling you about. The new search prototype basically does that.

It assembles a prompt in SQL and sends it to the language model. So that's stop number one, which is existing analysts are now able to use the power of language models to do sentiment detection, to do translation, to do summarization. To do structured data extraction from text. All of those things just come out of the box in Snowflake, courtesy of Snowflake Cortex.

But we also designed these in a way that you can now begin to build applications like chatbots on top of it. By combining semantic search with a language model, as I said, that's the ingredient. For a chatbot, you can put all the documents for a particular topic into a snowflake table, and then you can create a chatbot by which you can just have a conversation about those documents.

but we also expose our most complex model, like the SQL generation model, to our customers, so that if they say, no, no, no, no, no, I do not want to call SQL functions. I don't want to use your easy to build chatbot. I want to build something amazing myself. I'm, I have the people that are, that can do the soft bridge.

We make that possible as well using something called container services, which is our extensibility framework. Our customers can fine tune models themselves, deploy them in container services, and then write applications that talk to these deployed models. But you get the idea, which is language models at every layer.

We don't try to lock our customers into one way of doing things. We expose these at every layer so that they can mix and match what it is that they want to do. My mantra is simple things should be simple. Complex things should be possible. And that's really what Snowflake Cortex and Container Services are about.

Richie Cotton: I really like that there's a sort of gradual evolution. You mentioned there's a spectrum, so you can start with just doing simple things and like everything is just. Generated natural language. And then if you've got more technical skills, you can build up to doing things in Snowflake SQL.

Sridhar Ramaswamy: That's right. That's right. That's right.

Richie Cotton: Excellent.

And so you mentioned like often this is going to be like whole data teams being involved or maybe even software developers. So it does seem like working with data is very much a team sport these days. So can you talk about how you see your customers doing collaboration on some of these tasks?

Sridhar Ramaswamy: You mean within, within their teams or with us? I

Richie Cotton: Yeah. So within teams, and I guess even like in organizations often working with data and AI tends to be several teams involved in things. So do you have any advice on effective collaboration techniques?

Sridhar Ramaswamy: mean, I think the first thing that I talked about when it came to collaborating was how do you make sure that silos are eliminated and data duplication is a thing of the past. This is where horizon and all of the efforts that you're doing around collaboration comes into play, which is don't reinvent the wheel again.

This most big enterprises have three implementations for every project or like things repeated over and over again, simply because communication is not all that effective, any basic level. We want to make sure that we make it super easy for people to leverage the work of their colleagues and use the data.

And then when you, go up one step from there to let's build applications that is very much a collaborative team sport. Then you need one person that knows the data. You're probably more of an analyst type person. But if you're building a chatbot, for example, you do want somebody that knows a little bit about, prompt engineering and what language models do.

and so on. And then you also need governance on these things. So you do have to get the right people together in order to do these projects. But part of our design with Cortex was simple things like an analyst being able to use a language model to do interesting things within like the ambit of what they do day to day, that should be much easier.

Similarly, an analyst that sort of knows data that is looking at a new schema, making them productive quickly with Snowpilot, is something that sort of happens naturally. I think this does not take away from the need to bring people together with different skills in order to make a project succeed.

In Snowpilot, for example, We have people that are search infrastructure engineers. We have people that know data really, really well. We have language model experts that are doing things like fine tuning. We also have UX engineers because you need to actually create a product that people love, that is easy to interact with.

And so, and then obviously you need like product managers and designers in order to be able to get something like that done. Snowpilot is a complicated project, but like you get the idea. You do need to bring people together and they need to have the right skills in order to get meaningful things done.

That's right,

Richie Cotton: Okay. Yeah. So really is about make sure that everyone's got access to the right information and they're being able to work at the sort of the appropriate level of technical skills for them.

Sridhar Ramaswamy: And this is also where like leads and managers are really important in this process. They have to have a mental idea of this is what it takes to be successful with a project like this. These are the skills that need to be there. Sometimes it's small teams, of two people, sometimes it's four people.

But just like thinking through that and assembling these teams is an essential ingredient for success.

Richie Cotton: All right, so I'd like to step away from the snowflake talk now. So you also have another job as a partner at Greylock.

Sridhar Ramaswamy: I'm a venture partner. It's a part time role. Yeah.

Richie Cotton: Excellent, because you got all that free time for an extra job. Okay, so I'd like to know what data and AI companies you are most excited about right now.

Sridhar Ramaswamy: Yeah. So, at Greylock, we have invested in several, what we call foundation model companies. these are, we think the infrastructure companies that will power the future. This is Mustafa's inflection. Mustafa's, of course, one of the world's uh, Suleiman is one of the world's renowned AI experts used to work at DeepMind.

They also invested in a company called Adept. So those are at the, foundational layer. But we do think that there are a ton of enterprise application companies that are going to be interesting. We don't think like AI is going to be quite like a magic sauce, as I was saying.

You still need to acquire customers. You still need to have that more because pretty much everybody knows how to use like GPD 3 or GPD 4 on an API car. the bar for excellence is, is quite a bit higher. And we are also excited by some of the newer generative applications, everything from image generation.

So I think video generation is pretty exciting. I don't know about you, but I used to edit videos when my son used to play tennis. It's a mind bogglingly tedious job to do anything. I used to try and make three minute summaries out of matches that would last three hours. It's so hard to do.

And I think whether it is video generation or video editing I think there's just a lot of value that they're going to create. I think advertising and marketing Or going to be changed in a pretty big way. My son, who's also a software engineer recently showed me this l lmm powered application for making experimentation on websites just a whole lot easier.

And he is like, here, drop this little piece of JavaScript. We'll run the experiment, we'll generate potential variations on experiments for you. So I think there's like a set of these kinds of products that are gonna be AI native. That are going to have a big impact on both enterprises and consumers.

Richie Cotton: I'm definitely very excited for all these sort of generative AI video applications. It just seemed like they've been kind of okay and coming soon for a while now. So yeah, I hope they get their spotlight. So the other thing is that it seems like there has been a lot of money thrown at AI companies just over the last year.

Is this something you think is going to continue or do you think it's a bubble that's about to burst?

Sridhar Ramaswamy: I think people are definitely going to be asking questions. About revenue returns and what is actually going on over there. A 5 percent interest rate environment has a profound influence on startups. I think it's hard for people to realize that like the difference between zero and 5 percent interest rates, it's basically infinity.

And I mean, one of the things that happened last year when we talk about large amounts of money being thrown at companies is that quite a bit of it, but also investments by the CSPs, by the very large platforms. When these investments basically turned around into cloud spend so I don't think of that as like real investment.

it's like a little bit of as my colleague Vivek puts it taking your balance sheet and converting it into revenue. that's what that is. I do think that VCs are cautious about not throwing too much money into unknown kind of companies.

And I think the time of 100x revenue valuations are definitely a thing of the past. I think has perhaps another 6 to 9 months before similar kinds of questions will be asked about revenue and revenue growth and things like that. So, when it comes to our customers, for example, already, they're asking us about, Hey, how much should I invest?

How should I be looking for auto? I barely creating value. These are all good, hard questions for people to ask. And honestly, I think avoiding some of the hype will keep us all in a better place. Because bubbles do not do, anyone any favors.

Richie Cotton: Absolutely. So, uh, it sounds like it's good that there are some hard questions being asked before money's being thrown at it. Excellent. All right.

Sridhar Ramaswamy: Money, money being thrown itself, of course, is a funny phrase, but yeah. Yeah, I think

Richie Cotton: true. All right. So, before we wrap up do you have any final advice for organizations wanting to improve their data management or AI capabilities?

Sridhar Ramaswamy: that there are pretty solid breakthroughs when it comes to AI and language models. It is not I think like thoughtfully using the power of language models can make things more efficient this whole like writing code. There is a before and an after, if you have an assistant.

Even having access to Chad GPT I tell people, I have probably written code with the Python date functions for the past 20 years, but I can never remember exactly what they do. To be able to simply type in the Chad GPT, hey, I'm trying to extract date from, a string that looks like this, and it instantly gives me that answer so that I'm not fishing through eight stack overflow posts about doing that.

I think that kind of productivity is super real. I would say that embracing what these models enable. And using a platform like Snowflake to build on top of. We take enormous pride in the fact that our AI infrastructure is seamlessly integrated with everything else that's going on in Snowflake.

The investments that you have in access control just carry over naturally to everything that we provide with AI. I think having partners like that that are about providing real value and not just hyping up the latest thing that customers can put money into. I think that's an important thing to keep in mind.

But I think the value that companies can get out of AI, which basically comes from understanding language, understanding knowledge, making it easier for us to talk and query it. I think that is a breakthrough. And I would definitely encourage every CIO, every CDO to think about how can they make existing things that they do that are tedious or difficult, more efficient.

There's a whole bunch of those to go after.

Richie Cotton: Wonderful. All right. Lots of opportunities there then. Excellent. So, thank you very much for your time Sridhar.

Sridhar Ramaswamy: Thank you, Richie. This was a fun, fun, fun conversation.

Topics

Data Engineering

Artificial Intelligence (AI)

blog

DataFramed Podcast Series: AI and the Modern Data Stack

Find out about DataCamp's upcoming podcast series focussing on how AI is becoming a must-have for data teams and organizations.

Adel Nehme

podcast

[AI and the Modern Data Stack] How Databricks is Transforming Data Warehousing and AI with Ari Kaplan, Head Evangelist & Robin Sutara, Field CTO at Databricks

Richie, Ari, and Robin explore Databricks, the application of generative AI in improving services operations and providing data insights, data intelligence and lakehouse technology, how AI tools are changing data democratization, the challenges of data governance and management and how Databricks can help, the changing jobs in data and AI, and much more.

Richie Cotton

52 min

podcast

The 2nd Wave of Generative AI with Sailesh Ramakrishnan & Madhu Iyer, Managing Partners at Rocketship.vc

Richie, Madhu and Sailesh explore the generative AI revolution, the impact of genAI across industries, investment philosophy and data-driven decision-making, the challenges and opportunities when investing in AI, future trends and predictions, and much more.

Richie Cotton

51 min

podcast

Do Spreadsheets Need a Rethink? With Hjalmar Gislason, CEO of GRID

Richie and Hjalmar Gislason explore the integral role of spreadsheets in today's data-driven world, the limitations of traditional Business Intelligence tools, and the transformative potential of generative AI in the realm of spreadsheets.

Richie Cotton

54 min

podcast

[AI and the Modern Data Stack] Accelerating AI Workflows with Nuri Cankaya, VP of AI Marketing & La Tiffaney Santucci, AI Marketing Director at Intel

Richie, Nuri, and La Tiffaney explore AI’s impact on marketing analytics, how AI is being integrated into existing products, the workflow for implementing AI into business processes and the challenges that come with it, the democratization of AI, what the state of AGI might look like in the near future, and much more.

Richie Cotton

52 min

tutorial

Snowflake Snowpark: A Comprehensive Introduction

Take the first steps to master in-database machine learning using Snowflake Snowpark.

Bex Tuychiev

19 min

See More See More