Home Podcasts Artificial Intelligence (AI)

[DataFramed AI Series #3] GPT and Generative AI for Data Teams

Sarah Schlobohm, the Head of AI at Kubrick Group, discusses the intricacies of incorporating AI within data teams.

May 2023

Guest

Dr Sarah Schlobohm

Dr. Schlobohm leads the training of the next generation of machine learning engineers. With a background in finance and consulting, Sarah has a deep understanding of the intersection between business strategy, data science, and AI. Prior to her work in finance, Sarah became a chartered accountant, where she honed her skills in financial analysis and strategy. Sarah worked for one of the world's largest banks, where she used data science to fight financial crime, making significant contributions to the industry's efforts to combat money laundering and other illicit activities. She started her career in data science with a PhD in particle physics, where she trained her first neural network to identify b-quarks.

Host

Richie Cotton

Key Quotes

I think the only thing that we can guarantee about this industry is that it's going to keep changing and it's going to keep adapting. And we're going to have, we're going to keep building on what we're doing. It's going to be more interesting, more powerful, and that's going to use different skills, right? And are we going to be using Python in 20 years? Probably not. I mean, some people are still using Fortran, but hey, for the most part, no, there are trends, there are shifts in what we do. This is a big one, this is a big step change.

Prompt engineering is another way of saying asking the right question. I think there are tricks to getting it to do what you want. So giving it context, but a lot of these things are just good advice for how to tell a human to do stuff as well. So give it context, tell it why. So in this situation, I want you to be a data scientist using Python. So give it that prompt. Ask specifically what you want. If you're looking for a format of an answer, you know, give it an example, say, hey, give me an output like this. I think it's going to be a super important skill that quite a lot of jobs have. I think in very limited circumstances, that is going to be a specific job. But I think that's going to be part of like a mega pipeline in, in certain areas. Search engine optimization, I think is a good comparison. Is that the job title in a lot of cases? Probably not, right? But if you have a marketing role, that's probably an essential skill. And in a big enough, specialized enough organization, you might have someone where their only job is that. But I think it's more and more a skill we all need.

Key Takeaways

When considering adopting AI within your team, prioritize safety and security when using any data. Do not put personally identifiable information into AI products that aren’t secure. When using datasets with AI products, ensure just the schema is visible, rather than the constituent data itself.

Use AI to test your knowledge, rather than asking a Chatbot for information, ask it to generate questions for you. For example, if you’re looking for a new job, or practicing for a presentation, ask the AI to act as an interviewer or someone that might have questions on topics you present. Confirm your confidence in Q&A’s through an AI trial run.

When using responses from chatbots such as ChatGPT, trust them, but verify them. AI products will hallucinate citations that seem plausible. If asking for code outputs, test and verify that the code works before putting anything into a production environment.

Transcript

Richie Cotton: Welcome to part three of the Data Framed AI series. This is Richie. Generative AI is already having a big impact on data teams, so it's important to understand how to use it well and to understand how it's changing the careers of data analysts. And data scientists helping us keep up to date with what to do is Sarah Schlobohm, the Head of AI at Kubrick Group.

Dr. Schlobohm leads the training for the next generation of machine learning engineers. With a background in finance and consulting, Sarah has a deep understanding of the intersection between business strategy, data science and ai. In addition to her data science and AI work, Sarah is also a chartered accountant and has a PhD in particle physics.

In today's episode, we'll cover the impact of generative AI on data teams and on data upskilling and careers. Let's hear what Sarah has to say.

Hi, Sarah, thank you for joining us on the show. Great to have you here.

Sarah Schlobohm: Hi, Richie. Happy to be here.

Richie Cotton: And I'd like to just dive straight in talking about what are the use cases for generative ai particularly for people who are working with data.

Sarah Schlobohm: Yeah, absolutely. I mean, I think the thing I'm actually really excited about it is for how many cases there are for not data. I think that may be one that we come back to, but I think yeah it's really democratizing fields that, that needed a lot of specialized knowledge before the thing I'm really impressed with the l... See more

atest version of generative AI is how well it can generate code.

I mean, that, that was the big change for me, that was so impressive. I am, I'm absolutely not a purist when it comes to coding. I like if you gave me a blank Blank Jupyter notebook to, to code on and have a panic attack. The way I learned and the way I still function is to like take somebody else's code and hack it until it works for me.

So, this is an absolutely perfect thing for the way I code. Cuz it's never right. It's never right to begin with, but it gets you so far along that. That first path right now that it's incredible. So yeah the ability to manipulate code, the ability to, give it code and say, make this faster, make this more efficient, make it more Python.

I think that's incredibly exciting. The fact that you can just Feed it a data set and say, tell me what are the key features about this. I think that's incredibly exciting. But all the non-data use cases as well. So the, please help me write a presentation about this. Please help me, present this back to stakeholders.

Please help me summarize this. Well suggest some good data visualizations on this data. There's just so many opportunities.

Richie Cotton: That's brilliant. And I do find it interesting that it's actually rare to be starting writing code from scratch. Quite often there's an existing code base that you want to work on, so, just editing code written by G P T or whatever is very similar to writing code written by one of your colleagues.

Sarah Schlobohm: But that's not the way we teach it, right? Like we teach people, Hey, here's a blank notebook. You need to structure your code from scratch and. I can't remember every import statement. I have to go and steal it from somebody else, so usually myself and an older version. Yeah. Yeah. It's

Richie Cotton: Absolutely. I'd love to talk about the non-data use cases later, but just for now, if we continue on the theme of data use cases, is there a difference about how different data roles are gonna use generative ai?

Sarah Schlobohm: Yeah, I mean obviously if you're a data analyst, if you're in a more code heavy role, you're probably gonna be generating a lot of code with it. If you are doing, summary analysis there's so many first drafts you can get out of that. I think, yeah, there's. There's so much in terms of all of the bits that support being a data scientist that aren't.

Data and aren't coding. These are the things that, that you really have to teach that I wish somebody had told me as a young data scientist that like, maybe half the job is actually the technical bit. The rest of it is working with stakeholders, getting it into production, writing the business case to support it, writing the presentation to say, Hey, this is why you need to approve it in the go.

No-go. All of that kind of stuff. And generative AI can help out with all of those aspects of it as well. In terms of non-technical use cases? Yeah, re reporting, project management, all of that way to reword things lesson plans if you're teaching at all, for example. Yeah it's just got so many opportunities.

Richie Cotton: It's really exciting that you mentioned all these different bits of a sort of more holistic workflow. That aren't necessarily just about crunching numbers, so have you got any examples of how you've used G P T or other AI for these non-technical bits? Like, creating reports or project management or things like that?

Sarah Schlobohm: Yeah, well it wrote my bio for this podcast, so that's helpful. I fed it, the information and it constructed something sensible and I had to reword it at the end. Because of course you do, you it's great for first draft and everything. It's never good as a final draft. But yeah, that's a good example.

No, great example. Literally we needed to develop a lesson plan within Cubic where I work, where we train people. And yeah, it gave a really good structure for, you can tell it's over how many days you want to learn something, in what depth, and it will give you a structured guide.

So actually one of my consultants, so I train machine learning engineers. one of my consultants that I've trained recently is using it to run a marathon. So he's never run a marathon before. But he's setting up a project to follow chat, GT's plan for training, for recovery, for food, for absolutely everything.

And he's doing it all for charity. So, if I can give a shameless plug and link for that, I'd love tbt. He's calling it the Mary.

Richie Cotton: Okay. That's pretty cool. It sounds like you could end up train really well, or it could be like, well, yeah.

Sarah Schlobohm: Well, he is not gonna do anything that'll

Richie Cotton: he put the prompt where it was like, sure. To. Like spend plenty of time on the couch watching TV as well.

Sarah Schlobohm: seems to be. Yeah. But I think that it's a really interesting project. I think we're just scratching the surface with what people can do. But yeah you can use it for structure presentations. It gave me some great examples. Yeah, you can use it for paraphrasing things. It's really helpful with that.

Internally we had a bit of we had a strategy day and we had a little exercise where we had to draw a shield. It, it sounds cheesy, it's one of these icebreakers, but it's actually pretty good. And you had to put different things in the quarters of the shield to like, represent your life.

So I had a Dali drum mine and it was really good. Far better than I could do with my design skills.

Richie Cotton: Nice. I am curious as to what went into your coat of homes then.

Sarah Schlobohm: It was what inspires you, which was love of learning. What are you proud of? Which was speaking at a big industry event recently. What else was it? What challenges are you facing? I think I gave some sort of women in data answer and Something in your life right now? I can't remember what I said now, but it did, it gave me, like, I gave it the most basic prompts.

I was like a red-haired woman speaking in front of a crowd and it sure did draw a cartoon of me. That was all I gave it. Red haired woman with glasses. She looks sufficiently stern.

Richie Cotton: It's nice that you formed the image of a generic red 10 form of glasses. Brilliant. Alright, so, getting back to AI and data. So one of the big sticking points is when data people have to work with people who don't have a data background. So, can you just talk me through how you use AI for this?

Sarah Schlobohm: That's, I mean, the barriers to entry have never been lower on this, right? You can just show someone this and explain how it works. Like my mom is a retired English teacher and. She's using it on its, I think that's the, in show business they used to say, will it play in Poughkeepsie?

That was, will it reach the mass market? So if it will reach you in Portage, Indiana as a retired English teacher, then I think it's really really past the hype cycle and into. Into reality. It can do a lot of things like summarize advice for you. It can help suggest visual aid. I actually asked it this question in advance.

It had some very good advice, which consisted mainly of know your audience, simplify your language, use visuals, focus on the big picture, be prepared to answer questions, and practice. Which is all very good advice. I think, if if somebody came to me and said, ask me for that advice.

That's probably what I'd give it.

Richie Cotton: Yeah, that, that is great advice for for preparing a presentation. So, G P T seems to have stole the limelight in terms of what generative AA is, but there are tons of other models around terms of other different application, like models of different purposes. Are there any other models that you think data practitioners ought to be looking into at the moment?

Sarah Schlobohm: I think, I think for language models, chat, g p t is really the runaway success story. And I think it's still my favorite. I've played with a few of the other ones but it's really good, but it's still just text-based, which it will remind you every time you ask it to do something that's not text-based.

So there's ones that can draw Images. So Dolly also by, by OpenAI. Mid journeys quite cool. There are ones that do video, haven't put as much with those, but you know, there are a number of emerging ones out there. I've just been playing around with an app called RDO that will take pictures of you and generate different images of you.

In various art styles and more importantly your pets as well. So I can have my cat as a princess in space if I want. And really that's what we've used the internet for. So it makes sense that's what we use generative AI for.

Richie Cotton: I suppose, yeah, if you're training on stuff that people have created on the internet, then that's where you're gonna get Back out of it as well. so, given that you head of ai, you're managing a team, I'm curious as to whether the use cases for managers are different for individual contributors.

Sarah Schlobohm: I think that's a big one. I think generally speaking, the more senior you get, the more sort of responsibility you have to think about the big picture and the more you have to think about AA ethics. To some extent, yes, that's everyone's job. We all need to think about AA ethics, but really.

It's the manager's job to be thinking about, are we using this safely? Are we using this for the appropriate purposes? And those are important questions to ask. So for example when we do training, we have assessments that go along with that and chat. G p T can pass our assessments. So as head of ai, I had to be the one to identify that and raise that and say, Hey guys, what are we gonna do about that?

How are we gonna adapt? I should point out we're in good company. It's passing things like medical exams, the bar, the Wharton mba. So, we are in good company there. But I think it's that what are the implications? What's the bigger picture? How do we do this safely? How do we respect privacy?

How do we. Respect intellectual property. Those are the sort of big questions that the more senior you get, you have to answer the strategic level ones. I think also there's just more reporting, there's more admin, so there's more opportunity to use it for those sort of non-technical roles as well.

I think it's the putting it in context that's the most important thing.

Richie Cotton: So that point you made about assessments is really interesting. It's something we've been looking hard into data camp as well. So, someone on our assessment team was saying one of the big problems is that writing a good assessment question, it has all the saying key features as writing a good prompt.

Cause you want to be really precise about what you're asking for. And so that's why GT is really good at passing assessments cuz it's a similar sort of optimization process. Going back to the point about managers you have any advice for managers who are considering adopting generative AI on their teams?

Sarah Schlobohm: I mean, safety and security and privacy are the first thing. Obviously don't put state secrets in it. Don't put p I personally identifiable information. You don't necessarily know where it's going. You don't necessarily, well, you do know that Open. I has access to anything.

You put Into it. And similarly, anything else. So just be careful. Use it for general things. Use it for things that aren't, gonna break the world if they get out. But you can't pretend it doesn't exist. When it first came out, I saw two really odd knee-jerk reactions to it.

one was to try to pretend it didn't exist. Like, ooh, if we don't tell them about it, maybe they won't use it. Well, come on, it's all over the internet. It's the most successful product launch in history chat. J p t is a lot of metrics. So, you can't do that. And the other option was to just, consider straight up banning it.

Just block it from the servers. Don't use it, but that's just shooting yourself in the foot in the long run. Right? I think of it as I, I'm old enough to remember like back the early days of the internet and when Google first came out, that kind of thing. It feels very similar to that. That sort of, almost wild west nature of it at the time and, okay, we had a.com bust and all of those things.

I expect all of that to happen to you. But can you imagine these days asking people not to use a search engine? That'd be insane. And to suddenly just stop, people from using any kind of generative ai, blocking it, banning it, whatever. Except in very specific use cases would similarly I think, hinder productivity.

Richie Cotton: Yeah, I certainly remember like back in the early two thousands being at some jobs where the internet was like very heavily locked down and it became almost a sport to see like which sites could you actually access?

Sarah Schlobohm: Yeah. But even then I basically couldn't program without Stack Overflow. So, now I've got jazz every day. It's all good.

Richie Cotton: Brilliant. And do you have any tips for like, how you go about sharing best practices for using AI or sharing prompts or anything like that? Like what do you do to make your team more efficiently use ai?

Sarah Schlobohm: I think we're all still figuring that out. That's one of the interesting things about it. Just like we all figured out how to use a search engine. Well back in the day we're all gonna have to figure out how to do prompts well. But at the same time, I think I think this is moving into another question, but the best qualities, the best data scientists, right?

You can teach specific skills, but they're gonna change all the time, right? What you really need are the ability to ask good questions and the ability to critically evaluate information. And with generative ai, that's the most important thing. Still asking good questions is another way of saying prompt engineering.

Basically and again, like you were saying about assessments, the more sort of specific and detailed you can be in asking that question. The more specific and detailed your answer is gonna be, and then, chat, G p t will and, and all the generative AI don't mean to pick on it specifically will very confidently give you its answers right or wrong.

And being able to say, ah, hang on, that makes sense, or That makes 80% sense, but I need to tweak this. Or that's how it should work. But that's not how it works in practice. That's always been an important skill and that will continue to be incredibly important here now that we use these tools more.

Richie Cotton: Absolutely agreed. And just to push on this a little bit further do you have any ways of making sure that people are critically thinking about the responses that they get?

Sarah Schlobohm: I like when you can tell when people have obviously used it and not edited at all. I do just often reply. Thanks Chad, g p t to that. If you see it, if you see it in an online discussion. Any tips? I think just as you would with, okay, you've asked your friend who's sitting next to you, who's coding with you, who you know generally to be a good programmer, a good data scientist, sometimes they're gonna be wrong.

And don't just take everything at face value if it's code, test the code. If it's critical information, then, still consider looking up. Sources on that. Don't believe it's citations. It will hallucinate citations still. So it will assemble plausible looking citations to research papers with names and sensible titles.

But when you try and look them up on the archive, if they don't exist, so trust but verify.

Richie Cotton: That seems like good advice I have to say. For data science use cases, I feel like I can. Usually tell whether something's been written by a human or by ai, but in some cases, like for things like sales stuff or marketing stuff, that stuff, even if it's written by a human, it often sounds like it's written by AI anyway, so it's very, it's much more difficult to tell.

Sarah Schlobohm: But I mean, we've always used templates for these purposes, right? Like I genuinely believe if you're gonna do something twice, you should probably automate it. And when it comes to like, telling people stuff, writing it down is how we automate that. So if you, I've always been a big fan of using like templates for.

routine, communication. Plug in the bits that you need and go. This is just templates plus attempting to fill in the blanks for you.

Richie Cotton: That's a nice way of putting it. so, I'd like to talk a little bit about jobs. Cause I think there's a lot of fear that some jobs are gonna be just completely replaced by ai. I guess first of all are there any tasks or roles that you think are gonna be automated completely with generative ai?

Sarah Schlobohm: I think bits of jobs are going to be automated away. I don't know that entire jobs are gonna be automated away again. Just as we saw in the early days of the internet, there were scares about, ooh, well all brick and mortar shops close. Well, some did. But not all of them did.

I still go shopping on, on the high street. I still like to look at stuff. There are still cases where, something physical is still better than what's online. So I think we're gonna see a similar transition. Some stuff is gonna disappear. We don't program using punch cards anymore. We don't do math using slide rules anymore, but we still program, we still do math.

We're just using different tools to do it. So I think those jobs are going to shift rather than completely disappear and hopefully it's the boring stuff that we get to automate away and we get to focus on the interesting stuff, the fun stuff, the stuff that's gonna a difference in the world.

Richie Cotton: So, in that case I'd like to talk about cases where you are having a human and an AI work together. do you see any particular cases where that's gonna grow in popularity?

Sarah Schlobohm: Well, I mean, even the generative AI is that it's human in the loop reinforcement learning. And that's a pretty productive approach. I think we've definitely seen, there have definitely been some studies about medicine where Studies that show doctors using AI do so much better than either alone.

I think that's the way forward.

Richie Cotton: Are there any cases where you think you should just never use ai, that it should be blocked?

Sarah Schlobohm: I don't actually think you should never use AI again. I'm gonna go back to the like, should you never look stuff up? Analogy? Of course not. If you're gonna do anything, you're gonna do research on it. And this is one of these tools to do research. Again, don't put the state secrets in, how can I best protect my nuclear bunker?

Well, no, that would be insane. But There are lots of cases where it shouldn't do everything, but I think in your toolkit, you wouldn't say there are no parts of math where you can't use a calculator. Well, okay, there are parts where like, a calculator's not gonna help you that much.

And similarly, there are gonna be parts where generative AI is not gonna help you that much. But are you really gonna tell people no. For very good reasons, you need to go back and use the slide rule now. I don't think so.

Richie Cotton: There's probably like three slide rule enthusiasts to listen to us right

Sarah Schlobohm: Yeah, I mean, really,

Richie Cotton: it's my favorite way of doing.

Sarah Schlobohm: no I, I think they're really cool. I have a lovely one and it's a lovely, it's a lovely tactile thing and it's a great reminder of what we used to do. And I used to, I never used punch cards, but my physics department used old punch cards as like our scratch paper.

So I have a fondness for it. Right. And there probably is somebody running some fantastic. The whole punch machine. and you can still use it for machine knitting, interestingly enough, or ARD weaving. They still use very much whole punch technologies where a lot of that came from. But but yeah, it's, for the most part I'm super into hand knitting, but for the most part I, I still buy my socks commercially.

Richie Cotton: the recurring theme so far seems to be if you've got privacy or security problems, that's gonna be the biggest blocker to using ai. Do you have any tips on how you might, reason about this or how you might get round these blockers?

Sarah Schlobohm: I mean, for the most part, if it involves anything that can identify a human. Think twice before you do it right. Personally, identifiable information is obvious. Things like names, ID numbers, things like that. But especially even combined a lot of things are identifiable. So if you say that Ginger American, especially in a certain tone of voice, I probably know you're speaking about me.

So, no it is something really impressive. Like, 95% of the dataset can be. De anonymized in like four data points. It's really shocking. So be really careful about putting information about people into it. I think that's the biggest thing to say. And then anything anything you wouldn't want on the front page of a newspaper.

I think that's always a good test for AI ethics. If it comes out, Sarah writes a model that, I don't know, says rude things about ginger Americans. Then. Just don't if it got out, if you wouldn't want literally everybody in their next door neighbor to read it. Don't put it in a public facing tool.

Richie Cotton: Okay. So it really is about just thinking about what are the impacts gonna be if this becomes public or if you get a wrong answer.

Sarah Schlobohm: Yeah, I, and obviously, well, you should know if you're dealing with anything that has a security clearance or intellectual property implications, et cetera. I think those are pretty obvious. Don't use it cases.

Richie Cotton: And I'm hoping most companies and organizations have some kind of guidelines on what are the important or basic data that ought to be kept secure. So, maybe those are the ones you keep away from ai.

Sarah Schlobohm: Yeah, but I think you also know I mean, I would be shocked if very many companies have updated their security policies in line of. This revolution. I mean, this has only been out since November 30th, 2022. This is when it all really took off. Obviously bits of it existed before then but I think that's the sort of key date in all of this.

So I'd be shocked if companies have kept up with it. I know certainly regulation has never kept up with it. That's always been a problem when thinking about AI ethics and data privacy and concerns like that. So it is up to us to behave ethically when using it and make sure that we're not doing anything for evil and to make sure that we're protecting ourselves and our company's rights when we're using it.

Richie Cotton: Absolutely. So just moving on from the impact on jobs and tasks to upskilling, I think. Of AI has some pretty big implications for education.

Sarah Schlobohm: It does.

Richie Cotton: Yeah. So, to begin with how do you think it helps people learn new skills?

Sarah Schlobohm: I mean, you can just chat with it. You can just ask it. It's incredible. Explain Time series to me and you can have a chat with it and you can say, oh, tell me more about that second paragraph, and it will, And, is it gonna be right about absolutely every single detail? Well, probably not.

So, so do be careful about that and do verify that. But it can basically be like having a private tutor now, which is incredible. I'm a big fan of Duolingo. For language learning. I'm on it all the time. I'm learning far too many languages and what they've done with it I think is incredibly impressive.

You can do really tailored learning. I think this is a thing we know about education in general, right? Is that really focused practice on the areas that you're getting wrong is what helps you upscale rapidly. And this is where. AI powered learning has such great potential because, as I'm doing my French lessons, it can say, oh, she always gets a subjunctive wrong and it can give me, loads more review about that.

And help me with that. And similarly, I assume if I'm learning a programming languages, for example, on data camp I'm going to be able to get a, she always forgets a, I dunno to define the arguments there or something. I don't know what common mistakes I make, but Usually just typos.

It can help me catch that too.

Richie Cotton: So the idea of personalization, so being able to figure out what is this? Particular learner doing wrong seems really powerful.

Sarah Schlobohm: It really is.

Richie Cotton: absolutely. Are there any other sort of areas where you think, okay, this is a game changer? Well, this is gonna really improve people's learning performance.

Sarah Schlobohm: Yeah, I hope so. I mean, have, if you've ever sat through like really standardized, boring, corporate learning you really hope that it gives you something other than that PowerPoint presentation with A bad quiz at the end. We've all done that. Like, there, there are things that we have to do regularly for very good reasons.

Things like anti-party and corruption training. But that could certainly be more engaging. That could be stickier, in terms of it, it sticks in your brain better. I think, yeah, that, that level of personalization, interactivity, adaptability, that's something that I think in the past, for the most part, you would have had to work with someone.

Really directly in a small group to get and there's so much opportunity to, Get the generative AI to quiz you. So it's great for practicing job interviews. You can get it to role play a job interviewer and ask you all those sorts of questions. And then it can even evaluate the response.

So if you chat that back in so I think there's so much opportunity to quiz you. When I was a kid, when I was studying for my exams, I'd always get my mom to quiz me. And that was the best practice when you go in and ace it based on that. So, yeah. Now you've got your own private tutor potentially.

Richie Cotton: Yeah AI tutors, that's feature coming soon to data camp. So, I do like that example. Yeah that, that's very nice. The idea of being able to converse it with someone, practice your conversations, that does seem a very powerful way of learning. So just unrelated note does the fact that you can use ai, does that change the skillset that you need particularly for data analysts and data scientists?

Sarah Schlobohm: I mean it does, but again, I still think those most important skills are always asking the right questions and critically evaluating information. And so that just makes those even more important because asking good questions is another way to say prompt engineering.

Richie Cotton: So in terms of technical skills if AI can generate your code, does that change your relationship with how you learn about coding?

Sarah Schlobohm: I think there's just less rote memorization that has to happen. But again, in my career, I have seen this progression that has happened, right? It used to be back in the day that if you were writing some complicated thing in C or C, c plus, whatever, like. Like during my PhD you had to code in those equations by yourself, right?

You had to have that math, you had to code that in specifically and Okay. That made the code run way faster. Potentially there, there were good reasons to do that. Nowadays, we don't do that anymore. We use Python packages if we're programming in Python. Right. Somebody's done that function for me.

I don't need to write a neural net from scratch. I'll go off and use TensorFlow or PY torture, whatever. Right. So it's just the next level of abstraction. We were already getting there with, I thought the next step up Packages was gonna be pre-trained models, and to some extent this is that because it's trained on the entire internet.

But I, I think it's just that, it's just that next level of abstraction up and why wouldn't we, when appropriate use the more powerful tool to do that.

Richie Cotton: I definitely agree with that. I want. Drink because this is also of built on deep learning and the whole area seems like really important to, like, you need a few people to understand what's going on. Do you think these sort of natural language processing skills or deep learning skills are going to become more important, or is it just well, this is all done for you.

We don't need that.

Sarah Schlobohm: It depends on what you wanna do. If you just want to use the outputs of these things then no. I don't need to know how a search engine works to draw the internet. But if you wanna work with it, if you want to be on the cutting edge of technology, if you want to be adapting it, if you want to be, making your own tools with it, then I think, yeah, absolutely.

Natural language processing and And deep learning are some of the most important skills. Those are, especially thinking about just text-based models. If you're thinking about, do you wanna do image generations, do you wanna do video generation? Then you need, obviously on top of that computer vision skills.

But you do still need some N L P because it has to be able to interpret prompts. so yeah it will be there. But again, how many of us are really writing our own price on packages? These days, rather than using that output to, to write the code, I think there's a parallel there as well.

Richie Cotton: That seems like sensible advice. And so I'm wondering in general, how does it, how does the existence of generative ai make change your decisions when you are trying to hire data professionals?

Sarah Schlobohm: Oh, that's interesting. I mean, if they didn't know what it is, they've clearly been living under a rock. So I think I think that's genuinely true though, because you do want people to be able to, be aware of trends in the industry and have ideas and opinions about where things are going.

So I think people having an awareness of it and people having interesting insights on how to use it. I think those are probably great interview questions. But again, I'm still always hiring for, can you ask good questions? Can you critically evaluate responses? Are you creative? Can you learn quickly?

Can you adapt quickly? All of that's in practice. Now you need to do all of that. You need to do it to just successfully use the tool full stop, but also to be able to adapt to a changing workplace that's going to integrate tools like this more and more.

Richie Cotton: Excellent. Has it changed the profile you might look for or are there any new skills that you think are more important now?

Sarah Schlobohm: I do think NLP and deep learning are gonna become more and more important, I think. I think that's absolutely the right shout on that. But I think individual skills and technologies I. Change a lot and you can learn them and you need to be constantly learning. I think the only thing that we can guarantee about this industry is that it's gonna keep changing and it's gonna keep adapting.

And we're we're gonna keep building on what we're doing. It's gonna be more interesting, more powerful, and that's gonna. Use different skills, right? And are we gonna be using Python in 20 years? Prob probably not. I mean, some people are still using Fortren, but hey for the most part, no.

There, there are trends. There are shifts in what we do. This is a big one. This is a big step change. But the core skills, are you logical? Are you creative? Are you pragmatic? Can make things. Makes sense. Those are the sort of more fundamental skills rather than specific technologies.

Cuz I can go do a data camp course tomorrow and refresh myself on time series if I need to. That's not the point. It's new. I know how to find that information. Do I even know how to. Ask questions about this, like, Hey, what kind of a problem am I talking about? That's a more important question to ask than what specific piece of technology, or how do I write these five lines of code about it?

Framing, framing the problem, thinking about how to solve it, how to communicate the results at the end of it, that's always gonna be more important than a specific skill.

Richie Cotton: One thing you've mentioned a few times is the idea of prompt engineering. And it seems that this might be a real job. I'm never quite sure if it's like, if it's just a task or whether it's like a whole job in itself. Can you just tell me a bit about what it involves?

Sarah Schlobohm: Yeah, I mean, prompt engineering is another way of saying, asking the right question. I think there are tricks to getting it to do what you want. So giving it context again. But a lot of these things are just good advice for how to tell a human to do stuff as well. So give it context.

Talent. Why? So in this situation, I want you to be a data scientist using Python. So give it that prompt. Ask specifically what you want. If you're looking for a format of an answer, give it an example. Say, Hey, give me an output like this. Again, really great advice for dealing with humans as well.

So, so yeah, I think it's gonna be a super important skill that quite a lot of. Jobs have, I think in very limited circumstances that is gonna be a specific job, but I think that's gonna be part of like a mega pipeline in certain areas. I think like you do see, search engine optimization, I think is again a good comparison.

Is that the job title? In a lot of cases, probably not. Right. But if you have a marketing role, that's probably an essential skill and in a big enough specialized enough organization, you might have someone with their only job. Is that, but I think it's more and more a skill we all need, but it's a skill we can still all use because again, for dealing with humans, giving context, saying explicitly and clearly what you need, giving it an example of what good looks like.

Still all really great advice.

Richie Cotton: Excellent. So maybe a job in big organizations. But otherwise it's just a skill everyone needs.

Sarah Schlobohm: How to search stuff on the internet right now is a skill that everyone needs, right? How to phrase that question properly. How to sort through all the crap, how to ignore the ads, how to do all of that. We don't even think of those as skills anymore. But 20 odd years ago, we were all figuring this out for the first time.

And it's what we're doing again, I think with.

Richie Cotton: internet search. It does sound like an job. What do you do? Well, yeah, I just Google stuff all day.

Sarah Schlobohm: my jobs like that.

Richie Cotton: Let's talk about your work at Curick. So can you tell me what you are working on at the moment?

Sarah Schlobohm: Sure. Yeah. So we are, we're training the next generation of, well, lots of data professionals. So as head of ai, look, I look after our machine learning engineering track. So the people who are going to be potentially using this stuff on a full-time basis. We have other stream as well.

But yeah, training them up in the basics of. All of the business stuff around this as well, because again, half the job is actually getting it implemented and communicating it. Training them up with the basic skills that we need and then sending them out to client sites and supporting them out there using machine learning skills.

so we're spending a lot of time, obviously talking about new revolution in AI and how interesting it is. Yeah, lots of good conversations. Lots about, obviously around the assessments. We had some pretty serious assessments that were, a couple of weeks after TA G P T dropped and we ran the exam through it Those days, it was passing it at good marks rather than an incredibly perfect marks, but, It was, having an honest conversation with them about, look, this is you can do this, it can answer the questions.

We're not gonna lie to you. We're not gonna hide that from you. But you're not gonna get the benefit of, really being able to challenge yourself and really being able to test yourself if you use it to effectively collaborate. Right? Because rather than it copying from someone.

Right. The equivalent I think of using chat g b T on exam is closer to collaborating with a human. It's closer to plagiarism than, I have automated something. so setting expectations around that has been a really interesting discussion. We always talk about AI ethics and I think the AI ethics of this is a big one as well.

all those same questions about privacy and security, but also, where is it going? They're all worried about, is this still the right thing to train in? Should we, are we gonna have jobs at the end of this? And Yeah. Yeah. I actually think it's more important than ever.

Richie Cotton: Yeah, it's a thorny issue. The idea of using AI for assessment. Cuz it it's cheating, but if it's something you. Gonna use it in your job then maybe it's something you you want to be able to use when you're

Sarah Schlobohm: Well, exactly. Cause I was I was a hiring manager at a bank. I worked at one of the world's largest banks. And it drove me crazy when we would get young data professionals who would join us and would insist on doing absolutely everything from scratch with no collaboration. Because that was how you're supposed to do it in school and university, but that's not.

What I want you to do on the job site, like if you can ask Bob next to you and Bob can say, yep, this is how you do it in five minutes, or if you can find it on Stack Overflow. I would much rather you do that than spend two days hashing out for yourself less efficiently. Like this stuff exists. It's been tested, it's been tried.

I think this is always the example we used for using pre-made python functions, right? Don't write a linear regression from scratch. What are you doing? You psych it learn. I think that's where we're headed with this as well.

Richie Cotton: Alright, nice. And do you have any success stories around AI that you can share from Cubic.

Sarah Schlobohm: Well have to be a bit careful about client confidentiality. I think there's some really interesting stuff that's happening in life sciences with large language models. I went to an event on that. In industry. I think that's incredibly cool. There's lots of options for. For, developing new sort of computational chemistry with it for looking for new drug reactions, for doing sort of meta-analysis on past research papers.

I've seen some really interesting examples of that. I think that's an incredibly interesting area that's gonna be relevant well to everyone because we all have to deal with healthcare at some point in our lives. I think that's incredibly cool. We've definitely seen people switch to things that are a lot more efficient, different versions, automating things being able to automatically classify documents, for example looking at supply chains.

There's just there's opportunities absolutely everywhere. If you're like, Hey, that could be done more efficiently, there's probably a way to use technology to do that.

Richie Cotton: Excellent. So lots of opportunities out there then for people wanting to adopt this stuff do you have any final advice for data managers or data teams who are wanting to try the hand at AI.

Sarah Schlobohm: Yeah, I mean, just try it, like the barrier to entry has never been lower. It's just narrow. It may not stay that way forever. Right? It may not always be so, so freely accessible or so cheap accessible. But it's. It's a big step change. It's exciting. So, so on the one hand, it's cool.

Be excited about it, right? Absolutely. There's so much potential. On the other hand, like, calm down. It's also probably not the end of the world. It's probably not Skynet. I think there's so much opportunity, but also keeps some perspective on it as well. We've been through, I have personally been through a few revolutions of the technological variety, right?

We got the internet, we got advances in, in search engines like Google. They, some stuff phases out, but some new opportunities are made every time. We wouldn't be talking right now without a bunch of them. I'm in Chicago, you're in New York. Look at what we've managed to do. There's some kind of automated transcript going on in the background.

I couldn't have imagined that 20 years ago, but here we're,

Richie Cotton: Technology is amazing, and I do like the phrasing that church is probably not scouted. I think that's optimistic.

Sarah Schlobohm: I, I may to regret that maybe like, The end of the world, see it on a newspaper. Probably not Skynet.

Richie Cotton: Nice. Alright. Well, thank you for joining me on the show uh, Sarah, that was uh, really informative. Uh, Yeah. Uh, Great stuff.

Sarah Schlobohm: Thanks so much.

Topics

Artificial Intelligence (AI)

blog

DataFramed AI Series: Navigating the Generative AI Revolution

Find out about DataCamp's upcoming podcast series focussing on the power of ChatGPT and generative AI.

Richie Cotton

3 min

blog

GPT-3 and the Next Generation of AI-Powered Services

How GPT-3 expands the world of possibilities for language tasks—and why it will pave the way for designers to prototype more easily, streamline work for data analysts, enable more robust research, and automate content generation.

Adel Nehme

7 min

podcast

GPT-3 and our AI-Powered Future

Sandra Kublik and Shubham Saboo, authors of GPT-3: Building Innovative NLP Products Using Large Language Models shares insights about what makes GPT-3 unique, the transformative use-cases it has ushered in, the technology powering GPT-3, its risks and limits.

Adel Nehme

64 min

podcast

[DataFramed AI Series #1] ChatGPT and the OpenAI Developer Ecosystem

Logan Kilpatrick, Member of the Developer Advocacy Staff at OpenAI takes us through their products, API, and models, and provides insights into the many use cases of ChatGPT.

Adel Nehme

55 min

podcast

ChatGPT and How Generative AI is Augmenting Workflows

Join in for a discussion on ChatGPT, GPT-3, and their use cases for working with text, helping companies scale their operations, and much more.

Richie Cotton

48 min

podcast

[DataFramed AI Series #4] Building AI Products with ChatGPT

Joaquin Marques covers ideas on what to build with AI, the details of how to build AI products, and how ChatGPT is making chatbots better.

Richie Cotton

56 min

See More See More