Home Podcasts Artificial Intelligence (AI)

Learning & Memory, For Brains & AI, with Kim Stachenfeld, Senior Research Scientist at Google DeepMind

Richie and Kim explore her work on Google Gemini, the importance of customizability in AI models, the intersection of AI, neuroscience and memory and much more.

Jun 2024

Guest

Kim Stachenfeld

Kim Stachenfeld is a Senior Research Scientist at Google DeepMind in NYC and Affiliate Faculty at the Center for Theoretical Neuroscience at Columbia University. Her research covers topics in Neuroscience and AI. On the Neuroscience side, she study how animals build and use models of their world that support memory and prediction. On the Machine Learning side, she works on implementing these cognitive functions in deep learning models. Kim’s work has been featured in The Atlantic, Quanta Magazine, Nautilus, and MIT Technology Review. In 2019, she was named one of MIT Tech Review’s Innovators under 35 for her work on predictive representations in hippocampus.

Host

Richie Cotton

Key Quotes

The way that we think about the role of episodic memory, of rapid memory contributing to learning in the brain is actually quite similar to how we think about retrieval augmented generation, like RAG in these language models, that if you give them access to a memory buffer in which you can just put entire new experiences that they could then draw and add to their context, that's actually like... pretty, in a lot of ways, very similar to how we think about learning in the brain. And I think both neuroscience and AI could find that an interesting and fruitful analogy.

Historically the fields of neuroscience and AI have had a lot of overlap. I think they thinking about the problem of how to learn from reward or how to make sense of things that you are looking at, the computer vision problem, or how to implement memory well. These are just problems that show up in both AI and neuroscience. And historically, there's been back and forth conferences like RLDM, reinforcement learning and decision making, which are very interdisciplinary and involve researchers from neuroscience, computer science, thinking about the same things. There's always debate about like, oh, was this idea from neuroscience or did neuroscience get the idea from AI? I don't know, I haven't been alive for the entire history of neuroscience and AI. It seems like the kind of way that I think about it now is that it's an efficient use of intellectual resources for there to be some people who are just thinking about the same problem in both the neuroscience and AI context. That it makes, because these problems have a lot of overlap, ideas people think about in neuroscience, they have relevance to how they work in AI. Ideas people think about in AI constitute nice hypotheses for how the brain works. It's quite possible that a random idea that we have for an algorithm may be a really bad hypothesis for how learning in the brain works. There's very different constraints that a system that works with GPUs has to obey versus like a system that has to form itself out of a single cell and also like learn and be subject to biological constraints and be made out of food. Like they're very different constraints on these systems. But they have some computational problems in common and thinking about what's in common between them I think is just a good use of researcher time.

Key Takeaways

Incorporate retrieval databases into your AI workflows to improve response accuracy and tap into a vast repository of factual information from original sources.

To get the most out of your AI models, prioritize customizability and adaptability to ensure they can be tailored to specific tasks or domains.

To ensure accurate decision-making, balance predictive models with human insights and expertise to provide a more complete picture of complex systems and phenomena.

Links From The Show

DeepMind

Kim’s Website

Dr James Whittington - A unifying framework for frontal and temporal representation of memory

Paper - Language models show human-like content effects on reasoning tasks

[Course] Artificial Intelligence (AI) Strategy

Transcript

Richie Cotton: Hi, Kim. Welcome to the show.

Kim Stachenfeld: Hi, Richie. Thanks for having me.

Richie Cotton: So I'd like to start off talking about learning. So how do brains learn?

Kim Stachenfeld: It's funny when I first, like, my first reaction to that question is, oh gosh, uh, how could I like, that's a big question. Um, I think, uh, when I was, when I was thinking about this a little bit in, in preparation for this, I, I was thinking like two things do kind of really stick out as special things about how the brain learns, especially in comparison to modern AI.

Um, and that's the way that it's, it's structured. Um, so the one way it's structured is just developmentally. Um, cognition emerges gradually. Um, babies are very different from adult humans to just start out with the obvious. Um, and Humans acquire skills at different stages in their life. They have, uh, they, they gradually start to experience more and more of the world as their brains become prepared to deal with it.

They first learn their, how to make sense of the world with vision and audition. Um, they start to form memories, they start to form speech, motor control, social behaviors. Um, you've probably heard the brain is not fully developed until you're 28 years old, often in the context of like an argument with a parent when you're a teenager.

Um, Disagreeing about what you should be allowed to do, um, and it's, it's true. It takes a while for cognition to learn, to develop, and we have e... See more

xperience that's age appropriate as we go through life. Um, this is really different from how language models are trained, where it's just right away they start seeing random samples from the internet and slowly, just like, the entire data set is presented to them at once and they very gradually start to make sense of it rather than building from more simple things to more complex things, or building from certain skills to later skills.

Um, another thing that's really special about learning in the brain, um, is that it seems like there is, some regional specialization, some modularity to how learning is structured in the brain. We have theories about how different brain circuits and different brain areas contribute to different types of learning, which all mingle together in a really complicated and, and in some ways like still hard to describe way.

Uh, ways. Um, I focus, for instance, a lot on hippocampus. Hippocampus seems to be associated with rapid memory for, for experience. Um, so if you, for instance, uh, summon to mind the vivid experience of when you parked your car this morning in which parking lot you decided to drive into and, and can kind of summon that and, and almost re experience it, that's thought, that's episodic memory.

Um, That's very different, for instance, from the more gradual process of learning from reinforcement. Gradually learning which actions are rewarded, which actions are not rewarded, and modifying your behaviors over time. Um, and there's evidence that there's like, a lot of specialization in these processes, kind of, happen somewhat separately and then all joined together.

Um, there's also just like tons of other kind of learning. I, um, those are two I've, I've thought about a lot, so I mentioned them, but like cerebellar learning, cortical learning, like there's just all sorts of, it's, it's very specialized as opposed to the kind of monolith that, uh, that, that large neural networks are.

Richie Cotton: That's absolutely fascinating. There are lots of different types of learning and fairly reassuring, I suppose, that there are lots of different ways you can learn things. So, uh, it was also kind of fascinating that, uh, you mentioned reinforcement learning. It's also like a technique for training, uh, AI. So, uh, it's good that like brains and AI have some overlap.

Speaker 4: Oh yeah. Yeah. Lots of overlap.

Richie Cotton: Super. So, um, I also found fascinating that you said that, uh, Your brain's not fully formed or you continue learning things up until you're 28, which is much longer than I expected. Like, so I think once you're grown, you're basically done growing with things. So, uh, yeah, it's, it's interesting that carries on a bit longer.

So, um, since you mentioned, uh, there are lots of different types of learning and some of them a bit more specialized. Can you talk me through about, um, to work differently for learning different kinds of things?

Kim Stachenfeld: A classic neuroscience example of this, um, is the, uh, the, the difference between episodic memory, um, which I'll define, and for, and, um, procedural memory.

Um, there is a very classic, uh, study in neuroscience, um, I, um, actually my, my brother in law is a medical student, and when he was taking his neurology class and had his class on the hippocampus, I said, I bet you like ten dollars your very first slide is going to be this guy Henry Molaison, um, who had both of his hippocampi removed and lost the ability to form new episodic memories.

Um, a bet which he did not take, but I would have won. Um, the, um, basically, they, this, this man, Henry Molaison, unfortunately, was, uh, experiencing seizures, which originated in his hippocampus, um, they removed both of his hippocampi, um, and he entirely lost the ability to form new memories, um, he would experience things, and then just as soon as something came, would happen that distracted him from his current experience, he would forget what had been previously going on.

Um, so he was very limited in his ability to learn new things. However, I think that he could eventually learn with repetition was complicated motor behaviors. Um, in particular, they used this thing, a mirror star tracing task. You, you watch a, um, you watch yourself tracing the outline of a star in a mirror.

It's something that's like kind of tricky to do. Um, and he would practice doing this, have no recollection of doing it, but gradually get better. Just like surprising himself that he was so good at it every time he did it. Um, so that, that's kind of one, one of the like really foundational results in neuroscience about how there are different learning systems.

And you can really severely affect one while leaving others somewhat intact, although they seem to relate to each other. Um, presumably if he had. had a hippocampus, he might have been able to learn that task a little bit faster, like these, these systems interact. But the fact that there is as much modularity as there is, is often, um, still surprising to people.

Richie Cotton: Well, first of all, it sounds like an absolutely horrible experience for that man having like bits of his brain not working.

Kim Stachenfeld: Yeah, there's been a lot of, uh, In the early days of neurosurgery, there were a lot of, uh, kind of reckless, um, surgeries, I'll say. Um, we don't do that anymore.

Richie Cotton: Okay. Uh, that's very reassuring.

Um, uh, but yeah, uh, very fascinating that, um, there are these like independent learning systems. So even if you lose like one ability, you can still learn things in other ways. Um, I'm talking about learning things another way. So, um, a lot of our audience are very interested in learning things about data and you'll want to be able to learn faster.

So do you have any tips for how to learn more effectively that are actually backed up by science?

Kim Stachenfeld: Yeah, I feel like, I mean, I'm not a medical doctor, um, I, so, you know, grain of salt. Um, the, uh, it, it, it seems generally that, like, the boring things people recommend, sleep, exercise, social interactions, uh, repetition, those are, um, pretty useful for, for learning.

Um, they, they seem to, uh, uh, I think one thing that's particularly interesting from the perspective of somebody who, studies, uh, hippocampus and reinforcement learning, um, is that some of these have been associated with changes in neural circuitry that, that seem a little evocative. Hard to say for sure that that actually, like, the observations made at the clinical level really relate to what people see in the lab.

Um, there's, It's really, it's a complicated system, um, but, uh, exercise in particular is associated with, uh, neurogenesis and hippocampus, the, the, um, creation of new neurons, um, social interactions, um, sleep also seem, uh, social interactions also seem like they, um, have this effect, um, the, um, sleep also, um, we have, there's a lot of theories about how sleep is useful for converting, um, uh, like experiences that you've had recently into knowledge that's more, um, that's more general and more robust and will like stay in your brain for a longer period of time.

Um, there's a theory of complementary learning systems, this idea that hippocampus helps you learn things really rapidly and eventually patterns from these experiences are extracted and distilled in your cortical circuits and sleep is hypothesized to be a really important part of that process. Um, so as somebody who, uh, doesn't sleep that much and is loathe to exercise, um, I, um, I find that's a little depressing, but, you know, it is nice that the things that, that kind of lead to a, uh, happier lifestyle also, um, can be justified, uh, in terms of, like, you know, promoting brain health.

Um, what I also find really interesting is just that, like, novelty seems like a really big driver of the formation of new memories. Um, there was a lot of concern during COVID that people weren't just getting enough, like, novel sights. They were staying in their house too much and not seeing new things.

Um, and that seems like it doesn't, uh, that having a persistent experience of novelty is important for laying down new memories. Um, that's also kind of something in the realm of, like, the things that kind of make you feel happier. Um, Uh, seem like they also put your brain in a good state to learn stuff.

Richie Cotton: That idea about sleep is particularly interesting because quite often you see on LinkedIn, people talk about, Oh yeah, I get up at 4 a. m. and I'm super productive immediately. Yeah. I'm like, well, actually maybe you should probably have had a lie in, you'd be smarter. Yeah.

Kim Stachenfeld: I mean, if they're going to bed at like eight or whatever, that's, that's probably a pretty solid plan.

I mean, having a good schedule is a good way to get more sleep. Having a schedule that deprives you of sleep. Uh, is, is probably not, I mean, there isn't, I don't know of any science that support, just to say it in careful terms, I don't know of that being supported by anything. I think of anything, the, there's, there has just been lots of research recently into how sleep, it matters a lot more than we think.

Um, and there's also some really interesting cognitive theories of how that works. Um, so yeah, I would say, I would say sleep more than, more than you want to, to your listeners.

Richie Cotton: All right, I'm gonna try and see if I can, like, get away with having a nap at work just to say, Oh yeah, it's important for my brain power!

Speaker 4: Yeah!

Richie Cotton: All right, uh, so, uh, just releasing this, so we talked about how to learn more effectively, and releasing this is like, you learn stuff, you forget it, how can you improve your memory?

Kim Stachenfeld: I don't know if, uh, if you've heard of, like, the, the idea of memory palaces, um, but these seem, like, I, I, when I had heard them before I learned any neuroscience, I was like, that seems a little bit silly.

The idea is basically you, um, imagine, um, a, a spatial, like, a, a spatially familiar place. Um, you can, sometimes people like to just, like, construct them, but I, I think the classic example is, like, you use your childhood home or something, Um, and you can memorize random sequences of things by kind of binding them to different locations in your, um, uh, in, in your memory of your house.

Um, there's actually been like a decent amount of research into this. It seems to work quite well. Um, other things that boost, that, that, um, uh, they don't boost the skill of memory necessarily. They're more like a strategy for remembering particular things. You can bind them to. Uh, Spatial arrangements, as in the Memory Palace, you can bind them to songs, like uh, I learned a song in Spanish class in high school to remember all the irregular verbs, I think that's a pretty common trick.

Um, and then I, another like, general thing in that family is you can put them into a story. Um, this is sometimes what people who do those, uh, memory games, like where they try to remember random sequences of cards, they construct a story about it. Oh, I'd like why why these things work and the extent to which they really work in systematic ways is, uh, you know, a bit of an open question, but there is a modeling.

There's there's some a modeling framework that makes some interesting hypotheses about it. I think the basic idea is that, um, we often have the experience of applying a familiar narrative structure to a new situation. I, if I watch a rom com, I know generally how rom coms go, and I would be really surprised, you know, if, uh, If, if, just, like, if, if something happened besides the characters ending up together.

Like, things tend to go in a particular way. I know how, uh, walking through my childhood home is gonna go. I would be really surprised if suddenly the garage was in the wrong place. I know how songs are going to go. Um, and you know, uh, if, if all of a sudden I started hearing white noise instead of a song, you would be very surprising to me.

These familiar sequences can for sort of form a template to which we can bind new things, um, that's at least the hypothesis. Um, for, for the aficionados, there's some really nice, uh, computational work from, from James Whittington, um, like implementing this idea in mathematical concepts. Um, Hila Feet's group also has some nice work like implementing this specifically in the spatial context, although it's thinking more about how animals navigate than it is like, uh, just like all of memory processes, but it's, it's a really general framework and can be thought of in that way.

Um, so I think like an emerging idea in computational neuroscience is that these, these templates can be useful for binding arbitrary concepts. Um, and in general I think this is one of the reasons, this is just me kind of speculating, but why curiosity is a really good driver of learning and memory is like, if you're trying, if you're actively kind of trying to put new knowledge into a story to stitch new things you've learned into other things and make relations, you are naturally leveraging this process.

You're trying to kind of apply the templates you know about how things logically fit together, um, and, um, and, and using that to scaffold your learning and then you'll have better memory for it because it's both more interesting and more salient and wakes up more of your brain, but also because it like is, is leveraging structure that that's already in there, um, and, and binding to new things.

Richie Cotton: That's brilliant. We do spend quite a lot of time on this podcast, talk about, uh, storytelling for, um, improving like the impact of like your data analysis and things. Uh, so it's good to know that is really backed up by science. Um, do you have other idea about songs? That's also like terrifying, like I'm going to have to start singing songs about data just to make people remember.

Uh, yeah, it's a skill I don't have, but maybe, uh, maybe for someone else. So, on the flip side to this, uh, there are always people come up with, like, advice on how you can learn things better or remember things better. So, are there any things that, uh, don't actually work that, uh, may be in popular culture?

Kim Stachenfeld: I'm sure there are. There probably are some that I'm not even thinking of because I just kind of, uh, like Maybe, uh, put them out of my own brain and then forgot that they're, they're out there. Two that came to mind are things that show up in my, uh, social media ads a lot. Are games do improve cognition and supplements?

Um, I mean, I, like, I'm sure there are, like, games challenge your brain. brain and in different ways and I'm sure there's like ways they can help. We don't really understand it fully so definitely a lot of those things which are like play this app and organize luggage on a phone screen into different shapes like that kind of game.

that there isn't, I don't think that there's a ton of evidence that that makes you better at anything besides playing that particular game. And it definitely will make you better at playing that particular game. Um, it's hard for me to imagine that you're better off like, you know, playing a game on your phone, uh, to, to really get better at learning and memory.

Um, and that's better than like, you know, having a conversation with another human being, which requires you to, to recall things from your memory and improvise and adjust in a very like adaptive way. Lots of games really require, um, a lot of, like, thinking and recall. Um, I love doing crossword puzzles, um, and I like to think that there's some benefit to them, although, you know, besides being more likely to use words that have a lot of vowels in them in my day to day speech, it's hard for me to know for sure what they are.

Um, I'm, you know, playing games like, chest and stuff. It's hard for me to imagine that doesn't build up cognition. Um, but, um, but yeah, I think a lot of, a lot of these games over promise and it's, it's, it's good to basically just know that like, whatever the claims are, uh, it's probably just in the realm of uncertainty relative to neuroscience.

Um, supplements also like, just be careful with those. I mean, that's almost more dangerous because like, there's definitely a lot of, uh, pills you can take that will change your neurochemistry. Um, If you have like ADHD or depression or something, one could talk to a psychiatrist and get diagnosed with drugs that will definitely have an effect on your focus and learning and memory.

And they have side effects and one shouldn't do it lightly and whatever. But like, there's definitely an effect that. things, like, things that you can, medicine you can take and have. Um, it seems like there's also just, like, a lot of other stuff which is advertised and I have no idea what it, what it does.

Uh, a heuristic I often have for evaluating those things is I look at the, um, the about pages and see who is associated with it and see if any of them are, like, doctors and professors. It's a hacky method. There's, that's not, like, a surefire way to know that something is or is not a scam. Um, but it's at least like one source of evidence you can draw to see how aligned it is with, um, what medical and scientific institutions are kind of generally supporting.

Richie Cotton: So, uh, maybe don't just buy random pills off the internet.

Speaker 4: Don't just take those off the internet. It could go poorly.

Richie Cotton: Yes, that's good life. Good life advice in general, I think. All right. So, um, I'd like to talk a bit about, um, how, about the overlap between, uh, brain research and AI research. So can you talk me through how does your, uh, neuroscience research relate to AI?

Kim Stachenfeld: Yeah. I mean, so, um, historically the fields have had a lot of overlap. Um, you know, I think, uh, the thinking about the problem of how to learn from reward, um, or how to make sense of things that you are looking at, the computer vision problem, or how to implement memory well. These are just problems that show up in both, uh, AI and, uh, neuroscience.

Um, and, um, and historically there's been back and forth. Um, conferences like RLDM, Reinforcement Learning and Decision Making, which are very interdisciplinary and involve researchers from neuroscience, computer science, thinking about the same things. There's always debate about like, oh, was this idea from neuroscience, or did neuroscience get the idea from AI?

Um, I don't know, I, uh, haven't been alive for the entire history of neuroscience and AI. Um, it seems like, um, the kind of way that I think about it now is that it's an efficient use of intellectual resources for, um, there to be some people who are just thinking about the same problem in both the neuroscience and AI context.

Um, that it makes, like, because these problems have a lot of, overlap, um, ideas people think about in neuroscience, um, they have relevance to how they work in AI. Ideas people think about in AI constitute nice hypotheses for how the brain works. Um, it's quite possibly, possible that a random idea that we have for an algorithm, um, may be a really bad hypothesis for how learning in the brain works.

Um, there's very different constraints that a system that works with GPUs has to obey versus like a system that has to form itself out of a single cell and also, um, like learn and be subject to biological constraints and be made out of food. Like they're very different constraints on these systems. Um, but they have some computational problems in common and thinking about what's in common between them, um, I think is just a good use of, uh, of researcher time.

Um, so that, that's kind of how I think about it. Um, that like, it just makes the problems have a lot in common. Um, and, uh, it makes sense to think about both in the same space, so you can make, um, like, double up on your efforts.

Richie Cotton: So, it's cool, there's a lot of overlap. I'm just wondering, have you got any examples of, um, maybe an idea you've taken from AI, or wherever or how you've used AI as part of your research?

Kim Stachenfeld: Yeah, so I've done a lot of work in particular on, um, on, uh, prediction in the brain, um, and how, um, how memories are used to make predictions about what's going to happen, um, and how those predictions are used by the brain. Um, so this is like, this was really building on ideas from the field of machine learning, um, or, or AI, um, And, uh, testing the extent to which these explain activity that we see in the brain.

Um, it really, like, in particular in hippocampus. Um, so the, the, um, there's, there's a lot of work in AI on how, uh, making predictions can be, um, useful for different things. Um, one, one thing is just like the very act of planning involves making a prediction about what's going to happen and how it depends on your actions.

If I want to figure out like, what time am I going to get to work? I might just kind of have some table of experience that I've remembered. Like, oh, if I leave at nine, I'll get there at nine 30. If I leave at eight 45, I'll get there at nine 15. Or I could play out in my head, like, all right, I'll go through this sequence of steps and that'll sum up to about 30 minutes.

Um, So selecting actions is, is like often informed by our predictions. Um, also just like the act of trying to make predictions often causes you to learn about things you might otherwise not bother learning. If I'm trying to predict what will happen, I'll have to pay attention to all these features that have, about my environment that have information about what's going to happen.

For instance, like how much traffic there is today, or if it's raining or um, other factors that could, could be. you know, affect commute. Um, so the, um, this, this has been a big topic in machine learning, how we can, how models can learn to make predictions, um, how making predictions can cause models to learn better representations of their environment.

Um, and one area of research that I've worked in a lot is how we can think of, um, hippocampus is contributing to making predictions about what's going to happen next. I'll hit the campus. forms memories of what has happened and uses that to support predictions about what's going to happen. Um, you can kind of relate like the past to your present in the same way that your present relates to your future.

Um, so memories can be used to, to forecast what's going to happen. Um, so we've used a couple different ideas in machine learning. Um, uh, like we, we used, uh, in particular, some ideas. like called successor representations about rep, like how to represent your observations in a way that makes, uh, that, that incorporates information about what's going to happen next.

Um, and how this captures some ways that cells behave in hippocampus, how they change their firing patterns and their activity patterns over time.

Richie Cotton: So I never really considered how hard it is just to figure out like how long it's going to take. Essentially, there's like different approaches. So you can either go from.

past experience, or you can try and reason about it from scratch. Um, so actually this, um, is kind of interesting because you think about, um, particularly like these large language models are really. predicting, like, what should the next word be, but they're rubbish about reasoning about things. Um, do you have a sense of, like, why prediction is easier than reasoning in this case?

Kim Stachenfeld: I mean, one thing is, I'm not sure, it's hard to say objectively that they are rubbish at reasoning or not rubbish at reasoning or, like, compared to their ability at prediction, because we don't necessarily have a frame of, of reference. I think we're, um, maybe, like, We're, we're broadly impressed by their prediction abilities and maybe less impressed by their reasoning abilities.

Um, but it's hard to really evaluate this in a quantitative way. Um, I think, and I, I have some colleagues who did some nice work on this. Andrew Lampinen had a lovely paper, um, comparing, um, reasoning, like reasoning failures in large language models to reasoning failures in humans. Um, because humans make a lot of mistakes in reasoning too.

Um, and they basically showed that for, um, for, for reasoning problems that were more familiar, that were congruent with experience, um, language models tend to be better at it. And, and humans show the same effect. One example they had is if you have some task where you have, um, this is like a classic, you know, cognitive science kind of task.

You have decks of cards and they have pictures on them. Based on the picture, you have to make some judgment about, you Uh, what's on the other side of the card, which is a number. So for instance, if you have a task like, um, there's a picture on a card and then you'll see a number and based on the picture, you have to decide if the number is greater than or less than 21 or greater than or less than 17 for another image, that's something people can learn and language models can figure out.

Um, but they make a lot of mistakes. It's kind of like hard to keep all that information in memory. If you make the pictures, um, a picture of a beer glass to represent the drinking age, and a picture of a car to represent the driving age, people are much faster at it, they make fewer mistakes, um, it's a much easier task because we're used to making that kind of judgment.

Um, uh, they refer to it as, as amortized, that you've already sort of done some of the work to learn about that computational problem, um, and you've already, you, you have practice making that judgment, it's familiar to you. Um, I think there's probably a, um, similar thing in language models that like they they're they're trained statistically they're trained to predict what is the most probable next word um and when you have like there are reasoning problems that are implicit that that have happened in the corpus that they're trained on that that exist in that language data so they have some ability to do it um but they're trained statistically and reasoning problems that are really far away from their experience which are the things we really think of as like canonical, exclusive reasoning problems.

Problems that are only reasoning and not just memory, not just like relating to your prior experience. Those are things that are unfamiliar to them. Um, and this really gets to the, um, the, the challenge that a lot of deep learning models have of, of generalizing out of distribution, of handling really unfamiliar data.

Um, And I think like one, one thing about this that, um, uh, you, you may have talked about on the podcast, like chain of thought reasoning, this idea or, or a chain of thought prompting famous example of this is if you prompt a language model, let's think this through step by step, it'll give you a longer and also more accurate answer.

Um, And, um, and there's some work from Noah Goodman's group hypothesizing that the reason for this is that it forces the model to break an unfamiliar problem down into familiar parts. Um, experiences is local and you have experience with individual steps of reasoning, but you don't have experience with this particular reasoning problem.

Um, so when you convert a problem that is, um, new into a set of, um, relations between familiar parts, you can start to take something that's unfamiliar and recast it as something that's, that's familiar and that you know how to deal with. Um, so I think probably the way humans are good at this is that we have experience with this reformulation process and can do it kind of automatically.

Um, and the extent to which we are good at this is also kind of something that psychologists are constantly trying to assess and a good reference point when thinking about how machines do stuff.

Richie Cotton: So this actually seems like, um, good advice for people in general. It's like if you don't know how to solve a problem, then break it down into small problems that you can solve.

Um, okay, so, um, you said that, uh, familiarity, um, makes things easier to reason about. Um, are there any other tips, or are there any other things from neuroscience where you think, okay, this might be useful to apply to AI, to help AI reason better?

Kim Stachenfeld: Familiarity makes things reason easier to reason about, um, especially if you are a neural network where, uh, neural networks are, they are so expressive.

They're capable of learning such a broad array of functions that what they're going to do on data they've never seen before is kind of unconstrained because they could learn to do something different on that data if you wanted them to. Their expressiveness is deeply tied up with their ability to learn complicated things, their expressiveness is tied up in their, uh, what they might, that we don't know what they're going to do on unfamiliar data.

So recasting things as familiar is particularly important if you want, like, a statistical learning system like a neural network to do a good job on reasoning problems. Um, I think, um, a, I mean, a, a thing that I, uh, think about a lot from the perspective of, like, uh, benefiting, uh, like augmenting the current, uh, state of the art large models, um, is the role of memory and how it's different in humans.

Um, the, um, the, the way that we think about like the role of episodic memory of, of, of rapid memory contributing to, um, learning in the brain is, is actually quite similar to how we think about, um, retrieval augmented generation, like, um, rag in these language models, that if you give them access to a memory buffer in which you can just.

put entire new experiences that they could then draw and add to their context. Um, that's actually like pretty like, kind of, in a lot of ways, very similar to how we think about learning in the brain. Um, and I think both like neuroscience and and AI, uh, will could find that an interesting and fruitful analogy.

Um, so The, um, uh, a big thing, I think, is that, um, the, the process, like, one thing I really like about, uh, Andrew and Ishita's paper on, um, comparing language model reasoning to, um, human reasoning is that it demonstrates, uh, that, that, like, One lesson for thinking about the relationship between AI and neuroscience or AI and psychology is that it's really hard to evaluate, um, how good a model, um, or a brain is at doing stuff.

And the entire field of psychology has really like spent a lot of time trying to really tease apart how the brain solves different problems. Um, what, what kind of, uh, like really It's difficult to quantify and to systematically say things about how an intelligent system is working. Um, so I think that, um, comparing, um, using the brain as a reference point is a really useful way of calibrating expectations.

Like, should we say this is good? Should we say this is bad? Like, well, let's make sure, using humans as a reference point is really helpful for just calibrating our expectations. And then the processes by which we tease apart how the brain works, which is the job of neuroscientists, or how, uh, behavior works, which is the job of psychologists, um, has a lot of complexity to it.

And I think there's a lot of interesting reuse of methods, um, that, that will eventually go into that.

Richie Cotton: You mentioned that, um, it's so difficult to quantify how a brain does all this sort of, um, learning and memory and things. Um, should we be using different metrics for AI then? Should we just be saying, okay, this should have a completely different.

system to it. Um, how do you think about that?

Kim Stachenfeld: A lot of what goes on is that people will, um, uh, people will play around with language models, um, and then post something on Twitter, which is a single example, um, and then people read into it and post it around and then somebody will try prompting in a different way and you don't get that same effect anymore.

Um, this approach has, um, Pros and cons, honestly. I mean, I think there's, there's something, uh, clearly anecdotal and therefore not super scientific about it, um, however, it's, um, it's, it's almost kind of like field research, like, it's, it's lots of people playing around and trying different things and finding things, um, and those constitute good hypotheses for what kinds of things we should look for.

I think the, the mistake perhaps is like, uh, confusing, um, Okay. Uh, like single examples with conclusions rather than using them to inspire hypotheses. Um, and I, I think that's, that's really like, um, you, you hear a lot about the scientific method, which starts with a hypothesis and then constructs an eva, a, a, a scheme to evaluate it.

An experiment to evaluate it. Um, where these hypotheses come from is a little bit of a, of a fuzzy story. Um, and, you know, people trying different stuff, people having experiences in their lives, people just kind of thinking random thoughts. is a big part of that historically, I think, um, and, you know, some, some hypotheses just strike people as interesting questions and some don't.

And I think that this process is really good for generating hypotheses. Um, actually evaluating them, um, is, is tricky. Um, I have, um, uh, like I, there's, um, a lot of work that goes into this. I don't necessarily want to say like, you know, um, I don't want to say, like, people are doing it wrong. People are trying really hard to come up with, um, evaluation systems to try and quantify what, uh, what these models are good at.

If they're good at reasoning, um, if they're good at math, if they're good at critical reading comprehension, um, They are using, um, tests that we use on humans, um, and, um, which is very much kind of building with a psychological approach. Um, and, um, and then when they pass this test, then they start to think, well, wait, where does that test fall?

Like, let's try to make new tests as a result of it. Um, so it's, in some sense, I don't think anybody's doing anything like fundamentally wrong. It's just, it's just a hard process. Um, I think that's one of the lessons is it's just like really hard and requires a lot of investment, um, to come up with good evaluations, um, of how these systems work and to really like.

Calibrate expectations and, and test, um, hard hypotheses about how models are working.

Richie Cotton: I like that, um, coming up with a hypothesis, like one of the things, just having random thoughts that makes you feel good. So, oh yeah, maybe like, uh, the idea that just popped into your head, probably write it down. It could be something interesting.

Kim Stachenfeld: Yeah, I think there's something really, one thing I found, uh, really vexing in graduate school, um, but, but now find it kind of fascinating for you is that the, the scientific method starts with a hypothesis and no one tells you where to get a good hypothesis. People, it's almost an aesthetic question of like, what is a good research question?

Is it, um, uh, like, there are Sometimes there is, you know, like, obvious examples of, of things that are interesting. Um, but it, but it's, it's really hard, I think, to go from what, the kind of question that motivates, uh, me is like something like, how does memory work in the brain? That's not like a hypothesis.

That's like an, a topic, an interest topic. And then a specific hypothesis, like, oh, it's using, uh, I don't know, K nearest neighbor's lookup or something. Like a specific algorithm that can constitute a hypothesis. Um, Why, why is that the right one to start with? Is there, is there evidence that it's even in the vicinity of that?

Like, what if it's just something entirely different and this question doesn't really, like, um, get you leverage? Um, it's, it's very, uh, it's very concerning. And I think one of the things I like about this AI inspired neuroscience is it's, it's a nice batch of hypotheses. These algorithms that work well can comprise a library of hypotheses that we can evaluate against data in the brain, um, which.

I don't, at least aesthetically, it's an area that appeals to me, um, although, you know, it's, uh, it doesn't, maybe it doesn't come down to something, uh, fundamentally, uh, easier to justify than that.

Richie Cotton: So systematically coming up with good hypotheses is maybe an unsolved problem at the moment. So, uh, future area of research.

Yeah. So speaking about research, so, uh, I know one of the things, uh, that you work on is, uh, to do with associative memory. Can you tell me a bit more about what that is?

Kim Stachenfeld: A topic that I have worked on a lot is how to think about, um, how relationships are learned and represented. Um, so, the, the benefit of this, this is a framework that has been around in, in psychology and neuroscience for a while.

The idea that, like, um, humans and animals are learning relationships between things. Um, and we can gain a lot of Um, foothold on how, um, learning and reasoning work in the brain by understanding how, uh, the brain can implement this process of learning about relationships. Um, so, relationships, it's, it's a pretty, it's a pretty broad term.

You could think about, like, relationships between people. I guess that's the most colloquial usage. Um, you can think about spatial relationships, um, semantic relationships, like, um, uh, the, Da Vinci's painting of the Mona Lisa is in the Louvre. Each of those represent relations between different entities that, that live somewhere in your mind.

Um, knowledge graphs are, are, uh, a data structure that's, that's kind of generally used to represent these kinds of associations. Um, and there's been some work, um, hypothesizing and testing the idea that, um, humans and animals, learn associations and that this kind of associative learning is is, um, is a very, like, flexible and powerful basis of memory.

And we can think of how hippocampus is forming new memories as building associations between the entities that were experienced together. Um, and, um, part of the power of this idea is that it gives you a way to go from specific experiences to more general concepts. Um, I might learn, for instance, that, um, if, if I, for instance, learn that da Vinci painted the Mona Lisa and the Mona Lisa is in the Louvre, I have learned something implicitly that da Vinci has a painting in the Louvre.

The, the ability to learn these little things and then chain them together into broader inferences, um, is a really important concept in just, like, how we learn, uh, more, more complex structures than we started out with, how we can, like, bootstrap what we've learned in order to make broader inferences. Um, the, um, uh, a, a particular example I've worked on a lot is, um, uh, thinking about the relationships between spatial relations.

Um, there's been a lot of work on how, um, space and maps and navigation is represented in the brain, um, and how other kinds of associations that more generally just relate to. two different things you can experience in time, uh, relate together. We can think, um, I think some of the, the work that I've done can really be summarized as thinking of how we can think of spatial navigation as part of the more general problem of how we learn statistical structure about, um, about how events are joined in time.

Um, space is one way to constrain that. a, if I, you know, enter my building, um, then I can now access all these other states in the building. They're all now states that are, I can predict as coming up in the future. Um, but there's other things that scaffold this too. Like if I start, you know, if I start a movie that's a rom com, I can predict at the end, um, that the characters are going to end up together.

Um, and being able to think about all of these in the same, um, framework lets us think like, how can we take computational models that have been applied to understand how A rat solves a maze, and use that to think about how, um, episodic recall can work more generally, how, how, um, memory systems can support.

planning and reasoning a little bit more generally.

Richie Cotton: So this seems to come back to that, um, idea of the memory palace you mentioned earlier, where you're associating a specific memory with a physical location. Okay, um, so you mentioned that, um, spatial associations and temporal associations seem to be similar things.

Um, are the differences between them, or do we sort of reason about these things in the same way in general?

Kim Stachenfeld: Yeah, I, there's different hypotheses about it, I guess I should say. Um, the, the, the work that I've done is, is hypothesizing that, um, that these are really like part of the same process, um, that, um, you can either, you can think of, um, learning, you can either think of spatial environments as constraining temporal relationships, like, you know, you can't teleport, so things nearby in space are going to be related in time as well.

Um, you can similarly think of, um, You can also think of it the other way and that's really more like the modeling framework that we've had that like you're if you have a system that's trying to learn temporal relationships between things. One thing it will automatically learn is that nearby places in space are related because they tend to appear nearby in time.

So one thing we look for is there's been some experiments that that try to decouple this. So for instance, there was an experiment from Christian Doller's group where they put people in VR. and had them navigate around, um, and then added teleportation, which decouples space and time, and found that this had an effect on representations in hippocampus, um, in, in humans.

Um, so that's like one of the kinds of things you can, you can model by thinking about this, is that, um, if you're generally learning just more about, um, temporal relationships, space, you get spatial relationships for free, because space constrains, um, how different states will be related in time, right? Um, but you can also get other stuff too.

Um, and for, because, um, because hippocampus is thought of as this general memory system that has a special relationship with space, but also can remember, be responsible for learning experiences that have nothing to do with space. Um, adopting this more general sequence formulation, um, seemed compelling.

Um, there's, there's other work, yeah, I think this is kind of a, um, uh, uh, uh, a big field now is, is thinking about the relationship between statistical learning, um, And um, and, and spatial structure. Um, we weren't the first to do work in that space. Um, but we had like a, a particular, uh, computational model that, that we used that really leveraged this idea.

Um, and I think that the kind of counter hypothesis is that this is more, these areas are more specialized for spatial navigation, um, and you get some temporal stuff for free because, uh, space constraints. navigation, um, but that it's really more specialized rather than just all coming out of the same general purpose algorithm.

Richie Cotton: You mentioned the idea of combining statistical learning with, um, associative, uh, memories. So this seems a little bit like this idea with AI where you combine, you know, like a large language model, but just the prediction very well statistically. And then you have, um, a graph, a graph database or some kind of knowledge graph to, uh, handle the reasoning.

Is this a similar sort of idea?

Kim Stachenfeld: It gets really similar. Yeah. Um, I, I think, um, I think that's a really good example. And I, I alluded to, um, uh, retrieval augmented generation at one point where you have a, um, uh, a predictive system and that's augmented by, um, specific passages that you might recall from a database or specific pieces of evidence that you might recall from a database and use to inform your prediction.

Um, I think this is a really similar idea. Um, this particular structure you're retrieving from might have graph structure, um, about how things are related to each other that can inform the retrieval process. Um, and, um, and yeah, I think it's, uh, really like quite a similar concept in that it's combining these ideas about, um, understanding how Entities are related to each other, um, and then using those relationships to make predictions about what's going to happen next.

Richie Cotton: All right. Brilliant. I love that like all the components of AI technology are kind of mimicking bits of how humans reason about things and how humans learn. So we're sort of gradually getting all the components together.

Kim Stachenfeld: Yeah. I mean, I think that's at least I might, I'm probably biased by having a very neurocentric view.

That's certainly like my, my dominant scaffolds are thinking about things. Um, there, I'm sure there's, there's certainly differences too, but I think there are, um, there's certainly like some basis for interesting and compelling metaphors.

Richie Cotton: In that case, are there any bits that we're missing? Like what do we have, um, in human learning that we don't have on the AI side yet?

Kim Stachenfeld: I, I guess these aren't things that are like kind of fundamentally off the radar of folks, but I think they're, um, they're, they're still like pretty active research areas and have a lot of open questions in them. Um, what is this problem of, um, uh, generalizing outside of the data you've seen before? Like basically, um, using, this is related to reasoning.

Um, it's also just related to, to innovation and creativity. Like, can you from some knowledge you've learned and generate something that's That's, that's new. Can you make a novel insight? Can you propose an experiment? Or can you tie things together that haven't been tied together before? Or can you generalize, even just like to a, you know, a longer, like tell a longer story than, than you've ever, uh, seen in your, um, uh, database before, you know, so that this, this ability to, um, go beyond what you've already seen to take the parts you've learned and compose them into something new, either to constitute a novel?

conclusion or to constitute just a novel hypothesis about how the world works. Um, I think this is, uh, something that, um, uh, is not really like, it is, it's certainly still an open problem. Um, and there's been a lot of work, like, trying to think about how to generalize, uh, to longer contexts, and particularly this is sort of the metaphorical equivalent of, like, tell a story longer than the one you've ever seen before.

Um, recall something that's a longer time ago than you've ever seen in your training data before. Um, these, these kind of problems are really, like, starting to touch on these questions. Um, I've done a lot of work with graph neural networks, um, which are, um, a, a neural network system that di that implements something like relational reasoning.

They, they're neural networks that are, um, Uh, that, that learn relations between things. Um, and then if you, they're, they're able to learn about novel or larger relations than they've ever seen before, exactly because they're forced to break problems into parts. That's like hard coded into them, um, that they have to break problems into parts.

Um, and so I think of this as like a very, uh, like, key ability to want to be able to have in, um, in, in our, in our state of the art, in our, our most exciting AI systems. Another thing I think of a lot, um, partly because my research focuses on hippocampus is, is just the role of retrieval, um, and retrieval augmented generation.

I'm super excited about that area in general. Um, I think it has a big relationship to extending context. It's sort of like, If you extend the context of a language model, that means you can put more information into it. Similarly, if you augment the model with a retrieval database, then you can, the language model can refer to things in this retriever, in, in this retrieval database that, you know, it, it hasn't necessarily seen before.

Um, and this, um, this just seems like it has enormous potential to expand the computational abilities of these models. Um, there's been tons of work in the last. The year is making models that are really, really big, um, and train on tons and tons of data. Um, and the ability to now, like, update these in flexible ways, use them for customized applications, update them when news changes.

If you can kind of feed information into them in a more like rapid and immediate way and still have them use it flexibly, I think that would make really good use out of the big expensive models we've made. Um, and also just like enable tons of different applications. also just huge parallel to how we think memory in the brain works.

Um, so from the perspective of like a nice batch of models to constitute hypotheses about memory, um, I, I'm, I'm really excited about that, uh, line of work.

Richie Cotton: So the idea of customizability just seemed incredibly important. Like no one wants to have to build Gemini from scratch. Like you just want to be able to customize it for your own use case.

Kim Stachenfeld: Yeah. Yeah. I, I, I think, I think it's really, um, I think it's really big. I mean, it's also, um, just really important for, um, factuality. I mean, I think oftentimes, uh, you want a model that's going to be able to tell you something that's statistically likely. Maybe it's never seen things before, but it can kind of, if you asked it a question, it doesn't really know the answer to, um, and it gives you something statistically plausible.

Um, and that thing is maybe true or maybe not true. Um, and you know, they're getting better at making things true, but they're not, they're not always, um, going to be true. A method for doing this better is to go into a retrieval database that has original sources, the kinds of, uh, you can really like control what's in there and control what you're conditioning on and retrieve.

specific factual information that you have some ability to verify or, or, or think about its authenticity. Um, and you can use that to inform responses. Um, so I think that's also just a really important utility of this is you're not, um, you're not just kind of like confabulating. Um, you can, you can base this on something that, uh, that you can evaluate.

Richie Cotton: So, so much exciting stuff going on. Uh, I know you've talked about some things you're excited about, really, like retrieval. Um, what, what's like your number one thing that you're most excited about in, uh, for the future of AI and neuroscience?

Kim Stachenfeld: Yeah, I mean, I really like, I really like the memory stuff. Um, I will say one thing, yeah, I guess one thing, um, I, one of my, um, one of my colleagues, uh, Kevin Miller, um, has, uh, um, he, like he's used the phrase just like kind of in conversations like humanist neuroscience or humanist AI driven science where the idea that um, AI could be used to give us new insights and and help us like learn qualitatively new things about the scientific world.

Um, I think that this is something um, That, uh, that is a very exciting challenge for me. I don't know if it's something that, um, it's not something I'm excited about in the sense that this, this exists and we already know how to do it, but I'm excited about it as a, as a challenge. Um, a lot of the advances in AI for science have been to contribute tools to, to make a model that's really good at predicting how proteins will fold.

It doesn't necessarily tell you why they fold that way. Um, it's, it's, but it's a, a tool to help scientists, um, you know, do something else with bio, with biology. They can, now that they know this, then they can use it to solve other problems. Or then they can try and unpack the reasons why it folds in a particular way.

Um, but I, I really like the um, the, the challenge of like, let's not just have a model that's good at predicting. Let's have a, that's, A model that's good at predicting and also can tell us something that we would consider an insight, um, which is challenging for many reasons, not the least of which, because we don't have a very formal operationalization of what constitutes an insight.

Um, but we kind of know it when we see it. Um, and, um, uh, I think this is a very, yeah, I, I, I really like Kevin's formulation of this. I think it's a, a nice way of thinking about like what the next step of using AI for science will be.

Richie Cotton: Okay, yeah, certainly in the world of data, that last mile from, uh, going from predictions to having some kind of insight that you can actually act on, make a decision about, uh, that's always the hardest bit.

So, uh, having some AI to help for that, and especially in the world of science, that just seemed incredibly important.

Kim Stachenfeld: Yeah. Yeah, it relates to this question of what constitutes a good hypothesis versus, like, what is just a hypothesis that you can feed into the scientific method? Like, a, a hypothesis is just anything that makes a testable prediction.

Um, and you could have the, the wonkiest, weirdest model. You could just say like, this random Rube Goldberg machine, I hypothesize, is going to do the same thing as the brain. Um, and that's a easy hypothesis to reject, and probably not that most people would be like, that was a kind of a boring hypothesis to make.

But, you know, making, making predictive models that, that make predictions that you can test, but don't necessarily decompose the model space into interpretable or understandable or semantic divisions just isn't, isn't usually what we think of as, as making an insight. It's just not, it doesn't, doesn't, um, like satisfy what folks are often looking for.

So I think being able to, um, do that extra step will be something very cool.

Richie Cotton: Absolutely. It's certainly exciting times. So, uh, yeah. Uh, thank you for coming on the show.

Kim Stachenfeld: Yeah. Thank you so much for having me. Um, it was really lovely chatting.

Topics

Artificial Intelligence (AI)

Machine Learning

podcast

Making Better Decisions using Data & AI with Cassie Kozyrkov, Google's First Chief Decision Scientist

Richie speaks to Google's first Chief Decision Scientist and CEO of Data Scientific, Cassie Kozyrkov, covering decision science, data and AI.

Richie Cotton

68 min

podcast

The 2nd Wave of Generative AI with Sailesh Ramakrishnan & Madhu Iyer, Managing Partners at Rocketship.vc

Richie, Madhu and Sailesh explore the generative AI revolution, the impact of genAI across industries, investment philosophy and data-driven decision-making, the challenges and opportunities when investing in AI, future trends and predictions, and much more.

Richie Cotton

51 min

podcast

The Past, Present & Future of Generative AI—With Joanne Chen, General Partner at Foundation Capital

Richie and Joanne cover emerging trends in generative AI, business use cases, the role of AI in augmenting work, and actionable insights for individuals and organizations wanting to adopt AI.

Richie Cotton

36 min

podcast

Deep Learning at NVIDIA

The modern superpower of deep learning and where it has the largest impact, past, present and future, filtered through the lens of Michelle Gill's work at NVIDIA.

podcast

How Generative AI is Changing Business and Society with Bernard Marr, AI Advisor, Best-Selling Author, and Futurist

Richie and Bernard explore how AI will impact society through the augmentation of jobs, the importance of developing skills that won’t be easily replaced by AI, why we should be optimistic about the future of AI, and much more.

Richie Cotton

48 min

podcast

Trust and Regulation in AI with Bruce Schneier, Internationally Renowned Security Technologist

Richie and Bruce explore the definition of trust, how AI mimics social trust, AI and deception, AI regulation, why AI is a political issue and much more.

Richie Cotton

40 min

See More See More