back to indexMichael Littman: Reinforcement Learning and the Future of AI | Lex Fridman Podcast #144
link |
The following is a conversation with Michael Littman, a computer science professor at Brown
link |
University doing research on and teaching machine learning, reinforcement learning,
link |
and artificial intelligence. He enjoys being silly and lighthearted in conversation,
link |
so this was definitely a fun one. Quick mention of each sponsor,
link |
followed by some thoughts related to the episode. Thank you to SimplySafe, a home security company
link |
I use to monitor and protect my apartment, ExpressVPN, the VPN I've used for many years
link |
to protect my privacy on the internet, MasterClass, online courses that I enjoy from
link |
some of the most amazing humans in history, and BetterHelp, online therapy with a licensed
link |
professional. Please check out these sponsors in the description to get a discount and to support
link |
this podcast. As a side note, let me say that I may experiment with doing some solo episodes
link |
in the coming month or two. The three ideas I have floating in my head currently is to use one,
link |
a particular moment in history, two, a particular movie, or three, a book to drive a conversation
link |
about a set of related concepts. For example, I could use 2001, A Space Odyssey, or Ex Machina
link |
to talk about AGI for one, two, three hours. Or I could do an episode on the, yes, rise and fall of
link |
Hitler and Stalin, each in a separate episode, using relevant books and historical moments
link |
for reference. I find the format of a solo episode very uncomfortable and challenging,
link |
but that just tells me that it's something I definitely need to do and learn from the experience.
link |
Of course, I hope you come along for the ride. Also, since we have all this momentum built up
link |
on announcements, I'm giving a few lectures on machine learning at MIT this January.
link |
In general, if you have ideas for the episodes, for the lectures, or for just short videos on
link |
YouTube, let me know in the comments that I still definitely read, despite my better judgment,
link |
and the wise sage advice of the great Joe Rogan. If you enjoy this thing, subscribe on YouTube,
link |
review it with Five Stars and Apple Podcast, follow on Spotify, support on Patreon, or connect
link |
with me on Twitter at Lex Friedman. And now, here's my conversation with Michael Littman.
link |
I saw a video of you talking to Charles Isbell about Westworld, the TV series. You guys were
link |
doing the kind of thing where you're watching new things together, but let's rewind back.
link |
Is there a sci fi movie or book or shows that was profound, that had an impact on you philosophically,
link |
or just specifically something you enjoyed nerding out about?
link |
Yeah, interesting. I think a lot of us have been inspired by robots in movies. One that I really
link |
like is, there's a movie called Robot and Frank, which I think is really interesting because it's
link |
very near term future, where robots are being deployed as helpers in people's homes. And we
link |
don't know how to make robots like that at this point, but it seemed very plausible. It seemed
link |
very realistic or imaginable. And I thought that was really cool because they're awkward,
link |
they do funny things that raise some interesting issues, but it seemed like something that would
link |
ultimately be helpful and good if we could do it right.
link |
Yeah, he was an older cranky gentleman, right?
link |
He was an older cranky jewel thief, yeah.
link |
It's kind of funny little thing, which is, you know, he's a jewel thief and so he pulls the
link |
robot into his life, which is like, which is something you could imagine taking a home robotics
link |
thing and pulling into whatever quirky thing that's involved in your existence.
link |
It's meaningful to you. Exactly so. Yeah. And I think from that perspective, I mean,
link |
not all of us are jewel thieves. And so when we bring our robots into our lives, it explains a
link |
lot about this apartment, actually. But no, the idea that people should have the ability to make
link |
this technology their own, that it becomes part of their lives. And I think it's hard for us
link |
as technologists to make that kind of technology. It's easier to mold people into what we need them
link |
to be. And just that opposite vision, I think, is really inspiring. And then there's a
link |
anthropomorphization where we project certain things on them, because I think the robot was
link |
kind of dumb. But I have a bunch of Roombas I play with and you immediately project stuff onto
link |
them. Much greater level of intelligence. We'll probably do that with each other too. Much greater
link |
degree of compassion. That's right. One of the things we're learning from AI is where we are
link |
smart and where we are not smart. Yeah. You also enjoy, as people can see, and I enjoyed
link |
myself watching you sing and even dance a little bit, a little bit, a little bit of dancing.
link |
A little bit of dancing. That's not quite my thing. As a method of education or just in life,
link |
you know, in general. So easy question. What's the definitive, objectively speaking,
link |
top three songs of all time? Maybe something that, you know, to walk that back a little bit,
link |
maybe something that others might be surprised by the three songs that you kind of enjoy.
link |
That is a great question that I cannot answer. But instead, let me tell you a story.
link |
So pick a question you do want to answer. That's right. I've been watching the
link |
presidential debates and vice presidential debates. And it turns out, yeah, it's really,
link |
you can just answer any question you want. So it's a related question. Well said.
link |
I really like pop music. I've enjoyed pop music ever since I was very young. So 60s music,
link |
70s music, 80s music. This is all awesome. And then I had kids and I think I stopped listening
link |
to music and I was starting to realize that my musical taste had sort of frozen out.
link |
And so I decided in 2011, I think, to start listening to the top 10 billboard songs each week.
link |
So I'd be on the on the treadmill and I would listen to that week's top 10 songs
link |
so I could find out what was popular now. And what I discovered is that I have no musical
link |
taste whatsoever. I like what I'm familiar with. And so the first time I'd hear a song
link |
is the first week that was on the charts, I'd be like, and then the second week,
link |
I was into it a little bit. And the third week, I was loving it. And by the fourth week is like,
link |
just part of me. And so I'm afraid that I can't tell you the most my favorite song of all time,
link |
because it's whatever I heard most recently. Yeah, that's interesting. People have told me that
link |
there's an art to listening to music as well. And you can start to, if you listen to a song,
link |
just carefully, like explicitly, just force yourself to really listen. You start to,
link |
I did this when I was part of jazz band and fusion band in college. You start to hear the layers
link |
of the instruments. You start to hear the individual instruments and you start to,
link |
you can listen to classical music or to orchestra this way. You can listen to jazz this way.
link |
I mean, it's funny to imagine you now to walking that forward to listening to pop hits now as like
link |
a scholar, listening to like Cardi B or something like that, or Justin Timberlake. Is he? No,
link |
not Timberlake, Bieber. They've both been in the top 10 since I've been listening.
link |
They're still up there. Oh my God, I'm so cool.
link |
If you haven't heard Justin Timberlake's top 10 in the last few years, there was one
link |
song that he did where the music video was set at essentially NeurIPS.
link |
Oh, wow. Oh, the one with the robotics. Yeah, yeah, yeah, yeah, yeah.
link |
Yeah, yeah. It's like at an academic conference and he's doing a demo.
link |
He was presenting, right?
link |
It was sort of a cross between the Apple, like Steve Jobs kind of talk and NeurIPS.
link |
So, you know, it's always fun when AI shows up in pop culture.
link |
I wonder if he consulted somebody for that. That's really interesting. So maybe on that topic,
link |
I've seen your celebrity multiple dimensions, but one of them is you've done cameos in different
link |
places. I've seen you in a TurboTax commercial as like, I guess, the brilliant Einstein character.
link |
And the point is that TurboTax doesn't need somebody like you. It doesn't need a brilliant
link |
Very few things need someone like me. But yes, they were specifically emphasizing the
link |
idea that you don't need to be like a computer expert to be able to use their software.
link |
How did you end up in that world?
link |
I think it's an interesting story. So I was teaching my class. It was an intro computer
link |
science class for non concentrators, non majors. And sometimes when people would visit campus,
link |
they would check in to say, hey, we want to see what a class is like. Can we sit on your class?
link |
So a person came to my class who was the daughter of the brother of the husband of the best friend
link |
of my wife. Anyway, basically a family friend came to campus to check out Brown and asked to
link |
come to my class and came with her dad. Her dad is, who I've known from various
link |
kinds of family events and so forth, but he also does advertising. And he said that he was
link |
recruiting scientists for this ad, this TurboTax set of ads. And he said, we wrote the ad with the
link |
idea that we get like the most brilliant researchers, but they all said no. So can you
link |
help us find like B level scientists? And I'm like, sure, that's who I hang out with.
link |
So that should be fine. So I put together a list and I did what some people call the Dick Cheney.
link |
So I included myself on the list of possible candidates, with a little blurb about each one
link |
and why I thought that would make sense for them to do it. And they reached out to a handful of
link |
them, but then they ultimately, they YouTube stalked me a little bit and they thought,
link |
oh, I think he could do this. And they said, okay, we're going to offer you the commercial.
link |
I'm like, what? So it was such an interesting experience because they have another world, the
link |
people who do like nationwide kind of ad campaigns and television shows and movies and so forth.
link |
It's quite a remarkable system that they have going because they have a set. Yeah. So I went to,
link |
it was just somebody's house that they rented in New Jersey. But in the commercial, it's just me
link |
and this other woman. In reality, there were 50 people in that room and another, I don't know,
link |
half a dozen kind of spread out around the house in various ways. There were people whose job it
link |
was to control the sun. They were in the backyard on ladders, putting filters up to try to make sure
link |
that the sun didn't glare off the window in a way that would wreck the shot. So there was like
link |
six people out there doing that. There was three people out there giving snacks, the craft table.
link |
There was another three people giving healthy snacks because that was a separate craft table.
link |
There was one person whose job it was to keep me from getting lost. And I think the reason for all
link |
this is because so many people are in one place at one time. They have to be time efficient. They
link |
have to get it done. The morning they were going to do my commercial. In the afternoon, they were
link |
going to do a commercial of a mathematics professor from Princeton. They had to get it done. No wasted
link |
time or energy. And so there's just a fleet of people all working as an organism. And it was
link |
fascinating. I was just the whole time just looking around like, this is so neat. Like one person
link |
whose job it was to take the camera off of the cameraman so that someone else whose job it was
link |
to remove the film canister. Because every couple's takes, they had to replace the film because film
link |
gets used up. It was just, I don't know. I was geeking out the whole time. It was so fun.
link |
How many takes did it take? It looked the opposite. There was more than two people there. It was very
link |
relaxed. Right. Yeah. The person who I was in the scene with is a professional. She's an improv
link |
comedian from New York City. And when I got there, they had given me a script as such as it was. And
link |
then I got there and they said, we're going to do this as improv. I'm like, I don't know how to
link |
improv. I don't know what you're telling me to do here. Don't worry. She knows. I'm like, okay.
link |
I'll go see how this goes. I guess I got pulled into the story because like, where the heck did
link |
you come from? I guess in the scene. Like, how did you show up in this random person's house?
link |
Yeah. Well, I mean, the reality of it is I stood outside in the blazing sun. There was someone
link |
whose job it was to keep an umbrella over me because I started to sweat. And so I would wreck
link |
the shot because my face was all shiny with sweat. So there was one person who would dab me off,
link |
had an umbrella. But yeah, like the reality of it, like, why is this strange stalkery person hanging
link |
around outside somebody's house? We're not sure when you have to look in,
link |
what the ways for the book, but are you, so you make, you make, like you said, YouTube,
link |
you make videos yourself, you make awesome parody, sort of parody songs that kind of focus on a
link |
particular aspect of computer science. How much those seem really interesting to you?
link |
How much those seem really natural? How much production value goes into that?
link |
Do you also have a team of 50 people? The videos, almost all the videos,
link |
except for the ones that people would have actually seen, are just me. I write the lyrics,
link |
I sing the song. I generally find a, like a backing track online because I'm like you,
link |
can't really play an instrument. And then I do, in some cases I'll do visuals using just like
link |
PowerPoint. Lots and lots of PowerPoint to make it sort of like an animation.
link |
The most produced one is the one that people might have seen, which is the overfitting video
link |
that I did with Charles Isbell. And that was produced by the Georgia Tech and Udacity people
link |
because we were doing a class together. It was kind of, I usually do parody songs kind of to
link |
cap off a class at the end of a class. So that one you're wearing, so it was just a
link |
thriller. You're wearing the Michael Jackson, the red leather jacket. The interesting thing
link |
with podcasting that you're also into is that I really enjoy is that there's not a team of people.
link |
It's kind of more, because you know, there's something that happens when there's more people
link |
involved than just one person that just the way you start acting, I don't know. There's a censorship.
link |
You're not given, especially for like slow thinkers like me, you're not. And I think most of us are,
link |
if we're trying to actually think we're a little bit slow and careful, it kind of large teams get
link |
in the way of that. And I don't know what to do with that. Like that's the, to me, like if,
link |
yeah, it's very popular to criticize quote unquote mainstream media.
link |
But there is legitimacy to criticizing them the same. I love listening to NPR, for example,
link |
but every, it's clear that there's a team behind it. There's a commercial,
link |
there's constant commercial breaks. There's this kind of like rush of like,
link |
okay, I have to interrupt you now because we have to go to commercial. Just this whole,
link |
it creates, it destroys the possibility of nuanced conversation. Yeah, exactly. Evian,
link |
which Charles Isbell, who I talked to yesterday told me that Evian is naive backwards, which
link |
the fact that his mind thinks this way is quite brilliant. Anyway, there's a freedom to this
link |
podcast. He's Dr. Awkward, which by the way, is a palindrome. That's a palindrome that I happen to
link |
know from other parts of my life. And I just, well, you know, use it against Charles. Dr. Awkward.
link |
So what was the most challenging parody song to make? Was it the Thriller one?
link |
No, that one was really fun. I wrote the lyrics really quickly and then I gave it over to the
link |
production team. They recruited a acapella group to sing. That went really smoothly. It's great
link |
having a team because then you can just focus on the part that you really love, which in my case
link |
is writing the lyrics. For me, the most challenging one, not challenging in a bad way, but challenging
link |
in a really fun way, was I did one of the parody songs I did is about the halting problem in
link |
computer science. The fact that you can't create a program that can tell for any other arbitrary
link |
program whether it actually going to get stuck in infinite loop or whether it's going to eventually
link |
stop. And so I did it to an 80's song because I hadn't started my new thing of learning current
link |
songs. And it was Billy Joel's The Piano Man. Nice. Which is a great song. Sing me a song.
link |
You're the piano man. Yeah. So the lyrics are great because first of all, it rhymes. Not all
link |
songs rhyme. I've done Rolling Stones songs which turn out to have no rhyme scheme whatsoever. They're
link |
just sort of yelling and having a good time, which makes it not fun from a parody perspective because
link |
like you can say anything. But the lines rhymed and there was a lot of internal rhymes as well.
link |
And so figuring out how to sing with internal rhymes, a proof of the halting problem was really
link |
challenging. And I really enjoyed that process. What about, last question on this topic, what
link |
about the dancing in the Thriller video? How many takes that take? So I wasn't planning to dance.
link |
They had me in the studio and they gave me the jacket and it's like, well, you can't,
link |
if you have the jacket and the glove, like there's not much you can do. Yeah. So I think I just
link |
danced around and then they said, why don't you dance a little bit? There was a scene with me
link |
and Charles dancing together. They did not use it in the video, but we recorded it. Yeah. Yeah. No,
link |
it was pretty funny. And Charles, who has this beautiful, wonderful voice doesn't really sing.
link |
He's not really a singer. And so that was why I designed the song with him doing a spoken section
link |
and me doing the singing. It's very like Barry White. Yeah. Smooth baritone. Yeah. Yeah. It's
link |
great. That was awesome. So one of the other things Charles said is that, you know, everyone
link |
knows you as like a super nice guy, super passionate about teaching and so on. What he said,
link |
don't know if it's true, that despite the fact that you're, you are. Okay. I will admit this
link |
finally for the first time. That was, that was me. It's the Johnny Cash song. Kill the Manorino just
link |
to watch him die. That you actually do have some strong opinions on some topics. So if this in fact
link |
is true, what strong opinions would you say you have? Is there ideas you think maybe in artificial
link |
intelligence and machine learning, maybe in life that you believe is true that others might,
link |
you know, some number of people might disagree with you on? So I try very hard to see things
link |
from multiple perspectives. There's this great Calvin and Hobbes cartoon where, do you know?
link |
Yeah. Okay. So Calvin's dad is always kind of a bit of a foil and he talked Calvin into,
link |
Calvin had done something wrong. The dad talks him into like seeing it from another perspective
link |
and Calvin, like this breaks Calvin because he's like, oh my gosh, now I can see the opposite sides
link |
of things. And so the, it's, it becomes like a Cubist cartoon where there is no front and back.
link |
Everything's just exposed and it really freaks him out. And finally he settles back down. It's
link |
like, oh good. No, I can make that go away. But like, I'm that, I'm that I live in that world where
link |
I'm trying to see everything from every perspective all the time. So there are some things that I've
link |
formed opinions about that I would be harder, I think, to disavow me of. One is the super
link |
intelligence argument and the existential threat of AI is one where I feel pretty confident in my
link |
feeling about that one. Like I'm willing to hear other arguments, but like, I am not particularly
link |
moved by the idea that if we're not careful, we will accidentally create a super intelligence
link |
that will destroy human life. Let's talk about that. Let's get you in trouble and record your
link |
video. It's like Bill Gates, I think he said like some quote about the internet that that's just
link |
going to be a small thing. It's not going to really go anywhere. And then I think Steve
link |
Ballmer said, I don't know why I'm sticking on Microsoft. That's something that like smartphones
link |
are useless. There's no reason why Microsoft should get into smartphones, that kind of.
link |
So let's get, let's talk about AGI. As AGI is destroying the world, we'll look back at this
link |
video and see. No, I think it's really interesting to actually talk about because nobody really
link |
knows the future. So you have to use your best intuition. It's very difficult to predict it,
link |
but you have spoken about AGI and the existential risks around it and sort of basing your intuition
link |
that we're quite far away from that being a serious concern relative to the other concerns
link |
we have. Can you maybe unpack that a little bit? Yeah, sure, sure, sure. So as I understand it,
link |
that for example, I read Bostrom's book and a bunch of other reading material about this sort
link |
of general way of thinking about the world. And I think the story goes something like this, that we
link |
will at some point create computers that are smart enough that they can help design the next version
link |
of themselves, which itself will be smarter than the previous version of themselves and eventually
link |
bootstrapped up to being smarter than us. At which point we are essentially at the mercy of this sort
link |
of more powerful intellect, which in principle we don't have any control over what its goals are.
link |
And so if its goals are at all out of sync with our goals, for example, the continued existence
link |
of humanity, we won't be able to stop it. It'll be way more powerful than us and we will be toast.
link |
So there's some, I don't know, very smart people who have signed on to that story. And it's a
link |
compelling story. Now I can really get myself in trouble. I once wrote an op ed about this,
link |
specifically responding to some quotes from Elon Musk, who has been on this very podcast
link |
more than once. AI summoning the demon. But then he came to Providence, Rhode Island,
link |
which is where I live, and said to the governors of all the states, you know, you're worried about
link |
entirely the wrong thing. You need to be worried about AI. You need to be very, very worried about
link |
AI. And journalists kind of reacted to that and they wanted to get people's take. And I was like,
link |
OK, my my my belief is that one of the things that makes Elon Musk so successful and so remarkable
link |
as an individual is that he believes in the power of ideas. He believes that you can have you can
link |
if you know, if you have a really good idea for getting into space, you can get into space.
link |
If you have a really good idea for a company or for how to change the way that people drive,
link |
you just have to do it and it can happen. It's really natural to apply that same idea to AI.
link |
You see these systems that are doing some pretty remarkable computational tricks, demonstrations,
link |
and then to take that idea and just push it all the way to the limit and think, OK, where does
link |
this go? Where is this going to take us next? And if you're a deep believer in the power of ideas,
link |
then it's really natural to believe that those ideas could be taken to the extreme and kill us.
link |
So I think, you know, his strength is also his undoing, because that doesn't mean it's true.
link |
Like, it doesn't mean that that has to happen, but it's natural for him to think that.
link |
So another way to phrase the way he thinks, and I find it very difficult to argue with that line
link |
of thinking. So Sam Harris is another person from neuroscience perspective that thinks like that
link |
is saying, well, is there something fundamental in the physics of the universe that prevents this
link |
from eventually happening? And Nick Bostrom thinks in the same way, that kind of zooming out, yeah,
link |
OK, we humans now are existing in this like time scale of minutes and days. And so our intuition
link |
is in this time scale of minutes, hours and days. But if you look at the span of human history,
link |
is there any reason you can't see this in 100 years? And like, is there something fundamental
link |
about the laws of physics that prevent this? And if it doesn't, then it eventually will happen
link |
or will we will destroy ourselves in some other way. And it's very difficult, I find,
link |
to actually argue against that. Yeah, me too.
link |
And not sound like. Not sound like you're just like rolling your eyes like I have like science
link |
fiction, we don't have to think about it, but even even worse than that, which is like, I don't have
link |
kids, but like I got to pick up my kids now like this. OK, I see there's more pressing short. Yeah,
link |
there's more pressing short term things that like stop over the next national crisis. We have much,
link |
much shorter things like now, especially this year, there's covid. So like any kind of discussion
link |
like that is like there's this, you know, this pressing things today is. And then so the Sam
link |
Harris argument, well, like any day the exponential singularity can can occur is very difficult to
link |
argue against. I mean, I don't know. But part of his story is also he's not going to put a date on
link |
it. It could be in a thousand years, it could be in a hundred years, it could be in two years. It's
link |
just that as long as we keep making this kind of progress, it's ultimately has to become a concern.
link |
I kind of am on board with that. But the thing that the piece that I feel like is missing from
link |
that that way of extrapolating from the moment that we're in, is that I believe that in the
link |
process of actually developing technology that can really get around in the world and really process
link |
and do things in the world in a sophisticated way, we're going to learn a lot about what that means,
link |
which that we don't know now because we don't know how to do this right now.
link |
If you believe that you can just turn on a deep learning network and eventually give it enough
link |
compute and eventually get there. Well, sure, that seems really scary because we won't we won't be
link |
in the loop at all. We won't we won't be helping to design or target these kinds of systems.
link |
But I don't I don't see that. That feels like it is against the laws of physics,
link |
because these systems need help. Right. They need they need to surpass the the the difficulty,
link |
the wall of complexity that happens in arranging something in the form that that will happen.
link |
Yeah, like I believe in evolution, like I believe that that that there's an argument. Right. So
link |
there's another argument, just to look at it from a different perspective, that people say,
link |
why don't believe in evolution? How could evolution? It's it's sort of like a random set of
link |
parts assemble themselves into a 747. And that could just never happen. So it's like,
link |
OK, that's maybe hard to argue against. But clearly, 747 do get assembled. They get assembled
link |
by us. Basically, the idea being that there's a process by which we will get to the point of
link |
making technology that has that kind of awareness. And in that process, we're going to learn a lot
link |
about that process and we'll have more ability to control it or to shape it or to build it in our
link |
own image. It's not something that is going to spring into existence like that 747. And we're
link |
just going to have to contend with it completely unprepared. That's very possible that in the
link |
context of the long arc of human history, it will, in fact, spring into existence.
link |
But that springing might take like if you look at nuclear weapons, like even 20 years is a springing
link |
in in the context of human history. And it's very possible, just like with nuclear weapons,
link |
that we could have I don't know what percentage you want to put at it, but the possibility could
link |
have knocked ourselves out. Yeah. The possibility of human beings destroying themselves in the 20th
link |
century with nuclear weapons. I don't know. You can if you really think through it, you could
link |
really put it close to, like, I don't know, 30, 40 percent, given like the certain moments of
link |
crisis that happen. So, like, I think one, like, fear in the shadows that's not being acknowledged
link |
is it's not so much the A.I. will run away is is that as it's running away,
link |
we won't have enough time to think through how to stop it. Right. Fast takeoff or FOOM. Yeah.
link |
I mean, my much bigger concern, I wonder what you think about it, which is
link |
we won't know it's happening. So I kind of think that there's an A.G.I. situation already happening
link |
with social media that our minds, our collective intelligence of human civilization is already
link |
being controlled by an algorithm. And like we're we're already super like the level of a collective
link |
intelligence, thanks to Wikipedia, people should donate to Wikipedia to feed the A.G.I.
link |
. Man, if we had a super intelligence that that was in line with Wikipedia's values,
link |
that it's a lot better than a lot of other things I could imagine. I trust Wikipedia more than I
link |
trust Facebook or YouTube as far as trying to do the right thing from a rational perspective.
link |
Yeah. Now, that's not where you were going. I understand that. But it does strike me that
link |
there's sort of smarter and less smart ways of exposing ourselves to each other on the Internet.
link |
Yeah. The interesting thing is that Wikipedia and social media have very different forces.
link |
You're right. I mean, Wikipedia, if A.G.I. was Wikipedia, it'd be just like this cranky, overly
link |
competent editor of articles. You know, there's something to that. But the social
link |
media aspect is not. So the vision of A.G.I. is as a separate system that's super intelligent.
link |
That's super intelligent. That's one key little thing. I mean, there's the paperclip argument
link |
that's super dumb, but super powerful systems. But with social media, you have a relatively like
link |
algorithms we may talk about today, very simple algorithms that when something Charles talks a
link |
lot about, which is interactive A.I., when they start like having at scale, like tiny little
link |
interactions with human beings, they can start controlling these human beings. So a single
link |
algorithm can control the minds of human beings slowly to what we might not realize. It could
link |
start wars. It could start. It could change the way we think about things. It feels like
link |
in the long arc of history, if I were to sort of zoom out from all the outrage and all the tension
link |
on social media, that it's progressing us towards better and better things. It feels like chaos and
link |
toxic and all that kind of stuff. It's chaos and toxic. Yeah. But it feels like actually
link |
the chaos and toxic is similar to the kind of debates we had from the founding of this country.
link |
You know, there was a civil war that happened over that period. And ultimately it was all about
link |
this tension of like something doesn't feel right about our implementation of the core values we
link |
hold as human beings. And they're constantly struggling with this. And that results in people
link |
calling each other, just being shady to each other on Twitter. But ultimately the algorithm is
link |
managing all that. And it feels like there's a possible future in which that algorithm
link |
controls us into the direction of self destruction and whatever that looks like.
link |
Yeah. So, all right. I do believe in the power of social media to screw us up royally. I do believe
link |
in the power of social media to benefit us too. I do think that we're in a, yeah, it's sort of
link |
almost got dropped on top of us. And now we're trying to, as a culture, figure out how to cope
link |
with it. There's a sense in which, I don't know, there's some arguments that say that, for example,
link |
I guess college age students now, late college age students now, people who were in middle school
link |
when social media started to really take off, may be really damaged. Like this may have really hurt
link |
their development in a way that we don't have all the implications of quite yet. That's the generation
link |
who, and I hate to make it somebody else's responsibility, but like they're the ones who
link |
can fix it. They're the ones who can figure out how do we keep the good of this kind of technology
link |
without letting it eat us alive. And if they're successful, we move on to the next phase, the next
link |
level of the game. If they're not successful, then yeah, then we're going to wreck each other. We're
link |
going to destroy society. So you're going to, in your old age, sit on a porch and watch the world
link |
burn because of the TikTok generation that... I believe, well, so this is my kid's age,
link |
right? And that's certainly my daughter's age. And she's very tapped in to social stuff, but she's
link |
also, she's trying to find that balance, right? Of participating in it and in getting the positives
link |
of it, but without letting it eat her alive. And I think sometimes she ventures, I hope she doesn't
link |
watch this. Sometimes I think she ventures a little too far and is consumed by it. And other
link |
times she gets a little distance. And if there's enough people like her out there, they're going to
link |
navigate this choppy waters. That's an interesting skill actually to develop. I talked to my dad
link |
about it. I've now, somehow this podcast in particular, but other reasons has received a
link |
little bit of attention. And with that, apparently in this world, even though I don't shut up about
link |
love and I'm just all about kindness, I have now a little mini army of trolls. It's kind of hilarious
link |
actually, but it also doesn't feel good, but it's a skill to learn to not look at that, like to
link |
moderate actually how much you look at that. The discussion I have with my dad, it's similar to,
link |
it doesn't have to be about trolls. It could be about checking email, which is like, if you're
link |
anticipating, you know, there's a, my dad runs a large Institute at Drexel University and there
link |
could be stressful like emails you're waiting, like there's drama of some kinds. And so like,
link |
there's a temptation to check the email. If you send an email and you kind of,
link |
and that pulls you in into, it doesn't feel good. And it's a skill that he actually complains that
link |
he hasn't learned. I mean, he grew up without it. So he hasn't learned the skill of how to
link |
shut off the internet and walk away. And I think young people, while they're also being
link |
quote unquote damaged by like, you know, being bullied online, all of those stories, which are
link |
very like horrific, you basically can't escape your bullies these days when you're growing up.
link |
But at the same time, they're also learning that skill of how to be able to shut off the,
link |
like disconnect with it, be able to laugh at it, not take it too seriously. It's fascinating. Like
link |
we're all trying to figure this out. Just like you said, it's been dropped on us and we're trying to
link |
figure it out. Yeah. I think that's really interesting. And I guess I've become a believer
link |
in the human design, which I feel like I don't completely understand. Like how do you make
link |
something as robust as us? Like we're so flawed in so many ways. And yet, and yet, you know,
link |
we dominate the planet and we do seem to manage to get ourselves out of scrapes eventually,
link |
not necessarily the most elegant possible way, but somehow we get, we get to the next step.
link |
And I don't know how I'd make a machine do that. Generally speaking, like if I train one of my
link |
reinforcement learning agents to play a video game and it works really hard on that first stage
link |
over and over and over again, and it makes it through, it succeeds on that first level.
link |
And then the new level comes and it's just like, okay, I'm back to the drawing board. And somehow
link |
humanity, we keep leveling up and then somehow managing to put together the skills necessary to
link |
achieve success, some semblance of success in that next level too. And, you know,
link |
I hope we can keep doing that.
link |
You mentioned reinforcement learning. So you've had a couple of years in the field. No, quite,
link |
you know, quite a few, quite a long career in artificial intelligence broadly, but reinforcement
link |
learning specifically, can you maybe give a hint about your sense of the history of the field?
link |
And in some ways it's changed with the advent of deep learning, but as a long roots, like how is it
link |
weaved in and out of your own life? How have you seen the community change or maybe the ideas that
link |
it's playing with change? I've had the privilege, the pleasure of being, of having almost a front
link |
row seat to a lot of this stuff. And it's been really, really fun and interesting. So when I was
link |
in college in the eighties, early eighties, the neural net thing was starting to happen.
link |
And I was taking a lot of psychology classes and a lot of computer science classes as a college
link |
student. And I thought, you know, something that can play tic tac toe and just like learn to get
link |
better at it. That ought to be a really easy thing. So I spent almost, almost all of my, what would
link |
have been vacations during college, like hacking on my home computer, trying to teach it how to
link |
play tic tac toe and programming language. Basic. Oh yeah. That's, that's, I was, I that's my first
link |
language. That's my native language. Is that when you first fell in love with computer science,
link |
just like programming basic on that? Uh, what was, what was the computer? Do you remember? I had,
link |
I had a TRS 80 model one before they were called model ones. Cause there was nothing else. Uh,
link |
I got my computer in 1979, uh, instead. So I was, I was, I would have been bar mitzvahed,
link |
but instead of having a big party that my parents threw on my behalf, they just got me a computer.
link |
Cause that's what I really, really, really wanted. I saw them in the, in the, in the mall and
link |
radio shack. And I thought, what, how are they doing that? I would try to stump them. I would
link |
give them math problems like one plus and then in parentheses, two plus one. And I would always get
link |
it right. I'm like, how do you know so much? Like I've had to go to algebra class for the last few
link |
years to learn this stuff and you just seem to know. So I was, I was, I was smitten and, uh,
link |
got a computer and I think ages 13 to 15. I have no memory of those years. I think I just was in
link |
my room with the computer, listening to Billy Joel, communing, possibly listening to the radio,
link |
listening to Billy Joel. That was the one album I had, uh, on vinyl at that time. And, um, and then
link |
I got it on cassette tape and that was really helpful because then I could play it. I didn't
link |
have to go down to my parents, wifi or hi fi sorry. Uh, and at age 15, I remember kind of
link |
walking out and like, okay, I'm ready to talk to people again. Like I've learned what I need to
link |
learn here. And, um, so yeah, so, so that was, that was my home computer. And so I went to college
link |
and I was like, oh, I'm totally going to study computer science. And I opted the college I chose
link |
specifically had a computer science major. The one that I really wanted the college I really wanted
link |
to go to didn't so bye bye to them. So I went to Yale, uh, Princeton would have been way more
link |
convenient and it was just beautiful campus and it was close enough to home. And I was really
link |
excited about Princeton. And I visited, I said, so computer science majors like, well, we have
link |
computer engineering. I'm like, Oh, I don't like that word engineering. I like computer science.
link |
I really, I want to do like, you're saying hardware and software. They're like, yeah.
link |
I'm like, I just want to do software. I couldn't care less about hardware. And you grew up in
link |
Philadelphia. I grew up outside Philly. Yeah. Yeah. Uh, so the, you know, local schools were
link |
like Penn and Drexel and, uh, temple. Like everyone in my family went to temple at least at
link |
one point in their lives, except for me. So yeah, Philly, Philly family, Yale had a computer science
link |
department. And that's when you, it's kind of interesting. You said eighties and neural
link |
networks. That's when the neural networks was a hot new thing or a hot thing period. Uh, so what
link |
is that in college when you first learned about neural networks or when she learned, like how did
link |
it was in a psychology class, not in a CS. Yeah. Was it psychology or cognitive science or like,
link |
do you remember like what context it was? Yeah. Yeah. Yeah. So, so I was a, I've always been a
link |
bit of a cognitive psychology groupie. So like I'm, I studied computer science, but I like,
link |
I like to hang around where the cognitive scientists are. Cause I don't know brains, man.
link |
They're like, they're wacky. Cool. And they have a bigger picture view of things. They're a little
link |
less engineering. I would say they're more, they're more interested in the nature of cognition and
link |
intelligence and perception and how like the vision system work. Like they're asking always
link |
bigger questions. Now with the deep learning community there, I think more, there's a lot of
link |
intersections, but I do find that the neuroscience folks actually in cognitive psychology, cognitive
link |
science folks are starting to learn how to program, how to use neural, artificial neural networks.
link |
And they are actually approaching problems in like totally new, interesting ways. It's fun to
link |
watch that grad students from those departments, like approach a problem of machine learning.
link |
Right. They come in with a different perspective. Yeah. They don't care about like your
link |
image net data set or whatever they want, like to understand the, the, the, like the basic
link |
mechanisms at the, at the neuronal level and the functional level of intelligence. It's kind of,
link |
it's kind of cool to see them work, but yeah. Okay. So you always love, you're always a groupie
link |
of cognitive psychology. Yeah. Yeah. And so, so it was in a class by Richard Garrig. He was kind of
link |
like my favorite psych professor in college. And I took like three different classes with him
link |
and yeah. So they were talking specifically the class, I think was kind of a,
link |
there was a big paper that was written by Steven Pinker and Prince. I don't, I'm blanking on
link |
Prince's first name, but Prince and Pinker and Prince, they wrote kind of a, they were at that
link |
time kind of like, ah, I'm blanking on the names of the current people. The cognitive scientists
link |
who are complaining a lot about deep networks. Oh, Gary, Gary Marcus, Marcus and who else? I mean,
link |
there's a few, but Gary, Gary's the most feisty. Sure. Gary's very feisty. And with this, with his
link |
coauthor, they, they, you know, they're kind of doing these kinds of take downs where they say,
link |
okay, well, yeah, it does all these amazing, amazing things, but here's a shortcoming. Here's
link |
a shortcoming. Here's a shortcoming. And so the Pinker Prince paper is kind of like the,
link |
that generation's version of Marcus and Davis, right? Where they're, they're trained as cognitive
link |
scientists, but they're looking skeptically at the results in the, in the artificial intelligence,
link |
neural net kind of world and saying, yeah, it can do this and this and this, but low,
link |
it can't do that. And it can't do that. And it can't do that maybe in principle or maybe just
link |
in practice at this point. But, but the fact of the matter is you're, you've narrowed your focus
link |
too far to be impressed. You know, you're impressed with the things within that circle,
link |
but you need to broaden that circle a little bit. You need to look at a wider set of problems.
link |
And so, so we had, so I was in this seminar in college that was basically a close reading of
link |
the Pinker Prince paper, which was like really thick. There was a lot going on in there. And,
link |
and it, you know, and it talked about the reinforcement learning idea a little bit.
link |
I'm like, oh, that sounds really cool because behavior is what is really interesting to me
link |
about psychology anyway. So making programs that, I mean, programs are things that behave.
link |
People are things that behave. Like I want to make learning that learns to behave.
link |
And which way was reinforcement learning presented? Is this talking about human and
link |
animal behavior or are we talking about actual mathematical construct?
link |
Ah, that's right. So that's a good question. Right. So this is, I think it wasn't actually
link |
talked about as behavior in the paper that I was reading. I think that it just talked about
link |
learning. And to me, learning is about learning to behave, but really neural nets at that point
link |
were about learning like supervised learning. So learning to produce outputs from inputs.
link |
So I kind of tried to invent reinforcement learning. When I graduated, I joined a research
link |
group at Bellcore, which had spun out of Bell Labs recently at that time because of the divestiture
link |
of the long distance and local phone service in the 1980s, 1984. And I was in a group with
link |
Dave Ackley, who was the first author of the Boltzmann machine paper. So the very first neural
link |
net paper that could handle XOR, right? So XOR sort of killed neural nets. The very first,
link |
the zero with the first winter. Yeah. Um, the, the perceptrons paper and Hinton along with his
link |
student, Dave Ackley, and I think there was other authors as well showed that no, no, no,
link |
with Boltzmann machines, we can actually learn nonlinear concepts. And so everything's back on
link |
the table again. And that kind of started that second wave of neural networks. So Dave Ackley
link |
was, he became my mentor at, at Bellcore and we talked a lot about learning and life and
link |
computation and how all these things fit together. Now Dave and I have a podcast together. So,
link |
so I get to kind of enjoy that sort of his, his perspective once again, even, even all these years
link |
later. And so I said, so I said, I was really interested in learning, but in the concept of
link |
behavior and he's like, oh, well that's reinforcement learning here. And he gave me
link |
Rich Sutton's 1984 TD paper. So I read that paper. I honestly didn't get all of it,
link |
but I got the idea. I got that they were using, that he was using ideas that I was familiar with
link |
in the context of neural nets and, and like sort of back prop. But with this idea of making
link |
predictions over time, I'm like, this is so interesting, but I don't really get all the
link |
details I said to Dave. And Dave said, oh, well, why don't we have him come and give a talk?
link |
And I was like, wait, what, you can do that? Like, these are real people. I thought they
link |
were just words. I thought it was just like ideas that somehow magically seeped into paper. He's
link |
like, no, I, I, I know Rich like, we'll just have him come down and he'll give a talk. And so I was,
link |
you know, my mind was blown. And so Rich came and he gave a talk at Bellcore and he talked about
link |
what he was super excited, which was they had just figured out at the time Q learning. So Watkins
link |
had visited the Rich Sutton's lab at, at UMass or Andy Bartow's lab that Rich was a part of.
link |
And, um, he was really excited about this because it resolved a whole bunch of problems that he
link |
didn't know how to resolve in the, in the earlier paper. And so, um,
link |
For people who don't know TD, temporal difference, these are all just algorithms
link |
for reinforcement learning.
link |
Right. And TD, temporal difference in particular is about making predictions over time. And you can
link |
try to use it for making decisions, right? Cause if you can predict how good a future action or an
link |
action outcomes will be in the future, you can choose one that has better and, or, but the thing
link |
that's really cool about Q learning is it was off policy, which meant that you could actually be
link |
learning about the environment and what the value of different actions would be while actually
link |
figuring out how to behave optimally. So that was a revelation.
link |
Yeah. And the proof of that is kind of interesting. I mean, that's really surprising
link |
to me when I first read that paper. I mean, it's, it's, it's, it's, it's, it's, it's, it's,
link |
it's, it's, it's, it's, it's, it's, it's, it's, it's, it's, it's, it's, it's, it's, it's, it's, it's,
link |
it's interesting. I mean, that's really surprising to me when I first read that and then in Richard,
link |
Rich Sutton's book on the matter, it's, it's kind of a beautiful that a single equation can
link |
capture all one line of code and like, you can learn anything. Yeah. Like enough time.
link |
So equation and code, you're right. Like you can the code that you can arguably, at least
link |
if you like squint your eyes can say,
link |
this is all of intelligence is that you can implement
link |
that in a single one.
link |
I think I started with Lisp, which is a shout out to Lisp
link |
with like a single line of code, key piece of code,
link |
maybe a couple that you could do that.
link |
It's kind of magical.
link |
It's feels too good to be true.
link |
Well, and it sort of is.
link |
It seems to require an awful lot
link |
of extra stuff supporting it.
link |
But nonetheless, the idea is really good.
link |
And as far as we know, it is a very reasonable way
link |
of trying to create adaptive behavior,
link |
behavior that gets better at something over time.
link |
Did you find the idea of optimal at all compelling
link |
that you could prove that it's optimal?
link |
So like one part of computer science
link |
that it makes people feel warm and fuzzy inside
link |
is when you can prove something like
link |
that a sorting algorithm worst case runs
link |
and N log N, and it makes everybody feel so good.
link |
Even though in reality, it doesn't really matter
link |
what the worst case is, what matters is like,
link |
does this thing actually work in practice
link |
on this particular actual set of data that I enjoy?
link |
So here's a place where I have maybe a strong opinion,
link |
which is like, you're right, of course, but no, no.
link |
Like, so what makes worst case so great, right?
link |
If you have a worst case analysis so great
link |
is that you get modularity.
link |
You can take that thing and plug it into another thing
link |
and still have some understanding of what's gonna happen
link |
when you click them together, right?
link |
If it just works well in practice, in other words,
link |
with respect to some distribution that you care about,
link |
when you go plug it into another thing,
link |
that distribution can shift, it can change,
link |
and your thing may not work well anymore.
link |
And you want it to, and you wish it does,
link |
and you hope that it will, but it might not,
link |
So you're saying you don't like machine learning.
link |
But we have some positive theoretical results
link |
You can come back at me with,
link |
yeah, but they're really weak,
link |
and yeah, they're really weak.
link |
And you can even say that sorting algorithms,
link |
like if you do the optimal sorting algorithm,
link |
it's not really the one that you want,
link |
and that might be true as well.
link |
But it is, the modularity is a really powerful statement.
link |
I really like that.
link |
If you're an engineer, you can then assemble
link |
different things, you can count on them to be,
link |
I mean, it's interesting.
link |
It's a balance, like with everything else in life,
link |
you don't want to get too obsessed.
link |
I mean, this is what computer scientists do,
link |
which they tend to get obsessed,
link |
and they overoptimize things,
link |
or they start by optimizing, and then they overoptimize.
link |
So it's easy to get really granular about this thing,
link |
but like the step from an n squared to an n log n
link |
sorting algorithm is a big leap for most real world systems.
link |
No matter what the actual behavior of the system is,
link |
that's a big leap.
link |
And the same can probably be said
link |
for other kind of first leaps
link |
that you would take on a particular problem.
link |
Like it's picking the low hanging fruit,
link |
or whatever the equivalent of doing the,
link |
not the dumbest thing, but the next to the dumbest thing.
link |
Picking the most delicious reachable fruit.
link |
Yeah, most delicious reachable fruit.
link |
I don't know why that's not a saying.
link |
Okay, so then this is the 80s,
link |
and this kind of idea starts to percolate of learning.
link |
At that point, I got to meet Rich Sutton,
link |
so everything was sort of downhill from there,
link |
and that was really the pinnacle of everything.
link |
But then I felt like I was kind of on the inside.
link |
So then as interesting results were happening,
link |
I could like check in with Rich or with Jerry Tesaro,
link |
who had a huge impact on kind of early thinking
link |
in temporal difference learning and reinforcement learning
link |
and showed that you could do,
link |
you could solve problems
link |
that we didn't know how to solve any other way.
link |
And so that was really cool.
link |
So as good things were happening,
link |
I would hear about it from either the people
link |
who were doing it,
link |
or the people who were talking to the people
link |
who were doing it.
link |
And so I was able to track things pretty well
link |
So what wasn't most of the excitement
link |
on reinforcement learning in the 90s era
link |
with, what is it, TD Gamma?
link |
Like what's the role of these kind of little
link |
like fun game playing things and breakthroughs
link |
about exciting the community?
link |
Was that, like what were your,
link |
because you've also built across,
link |
or part of building across a puzzle solver,
link |
solving program called proverb.
link |
So you were interested in this as a problem,
link |
like in forming, using games to understand
link |
how to build intelligence systems.
link |
So like, what did you think about TD Gamma?
link |
Like what did you think about that whole thing in the 90s?
link |
Yeah, I mean, I found the TD Gamma result
link |
really just remarkable.
link |
So I had known about some of Jerry's stuff
link |
before he did TD Gamma and he did a system,
link |
just more vanilla, well, not entirely vanilla,
link |
but a more classical back proppy kind of network
link |
for playing backgammon,
link |
where he was training it on expert moves.
link |
So it was kind of supervised,
link |
but the way that it worked was not to mimic the actions,
link |
but to learn internally an evaluation function.
link |
So to learn, well, if the expert chose this over this,
link |
that must mean that the expert values this more than this.
link |
And so let me adjust my weights to make it
link |
so that the network evaluates this
link |
as being better than this.
link |
So it could learn from human preferences,
link |
it could learn its own preferences.
link |
And then when he took the step from that
link |
to actually doing it
link |
as a full on reinforcement learning problem,
link |
where you didn't need a trainer,
link |
you could just let it play, that was remarkable, right?
link |
And so I think as humans often do,
link |
as we've done in the recent past as well,
link |
people extrapolate.
link |
It's like, oh, well, if you can do that,
link |
which is obviously very hard,
link |
then obviously you could do all these other problems
link |
that we wanna solve that we know are also really hard.
link |
And it turned out very few of them ended up being practical,
link |
partly because I think neural nets,
link |
certainly at the time,
link |
were struggling to be consistent and reliable.
link |
And so training them in a reinforcement learning setting
link |
was a bit of a mess.
link |
I had, I don't know, generation after generation
link |
of like master students
link |
who wanted to do value function approximation,
link |
basically reinforcement learning with neural nets.
link |
And over and over and over again, we were failing.
link |
We couldn't get the good results that Jerry Tesaro got.
link |
I now believe that Jerry is a neural net whisperer.
link |
He has a particular ability to get neural networks
link |
to do things that other people would find impossible.
link |
And it's not the technology,
link |
it's the technology and Jerry together.
link |
Which I think speaks to the role of the human expert
link |
in the process of machine learning.
link |
Right, it's so easy.
link |
We're so drawn to the idea that it's the technology
link |
that is where the power is coming from
link |
that I think we lose sight of the fact
link |
that sometimes you need a really good,
link |
just like, I mean, no one would think,
link |
hey, here's this great piece of software.
link |
Here's like, I don't know, GNU Emacs or whatever.
link |
And doesn't that prove that computers are super powerful
link |
and basically gonna take over the world?
link |
It's like, no, Stalman is a hell of a hacker, right?
link |
So he was able to make the code do these amazing things.
link |
He couldn't have done it without the computer,
link |
but the computer couldn't have done it without him.
link |
And so I think people discount the role of people
link |
like Jerry who have just a particular set of skills.
link |
On that topic, by the way, as a small side note,
link |
I tweeted Emacs is greater than Vim yesterday
link |
and deleted the tweet 10 minutes later
link |
when I realized it started a war.
link |
I was like, oh, I was just kidding.
link |
I was just being, and I'm gonna walk back and forth.
link |
So people still feel passionately
link |
about that particular piece of good stuff.
link |
Yeah, I don't get that
link |
because Emacs is clearly so much better, I don't understand.
link |
But why do I say that?
link |
Because I spent a block of time in the 80s
link |
making my fingers know the Emacs keys
link |
and now that's part of the thought process for me.
link |
Like I need to express, and if you take that,
link |
if you take my Emacs key bindings away, I become...
link |
I can't express myself.
link |
I'm the same way with the,
link |
I don't know if you know what it is,
link |
but it's a Kinesis keyboard, which is this butt shaped keyboard.
link |
Yes, I've seen them.
link |
They're very, I don't know, sexy, elegant?
link |
They're just beautiful.
link |
Yeah, they're gorgeous, way too expensive.
link |
But the problem with them, similar with Emacs,
link |
is once you learn to use it.
link |
It's harder to use other things.
link |
It's hard to use other things.
link |
There's this absurd thing where I have like small, elegant,
link |
lightweight, beautiful little laptops
link |
and I'm sitting there in a coffee shop
link |
with a giant Kinesis keyboard and a sexy little laptop.
link |
It's absurd, but I used to feel bad about it,
link |
but at the same time, you just kind of have to,
link |
sometimes it's back to the Billy Joel thing.
link |
You just have to throw that Billy Joel record
link |
and throw Taylor Swift and Justin Bieber to the wind.
link |
See, but I like them now because again,
link |
I have no musical taste.
link |
Like now that I've heard Justin Bieber enough,
link |
I'm like, I really like his songs.
link |
And Taylor Swift, not only do I like her songs,
link |
but my daughter's convinced that she's a genius.
link |
And so now I basically have signed onto that.
link |
So yeah, that speaks to the,
link |
back to the robustness of the human brain.
link |
That speaks to the neuroplasticity
link |
that you can just like a mouse teach yourself to,
link |
or probably a dog teach yourself to enjoy Taylor Swift.
link |
I try, you know what?
link |
It has to do with just like acclimation, right?
link |
Just like you said, a couple of weeks.
link |
That's an interesting experiment.
link |
I'll actually try that.
link |
Like I'll listen to it.
link |
That wasn't the intent of the experiment?
link |
Just like social media,
link |
it wasn't intended as an experiment
link |
to see what we can take as a society,
link |
but it turned out that way.
link |
I don't think I'll be the same person
link |
on the other side of the week listening to Taylor Swift,
link |
No, it's more compartmentalized.
link |
Don't be so worried.
link |
Like it's, like I get that you can be worried,
link |
but don't be so worried
link |
because we compartmentalize really well.
link |
And so it won't bleed into other parts of your life.
link |
You won't start, I don't know,
link |
wearing red lipstick or whatever.
link |
It changed fashion and everything.
link |
But you know what?
link |
The thing you have to watch out for
link |
is you'll walk into a coffee shop
link |
once we can do that again.
link |
And recognize the song?
link |
And you'll be, no,
link |
you won't know that you're singing along
link |
until everybody in the coffee shop is looking at you.
link |
And then you're like, that wasn't me.
link |
Yeah, that's the, you know,
link |
people are afraid of AGI.
link |
I'm afraid of the Taylor Swift.
link |
The Taylor Swift takeover.
link |
Yeah, and I mean, people should know that TD Gammon was,
link |
I get, would you call it,
link |
do you like the terminology of self play by any chance?
link |
So like systems that learn by playing themselves.
link |
Just, I don't know if it's the best word, but.
link |
So what's the problem with that term?
link |
So it's like the big bang,
link |
like it's like talking to a serious physicist.
link |
Do you like the term big bang?
link |
And when it was early,
link |
I feel like it's the early days of self play.
link |
I don't know, maybe it was used previously,
link |
but I think it's been used by only a small group of people.
link |
And so like, I think we're still deciding
link |
is this ridiculously silly name a good name
link |
for potentially one of the most important concepts
link |
in artificial intelligence?
link |
Okay, it depends how broadly you apply the term.
link |
So I used the term in my 1996 PhD dissertation.
link |
Wow, the actual terms of self play.
link |
Yeah, because Tesoro's paper was something like
link |
training up an expert backgammon player through self play.
link |
So I think it was in the title of his paper.
link |
If not in the title, it was definitely a term that he used.
link |
There's another term that we got from that work is rollout.
link |
So I don't know if you, do you ever hear the term rollout?
link |
That's a backgammon term that has now applied
link |
generally in computers, well, at least in AI
link |
because of TD gammon.
link |
That's fascinating.
link |
So how is self play being used now?
link |
And like, why is it,
link |
does it feel like a more general powerful concept
link |
is sort of the idea of,
link |
well, the machine's just gonna teach itself to be smart.
link |
Yeah, so that's where maybe you can correct me,
link |
but that's where the continuation of the spirit
link |
and actually like literally the exact algorithms
link |
of TD gammon are applied by DeepMind and OpenAI
link |
to learn games that are a little bit more complex
link |
that when I was learning artificial intelligence,
link |
Go was presented to me
link |
with artificial intelligence, the modern approach.
link |
I don't know if they explicitly pointed to Go
link |
in those books as like unsolvable kind of thing,
link |
like implying that these approaches hit their limit
link |
in this, with these particular kind of games.
link |
So something, I don't remember if the book said it or not,
link |
but something in my head,
link |
or if it was the professors instilled in me the idea
link |
like this is the limits of artificial intelligence
link |
Like it instilled in me the idea
link |
that if we can create a system that can solve the game of Go
link |
we've achieved AGI.
link |
That was kind of, I didn't explicitly like say this,
link |
but that was the feeling.
link |
And so from, I was one of the people that it seemed magical
link |
when a learning system was able to beat
link |
a human world champion at the game of Go
link |
and even more so from that, that was AlphaGo,
link |
even more so with AlphaGo Zero
link |
than kind of renamed and advanced into AlphaZero
link |
beating a world champion or world class player
link |
without any supervised learning on expert games.
link |
We're doing only through by playing itself.
link |
So that is, I don't know what to make of it.
link |
I think it would be interesting to hear
link |
what your opinions are on just how exciting,
link |
surprising, profound, interesting, or boring
link |
the breakthrough performance of AlphaZero was.
link |
Okay, so AlphaGo knocked my socks off.
link |
That was so remarkable.
link |
Which aspect of it?
link |
That they got it to work,
link |
that they actually were able to leverage
link |
a whole bunch of different ideas,
link |
integrate them into one giant system.
link |
Just the software engineering aspect of it is mind blowing.
link |
I don't, I've never been a part of a program
link |
as complicated as the program that they built for that.
link |
And just the, like Jerry Tesaro is a neural net whisperer,
link |
like David Silver is a kind of neural net whisperer too.
link |
He was able to coax these networks
link |
and these new way out there architectures
link |
to do these, solve these problems that,
link |
as you said, when we were learning from AI,
link |
no one had an idea how to make it work.
link |
It was remarkable that these techniques
link |
that were so good at playing chess
link |
and that could beat the world champion in chess
link |
couldn't beat your typical Go playing teenager in Go.
link |
So the fact that in a very short number of years,
link |
we kind of ramped up to trouncing people in Go
link |
just blew me away.
link |
So you're kind of focusing on the engineering aspect,
link |
which is also very surprising.
link |
I mean, there's something different
link |
about large, well funded companies.
link |
I mean, there's a compute aspect to it too.
link |
Like that, of course, I mean, that's similar
link |
to Deep Blue, right, with IBM.
link |
Like there's something important to be learned
link |
and remembered about a large company
link |
taking the ideas that are already out there
link |
and investing a few million dollars into it or more.
link |
And so you're kind of saying the engineering
link |
is kind of fascinating, both on the,
link |
with AlphaGo is probably just gathering all the data,
link |
right, of the expert games, like organizing everything,
link |
actually doing distributed supervised learning.
link |
And to me, see the engineering I kind of took for granted,
link |
to me philosophically being able to persist
link |
in the face of like long odds,
link |
because it feels like for me,
link |
I would be one of the skeptical people in the room
link |
thinking that you can learn your way to beat Go.
link |
Like it sounded like, especially with David Silver,
link |
it sounded like David was not confident at all.
link |
So like it was, like not,
link |
it's funny how confidence works.
link |
It's like, you're not like cocky about it, like, but.
link |
Right, because if you're cocky about it,
link |
you kind of stop and stall and don't get anywhere.
link |
But there's like a hope that's unbreakable.
link |
Maybe that's better than confidence.
link |
It's a kind of wishful hope and a little dream.
link |
And you almost don't want to do anything else.
link |
You kind of keep doing it.
link |
That's, that seems to be the story and.
link |
But with enough skepticism that you're looking
link |
for where the problems are and fighting through them.
link |
Cause you know, there's gotta be a way out of this thing.
link |
And for him, it was probably,
link |
there's a bunch of little factors that come into play.
link |
It's funny how these stories just all come together.
link |
Like everything he did in his life came into play,
link |
which is like a love for video games
link |
and also a connection to,
link |
so the nineties had to happen with TD Gammon and so on.
link |
In some ways it's surprising,
link |
maybe you can provide some intuition to it
link |
that not much more than TD Gammon was done
link |
for quite a long time on the reinforcement learning front.
link |
Is that weird to you?
link |
I mean, like I said, the students who I worked with,
link |
we tried to get, basically apply that architecture
link |
to other problems and we consistently failed.
link |
There were a couple of really nice demonstrations
link |
that ended up being in the literature.
link |
There was a paper about controlling elevators, right?
link |
Where it's like, okay, can we modify the heuristic
link |
that elevators use for deciding,
link |
like a bank of elevators for deciding which floors
link |
we should be stopping on to maximize throughput essentially.
link |
And you can set that up as a reinforcement learning problem
link |
and you can have a neural net represent the value function
link |
so that it's taking where all the elevators,
link |
where the button pushes, you know, this high dimensional,
link |
well, at the time high dimensional input,
link |
you know, a couple of dozen dimensions
link |
and turn that into a prediction as to,
link |
oh, is it gonna be better if I stop at this floor or not?
link |
And ultimately it appeared as though
link |
for the standard simulation distribution
link |
for people trying to leave the building
link |
at the end of the day,
link |
that the neural net learned a better strategy
link |
than the standard one that's implemented
link |
in elevator controllers.
link |
There was some work that Satyendra Singh et al
link |
did on handoffs with cell phones,
link |
you know, deciding when should you hand off
link |
from this cell tower to this cell tower.
link |
Oh, okay, communication networks, yeah.
link |
Yeah, and so a couple of things
link |
seemed like they were really promising.
link |
None of them made it into production that I'm aware of.
link |
And neural nets as a whole started
link |
to kind of implode around then.
link |
And so there just wasn't a lot of air in the room
link |
for people to try to figure out,
link |
okay, how do we get this to work in the RL setting?
link |
And then they found their way back in 10 plus years.
link |
So you said AlphaGo was impressive,
link |
like it's a big spectacle.
link |
Is there, is that?
link |
Right, so then AlphaZero.
link |
So I think I may have a slightly different opinion
link |
on this than some people.
link |
So I talked to Satyendra Singh in particular about this.
link |
So Satyendra was like Rich Sutton,
link |
a student of Andy Bartow.
link |
So they came out of the same lab,
link |
very influential machine learning,
link |
reinforcement learning researcher.
link |
Now at DeepMind, as is Rich.
link |
Though different sites, the two of them.
link |
Rich is in Alberta and Satyendra would be in England,
link |
but I think he's in England from Michigan at the moment.
link |
But the, but he was, yes,
link |
he was much more impressed with AlphaGo Zero,
link |
which is didn't get a kind of a bootstrap
link |
in the beginning with human trained games.
link |
It just was purely self play.
link |
Though the first one AlphaGo
link |
was also a tremendous amount of self play, right?
link |
They started off, they kickstarted the action network
link |
that was making decisions,
link |
but then they trained it for a really long time
link |
using more traditional temporal difference methods.
link |
So as a result, I didn't,
link |
it didn't seem that different to me.
link |
Like, it seems like, yeah, why wouldn't that work?
link |
Like once you, once it works, it works.
link |
So what, but he found that removal
link |
of that extra information to be breathtaking.
link |
Like that's a game changer.
link |
To me, the first thing was more of a game changer.
link |
But the open question, I mean,
link |
I guess that's the assumption is the expert games
link |
might contain within them a humongous amount of information.
link |
But we know that it went beyond that, right?
link |
We know that it somehow got away from that information
link |
because it was learning strategies.
link |
I don't think AlphaGo is just better
link |
at implementing human strategies.
link |
I think it actually developed its own strategies
link |
that were more effective.
link |
And so from that perspective, okay, well,
link |
so it made at least one quantum leap
link |
in terms of strategic knowledge.
link |
Okay, so now maybe it makes three, like, okay.
link |
But that first one is the doozy, right?
link |
Getting it to work reliably and for the networks
link |
to hold onto the value well enough.
link |
Like that was a big step.
link |
Well, maybe you could speak to this
link |
on the reinforcement learning front.
link |
So starting from scratch and learning to do something,
link |
like the first like random behavior
link |
to like crappy behavior to like somewhat okay behavior.
link |
It's not obvious to me that that's not like impossible
link |
to take those steps.
link |
Like if you just think about the intuition,
link |
like how the heck does random behavior
link |
become somewhat basic intelligent behavior?
link |
Not human level, not superhuman level, but just basic.
link |
But you're saying to you kind of the intuition is like,
link |
if you can go from human to superhuman level intelligence
link |
on this particular task of game playing,
link |
then so you're good at taking leaps.
link |
So you can take many of them.
link |
That the system, I believe that the system
link |
can take that kind of leap.
link |
Yeah, and also I think that beginner knowledge in go,
link |
like you can start to get a feel really quickly
link |
for the idea that being in certain parts of the board
link |
seems to be more associated with winning, right?
link |
Cause it's not stumbling upon the concept of winning.
link |
It's told that it wins or that it loses.
link |
Well, it's self play.
link |
So it both wins and loses.
link |
It's told which side won.
link |
And the information is kind of there
link |
to start percolating around to make a difference as to,
link |
well, these things have a better chance of helping you win.
link |
And these things have a worse chance of helping you win.
link |
And so it can get to basic play, I think pretty quickly.
link |
Then once it has basic play,
link |
well now it's kind of forced to do some search
link |
to actually experiment with, okay,
link |
well what gets me that next increment of improvement?
link |
How far do you think, okay, this is where you kind of
link |
bring up the Elon Musk and the Sam Harris, right?
link |
How far is your intuition about these kinds
link |
of self play mechanisms being able to take us?
link |
Cause it feels, one of the ominous but stated calmly things
link |
that when I talked to David Silver, he said,
link |
is that they have not yet discovered a ceiling
link |
for Alpha Zero, for example, in the game of Go or chess.
link |
Like it keeps, no matter how much they compute,
link |
they throw at it, it keeps improving.
link |
So it's possible, it's very possible that if you throw,
link |
you know, some like 10 X compute that it will improve
link |
by five X or something like that.
link |
And when stated calmly, it's so like, oh yeah, I guess so.
link |
But like, and then you think like,
link |
well, can we potentially have like continuations
link |
of Moore's law in totally different way,
link |
like broadly defined Moore's law,
link |
not the exponential improvement, like,
link |
are we going to have an Alpha Zero that swallows the world?
link |
But notice it's not getting better at other things.
link |
It's getting better at Go.
link |
And I think that's a big leap to say,
link |
okay, well, therefore it's better at other things.
link |
Well, I mean, the question is how much of the game of life
link |
can be turned into.
link |
Right, so that I think is a really good question.
link |
And I think that we don't, I don't think we as a,
link |
I don't know, community really know the answer to this,
link |
but so, okay, so I went to a talk
link |
by some experts on computer chess.
link |
So in particular, computer chess is really interesting
link |
because for, of course, for a thousand years,
link |
humans were the best chess playing things on the planet.
link |
And then computers like edged ahead of the best person.
link |
And they've been ahead ever since.
link |
It's not like people have overtaken computers.
link |
But computers and people together
link |
have overtaken computers.
link |
So at least last time I checked,
link |
I don't know what the very latest is,
link |
but last time I checked that there were teams of people
link |
who could work with computer programs
link |
to defeat the best computer programs.
link |
In the game of Go?
link |
In the game of chess.
link |
In the game of chess.
link |
Right, and so using the information about how,
link |
these things called ELO scores,
link |
this sort of notion of how strong a player are you.
link |
There's kind of a range of possible scores.
link |
And you increment in score,
link |
basically if you can beat another player
link |
of that lower score 62% of the time or something like that.
link |
Like there's some threshold
link |
of if you can somewhat consistently beat someone,
link |
then you are of a higher score than that person.
link |
And there's a question as to how many times
link |
can you do that in chess, right?
link |
And so we know that there's a range of human ability levels
link |
that cap out with the best playing humans.
link |
And the computers went a step beyond that.
link |
And computers and people together have not gone,
link |
I think a full step beyond that.
link |
It feels, the estimates that they have
link |
is that it's starting to asymptote.
link |
That we've reached kind of the maximum,
link |
the best possible chess playing.
link |
And so that means that there's kind of
link |
a finite strategic depth, right?
link |
At some point you just can't get any better at this game.
link |
Yeah, I mean, I don't, so I'll actually check that.
link |
I think it's interesting because if you have somebody
link |
like Magnus Carlsen, who's using these chess programs
link |
to train his mind, like to learn about chess.
link |
To become a better chess player, yeah.
link |
And so like, that's a very interesting thing
link |
because we're not static creatures.
link |
We're learning together.
link |
I mean, just like we're talking about social networks,
link |
those algorithms are teaching us
link |
just like we're teaching those algorithms.
link |
So that's a fascinating thing.
link |
But I think the best chess playing programs
link |
are now better than the pairs.
link |
Like they have competition between pairs,
link |
but it's still, even if they weren't,
link |
it's an interesting question, where's the ceiling?
link |
So the David, the ominous David Silver kind of statement
link |
is like, we have not found the ceiling.
link |
Right, so the question is, okay,
link |
so I don't know his analysis on that.
link |
My, from talking to Go experts,
link |
the depth, the strategic depth of Go
link |
seems to be substantially greater than that of chess.
link |
That there's more kind of steps of improvement
link |
that you can make, getting better and better
link |
and better and better.
link |
But there's no reason to think that it's infinite.
link |
And so it could be that what David is seeing
link |
is a kind of asymptoting that you can keep getting better,
link |
but with diminishing returns.
link |
And at some point you hit optimal play.
link |
Like in theory, all these finite games, they're finite.
link |
They have an optimal strategy.
link |
There's a strategy that is the minimax optimal strategy.
link |
And so at that point, you can't get any better.
link |
You can't beat that strategy.
link |
Now that strategy may be,
link |
from an information processing perspective, intractable.
link |
Right, you need, all the situations
link |
are sufficiently different that you can't compress it at all.
link |
It's this giant mess of hardcoded rules.
link |
And we can never achieve that.
link |
But that still puts a cap on how many levels of improvement
link |
that we can actually make.
link |
But the thing about self play is if you put it,
link |
although I don't like doing that,
link |
in the broader category of self supervised learning,
link |
is that it doesn't require too much or any human input.
link |
Human labeling, yeah.
link |
Yeah, human label or just human effort.
link |
The human involvement passed a certain point.
link |
And the same thing you could argue is true
link |
for the recent breakthroughs in natural language processing
link |
with language models.
link |
Oh, this is how you get to GPT3.
link |
Yeah, see how that did the.
link |
That was a good transition.
link |
Yeah, I practiced that for days leading up to this now.
link |
But like that's one of the questions is,
link |
can we find ways to formulate problems in this world
link |
that are important to us humans,
link |
like more important than the game of chess,
link |
that to which self supervised kinds of approaches
link |
Whether it's self play, for example,
link |
for like maybe you could think of like autonomous vehicles
link |
in simulation, that kind of stuff,
link |
or just robotics applications and simulation,
link |
or in the self supervised learning,
link |
where unannotated data,
link |
or data that's generated by humans naturally
link |
without extra costs, like Wikipedia,
link |
or like all of the internet can be used
link |
to learn something about,
link |
to create intelligent systems that do something
link |
really powerful, that pass the Turing test,
link |
or that do some kind of superhuman level performance.
link |
So what's your intuition,
link |
like trying to stitch all of it together
link |
about our discussion of AGI,
link |
the limits of self play,
link |
and your thoughts about maybe the limits of neural networks
link |
in the context of language models.
link |
Is there some intuition in there
link |
that might be useful to think about?
link |
So first of all, the whole Transformer network
link |
family of things is really cool.
link |
It's really, really cool.
link |
I mean, if you've ever,
link |
back in the day you played with,
link |
I don't know, Markov models for generating texts,
link |
and you've seen the kind of texts that they spit out,
link |
and you compare it to what's happening now,
link |
it's amazing, it's so amazing.
link |
Now, it doesn't take very long interacting
link |
with one of these systems before you find the holes, right?
link |
It's not smart in any kind of general way.
link |
It's really good at a bunch of things.
link |
And it does seem to understand
link |
a lot of the statistics of language extremely well.
link |
And that turns out to be very powerful.
link |
You can answer many questions with that.
link |
But it doesn't make it a good conversationalist, right?
link |
And it doesn't make it a good storyteller.
link |
It just makes it good at imitating
link |
of things that is seen in the past.
link |
The exact same thing could be said
link |
by people who are voting for Donald Trump
link |
about Joe Biden supporters,
link |
and people voting for Joe Biden
link |
about Donald Trump supporters is, you know.
link |
That they're not intelligent, they're just following the.
link |
Yeah, they're following things they've seen in the past.
link |
And it doesn't take long to find the flaws
link |
in their natural language generation abilities.
link |
So we're being very.
link |
That's interesting.
link |
Critical of AI systems.
link |
Right, so I've had a similar thought,
link |
which was that the stories that GPT3 spits out
link |
are amazing and very humanlike.
link |
And it doesn't mean that computers are smarter
link |
than we realize necessarily.
link |
It partly means that people are dumber than we realize.
link |
Or that much of what we do day to day is not that deep.
link |
Like we're just kind of going with the flow.
link |
We're saying whatever feels like the natural thing
link |
Not a lot of it is creative or meaningful or intentional.
link |
But enough is that we actually get by, right?
link |
We do come up with new ideas sometimes,
link |
and we do manage to talk each other into things sometimes.
link |
And we do sometimes vote for reasonable people sometimes.
link |
But it's really hard to see in the statistics
link |
because so much of what we're saying is kind of rote.
link |
And so our metrics that we use to measure
link |
how these systems are doing don't reveal that
link |
because it's in the interstices that is very hard to detect.
link |
But is your, do you have an intuition
link |
that with these language models, if they grow in size,
link |
it's already surprising when you go from GPT2 to GPT3
link |
that there is a noticeable improvement.
link |
So the question now goes back to the ominous David Silver
link |
Right, so maybe there's just no ceiling.
link |
We just need more compute.
link |
Now, I mean, okay, so now I'm speculating.
link |
As opposed to before when I was completely on firm ground.
link |
All right, I don't believe that you can get something
link |
that really can do language and use language as a thing
link |
that doesn't interact with people.
link |
Like I think that it's not enough
link |
to just take everything that we've said written down
link |
and just say, that's enough.
link |
You can just learn from that and you can be intelligent.
link |
I think you really need to be pushed back at.
link |
I think that conversations,
link |
even people who are pretty smart,
link |
maybe the smartest thing that we know,
link |
maybe not the smartest thing we can imagine,
link |
but we get so much benefit
link |
out of talking to each other and interacting.
link |
That's presumably why you have conversations live with guests
link |
is that there's something in that interaction
link |
that would not be exposed by,
link |
oh, I'll just write you a story
link |
and then you can read it later.
link |
And I think because these systems
link |
are just learning from our stories,
link |
they're not learning from being pushed back at by us,
link |
that they're fundamentally limited
link |
into what they can actually become on this route.
link |
They have to get shut down.
link |
Like we have to have an argument,
link |
they have to have an argument with us
link |
and lose a couple of times
link |
before they start to realize, oh, okay, wait,
link |
there's some nuance here that actually matters.
link |
Yeah, that's actually subtle sounding,
link |
but quite profound that the interaction with humans
link |
is essential and the limitation within that
link |
is profound as well because the timescale,
link |
like the bandwidth at which you can really interact
link |
with humans is very low.
link |
So you can't, one of the underlying things about self plays,
link |
it has to do a very large number of interactions.
link |
And so you can't really deploy reinforcement learning systems
link |
into the real world to interact.
link |
Like you couldn't deploy a language model
link |
into the real world to interact with humans
link |
because it was just not getting enough data
link |
relative to the cost it takes to interact.
link |
Like the time of humans is expensive,
link |
which is really interesting.
link |
That takes us back to reinforcement learning
link |
and trying to figure out if there's ways
link |
to make algorithms that are more efficient at learning,
link |
keep the spirit in reinforcement learning
link |
and become more efficient.
link |
In some sense, that seems to be the goal.
link |
I'd love to hear what your thoughts are.
link |
I don't know if you got a chance to see
link |
the blog post called Bitter Lesson.
link |
By Rich Sutton that makes an argument,
link |
hopefully I can summarize it.
link |
Yeah, but do you want?
link |
So I mean, I could try and you can correct me,
link |
which is he makes an argument that it seems
link |
if we look at the long arc of the history
link |
of the artificial intelligence field,
link |
he calls 70 years that the algorithms
link |
from which we've seen the biggest improvements in practice
link |
are the very simple, like dumb algorithms
link |
that are able to leverage computation.
link |
And you just wait for the computation to improve.
link |
Like all of the academics and so on have fun
link |
by finding little tricks
link |
and congratulate themselves on those tricks.
link |
And sometimes those tricks can be like big,
link |
that feel in the moment like big spikes and breakthroughs,
link |
but in reality over the decades,
link |
it's still the same dumb algorithm
link |
that just waits for the compute to get faster and faster.
link |
Do you find that to be an interesting argument
link |
against the entirety of the field of machine learning
link |
as an academic discipline?
link |
That we're really just a subfield of computer architecture.
link |
We're just kind of waiting around
link |
for them to do their next thing.
link |
Who really don't want to do hardware work.
link |
I really don't want to think about it.
link |
We're procrastinating.
link |
Yes, that's right, just waiting for them to do their jobs
link |
so that we can pretend to have done ours.
link |
So yeah, I mean, the argument reminds me a lot of,
link |
I think it was a Fred Jelinek quote,
link |
early computational linguist who said,
link |
we're building these computational linguistic systems
link |
and every time we fire a linguist performance goes up
link |
by 10%, something like that.
link |
And so the idea of us building the knowledge in,
link |
in that case was much less,
link |
he was finding it to be much less successful
link |
than get rid of the people who know about language as a,
link |
from a kind of scholastic academic kind of perspective
link |
and replace them with more compute.
link |
And so I think this is kind of a modern version
link |
of that story, which is, okay,
link |
we want to do better on machine vision.
link |
You could build in all these,
link |
motivated part based models that,
link |
that just feel like obviously the right thing
link |
that you have to have,
link |
or we can throw a lot of data at it
link |
and guess what we're doing better with a lot of data.
link |
So I hadn't thought about it until this moment in this way,
link |
but what I believe, well, I've thought about what I believe.
link |
What I believe is that, you know, compositionality
link |
and what's the right way to say it,
link |
the complexity grows rapidly
link |
as you consider more and more possibilities,
link |
And so far Moore's law has also been growing explosively
link |
And so it really does seem like, well,
link |
we don't have to think really hard about the algorithm
link |
design or the way that we build the systems,
link |
because the best benefit we could get is exponential.
link |
And the best benefit that we can get from waiting
link |
So we can just wait.
link |
It's got, that's gotta end, right?
link |
And there's hints now that,
link |
that Moore's law is starting to feel some friction,
link |
starting to, the world is pushing back a little bit.
link |
One thing that I don't know, do lots of people know this?
link |
I didn't know this, I was trying to write an essay
link |
and yeah, Moore's law has been amazing
link |
and it's enabled all sorts of things,
link |
but there's also a kind of counter Moore's law,
link |
which is that the development cost
link |
for each successive generation of chips also is doubling.
link |
So it's costing twice as much money.
link |
So the amount of development money per cycle or whatever
link |
is actually sort of constant.
link |
And at some point we run out of money.
link |
So, or we have to come up with an entirely different way
link |
of doing the development process.
link |
So like, I guess I always a bit skeptical of the look,
link |
it's an exponential curve, therefore it has no end.
link |
Soon the number of people going to NeurIPS
link |
will be greater than the population of the earth.
link |
That means we're gonna discover life on other planets.
link |
It means that we're in a sigmoid curve on the front half,
link |
which looks a lot like an exponential.
link |
The second half is gonna look a lot like diminishing returns.
link |
Yeah, I mean, but the interesting thing about Moore's law,
link |
if you actually like look at the technologies involved,
link |
it's hundreds, if not thousands of S curves
link |
stacked on top of each other.
link |
It's not actually an exponential curve,
link |
it's constant breakthroughs.
link |
And then what becomes useful to think about,
link |
which is exactly what you're saying,
link |
the cost of development, like the size of teams,
link |
the amount of resources that are invested
link |
in continuing to find new S curves, new breakthroughs.
link |
And yeah, it's an interesting idea.
link |
If we live in the moment, if we sit here today,
link |
it seems to be the reasonable thing
link |
to say that exponentials end.
link |
And yet in the software realm,
link |
they just keep appearing to be happening.
link |
And it's so, I mean, it's so hard to disagree
link |
with Elon Musk on this.
link |
Because it like, I've, you know,
link |
I used to be one of those folks,
link |
I'm still one of those folks that studied
link |
autonomous vehicles, that's what I worked on.
link |
And it's like, you look at what Elon Musk is saying
link |
about autonomous vehicles, well, obviously,
link |
in a couple of years, or in a year, or next month,
link |
we'll have fully autonomous vehicles.
link |
Like there's no reason why we can't.
link |
Driving is pretty simple, like it's just a learning problem
link |
and you just need to convert all the driving
link |
that we're doing into data and just having you all know
link |
with the trains on that data.
link |
And like, we use only our eyes, so you can use cameras
link |
and you can train on it.
link |
And it's like, yeah, that should work.
link |
And then you put that hat on, like the philosophical hat,
link |
and but then you put the pragmatic hat and it's like,
link |
this is what the flaws of computer vision are.
link |
Like, this is what it means to train at scale.
link |
And then you put the human factors, the psychology hat on,
link |
which is like, it's actually driving us a lot,
link |
the cognitive science or cognitive,
link |
whatever the heck you call it, it's really hard,
link |
it's much harder to drive than we realize,
link |
there's a much larger number of edge cases.
link |
So building up an intuition around this is,
link |
around exponentials is really difficult.
link |
And on top of that, the pandemic is making us think
link |
about exponentials, making us realize that like,
link |
we don't understand anything about it,
link |
we're not able to intuit exponentials,
link |
we're either ultra terrified, some part of the population
link |
and some part is like the opposite of whatever
link |
the different carefree and we're not managing it very well.
link |
Blase, well, wow, is that French?
link |
I assume so, it's got an accent.
link |
So it's fascinating to think what the limits
link |
of this exponential growth of technology,
link |
not just Moore's law, it's technology,
link |
how that rubs up against the bitter lesson
link |
and GPT three and self play mechanisms.
link |
Like it's not obvious, I used to be much more skeptical
link |
about neural networks.
link |
Now I at least give a slither of possibility
link |
that we'll be very much surprised
link |
and also caught in a way that like,
link |
we are not prepared for.
link |
Like in applications of social networks, for example,
link |
cause it feels like really good transformer models
link |
that are able to do some kind of like very good
link |
natural language generation of the same kind of models
link |
that can be used to learn human behavior
link |
and then manipulate that human behavior
link |
to gain advertisers dollars and all those kinds of things
link |
through the capitalist system.
link |
And they arguably already are manipulating human behavior.
link |
But not for self preservation, which I think is a big,
link |
that would be a big step.
link |
Like if they were trying to manipulate us
link |
to convince us not to shut them off,
link |
I would be very freaked out.
link |
But I don't see a path to that from where we are now.
link |
They don't have any of those abilities.
link |
That's not what they're trying to do.
link |
They're trying to keep people on the site.
link |
But see the thing is, this is the thing about life on earth
link |
is they might be borrowing our consciousness
link |
and sentience like, so like in a sense they do
link |
because the creators of the algorithms have,
link |
like they're not, if you look at our body,
link |
we're not a single organism.
link |
We're a huge number of organisms
link |
with like tiny little motivations
link |
were built on top of each other.
link |
In the same sense, the AI algorithms that are,
link |
It's a system that includes companies and corporations,
link |
because corporations are funny organisms
link |
in and of themselves that really do seem
link |
to have self preservation built in.
link |
And I think that's at the design level.
link |
I think they're designed to have self preservation
link |
In that broader system that we're also a part of
link |
and can have some influence on,
link |
it is much more complicated, much more powerful.
link |
Yeah, I agree with that.
link |
So people really love it when I ask,
link |
what three books, technical, philosophical, fiction
link |
had a big impact on your life?
link |
Maybe you can recommend.
link |
We went with movies, we went with Billy Joe
link |
and I forgot what music you recommended, but.
link |
I didn't, I just said I have no taste in music.
link |
I just like pop music.
link |
That was actually really skillful
link |
the way you avoided that question.
link |
Thank you, thanks.
link |
I'm gonna try to do the same with the books.
link |
So do you have a skillful way to avoid answering
link |
the question about three books you would recommend?
link |
I'd like to tell you a story.
link |
So my first job out of college was at Bellcore.
link |
I mentioned that before, where I worked with Dave Ackley.
link |
The head of the group was a guy named Tom Landauer.
link |
And I don't know how well known he's known now,
link |
but arguably he's the inventor
link |
and the first proselytizer of word embeddings.
link |
So they developed a system shortly before I got to the group
link |
that was called latent semantic analysis
link |
that would take words of English
link |
and embed them in multi hundred dimensional space
link |
and then use that as a way of assessing
link |
similarity and basically doing reinforcement learning,
link |
I'm sorry, not reinforcement, information retrieval,
link |
sort of pre Google information retrieval.
link |
And he was trained as an anthropologist,
link |
but then became a cognitive scientist.
link |
So I was in the cognitive science research group.
link |
Like I said, I'm a cognitive science groupie.
link |
At the time I thought I'd become a cognitive scientist,
link |
but then I realized in that group,
link |
no, I'm a computer scientist,
link |
but I'm a computer scientist who really loves
link |
to hang out with cognitive scientists.
link |
And he said, he studied language acquisition in particular.
link |
He said, you know, humans have about this number of words
link |
of vocabulary and most of that is learned from reading.
link |
And I said, that can't be true
link |
because I have a really big vocabulary and I don't read.
link |
He's like, you must.
link |
I'm like, I don't think I do.
link |
I mean like stop signs, I definitely read stop signs,
link |
but like reading books is not a thing that I do a lot of.
link |
Do you really though?
link |
It might be just visual, maybe the red color.
link |
Do I read stop signs?
link |
No, it's just pattern recognition at this point.
link |
I don't sound it out.
link |
I wonder what that, oh yeah, stop the guns.
link |
That's fascinating.
link |
So I don't read very, I mean, obviously I read
link |
and I've read plenty of books,
link |
but like some people like Charles,
link |
my friend Charles and others,
link |
like a lot of people in my field, a lot of academics,
link |
like reading was really a central topic to them
link |
in development and I'm not that guy.
link |
In fact, I used to joke that when I got into college,
link |
that it was on kind of a help out the illiterate
link |
kind of program because I got to,
link |
like in my house, I wasn't a particularly bad
link |
or good reader, but when I got to college,
link |
I was surrounded by these people that were just voracious
link |
in their reading appetite.
link |
And they would like, have you read this?
link |
Have you read this?
link |
Have you read this?
link |
And I'm like, no, I'm clearly not qualified
link |
to be at this school.
link |
Like there's no way I should be here.
link |
Now I've discovered books on tape, like audio books.
link |
And so I'm much better.
link |
I'm more caught up.
link |
I read a lot of books.
link |
The small tangent on that,
link |
it is a fascinating open question to me
link |
on the topic of driving.
link |
Whether, you know, supervised learning people,
link |
machine learning people think you have to like drive
link |
to learn how to drive.
link |
To me, it's very possible that just by us humans,
link |
by first of all, walking,
link |
but also by watching other people drive,
link |
not even being inside cars as a passenger,
link |
but let's say being inside the car as a passenger,
link |
but even just like being a pedestrian and crossing the road,
link |
you learn so much about driving from that.
link |
It's very possible that you can,
link |
without ever being inside of a car,
link |
be okay at driving once you get in it.
link |
Or like watching a movie, for example.
link |
I don't know, something like that.
link |
Have you taught anyone to drive?
link |
No, except myself.
link |
I have two children.
link |
And I learned a lot about car driving
link |
because my wife doesn't want to be the one in the car
link |
while they're learning.
link |
So I sit in the passenger seat and it's really scary.
link |
You know, I have wishes to live
link |
and they're figuring things out.
link |
Now, they start off very much better
link |
than I imagine like a neural network would, right?
link |
They get that they're seeing the world.
link |
They get that there's a road that they're trying to be on.
link |
They get that there's a relationship
link |
between the angle of the steering,
link |
but it takes a while to not be very jerky.
link |
And so that happens pretty quickly.
link |
Like the ability to stay in lane at speed,
link |
that happens relatively fast.
link |
It's not zero shot learning, but it's pretty fast.
link |
The thing that's remarkably hard,
link |
and this is I think partly why self driving cars
link |
is the degree to which driving
link |
is a social interaction activity.
link |
And that blew me away.
link |
I was completely unaware of it
link |
until I watched my son learning to drive.
link |
And I was realizing that he was sending signals
link |
to all the cars around him.
link |
And those in his case,
link |
he's always had social communication challenges.
link |
He was sending very mixed confusing signals
link |
to the other cars.
link |
And that was causing the other cars
link |
to drive weirdly and erratically.
link |
And there was no question in my mind
link |
that he would have an accident
link |
because they didn't know how to read him.
link |
There's things you do with the speed that you drive,
link |
the positioning of your car,
link |
that you're constantly like in the head
link |
of the other drivers.
link |
And seeing him not knowing how to do that
link |
and having to be taught explicitly,
link |
okay, you have to be thinking
link |
about what the other driver is thinking,
link |
was a revelation to me.
link |
So creating kind of theories of mind of the other.
link |
Theories of mind of the other cars.
link |
Which I just hadn't heard discussed
link |
in the self driving car talks that I've been to.
link |
Since then, there's some people who do consider
link |
those kinds of issues,
link |
but it's way more subtle than I think
link |
there's a little bit of work involved with that
link |
when you realize like when you especially focus
link |
not on other cars, but on pedestrians, for example,
link |
it's literally staring you in the face.
link |
So then when you're just like,
link |
how do I interact with pedestrians?
link |
Pedestrians, you're practically talking
link |
to an octopus at that point.
link |
They've got all these weird degrees of freedom.
link |
You don't know what they're gonna do.
link |
They can turn around any second.
link |
But the point is, we humans know what they're gonna do.
link |
Like we have a good theory of mind.
link |
We have a good mental model of what they're doing.
link |
And we have a good model of the model they have a view
link |
and the model of the model of the model.
link |
Like we're able to kind of reason about this kind of,
link |
the social like game of it all.
link |
The hope is that it's quite simple actually,
link |
that it could be learned.
link |
That's why I just talked to the Waymo.
link |
I don't know if you know that company.
link |
It's Google South Africa.
link |
They, I talked to their CTO about this podcast
link |
and they like, I rode in their car
link |
and it's quite aggressive and it's quite fast
link |
and it's good and it feels great.
link |
It also, just like Tesla,
link |
Waymo made me change my mind about like,
link |
maybe driving is easier than I thought.
link |
Maybe I'm just being speciest, human centric, maybe.
link |
It's a speciest argument.
link |
Yeah, so I don't know.
link |
But it's fascinating to think about like the same
link |
as with reading, which I think you just said.
link |
You avoided the question,
link |
though I still hope you answered it somewhat.
link |
You avoided it brilliantly.
link |
It is, there's blind spots as artificial intelligence,
link |
that artificial intelligence researchers have
link |
about what it actually takes to learn to solve a problem.
link |
That's fascinating.
link |
Have you had Anca Dragan on?
link |
She's one of my favorites.
link |
And in particular, she thinks a lot about this kind of,
link |
I know that you know that I know kind of planning.
link |
And the last time I spoke with her,
link |
she was very articulate about the ways
link |
in which self driving cars are not solved.
link |
Like what's still really, really hard.
link |
But even her intuition is limited.
link |
Like we're all like new to this.
link |
So in some sense, the Elon Musk approach
link |
of being ultra confident and just like plowing.
link |
Putting it out there.
link |
Like some people say it's reckless and dangerous and so on.
link |
But like, partly it's like, it seems to be one
link |
of the only ways to make progress
link |
in artificial intelligence.
link |
So it's, you know, these are difficult things.
link |
You know, democracy is messy.
link |
Implementation of artificial intelligence systems
link |
in the real world is messy.
link |
So many years ago, before self driving cars
link |
were an actual thing you could have a discussion about,
link |
somebody asked me, like, what if we could use
link |
that robotic technology and use it to drive cars around?
link |
Like, isn't that, aren't people gonna be killed?
link |
And then it's not, you know, blah, blah, blah.
link |
I'm like, that's not what's gonna happen.
link |
I said with confidence, incorrectly, obviously.
link |
What I think is gonna happen is we're gonna have a lot more,
link |
like a very gradual kind of rollout
link |
where people have these cars in like closed communities,
link |
right, where it's somewhat realistic,
link |
but it's still in a box, right?
link |
So that we can really get a sense of what,
link |
what are the weird things that can happen?
link |
How do we, how do we have to change the way we behave
link |
around these vehicles?
link |
Like, it's obviously requires a kind of co evolution
link |
that you can't just plop them in and see what happens.
link |
But of course, we're basically popping them in
link |
and see what happens.
link |
So I was wrong, but I do think that would have been
link |
So that's, but your intuition, that's funny,
link |
just zooming out and looking at the forces of capitalism.
link |
And it seems that capitalism rewards risk takers
link |
and rewards and punishes risk takers, like,
link |
and like, try it out.
link |
The academic approach to let's try a small thing
link |
and try to understand slowly the fundamentals
link |
And let's start with one, then do two, and then see that.
link |
And then do the three, you know, the capitalist
link |
like startup entrepreneurial dream is let's build a thousand
link |
Right, and 500 of them fail, but whatever,
link |
the other 500, we learned from them.
link |
But if you're good enough, I mean, one thing is like,
link |
your intuition would say like, that's gonna be
link |
hugely destructive to everything.
link |
But actually, it's kind of the forces of capitalism,
link |
like people are quite, it's easy to be critical,
link |
but if you actually look at the data at the way
link |
our world has progressed in terms of the quality of life,
link |
it seems like the competent good people rise to the top.
link |
This is coming from me from the Soviet Union and so on.
link |
It's like, it's interesting that somebody like Elon Musk
link |
is the way you push progress in artificial intelligence.
link |
Like it's forcing Waymo to step their stuff up
link |
and Waymo is forcing Elon Musk to step up.
link |
It's fascinating, because I have this tension in my heart
link |
and just being upset by the lack of progress
link |
in autonomous vehicles within academia.
link |
So there's a huge progress in the early days
link |
of the DARPA challenges.
link |
And then it just kind of stopped like at MIT,
link |
but it's true everywhere else with an exception
link |
of a few sponsors here and there is like,
link |
it's not seen as a sexy problem, Thomas.
link |
Like the moment artificial intelligence starts approaching
link |
the problems of the real world,
link |
like academics kind of like, all right, let the...
link |
They get really hard in a different way.
link |
In a different way, that's right.
link |
I think, yeah, right, some of us are not excited
link |
about that other way.
link |
But I still think there's fundamentals problems
link |
to be solved in those difficult things.
link |
It's not, it's still publishable, I think.
link |
Like we just need to, it's the same criticism
link |
you could have of all these conferences in Europe, CVPR,
link |
where application papers are often as powerful
link |
and as important as like a theory paper.
link |
Even like theory just seems much more respectable and so on.
link |
I mean, machine learning community is changing
link |
that a little bit.
link |
I mean, at least in statements,
link |
but it's still not seen as the sexiest of pursuits,
link |
which is like, how do I actually make this thing
link |
work in practice as opposed to on this toy data set?
link |
All that to say, are you still avoiding
link |
the three books question?
link |
Is there something on audio book that you can recommend?
link |
Oh, yeah, I mean, yeah, I've read a lot of really fun stuff.
link |
In terms of books that I find myself thinking back on
link |
that I read a while ago,
link |
like that stood the test of time to some degree.
link |
I find myself thinking of program or be programmed a lot
link |
by Douglas Roschkopf, which was,
link |
it basically put out the premise
link |
that we all need to become programmers
link |
in one form or another.
link |
And it was an analogy to once upon a time
link |
we all had to become readers.
link |
We had to become literate.
link |
And there was a time before that
link |
when not everybody was literate,
link |
but once literacy was possible,
link |
the people who were literate had more of a say in society
link |
than the people who weren't.
link |
And so we made a big effort to get everybody up to speed.
link |
And now it's not 100% universal, but it's quite widespread.
link |
Like the assumption is generally that people can read.
link |
The analogy that he makes is that programming
link |
is a similar kind of thing,
link |
that we need to have a say in, right?
link |
So being a reader, being literate, being a reader means
link |
you can receive all this information,
link |
but you don't get to put it out there.
link |
And programming is the way that we get to put it out there.
link |
And that was the argument that he made.
link |
I think he specifically has now backed away from this idea.
link |
He doesn't think it's happening quite this way.
link |
And that might be true that it didn't,
link |
society didn't sort of play forward quite that way.
link |
I still believe in the premise.
link |
I still believe that at some point,
link |
the relationship that we have to these machines
link |
and these networks has to be one of each individual
link |
can, has the wherewithal to make the machines help them.
link |
Do the things that that person wants done.
link |
And as software people, we know how to do that.
link |
And when we have a problem, we're like, okay,
link |
I'll just, I'll hack up a Pearl script or something
link |
If we lived in a world where everybody could do that,
link |
that would be a better world.
link |
And computers would be, have, I think less sway over us.
link |
And other people's software would have less sway over us
link |
In some sense, software engineering, programming is power.
link |
Programming is power, right?
link |
Yeah, it's like magic.
link |
It's like magic spells.
link |
And it's not out of reach of everyone.
link |
But at the moment, it's just a sliver of the population
link |
who can commune with machines in this way.
link |
So I don't know, so that book had a big impact on me.
link |
Currently, I'm reading The Alignment Problem,
link |
actually by Brian Christian.
link |
So I don't know if you've seen this out there yet.
link |
Is this similar to Stuart Russell's work
link |
with the control problem?
link |
It's in that same general neighborhood.
link |
I mean, they have different emphases
link |
that they're concentrating on.
link |
I think Stuart's book did a remarkably good job,
link |
like just a celebratory good job
link |
at describing AI technology and sort of how it works.
link |
I thought that was great.
link |
It was really cool to see that in a book.
link |
I think he has some experience writing some books.
link |
You know, that's probably a possible thing.
link |
He's maybe thought a thing or two
link |
about how to explain AI to people.
link |
Yeah, that's a really good point.
link |
This book so far has been remarkably good
link |
at telling the story of sort of the history,
link |
the recent history of some of the things
link |
that have happened.
link |
I'm in the first third.
link |
He said this book is in three thirds.
link |
The first third is essentially AI fairness
link |
and implications of AI on society
link |
that we're seeing right now.
link |
And that's been great.
link |
I mean, he's telling the stories really well.
link |
He went out and talked to the frontline people
link |
whose names were associated with some of these ideas
link |
and it's been terrific.
link |
He says the second half of the book
link |
is on reinforcement learning.
link |
So maybe that'll be fun.
link |
And then the third half, third third,
link |
is on the super intelligence alignment problem.
link |
And I suspect that that part will be less fun
link |
Yeah, it's an interesting problem to talk about.
link |
I find it to be the most interesting,
link |
just like thinking about whether we live
link |
in a simulation or not,
link |
as a thought experiment to think about our own existence.
link |
So in the same way,
link |
talking about alignment problem with AGI
link |
is a good way to think similar
link |
to like the trolley problem with autonomous vehicles.
link |
It's a useless thing for engineering,
link |
but it's a nice little thought experiment
link |
for actually thinking about what are like
link |
our own human ethical systems, our moral systems
link |
to by thinking how we engineer these things,
link |
you start to understand yourself.
link |
So sci fi can be good at that too.
link |
So one sci fi book to recommend
link |
is Exhalations by Ted Chiang,
link |
bunch of short stories.
link |
This Ted Chiang is the guy who wrote the short story
link |
that became the movie Arrival.
link |
And all of his stories just from a,
link |
he was a computer scientist,
link |
actually he studied at Brown.
link |
And they all have this sort of really insightful bit
link |
of science or computer science that drives them.
link |
And so it's just a romp, right?
link |
To just like, he creates these artificial worlds
link |
with these by extrapolating on these ideas
link |
that we know about,
link |
but hadn't really thought through
link |
to this kind of conclusion.
link |
And so his stuff is, it's really fun to read,
link |
it's mind warping.
link |
So I'm not sure if you're familiar,
link |
I seem to mention this every other word
link |
is I'm from the Soviet Union and I'm Russian.
link |
Way too much to see us.
link |
My roots are Russian too,
link |
but a couple generations back.
link |
Well, it's probably in there somewhere.
link |
So maybe we can pull at that thread a little bit
link |
of the existential dread that we all feel.
link |
You mentioned that you,
link |
I think somewhere in the conversation you mentioned
link |
that you don't really pretty much like dying.
link |
I forget in which context,
link |
it might've been a reinforcement learning perspective.
link |
No, you know what it was?
link |
It was in teaching my kids to drive.
link |
That's how you face your mortality, yes.
link |
From a human beings perspective
link |
or from a reinforcement learning researchers perspective,
link |
let me ask you the most absurd question.
link |
What do you think is the meaning of this whole thing?
link |
The meaning of life on this spinning rock.
link |
I mean, I think reinforcement learning researchers
link |
maybe think about this from a science perspective
link |
more often than a lot of other people, right?
link |
As a supervised learning person,
link |
you're probably not thinking about the sweep of a lifetime,
link |
but reinforcement learning agents
link |
are having little lifetimes, little weird little lifetimes.
link |
And it's hard not to project yourself
link |
into their world sometimes.
link |
But as far as the meaning of life,
link |
so when I turned 42, you may know from,
link |
that is a book I read,
link |
The Hitchhiker's Guide to the Galaxy,
link |
that that is the meaning of life.
link |
So when I turned 42, I had a meaning of life party
link |
where I invited people over
link |
and everyone shared their meaning of life.
link |
We had slides made up.
link |
And so we all sat down and did a slide presentation
link |
to each other about the meaning of life.
link |
And mine was balance.
link |
I think that life is balance.
link |
And so the activity at the party,
link |
for a 42 year old, maybe this is a little bit nonstandard,
link |
but I found all the little toys and devices that I had
link |
where you had to balance on them.
link |
You had to like stand on it and balance,
link |
or a pogo stick I brought,
link |
a rip stick, which is like a weird two wheeled skateboard.
link |
I got a unicycle, but I didn't know how to do it.
link |
I would love watching you try.
link |
Yeah, I'll send you a video.
link |
I'm not great, but I managed.
link |
And so balance, yeah.
link |
So my wife has a really good one that she sticks to
link |
and is probably pretty accurate.
link |
And it has to do with healthy relationships
link |
with people that you love and working hard for good causes.
link |
But to me, yeah, balance, balance in a word.
link |
That works for me.
link |
Not too much of anything,
link |
because too much of anything is iffy.
link |
That feels like a Rolling Stones song.
link |
I feel like they must be.
link |
You can't always get what you want,
link |
but if you try sometimes, you can strike a balance.
link |
Yeah, I think that's how it goes, Michael.
link |
I'll write you a parody.
link |
It's a huge honor to talk to you.
link |
This is really fun.
link |
Oh, no, the honor's mine.
link |
I've been a big fan of yours,
link |
so can't wait to see what you do next
link |
in the world of education, in the world of parody,
link |
in the world of reinforcement learning.
link |
Thanks for talking to me.
link |
Thank you for listening to this conversation
link |
with Michael Littman, and thank you to our sponsors,
link |
SimpliSafe, a home security company I use
link |
to monitor and protect my apartment, ExpressVPN,
link |
the VPN I've used for many years
link |
to protect my privacy on the internet,
link |
Masterclass, online courses that I enjoy
link |
from some of the most amazing humans in history,
link |
and BetterHelp, online therapy with a licensed professional.
link |
Please check out these sponsors in the description
link |
to get a discount and to support this podcast.
link |
If you enjoy this thing, subscribe on YouTube,
link |
review it with five stars on Apple Podcast,
link |
follow on Spotify, support it on Patreon,
link |
or connect with me on Twitter at Lex Friedman.
link |
And now, let me leave you with some words
link |
from Groucho Marx.
link |
If you're not having fun, you're doing something wrong.
link |
Thank you for listening, and hope to see you next time.