back to index

Stuart Russell: Long-Term Future of Artificial Intelligence | Lex Fridman Podcast #9


small model | large model

link |
00:00:00.000
The following is a conversation with Stuart Russell. He's a professor of computer science at
link |
00:00:04.720
UC Berkeley and a coauthor of a book that introduced me and millions of other people
link |
00:00:10.240
to the amazing world of AI called Artificial Intelligence, A Modern Approach. So it was an
link |
00:00:16.720
honor for me to have this conversation as part of MIT course in artificial general intelligence
link |
00:00:23.120
and the artificial intelligence podcast. If you enjoy it, please subscribe on YouTube,
link |
00:00:28.560
iTunes or your podcast provider of choice, or simply connect with me on Twitter at Lex Friedman
link |
00:00:34.320
spelled F R I D. And now here's my conversation with Stuart Russell.
link |
00:00:41.440
So you've mentioned in 1975 in high school, you've created one of your first AI programs
link |
00:00:47.600
that play chess. Were you ever able to build a program that beat you at chess or another board
link |
00:00:57.360
game? So my program never beat me at chess. I actually wrote the program at Imperial College.
link |
00:01:06.880
So I used to take the bus every Wednesday with a box of cards this big and shove them into the
link |
00:01:14.400
card reader. And they gave us eight seconds of CPU time. It took about five seconds to read the cards
link |
00:01:21.440
in and compile the code. So we had three seconds of CPU time, which was enough to make one move,
link |
00:01:28.080
you know, with a not very deep search. And then we would print that move out and then
link |
00:01:32.080
we'd have to go to the back of the queue and wait to feed the cards in again.
link |
00:01:35.840
How deep was the search? Are we talking about one move, two moves, three moves?
link |
00:01:39.760
No, I think we got an eight move, a depth eight with alpha beta. And we had some tricks of our
link |
00:01:48.160
own about move ordering and some pruning of the tree. But you were still able to beat that program?
link |
00:01:55.120
Yeah, yeah. I was a reasonable chess player in my youth. I did an Othello program and a
link |
00:02:01.840
backgammon program. So when I got to Berkeley, I worked a lot on what we call meta reasoning,
link |
00:02:08.640
which really means reasoning about reasoning. And in the case of a game playing program,
link |
00:02:14.240
you need to reason about what parts of the search tree you're actually going to explore because the
link |
00:02:19.040
search tree is enormous, bigger than the number of atoms in the universe. And the way programs
link |
00:02:27.840
succeed and the way humans succeed is by only looking at a small fraction of the search tree.
link |
00:02:33.280
And if you look at the right fraction, you play really well. If you look at the wrong fraction,
link |
00:02:37.760
if you waste your time thinking about things that are never going to happen,
link |
00:02:41.600
moves that no one's ever going to make, then you're going to lose because you won't be able
link |
00:02:46.480
to figure out the right decision. So that question of how machines can manage their own computation,
link |
00:02:53.920
how they decide what to think about, is the meta reasoning question. And we developed some methods
link |
00:03:00.720
for doing that. And very simply, the machine should think about whatever thoughts are going
link |
00:03:07.040
to improve its decision quality. We were able to show that both for Othello, which is a standard
link |
00:03:13.840
two player game, and for Backgammon, which includes dice rolls, so it's a two player game
link |
00:03:19.680
with uncertainty. For both of those cases, we could come up with algorithms that were actually
link |
00:03:25.600
much more efficient than the standard alpha beta search, which chess programs at the time were
link |
00:03:31.760
using. And that those programs could beat me. And I think you can see the same basic ideas in Alpha
link |
00:03:42.000
Go and Alpha Zero today. The way they explore the tree is using a form of meta reasoning to select
link |
00:03:51.600
what to think about based on how useful it is to think about it. Is there any insights you can
link |
00:03:57.360
describe with our Greek symbols of how do we select which paths to go down? There's really
link |
00:04:04.720
two kinds of learning going on. So as you say, Alpha Go learns to evaluate board positions. So
link |
00:04:11.280
it can look at a go board. And it actually has probably a superhuman ability to instantly tell
link |
00:04:19.760
how promising that situation is. To me, the amazing thing about Alpha Go is not that it can
link |
00:04:28.240
be the world champion with its hands tied behind his back, but the fact that if you stop it from
link |
00:04:36.960
searching altogether, so you say, okay, you're not allowed to do any thinking ahead. You can just
link |
00:04:42.160
consider each of your legal moves and then look at the resulting situation and evaluate it. So
link |
00:04:48.240
what we call a depth one search. So just the immediate outcome of your moves and decide if
link |
00:04:53.760
that's good or bad. That version of Alpha Go can still play at a professional level.
link |
00:05:02.000
And human professionals are sitting there for five, 10 minutes deciding what to do and Alpha Go
link |
00:05:06.960
in less than a second can instantly intuit what is the right move to make based on its ability to
link |
00:05:14.800
evaluate positions. And that is remarkable because we don't have that level of intuition about Go.
link |
00:05:23.280
We actually have to think about the situation. So anyway, that capability that Alpha Go has is one
link |
00:05:31.680
big part of why it beats humans. The other big part is that it's able to look ahead 40, 50, 60 moves
link |
00:05:41.520
into the future. And if it was considering all possibilities, 40 or 50 or 60 moves into the
link |
00:05:49.840
future, that would be 10 to the 200 possibilities. So way more than atoms in the universe and so on.
link |
00:06:01.360
So it's very, very selective about what it looks at. So let me try to give you an intuition about
link |
00:06:08.800
how you decide what to think about. It's a combination of two things. One is how promising
link |
00:06:14.800
it is. So if you're already convinced that a move is terrible, there's no point spending a lot more
link |
00:06:22.560
time convincing yourself that it's terrible because it's probably not going to change your mind. So
link |
00:06:28.800
the real reason you think is because there's some possibility of changing your mind about what to do.
link |
00:06:34.400
And it's that changing your mind that would result then in a better final action in the real world.
link |
00:06:40.960
So that's the purpose of thinking is to improve the final action in the real world. So if you
link |
00:06:47.920
think about a move that is guaranteed to be terrible, you can convince yourself it's terrible,
link |
00:06:53.440
you're still not going to change your mind. But on the other hand, suppose you had a choice between
link |
00:06:59.280
two moves. One of them you've already figured out is guaranteed to be a draw, let's say. And then
link |
00:07:05.040
the other one looks a little bit worse. It looks fairly likely that if you make that move, you're
link |
00:07:10.000
going to lose. But there's still some uncertainty about the value of that move. There's still some
link |
00:07:16.640
possibility that it will turn out to be a win. Then it's worth thinking about that. So even though
link |
00:07:22.080
it's less promising on average than the other move, which is a good move, it's worth thinking
link |
00:07:27.840
about on average than the other move, which is guaranteed to be a draw. There's still some
link |
00:07:32.160
purpose in thinking about it because there's a chance that you will change your mind and discover
link |
00:07:36.800
that in fact it's a better move. So it's a combination of how good the move appears to be
link |
00:07:42.720
and how much uncertainty there is about its value. The more uncertainty, the more it's worth thinking
link |
00:07:48.640
about because there's a higher upside if you want to think of it that way.
link |
00:07:52.240
And of course in the beginning, especially in the AlphaGo Zero formulation, everything is shrouded
link |
00:07:59.760
in uncertainty. So you're really swimming in a sea of uncertainty. So it benefits you to,
link |
00:08:07.600
I mean, actually following the same process as you described, but because you're so uncertain
link |
00:08:11.120
about everything, you basically have to try a lot of different directions.
link |
00:08:15.360
Yeah. So the early parts of the search tree are fairly bushy that it will look at a lot
link |
00:08:22.480
of different possibilities, but fairly quickly, the degree of certainty about some of the moves,
link |
00:08:27.840
I mean, if a move is really terrible, you'll pretty quickly find out, right? You lose half
link |
00:08:32.000
your pieces or half your territory and then you'll say, okay, this is not worth thinking
link |
00:08:37.280
about anymore. And then so further down the tree becomes very long and narrow and you're following
link |
00:08:45.360
various lines of play, 10, 20, 30, 40, 50 moves into the future. And that again is something that
link |
00:08:55.280
human beings have a very hard time doing mainly because they just lack the short term memory.
link |
00:09:02.000
You just can't remember a sequence of moves that's 50 moves long. And you can't imagine
link |
00:09:08.960
the board correctly for that many moves into the future.
link |
00:09:13.040
Of course, the top players, I'm much more familiar with chess, but the top players probably have,
link |
00:09:19.680
they have echoes of the same kind of intuition instinct that in a moment's time AlphaGo applies
link |
00:09:26.720
when they see a board. I mean, they've seen those patterns, human beings have seen those patterns
link |
00:09:31.600
before at the top, at the grandmaster level. It seems that there is some similarities or maybe
link |
00:09:41.360
it's our imagination creates a vision of those similarities, but it feels like this kind of
link |
00:09:47.360
pattern recognition that the AlphaGo approaches are using is similar to what human beings at the
link |
00:09:53.920
top level are using.
link |
00:09:55.360
I think there's, there's some truth to that, but not entirely. Yeah. I mean, I think the,
link |
00:10:03.040
the extent to which a human grandmaster can reliably instantly recognize the right move
link |
00:10:10.720
and instantly recognize the value of the position. I think that's a little bit overrated.
link |
00:10:15.840
But if you sacrifice a queen, for example, I mean, there's these, there's these beautiful games of
link |
00:10:20.480
chess with Bobby Fischer, somebody where it's seeming to make a bad move. And I'm not sure
link |
00:10:28.400
there's a perfect degree of calculation involved where they've calculated all the possible things
link |
00:10:34.720
that happen, but there's an instinct there, right? That somehow adds up to
link |
00:10:40.640
Yeah. So I think what happens is you, you, you get a sense that there's some possibility in the
link |
00:10:46.160
position, even if you make a weird looking move, that it opens up some, some lines of,
link |
00:10:56.080
of calculation that otherwise would be definitely bad. And, and it's that intuition that there's
link |
00:11:05.040
something here in this position that might, might yield a win.
link |
00:11:10.880
And then you follow that, right? And, and in some sense, when a, when a chess player is
link |
00:11:16.080
following a line and in his or her mind, they're, they're mentally simulating what the other person
link |
00:11:23.440
is going to do, what the opponent is going to do. And they can do that as long as the moves are kind
link |
00:11:29.200
of forced, right? As long as there's, you know, there's a, a fort we call a forcing variation
link |
00:11:34.640
where the opponent doesn't really have much choice how to respond. And then you follow that,
link |
00:11:39.120
how to respond. And then you see if you can force them into a situation where you win.
link |
00:11:43.520
You know, we see plenty of mistakes even, even in grandmaster games where they just miss some
link |
00:11:51.920
simple three, four, five move combination that, you know, wasn't particularly apparent in,
link |
00:11:58.560
in the position, but was still there. That's the thing that makes us human.
link |
00:12:02.560
Yeah. So when you mentioned that in Othello, those games were after some matter reasoning
link |
00:12:09.680
improvements and research was able to beat you. How did that make you feel?
link |
00:12:14.960
Part of the meta reasoning capability that it had was based on learning and, and you could
link |
00:12:23.680
sit down the next day and you could just feel that it had got a lot smarter, you know, and all of a
link |
00:12:30.240
sudden you really felt like you're sort of pressed against the wall because it was, it was much more
link |
00:12:37.280
aggressive and, and was totally unforgiving of any minor mistake that you might make. And, and
link |
00:12:43.440
actually it seemed understood the game better than I did. And Gary Kasparov has this quote where
link |
00:12:52.000
during his match against Deep Blue, he said, he suddenly felt that there was a new kind of
link |
00:12:56.880
intelligence across the board. Do you think that's a scary or an exciting
link |
00:13:03.120
possibility for, for Kasparov and for yourself in, in the context of chess, purely sort of
link |
00:13:10.320
in this, like that feeling, whatever that is? I think it's definitely an exciting feeling.
link |
00:13:17.680
You know, this is what made me work on AI in the first place was as soon as I really understood
link |
00:13:23.680
what a computer was, I wanted to make it smart. You know, I started out with the first program
link |
00:13:29.920
I wrote was for the Sinclair programmable calculator. And I think you could write a
link |
00:13:35.680
21 step algorithm. That was the biggest program you could write, something like that. And do
link |
00:13:42.800
little arithmetic calculations. So I think I implemented Newton's method for a square
link |
00:13:48.080
roots and a few other things like that. But then, you know, I thought, okay, if I just had more
link |
00:13:54.240
space, I could make this thing intelligent. And so I started thinking about AI and,
link |
00:14:04.880
and I think the, the, the thing that's scary is not, is not the chess program
link |
00:14:11.280
because, you know, chess programs, they're not in the taking over the world business.
link |
00:14:19.520
But if you extrapolate, you know, there are things about chess that don't resemble
link |
00:14:29.040
the real world, right? We know, we know the rules of chess.
link |
00:14:35.120
The chess board is completely visible to the program where of course the real world is not
link |
00:14:40.720
most, most of the real world is, is not visible from wherever you're sitting, so to speak.
link |
00:14:47.520
And to overcome those kinds of problems, you need qualitatively different algorithms. Another thing
link |
00:14:56.720
about the real world is that, you know, we, we regularly plan ahead on the timescales involving
link |
00:15:05.520
billions or trillions of steps. Now we don't plan those in detail, but you know, when you
link |
00:15:12.400
choose to do a PhD at Berkeley, that's a five year commitment and that amounts to about a trillion
link |
00:15:19.680
motor control steps that you will eventually be committed to. Including going up the stairs,
link |
00:15:26.240
opening doors, drinking water. Yeah. I mean, every, every finger movement while you're typing,
link |
00:15:32.240
every character of every paper and the thesis and everything. So you're not committing in
link |
00:15:36.240
advance to the specific motor control steps, but you're still reasoning on a timescale that
link |
00:15:41.600
will eventually reduce to trillions of motor control actions. And so for all of these reasons,
link |
00:15:52.160
you know, AlphaGo and Deep Blue and so on don't represent any kind of threat to humanity,
link |
00:15:58.080
but they are a step towards it, right? And progress in AI occurs by essentially removing
link |
00:16:07.040
one by one these assumptions that make problems easy. Like the assumption of complete observability
link |
00:16:14.640
of the situation, right? We remove that assumption, you need a much more complicated
link |
00:16:20.160
kind of computing design. It needs, it needs something that actually keeps track of all the
link |
00:16:25.120
things you can't see and tries to estimate what's going on. And there's inevitable uncertainty
link |
00:16:31.040
in that. So it becomes a much more complicated problem. But, you know, we are removing those
link |
00:16:36.880
assumptions. We are starting to have algorithms that can cope with much longer timescales,
link |
00:16:42.320
that can cope with uncertainty, that can cope with partial observability.
link |
00:16:47.520
And so each of those steps sort of magnifies by a thousand the range of things that we can
link |
00:16:54.240
do with AI systems. So the way I started in AI, I wanted to be a psychiatrist for a long time. I
link |
00:16:58.880
wanted to understand the mind in high school and of course program and so on. And I showed up
link |
00:17:04.960
University of Illinois to an AI lab and they said, okay, I don't have time for you,
link |
00:17:10.160
but here's a book, AI and Modern Approach. I think it was the first edition at the time.
link |
00:17:16.640
Here, go, go, go learn this. And I remember the lay of the land was, well, it's incredible that
link |
00:17:22.080
we solved chess, but we'll never solve go. I mean, it was pretty certain that go
link |
00:17:27.360
in the way we thought about systems that reason wasn't possible to solve. And now we've solved
link |
00:17:33.440
this. So it's a very... Well, I think I would have said that it's unlikely we could take
link |
00:17:39.840
the kind of algorithm that was used for chess and just get it to scale up and work well for go.
link |
00:17:46.480
And at the time what we thought was that in order to solve go, we would have to do something similar
link |
00:17:56.800
to the way humans manage the complexity of go, which is to break it down into kind of sub games.
link |
00:18:02.800
So when a human thinks about a go board, they think about different parts of the board as sort
link |
00:18:08.320
of weakly connected to each other. And they think about, okay, within this part of the board, here's
link |
00:18:13.280
how things could go in that part of board, here's how things could go. And then you try to sort of
link |
00:18:18.000
couple those two analyses together and deal with the interactions and maybe revise your views of
link |
00:18:24.000
how things are going to go in each part. And then you've got maybe five, six, seven, ten parts of
link |
00:18:28.640
the board. And that actually resembles the real world much more than chess does because in the
link |
00:18:38.160
real world, we have work, we have home life, we have sport, different kinds of activities,
link |
00:18:46.880
shopping, these all are connected to each other, but they're weakly connected. So when I'm typing
link |
00:18:54.560
a paper, I don't simultaneously have to decide which order I'm going to get the milk and the
link |
00:19:01.280
butter, that doesn't affect the typing. But I do need to realize, okay, I better finish this
link |
00:19:08.240
before the shops close because I don't have anything, I don't have any food at home. So
link |
00:19:12.320
there's some weak connection, but not in the way that chess works where everything is tied into a
link |
00:19:19.040
single stream of thought. So the thought was that to solve go, we'd have to make progress on stuff
link |
00:19:26.080
that would be useful for the real world. And in a way, AlphaGo is a little bit disappointing,
link |
00:19:29.520
right? Because the program designed for AlphaGo is actually not that different from Deep Blue
link |
00:19:39.680
or even from Arthur Samuel's checker playing program from the 1950s. And in fact, the two
link |
00:19:48.160
things that make AlphaGo work is one is this amazing ability to evaluate the positions,
link |
00:19:53.360
and the other is the meta reasoning capability, which allows it to
link |
00:19:57.520
explore some paths in the tree very deeply and to abandon other paths very quickly.
link |
00:20:04.400
So this word meta reasoning, while technically correct, inspires perhaps the wrong degree of
link |
00:20:14.160
power that AlphaGo has, for example, the word reasoning is a powerful word. So let me ask you,
link |
00:20:19.280
sort of, you were part of the symbolic AI world for a while, like AI was, there's a lot of
link |
00:20:27.760
excellent, interesting ideas there that unfortunately met a winter. And so do you think it reemerges?
link |
00:20:38.320
So I would say, yeah, it's not quite as simple as that. So the AI winter
link |
00:20:44.320
for the first winter that was actually named as such was the one in the late 80s.
link |
00:20:51.440
And that came about because in the mid 80s, there was a really a concerted attempt to push AI
link |
00:21:01.520
out into the real world using what was called expert system technology. And for the most part,
link |
00:21:09.120
that technology was just not ready for primetime. They were trying, in many cases, to do a form of
link |
00:21:17.040
uncertain reasoning, judgment, combinations of evidence, diagnosis, those kinds of things,
link |
00:21:24.480
which was simply invalid. And when you try to apply invalid reasoning methods to real problems,
link |
00:21:31.600
you can fudge it for small versions of the problem. But when it starts to get larger,
link |
00:21:36.720
the thing just falls apart. So many companies found that the stuff just didn't work, and they
link |
00:21:44.240
were spending tons of money on consultants to try to make it work. And there were other
link |
00:21:50.400
practical reasons, like they were asking the companies to buy incredibly expensive
link |
00:21:56.160
Lisp machine workstations, which were literally between $50,000 and $100,000 in 1980s money,
link |
00:22:06.080
which would be like between $150,000 and $300,000 per workstation in current prices.
link |
00:22:13.920
And then the bottom line, they weren't seeing a profit from it.
link |
00:22:17.280
Yeah, in many cases. I think there were some successes, there's no doubt about that. But
link |
00:22:23.920
people, I would say, overinvested. Every major company was starting an AI department, just like
link |
00:22:30.800
now. And I worry a bit that we might see similar disappointments, not because the current technology
link |
00:22:40.000
is invalid, but it's limited in its scope. And it's almost the duel of the scope problems that
link |
00:22:51.280
expert systems had. So what have you learned from that hype cycle? And what can we do to
link |
00:22:56.720
prevent another winter, for example? Yeah, so when I'm giving talks these days,
link |
00:23:02.800
that's one of the warnings that I give. So this is a two part warning slide. One is that rather
link |
00:23:11.120
than data being the new oil, data is the new snake oil. That's a good line. And then the other
link |
00:23:18.880
is that we might see a kind of very visible failure in some of the major application areas. And I think
link |
00:23:30.800
self driving cars would be the flagship. And I think when you look at the history,
link |
00:23:40.000
so the first self driving car was on the freeway, driving itself, changing lanes, overtaking in 1987.
link |
00:23:52.000
And so it's more than 30 years. And that kind of looks like where we are today, right? You know,
link |
00:23:59.040
prototypes on the freeway, changing lanes and overtaking. Now, I think that's one of the things
link |
00:24:05.920
that's been made, particularly on the perception side. So we worked a lot on autonomous vehicles
link |
00:24:12.400
in the early mid 90s at Berkeley. And we had our own big demonstrations. We put congressmen into
link |
00:24:21.680
self driving cars and had them zooming along the freeway. And the problem was clearly perception.
link |
00:24:30.640
At the time, the problem was perception. Yeah. So in simulation, with perfect perception,
link |
00:24:36.160
you could actually show that you can drive safely for a long time, even if the other cars are
link |
00:24:40.640
misbehaving and so on. But simultaneously, we worked on machine vision for detecting cars and
link |
00:24:48.800
tracking pedestrians and so on. And we couldn't get the cars to do that. And so we had to do
link |
00:24:56.880
that for pedestrians and so on. And we couldn't get the reliability of detection and tracking
link |
00:25:03.120
up to a high enough level, particularly in bad weather conditions, nighttime,
link |
00:25:10.800
rainfall. Good enough for demos, but perhaps not good enough to cover the general operation.
link |
00:25:15.920
Yeah. So the thing about driving is, you know, suppose you're a taxi driver, you know,
link |
00:25:19.680
and you drive every day, eight hours a day for 10 years, right? That's 100 million seconds of
link |
00:25:25.200
driving, you know, and any one of those seconds, you can make a fatal mistake. So you're talking
link |
00:25:30.560
about eight nines of reliability, right? Now, if your vision system only detects 98.3% of the
link |
00:25:40.080
vehicles, right, then that's sort of, you know, one in a bit nines of reliability. So you have
link |
00:25:47.200
another seven orders of magnitude to go. And this is what people don't understand. They think,
link |
00:25:54.560
oh, because I had a successful demo, I'm pretty much done. But you're not even within seven orders
link |
00:26:01.440
of magnitude of being done. And that's the difficulty. And it's not the, can I follow a
link |
00:26:09.760
white line? That's not the problem, right? We follow a white line all the way across the country.
link |
00:26:16.640
But it's the weird stuff that happens. It's all the edge cases, yeah.
link |
00:26:22.160
The edge case, other drivers doing weird things. You know, so if you talk to Google, right, so
link |
00:26:29.200
they had actually a very classical architecture where, you know, you had machine vision which
link |
00:26:35.600
would detect all the other cars and pedestrians and the white lines and the road signs. And then
link |
00:26:41.440
basically that was fed into a logical database. And then you had a classical 1970s rule based
link |
00:26:48.880
expert system telling you, okay, if you're in the middle lane and there's a bicyclist in the right
link |
00:26:55.360
lane who is signaling this, then you do that, right? And what they found was that every day
link |
00:27:02.640
they'd go out and there'd be another situation that the rules didn't cover. You know, so they'd
link |
00:27:06.560
come to a traffic circle and there's a little girl riding her bicycle the wrong way around
link |
00:27:10.880
the traffic circle. Okay, what do you do? We don't have a rule. Oh my God. Okay, stop.
link |
00:27:14.720
And then, you know, they come back and add more rules and they just found that this was not really
link |
00:27:20.560
converging. And if you think about it, right, how do you deal with an unexpected situation,
link |
00:27:28.240
meaning one that you've never previously encountered and the sort of reasoning required
link |
00:27:35.600
to figure out the solution for that situation has never been done. It doesn't match any previous
link |
00:27:41.200
situation in terms of the kind of reasoning you have to do. Well, you know, in chess programs,
link |
00:27:46.560
this happens all the time, right? You're constantly coming up with situations you haven't
link |
00:27:51.280
seen before and you have to reason about them and you have to think about, okay, here are the
link |
00:27:56.480
possible things I could do. Here are the outcomes. Here's how desirable the outcomes are and then
link |
00:28:01.680
pick the right one. You know, in the 90s, we were saying, okay, this is how you're going to have to
link |
00:28:05.440
do automated vehicles. They're going to have to have a look ahead capability, but the look ahead
link |
00:28:10.880
for driving is more difficult than it is for chess because there's humans and they're less
link |
00:28:18.000
predictable than chess pieces. Well, then you have an opponent in chess who's also somewhat
link |
00:28:23.840
unpredictable. But for example, in chess, you always know the opponent's intention. They're
link |
00:28:29.920
trying to beat you, right? Whereas in driving, you don't know is this guy trying to turn left
link |
00:28:36.000
or has he just forgotten to turn off his turn signal or is he drunk or is he changing the
link |
00:28:42.000
channel on his radio or whatever it might be. You've got to try and figure out the mental state,
link |
00:28:47.520
the intent of the other drivers to forecast the possible evolutions of their trajectories.
link |
00:28:54.880
And then you've got to figure out, okay, which is the trajectory for me that's going to be safest.
link |
00:29:00.400
And those all interact with each other because the other drivers are going to react to your
link |
00:29:04.720
trajectory and so on. So, you know, they've got the classic merging onto the freeway problem where
link |
00:29:10.640
you're kind of racing a vehicle that's already on the freeway and you're going to pull ahead of
link |
00:29:15.520
them or you're going to let them go first and pull in behind and you get this sort of uncertainty
link |
00:29:19.920
about who's going first. So all those kinds of things mean that you need a decision making
link |
00:29:29.440
architecture that's very different from either a rule based system or it seems to me kind of an
link |
00:29:37.200
end to end neural network system. So just as AlphaGo is pretty good when it doesn't do any
link |
00:29:43.840
look ahead, but it's way, way, way, way better when it does, I think the same is going to be
link |
00:29:49.920
true for driving. You can have a driving system that's pretty good when it doesn't do any look
link |
00:29:55.120
ahead, but that's not good enough. And we've already seen multiple deaths caused by poorly
link |
00:30:03.440
designed machine learning algorithms that don't really understand what they're doing.
link |
00:30:09.360
Yeah. On several levels, I think on the perception side, there's mistakes being made by those
link |
00:30:16.480
algorithms where the perception is very shallow. On the planning side, the look ahead, like you
link |
00:30:21.200
said, and the thing that we come up against that's really interesting when you try to deploy systems
link |
00:30:31.200
in the real world is you can't think of an artificial intelligence system as a thing that
link |
00:30:36.160
responds to the world always. You have to realize that it's an agent that others will respond to as
link |
00:30:41.440
well. So in order to drive successfully, you can't just try to do obstacle avoidance.
link |
00:30:47.680
Right. You can't pretend that you're invisible, right? You're the invisible car.
link |
00:30:51.920
Right. It doesn't work that way.
link |
00:30:53.440
I mean, but you have to assert yet others have to be scared of you. Just we're all,
link |
00:30:58.320
there's this tension, there's this game. So if we study a lot of work with pedestrians,
link |
00:31:04.080
if you approach pedestrians as purely an obstacle avoidance, so you're doing look ahead as in
link |
00:31:10.000
modeling the intent that they're not going to, they're going to take advantage of you. They're
link |
00:31:15.200
not going to respect you at all. There has to be a tension, a fear, some amount of uncertainty.
link |
00:31:21.040
That's how we have created.
link |
00:31:24.160
Or at least just a kind of a resoluteness. You have to display a certain amount of
link |
00:31:29.760
resoluteness. You can't be too tentative. And yeah, so the solutions then become
link |
00:31:39.120
pretty complicated, right? You get into game theoretic analyses. And so at Berkeley now,
link |
00:31:46.000
we're working a lot on this kind of interaction between machines and humans.
link |
00:31:51.440
And that's exciting.
link |
00:31:53.200
And so my colleague, Ankur Dragan, actually, if you formulate the problem game theoretically,
link |
00:32:04.400
you just let the system figure out the solution. It does interesting unexpected things. Like
link |
00:32:10.080
sometimes at a stop sign, if no one is going first, the car will actually back up a little,
link |
00:32:18.640
right? And just to indicate to the other cars that they should go. And that's something it
link |
00:32:23.680
invented entirely by itself. We didn't say this is the language of communication at stop signs.
link |
00:32:29.920
It figured it out.
link |
00:32:30.720
That's really interesting. So let me one just step back for a second. Just this beautiful
link |
00:32:38.960
philosophical notion. So Pamela McCordick in 1979 wrote, AI began with the ancient wish to
link |
00:32:47.040
forge the gods. So when you think about the history of our civilization, do you think
link |
00:32:53.840
that there is an inherent desire to create, let's not say gods, but to create superintelligence?
link |
00:33:01.520
Is it inherent to us? Is it in our genes? That the natural arc of human civilization is to create
link |
00:33:11.280
things that are of greater and greater power and perhaps echoes of ourselves. So to create the gods
link |
00:33:19.200
as Pamela said. Maybe. I mean, we're all individuals, but certainly we see over and over
link |
00:33:32.080
again in history, individuals who thought about this possibility. Hopefully when I'm not being too
link |
00:33:40.240
philosophical here, but if you look at the arc of this, where this is going and we'll talk about AI
link |
00:33:47.440
safety, we'll talk about greater and greater intelligence. Do you see that there in, when you
link |
00:33:54.320
created the Othello program and you felt this excitement, what was that excitement? Was it
link |
00:33:59.680
excitement of a tinkerer who created something cool like a clock? Or was there a magic or was
link |
00:34:07.680
it more like a child being born? Yeah. So I mean, I certainly understand that viewpoint. And if you
link |
00:34:14.320
look at the Lighthill report, which was, so in the 70s, there was a lot of controversy in the UK
link |
00:34:23.520
about AI and whether it was for real and how much money the government should invest. And
link |
00:34:32.320
there was a long story, but the government commissioned a report by Lighthill, who was a
link |
00:34:39.040
physicist, and he wrote a very damning report about AI, which I think was the point. And he
link |
00:34:48.800
said that these are frustrated men who are unable to have children would like to create and create
link |
00:34:59.200
a life as a kind of replacement, which I think is really pretty unfair. But there is a kind of magic,
link |
00:35:17.360
I would say, when you build something and what you're building in is really just, you're building
link |
00:35:28.000
in some understanding of the principles of learning and decision making. And to see those
link |
00:35:35.200
principles actually then turn into intelligent behavior in specific situations, it's an
link |
00:35:45.600
incredible thing. And that is naturally going to make you think, okay, where does this end?
link |
00:36:00.000
And so there's magical optimistic views of where it ends, whatever your view of optimism is,
link |
00:36:08.240
whatever your view of utopia is, it's probably different for everybody. But you've often talked
link |
00:36:13.280
about concerns you have of how things may go wrong. So I've talked to Max Tegmark. There's a
link |
00:36:26.000
lot of interesting ways to think about AI safety. You're one of the seminal people thinking about
link |
00:36:33.280
this problem amongst sort of being in the weeds of actually solving specific AI problems. You're
link |
00:36:39.440
also thinking about the big picture of where are we going? So can you talk about several elements
link |
00:36:44.800
of it? Let's just talk about maybe the control problem. So this idea of losing ability to control
link |
00:36:52.800
the behavior in our AI system. So how do you see that? How do you see that coming about?
link |
00:37:00.000
What do you think we can do to manage it?
link |
00:37:04.480
Well, so it doesn't take a genius to realize that if you make something that's smarter than you,
link |
00:37:09.280
you might have a problem. Alan Turing wrote about this and gave lectures about this in 1951.
link |
00:37:22.240
He did a lecture on the radio and he basically says, once the machine thinking method starts,
link |
00:37:31.200
very quickly they'll outstrip humanity. And if we're lucky, we might be able to turn off the power
link |
00:37:42.640
at strategic moments, but even so, our species would be humbled. Actually, he was wrong about
link |
00:37:49.360
that. If it's sufficiently intelligent machine, it's not going to let you switch it off. It's
link |
00:37:55.120
actually in competition with you. So what do you think is most likely going to happen?
link |
00:37:59.440
What do you think is meant just for a quick tangent, if we shut off this super intelligent
link |
00:38:06.560
machine that our species will be humbled? I think he means that we would realize that
link |
00:38:16.400
we are inferior, right? That we only survive by the skin of our teeth because we happen to get
link |
00:38:22.240
to the off switch just in time. And if we hadn't, then we would have lost control over the earth.
link |
00:38:32.160
Are you more worried when you think about this stuff about super intelligent AI,
link |
00:38:36.800
or are you more worried about super powerful AI that's not aligned with our values? So the
link |
00:38:43.200
paperclip scenarios kind of... So the main problem I'm working on is the control problem, the problem
link |
00:38:54.560
of machines pursuing objectives that are, as you say, not aligned with human objectives. And
link |
00:39:02.320
this has been the way we've thought about AI since the beginning.
link |
00:39:07.520
You build a machine for optimizing, and then you put in some objective, and it optimizes, right?
link |
00:39:14.320
And we can think of this as the King Midas problem, right? Because if the King Midas put
link |
00:39:23.920
in this objective, everything I touch should turn to gold. And the gods, that's like the machine,
link |
00:39:30.080
they said, okay, done. You now have this power. And of course, his father,
link |
00:39:35.520
his drink, and his family all turned to gold. And then he dies of misery and starvation. And
link |
00:39:47.200
it's a warning, it's a failure mode that pretty much every culture in history has had some story
link |
00:39:54.240
along the same lines. There's the genie that gives you three wishes, and the third wish is always,
link |
00:39:59.520
you know, please undo the first two wishes because I messed up. And when Arthur Samuel wrote his
link |
00:40:09.920
checker playing program, which learned to play checkers considerably better than
link |
00:40:13.680
Arthur Samuel could play, and actually reached a pretty decent standard.
link |
00:40:20.080
Norbert Wiener, who was one of the major mathematicians of the 20th century,
link |
00:40:24.640
he's sort of the father of modern automation control systems. He saw this and he basically
link |
00:40:31.680
extrapolated, as Turing did, and said, okay, this is how we could lose control.
link |
00:40:39.840
And specifically, that we have to be certain that the purpose we put into the machine is the
link |
00:40:49.680
purpose which we really desire. And the problem is, we can't do that.
link |
00:40:57.840
You mean we're not, it's a very difficult to encode,
link |
00:41:00.720
to put our values on paper is really difficult, or you're just saying it's impossible?
link |
00:41:10.720
So theoretically, it's possible, but in practice, it's extremely unlikely that we could
link |
00:41:17.840
specify correctly in advance, the full range of concerns of humanity.
link |
00:41:24.160
You talked about cultural transmission of values,
link |
00:41:27.120
I think is how humans to human transmission of values happens, right?
link |
00:41:31.680
Well, we learn, yeah, I mean, as we grow up, we learn about the values that matter,
link |
00:41:37.760
how things should go, what is reasonable to pursue and what isn't reasonable to pursue.
link |
00:41:43.600
You think machines can learn in the same kind of way?
link |
00:41:46.000
Yeah, so I think that what we need to do is to get away from this idea that
link |
00:41:52.560
you build an optimising machine, and then you put the objective into it.
link |
00:41:56.800
Because if it's possible that you might put in a wrong objective, and we already know this is
link |
00:42:03.840
possible because it's happened lots of times, right? That means that the machine should never
link |
00:42:09.600
take an objective that's given as gospel truth. Because once it takes the objective as gospel
link |
00:42:18.000
truth, then it believes that whatever actions it's taking in pursuit of that objective are
link |
00:42:26.480
the correct things to do. So you could be jumping up and down and saying, no, no, no,
link |
00:42:30.480
no, you're going to destroy the world, but the machine knows what the true objective is and is
link |
00:42:35.280
pursuing it, and tough luck to you. And this is not restricted to AI, right? This is, I think,
link |
00:42:42.480
many of the 20th century technologies, right? So in statistics, you minimise a loss function,
link |
00:42:48.080
the loss function is exogenously specified. In control theory, you minimise a cost function.
link |
00:42:53.440
In operations research, you maximise a reward function, and so on. So in all these disciplines,
link |
00:42:59.040
this is how we conceive of the problem. And it's the wrong problem because we cannot specify
link |
00:43:07.040
with certainty the correct objective, right? We need uncertainty, we need the machine to be
link |
00:43:13.840
uncertain about what it is that it's supposed to be maximising.
link |
00:43:18.080
Favourite idea of yours, I've heard you say somewhere, well, I shouldn't pick favourites,
link |
00:43:23.920
but it just sounds beautiful, we need to teach machines humility. It's a beautiful way to put it,
link |
00:43:31.440
I love it.
link |
00:43:32.640
That they're humble, they know that they don't know what it is they're supposed to be doing,
link |
00:43:39.520
and that those objectives, I mean, they exist, they're within us, but we may not be able to
link |
00:43:47.200
we may not be able to explicate them, we may not even know how we want our future to go.
link |
00:43:56.160
Exactly.
link |
00:43:58.240
And the machine, a machine that's uncertain is going to be deferential to us. So if we say,
link |
00:44:06.800
don't do that, well, now the machines learn something a bit more about our true objectives,
link |
00:44:11.840
because something that it thought was reasonable in pursuit of our objective,
link |
00:44:16.480
turns out not to be, so now it's learned something. So it's going to defer because
link |
00:44:20.640
it wants to be doing what we really want. And that point, I think, is absolutely central
link |
00:44:30.240
to solving the control problem. And it's a different kind of AI when you take away this
link |
00:44:37.920
idea that the objective is known, then in fact, a lot of the theoretical frameworks that we're so
link |
00:44:44.560
familiar with, you know, Markov decision processes, goal based planning, you know,
link |
00:44:53.440
standard games research, all of these techniques actually become inapplicable.
link |
00:44:59.280
And you get a more complicated problem because now the interaction with the human becomes part
link |
00:45:11.360
of the problem. Because the human by making choices is giving you more information about
link |
00:45:21.200
the true objective and that information helps you achieve the objective better.
link |
00:45:26.640
And so that really means that you're mostly dealing with game theoretic problems where
link |
00:45:31.840
you've got the machine and the human and they're coupled together,
link |
00:45:35.840
rather than a machine going off by itself with a fixed objective.
link |
00:45:39.040
LW. Which is fascinating on the machine and the human level that we, when you don't have an
link |
00:45:46.800
objective, means you're together coming up with an objective. I mean, there's a lot of philosophy
link |
00:45:53.120
that, you know, you could argue that life doesn't really have meaning. We together agree on what
link |
00:45:58.880
gives it meaning and we kind of culturally create things that give why the heck we are on this earth
link |
00:46:05.920
anyway. We together as a society create that meaning and you have to learn that objective.
link |
00:46:11.280
And one of the biggest, I thought that's where you were going to go for a second,
link |
00:46:15.760
one of the biggest troubles we run into outside of statistics and machine learning and AI
link |
00:46:21.200
and just human civilization is when you look at, I came from, I was born in the Soviet Union
link |
00:46:28.080
and the history of the 20th century, we ran into the most trouble, us humans, when there was a
link |
00:46:36.320
certainty about the objective and you do whatever it takes to achieve that objective, whether you're
link |
00:46:41.200
talking about Germany or communist Russia. You get into trouble with humans.
link |
00:46:47.040
I would say with, you know, corporations, in fact, some people argue that, you know,
link |
00:46:52.400
we don't have to look forward to a time when AI systems take over the world. They already have
link |
00:46:57.200
and they call corporations, right? That corporations happen to be using people as
link |
00:47:03.760
components right now, but they are effectively algorithmic machines and they're optimizing
link |
00:47:10.160
an objective, which is quarterly profit that isn't aligned with overall wellbeing of the human race.
link |
00:47:17.520
And they are destroying the world. They are primarily responsible for our inability to tackle
link |
00:47:23.440
climate change. So I think that's one way of thinking about what's going on with corporations,
link |
00:47:30.240
but I think the point you're making is valid that there are many systems in the real world where
link |
00:47:39.680
we've sort of prematurely fixed on the objective and then decoupled the machine from those that's
link |
00:47:48.480
supposed to be serving. And I think you see this with government, right? Government is supposed to
link |
00:47:54.800
be a machine that serves people, but instead it tends to be taken over by people who have their
link |
00:48:02.720
own objective and use government to optimize that objective regardless of what people want.
link |
00:48:09.120
Do you find appealing the idea of almost arguing machines where you have multiple AI systems with
link |
00:48:16.080
a clear fixed objective. We have in government, the red team and the blue team, they're very fixed on
link |
00:48:22.480
their objectives and they argue and they kind of may disagree, but it kind of seems to make it
link |
00:48:29.760
work somewhat that the duality of it. Okay. Let's go a hundred years back when there was still was
link |
00:48:39.680
going on or at the founding of this country, there was disagreements and that disagreement is where,
link |
00:48:46.480
so it was a balance between certainty and forced humility because the power was distributed.
link |
00:48:53.840
Yeah. I think that the nature of debate and disagreement argument takes as a premise,
link |
00:49:04.000
the idea that you could be wrong, which means that you're not necessarily absolutely convinced
link |
00:49:12.320
that your objective is the correct one. If you were absolutely convinced, there'd be no point
link |
00:49:19.440
in having any discussion or argument because you would never change your mind and there wouldn't
link |
00:49:24.080
be any sort of synthesis or anything like that. I think you can think of argumentation as an
link |
00:49:32.000
implementation of a form of uncertain reasoning. I've been reading recently about utilitarianism
link |
00:49:44.640
and the history of efforts to define in a sort of clear mathematical way,
link |
00:49:53.600
if you like a formula for moral or political decision making. It's really interesting that
link |
00:50:00.400
the parallels between the philosophical discussions going back 200 years and what you see now in
link |
00:50:07.920
discussions about existential risk because it's almost exactly the same. Someone would say,
link |
00:50:14.640
okay, well here's a formula for how we should make decisions. Utilitarianism is roughly each
link |
00:50:20.720
person has a utility function and then we make decisions to maximize the sum of everybody's
link |
00:50:27.120
utility. Then people point out, well, in that case, the best policy is one that leads to
link |
00:50:36.480
the enormously vast population, all of whom are living a life that's barely worth living.
link |
00:50:44.000
This is called the repugnant conclusion. Another version is that we should maximize
link |
00:50:50.640
pleasure and that's what we mean by utility. Then you'll get people effectively saying, well,
link |
00:50:57.840
in that case, we might as well just have everyone hooked up to a heroin drip. They didn't use those
link |
00:51:03.040
words, but that debate was happening in the 19th century as it is now about AI that if we get the
link |
00:51:11.520
formula wrong, we're going to have AI systems working towards an outcome that in retrospect
link |
00:51:20.160
would be exactly wrong. Do you think there's, as beautifully put, so the echoes are there,
link |
00:51:26.400
but do you think, I mean, if you look at Sam Harris, our imagination worries about the AI
link |
00:51:32.880
version of that because of the speed at which the things going wrong in the utilitarian context
link |
00:51:44.080
could happen. Is that a worry for you? Yeah. I think that in most cases, not in all, but if we
link |
00:51:53.520
have a wrong political idea, we see it starting to go wrong and we're not completely stupid and so
link |
00:52:00.560
we say, okay, maybe that was a mistake. Let's try something different. Also, we're very slow and
link |
00:52:09.600
inefficient about implementing these things and so on. So you have to worry when you have
link |
00:52:14.800
corporations or political systems that are extremely efficient. But when we look at AI systems
link |
00:52:22.240
or even just computers in general, they have this different characteristic from ordinary
link |
00:52:29.760
human activity in the past. So let's say you were a surgeon, you had some idea about how to do some
link |
00:52:36.000
operation. Well, and let's say you were wrong, that way of doing the operation would mostly
link |
00:52:42.400
kill the patient. Well, you'd find out pretty quickly, like after three, maybe three or four
link |
00:52:49.280
tries. But that isn't true for pharmaceutical companies because they don't do three or four
link |
00:53:00.160
operations. They manufacture three or four billion pills and they sell them and then they find out
link |
00:53:05.840
maybe six months or a year later that, oh, people are dying of heart attacks or getting cancer from
link |
00:53:11.520
this drug. And so that's why we have the FDA, right? Because of the scalability of pharmaceutical
link |
00:53:18.720
production. And there have been some unbelievably bad episodes in the history of pharmaceuticals
link |
00:53:29.840
and adulteration of products and so on that have killed tens of thousands or paralyzed hundreds
link |
00:53:36.640
of thousands of people. Now with computers, we have that same scalability problem that you can
link |
00:53:43.760
sit there and type for I equals one to five billion do, right? And all of a sudden you're
link |
00:53:49.760
having an impact on a global scale. And yet we have no FDA, right? There's absolutely no controls
link |
00:53:56.160
at all over what a bunch of undergraduates with too much caffeine can do to the world.
link |
00:54:03.440
And we look at what happened with Facebook, well, social media in general and click through
link |
00:54:10.160
optimization. So you have a simple feedback algorithm that's trying to just optimize click
link |
00:54:18.720
through, right? That sounds reasonable, right? Because you don't want to be feeding people ads
link |
00:54:24.400
that they don't care about or not interested in. And you might even think of that process as
link |
00:54:33.840
simply adjusting the feeding of ads or news articles or whatever it might be
link |
00:54:41.280
to match people's preferences, right? Which sounds like a good idea.
link |
00:54:47.360
But in fact, that isn't how the algorithm works, right? You make more money,
link |
00:54:54.080
the algorithm makes more money if it can better predict what people are going to click on,
link |
00:55:01.200
because then it can feed them exactly that, right? So the way to maximize click through
link |
00:55:07.680
is actually to modify the people to make them more predictable. And one way to do that is to
link |
00:55:16.320
feed them information, which will change their behavior and preferences towards extremes that
link |
00:55:23.600
make them predictable. Whatever is the nearest extreme or the nearest predictable point,
link |
00:55:29.200
that's where you're going to end up. And the machines will force you there.
link |
00:55:35.520
And I think there's a reasonable argument to say that this, among other things,
link |
00:55:40.240
is contributing to the destruction of democracy in the world.
link |
00:55:47.280
And where was the oversight of this process? Where were the people saying, okay,
link |
00:55:52.720
you would like to apply this algorithm to 5 billion people on the face of the earth.
link |
00:55:58.560
Can you show me that it's safe? Can you show me that it won't have various kinds of negative
link |
00:56:03.760
effects? No, there was no one asking that question. There was no one placed between
link |
00:56:11.120
the undergrads with too much caffeine and the human race. They just did it.
link |
00:56:16.160
But some way outside the scope of my knowledge, so economists would argue that the, what is it,
link |
00:56:22.800
the invisible hand, so the capitalist system, it was the oversight. So if you're going to corrupt
link |
00:56:29.280
society with whatever decision you make as a company, then that's going to be reflected in
link |
00:56:33.600
people not using your product. That's one model of oversight.
link |
00:56:38.160
We shall see, but in the meantime, but you might even have broken the political system
link |
00:56:48.000
that enables capitalism to function. Well, you've changed it.
link |
00:56:53.040
We shall see.
link |
00:56:54.960
Change is often painful. So my question is absolutely, it's fascinating. You're absolutely
link |
00:57:01.360
right that there was zero oversight on algorithms that can have a profound civilization changing
link |
00:57:09.040
effect. So do you think it's possible? I mean, I haven't, have you seen government? So do you
link |
00:57:15.840
think it's possible to create regulatory bodies oversight over AI algorithms, which are inherently
link |
00:57:24.400
such cutting edge set of ideas and technologies?
link |
00:57:28.400
Yeah, but I think it takes time to figure out what kind of oversight, what kinds of controls.
link |
00:57:35.040
I mean, it took time to design the FDA regime, you know, and some people still don't like it and
link |
00:57:40.160
they want to fix it. And I think there are clear ways that it could be improved.
link |
00:57:46.960
But the whole notion that you have stage one, stage two, stage three, and here are the criteria
link |
00:57:51.680
for what you have to do to pass a stage one trial, right? We haven't even thought about what those
link |
00:57:58.320
would be for algorithms. So, I mean, I think there are things we could do right now with regard to
link |
00:58:07.040
bias, for example, we have a pretty good technical handle on how to detect algorithms that are
link |
00:58:15.280
propagating bias that exists in data sets, how to de bias those algorithms, and even what it's going
link |
00:58:22.960
to cost you to do that. So I think we could start having some standards on that. I think there are
link |
00:58:30.320
things to do with impersonation and falsification that we could work on.
link |
00:58:37.280
Fakes, yeah.
link |
00:58:38.400
A very simple point. So impersonation is a machine acting as if it was a person.
link |
00:58:46.000
I can't see a real justification for why we shouldn't insist that machines self identify
link |
00:58:53.200
as machines. Where is the social benefit in fooling people into thinking that this is really
link |
00:59:02.800
a person when it isn't? I don't mind if it uses a human like voice, that's easy to understand,
link |
00:59:09.360
that's fine, but it should just say, I'm a machine in some form.
link |
00:59:14.960
And how many people are speaking to that? I would think relatively obvious facts.
link |
00:59:20.000
Yeah, I mean, there is actually a law in California that bans impersonation, but only in certain
link |
00:59:27.280
restricted circumstances. So for the purpose of engaging in a fraudulent transaction and for the
link |
00:59:36.000
purpose of modifying someone's voting behavior. So those are the circumstances where machines have
link |
00:59:44.160
to self identify. But I think arguably, it should be in all circumstances. And
link |
00:59:51.280
then when you talk about deep fakes, we're just at the beginning, but already it's possible to
link |
00:59:58.480
make a movie of anybody saying anything in ways that are pretty hard to detect.
link |
01:00:05.440
Including yourself because you're on camera now and your voice is coming through with high
link |
01:00:09.040
resolution.
link |
01:00:09.520
Yeah, so you could take what I'm saying and replace it with pretty much anything else you
link |
01:00:13.600
wanted me to be saying. And it's a very simple thing.
link |
01:00:17.040
Take what I'm saying and replace it with pretty much anything else you wanted me to be saying. And
link |
01:00:21.440
even it would change my lips and facial expressions to fit. And there's actually not much
link |
01:00:30.640
in the way of real legal protection against that. I think in the commercial area, you could say,
link |
01:00:38.160
yeah, you're using my brand and so on. There are rules about that. But in the political sphere,
link |
01:00:45.600
I think at the moment, anything goes. That could be really, really damaging.
link |
01:00:53.840
And let me just try to make not an argument, but try to look back at history and say something dark
link |
01:01:04.160
in essence is while regulation seems to be, oversight seems to be exactly the right thing to
link |
01:01:10.240
do here. It seems that human beings, what they naturally do is they wait for something to go
link |
01:01:15.440
wrong. If you're talking about nuclear weapons, you can't talk about nuclear weapons being dangerous
link |
01:01:21.840
until somebody actually like the United States drops the bomb or Chernobyl melting. Do you think
link |
01:01:28.720
we will have to wait for things going wrong in a way that's obviously damaging to society,
link |
01:01:36.880
not an existential risk, but obviously damaging? Or do you have faith that...
link |
01:01:43.440
I hope not, but I think we do have to look at history.
link |
01:01:49.840
And so the two examples you gave, nuclear weapons and nuclear power are very, very interesting
link |
01:01:57.280
because nuclear weapons, we knew in the early years of the 20th century that atoms contained
link |
01:02:07.520
a huge amount of energy. We had E equals MC squared. We knew the mass differences between
link |
01:02:12.880
the different atoms and their components. And we knew that
link |
01:02:17.920
you might be able to make an incredibly powerful explosive. So HG Wells wrote science fiction book,
link |
01:02:23.760
I think in 1912. Frederick Soddy, who was the guy who discovered isotopes, the Nobel prize winner,
link |
01:02:31.920
he gave a speech in 1915 saying that one pound of this new explosive would be the equivalent
link |
01:02:40.400
of 150 tons of dynamite, which turns out to be about right. And this was in World War I,
link |
01:02:48.320
so he was imagining how much worse the world war would be if we were using that kind of explosive.
link |
01:02:56.160
But the physics establishment simply refused to believe that these things could be made.
link |
01:03:04.000
Including the people who are making it.
link |
01:03:05.760
Well, so they were doing the nuclear physics. I mean, eventually were the ones who made it.
link |
01:03:11.200
You talk about Fermi or whoever.
link |
01:03:13.440
Well, so up to the development was mostly theoretical. So it was people using sort of
link |
01:03:22.240
primitive kinds of particle acceleration and doing experiments at the level of single particles
link |
01:03:29.440
or collections of particles. They weren't yet thinking about how to actually make a bomb or
link |
01:03:37.280
anything like that. But they knew the energy was there and they figured if they understood it
link |
01:03:40.640
better, it might be possible. But the physics establishment, their view, and I think because
link |
01:03:47.040
they did not want it to be true, their view was that it could not be true. That this could not
link |
01:03:54.320
not provide a way to make a super weapon. And there was this famous speech given by Rutherford,
link |
01:04:03.520
who was the sort of leader of nuclear physics. And it was on September 11th, 1933. And he said,
link |
01:04:11.840
anyone who talks about the possibility of obtaining energy from transformation of atoms
link |
01:04:17.760
is talking complete moonshine. And the next morning, Leo Szilard read about that speech
link |
01:04:26.080
and then invented the nuclear chain reaction. And so as soon as he invented, as soon as he had that
link |
01:04:32.880
idea that you could make a chain reaction with neutrons, because neutrons were not repelled by
link |
01:04:38.560
the nucleus, so they could enter the nucleus and then continue the reaction. As soon as he has that
link |
01:04:44.240
idea, he instantly realized that the world was in deep doo doo. Because this is 1933, right? Hitler
link |
01:04:54.400
had recently come to power in Germany. Szilard was in London and eventually became a refugee
link |
01:05:04.000
and came to the US. And in the process of having the idea about the chain reaction,
link |
01:05:11.920
he figured out basically how to make a bomb and also how to make a reactor. And he patented the
link |
01:05:18.960
reactor in 1934. But because of the situation, the great power conflict situation that he could see
link |
01:05:27.920
happening, he kept that a secret. And so between then and the beginning of World War II, people
link |
01:05:39.920
were working, including the Germans, on how to actually create neutron sources, what specific
link |
01:05:50.320
fission reactions would produce neutrons of the right energy to continue the reaction.
link |
01:05:57.440
And that was demonstrated in Germany, I think in 1938, if I remember correctly.
link |
01:06:01.440
The first nuclear weapon patent was 1939 by the French. So this was actually going on well before
link |
01:06:16.480
World War II really got going. And then the British probably had the most advanced capability
link |
01:06:22.640
in this area. But for safety reasons, among others, and just resources, they moved the program
link |
01:06:30.160
from Britain to the US and then that became Manhattan Project. So the reason why we couldn't
link |
01:06:40.560
have any kind of oversight of nuclear weapons and nuclear technology
link |
01:06:46.560
was because we were basically already in an arms race and a war.
link |
01:06:50.800
LR But you mentioned then in the 20s and 30s. So what are the echoes? The way you've described
link |
01:07:00.960
this story, I mean, there's clearly echoes. Why do you think most AI researchers,
link |
01:07:06.800
folks who are really close to the metal, they really are not concerned about AI. They don't
link |
01:07:11.760
think about it, whether it's they don't want to think about it. But why do you think that is,
link |
01:07:18.240
is what are the echoes of the nuclear situation to the current AI situation? And what can we do
link |
01:07:27.120
about it? BF I think there is a kind of motivated cognition, which is a term in psychology means
link |
01:07:35.520
that you believe what you would like to be true, rather than what is true. And it's unsettling
link |
01:07:46.000
to think that what you're working on might be the end of the human race, obviously. So you would
link |
01:07:52.640
rather instantly deny it and come up with some reason why it couldn't be true. And I have,
link |
01:08:00.560
I collected a long list of reasons that extremely intelligent, competent AI scientists have come up
link |
01:08:08.160
with for why we shouldn't worry about this. For example, calculators are superhuman at arithmetic
link |
01:08:16.800
and they haven't taken over the world. So there's nothing to worry about. Well, okay, my five year
link |
01:08:22.000
old, you know, could have figured out why that was an unreasonable and really quite weak argument.
link |
01:08:29.040
Another one was, while it's theoretically possible that you could have superhuman AI destroy the
link |
01:08:40.320
world, it's also theoretically possible that a black hole could materialize right next to the
link |
01:08:45.680
earth and destroy humanity. I mean, yes, it's theoretically possible, quantum theoretically,
link |
01:08:50.960
extremely unlikely that it would just materialize right there. But that's a completely bogus analogy,
link |
01:08:58.080
because, you know, if the whole physics community on earth was working to materialize a black hole
link |
01:09:04.240
in near earth orbit, right? Wouldn't you ask them, is that a good idea? Is that going to be safe?
link |
01:09:10.160
You know, what if you succeed? Right. And that's the thing, right? The AI community is sort of
link |
01:09:16.720
refused to ask itself, what if you succeed? And initially I think that was because it was too hard,
link |
01:09:24.240
but, you know, Alan Turing asked himself that, and he said, we'd be toast, right? If we were lucky,
link |
01:09:32.720
we might be able to switch off the power, but probably we'd be toast. But there's also an aspect
link |
01:09:37.600
that because we're not exactly sure what the future holds, it's not clear exactly,
link |
01:09:45.200
so technically what to worry about, sort of how things go wrong. And so there is something,
link |
01:09:53.360
it feels like, maybe you can correct me if I'm wrong, but there's something paralyzing about
link |
01:09:58.800
worrying about something that logically is inevitable, but you have to think about it,
link |
01:10:05.200
logically is inevitable, but you don't really know what that will look like.
link |
01:10:10.720
Yeah, I think that's, it's a reasonable point and, you know, it's certainly in terms of
link |
01:10:18.480
existential risks, it's different from, you know, asteroid collides with the earth, right? Which,
link |
01:10:24.000
again, is quite possible, you know, it's happened in the past, it'll probably happen again,
link |
01:10:29.520
we don't know right now, but if we did detect an asteroid that was going to hit the earth
link |
01:10:34.960
in 75 years time, we'd certainly be doing something about it.
link |
01:10:39.760
Well, it's clear there's got big rock and there's,
link |
01:10:42.080
we'll probably have a meeting and see what do we do about the big rock with AI.
link |
01:10:46.160
Right, with AI, I mean, there are very few people who think it's not going to happen within the
link |
01:10:50.160
next 75 years. I know Rod Brooks doesn't think it's going to happen, maybe Andrew Ng doesn't
link |
01:10:56.160
think it's happened, but, you know, a lot of the people who work day to day, you know, as you say,
link |
01:11:02.800
at the rock face, they think it's going to happen. I think the median estimate from AI researchers is
link |
01:11:10.640
somewhere in 40 to 50 years from now, or maybe, you know, I think in Asia, they think it's going
link |
01:11:16.000
to be even faster than that. I'm a little bit more conservative, I think it'd probably take
link |
01:11:24.080
longer than that, but I think, you know, as happened with nuclear weapons, it can happen
link |
01:11:30.720
overnight that you have these breakthroughs and we need more than one breakthrough, but,
link |
01:11:34.960
you know, it's on the order of half a dozen, I mean, this is a very rough scale, but sort of
link |
01:11:40.640
half a dozen breakthroughs of that nature would have to happen for us to reach the superhuman AI.
link |
01:11:49.920
But the, you know, the AI research community is vast now, the massive investments from governments,
link |
01:11:57.280
from corporations, tons of really, really smart people, you know, you just have to look at the
link |
01:12:03.360
rate of progress in different areas of AI to see that things are moving pretty fast. So to say,
link |
01:12:09.200
oh, it's just going to be thousands of years, I don't see any basis for that. You know, I see,
link |
01:12:15.920
you know, for example, the Stanford 100 year AI project, right, which is supposed to be sort of,
link |
01:12:26.400
you know, the serious establishment view, their most recent report actually said it's probably
link |
01:12:32.400
not even possible. Oh, wow.
link |
01:12:35.280
Right. Which if you want a perfect example of people in denial, that's it. Because, you know,
link |
01:12:42.880
for the whole history of AI, we've been saying to philosophers who said it wasn't possible,
link |
01:12:49.520
well, you have no idea what you're talking about. Of course it's possible, right? Give me an argument
link |
01:12:53.920
for why it couldn't happen. And there isn't one, right? And now, because people are worried that
link |
01:13:00.400
maybe AI might get a bad name, or I just don't want to think about this, they're saying, okay,
link |
01:13:06.080
well, of course, it's not really possible. You know, imagine if, you know, the leaders of the
link |
01:13:12.240
cancer biology community got up and said, well, you know, of course, curing cancer,
link |
01:13:17.360
it's not really possible. There'd be complete outrage and dismay. And, you know, I find this
link |
01:13:28.320
really a strange phenomenon. So, okay, so if you accept that it's possible,
link |
01:13:35.680
and if you accept that it's probably going to happen, the point that you're making that,
link |
01:13:42.400
you know, how does it go wrong? A valid question. Without that, without an answer to that question,
link |
01:13:50.160
then you're stuck with what I call the gorilla problem, which is, you know, the problem that
link |
01:13:54.320
the gorillas face, right? They made something more intelligent than them, namely us, a few million
link |
01:14:00.480
years ago, and now they're in deep doo doo. So there's really nothing they can do. They've lost
link |
01:14:07.680
the control. They failed to solve the control problem of controlling humans, and so they've
link |
01:14:13.760
lost. So we don't want to be in that situation. And if the gorilla problem is the only formulation
link |
01:14:20.240
you have, there's not a lot you can do, right? Other than to say, okay, we should try to stop,
link |
01:14:26.640
you know, we should just not make the humans, or in this case, not make the AI. And I think
link |
01:14:31.760
that's really hard to do. I'm not actually proposing that that's a feasible course of
link |
01:14:40.320
action. I also think that, you know, if properly controlled AI could be incredibly beneficial.
link |
01:14:48.800
But it seems to me that there's a consensus that one of the major failure modes is this
link |
01:14:56.720
loss of control, that we create AI systems that are pursuing incorrect objectives. And because
link |
01:15:05.040
the AI system believes it knows what the objective is, it has no incentive to listen to us anymore,
link |
01:15:12.240
so to speak, right? It's just carrying out the strategy that it has computed as being the optimal
link |
01:15:21.680
solution. And, you know, it may be that in the process, it needs to acquire more resources to
link |
01:15:30.480
increase the possibility of success or prevent various failure modes by defending itself against
link |
01:15:36.800
interference. And so that collection of problems, I think, is something we can address. The other
link |
01:15:45.920
problems are, roughly speaking, you know, misuse, right? So even if we solve the control problem,
link |
01:15:55.680
we make perfectly safe controllable AI systems. Well, why? You know, why does Dr. Evil going to
link |
01:16:01.600
use those, right? He wants to just take over the world and he'll make unsafe AI systems that then
link |
01:16:06.480
get out of control. So that's one problem, which is sort of a, you know, partly a policing problem,
link |
01:16:12.960
partly a sort of a cultural problem for the profession of how we teach people what kinds
link |
01:16:21.280
of AI systems are safe. You talk about autonomous weapon system and how pretty much everybody
link |
01:16:26.000
agrees that there's too many ways that that can go horribly wrong. This great slaughterbots movie
link |
01:16:32.000
that kind of illustrates that beautifully. I want to talk about that. That's another,
link |
01:16:36.960
there's another topic I'm having to talk about. I just want to mention that what I see is the
link |
01:16:41.200
third major failure mode, which is overuse, not so much misuse, but overuse of AI that we become
link |
01:16:49.760
overly dependent. So I call this the WALL E problem. So if you've seen WALL E, the movie,
link |
01:16:54.960
all right, all the humans are on the spaceship and the machines look after everything for them,
link |
01:17:00.240
and they just watch TV and drink big gulps. And they're all sort of obese and stupid and they
link |
01:17:07.440
sort of totally lost any notion of human autonomy. And, you know, so in effect, right. This would
link |
01:17:17.680
happen like the slow boiling frog, right? We would gradually turn over more and more of the
link |
01:17:24.240
management of our civilization to machines as we are already doing. And this, you know, if this
link |
01:17:29.520
if this process continues, you know, we sort of gradually switch from sort of being the masters
link |
01:17:37.920
of technology to just being the guests. Right. So we become guests on a cruise ship, you know,
link |
01:17:44.160
which is fine for a week, but not not for the rest of eternity. You know, and it's almost
link |
01:17:51.360
irreversible. Right. Once you once you lose the incentive to, for example, you know, learn to be
link |
01:17:58.640
an engineer or a doctor or a sanitation operative or any other of the infinitely many ways that we
link |
01:18:08.000
maintain and propagate our civilization. You know, if you if you don't have the incentive to do any
link |
01:18:14.240
of that, you won't. And then it's really hard to recover. And of course, as just one of the
link |
01:18:20.320
technologies that could that third failure mode result in that there's probably other
link |
01:18:24.400
technology in general detaches us from it does a bit. But the difference is that in terms of
link |
01:18:31.120
the knowledge to to run our civilization, you know, up to now, we've had no alternative but
link |
01:18:38.240
to put it into people's heads. Right. And if you software with Google, I mean, so software in
link |
01:18:43.920
general, so computers in general, but but the, you know, the knowledge of how, you know, how a
link |
01:18:51.200
sanitation system works, you know, that's an AI has to understand that it's no good putting it
link |
01:18:56.000
into Google. So, I mean, we we've always put knowledge in on paper, but paper doesn't run our
link |
01:19:02.560
civilization and only runs when it goes from the paper into people's heads again. Right. So we've
link |
01:19:07.120
always propagated civilization through human minds. And we've spent about a trillion person
link |
01:19:13.680
years doing that. I literally write you, you can work it out. It's about right. There's about just
link |
01:19:19.440
over 100 billion people who've ever lived. And each of them has spent about 10 years learning
link |
01:19:25.280
stuff to keep their civilization going. And so that's a trillion person years we put into this
link |
01:19:30.640
effort. Beautiful way to describe all civilization. And now we're, you know, we're in danger of
link |
01:19:36.160
throwing that away. So this is a problem that AI can't solve. It's not a technical problem. It's
link |
01:19:40.880
you know, if we do our job right, the AI systems will say, you know, the human race doesn't in the
link |
01:19:48.560
long run want to be passengers in a cruise ship. The human race wants autonomy. This is part of
link |
01:19:54.560
human preferences. So we, the AI systems are not going to do this stuff for you. You've got to do
link |
01:20:01.200
it for yourself. Right. I'm not going to carry you to the top of Everest in an autonomous
link |
01:20:06.320
helicopter. You have to climb it if you want to get the benefit and so on. So, but I'm afraid that
link |
01:20:14.960
because we are short sighted and lazy, we're going to override the AI systems. And, and there's an
link |
01:20:22.400
amazing short story that I recommend to everyone that I talked to about this called The Machine
link |
01:20:28.720
Stops, written in 1909 by E.M. Forster, who, you know, wrote novels about the British Empire and
link |
01:20:37.520
sort of things that became costume dramas on the BBC. But he wrote this one science fiction story,
link |
01:20:42.240
which is an amazing vision of the future. It has basically iPads, it has video conferencing,
link |
01:20:51.680
it has MOOCs, it has computer induced obesity. I mean, literally it's what people spend their
link |
01:21:00.320
time doing is giving online courses or listening to online courses and talking about ideas,
link |
01:21:05.920
but they never get out there in the real world. They don't really have a lot of face to face
link |
01:21:11.200
contact. Everything is done online, you know, so all the things we're worrying about now
link |
01:21:17.520
were described in the story. And, and then the human race becomes more and more dependent on
link |
01:21:22.000
the machine, loses knowledge of how things really run and then becomes vulnerable to collapse. And
link |
01:21:31.360
so it's a, it's a pretty unbelievably amazing story for someone writing in 1909 to imagine all
link |
01:21:38.640
this. So there's very few people that represent artificial intelligence more than you Stuart
link |
01:21:45.760
Russell. If you say it's okay, that's very kind. So it's all my fault. Right. You're often brought
link |
01:21:57.200
up as the person, well, Stuart Russell, like the AI person is worried about this. That's why you
link |
01:22:03.680
should be worried about it. Do you feel the burden of that? I don't know if you feel that at all,
link |
01:22:10.240
but when I talk to people like from, you talk about people outside of computer science,
link |
01:22:15.840
when they think about this, Stuart Russell is worried about AI safety. You should be worried
link |
01:22:21.280
too. Do you feel the burden of that? I mean, in a practical sense, yeah, because I get, you know,
link |
01:22:29.840
a dozen, sometimes 25 invitations a day to talk about it, to give interviews, to write press
link |
01:22:38.640
articles and so on. So in that very practical sense, I'm seeing that people are concerned and
link |
01:22:46.160
really interested about this. Are you worried that you could be wrong as all good scientists are?
link |
01:22:52.320
Of course. I worry about that all the time. I mean, that's, that's always been the way that I,
link |
01:22:57.920
I've worked, you know, is like I have an argument in my head with myself, right? So I have,
link |
01:23:03.440
I have some idea and then I think, okay, how could that be wrong? Or did someone else already have
link |
01:23:10.560
that idea? So I'll go and, you know, search in as much literature as I can to see whether someone
link |
01:23:16.720
else already thought of that or, or even refuted it. So, you know, I, right now I'm, I'm reading a
link |
01:23:23.680
lot of philosophy because, you know, in, in the form of the debates over, over utilitarianism and,
link |
01:23:32.800
and other kinds of moral, moral formulas, shall we say, people have already thought through
link |
01:23:42.800
some of these issues. But, you know, what, one of the things I'm, I'm not seeing in a lot of
link |
01:23:47.680
these debates is this specific idea about the importance of uncertainty in the objective
link |
01:23:56.400
that this is the way we should think about machines that are beneficial to humans. So this
link |
01:24:01.920
idea of provably beneficial machines based on explicit uncertainty in the objective,
link |
01:24:10.000
you know, it seems to be, you know, my gut feeling is this is the core of it. It's going to have to
link |
01:24:17.600
be elaborated in a lot of different directions and there are a lot of beneficial. Yeah. But there,
link |
01:24:23.600
there are, I mean, it has to be right. We can't afford, you know, hand wavy beneficial because
link |
01:24:30.640
there are, you know, whenever we do hand wavy stuff, there are loopholes. And the thing about
link |
01:24:34.800
super intelligent machines is they find the loopholes, you know, just like, you know, tax
link |
01:24:40.560
evaders. If you don't write your tax law properly, people will find the loopholes and end up paying
link |
01:24:46.400
no tax. And, and so you should think of it this way and, and getting those definitions right,
link |
01:24:56.480
you know, it is really a long process, you know, so you can, you can define mathematical frameworks
link |
01:25:04.400
and within that framework, you can prove mathematical theorems that yes, this will,
link |
01:25:08.560
you know, this, this theoretical entity will be provably beneficial to that theoretical entity,
link |
01:25:13.680
but that framework may not match the real world in some crucial way. So it's a long process,
link |
01:25:20.160
thinking through it, iterating and so on. Last question. Yep. You have 10 seconds to answer it.
link |
01:25:27.120
What is your favorite sci fi movie about AI? I would say interstellar has my favorite robots.
link |
01:25:34.480
Oh, beats space. Yeah. Yeah. Yeah. So, so Tars, the robots, one of the robots in interstellar is
link |
01:25:42.160
the way robot should behave. And, uh, I would say ex machina is in some ways, the one,
link |
01:25:51.520
the one that makes you think, uh, in a nervous kind of way about, about where we're going.
link |
01:25:58.080
Well Stuart, thank you so much for talking today. Pleasure.