back to index

Stuart Russell: Long-Term Future of Artificial Intelligence | Lex Fridman Podcast #9


small model | large model

link |
00:00:00.000
The following is a conversation from Stuart Russell. He's a professor of computer science at UC
link |
00:00:05.040
Berkeley and a coauthor of a book that introduced me and millions of other people to the amazing world
link |
00:00:11.360
of AI called Artificial Intelligence The Modern Approach. So it was an honor for me to have this
link |
00:00:18.320
conversation as part of MIT course on artificial journal intelligence and the artificial intelligence
link |
00:00:24.480
podcast. If you enjoy it, please subscribe on YouTube, iTunes or your podcast provider of choice
link |
00:00:31.360
or simply connect with me on Twitter at Lex Freedman spelled F R I D. And now here's my
link |
00:00:37.600
conversation with Stuart Russell. So you've mentioned in 1975 in high school you've created
link |
00:00:46.160
one of your first AI programs that played chess. Were you ever able to build a program that
link |
00:00:54.160
beat you at chess or another board game? So my program never beat me at chess.
link |
00:01:03.520
I actually wrote the program at Imperial College. So I used to take the bus every Wednesday with a
link |
00:01:10.480
box of cards this big and shove them into the card reader and they gave us eight seconds of CPU time.
link |
00:01:17.200
It took about five seconds to read the cards in and compile the code. So we had three seconds of
link |
00:01:24.720
CPU time, which was enough to make one move, you know, with a not very deep search. And then we
link |
00:01:30.960
would print that move out and then we'd have to go to the back of the queue and wait to feed the
link |
00:01:34.960
cards in again. How deep was the search? Well, are we talking about two moves? So no, I think we've
link |
00:01:40.480
got we got an eight move, eight, you know, depth eight with alpha beta. And we had some tricks of
link |
00:01:48.000
our own about move ordering and some pruning of the tree. And we were still able to beat that
link |
00:01:54.480
program. Yeah, yeah, I was a reasonable chess player in my youth. I did an Othello program
link |
00:02:01.680
and a backgammon program. So when I got to Berkeley, I worked a lot on
link |
00:02:05.920
what we call meta reasoning, which really means reasoning about reasoning. And in the case of
link |
00:02:13.200
a game playing program, you need to reason about what parts of the search tree you're actually
link |
00:02:18.320
going to explore, because the search tree is enormous, you know, bigger than the number of
link |
00:02:23.440
atoms in the universe. And the way programs succeed and the way humans succeed is by only
link |
00:02:30.960
looking at a small fraction of the search tree. And if you look at the right fraction, you play
link |
00:02:36.160
really well. If you look at the wrong fraction, if you waste your time thinking about things that
link |
00:02:41.360
are never going to happen, the moves that no one's ever going to make, then you're going to lose,
link |
00:02:45.840
because you won't be able to figure out the right decision. So that question of how machines can
link |
00:02:53.760
manage their own computation, how they decide what to think about is the meta reasoning question.
link |
00:02:59.760
We developed some methods for doing that. And very simply, a machine should think about
link |
00:03:06.640
whatever thoughts are going to improve its decision quality. We were able to show that
link |
00:03:12.640
both for a fellow, which is a standard two player game, and for backgammon, which includes
link |
00:03:19.040
dice rolls, so it's a two player game with uncertainty. For both of those cases, we could
link |
00:03:24.000
come up with algorithms that were actually much more efficient than the standard alpha beta search,
link |
00:03:31.120
which chess programs at the time were using. And that those programs could beat me.
link |
00:03:38.080
And I think you can see the same basic ideas in AlphaGo and AlphaZero today.
link |
00:03:44.720
The way they explore the tree is using a form of meta reasoning to select what to think about
link |
00:03:52.560
based on how useful it is to think about it. Is there any insights you can describe
link |
00:03:57.840
without Greek symbols of how do we select which paths to go down?
link |
00:04:04.240
There's really two kinds of learning going on. So as you say, AlphaGo learns to evaluate board
link |
00:04:10.560
to evaluate board position. So it can look at a go board. And it actually has probably a super
link |
00:04:17.680
human ability to instantly tell how promising that situation is. To me, the amazing thing about
link |
00:04:25.760
AlphaGo is not that it can be the world champion with its hands tied behind his back. But the fact that
link |
00:04:34.560
if you stop it from searching altogether, so you say, okay, you're not allowed to do
link |
00:04:41.360
any thinking ahead. You can just consider each of your legal moves and then look at the
link |
00:04:47.120
resulting situation and evaluate it. So what we call a depth one search. So just the immediate
link |
00:04:53.280
outcome of your moves and decide if that's good or bad. That version of AlphaGo
link |
00:04:57.920
can still play at a professional level. And human professionals are sitting there for
link |
00:05:05.200
five, 10 minutes deciding what to do. And AlphaGo in less than a second can instantly intuit what
link |
00:05:13.440
is the right move to make based on its ability to evaluate positions. And that is remarkable
link |
00:05:19.760
because we don't have that level of intuition about go. We actually have to think about the
link |
00:05:26.080
situation. So anyway, that capability that AlphaGo has is one big part of why it beats humans.
link |
00:05:35.840
The other big part is that it's able to look ahead 40, 50, 60 moves into the future. And
link |
00:05:46.880
if it was considering all possibilities, 40 or 50 or 60 moves into the future,
link |
00:05:51.200
that would be 10 to the 200 possibilities. So way more than atoms in the universe and so on.
link |
00:06:02.240
So it's very, very selective about what it looks at. So let me try to give you an intuition about
link |
00:06:10.880
how you decide what to think about. It's a combination of two things. One is
link |
00:06:15.360
how promising it is. So if you're already convinced that a move is terrible,
link |
00:06:22.560
there's no point spending a lot more time convincing yourself that it's terrible.
link |
00:06:27.520
Because it's probably not going to change your mind. So the real reason you think is because
link |
00:06:33.600
there's some possibility of changing your mind about what to do. And is that changing your mind
link |
00:06:39.920
that would result then in a better final action in the real world. So that's the purpose of thinking
link |
00:06:46.800
is to improve the final action in the real world. And so if you think about a move that is guaranteed
link |
00:06:53.520
to be terrible, you can convince yourself it's terrible, you're still not going to change your
link |
00:06:58.000
mind. But on the other hand, suppose you had a choice between two moves, one of them you've
link |
00:07:04.320
already figured out is guaranteed to be a draw, let's say. And then the other one looks a little
link |
00:07:10.400
bit worse. Like it looks fairly likely that if you make that move, you're going to lose.
link |
00:07:14.640
But there's still some uncertainty about the value of that move. There's still some possibility
link |
00:07:20.720
that it will turn out to be a win. Then it's worth thinking about that. So even though it's
link |
00:07:25.920
less promising on average than the other move, which is guaranteed to be a draw, there's still
link |
00:07:31.280
some purpose in thinking about it because there's a chance that you'll change your mind and discover
link |
00:07:36.160
that in fact it's a better move. So it's a combination of how good the move appears to be
link |
00:07:42.080
and how much uncertainty there is about its value. The more uncertainty, the more it's worth thinking
link |
00:07:48.000
about because there's a higher upside if you want to think of it that way. And of course in the
link |
00:07:52.800
beginning, especially in the AlphaGo Zero formulation, it's everything is shrouded in
link |
00:07:59.920
uncertainty. So you're really swimming in a sea of uncertainty. So it benefits you to
link |
00:08:07.520
I mean, actually falling in the same process as you described, but because you're so uncertain
link |
00:08:11.120
about everything, you basically have to try a lot of different directions.
link |
00:08:15.280
Yeah. So the early parts of the search tree are fairly bushy that it will look at a lot
link |
00:08:22.400
of different possibilities, but fairly quickly, the degree of certainty about some of the moves.
link |
00:08:27.840
I mean, if a move is really terrible, you'll pretty quickly find out, right? You'll lose
link |
00:08:31.760
half your pieces or half your territory. And then you'll say, okay, this is not worth thinking
link |
00:08:37.200
about anymore. And then so further down, the tree becomes very long and narrow. And you're following
link |
00:08:45.280
various lines of play, 10, 20, 30, 40, 50 moves into the future. And that again is something
link |
00:08:54.800
that human beings have a very hard time doing mainly because they just lack the short term memory.
link |
00:09:02.480
You just can't remember a sequence of moves. That's 50 moves long. And you can't imagine
link |
00:09:09.440
the board correctly for that many moves into the future. Of course, the top players,
link |
00:09:16.400
I'm much more familiar with chess, but the top players probably have,
link |
00:09:19.280
they have echoes of the same kind of intuition instinct that in a moment's time, AlphaGo applies
link |
00:09:27.280
when they see a board. I mean, they've seen those patterns, human beings have seen those
link |
00:09:31.760
patterns before at the top, at the grandmaster level. It seems that there is some
link |
00:09:40.000
similarities, or maybe it's our imagination creates a vision of those similarities, but it
link |
00:09:45.920
feels like this kind of pattern recognition that the AlphaGo approaches are using is similar to
link |
00:09:53.120
what human beings at the top level are using. I think there's some truth to that.
link |
00:10:01.520
But not entirely. Yeah, I mean, I think the extent to which a human grandmaster can reliably
link |
00:10:10.080
instantly recognize the right move and instantly recognize the values of position.
link |
00:10:13.680
I think that's a little bit overrated. But if you sacrifice a queen, for example,
link |
00:10:19.120
I mean, there's these, there's these beautiful games of chess with Bobby Fisher, somebody where
link |
00:10:24.640
it's seeming to make a bad move. And I'm not sure there's a perfect degree of calculation
link |
00:10:32.720
involved where they've calculated all the possible things that happen. But there's an
link |
00:10:37.440
instinct there, right, that somehow adds up to. Yeah, so I think what happens is you get a sense
link |
00:10:45.440
that there's some possibility in the position, even if you make a weird looking move, that it
link |
00:10:51.680
opens up some lines of calculation that otherwise would be definitely bad. And it's that intuition
link |
00:11:05.120
that there's something here in this position that might might yield a win down the set. And then
link |
00:11:13.920
you follow that. Right. And in some sense, when a chess player is following a line in his or her
link |
00:11:20.640
mind, they're they mentally simulating what the other person is going to do, what the opponent
link |
00:11:27.120
is going to do. And they can do that as long as the moves are kind of forced, right, as long as
link |
00:11:33.680
there's a we call a forcing variation where the opponent doesn't really have much choice how to
link |
00:11:39.440
respond. And then you see if you can force them into a situation where you win. We see plenty
link |
00:11:45.200
of mistakes, even in grandmaster games, where they just miss some simple three, four, five move
link |
00:11:54.560
combination that wasn't particularly apparent in the position, but was still there.
link |
00:12:00.400
That's the thing that makes us human. Yeah. So when you mentioned that in Othello, those games
link |
00:12:07.360
were after some meta reasoning improvements and research was able to beat you. How did that make
link |
00:12:13.760
you feel part of the meta reasoning capability that it had was based on learning. And,
link |
00:12:23.280
and you could sit down the next day and you could just feel that it had got a lot smarter.
link |
00:12:28.160
You know, and all of a sudden, you really felt like you're sort of pressed against
link |
00:12:34.480
the wall because it was it was much more aggressive and was totally unforgiving of any
link |
00:12:40.800
minor mistake that you might make. And actually, it seemed understood the game better than I did.
link |
00:12:47.760
And Gary Kasparov has this quote where during his match against Deep Blue, he said he suddenly
link |
00:12:55.520
felt that there was a new kind of intelligence across the board. Do you think that's a scary or
link |
00:13:01.680
an exciting possibility for Kasparov and for yourself in the context of chess purely sort of
link |
00:13:10.240
in this like that feeling, whatever that is, I think it's definitely an exciting feeling.
link |
00:13:17.600
You know, this is what made me work on AI in the first place was as soon as I really understood
link |
00:13:23.680
what a computer was, I wanted to make it smart. You know, I started out with the first program I
link |
00:13:30.080
wrote was for the Sinclair Programmable Calculator. And I think you could write a 21 step algorithm.
link |
00:13:38.640
That was the biggest program you could write something like that and do little arithmetic
link |
00:13:44.160
calculations. So I think I implemented Newton's method for square roots and a few other things
link |
00:13:49.440
like that. But then, you know, I thought, okay, if I just had more space, I could make this thing
link |
00:13:56.640
intelligent. And so I started thinking about AI. And I think the thing that's scary is not the
link |
00:14:10.560
chess program, because you know, chess programs, they're not in the taking over the world business.
link |
00:14:19.520
But if you extrapolate, you know, there are things about chess that don't resemble the real
link |
00:14:29.440
world, right? We know, we know the rules of chess. The chess board is completely visible
link |
00:14:37.600
to the program where, of course, the real world is not. Most the real world is not visible from
link |
00:14:43.280
wherever you're sitting, so to speak. And to overcome those kinds of problems, you need
link |
00:14:52.400
qualitatively different algorithms. Another thing about the real world is that, you know, we
link |
00:14:58.240
we regularly plan ahead on the timescales involving billions or trillions of steps. Now,
link |
00:15:07.520
we don't plan those in detail. But, you know, when you choose to do a PhD at Berkeley,
link |
00:15:14.800
that's a five year commitment that amounts to about a trillion motor control steps that you
link |
00:15:20.480
will eventually be committed to. Including going up the stairs, opening doors,
link |
00:15:26.160
a drinking water type. Yeah, I mean, every every finger movement while you're typing every character
link |
00:15:32.880
of every paper and the thesis and everything. So you're not committing in advance to the specific
link |
00:15:37.280
motor control steps, but you're still reasoning on a timescale that will eventually reduce to
link |
00:15:44.400
trillions of motor control actions. And so for all these reasons,
link |
00:15:50.000
you know, AlphaGo and Deep Blue and so on don't represent any kind of threat to humanity. But
link |
00:15:58.160
they are a step towards it, right? And progress in AI occurs by essentially removing one by one
link |
00:16:08.320
these assumptions that make problems easy, like the assumption of complete observability
link |
00:16:14.640
of the situation, right? We remove that assumption, you need a much more complicated kind of computing
link |
00:16:22.160
design and you need something that actually keeps track of all the things you can't see
link |
00:16:26.000
and tries to estimate what's going on. And there's inevitable uncertainty in that. So it becomes a
link |
00:16:31.920
much more complicated problem. But, you know, we are removing those assumptions, we are starting to
link |
00:16:38.160
have algorithms that can cope with much longer timescales, cope with uncertainty that can cope
link |
00:16:44.400
with partial observability. And so each of those steps sort of magnifies by a thousand the range
link |
00:16:53.360
of things that we can do with AI systems. So the way I started in AI, I wanted to be a psychiatrist
link |
00:16:58.400
for a long time and understand the mind in high school, and of course program and so on. And I
link |
00:17:03.840
showed up University of Illinois to an AI lab and they said, okay, I don't have time for you, but here
link |
00:17:10.640
is a book, AI Modern Approach, I think it was the first edition at the time. Here, go learn this.
link |
00:17:18.480
And I remember the lay of the land was, well, it's incredible that we solved chess, but we'll
link |
00:17:23.120
never solve go. I mean, it was pretty certain that go in the way we thought about systems that reason
link |
00:17:31.520
wasn't possible to solve. And now we've solved it. So it's a very... Well, I think I would have said
link |
00:17:36.080
that it's unlikely we could take the kind of algorithm that was used for chess and just get
link |
00:17:44.080
it to scale up and work well for go. And at the time, what we thought was that in order to solve
link |
00:17:55.680
go, we would have to do something similar to the way humans manage the complexity of go,
link |
00:18:01.600
which is to break it down into kind of sub games. So when a human thinks about a go board,
link |
00:18:06.960
they think about different parts of the board as sort of weakly connected to each other.
link |
00:18:12.480
And they think about, okay, within this part of the board, here's how things could go.
link |
00:18:16.880
In that part of board, here's how things could go. And then you try to sort of couple those
link |
00:18:20.560
two analyses together and deal with the interactions and maybe revise your views of how things are
link |
00:18:26.400
going to go in each part. And then you've got maybe five, six, seven, 10 parts of the board. And
link |
00:18:33.440
that actually resembles the real world much more than chess does. Because in the real world,
link |
00:18:41.440
we have work, we have home life, we have sport, whatever different kinds of activities, shopping,
link |
00:18:49.200
these all are connected to each other, but they're weakly connected. So when I'm typing a paper,
link |
00:18:58.480
I don't simultaneously have to decide which order I'm going to get the milk and the butter.
link |
00:19:04.400
That doesn't affect the typing. But I do need to realize, okay, better finish this
link |
00:19:10.320
before the shops close because I don't have anything, I don't have any food at home.
link |
00:19:14.080
So there's some weak connection, but not in the way that chess works, where everything is tied
link |
00:19:20.560
into a single stream of thought. So the thought was that go to solve go would have to make progress
link |
00:19:27.600
on stuff that would be useful for the real world. And in a way, AlphaGo is a little bit disappointing
link |
00:19:32.480
because the program designed for AlphaGo is actually not that different from
link |
00:19:38.160
from Deep Blue or even from Arthur Samuel's Jacob playing program from the 1950s.
link |
00:19:48.160
And in fact, the two things that make AlphaGo work is one is this amazing ability to evaluate
link |
00:19:54.560
the positions. And the other is the meta reasoning capability, which allows it to
link |
00:19:59.200
to explore some paths in the tree very deeply and to abandon other paths very quickly.
link |
00:20:06.960
So this word meta reasoning, while technically correct, inspires perhaps the wrong
link |
00:20:16.000
degree of power that AlphaGo has, for example, the word reasoning is a powerful word. So let me
link |
00:20:21.360
ask you sort of, do you were part of the symbolic AI world for a while, like where AI was, there's
link |
00:20:29.840
a lot of excellent interesting ideas there that unfortunately met a winter. And so do you think
link |
00:20:38.960
it reemerges? Oh, so I would say, yeah, it's not quite as simple as that. So the AI winter,
link |
00:20:46.800
the first winter that was actually named as such was the one in the late 80s.
link |
00:20:56.400
And that came about because in the mid 80s, there was
link |
00:21:03.280
really a concerted attempt to push AI out into the real world using what was called
link |
00:21:10.480
expert system technology. And for the most part, that technology was just not ready for prime
link |
00:21:17.280
time. They were trying in many cases to do a form of uncertain reasoning, judgment combinations of
link |
00:21:27.200
evidence diagnosis, those kinds of things, which was simply invalid. And when you try to apply
link |
00:21:34.640
invalid reasoning methods to real problems, you can fudge it for small versions of the problem.
link |
00:21:40.960
But when it starts to get larger, the thing just falls apart. So many companies found that
link |
00:21:49.040
the stuff just didn't work. And they were spending tons of money on consultants to
link |
00:21:53.440
try to make it work. And there were other practical reasons, like they were asking
link |
00:21:59.600
the companies to buy incredibly expensive Lisp machine workstations, which were literally
link |
00:22:07.760
between $50,000 and $100,000 in 1980s money, which would be between $150,000 and $300,000 per
link |
00:22:17.680
workstation in current prices. Then the bottom line, they weren't seeing a profit from it.
link |
00:22:24.000
Yeah. In many cases, I think there were some successes. There's no doubt about that. But
link |
00:22:30.880
people, I would say, over invested. Every major company was starting an AI department just like
link |
00:22:37.760
now. And I worry a bit that we might see similar disappointments, not because the
link |
00:22:45.840
current technology is invalid, but it's limited in its scope. And it's almost the dual of the
link |
00:22:57.600
scope problems that expert systems had. What have you learned from that hype cycle? And
link |
00:23:03.360
what can we do to prevent another winter, for example? Yeah. So when I'm giving talks these
link |
00:23:09.760
days, that's one of the warnings that I give. So there's two part warning slide. One is that
link |
00:23:18.480
rather than data being the new oil, data is the new snake oil. That's a good line. And then
link |
00:23:26.000
the other is that we might see a very visible failure in some of the major application areas.
link |
00:23:35.440
And I think self driving cars would be the flagship. And I think
link |
00:23:43.600
when you look at the history, so the first self driving car was on the freeway,
link |
00:23:51.200
driving itself, changing lanes, overtaking in 1987. And so it's more than 30 years.
link |
00:24:00.400
And that kind of looks like where we are today, right? Prototypes on the freeway,
link |
00:24:06.720
changing lanes and overtaking. Now, I think significant progress has been made, particularly
link |
00:24:13.760
on the perception side. So we worked a lot on autonomous vehicles in the early, mid 90s at
link |
00:24:20.560
Berkeley. And we had our own big demonstrations. We put congressmen into self driving cars and
link |
00:24:29.040
had them zooming along the freeway. And the problem was clearly perception.
link |
00:24:37.520
At the time, the problem was perception. Yeah. So in simulation, with perfect perception,
link |
00:24:42.880
you could actually show that you can drive safely for a long time, even if the other cars
link |
00:24:47.200
are misbehaving and so on. But simultaneously, we worked on machine vision for detecting cars and
link |
00:24:55.360
tracking pedestrians and so on. And we couldn't get the reliability of detection and tracking
link |
00:25:03.040
up to a high enough level, particularly in bad weather conditions, nighttime rainfall.
link |
00:25:11.440
Good enough for demos, but perhaps not good enough to cover the general operation.
link |
00:25:16.000
Yeah. So the thing about driving is, so suppose you're a taxi driver and you drive every day,
link |
00:25:20.800
eight hours a day for 10 years, that's 100 million seconds of driving. And any one of those
link |
00:25:27.360
seconds, you can make a fatal mistake. So you're talking about eight nines of reliability.
link |
00:25:34.960
Now, if your vision system only detects 98.3% of the vehicles, that's sort of one
link |
00:25:43.840
on a bit nine reliability. So you have another seven orders of magnitude to go. And this is
link |
00:25:52.720
what people don't understand. They think, oh, because I had a successful demo, I'm pretty much
link |
00:25:57.920
done. But you're not even within seven orders of magnitude of being done. And that's the difficulty.
link |
00:26:07.440
And it's not, can I follow a white line? That's not the problem. We follow a white line all the
link |
00:26:14.320
way across the country. But it's the weird stuff that happens. It's all the edge cases. Yeah.
link |
00:26:22.160
The edge case, other drivers doing weird things. So if you talk to Google, so they had actually
link |
00:26:30.640
a very classical architecture where you had machine vision, which would detect all the
link |
00:26:36.560
other cars and pedestrians and the white lines and the road signs. And then basically,
link |
00:26:42.480
that was fed into a logical database. And then you had a classical 1970s rule based expert system
link |
00:26:52.000
telling you, okay, if you're in the middle lane, and there's a bicyclist in the right lane,
link |
00:26:55.680
who is signaling this, then then do that, right? And what they found was that every day that go
link |
00:27:03.040
out and there'd be another situation that the rules didn't cover. So they come to a traffic
link |
00:27:07.760
circle and there's a little girl riding her bicycle the wrong way around the traffic circle.
link |
00:27:11.680
Okay, what do you do? We don't have a rule. Oh my God. Okay, stop. And then they come back
link |
00:27:17.520
and add more rules. And they just found that this was not really converging. And if you think about
link |
00:27:24.400
it, right, how do you deal with an unexpected situation, meaning one that you've never previously
link |
00:27:31.280
encountered and the sort of the reasoning required to figure out the solution for that
link |
00:27:37.200
situation has never been done. It doesn't match any previous situation in terms of the kind of
link |
00:27:42.800
reasoning you have to do. Well, in chess programs, this happens all the time. You're constantly
link |
00:27:49.520
coming up with situations you haven't seen before. And you have to reason about them and you have
link |
00:27:54.560
to think about, okay, here are the possible things I could do. Here are the outcomes. Here's how
link |
00:27:59.840
desirable the outcomes are and then pick the right one. In the 90s, we were saying, okay,
link |
00:28:04.560
this is how you're going to have to do automated vehicles. They're going to have to have a look
link |
00:28:08.160
ahead capability. But the look ahead for driving is more difficult than it is for chess. Because
link |
00:28:14.400
of humans. Right, there's humans and they're less predictable than chess pieces. Well,
link |
00:28:20.720
then you have an opponent in chess who's also somewhat unpredictable. But for example, in chess,
link |
00:28:28.240
you always know the opponent's intention. They're trying to beat you. Whereas in driving, you don't
link |
00:28:33.600
know, is this guy trying to turn left or has he just forgotten to turn off his turn signal? Or is
link |
00:28:39.040
he drunk? Or is he changing the channel on his radio or whatever it might be, you got to try and
link |
00:28:45.680
figure out the mental state, the intent of the other drivers to forecast the possible evolutions
link |
00:28:52.560
of their trajectories. And then you got to figure out, okay, which is the trajectory for me that's
link |
00:28:58.160
going to be safest. And those all interact with each other because the other drivers are going
link |
00:29:04.000
to react to your trajectory and so on. So, you know, they've got the classic merging onto the
link |
00:29:09.120
freeway problem where you're kind of racing a vehicle that's already on the freeway and you're
link |
00:29:14.640
are you going to pull ahead of them or are you going to let them go first and pull in behind
link |
00:29:17.680
and you get this sort of uncertainty about who's going first. So all those kinds of things
link |
00:29:23.680
mean that you need a decision making architecture that's very different from either a rule based
link |
00:29:34.720
system or it seems to me a kind of an end to end neural network system. You know, so just as Alpha
link |
00:29:41.360
Go is pretty good when it doesn't do any look ahead, but it's way, way, way, way better when it does.
link |
00:29:47.360
I think the same is going to be true for driving. You can have a driving system that's pretty good
link |
00:29:54.080
when it doesn't do any look ahead, but that's not good enough. You know, and we've already seen
link |
00:29:59.920
multiple deaths caused by poorly designed machine learning algorithms that don't really
link |
00:30:07.440
understand what they're doing. Yeah, and on several levels, I think it's on the perception side,
link |
00:30:13.600
there's mistakes being made by those algorithms where the perception is very shallow on the
link |
00:30:19.520
planning side, the look ahead, like you said, and the thing that we come up against that's
link |
00:30:28.560
really interesting when you try to deploy systems in the real world is
link |
00:30:33.280
you can't think of an artificial intelligence system as a thing that responds to the world always.
link |
00:30:38.320
You have to realize that it's an agent that others will respond to as well.
link |
00:30:41.600
Well, so in order to drive successfully, you can't just try to do obstacle avoidance.
link |
00:30:47.840
You can't pretend that you're invisible, right? You're the invisible car.
link |
00:30:52.400
It doesn't work that way. I mean, but you have to assert, yet others have to be scared of you,
link |
00:30:57.280
just there's this tension, there's this game. So we study a lot of work with pedestrians.
link |
00:31:04.160
If you approach pedestrians as purely an obstacle avoidance, so you're doing look
link |
00:31:09.360
ahead as in modeling the intent, they're not going to take advantage of you.
link |
00:31:15.040
They're not going to respect you at all. There has to be a tension, a fear, some amount of
link |
00:31:20.080
uncertainty. That's how we have created. Or at least just a kind of a resoluteness.
link |
00:31:28.000
You have to display a certain amount of resoluteness. You can't be too tentative.
link |
00:31:32.000
Yeah. So the solutions then become pretty complicated. You get into game theoretic
link |
00:31:42.480
analyses. So at Berkeley now, we're working a lot on this kind of interaction between machines
link |
00:31:50.960
and humans. And that's exciting. So my colleague, Anka Dragan, actually, if you formulate the problem
link |
00:32:03.600
game theoretically and you just let the system figure out the solution, it does interesting,
link |
00:32:08.800
unexpected things. Like sometimes at a stop sign, if no one is going first, the car will
link |
00:32:16.640
actually back up a little. It's just to indicate to the other cars that they should go. And that's
link |
00:32:23.200
something it invented entirely by itself. That's interesting. We didn't say this is the language
link |
00:32:28.480
of communication at stop signs. It figured it out. That's really interesting. So let me one just
link |
00:32:36.240
step back for a second. Just this beautiful philosophical notion. So Pamela McCordick in
link |
00:32:42.960
1979 wrote AI began with the ancient wish to forge the gods. So when you think about the
link |
00:32:50.320
history of our civilization, do you think that there is an inherent desire to create,
link |
00:32:58.960
let's not say gods, but to create superintelligence? Is it inherent to us? Is it in our genes,
link |
00:33:05.680
that the natural arc of human civilization is to create things that are of greater and greater
link |
00:33:13.680
power and perhaps echoes of ourselves? So to create the gods, as Pamela said.
link |
00:33:21.680
It may be. I mean, we're all individuals, but certainly we see over and over again in history
link |
00:33:35.760
individuals who thought about this possibility. Hopefully, I'm not being too philosophical here.
link |
00:33:42.320
But if you look at the arc of this, where this is going and we'll talk about AI safety,
link |
00:33:48.560
we'll talk about greater and greater intelligence, do you see that when you created the Othello
link |
00:33:55.840
program and you felt this excitement, what was that excitement? Was it the excitement of a tinkerer
link |
00:34:01.680
who created something cool, like a clock? Or was there a magic, or was it more like a child being
link |
00:34:10.240
born? Yeah. So I mean, I certainly understand that viewpoint. And if you look at the light
link |
00:34:17.520
hill report, so in the 70s, there was a lot of controversy in the UK about AI and whether it
link |
00:34:26.640
was for real and how much the money the government should invest. So it's a long story, but the
link |
00:34:34.720
government commissioned a report by Lighthill, who was a physicist, and he wrote a very damning
link |
00:34:43.280
report about AI, which I think was the point. And he said that these are frustrated men who
link |
00:34:54.480
unable to have children would like to create life as a kind of replacement, which I think is
link |
00:35:05.760
really pretty unfair. But there is a kind of magic, I would say, when you build something
link |
00:35:25.680
and what you're building in is really just you're building in some understanding of the
link |
00:35:29.760
principles of learning and decision making. And to see those principles actually then
link |
00:35:37.840
turn into intelligent behavior in specific situations, it's an incredible thing. And
link |
00:35:47.920
that is naturally going to make you think, okay, where does this end?
link |
00:36:00.080
And so there's a there's magical, optimistic views of word and whatever your view of optimism is,
link |
00:36:08.240
whatever your view of utopia is, it's probably different for everybody. But you've often talked
link |
00:36:13.360
about concerns you have of how things might go wrong. So I've talked to Max Tegmark. There's a
link |
00:36:26.080
lot of interesting ways to think about AI safety. You're one of the seminal people thinking about
link |
00:36:33.360
this problem amongst sort of being in the weeds of actually solving specific AI problems,
link |
00:36:39.360
you're also thinking about the big picture of where we're going. So can you talk about
link |
00:36:44.080
several elements of it? Let's just talk about maybe the control problem. So this idea of
link |
00:36:50.800
losing ability to control the behavior of our AI system. So how do you see that? How do you see
link |
00:36:58.720
that coming about? What do you think we can do to manage it?
link |
00:37:04.480
Well, so it doesn't take a genius to realize that if you make something that's smarter than you,
link |
00:37:11.520
you might have a problem. Alan Turing wrote about this and gave lectures about this,
link |
00:37:21.600
1951. He did a lecture on the radio. And he basically says, once the machine thinking method
link |
00:37:32.480
starts, very quickly, they'll outstrip humanity. And if we're lucky, we might be able to turn off
link |
00:37:45.600
the power at strategic moments, but even so, our species would be humbled. And actually,
link |
00:37:52.160
I think it was wrong about that. If it's a sufficiently intelligent machine, it's not
link |
00:37:56.240
going to let you switch it off. It's actually in competition with you.
link |
00:38:00.160
So what do you think is meant just for a quick tangent if we shut off this
link |
00:38:05.840
super intelligent machine that our species would be humbled?
link |
00:38:11.840
I think he means that we would realize that we are inferior, that we only survive by the skin
link |
00:38:20.560
of our teeth because we happen to get to the off switch just in time. And if we hadn't,
link |
00:38:27.440
then we would have lost control over the earth. So are you more worried when you think about
link |
00:38:34.400
this stuff about super intelligent AI or are you more worried about super powerful AI that's not
link |
00:38:41.600
aligned with our values? So the paperclip scenarios kind of... I think so the main problem I'm
link |
00:38:49.760
working on is the control problem, the problem of machines pursuing objectives that are, as you
link |
00:38:58.960
say, not aligned with human objectives. And this has been the way we've thought about AI
link |
00:39:06.720
since the beginning. You build a machine for optimizing and then you put in some objective
link |
00:39:15.120
and it optimizes. And we can think of this as the king Midas problem. Because if the king Midas
link |
00:39:26.480
put in this objective, everything I touch should turn to gold and the gods, that's like the machine,
link |
00:39:32.640
they said, okay, done. You now have this power and of course his food and his drink and his family
link |
00:39:39.360
all turned to gold and then he dies of misery and starvation. It's a warning, it's a failure mode that
link |
00:39:50.080
pretty much every culture in history has had some story along the same lines. There's the
link |
00:39:56.160
genie that gives you three wishes and third wish is always, please undo the first two wishes because
link |
00:40:01.920
I messed up. And when Arthur Samuel wrote his checker playing program, which learned to play
link |
00:40:11.920
checkers considerably better than Arthur Samuel could play and actually reached a pretty decent
link |
00:40:16.800
standard, Norbert Wiener, who was one of the major mathematicians of the 20th century, he's sort of
link |
00:40:25.040
the father of modern automation control systems. He saw this and he basically extrapolated
link |
00:40:33.360
as Turing did and said, okay, this is how we could lose control. And specifically that
link |
00:40:45.520
we have to be certain that the purpose we put into the machine is the purpose which we really
link |
00:40:50.960
desire. And the problem is, we can't do that. Right. You mean we're not, it's a very difficult
link |
00:40:59.760
to encode, to put our values on paper is really difficult, or you're just saying it's impossible?
link |
00:41:09.120
The line is great between the two. So theoretically, it's possible, but in practice,
link |
00:41:15.360
it's extremely unlikely that we could specify correctly in advance the full range of concerns
link |
00:41:23.520
of humanity. You talked about cultural transmission of values, I think is how humans to human
link |
00:41:29.360
transmission of values happens, right? Well, we learn, yeah, I mean, as we grow up, we learn about
link |
00:41:36.320
the values that matter, how things should go, what is reasonable to pursue and what isn't
link |
00:41:42.640
reasonable to pursue. I think machines can learn in the same kind of way. Yeah. So I think that
link |
00:41:49.120
what we need to do is to get away from this idea that you build an optimizing machine and then you
link |
00:41:54.480
put the objective into it. Because if it's possible that you might put in a wrong objective, and we
link |
00:42:03.200
already know this is possible because it's happened lots of times, right? That means that the machine
link |
00:42:08.880
should never take an objective that's given as gospel truth. Because once it takes the objective
link |
00:42:17.760
as gospel truth, then it believes that whatever actions it's taking in pursuit of that objective
link |
00:42:26.800
are the correct things to do. So you could be jumping up and down and saying, no, no, no, no,
link |
00:42:31.200
you're going to destroy the world, but the machine knows what the true objective is and is pursuing
link |
00:42:36.480
it and tough luck to you. And this is not restricted to AI, right? This is, I think,
link |
00:42:43.360
many of the 20th century technologies, right? So in statistics, you minimize a loss function,
link |
00:42:48.880
the loss function is exogenously specified in control theory, you minimize a cost function,
link |
00:42:54.320
in operations research, you maximize a reward function, and so on. So in all these disciplines,
link |
00:42:59.840
this is how we conceive of the problem. And it's the wrong problem. Because we cannot specify
link |
00:43:08.560
with certainty the correct objective, right? We need uncertainty, we need the machine to be
link |
00:43:15.360
uncertain about what it is that it's supposed to be maximizing.
link |
00:43:19.440
It's my favorite idea of yours. I've heard you say somewhere, well, I shouldn't pick favorites,
link |
00:43:25.200
but it just sounds beautiful. We need to teach machines humility. It's a beautiful way to put
link |
00:43:32.640
it. I love it. That they're humble. They know that they don't know what it is they're supposed
link |
00:43:40.320
to be doing. And that those objectives, I mean, they exist, they're within us, but we may not
link |
00:43:48.240
be able to explicate them. We may not even know how we want our future to go.
link |
00:43:57.040
So exactly. And a machine that's uncertain is going to be differential to us. So if we say,
link |
00:44:06.800
don't do that, well, now the machines learn something a bit more about our true objectives,
link |
00:44:11.840
because something that it thought was reasonable in pursuit of our objective,
link |
00:44:16.480
it turns out not to be so now it's learned something. So it's going to defer because it
link |
00:44:20.800
wants to be doing what we really want. And that point, I think, is absolutely central
link |
00:44:30.240
to solving the control problem. And it's a different kind of AI when you take away this
link |
00:44:37.920
idea that the objective is known, then in fact, a lot of the theoretical frameworks that we're so
link |
00:44:44.560
familiar with, you know, mark off decision processes, goal based planning, you know,
link |
00:44:53.520
standard game research, all of these techniques actually become inapplicable.
link |
00:45:01.040
And you get a more complicated problem because because now the interaction with the human becomes
link |
00:45:11.120
part of the problem. Because the human by making choices is giving you more information about
link |
00:45:21.280
the true objective and that information helps you achieve the objective better.
link |
00:45:26.640
And so that really means that you're mostly dealing with game theoretic problems where you've
link |
00:45:32.000
got the machine and the human and they're coupled together, rather than a machine going off by itself
link |
00:45:38.000
with a fixed objective. Which is fascinating on the machine and the human level that we,
link |
00:45:44.400
when you don't have an objective means you're together coming up with an objective. I mean,
link |
00:45:51.920
there's a lot of philosophy that, you know, you could argue that life doesn't really have meaning.
link |
00:45:56.160
We we together agree on what gives it meaning and we kind of culturally create
link |
00:46:01.680
things that give why the heck we are in this earth anyway. We together as a society create
link |
00:46:08.560
that meaning and you have to learn that objective. And one of the biggest, I thought that's where
link |
00:46:13.680
you were going to go for a second. One of the biggest troubles we run into outside of statistics
link |
00:46:19.200
and machine learning and AI in just human civilization is when you look at, I came from,
link |
00:46:26.240
I was born in the Soviet Union. And the history of the 20th century, we ran into the most trouble,
link |
00:46:32.160
us humans, when there was a certainty about the objective. And you do whatever it takes to achieve
link |
00:46:40.160
that objective, whether you're talking about Germany or communist Russia, you get into trouble
link |
00:46:46.480
with humans. And I would say with corporations, in fact, some people argue that we don't have
link |
00:46:52.960
to look forward to a time when AI systems take over the world, they already have. And they call
link |
00:46:57.840
corporations, right? That corporations happen to be using people as components right now.
link |
00:47:05.920
But they are effectively algorithmic machines, and they're optimizing an objective, which is
link |
00:47:11.680
quarterly profit that isn't aligned with overall well being of the human race. And they are
link |
00:47:18.080
destroying the world. They are primarily responsible for our inability to tackle climate change.
link |
00:47:24.960
So I think that's one way of thinking about what's going on with corporations. But
link |
00:47:31.840
I think the point you're making is valid, that there are many systems in the real world where
link |
00:47:39.680
we've sort of prematurely fixed on the objective and then decoupled the machine from those that
link |
00:47:48.480
are supposed to be serving. And I think you see this with government, right? Government is supposed
link |
00:47:54.720
to be a machine that serves people. But instead, it tends to be taken over by people who have their
link |
00:48:02.720
own objective and use government to optimize that objective, regardless of what people want.
link |
00:48:08.160
Do you find appealing the idea of almost arguing machines where you have multiple AI systems with
link |
00:48:16.080
a clear fixed objective? We have in government the red team and the blue team that are very fixed
link |
00:48:22.400
on their objectives. And they argue, and it kind of maybe would disagree, but it kind of seems to
link |
00:48:28.240
make it work somewhat that the duality of it, okay, let's go 100 years back when there was still
link |
00:48:39.440
was going on or at the founding of this country, there was disagreements and that disagreement is
link |
00:48:44.480
where so it was a balance between certainty and forced humility because the power was distributed.
link |
00:48:52.160
Yeah, I think that the nature of debate and disagreement argument takes as a premise the idea
link |
00:49:05.280
that you could be wrong, which means that you're not necessarily absolutely convinced that your
link |
00:49:12.800
objective is the correct one. If you were absolutely convinced, there'd be no point
link |
00:49:19.520
in having any discussion or argument because you would never change your mind. And there wouldn't
link |
00:49:24.160
be any sort of synthesis or anything like that. So I think you can think of argumentation as an
link |
00:49:32.080
implementation of a form of uncertain reasoning. I've been reading recently about utilitarianism
link |
00:49:44.640
and the history of efforts to define in a sort of clear mathematical way a if you like a formula for
link |
00:49:54.960
moral or political decision making. And it's really interesting that the parallels between
link |
00:50:02.320
the philosophical discussions going back 200 years and what you see now in discussions about
link |
00:50:08.720
existential risk because it's almost exactly the same. So someone would say, okay, well,
link |
00:50:15.040
here's a formula for how we should make decisions. So utilitarianism is roughly each person has a
link |
00:50:21.600
utility function and then we make decisions to maximize the sum of everybody's utility.
link |
00:50:28.720
And then people point out, well, in that case, the best policy is one that leads to
link |
00:50:36.480
the enormously vast population, all of whom are living a life that's barely worth living.
link |
00:50:43.520
And this is called the repugnant conclusion. And another version is that we should maximize
link |
00:50:51.200
pleasure and that's what we mean by utility. And then you'll get people effectively saying,
link |
00:50:57.680
well, in that case, we might as well just have everyone hooked up to a heroin drip. And they
link |
00:51:02.480
didn't use those words. But that debate was happening in the 19th century, as it is now
link |
00:51:09.920
about AI, that if we get the formula wrong, we're going to have AI systems working towards
link |
00:51:17.600
an outcome that in retrospect, would be exactly wrong.
link |
00:51:22.080
Do you think there's has beautifully put so the echoes are there. But do you think,
link |
00:51:26.960
I mean, if you look at Sam Harris, our imagination worries about the AI version of that, because
link |
00:51:34.640
of the speed at which the things going wrong in the utilitarian context could happen.
link |
00:51:45.840
Is that a worry for you?
link |
00:51:47.280
Yeah, I think that in most cases, not in all, but if we have a wrong political idea,
link |
00:51:55.360
we see it starting to go wrong. And we're not completely stupid. And so we said, okay,
link |
00:52:02.000
maybe that was a mistake. Let's try something different. And also, we're very slow and inefficient
link |
00:52:10.160
about implementing these things and so on. So you have to worry when you have corporations
link |
00:52:15.520
or political systems that are extremely efficient. But when we look at AI systems,
link |
00:52:20.800
or even just computers in general, right, they have this different characteristic
link |
00:52:28.400
from ordinary human activity in the past. So let's say you were a surgeon. You had some idea
link |
00:52:35.040
about how to do some operation, right? Well, and let's say you were wrong, right, that that way
link |
00:52:40.480
of doing the operation would mostly kill the patient. Well, you'd find out pretty quickly,
link |
00:52:45.840
like after three, maybe three or four tries, right? But that isn't true for pharmaceutical
link |
00:52:56.000
companies, because they don't do three or four operations. They manufacture three or four billion
link |
00:53:03.040
pills and they sell them. And then they find out maybe six months or a year later that, oh,
link |
00:53:08.800
people are dying of heart attacks or getting cancer from this drug. And so that's why we have the FDA,
link |
00:53:14.880
right? Because of the scalability of pharmaceutical production. And there have been some unbelievably
link |
00:53:22.960
bad episodes in the history of pharmaceuticals and adulteration of products and so on that have
link |
00:53:34.320
killed tens of thousands or paralyzed hundreds of thousands of people.
link |
00:53:39.360
Now, with computers, we have that same scalability problem that you can
link |
00:53:43.280
sit there and type for i equals one to five billion, two, right? And all of a sudden,
link |
00:53:49.520
you're having an impact on a global scale. And yet we have no FDA, right? There's absolutely no
link |
00:53:55.360
controls at all over what a bunch of undergraduates with too much caffeine can do to the world.
link |
00:54:03.440
And, you know, we look at what happened with Facebook, well, social media in general, and
link |
00:54:09.600
click through optimization. So you have a simple feedback algorithm that's trying to just optimize
link |
00:54:18.480
click through, right? That sounds reasonable, right? Because you don't want to be feeding people
link |
00:54:24.080
ads that they don't care about or not interested in. And you might even think of that process as
link |
00:54:33.200
simply adjusting the the feeding of ads or news articles or whatever it might be to match people's
link |
00:54:42.160
preferences, right? Which sounds like a good idea. But in fact, that isn't how the algorithm works,
link |
00:54:50.880
right? You make more money. The algorithm makes more money. If it can better predict what people
link |
00:54:59.760
are going to click on, because then it can feed them exactly that, right? So the way to maximize
link |
00:55:06.400
click through is actually to modify the people, to make them more predictable. And one way to do
link |
00:55:13.360
that is to feed them information which will change their behavior and preferences towards
link |
00:55:21.920
extremes that make them predictable. Whatever is the nearest extreme or the nearest predictable
link |
00:55:27.600
point, that's where you're going to end up. And the machines will force you there.
link |
00:55:34.480
Now, and I think there's a reasonable argument to say that this, among other things, is
link |
00:55:40.400
contributing to the destruction of democracy in the world. And where was the oversight
link |
00:55:50.160
of this process? Where were the people saying, okay, you would like to apply this algorithm to
link |
00:55:55.600
five billion people on the face of the earth? Can you show me that it's safe? Can you show
link |
00:56:01.120
me that it won't have various kinds of negative effects? No, there was no one asking that question.
link |
00:56:07.040
There was no one placed between the undergrads with too much caffeine and the human race.
link |
00:56:15.520
It's just they just did it. And some way outside the scope of my knowledge,
link |
00:56:20.480
so economists would argue that the invisible hand, so the capitalist system, it was the
link |
00:56:27.120
oversight. So if you're going to corrupt society with whatever decision you make as a company,
link |
00:56:32.480
then that's going to be reflected in people not using your product. That's one model of oversight.
link |
00:56:39.280
We shall see. But in the meantime, but you might even have broken the political system
link |
00:56:48.000
that enables capitalism to function. Well, you've changed it. So we should see. Yeah.
link |
00:56:54.960
Change is often painful. So my question is absolutely, it's fascinating. You're absolutely
link |
00:57:01.360
right that there was zero oversight on algorithms that can have a profound civilization changing
link |
00:57:09.120
effect. So do you think it's possible? I mean, I haven't, have you seen government?
link |
00:57:15.440
So do you think it's possible to create regulatory bodies oversight over AI algorithms,
link |
00:57:22.800
which are inherently such cutting edge set of ideas and technologies?
link |
00:57:30.960
Yeah, but I think it takes time to figure out what kind of oversight, what kinds of controls.
link |
00:57:37.520
I mean, it took time to design the FDA regime. Some people still don't like it and they want
link |
00:57:42.960
to fix it. And I think there are clear ways that it could be improved. But the whole notion that
link |
00:57:50.400
you have stage one, stage two, stage three, and here are the criteria for what you have to do
link |
00:57:56.400
to pass a stage one trial, right? We haven't even thought about what those would be for algorithms.
link |
00:58:02.000
So I mean, I think there are, there are things we could do right now with regard to bias, for
link |
00:58:10.320
example, we have a pretty good technical handle on how to detect algorithms that are propagating
link |
00:58:19.040
bias that exists in data sets, how to debias those algorithms, and even what it's going to cost you
link |
00:58:26.720
to do that. So I think we could start having some standards on that. I think there are things to do
link |
00:58:34.480
with impersonation and falsification that we could, we could work on. So I think, yeah.
link |
00:58:43.360
Or the very simple point. So impersonation is a machine acting as if it was a person.
link |
00:58:51.440
I can't see a real justification for why we shouldn't insist that machine self identify as
link |
00:58:58.800
machines. Where is the social benefit in fooling people into thinking that this is really a person
link |
00:59:08.480
when it isn't? I don't mind if it uses a human like voice that's easy to understand. That's fine.
link |
00:59:15.120
But it should just say, I'm a machine in some form. And now many people are speaking to that.
link |
00:59:22.800
I would think relatively obvious facts. So I think most people... Yeah. I mean,
link |
00:59:26.400
there is actually a law in California that bans impersonation, but only in certain
link |
00:59:33.280
restricted circumstances. So for the purpose of engaging in a fraudulent transaction and for
link |
00:59:41.280
the purpose of modifying someone's voting behavior. So those are the circumstances where
link |
00:59:48.160
machines have to self identify. But I think, arguably, it should be in all circumstances.
link |
00:59:56.400
And then when you talk about deep fakes, we're just at the beginning. But already,
link |
01:00:03.280
it's possible to make a movie of anybody saying anything in ways that are pretty hard to detect.
link |
01:00:11.520
Including yourself because you're on camera now and your voice is coming through with high
link |
01:00:15.040
resolution. Yeah. So you could take what I'm saying and replace it with pretty much anything
link |
01:00:19.360
else you wanted me to be saying. And even it would change my lips and facial expressions to fit.
link |
01:00:26.960
And there's actually not much in the way of real legal protection against that.
link |
01:00:35.920
I think in the commercial area, you could say, yeah, that's...
link |
01:00:38.640
You're using my brand and so on. There are rules about that. But in the political sphere, I think,
link |
01:00:47.600
at the moment, it's anything goes. So that could be really, really damaging.
link |
01:00:53.920
And let me just try to make not an argument, but try to look back at history and say something
link |
01:01:02.400
dark, in essence, is while regulation seems to be... Oversight seems to be exactly the
link |
01:01:09.680
right thing to do here. It seems that human beings, what they naturally do is they wait
link |
01:01:14.480
for something to go wrong. If you're talking about nuclear weapons, you can't talk about
link |
01:01:20.080
nuclear weapons being dangerous until somebody actually, like the United States drops the bomb,
link |
01:01:26.000
or Chernobyl melting. Do you think we will have to wait for things going wrong in a way that's
link |
01:01:34.960
obviously damaging to society, not an existential risk, but obviously damaging?
link |
01:01:42.320
Or do you have faith that... I hope not. But I mean, I think we do have to look at history.
link |
01:01:48.000
So the two examples you gave, nuclear weapons and nuclear power, are very, very interesting because
link |
01:01:59.520
nuclear weapons, we knew in the early years of the 20th century that atoms contained a huge
link |
01:02:07.840
amount of energy. We had E equals MC squared. We knew the mass differences between the different
link |
01:02:13.280
atoms and their components, and we knew that you might be able to make an incredibly powerful
link |
01:02:20.640
explosive. So H.G. Wells wrote science fiction book, I think, in 1912. Frederick Soddy, who was the
link |
01:02:28.000
guy who discovered isotopes as a Nobel Prize winner, he gave a speech in 1915 saying that
link |
01:02:37.840
one pound of this new explosive would be the equivalent of 150 tons of dynamite,
link |
01:02:42.160
which turns out to be about right. And this was in World War I, so he was imagining how much worse
link |
01:02:51.360
the world war would be if we were using that kind of explosive. But the physics establishment
link |
01:02:57.600
simply refused to believe that these things could be made. Including the people who were making it.
link |
01:03:06.400
Well, so they were doing the nuclear physics. I mean, eventually were the ones who made it.
link |
01:03:11.280
You talk about Fermi or whoever. Well, so up to the development was mostly theoretical. So it was
link |
01:03:21.280
people using sort of primitive kinds of particle acceleration and doing experiments at the level
link |
01:03:28.320
of single particles or collections of particles. They weren't yet thinking about how to actually
link |
01:03:36.480
make a bomb or anything like that. But they knew the energy was there and they figured if they
link |
01:03:40.160
understood it better, it might be possible. But the physics establishment, their view, and I think
link |
01:03:46.720
because they did not want it to be true, their view was that it could not be true.
link |
01:03:53.360
That this could not provide a way to make a super weapon. And there was this famous
link |
01:04:01.120
speech given by Rutherford, who was the sort of leader of nuclear physics. And it was on
link |
01:04:08.240
September 11, 1933. And he said, you know, anyone who talks about the possibility of
link |
01:04:14.800
obtaining energy from transformation of atoms is talking complete moonshine. And the next
link |
01:04:22.800
morning, Leo Zillard read about that speech and then invented the nuclear chain reaction.
link |
01:04:28.560
And so as soon as he invented, as soon as he had that idea, that you could make a chain reaction
link |
01:04:35.920
with neutrons because neutrons were not repelled by the nucleus so they could enter the nucleus
link |
01:04:40.640
and then continue the reaction. As soon as he has that idea, he instantly realized that the world
link |
01:04:48.480
was in deep doo doo. Because this is 1933, right? Hitler had recently come to power in Germany.
link |
01:04:58.160
Zillard was in London. He eventually became a refugee and he came to the US. And in the
link |
01:05:09.280
process of having the idea about the chain reaction, he figured out basically how to make
link |
01:05:14.880
a bomb and also how to make a reactor. And he patented the reactor in 1934. But because
link |
01:05:22.800
of the situation, the great power conflict situation that he could see happening,
link |
01:05:29.200
he kept that a secret. And so between then and the beginning of World War II,
link |
01:05:39.600
people were working, including the Germans, on how to actually create neutron sources,
link |
01:05:48.000
what specific fission reactions would produce neutrons of the right energy to continue the
link |
01:05:54.720
reaction. And that was demonstrated in Germany, I think in 1938, if I remember correctly. The first
link |
01:06:03.760
nuclear weapon patent was 1939 by the French. So this was actually going on well before World War
link |
01:06:17.280
II really got going. And then the British probably had the most advanced capability
link |
01:06:22.720
in this area. But for safety reasons, among others, and bless just sort of just resources,
link |
01:06:28.720
they moved the program from Britain to the US. And then that became Manhattan Project.
link |
01:06:34.480
So the reason why we couldn't
link |
01:06:37.920
have any kind of oversight of nuclear weapons and nuclear technology was because we were basically
link |
01:06:48.320
already in an arms race in a war. But you mentioned in the 20s and 30s, so what are the echoes?
link |
01:07:00.000
The way you've described the story, I mean, there's clearly echoes. What do you think most AI
link |
01:07:04.320
researchers, folks who are really close to the metal, they really are not concerned about AI,
link |
01:07:11.440
they don't think about it, whether it's they don't want to think about it. But what are the,
link |
01:07:16.960
yeah, why do you think that is? What are the echoes of the nuclear situation to the current AI
link |
01:07:23.760
situation? And what can we do about it? I think there is a kind of motivated cognition, which is
link |
01:07:33.120
a term in psychology means that you believe what you would like to be true, rather than what is
link |
01:07:40.640
true. And it's unsettling to think that what you're working on might be the end of the human race,
link |
01:07:50.800
obviously. So you would rather instantly deny it and come up with some reason why it couldn't be
link |
01:07:58.400
true. And I collected a long list of regions that extremely intelligent, competent AI scientists
link |
01:08:08.080
have come up with for why we shouldn't worry about this. For example, calculators are superhuman at
link |
01:08:16.560
arithmetic and they haven't taken over the world, so there's nothing to worry about. Well, okay,
link |
01:08:21.680
my five year old could have figured out why that was an unreasonable and really quite weak argument.
link |
01:08:31.520
Another one was, while it's theoretically possible that you could have superhuman AI
link |
01:08:41.760
destroy the world, it's also theoretically possible that a black hole could materialize
link |
01:08:46.480
right next to the earth and destroy humanity. I mean, yes, it's theoretically possible,
link |
01:08:51.760
quantum theoretically, extremely unlikely that it would just materialize right there.
link |
01:08:58.400
But that's a completely bogus analogy because if the whole physics community on earth was working
link |
01:09:04.720
to materialize a black hole in near earth orbit, wouldn't you ask them, is that a good idea? Is
link |
01:09:11.680
that going to be safe? What if you succeed? And that's the thing. The AI community is sort of
link |
01:09:19.040
refused to ask itself, what if you succeed? And initially, I think that was because it was too
link |
01:09:26.240
hard, but Alan Turing asked himself that and he said, we'd be toast. If we were lucky, we might
link |
01:09:35.520
be able to switch off the power but probably we'd be toast. But there's also an aspect
link |
01:09:40.000
that because we're not exactly sure what the future holds, it's not clear exactly so technically
link |
01:09:49.680
what to worry about, sort of how things go wrong. And so there is something it feels like, maybe
link |
01:09:58.480
you can correct me if I'm wrong, but there's something paralyzing about worrying about something
link |
01:10:04.400
that logically is inevitable. But you don't really know what that will look like.
link |
01:10:10.720
Yeah, I think that's a reasonable point. And it's certainly in terms of existential risks,
link |
01:10:19.440
it's different from asteroid collides with the earth, which again is quite possible. It's
link |
01:10:26.800
happened in the past. It'll probably happen again. We don't know right now. But if we did detect an
link |
01:10:33.040
asteroid that was going to hit the earth in 75 years time, we'd certainly be doing something
link |
01:10:39.200
about it. Well, it's clear there's got big rock and we'll probably have a meeting and see what
link |
01:10:43.600
do we do about the big rock with AI. Right, with AI. I mean, there are very few people who think it's
link |
01:10:49.040
not going to happen within the next 75 years. I know Rod Brooks doesn't think it's going to happen.
link |
01:10:55.200
Maybe Andrew Ng doesn't think it's happened. But a lot of the people who work day to day,
link |
01:11:00.880
you know, as you say, at the rock face, they think it's going to happen. I think the median
link |
01:11:08.640
estimate from AI researchers is somewhere in 40 to 50 years from now. Or maybe, you know,
link |
01:11:14.320
I think in Asia, they think it's going to be even faster than that. I'm a little bit
link |
01:11:21.280
more conservative. I think it probably take longer than that. But I think, you know, as
link |
01:11:25.840
happened with nuclear weapons, it can happen overnight that you have these breakthroughs.
link |
01:11:32.400
And we need more than one breakthrough. But, you know, it's on the order of half a dozen.
link |
01:11:38.800
This is a very rough scale. But so half a dozen breakthroughs of that nature
link |
01:11:45.840
would have to happen for us to reach superhuman AI. But the AI research community is
link |
01:11:53.600
vast now, the massive investments from governments, from corporations, tons of really,
link |
01:12:00.640
really smart people. You just have to look at the rate of progress in different areas of AI
link |
01:12:05.760
to see that things are moving pretty fast. So to say, oh, it's just going to be thousands of years.
link |
01:12:11.920
I don't see any basis for that. You know, I see, you know, for example, the
link |
01:12:18.160
Stanford 100 year AI project, which is supposed to be sort of, you know, the serious establishment view,
link |
01:12:29.520
their most recent report actually said it's probably not even possible.
link |
01:12:34.160
Oh, wow.
link |
01:12:35.280
Right. Which if you want a perfect example of people in denial, that's it. Because, you know,
link |
01:12:42.960
for the whole history of AI, we've been saying to philosophers who said it wasn't possible. Well,
link |
01:12:49.760
you have no idea what you're talking about. Of course, it's possible. Right. Give me an
link |
01:12:53.520
argument for why it couldn't happen. And there isn't one. Right. And now, because people are
link |
01:12:59.760
worried that maybe AI might get a bad name, or I just don't want to think about this,
link |
01:13:05.200
they're saying, okay, well, of course, it's not really possible. You know, imagine, right? Imagine
link |
01:13:09.520
if, you know, the leaders of the cancer biology community got up and said, well, you know, of
link |
01:13:16.000
course, curing cancer, it's not really possible. There'd be a complete outrage and dismay. And,
link |
01:13:25.040
you know, I find this really a strange phenomenon. So,
link |
01:13:30.800
okay, so if you accept it as possible, and if you accept that it's probably going to happen,
link |
01:13:40.560
the point that you're making that, you know, how does it go wrong?
link |
01:13:46.320
A valid question without that, without an answer to that question, then you're stuck with what I
link |
01:13:51.600
call the gorilla problem, which is, you know, the problem that the gorillas face, right? They
link |
01:13:56.160
made something more intelligent than them, namely us a few million years ago, and now they're in
link |
01:14:02.800
deep doo doo. So there's really nothing they can do. They've lost the control. They failed to solve
link |
01:14:09.360
the control problem of controlling humans. And so they've lost. So we don't want to be in that
link |
01:14:16.240
situation. And if the gorilla problem is the only formulation you have, there's not a lot you can do.
link |
01:14:22.320
Right. Other than to say, okay, we should try to stop. You know, we should just not make the humans
link |
01:14:28.240
or, or in this case, not make the AI. And I think that's really hard to do.
link |
01:14:35.120
To, I'm not actually proposing that that's a feasible course of action. And I also think that,
link |
01:14:43.040
you know, if properly controlled AI could be incredibly beneficial.
link |
01:14:46.080
So the, but it seems to me that there's a, there's a consensus that one of the major
link |
01:14:54.960
failure modes is this loss of control that we create AI systems that are pursuing incorrect
link |
01:15:02.320
objectives. And because the AI system believes it knows what the objective is, it has no incentive
link |
01:15:11.040
to listen to us anymore, so to speak, right? It's just carrying out the,
link |
01:15:17.920
the strategy that it, it has computed as being the optimal solution.
link |
01:15:24.320
And, you know, it may be that in the process, it needs to acquire more resources to increase the
link |
01:15:31.600
possibility of success or prevent various failure modes by defending itself against interference.
link |
01:15:37.360
And so that collection of problems, I think, is something we can address.
link |
01:15:45.280
The other problems are roughly speaking, you know, misuse, right? So even if we solve the control
link |
01:15:55.360
problem, we make perfectly safe controllable AI systems. Well, why, you know, why does Dr.
link |
01:16:00.960
Evil going to use those, right? He wants to just take over the world and he'll make unsafe AI systems
link |
01:16:05.600
that then get out of control. So that's one problem, which is sort of a, you know, partly a
link |
01:16:12.000
policing problem, partly a sort of a cultural problem for the profession of how we teach people
link |
01:16:21.760
what kinds of AI systems are safe. You talk about autonomous weapon system and how pretty much
link |
01:16:26.560
everybody agrees that there's too many ways that that can go horribly wrong. You have this great
link |
01:16:31.920
Slotabots movie that kind of illustrates that beautifully. Well, I want to talk about that.
link |
01:16:36.560
That's another, there's another topic I'm happy to talk about the, I just want to mention that
link |
01:16:41.440
what I see is the third major failure mode, which is overuse, not so much misuse, but overuse of AI,
link |
01:16:49.680
that we become overly dependent. So I call this the warly problems. If you've seen the warly,
link |
01:16:55.280
the movie, all right, all the humans are on the spaceship and the machines look after everything
link |
01:17:00.800
for them. And they just watch TV and drink big gulps. And they're all sort of obese and stupid.
link |
01:17:07.680
And they sort of totally lost any notion of human autonomy. And, you know, so in effect, right,
link |
01:17:18.240
this would happen like the slow boiling frog, right, we would gradually turn over
link |
01:17:24.320
more and more of the management of our civilization to machines as we are already doing.
link |
01:17:28.480
And this, you know, this, if this process continues, you know, we sort of gradually
link |
01:17:34.560
switch from sort of being the masters of technology to just being the guests, right?
link |
01:17:41.440
So, so we become guests on a cruise ship, you know, which is fine for a week, but not,
link |
01:17:46.480
not for the rest of eternity, right? You know, and it's almost irreversible, right? Once you,
link |
01:17:53.520
once you lose the incentive to, for example, you know, learn to be an engineer or a doctor
link |
01:18:00.800
or a sanitation operative or any other of the, the infinitely many ways that we
link |
01:18:08.000
maintain and propagate our civilization. You know, if you, if you don't have the
link |
01:18:13.200
incentive to do any of that, you won't. And then it's really hard to recover.
link |
01:18:18.400
And of course they add just one of the technologies that could, that third failure mode result in that.
link |
01:18:23.440
There's probably other technology in general detaches us from.
link |
01:18:28.400
It does a bit, but the, the, the difference is that in terms of the knowledge to,
link |
01:18:34.080
to run our civilization, you know, up to now we've had no alternative, but to put it into
link |
01:18:39.360
people's heads, right? And if you, if you, software with Google, I mean, so software in
link |
01:18:44.400
general, so computers in general, but, but the, you know, the knowledge of how, you know, how
link |
01:18:51.600
a sanitation system works, you know, that's an AI has to understand that it's no good putting it
link |
01:18:56.560
into Google. So, I mean, we, we've always put knowledge in on paper, but paper doesn't run
link |
01:19:02.960
our civilization. It only runs when it goes from the paper into people's heads again, right? So
link |
01:19:07.520
we've always propagated civilization through human minds and we've spent about a trillion
link |
01:19:13.920
person years doing that literally, right? You, you can work it out. It's about, right? There's
link |
01:19:19.440
about just over a hundred billion people who've ever lived and each of them has spent about 10
link |
01:19:25.120
years learning stuff to keep their civilization going. And so that's a trillion person years we
link |
01:19:30.640
put into this effort. Beautiful way to describe all of civilization. And now we're, you know,
link |
01:19:35.760
we're in danger of throwing that away. So this is a problem that AI can't solve. It's not a
link |
01:19:39.840
technical problem. It's a, you know, and if we do our job right, the AI systems will say, you know,
link |
01:19:47.120
the human race doesn't in the long run want to be passengers in a cruise ship. The human race
link |
01:19:52.800
wants autonomy. This is part of human preferences. So we, the AI systems are not going to do this
link |
01:19:59.840
stuff for you. You've got to do it for yourself, right? I'm not going to carry you to the top of
link |
01:20:05.440
Everest in an autonomous helicopter. You have to climb it if you want to get the benefit and so on. So
link |
01:20:14.160
but I'm afraid that because we are short sighted and lazy, we're going to override the AI systems.
link |
01:20:20.880
And, and there's an amazing short story that I recommend to everyone that I talk to about this
link |
01:20:27.520
called the machine stops written in 1909 by Ian Forster, who, you know, wrote novels about the
link |
01:20:36.080
British Empire and sort of things that became costume dramas on the BBC. But he wrote this one
link |
01:20:41.200
science fiction story, which is an amazing vision of the future. It has, it has basically iPads.
link |
01:20:49.280
It has video conferencing. It has MOOCs. It has computer and computer induced obesity. I mean,
link |
01:20:57.680
literally, the whole thing is what people spend their time doing is giving online courses or
link |
01:21:02.960
listening to online courses and talking about ideas. But they never get out there in the real
link |
01:21:07.920
world. They don't really have a lot of face to face contact. Everything is done online.
link |
01:21:13.680
You know, so all the things we're worrying about now were described in the story and and then the
link |
01:21:19.680
human race becomes more and more dependent on the machine loses knowledge of how things really run
link |
01:21:27.600
and then becomes vulnerable to collapse. And so it's a it's a pretty unbelievably amazing
link |
01:21:35.200
story for someone writing in 1909 to imagine all this. Plus, yeah. So there's very few people
link |
01:21:41.520
that represent artificial intelligence more than you, Stuart Russell.
link |
01:21:46.960
If you say it's okay, that's very kind. So it's all my fault.
link |
01:21:50.880
It's all your fault. No, right. You're often brought up as the person. Well, Stuart Russell,
link |
01:22:00.560
like the AI person is worried about this. That's why you should be worried about it.
link |
01:22:06.080
Do you feel the burden of that? I don't know if you feel that at all. But when I talk to people,
link |
01:22:11.280
like from you talk about people outside of computer science, when they think about this,
link |
01:22:16.800
Stuart Russell is worried about AI safety, you should be worried too. Do you feel the burden
link |
01:22:22.560
of that? I mean, in a practical sense, yeah, because I get, you know, a dozen, sometimes
link |
01:22:31.520
25 invitations a day to talk about it, to give interviews, to write press articles and so on.
link |
01:22:39.600
So in that very practical sense, I'm seeing that people are concerned and really interested about
link |
01:22:47.120
this. Are you worried that you could be wrong, as all good scientists are? Of course. I worry about
link |
01:22:53.680
that all the time. I mean, that's always been the way that I've worked, you know, is like I have an
link |
01:23:00.400
argument in my head with myself, right? So I have some idea. And then I think, okay,
link |
01:23:06.320
okay, how could that be wrong? Or did someone else already have that idea? So I'll go and
link |
01:23:12.800
search in as much literature as I can to see whether someone else already thought of that
link |
01:23:18.240
or even refuted it. So, you know, right now, I'm reading a lot of philosophy because,
link |
01:23:25.600
you know, in the form of the debates over utilitarianism and other kinds of moral formulas,
link |
01:23:37.920
shall we say, people have already thought through some of these issues. But, you know,
link |
01:23:44.320
what one of the things I'm not seeing in a lot of these debates is this specific idea about
link |
01:23:51.280
the importance of uncertainty in the objective, that this is the way we should think about machines
link |
01:23:58.560
that are beneficial to humans. So this idea of provably beneficial machines based on explicit
link |
01:24:06.800
uncertainty in the objective, you know, it seems to be, you know, my gut feeling is this is the core
link |
01:24:15.200
of it. It's going to have to be elaborated in a lot of different directions. And they're a lot
link |
01:24:21.200
of beneficial. Yeah, but they're, I mean, it has to be, right? We can't afford, you know,
link |
01:24:27.440
hand wavy beneficial. Because there are, you know, whenever we do hand wavy stuff, there are
link |
01:24:33.040
loopholes. And the thing about super intelligent machines is they find the loopholes. You know,
link |
01:24:38.080
just like, you know, tax evaders, if you don't write your tax law properly, people will find
link |
01:24:44.320
the loopholes and end up paying no tax. And so you should think of it this way. And getting those
link |
01:24:53.440
definitions right, you know, it is really a long process, you know, so you can you can define
link |
01:25:03.440
mathematical frameworks. And within that framework, you can prove mathematical theorems that, yes,
link |
01:25:07.760
this will, you know, this this theoretical entity will be provably beneficial to that theoretical
link |
01:25:12.800
entity. But that framework may not match the real world in some crucial way. So the long process
link |
01:25:20.160
thinking through it to iterating and so on. Last question. Yep. You have 10 seconds to answer it.
link |
01:25:27.120
What is your favorite sci fi movie about AI? I would say interstellar has my favorite robots.
link |
01:25:34.480
Oh, beats space. Yeah, yeah, yeah. So so Tars, the robots, one of the robots in interstellar is
link |
01:25:42.160
the way robots should behave. And I would say X Machina is in some ways the one, the one that
link |
01:25:52.080
makes you think in a nervous kind of way about about where we're going.
link |
01:25:58.000
Well, Stuart, thank you so much for talking today. Pleasure.