back to indexStuart Russell: Long-Term Future of Artificial Intelligence | Lex Fridman Podcast #9
link |
The following is a conversation from Stuart Russell. He's a professor of computer science at UC
link |
Berkeley and a coauthor of a book that introduced me and millions of other people to the amazing world
link |
of AI called Artificial Intelligence The Modern Approach. So it was an honor for me to have this
link |
conversation as part of MIT course on artificial journal intelligence and the artificial intelligence
link |
podcast. If you enjoy it, please subscribe on YouTube, iTunes or your podcast provider of choice
link |
or simply connect with me on Twitter at Lex Freedman spelled F R I D. And now here's my
link |
conversation with Stuart Russell. So you've mentioned in 1975 in high school you've created
link |
one of your first AI programs that played chess. Were you ever able to build a program that
link |
beat you at chess or another board game? So my program never beat me at chess.
link |
I actually wrote the program at Imperial College. So I used to take the bus every Wednesday with a
link |
box of cards this big and shove them into the card reader and they gave us eight seconds of CPU time.
link |
It took about five seconds to read the cards in and compile the code. So we had three seconds of
link |
CPU time, which was enough to make one move, you know, with a not very deep search. And then we
link |
would print that move out and then we'd have to go to the back of the queue and wait to feed the
link |
cards in again. How deep was the search? Well, are we talking about two moves? So no, I think we've
link |
got we got an eight move, eight, you know, depth eight with alpha beta. And we had some tricks of
link |
our own about move ordering and some pruning of the tree. And we were still able to beat that
link |
program. Yeah, yeah, I was a reasonable chess player in my youth. I did an Othello program
link |
and a backgammon program. So when I got to Berkeley, I worked a lot on
link |
what we call meta reasoning, which really means reasoning about reasoning. And in the case of
link |
a game playing program, you need to reason about what parts of the search tree you're actually
link |
going to explore, because the search tree is enormous, you know, bigger than the number of
link |
atoms in the universe. And the way programs succeed and the way humans succeed is by only
link |
looking at a small fraction of the search tree. And if you look at the right fraction, you play
link |
really well. If you look at the wrong fraction, if you waste your time thinking about things that
link |
are never going to happen, the moves that no one's ever going to make, then you're going to lose,
link |
because you won't be able to figure out the right decision. So that question of how machines can
link |
manage their own computation, how they decide what to think about is the meta reasoning question.
link |
We developed some methods for doing that. And very simply, a machine should think about
link |
whatever thoughts are going to improve its decision quality. We were able to show that
link |
both for a fellow, which is a standard two player game, and for backgammon, which includes
link |
dice rolls, so it's a two player game with uncertainty. For both of those cases, we could
link |
come up with algorithms that were actually much more efficient than the standard alpha beta search,
link |
which chess programs at the time were using. And that those programs could beat me.
link |
And I think you can see the same basic ideas in AlphaGo and AlphaZero today.
link |
The way they explore the tree is using a form of meta reasoning to select what to think about
link |
based on how useful it is to think about it. Is there any insights you can describe
link |
without Greek symbols of how do we select which paths to go down?
link |
There's really two kinds of learning going on. So as you say, AlphaGo learns to evaluate board
link |
to evaluate board position. So it can look at a go board. And it actually has probably a super
link |
human ability to instantly tell how promising that situation is. To me, the amazing thing about
link |
AlphaGo is not that it can be the world champion with its hands tied behind his back. But the fact that
link |
if you stop it from searching altogether, so you say, okay, you're not allowed to do
link |
any thinking ahead. You can just consider each of your legal moves and then look at the
link |
resulting situation and evaluate it. So what we call a depth one search. So just the immediate
link |
outcome of your moves and decide if that's good or bad. That version of AlphaGo
link |
can still play at a professional level. And human professionals are sitting there for
link |
five, 10 minutes deciding what to do. And AlphaGo in less than a second can instantly intuit what
link |
is the right move to make based on its ability to evaluate positions. And that is remarkable
link |
because we don't have that level of intuition about go. We actually have to think about the
link |
situation. So anyway, that capability that AlphaGo has is one big part of why it beats humans.
link |
The other big part is that it's able to look ahead 40, 50, 60 moves into the future. And
link |
if it was considering all possibilities, 40 or 50 or 60 moves into the future,
link |
that would be 10 to the 200 possibilities. So way more than atoms in the universe and so on.
link |
So it's very, very selective about what it looks at. So let me try to give you an intuition about
link |
how you decide what to think about. It's a combination of two things. One is
link |
how promising it is. So if you're already convinced that a move is terrible,
link |
there's no point spending a lot more time convincing yourself that it's terrible.
link |
Because it's probably not going to change your mind. So the real reason you think is because
link |
there's some possibility of changing your mind about what to do. And is that changing your mind
link |
that would result then in a better final action in the real world. So that's the purpose of thinking
link |
is to improve the final action in the real world. And so if you think about a move that is guaranteed
link |
to be terrible, you can convince yourself it's terrible, you're still not going to change your
link |
mind. But on the other hand, suppose you had a choice between two moves, one of them you've
link |
already figured out is guaranteed to be a draw, let's say. And then the other one looks a little
link |
bit worse. Like it looks fairly likely that if you make that move, you're going to lose.
link |
But there's still some uncertainty about the value of that move. There's still some possibility
link |
that it will turn out to be a win. Then it's worth thinking about that. So even though it's
link |
less promising on average than the other move, which is guaranteed to be a draw, there's still
link |
some purpose in thinking about it because there's a chance that you'll change your mind and discover
link |
that in fact it's a better move. So it's a combination of how good the move appears to be
link |
and how much uncertainty there is about its value. The more uncertainty, the more it's worth thinking
link |
about because there's a higher upside if you want to think of it that way. And of course in the
link |
beginning, especially in the AlphaGo Zero formulation, it's everything is shrouded in
link |
uncertainty. So you're really swimming in a sea of uncertainty. So it benefits you to
link |
I mean, actually falling in the same process as you described, but because you're so uncertain
link |
about everything, you basically have to try a lot of different directions.
link |
Yeah. So the early parts of the search tree are fairly bushy that it will look at a lot
link |
of different possibilities, but fairly quickly, the degree of certainty about some of the moves.
link |
I mean, if a move is really terrible, you'll pretty quickly find out, right? You'll lose
link |
half your pieces or half your territory. And then you'll say, okay, this is not worth thinking
link |
about anymore. And then so further down, the tree becomes very long and narrow. And you're following
link |
various lines of play, 10, 20, 30, 40, 50 moves into the future. And that again is something
link |
that human beings have a very hard time doing mainly because they just lack the short term memory.
link |
You just can't remember a sequence of moves. That's 50 moves long. And you can't imagine
link |
the board correctly for that many moves into the future. Of course, the top players,
link |
I'm much more familiar with chess, but the top players probably have,
link |
they have echoes of the same kind of intuition instinct that in a moment's time, AlphaGo applies
link |
when they see a board. I mean, they've seen those patterns, human beings have seen those
link |
patterns before at the top, at the grandmaster level. It seems that there is some
link |
similarities, or maybe it's our imagination creates a vision of those similarities, but it
link |
feels like this kind of pattern recognition that the AlphaGo approaches are using is similar to
link |
what human beings at the top level are using. I think there's some truth to that.
link |
But not entirely. Yeah, I mean, I think the extent to which a human grandmaster can reliably
link |
instantly recognize the right move and instantly recognize the values of position.
link |
I think that's a little bit overrated. But if you sacrifice a queen, for example,
link |
I mean, there's these, there's these beautiful games of chess with Bobby Fisher, somebody where
link |
it's seeming to make a bad move. And I'm not sure there's a perfect degree of calculation
link |
involved where they've calculated all the possible things that happen. But there's an
link |
instinct there, right, that somehow adds up to. Yeah, so I think what happens is you get a sense
link |
that there's some possibility in the position, even if you make a weird looking move, that it
link |
opens up some lines of calculation that otherwise would be definitely bad. And it's that intuition
link |
that there's something here in this position that might might yield a win down the set. And then
link |
you follow that. Right. And in some sense, when a chess player is following a line in his or her
link |
mind, they're they mentally simulating what the other person is going to do, what the opponent
link |
is going to do. And they can do that as long as the moves are kind of forced, right, as long as
link |
there's a we call a forcing variation where the opponent doesn't really have much choice how to
link |
respond. And then you see if you can force them into a situation where you win. We see plenty
link |
of mistakes, even in grandmaster games, where they just miss some simple three, four, five move
link |
combination that wasn't particularly apparent in the position, but was still there.
link |
That's the thing that makes us human. Yeah. So when you mentioned that in Othello, those games
link |
were after some meta reasoning improvements and research was able to beat you. How did that make
link |
you feel part of the meta reasoning capability that it had was based on learning. And,
link |
and you could sit down the next day and you could just feel that it had got a lot smarter.
link |
You know, and all of a sudden, you really felt like you're sort of pressed against
link |
the wall because it was it was much more aggressive and was totally unforgiving of any
link |
minor mistake that you might make. And actually, it seemed understood the game better than I did.
link |
And Gary Kasparov has this quote where during his match against Deep Blue, he said he suddenly
link |
felt that there was a new kind of intelligence across the board. Do you think that's a scary or
link |
an exciting possibility for Kasparov and for yourself in the context of chess purely sort of
link |
in this like that feeling, whatever that is, I think it's definitely an exciting feeling.
link |
You know, this is what made me work on AI in the first place was as soon as I really understood
link |
what a computer was, I wanted to make it smart. You know, I started out with the first program I
link |
wrote was for the Sinclair Programmable Calculator. And I think you could write a 21 step algorithm.
link |
That was the biggest program you could write something like that and do little arithmetic
link |
calculations. So I think I implemented Newton's method for square roots and a few other things
link |
like that. But then, you know, I thought, okay, if I just had more space, I could make this thing
link |
intelligent. And so I started thinking about AI. And I think the thing that's scary is not the
link |
chess program, because you know, chess programs, they're not in the taking over the world business.
link |
But if you extrapolate, you know, there are things about chess that don't resemble the real
link |
world, right? We know, we know the rules of chess. The chess board is completely visible
link |
to the program where, of course, the real world is not. Most the real world is not visible from
link |
wherever you're sitting, so to speak. And to overcome those kinds of problems, you need
link |
qualitatively different algorithms. Another thing about the real world is that, you know, we
link |
we regularly plan ahead on the timescales involving billions or trillions of steps. Now,
link |
we don't plan those in detail. But, you know, when you choose to do a PhD at Berkeley,
link |
that's a five year commitment that amounts to about a trillion motor control steps that you
link |
will eventually be committed to. Including going up the stairs, opening doors,
link |
a drinking water type. Yeah, I mean, every every finger movement while you're typing every character
link |
of every paper and the thesis and everything. So you're not committing in advance to the specific
link |
motor control steps, but you're still reasoning on a timescale that will eventually reduce to
link |
trillions of motor control actions. And so for all these reasons,
link |
you know, AlphaGo and Deep Blue and so on don't represent any kind of threat to humanity. But
link |
they are a step towards it, right? And progress in AI occurs by essentially removing one by one
link |
these assumptions that make problems easy, like the assumption of complete observability
link |
of the situation, right? We remove that assumption, you need a much more complicated kind of computing
link |
design and you need something that actually keeps track of all the things you can't see
link |
and tries to estimate what's going on. And there's inevitable uncertainty in that. So it becomes a
link |
much more complicated problem. But, you know, we are removing those assumptions, we are starting to
link |
have algorithms that can cope with much longer timescales, cope with uncertainty that can cope
link |
with partial observability. And so each of those steps sort of magnifies by a thousand the range
link |
of things that we can do with AI systems. So the way I started in AI, I wanted to be a psychiatrist
link |
for a long time and understand the mind in high school, and of course program and so on. And I
link |
showed up University of Illinois to an AI lab and they said, okay, I don't have time for you, but here
link |
is a book, AI Modern Approach, I think it was the first edition at the time. Here, go learn this.
link |
And I remember the lay of the land was, well, it's incredible that we solved chess, but we'll
link |
never solve go. I mean, it was pretty certain that go in the way we thought about systems that reason
link |
wasn't possible to solve. And now we've solved it. So it's a very... Well, I think I would have said
link |
that it's unlikely we could take the kind of algorithm that was used for chess and just get
link |
it to scale up and work well for go. And at the time, what we thought was that in order to solve
link |
go, we would have to do something similar to the way humans manage the complexity of go,
link |
which is to break it down into kind of sub games. So when a human thinks about a go board,
link |
they think about different parts of the board as sort of weakly connected to each other.
link |
And they think about, okay, within this part of the board, here's how things could go.
link |
In that part of board, here's how things could go. And then you try to sort of couple those
link |
two analyses together and deal with the interactions and maybe revise your views of how things are
link |
going to go in each part. And then you've got maybe five, six, seven, 10 parts of the board. And
link |
that actually resembles the real world much more than chess does. Because in the real world,
link |
we have work, we have home life, we have sport, whatever different kinds of activities, shopping,
link |
these all are connected to each other, but they're weakly connected. So when I'm typing a paper,
link |
I don't simultaneously have to decide which order I'm going to get the milk and the butter.
link |
That doesn't affect the typing. But I do need to realize, okay, better finish this
link |
before the shops close because I don't have anything, I don't have any food at home.
link |
So there's some weak connection, but not in the way that chess works, where everything is tied
link |
into a single stream of thought. So the thought was that go to solve go would have to make progress
link |
on stuff that would be useful for the real world. And in a way, AlphaGo is a little bit disappointing
link |
because the program designed for AlphaGo is actually not that different from
link |
from Deep Blue or even from Arthur Samuel's Jacob playing program from the 1950s.
link |
And in fact, the two things that make AlphaGo work is one is this amazing ability to evaluate
link |
the positions. And the other is the meta reasoning capability, which allows it to
link |
to explore some paths in the tree very deeply and to abandon other paths very quickly.
link |
So this word meta reasoning, while technically correct, inspires perhaps the wrong
link |
degree of power that AlphaGo has, for example, the word reasoning is a powerful word. So let me
link |
ask you sort of, do you were part of the symbolic AI world for a while, like where AI was, there's
link |
a lot of excellent interesting ideas there that unfortunately met a winter. And so do you think
link |
it reemerges? Oh, so I would say, yeah, it's not quite as simple as that. So the AI winter,
link |
the first winter that was actually named as such was the one in the late 80s.
link |
And that came about because in the mid 80s, there was
link |
really a concerted attempt to push AI out into the real world using what was called
link |
expert system technology. And for the most part, that technology was just not ready for prime
link |
time. They were trying in many cases to do a form of uncertain reasoning, judgment combinations of
link |
evidence diagnosis, those kinds of things, which was simply invalid. And when you try to apply
link |
invalid reasoning methods to real problems, you can fudge it for small versions of the problem.
link |
But when it starts to get larger, the thing just falls apart. So many companies found that
link |
the stuff just didn't work. And they were spending tons of money on consultants to
link |
try to make it work. And there were other practical reasons, like they were asking
link |
the companies to buy incredibly expensive Lisp machine workstations, which were literally
link |
between $50,000 and $100,000 in 1980s money, which would be between $150,000 and $300,000 per
link |
workstation in current prices. Then the bottom line, they weren't seeing a profit from it.
link |
Yeah. In many cases, I think there were some successes. There's no doubt about that. But
link |
people, I would say, over invested. Every major company was starting an AI department just like
link |
now. And I worry a bit that we might see similar disappointments, not because the
link |
current technology is invalid, but it's limited in its scope. And it's almost the dual of the
link |
scope problems that expert systems had. What have you learned from that hype cycle? And
link |
what can we do to prevent another winter, for example? Yeah. So when I'm giving talks these
link |
days, that's one of the warnings that I give. So there's two part warning slide. One is that
link |
rather than data being the new oil, data is the new snake oil. That's a good line. And then
link |
the other is that we might see a very visible failure in some of the major application areas.
link |
And I think self driving cars would be the flagship. And I think
link |
when you look at the history, so the first self driving car was on the freeway,
link |
driving itself, changing lanes, overtaking in 1987. And so it's more than 30 years.
link |
And that kind of looks like where we are today, right? Prototypes on the freeway,
link |
changing lanes and overtaking. Now, I think significant progress has been made, particularly
link |
on the perception side. So we worked a lot on autonomous vehicles in the early, mid 90s at
link |
Berkeley. And we had our own big demonstrations. We put congressmen into self driving cars and
link |
had them zooming along the freeway. And the problem was clearly perception.
link |
At the time, the problem was perception. Yeah. So in simulation, with perfect perception,
link |
you could actually show that you can drive safely for a long time, even if the other cars
link |
are misbehaving and so on. But simultaneously, we worked on machine vision for detecting cars and
link |
tracking pedestrians and so on. And we couldn't get the reliability of detection and tracking
link |
up to a high enough level, particularly in bad weather conditions, nighttime rainfall.
link |
Good enough for demos, but perhaps not good enough to cover the general operation.
link |
Yeah. So the thing about driving is, so suppose you're a taxi driver and you drive every day,
link |
eight hours a day for 10 years, that's 100 million seconds of driving. And any one of those
link |
seconds, you can make a fatal mistake. So you're talking about eight nines of reliability.
link |
Now, if your vision system only detects 98.3% of the vehicles, that's sort of one
link |
on a bit nine reliability. So you have another seven orders of magnitude to go. And this is
link |
what people don't understand. They think, oh, because I had a successful demo, I'm pretty much
link |
done. But you're not even within seven orders of magnitude of being done. And that's the difficulty.
link |
And it's not, can I follow a white line? That's not the problem. We follow a white line all the
link |
way across the country. But it's the weird stuff that happens. It's all the edge cases. Yeah.
link |
The edge case, other drivers doing weird things. So if you talk to Google, so they had actually
link |
a very classical architecture where you had machine vision, which would detect all the
link |
other cars and pedestrians and the white lines and the road signs. And then basically,
link |
that was fed into a logical database. And then you had a classical 1970s rule based expert system
link |
telling you, okay, if you're in the middle lane, and there's a bicyclist in the right lane,
link |
who is signaling this, then then do that, right? And what they found was that every day that go
link |
out and there'd be another situation that the rules didn't cover. So they come to a traffic
link |
circle and there's a little girl riding her bicycle the wrong way around the traffic circle.
link |
Okay, what do you do? We don't have a rule. Oh my God. Okay, stop. And then they come back
link |
and add more rules. And they just found that this was not really converging. And if you think about
link |
it, right, how do you deal with an unexpected situation, meaning one that you've never previously
link |
encountered and the sort of the reasoning required to figure out the solution for that
link |
situation has never been done. It doesn't match any previous situation in terms of the kind of
link |
reasoning you have to do. Well, in chess programs, this happens all the time. You're constantly
link |
coming up with situations you haven't seen before. And you have to reason about them and you have
link |
to think about, okay, here are the possible things I could do. Here are the outcomes. Here's how
link |
desirable the outcomes are and then pick the right one. In the 90s, we were saying, okay,
link |
this is how you're going to have to do automated vehicles. They're going to have to have a look
link |
ahead capability. But the look ahead for driving is more difficult than it is for chess. Because
link |
of humans. Right, there's humans and they're less predictable than chess pieces. Well,
link |
then you have an opponent in chess who's also somewhat unpredictable. But for example, in chess,
link |
you always know the opponent's intention. They're trying to beat you. Whereas in driving, you don't
link |
know, is this guy trying to turn left or has he just forgotten to turn off his turn signal? Or is
link |
he drunk? Or is he changing the channel on his radio or whatever it might be, you got to try and
link |
figure out the mental state, the intent of the other drivers to forecast the possible evolutions
link |
of their trajectories. And then you got to figure out, okay, which is the trajectory for me that's
link |
going to be safest. And those all interact with each other because the other drivers are going
link |
to react to your trajectory and so on. So, you know, they've got the classic merging onto the
link |
freeway problem where you're kind of racing a vehicle that's already on the freeway and you're
link |
are you going to pull ahead of them or are you going to let them go first and pull in behind
link |
and you get this sort of uncertainty about who's going first. So all those kinds of things
link |
mean that you need a decision making architecture that's very different from either a rule based
link |
system or it seems to me a kind of an end to end neural network system. You know, so just as Alpha
link |
Go is pretty good when it doesn't do any look ahead, but it's way, way, way, way better when it does.
link |
I think the same is going to be true for driving. You can have a driving system that's pretty good
link |
when it doesn't do any look ahead, but that's not good enough. You know, and we've already seen
link |
multiple deaths caused by poorly designed machine learning algorithms that don't really
link |
understand what they're doing. Yeah, and on several levels, I think it's on the perception side,
link |
there's mistakes being made by those algorithms where the perception is very shallow on the
link |
planning side, the look ahead, like you said, and the thing that we come up against that's
link |
really interesting when you try to deploy systems in the real world is
link |
you can't think of an artificial intelligence system as a thing that responds to the world always.
link |
You have to realize that it's an agent that others will respond to as well.
link |
Well, so in order to drive successfully, you can't just try to do obstacle avoidance.
link |
You can't pretend that you're invisible, right? You're the invisible car.
link |
It doesn't work that way. I mean, but you have to assert, yet others have to be scared of you,
link |
just there's this tension, there's this game. So we study a lot of work with pedestrians.
link |
If you approach pedestrians as purely an obstacle avoidance, so you're doing look
link |
ahead as in modeling the intent, they're not going to take advantage of you.
link |
They're not going to respect you at all. There has to be a tension, a fear, some amount of
link |
uncertainty. That's how we have created. Or at least just a kind of a resoluteness.
link |
You have to display a certain amount of resoluteness. You can't be too tentative.
link |
Yeah. So the solutions then become pretty complicated. You get into game theoretic
link |
analyses. So at Berkeley now, we're working a lot on this kind of interaction between machines
link |
and humans. And that's exciting. So my colleague, Anka Dragan, actually, if you formulate the problem
link |
game theoretically and you just let the system figure out the solution, it does interesting,
link |
unexpected things. Like sometimes at a stop sign, if no one is going first, the car will
link |
actually back up a little. It's just to indicate to the other cars that they should go. And that's
link |
something it invented entirely by itself. That's interesting. We didn't say this is the language
link |
of communication at stop signs. It figured it out. That's really interesting. So let me one just
link |
step back for a second. Just this beautiful philosophical notion. So Pamela McCordick in
link |
1979 wrote AI began with the ancient wish to forge the gods. So when you think about the
link |
history of our civilization, do you think that there is an inherent desire to create,
link |
let's not say gods, but to create superintelligence? Is it inherent to us? Is it in our genes,
link |
that the natural arc of human civilization is to create things that are of greater and greater
link |
power and perhaps echoes of ourselves? So to create the gods, as Pamela said.
link |
It may be. I mean, we're all individuals, but certainly we see over and over again in history
link |
individuals who thought about this possibility. Hopefully, I'm not being too philosophical here.
link |
But if you look at the arc of this, where this is going and we'll talk about AI safety,
link |
we'll talk about greater and greater intelligence, do you see that when you created the Othello
link |
program and you felt this excitement, what was that excitement? Was it the excitement of a tinkerer
link |
who created something cool, like a clock? Or was there a magic, or was it more like a child being
link |
born? Yeah. So I mean, I certainly understand that viewpoint. And if you look at the light
link |
hill report, so in the 70s, there was a lot of controversy in the UK about AI and whether it
link |
was for real and how much the money the government should invest. So it's a long story, but the
link |
government commissioned a report by Lighthill, who was a physicist, and he wrote a very damning
link |
report about AI, which I think was the point. And he said that these are frustrated men who
link |
unable to have children would like to create life as a kind of replacement, which I think is
link |
really pretty unfair. But there is a kind of magic, I would say, when you build something
link |
and what you're building in is really just you're building in some understanding of the
link |
principles of learning and decision making. And to see those principles actually then
link |
turn into intelligent behavior in specific situations, it's an incredible thing. And
link |
that is naturally going to make you think, okay, where does this end?
link |
And so there's a there's magical, optimistic views of word and whatever your view of optimism is,
link |
whatever your view of utopia is, it's probably different for everybody. But you've often talked
link |
about concerns you have of how things might go wrong. So I've talked to Max Tegmark. There's a
link |
lot of interesting ways to think about AI safety. You're one of the seminal people thinking about
link |
this problem amongst sort of being in the weeds of actually solving specific AI problems,
link |
you're also thinking about the big picture of where we're going. So can you talk about
link |
several elements of it? Let's just talk about maybe the control problem. So this idea of
link |
losing ability to control the behavior of our AI system. So how do you see that? How do you see
link |
that coming about? What do you think we can do to manage it?
link |
Well, so it doesn't take a genius to realize that if you make something that's smarter than you,
link |
you might have a problem. Alan Turing wrote about this and gave lectures about this,
link |
1951. He did a lecture on the radio. And he basically says, once the machine thinking method
link |
starts, very quickly, they'll outstrip humanity. And if we're lucky, we might be able to turn off
link |
the power at strategic moments, but even so, our species would be humbled. And actually,
link |
I think it was wrong about that. If it's a sufficiently intelligent machine, it's not
link |
going to let you switch it off. It's actually in competition with you.
link |
So what do you think is meant just for a quick tangent if we shut off this
link |
super intelligent machine that our species would be humbled?
link |
I think he means that we would realize that we are inferior, that we only survive by the skin
link |
of our teeth because we happen to get to the off switch just in time. And if we hadn't,
link |
then we would have lost control over the earth. So are you more worried when you think about
link |
this stuff about super intelligent AI or are you more worried about super powerful AI that's not
link |
aligned with our values? So the paperclip scenarios kind of... I think so the main problem I'm
link |
working on is the control problem, the problem of machines pursuing objectives that are, as you
link |
say, not aligned with human objectives. And this has been the way we've thought about AI
link |
since the beginning. You build a machine for optimizing and then you put in some objective
link |
and it optimizes. And we can think of this as the king Midas problem. Because if the king Midas
link |
put in this objective, everything I touch should turn to gold and the gods, that's like the machine,
link |
they said, okay, done. You now have this power and of course his food and his drink and his family
link |
all turned to gold and then he dies of misery and starvation. It's a warning, it's a failure mode that
link |
pretty much every culture in history has had some story along the same lines. There's the
link |
genie that gives you three wishes and third wish is always, please undo the first two wishes because
link |
I messed up. And when Arthur Samuel wrote his checker playing program, which learned to play
link |
checkers considerably better than Arthur Samuel could play and actually reached a pretty decent
link |
standard, Norbert Wiener, who was one of the major mathematicians of the 20th century, he's sort of
link |
the father of modern automation control systems. He saw this and he basically extrapolated
link |
as Turing did and said, okay, this is how we could lose control. And specifically that
link |
we have to be certain that the purpose we put into the machine is the purpose which we really
link |
desire. And the problem is, we can't do that. Right. You mean we're not, it's a very difficult
link |
to encode, to put our values on paper is really difficult, or you're just saying it's impossible?
link |
The line is great between the two. So theoretically, it's possible, but in practice,
link |
it's extremely unlikely that we could specify correctly in advance the full range of concerns
link |
of humanity. You talked about cultural transmission of values, I think is how humans to human
link |
transmission of values happens, right? Well, we learn, yeah, I mean, as we grow up, we learn about
link |
the values that matter, how things should go, what is reasonable to pursue and what isn't
link |
reasonable to pursue. I think machines can learn in the same kind of way. Yeah. So I think that
link |
what we need to do is to get away from this idea that you build an optimizing machine and then you
link |
put the objective into it. Because if it's possible that you might put in a wrong objective, and we
link |
already know this is possible because it's happened lots of times, right? That means that the machine
link |
should never take an objective that's given as gospel truth. Because once it takes the objective
link |
as gospel truth, then it believes that whatever actions it's taking in pursuit of that objective
link |
are the correct things to do. So you could be jumping up and down and saying, no, no, no, no,
link |
you're going to destroy the world, but the machine knows what the true objective is and is pursuing
link |
it and tough luck to you. And this is not restricted to AI, right? This is, I think,
link |
many of the 20th century technologies, right? So in statistics, you minimize a loss function,
link |
the loss function is exogenously specified in control theory, you minimize a cost function,
link |
in operations research, you maximize a reward function, and so on. So in all these disciplines,
link |
this is how we conceive of the problem. And it's the wrong problem. Because we cannot specify
link |
with certainty the correct objective, right? We need uncertainty, we need the machine to be
link |
uncertain about what it is that it's supposed to be maximizing.
link |
It's my favorite idea of yours. I've heard you say somewhere, well, I shouldn't pick favorites,
link |
but it just sounds beautiful. We need to teach machines humility. It's a beautiful way to put
link |
it. I love it. That they're humble. They know that they don't know what it is they're supposed
link |
to be doing. And that those objectives, I mean, they exist, they're within us, but we may not
link |
be able to explicate them. We may not even know how we want our future to go.
link |
So exactly. And a machine that's uncertain is going to be differential to us. So if we say,
link |
don't do that, well, now the machines learn something a bit more about our true objectives,
link |
because something that it thought was reasonable in pursuit of our objective,
link |
it turns out not to be so now it's learned something. So it's going to defer because it
link |
wants to be doing what we really want. And that point, I think, is absolutely central
link |
to solving the control problem. And it's a different kind of AI when you take away this
link |
idea that the objective is known, then in fact, a lot of the theoretical frameworks that we're so
link |
familiar with, you know, mark off decision processes, goal based planning, you know,
link |
standard game research, all of these techniques actually become inapplicable.
link |
And you get a more complicated problem because because now the interaction with the human becomes
link |
part of the problem. Because the human by making choices is giving you more information about
link |
the true objective and that information helps you achieve the objective better.
link |
And so that really means that you're mostly dealing with game theoretic problems where you've
link |
got the machine and the human and they're coupled together, rather than a machine going off by itself
link |
with a fixed objective. Which is fascinating on the machine and the human level that we,
link |
when you don't have an objective means you're together coming up with an objective. I mean,
link |
there's a lot of philosophy that, you know, you could argue that life doesn't really have meaning.
link |
We we together agree on what gives it meaning and we kind of culturally create
link |
things that give why the heck we are in this earth anyway. We together as a society create
link |
that meaning and you have to learn that objective. And one of the biggest, I thought that's where
link |
you were going to go for a second. One of the biggest troubles we run into outside of statistics
link |
and machine learning and AI in just human civilization is when you look at, I came from,
link |
I was born in the Soviet Union. And the history of the 20th century, we ran into the most trouble,
link |
us humans, when there was a certainty about the objective. And you do whatever it takes to achieve
link |
that objective, whether you're talking about Germany or communist Russia, you get into trouble
link |
with humans. And I would say with corporations, in fact, some people argue that we don't have
link |
to look forward to a time when AI systems take over the world, they already have. And they call
link |
corporations, right? That corporations happen to be using people as components right now.
link |
But they are effectively algorithmic machines, and they're optimizing an objective, which is
link |
quarterly profit that isn't aligned with overall well being of the human race. And they are
link |
destroying the world. They are primarily responsible for our inability to tackle climate change.
link |
So I think that's one way of thinking about what's going on with corporations. But
link |
I think the point you're making is valid, that there are many systems in the real world where
link |
we've sort of prematurely fixed on the objective and then decoupled the machine from those that
link |
are supposed to be serving. And I think you see this with government, right? Government is supposed
link |
to be a machine that serves people. But instead, it tends to be taken over by people who have their
link |
own objective and use government to optimize that objective, regardless of what people want.
link |
Do you find appealing the idea of almost arguing machines where you have multiple AI systems with
link |
a clear fixed objective? We have in government the red team and the blue team that are very fixed
link |
on their objectives. And they argue, and it kind of maybe would disagree, but it kind of seems to
link |
make it work somewhat that the duality of it, okay, let's go 100 years back when there was still
link |
was going on or at the founding of this country, there was disagreements and that disagreement is
link |
where so it was a balance between certainty and forced humility because the power was distributed.
link |
Yeah, I think that the nature of debate and disagreement argument takes as a premise the idea
link |
that you could be wrong, which means that you're not necessarily absolutely convinced that your
link |
objective is the correct one. If you were absolutely convinced, there'd be no point
link |
in having any discussion or argument because you would never change your mind. And there wouldn't
link |
be any sort of synthesis or anything like that. So I think you can think of argumentation as an
link |
implementation of a form of uncertain reasoning. I've been reading recently about utilitarianism
link |
and the history of efforts to define in a sort of clear mathematical way a if you like a formula for
link |
moral or political decision making. And it's really interesting that the parallels between
link |
the philosophical discussions going back 200 years and what you see now in discussions about
link |
existential risk because it's almost exactly the same. So someone would say, okay, well,
link |
here's a formula for how we should make decisions. So utilitarianism is roughly each person has a
link |
utility function and then we make decisions to maximize the sum of everybody's utility.
link |
And then people point out, well, in that case, the best policy is one that leads to
link |
the enormously vast population, all of whom are living a life that's barely worth living.
link |
And this is called the repugnant conclusion. And another version is that we should maximize
link |
pleasure and that's what we mean by utility. And then you'll get people effectively saying,
link |
well, in that case, we might as well just have everyone hooked up to a heroin drip. And they
link |
didn't use those words. But that debate was happening in the 19th century, as it is now
link |
about AI, that if we get the formula wrong, we're going to have AI systems working towards
link |
an outcome that in retrospect, would be exactly wrong.
link |
Do you think there's has beautifully put so the echoes are there. But do you think,
link |
I mean, if you look at Sam Harris, our imagination worries about the AI version of that, because
link |
of the speed at which the things going wrong in the utilitarian context could happen.
link |
Is that a worry for you?
link |
Yeah, I think that in most cases, not in all, but if we have a wrong political idea,
link |
we see it starting to go wrong. And we're not completely stupid. And so we said, okay,
link |
maybe that was a mistake. Let's try something different. And also, we're very slow and inefficient
link |
about implementing these things and so on. So you have to worry when you have corporations
link |
or political systems that are extremely efficient. But when we look at AI systems,
link |
or even just computers in general, right, they have this different characteristic
link |
from ordinary human activity in the past. So let's say you were a surgeon. You had some idea
link |
about how to do some operation, right? Well, and let's say you were wrong, right, that that way
link |
of doing the operation would mostly kill the patient. Well, you'd find out pretty quickly,
link |
like after three, maybe three or four tries, right? But that isn't true for pharmaceutical
link |
companies, because they don't do three or four operations. They manufacture three or four billion
link |
pills and they sell them. And then they find out maybe six months or a year later that, oh,
link |
people are dying of heart attacks or getting cancer from this drug. And so that's why we have the FDA,
link |
right? Because of the scalability of pharmaceutical production. And there have been some unbelievably
link |
bad episodes in the history of pharmaceuticals and adulteration of products and so on that have
link |
killed tens of thousands or paralyzed hundreds of thousands of people.
link |
Now, with computers, we have that same scalability problem that you can
link |
sit there and type for i equals one to five billion, two, right? And all of a sudden,
link |
you're having an impact on a global scale. And yet we have no FDA, right? There's absolutely no
link |
controls at all over what a bunch of undergraduates with too much caffeine can do to the world.
link |
And, you know, we look at what happened with Facebook, well, social media in general, and
link |
click through optimization. So you have a simple feedback algorithm that's trying to just optimize
link |
click through, right? That sounds reasonable, right? Because you don't want to be feeding people
link |
ads that they don't care about or not interested in. And you might even think of that process as
link |
simply adjusting the the feeding of ads or news articles or whatever it might be to match people's
link |
preferences, right? Which sounds like a good idea. But in fact, that isn't how the algorithm works,
link |
right? You make more money. The algorithm makes more money. If it can better predict what people
link |
are going to click on, because then it can feed them exactly that, right? So the way to maximize
link |
click through is actually to modify the people, to make them more predictable. And one way to do
link |
that is to feed them information which will change their behavior and preferences towards
link |
extremes that make them predictable. Whatever is the nearest extreme or the nearest predictable
link |
point, that's where you're going to end up. And the machines will force you there.
link |
Now, and I think there's a reasonable argument to say that this, among other things, is
link |
contributing to the destruction of democracy in the world. And where was the oversight
link |
of this process? Where were the people saying, okay, you would like to apply this algorithm to
link |
five billion people on the face of the earth? Can you show me that it's safe? Can you show
link |
me that it won't have various kinds of negative effects? No, there was no one asking that question.
link |
There was no one placed between the undergrads with too much caffeine and the human race.
link |
It's just they just did it. And some way outside the scope of my knowledge,
link |
so economists would argue that the invisible hand, so the capitalist system, it was the
link |
oversight. So if you're going to corrupt society with whatever decision you make as a company,
link |
then that's going to be reflected in people not using your product. That's one model of oversight.
link |
We shall see. But in the meantime, but you might even have broken the political system
link |
that enables capitalism to function. Well, you've changed it. So we should see. Yeah.
link |
Change is often painful. So my question is absolutely, it's fascinating. You're absolutely
link |
right that there was zero oversight on algorithms that can have a profound civilization changing
link |
effect. So do you think it's possible? I mean, I haven't, have you seen government?
link |
So do you think it's possible to create regulatory bodies oversight over AI algorithms,
link |
which are inherently such cutting edge set of ideas and technologies?
link |
Yeah, but I think it takes time to figure out what kind of oversight, what kinds of controls.
link |
I mean, it took time to design the FDA regime. Some people still don't like it and they want
link |
to fix it. And I think there are clear ways that it could be improved. But the whole notion that
link |
you have stage one, stage two, stage three, and here are the criteria for what you have to do
link |
to pass a stage one trial, right? We haven't even thought about what those would be for algorithms.
link |
So I mean, I think there are, there are things we could do right now with regard to bias, for
link |
example, we have a pretty good technical handle on how to detect algorithms that are propagating
link |
bias that exists in data sets, how to debias those algorithms, and even what it's going to cost you
link |
to do that. So I think we could start having some standards on that. I think there are things to do
link |
with impersonation and falsification that we could, we could work on. So I think, yeah.
link |
Or the very simple point. So impersonation is a machine acting as if it was a person.
link |
I can't see a real justification for why we shouldn't insist that machine self identify as
link |
machines. Where is the social benefit in fooling people into thinking that this is really a person
link |
when it isn't? I don't mind if it uses a human like voice that's easy to understand. That's fine.
link |
But it should just say, I'm a machine in some form. And now many people are speaking to that.
link |
I would think relatively obvious facts. So I think most people... Yeah. I mean,
link |
there is actually a law in California that bans impersonation, but only in certain
link |
restricted circumstances. So for the purpose of engaging in a fraudulent transaction and for
link |
the purpose of modifying someone's voting behavior. So those are the circumstances where
link |
machines have to self identify. But I think, arguably, it should be in all circumstances.
link |
And then when you talk about deep fakes, we're just at the beginning. But already,
link |
it's possible to make a movie of anybody saying anything in ways that are pretty hard to detect.
link |
Including yourself because you're on camera now and your voice is coming through with high
link |
resolution. Yeah. So you could take what I'm saying and replace it with pretty much anything
link |
else you wanted me to be saying. And even it would change my lips and facial expressions to fit.
link |
And there's actually not much in the way of real legal protection against that.
link |
I think in the commercial area, you could say, yeah, that's...
link |
You're using my brand and so on. There are rules about that. But in the political sphere, I think,
link |
at the moment, it's anything goes. So that could be really, really damaging.
link |
And let me just try to make not an argument, but try to look back at history and say something
link |
dark, in essence, is while regulation seems to be... Oversight seems to be exactly the
link |
right thing to do here. It seems that human beings, what they naturally do is they wait
link |
for something to go wrong. If you're talking about nuclear weapons, you can't talk about
link |
nuclear weapons being dangerous until somebody actually, like the United States drops the bomb,
link |
or Chernobyl melting. Do you think we will have to wait for things going wrong in a way that's
link |
obviously damaging to society, not an existential risk, but obviously damaging?
link |
Or do you have faith that... I hope not. But I mean, I think we do have to look at history.
link |
So the two examples you gave, nuclear weapons and nuclear power, are very, very interesting because
link |
nuclear weapons, we knew in the early years of the 20th century that atoms contained a huge
link |
amount of energy. We had E equals MC squared. We knew the mass differences between the different
link |
atoms and their components, and we knew that you might be able to make an incredibly powerful
link |
explosive. So H.G. Wells wrote science fiction book, I think, in 1912. Frederick Soddy, who was the
link |
guy who discovered isotopes as a Nobel Prize winner, he gave a speech in 1915 saying that
link |
one pound of this new explosive would be the equivalent of 150 tons of dynamite,
link |
which turns out to be about right. And this was in World War I, so he was imagining how much worse
link |
the world war would be if we were using that kind of explosive. But the physics establishment
link |
simply refused to believe that these things could be made. Including the people who were making it.
link |
Well, so they were doing the nuclear physics. I mean, eventually were the ones who made it.
link |
You talk about Fermi or whoever. Well, so up to the development was mostly theoretical. So it was
link |
people using sort of primitive kinds of particle acceleration and doing experiments at the level
link |
of single particles or collections of particles. They weren't yet thinking about how to actually
link |
make a bomb or anything like that. But they knew the energy was there and they figured if they
link |
understood it better, it might be possible. But the physics establishment, their view, and I think
link |
because they did not want it to be true, their view was that it could not be true.
link |
That this could not provide a way to make a super weapon. And there was this famous
link |
speech given by Rutherford, who was the sort of leader of nuclear physics. And it was on
link |
September 11, 1933. And he said, you know, anyone who talks about the possibility of
link |
obtaining energy from transformation of atoms is talking complete moonshine. And the next
link |
morning, Leo Zillard read about that speech and then invented the nuclear chain reaction.
link |
And so as soon as he invented, as soon as he had that idea, that you could make a chain reaction
link |
with neutrons because neutrons were not repelled by the nucleus so they could enter the nucleus
link |
and then continue the reaction. As soon as he has that idea, he instantly realized that the world
link |
was in deep doo doo. Because this is 1933, right? Hitler had recently come to power in Germany.
link |
Zillard was in London. He eventually became a refugee and he came to the US. And in the
link |
process of having the idea about the chain reaction, he figured out basically how to make
link |
a bomb and also how to make a reactor. And he patented the reactor in 1934. But because
link |
of the situation, the great power conflict situation that he could see happening,
link |
he kept that a secret. And so between then and the beginning of World War II,
link |
people were working, including the Germans, on how to actually create neutron sources,
link |
what specific fission reactions would produce neutrons of the right energy to continue the
link |
reaction. And that was demonstrated in Germany, I think in 1938, if I remember correctly. The first
link |
nuclear weapon patent was 1939 by the French. So this was actually going on well before World War
link |
II really got going. And then the British probably had the most advanced capability
link |
in this area. But for safety reasons, among others, and bless just sort of just resources,
link |
they moved the program from Britain to the US. And then that became Manhattan Project.
link |
So the reason why we couldn't
link |
have any kind of oversight of nuclear weapons and nuclear technology was because we were basically
link |
already in an arms race in a war. But you mentioned in the 20s and 30s, so what are the echoes?
link |
The way you've described the story, I mean, there's clearly echoes. What do you think most AI
link |
researchers, folks who are really close to the metal, they really are not concerned about AI,
link |
they don't think about it, whether it's they don't want to think about it. But what are the,
link |
yeah, why do you think that is? What are the echoes of the nuclear situation to the current AI
link |
situation? And what can we do about it? I think there is a kind of motivated cognition, which is
link |
a term in psychology means that you believe what you would like to be true, rather than what is
link |
true. And it's unsettling to think that what you're working on might be the end of the human race,
link |
obviously. So you would rather instantly deny it and come up with some reason why it couldn't be
link |
true. And I collected a long list of regions that extremely intelligent, competent AI scientists
link |
have come up with for why we shouldn't worry about this. For example, calculators are superhuman at
link |
arithmetic and they haven't taken over the world, so there's nothing to worry about. Well, okay,
link |
my five year old could have figured out why that was an unreasonable and really quite weak argument.
link |
Another one was, while it's theoretically possible that you could have superhuman AI
link |
destroy the world, it's also theoretically possible that a black hole could materialize
link |
right next to the earth and destroy humanity. I mean, yes, it's theoretically possible,
link |
quantum theoretically, extremely unlikely that it would just materialize right there.
link |
But that's a completely bogus analogy because if the whole physics community on earth was working
link |
to materialize a black hole in near earth orbit, wouldn't you ask them, is that a good idea? Is
link |
that going to be safe? What if you succeed? And that's the thing. The AI community is sort of
link |
refused to ask itself, what if you succeed? And initially, I think that was because it was too
link |
hard, but Alan Turing asked himself that and he said, we'd be toast. If we were lucky, we might
link |
be able to switch off the power but probably we'd be toast. But there's also an aspect
link |
that because we're not exactly sure what the future holds, it's not clear exactly so technically
link |
what to worry about, sort of how things go wrong. And so there is something it feels like, maybe
link |
you can correct me if I'm wrong, but there's something paralyzing about worrying about something
link |
that logically is inevitable. But you don't really know what that will look like.
link |
Yeah, I think that's a reasonable point. And it's certainly in terms of existential risks,
link |
it's different from asteroid collides with the earth, which again is quite possible. It's
link |
happened in the past. It'll probably happen again. We don't know right now. But if we did detect an
link |
asteroid that was going to hit the earth in 75 years time, we'd certainly be doing something
link |
about it. Well, it's clear there's got big rock and we'll probably have a meeting and see what
link |
do we do about the big rock with AI. Right, with AI. I mean, there are very few people who think it's
link |
not going to happen within the next 75 years. I know Rod Brooks doesn't think it's going to happen.
link |
Maybe Andrew Ng doesn't think it's happened. But a lot of the people who work day to day,
link |
you know, as you say, at the rock face, they think it's going to happen. I think the median
link |
estimate from AI researchers is somewhere in 40 to 50 years from now. Or maybe, you know,
link |
I think in Asia, they think it's going to be even faster than that. I'm a little bit
link |
more conservative. I think it probably take longer than that. But I think, you know, as
link |
happened with nuclear weapons, it can happen overnight that you have these breakthroughs.
link |
And we need more than one breakthrough. But, you know, it's on the order of half a dozen.
link |
This is a very rough scale. But so half a dozen breakthroughs of that nature
link |
would have to happen for us to reach superhuman AI. But the AI research community is
link |
vast now, the massive investments from governments, from corporations, tons of really,
link |
really smart people. You just have to look at the rate of progress in different areas of AI
link |
to see that things are moving pretty fast. So to say, oh, it's just going to be thousands of years.
link |
I don't see any basis for that. You know, I see, you know, for example, the
link |
Stanford 100 year AI project, which is supposed to be sort of, you know, the serious establishment view,
link |
their most recent report actually said it's probably not even possible.
link |
Right. Which if you want a perfect example of people in denial, that's it. Because, you know,
link |
for the whole history of AI, we've been saying to philosophers who said it wasn't possible. Well,
link |
you have no idea what you're talking about. Of course, it's possible. Right. Give me an
link |
argument for why it couldn't happen. And there isn't one. Right. And now, because people are
link |
worried that maybe AI might get a bad name, or I just don't want to think about this,
link |
they're saying, okay, well, of course, it's not really possible. You know, imagine, right? Imagine
link |
if, you know, the leaders of the cancer biology community got up and said, well, you know, of
link |
course, curing cancer, it's not really possible. There'd be a complete outrage and dismay. And,
link |
you know, I find this really a strange phenomenon. So,
link |
okay, so if you accept it as possible, and if you accept that it's probably going to happen,
link |
the point that you're making that, you know, how does it go wrong?
link |
A valid question without that, without an answer to that question, then you're stuck with what I
link |
call the gorilla problem, which is, you know, the problem that the gorillas face, right? They
link |
made something more intelligent than them, namely us a few million years ago, and now they're in
link |
deep doo doo. So there's really nothing they can do. They've lost the control. They failed to solve
link |
the control problem of controlling humans. And so they've lost. So we don't want to be in that
link |
situation. And if the gorilla problem is the only formulation you have, there's not a lot you can do.
link |
Right. Other than to say, okay, we should try to stop. You know, we should just not make the humans
link |
or, or in this case, not make the AI. And I think that's really hard to do.
link |
To, I'm not actually proposing that that's a feasible course of action. And I also think that,
link |
you know, if properly controlled AI could be incredibly beneficial.
link |
So the, but it seems to me that there's a, there's a consensus that one of the major
link |
failure modes is this loss of control that we create AI systems that are pursuing incorrect
link |
objectives. And because the AI system believes it knows what the objective is, it has no incentive
link |
to listen to us anymore, so to speak, right? It's just carrying out the,
link |
the strategy that it, it has computed as being the optimal solution.
link |
And, you know, it may be that in the process, it needs to acquire more resources to increase the
link |
possibility of success or prevent various failure modes by defending itself against interference.
link |
And so that collection of problems, I think, is something we can address.
link |
The other problems are roughly speaking, you know, misuse, right? So even if we solve the control
link |
problem, we make perfectly safe controllable AI systems. Well, why, you know, why does Dr.
link |
Evil going to use those, right? He wants to just take over the world and he'll make unsafe AI systems
link |
that then get out of control. So that's one problem, which is sort of a, you know, partly a
link |
policing problem, partly a sort of a cultural problem for the profession of how we teach people
link |
what kinds of AI systems are safe. You talk about autonomous weapon system and how pretty much
link |
everybody agrees that there's too many ways that that can go horribly wrong. You have this great
link |
Slotabots movie that kind of illustrates that beautifully. Well, I want to talk about that.
link |
That's another, there's another topic I'm happy to talk about the, I just want to mention that
link |
what I see is the third major failure mode, which is overuse, not so much misuse, but overuse of AI,
link |
that we become overly dependent. So I call this the warly problems. If you've seen the warly,
link |
the movie, all right, all the humans are on the spaceship and the machines look after everything
link |
for them. And they just watch TV and drink big gulps. And they're all sort of obese and stupid.
link |
And they sort of totally lost any notion of human autonomy. And, you know, so in effect, right,
link |
this would happen like the slow boiling frog, right, we would gradually turn over
link |
more and more of the management of our civilization to machines as we are already doing.
link |
And this, you know, this, if this process continues, you know, we sort of gradually
link |
switch from sort of being the masters of technology to just being the guests, right?
link |
So, so we become guests on a cruise ship, you know, which is fine for a week, but not,
link |
not for the rest of eternity, right? You know, and it's almost irreversible, right? Once you,
link |
once you lose the incentive to, for example, you know, learn to be an engineer or a doctor
link |
or a sanitation operative or any other of the, the infinitely many ways that we
link |
maintain and propagate our civilization. You know, if you, if you don't have the
link |
incentive to do any of that, you won't. And then it's really hard to recover.
link |
And of course they add just one of the technologies that could, that third failure mode result in that.
link |
There's probably other technology in general detaches us from.
link |
It does a bit, but the, the, the difference is that in terms of the knowledge to,
link |
to run our civilization, you know, up to now we've had no alternative, but to put it into
link |
people's heads, right? And if you, if you, software with Google, I mean, so software in
link |
general, so computers in general, but, but the, you know, the knowledge of how, you know, how
link |
a sanitation system works, you know, that's an AI has to understand that it's no good putting it
link |
into Google. So, I mean, we, we've always put knowledge in on paper, but paper doesn't run
link |
our civilization. It only runs when it goes from the paper into people's heads again, right? So
link |
we've always propagated civilization through human minds and we've spent about a trillion
link |
person years doing that literally, right? You, you can work it out. It's about, right? There's
link |
about just over a hundred billion people who've ever lived and each of them has spent about 10
link |
years learning stuff to keep their civilization going. And so that's a trillion person years we
link |
put into this effort. Beautiful way to describe all of civilization. And now we're, you know,
link |
we're in danger of throwing that away. So this is a problem that AI can't solve. It's not a
link |
technical problem. It's a, you know, and if we do our job right, the AI systems will say, you know,
link |
the human race doesn't in the long run want to be passengers in a cruise ship. The human race
link |
wants autonomy. This is part of human preferences. So we, the AI systems are not going to do this
link |
stuff for you. You've got to do it for yourself, right? I'm not going to carry you to the top of
link |
Everest in an autonomous helicopter. You have to climb it if you want to get the benefit and so on. So
link |
but I'm afraid that because we are short sighted and lazy, we're going to override the AI systems.
link |
And, and there's an amazing short story that I recommend to everyone that I talk to about this
link |
called the machine stops written in 1909 by Ian Forster, who, you know, wrote novels about the
link |
British Empire and sort of things that became costume dramas on the BBC. But he wrote this one
link |
science fiction story, which is an amazing vision of the future. It has, it has basically iPads.
link |
It has video conferencing. It has MOOCs. It has computer and computer induced obesity. I mean,
link |
literally, the whole thing is what people spend their time doing is giving online courses or
link |
listening to online courses and talking about ideas. But they never get out there in the real
link |
world. They don't really have a lot of face to face contact. Everything is done online.
link |
You know, so all the things we're worrying about now were described in the story and and then the
link |
human race becomes more and more dependent on the machine loses knowledge of how things really run
link |
and then becomes vulnerable to collapse. And so it's a it's a pretty unbelievably amazing
link |
story for someone writing in 1909 to imagine all this. Plus, yeah. So there's very few people
link |
that represent artificial intelligence more than you, Stuart Russell.
link |
If you say it's okay, that's very kind. So it's all my fault.
link |
It's all your fault. No, right. You're often brought up as the person. Well, Stuart Russell,
link |
like the AI person is worried about this. That's why you should be worried about it.
link |
Do you feel the burden of that? I don't know if you feel that at all. But when I talk to people,
link |
like from you talk about people outside of computer science, when they think about this,
link |
Stuart Russell is worried about AI safety, you should be worried too. Do you feel the burden
link |
of that? I mean, in a practical sense, yeah, because I get, you know, a dozen, sometimes
link |
25 invitations a day to talk about it, to give interviews, to write press articles and so on.
link |
So in that very practical sense, I'm seeing that people are concerned and really interested about
link |
this. Are you worried that you could be wrong, as all good scientists are? Of course. I worry about
link |
that all the time. I mean, that's always been the way that I've worked, you know, is like I have an
link |
argument in my head with myself, right? So I have some idea. And then I think, okay,
link |
okay, how could that be wrong? Or did someone else already have that idea? So I'll go and
link |
search in as much literature as I can to see whether someone else already thought of that
link |
or even refuted it. So, you know, right now, I'm reading a lot of philosophy because,
link |
you know, in the form of the debates over utilitarianism and other kinds of moral formulas,
link |
shall we say, people have already thought through some of these issues. But, you know,
link |
what one of the things I'm not seeing in a lot of these debates is this specific idea about
link |
the importance of uncertainty in the objective, that this is the way we should think about machines
link |
that are beneficial to humans. So this idea of provably beneficial machines based on explicit
link |
uncertainty in the objective, you know, it seems to be, you know, my gut feeling is this is the core
link |
of it. It's going to have to be elaborated in a lot of different directions. And they're a lot
link |
of beneficial. Yeah, but they're, I mean, it has to be, right? We can't afford, you know,
link |
hand wavy beneficial. Because there are, you know, whenever we do hand wavy stuff, there are
link |
loopholes. And the thing about super intelligent machines is they find the loopholes. You know,
link |
just like, you know, tax evaders, if you don't write your tax law properly, people will find
link |
the loopholes and end up paying no tax. And so you should think of it this way. And getting those
link |
definitions right, you know, it is really a long process, you know, so you can you can define
link |
mathematical frameworks. And within that framework, you can prove mathematical theorems that, yes,
link |
this will, you know, this this theoretical entity will be provably beneficial to that theoretical
link |
entity. But that framework may not match the real world in some crucial way. So the long process
link |
thinking through it to iterating and so on. Last question. Yep. You have 10 seconds to answer it.
link |
What is your favorite sci fi movie about AI? I would say interstellar has my favorite robots.
link |
Oh, beats space. Yeah, yeah, yeah. So so Tars, the robots, one of the robots in interstellar is
link |
the way robots should behave. And I would say X Machina is in some ways the one, the one that
link |
makes you think in a nervous kind of way about about where we're going.
link |
Well, Stuart, thank you so much for talking today. Pleasure.