back to index

Gary Marcus: Toward a Hybrid of Deep Learning and Symbolic AI | Lex Fridman Podcast #43


small model | large model

link |
00:00:00.000
The following is a conversation with Gary Marcus.
link |
00:00:02.740
He's a professor emeritus at NYU,
link |
00:00:04.980
founder of Robust AI and Geometric Intelligence.
link |
00:00:08.200
The latter is a machine learning company
link |
00:00:10.300
that was acquired by Uber in 2016.
link |
00:00:13.500
He's the author of several books,
link |
00:00:15.740
Unnatural and Artificial Intelligence,
link |
00:00:18.180
including his new book, Rebooting AI,
link |
00:00:20.840
Building Machines We Can Trust.
link |
00:00:23.340
Gary has been a critical voice,
link |
00:00:25.480
highlighting the limits of deep learning and AI in general
link |
00:00:28.780
and discussing the challenges before our AI community
link |
00:00:33.700
that must be solved in order to achieve
link |
00:00:35.740
artificial general intelligence.
link |
00:00:38.300
As I'm having these conversations,
link |
00:00:40.100
I try to find paths toward insight, towards new ideas.
link |
00:00:43.600
I try to have no ego in the process.
link |
00:00:45.940
It gets in the way.
link |
00:00:47.640
I'll often continuously try on several hats, several roles.
link |
00:00:52.300
One, for example, is the role of a three year old
link |
00:00:54.740
who understands very little about anything
link |
00:00:57.140
and asks big what and why questions.
link |
00:01:00.340
The other might be a role of a devil's advocate
link |
00:01:02.940
who presents counter ideas with the goal of arriving
link |
00:01:05.600
at greater understanding through debate.
link |
00:01:08.240
Hopefully, both are useful, interesting,
link |
00:01:11.240
and even entertaining at times.
link |
00:01:13.400
I ask for your patience as I learn
link |
00:01:15.400
to have better conversations.
link |
00:01:17.760
This is the Artificial Intelligence Podcast.
link |
00:01:20.800
If you enjoy it, subscribe on YouTube,
link |
00:01:23.140
give it five stars on iTunes, support it on Patreon,
link |
00:01:26.340
or simply connect with me on Twitter
link |
00:01:28.560
at Lex Friedman, spelled F R I D M A N.
link |
00:01:32.540
And now, here's my conversation with Gary Marcus.
link |
00:01:37.220
Do you think human civilization will one day have
link |
00:01:40.400
to face an AI driven technological singularity
link |
00:01:42.960
that will, in a societal way,
link |
00:01:45.620
modify our place in the food chain
link |
00:01:47.260
of intelligent living beings on this planet?
link |
00:01:50.140
I think our place in the food chain has already changed.
link |
00:01:54.860
So there are lots of things people used to do by hand
link |
00:01:57.340
that they do with machine.
link |
00:01:59.180
If you think of a singularity as like one single moment,
link |
00:02:01.800
which is, I guess, what it suggests,
link |
00:02:03.220
I don't know if it'll be like that,
link |
00:02:04.580
but I think that there's a lot of gradual change
link |
00:02:07.340
and AI is getting better and better.
link |
00:02:09.220
I mean, I'm here to tell you why I think it's not nearly
link |
00:02:11.420
as good as people think, but the overall trend is clear.
link |
00:02:14.380
Maybe Rick Hertzweil thinks it's an exponential
link |
00:02:17.380
and I think it's linear.
link |
00:02:18.440
In some cases, it's close to zero right now,
link |
00:02:20.800
but it's all gonna happen.
link |
00:02:21.820
I mean, we are gonna get to human level intelligence
link |
00:02:24.780
or whatever you want, artificial general intelligence
link |
00:02:28.660
at some point, and that's certainly gonna change
link |
00:02:31.380
our place in the food chain,
link |
00:02:32.500
because a lot of the tedious things that we do now,
link |
00:02:35.200
we're gonna have machines do,
link |
00:02:36.040
and a lot of the dangerous things that we do now,
link |
00:02:38.540
we're gonna have machines do.
link |
00:02:39.900
I think our whole lives are gonna change
link |
00:02:41.660
from people finding their meaning through their work
link |
00:02:45.020
through people finding their meaning
link |
00:02:46.700
through creative expression.
link |
00:02:48.660
So the singularity will be a very gradual,
link |
00:02:53.660
in fact, removing the meaning of the word singularity.
link |
00:02:56.620
It'll be a very gradual transformation in your view.
link |
00:03:00.540
I think that it'll be somewhere in between,
link |
00:03:03.460
and I guess it depends what you mean by gradual and sudden.
link |
00:03:05.700
I don't think it's gonna be one day.
link |
00:03:07.340
I think it's important to realize
link |
00:03:08.860
that intelligence is a multidimensional variable.
link |
00:03:11.820
So people sort of write this stuff
link |
00:03:14.420
as if IQ was one number, and the day that you hit 262
link |
00:03:20.620
or whatever, you displace the human beings.
link |
00:03:22.700
And really, there's lots of facets to intelligence.
link |
00:03:25.300
So there's verbal intelligence,
link |
00:03:26.740
and there's motor intelligence,
link |
00:03:28.580
and there's mathematical intelligence and so forth.
link |
00:03:32.060
Machines, in their mathematical intelligence,
link |
00:03:34.620
far exceed most people already.
link |
00:03:36.900
In their ability to play games,
link |
00:03:38.140
they far exceed most people already.
link |
00:03:40.080
In their ability to understand language,
link |
00:03:41.760
they lag behind my five year old,
link |
00:03:43.140
far behind my five year old.
link |
00:03:44.740
So there are some facets of intelligence
link |
00:03:46.860
that machines have grasped, and some that they haven't,
link |
00:03:49.460
and we have a lot of work left to do
link |
00:03:51.780
to get them to, say, understand natural language,
link |
00:03:54.300
or to understand how to flexibly approach
link |
00:03:57.780
some kind of novel MacGyver problem solving
link |
00:04:01.340
kind of situation.
link |
00:04:03.020
And I don't know that all of these things will come at once.
link |
00:04:05.620
I think there are certain vital prerequisites
link |
00:04:07.940
that we're missing now.
link |
00:04:09.320
So for example, machines don't really have common sense now.
link |
00:04:12.500
So they don't understand that bottles contain water,
link |
00:04:15.540
and that people drink water to quench their thirst,
link |
00:04:18.160
and that they don't wanna dehydrate.
link |
00:04:19.500
They don't know these basic facts about human beings,
link |
00:04:22.100
and I think that that's a rate limiting step
link |
00:04:24.440
for many things.
link |
00:04:25.300
It's a great limiting step for reading, for example,
link |
00:04:27.680
because stories depend on things like,
link |
00:04:29.740
oh my God, that person's running out of water.
link |
00:04:31.540
That's why they did this thing.
link |
00:04:33.040
Or if they only had water, they could put out the fire.
link |
00:04:37.100
So you watch a movie, and your knowledge
link |
00:04:39.380
about how things work matter.
link |
00:04:41.220
And so a computer can't understand that movie
link |
00:04:44.320
if it doesn't have that background knowledge.
link |
00:04:45.780
Same thing if you read a book.
link |
00:04:47.900
And so there are lots of places where,
link |
00:04:49.660
if we had a good machine interpretable set of common sense,
link |
00:04:53.740
many things would accelerate relatively quickly,
link |
00:04:56.580
but I don't think even that is a single point.
link |
00:04:59.940
There's many different aspects of knowledge.
link |
00:05:02.540
And we might, for example, find that we make a lot
link |
00:05:05.260
of progress on physical reasoning,
link |
00:05:06.660
getting machines to understand, for example,
link |
00:05:09.140
how keys fit into locks, or that kind of stuff,
link |
00:05:11.980
or how this gadget here works, and so forth and so on.
link |
00:05:16.980
And so machines might do that long before they do
link |
00:05:19.500
really good psychological reasoning,
link |
00:05:21.780
because it's easier to get kind of labeled data
link |
00:05:24.380
or to do direct experimentation on a microphone stand
link |
00:05:28.680
than it is to do direct experimentation on human beings
link |
00:05:31.780
to understand the levers that guide them.
link |
00:05:34.860
That's a really interesting point, actually,
link |
00:05:36.860
whether it's easier to gain common sense knowledge
link |
00:05:39.740
or psychological knowledge.
link |
00:05:41.740
I would say the common sense knowledge
link |
00:05:43.300
includes both physical knowledge and psychological knowledge.
link |
00:05:46.860
And the argument I was making.
link |
00:05:47.700
Well, you said physical versus psychological.
link |
00:05:49.660
Yeah, physical versus psychological.
link |
00:05:51.100
And the argument I was making is physical knowledge
link |
00:05:53.260
might be more accessible, because you could have a robot,
link |
00:05:55.300
for example, lift a bottle, try putting a bottle cap on it,
link |
00:05:58.420
see that it falls off if it does this,
link |
00:06:00.420
and see that it could turn it upside down,
link |
00:06:02.020
and so the robot could do some experimentation.
link |
00:06:04.700
We do some of our psychological reasoning
link |
00:06:07.220
by looking at our own minds.
link |
00:06:09.240
So I can sort of guess how you might react to something
link |
00:06:11.940
based on how I think I would react to it.
link |
00:06:13.660
And robots don't have that intuition,
link |
00:06:15.980
and they also can't do experiments on people
link |
00:06:18.460
in the same way or we'll probably shut them down.
link |
00:06:20.500
So if we wanted to have robots figure out
link |
00:06:24.260
how I respond to pain by pinching me in different ways,
link |
00:06:27.800
like that's probably, it's not gonna make it
link |
00:06:29.660
past the human subjects board
link |
00:06:31.020
and companies are gonna get sued or whatever.
link |
00:06:32.900
So there's certain kinds of practical experience
link |
00:06:35.860
that are limited or off limits to robots.
link |
00:06:39.660
That's a really interesting point.
link |
00:06:41.060
What is more difficult to gain a grounding in?
link |
00:06:47.540
Because to play devil's advocate,
link |
00:06:49.940
I would say that human behavior is easier expressed
link |
00:06:54.980
in data and digital form.
link |
00:06:56.940
And so when you look at Facebook algorithms,
link |
00:06:59.100
they get to observe human behavior.
link |
00:07:01.100
So you get to study and manipulate even a human behavior
link |
00:07:04.620
in a way that you perhaps cannot study
link |
00:07:07.740
or manipulate the physical world.
link |
00:07:09.540
So it's true why you said pain is like physical pain,
link |
00:07:14.400
but that's again, the physical world.
link |
00:07:16.020
Emotional pain might be much easier to experiment with,
link |
00:07:20.080
perhaps unethical, but nevertheless,
link |
00:07:22.740
some would argue it's already going on.
link |
00:07:25.380
I think that you're right, for example,
link |
00:07:27.340
that Facebook does a lot of experimentation
link |
00:07:30.980
in psychological reasoning.
link |
00:07:32.900
In fact, Zuckerberg talked about AI
link |
00:07:36.040
at a talk that he gave in NIPS.
link |
00:07:38.400
I wasn't there, but the conference
link |
00:07:40.300
has been renamed NeurIPS,
link |
00:07:41.300
but he used to be called NIPS when he gave the talk.
link |
00:07:43.740
And he talked about Facebook basically
link |
00:07:45.300
having a gigantic theory of mind.
link |
00:07:47.100
So I think it is certainly possible.
link |
00:07:49.540
I mean, Facebook does some of that.
link |
00:07:51.220
I think they have a really good idea
link |
00:07:52.620
of how to addict people to things.
link |
00:07:53.900
They understand what draws people back to things.
link |
00:07:56.420
I think they exploit it in ways
link |
00:07:57.580
that I'm not very comfortable with.
link |
00:07:59.220
But even so, I think that there are only some slices
link |
00:08:03.300
of human experience that they can access
link |
00:08:05.620
through the kind of interface they have.
link |
00:08:07.220
And of course, they're doing all kinds of VR stuff,
link |
00:08:08.980
and maybe that'll change and they'll expand their data.
link |
00:08:11.940
And I'm sure that that's part of their goal.
link |
00:08:14.940
So it is an interesting question.
link |
00:08:16.860
I think love, fear, insecurity,
link |
00:08:21.700
all of the things that,
link |
00:08:24.300
I would say some of the deepest things
link |
00:08:26.620
about human nature and the human mind
link |
00:08:28.620
could be explored through digital form.
link |
00:08:30.500
It's that you're actually the first person
link |
00:08:32.220
just now that brought up,
link |
00:08:33.680
I wonder what is more difficult.
link |
00:08:35.860
Because I think folks who are the slow,
link |
00:08:40.220
and we'll talk a lot about deep learning,
link |
00:08:41.820
but the people who are thinking beyond deep learning
link |
00:08:44.860
are thinking about the physical world.
link |
00:08:46.420
You're starting to think about robotics
link |
00:08:48.060
in the home robotics.
link |
00:08:49.180
How do we make robots manipulate objects,
link |
00:08:52.300
which requires an understanding of the physical world
link |
00:08:55.020
and then requires common sense reasoning.
link |
00:08:57.300
And that has felt to be like the next step
link |
00:08:59.440
for common sense reasoning,
link |
00:09:00.420
but you've now brought up the idea
link |
00:09:02.100
that there's also the emotional part.
link |
00:09:03.620
And it's interesting whether that's hard or easy.
link |
00:09:06.840
I think some parts of it are and some aren't.
link |
00:09:08.540
So my company that I recently founded with Rod Brooks,
link |
00:09:12.660
from MIT for many years and so forth,
link |
00:09:15.940
we're interested in both.
link |
00:09:17.240
We're interested in physical reasoning
link |
00:09:18.580
and psychological reasoning, among many other things.
link |
00:09:21.500
And there are pieces of each of these that are accessible.
link |
00:09:26.140
So if you want a robot to figure out
link |
00:09:28.020
whether it can fit under a table,
link |
00:09:29.720
that's a relatively accessible piece of physical reasoning.
link |
00:09:33.660
If you know the height of the table
link |
00:09:34.760
and you know the height of the robot, it's not that hard.
link |
00:09:36.980
If you wanted to do physical reasoning about Jenga,
link |
00:09:39.900
it gets a little bit more complicated
link |
00:09:41.500
and you have to have higher resolution data
link |
00:09:43.820
in order to do it.
link |
00:09:45.260
With psychological reasoning,
link |
00:09:46.900
it's not that hard to know, for example,
link |
00:09:49.320
that people have goals and they like to act on those goals,
link |
00:09:51.700
but it's really hard to know exactly what those goals are.
link |
00:09:54.900
But ideas of frustration.
link |
00:09:56.780
I mean, you could argue it's extremely difficult
link |
00:09:58.780
to understand the sources of human frustration
link |
00:10:01.460
as they're playing Jenga with you, or not.
link |
00:10:05.740
You could argue that it's very accessible.
link |
00:10:08.020
There's some things that are gonna be obvious
link |
00:10:09.740
and some not.
link |
00:10:10.580
So I don't think anybody really can do this well yet,
link |
00:10:14.220
but I think it's not inconceivable
link |
00:10:16.620
to imagine machines in the not so distant future
link |
00:10:20.120
being able to understand that if people lose in a game,
link |
00:10:24.220
that they don't like that.
link |
00:10:26.260
That's not such a hard thing to program
link |
00:10:27.940
and it's pretty consistent across people.
link |
00:10:29.980
Most people don't enjoy losing
link |
00:10:31.540
and so that makes it relatively easy to code.
link |
00:10:34.620
On the other hand, if you wanted to capture everything
link |
00:10:36.860
about frustration, well, people can get frustrated
link |
00:10:39.180
for a lot of different reasons.
link |
00:10:40.320
They might get sexually frustrated,
link |
00:10:42.340
they might get frustrated,
link |
00:10:43.180
they can get their promotion at work,
link |
00:10:45.140
all kinds of different things.
link |
00:10:46.900
And the more you expand the scope,
link |
00:10:48.580
the harder it is for anything like the existing techniques
link |
00:10:51.540
to really do that.
link |
00:10:53.000
So I'm talking to Garret Kasparov next week
link |
00:10:55.660
and he seemed pretty frustrated
link |
00:10:57.220
with his game against Deep Blue, so.
link |
00:10:58.940
Yeah, well, I'm frustrated with my game
link |
00:11:00.300
against him last year,
link |
00:11:01.340
because I played him, I had two excuses,
link |
00:11:03.620
I'll give you my excuses up front,
link |
00:11:04.900
but it won't mitigate the outcome.
link |
00:11:07.060
I was jet lagged and I hadn't played in 25 or 30 years,
link |
00:11:11.100
but the outcome is he completely destroyed me
link |
00:11:13.020
and it wasn't even close.
link |
00:11:14.420
Have you ever been beaten in any board game by a machine?
link |
00:11:19.740
I have, I actually played the predecessor to Deep Blue.
link |
00:11:24.740
Deep Thought, I believe it was called,
link |
00:11:27.940
and that too crushed me.
link |
00:11:30.000
And that was, and after that you realize it's over for us.
link |
00:11:35.340
Well, there's no point in my playing Deep Blue.
link |
00:11:36.820
I mean, it's a waste of Deep Blue's computation.
link |
00:11:40.260
I mean, I played Kasparov
link |
00:11:41.540
because we both gave lectures this same event
link |
00:11:44.820
and he was playing 30 people.
link |
00:11:46.020
I forgot to mention that.
link |
00:11:46.900
Not only did he crush me,
link |
00:11:47.980
but he crushed 29 other people at the same time.
link |
00:11:50.660
I mean, but the actual philosophical and emotional experience
link |
00:11:55.460
of being beaten by a machine, I imagine is a,
link |
00:11:59.100
I mean, to you who thinks about these things
link |
00:12:01.380
may be a profound experience.
link |
00:12:03.580
Or no, it was a simple mathematical experience.
link |
00:12:07.780
Yeah, I think a game like chess particularly
link |
00:12:10.300
where you have perfect information,
link |
00:12:12.740
it's two player closed end
link |
00:12:14.780
and there's more computation for the computer,
link |
00:12:16.940
it's no surprise the machine wins.
link |
00:12:18.860
I mean, I'm not sad when a computer,
link |
00:12:22.020
I'm not sad when a computer calculates
link |
00:12:23.940
a cube root faster than me.
link |
00:12:25.220
Like, I know I can't win that game.
link |
00:12:27.860
I'm not gonna try.
link |
00:12:28.900
Well, with a system like AlphaGo or AlphaZero,
link |
00:12:32.080
do you see a little bit more magic in a system like that
link |
00:12:35.060
even though it's simply playing a board game?
link |
00:12:37.260
But because there's a strong learning component?
link |
00:12:39.940
You know, I find you should mention that
link |
00:12:41.300
in the context of this conversation
link |
00:12:42.580
because Kasparov and I are working on an article
link |
00:12:45.300
that's gonna be called AI is not magic.
link |
00:12:47.300
And, you know, neither one of us thinks that it's magic.
link |
00:12:50.500
And part of the point of this article
link |
00:12:51.980
is that AI is actually a grab bag of different techniques
link |
00:12:55.140
and some of them have,
link |
00:12:56.060
or they each have their own unique strengths and weaknesses.
link |
00:13:00.060
So, you know, you read media accounts
link |
00:13:02.820
and it's like, ooh, AI, it must be magical
link |
00:13:05.200
or it can solve any problem.
link |
00:13:06.580
Well, no, some problems are really accessible
link |
00:13:09.500
like chess and go and other problems like reading
link |
00:13:11.980
are completely outside the current technology.
link |
00:13:14.940
And it's not like you can take the technology,
link |
00:13:17.100
that drives AlphaGo and apply it to reading
link |
00:13:20.100
and get anywhere.
link |
00:13:21.340
You know, DeepMind has tried that a bit.
link |
00:13:23.180
They have all kinds of resources.
link |
00:13:24.500
You know, they built AlphaGo and they have,
link |
00:13:26.180
you know, I wrote a piece recently that they lost
link |
00:13:29.460
and you can argue about the word lost,
link |
00:13:30.540
but they spent $530 million more than they made last year.
link |
00:13:34.900
So, you know, they're making huge investments.
link |
00:13:36.620
They have a large budget
link |
00:13:37.860
and they have applied the same kinds of techniques
link |
00:13:40.900
to reading or to language.
link |
00:13:43.220
It's just much less productive there
link |
00:13:45.540
because it's a fundamentally different kind of problem.
link |
00:13:47.900
Chess and go and so forth are closed end problems.
link |
00:13:50.660
The rules haven't changed in 2,500 years.
link |
00:13:52.980
There's only so many moves you can make.
link |
00:13:54.700
You can talk about the exponential
link |
00:13:56.460
as you look at the combinations of moves,
link |
00:13:58.180
but fundamentally, you know, the go board has 361 squares.
link |
00:14:01.240
That's it.
link |
00:14:02.080
That's the only, you know, those intersections
link |
00:14:04.100
are the only places that you can place your stone.
link |
00:14:07.300
Whereas when you're reading,
link |
00:14:09.140
the next sentence could be anything.
link |
00:14:11.460
You know, it's completely up to the writer
link |
00:14:13.300
what they're gonna do next.
link |
00:14:14.460
That's fascinating that you think this way.
link |
00:14:16.260
You're clearly a brilliant mind
link |
00:14:17.980
who points out the emperor has no clothes,
link |
00:14:19.700
but so I'll play the role of a person who says.
link |
00:14:22.300
You're gonna put clothes on the emperor?
link |
00:14:23.300
Good luck with it.
link |
00:14:24.140
It romanticizes the notion of the emperor, period,
link |
00:14:27.980
suggesting that clothes don't even matter.
link |
00:14:30.140
Okay, so that's really interesting
link |
00:14:33.580
that you're talking about language.
link |
00:14:36.260
So there's the physical world
link |
00:14:37.780
of being able to move about the world,
link |
00:14:39.680
making an omelet and coffee and so on.
link |
00:14:41.940
There's language where you first understand
link |
00:14:46.020
what's being written and then maybe even more complicated
link |
00:14:48.860
than that, having a natural dialogue.
link |
00:14:51.260
And then there's the game of go and chess.
link |
00:14:53.620
I would argue that language is much closer to go
link |
00:14:57.540
than it is to the physical world.
link |
00:14:59.700
Like it is still very constrained.
link |
00:15:01.460
When you say the possibility of the number of sentences
link |
00:15:04.740
that could come, it is huge,
link |
00:15:06.500
but it nevertheless is much more constrained.
link |
00:15:09.260
It feels maybe I'm wrong than the possibilities
link |
00:15:12.740
that the physical world brings us.
link |
00:15:14.540
There's something to what you say
link |
00:15:15.860
in some ways in which I disagree.
link |
00:15:17.700
So one interesting thing about language
link |
00:15:20.620
is that it abstracts away.
link |
00:15:23.340
This bottle, I don't know if it would be in the field of view
link |
00:15:26.140
is on this table and I use the word on here
link |
00:15:28.900
and I can use the word on here, maybe not here,
link |
00:15:32.980
but that one word encompasses in analog space
link |
00:15:36.980
sort of infinite number of possibilities.
link |
00:15:39.340
So there is a way in which language filters down
link |
00:15:43.060
the variation of the world and there's other ways.
link |
00:15:46.660
So we have a grammar and more or less
link |
00:15:49.900
you have to follow the rules of that grammar.
link |
00:15:51.700
You can break them a little bit,
link |
00:15:52.700
but by and large we follow the rules of grammar
link |
00:15:55.420
and so that's a constraint on language.
link |
00:15:57.020
So there are ways in which language is a constrained system.
link |
00:15:59.460
On the other hand, there are many arguments
link |
00:16:02.300
that say there's an infinite number of possible sentences
link |
00:16:04.740
and you can establish that by just stacking them up.
link |
00:16:07.660
So I think there's water on the table,
link |
00:16:09.500
you think that I think there's water on the table,
link |
00:16:11.740
your mother thinks that you think that I think
link |
00:16:13.340
that water's on the table, your brother thinks
link |
00:16:15.620
that maybe your mom is wrong to think
link |
00:16:17.300
that you think that I think, right?
link |
00:16:18.660
So we can make sentences of infinite length
link |
00:16:21.980
or we can stack up adjectives.
link |
00:16:23.580
This is a very silly example, a very, very silly example,
link |
00:16:26.420
a very, very, very, very, very, very silly example
link |
00:16:28.780
and so forth.
link |
00:16:29.620
So there are good arguments
link |
00:16:30.980
that there's an infinite range of sentences.
link |
00:16:32.420
In any case, it's vast by any reasonable measure
link |
00:16:35.780
and for example, almost anything in the physical world
link |
00:16:37.980
we can talk about in the language world
link |
00:16:40.460
and interestingly, many of the sentences that we understand,
link |
00:16:43.820
we can only understand if we have a very rich model
link |
00:16:46.820
of the physical world.
link |
00:16:47.820
So I don't ultimately want to adjudicate the debate
link |
00:16:50.620
that I think you just set up, but I find it interesting.
link |
00:16:54.420
Maybe the physical world is even more complicated
link |
00:16:57.180
than language, I think that's fair, but.
link |
00:16:59.580
Language is really, really complicated.
link |
00:17:03.100
It's really, really hard.
link |
00:17:04.100
Well, it's really, really hard for machines,
link |
00:17:06.100
for linguists, people trying to understand it.
link |
00:17:08.500
It's not that hard for children
link |
00:17:09.660
and that's part of what's driven my whole career.
link |
00:17:12.100
I was a student of Steven Pinker's
link |
00:17:14.340
and we were trying to figure out
link |
00:17:15.340
why kids could learn language when machines couldn't.
link |
00:17:18.700
I think we're gonna get into language,
link |
00:17:20.540
we're gonna get into communication intelligence
link |
00:17:22.460
and neural networks and so on,
link |
00:17:24.220
but let me return to the high level,
link |
00:17:28.860
the futuristic for a brief moment.
link |
00:17:32.540
So you've written in your book, in your new book,
link |
00:17:37.300
it would be arrogant to suppose that we could forecast
link |
00:17:39.940
where AI will be or the impact it will have
link |
00:17:42.500
in a thousand years or even 500 years.
link |
00:17:45.180
So let me ask you to be arrogant.
link |
00:17:48.340
What do AI systems with or without physical bodies
link |
00:17:51.500
look like 100 years from now?
link |
00:17:53.500
If you would just, you can't predict,
link |
00:17:56.820
but if you were to philosophize and imagine, do.
link |
00:18:00.540
Can I first justify the arrogance
link |
00:18:02.020
before you try to push me beyond it?
link |
00:18:04.100
Sure.
link |
00:18:05.940
I mean, there are examples like,
link |
00:18:07.700
people figured out how electricity worked,
link |
00:18:09.720
they had no idea that that was gonna lead to cell phones.
link |
00:18:13.060
I mean, things can move awfully fast
link |
00:18:15.600
once new technologies are perfected.
link |
00:18:17.940
Even when they made transistors,
link |
00:18:19.460
they weren't really thinking that cell phones
link |
00:18:21.100
would lead to social networking.
link |
00:18:23.340
There are nevertheless predictions of the future,
link |
00:18:25.740
which are statistically unlikely to come to be,
link |
00:18:28.820
but nevertheless is the best.
link |
00:18:29.660
You're asking me to be wrong.
link |
00:18:31.380
Asking you to be statistically.
link |
00:18:32.220
In which way would I like to be wrong?
link |
00:18:34.020
Pick the least unlikely to be wrong thing,
link |
00:18:37.500
even though it's most very likely to be wrong.
link |
00:18:39.760
I mean, here's some things
link |
00:18:40.600
that we can safely predict, I suppose.
link |
00:18:42.740
We can predict that AI will be faster than it is now.
link |
00:18:47.300
It will be cheaper than it is now.
link |
00:18:49.520
It will be better in the sense of being more general
link |
00:18:52.880
and applicable in more places.
link |
00:18:56.980
It will be pervasive.
link |
00:18:59.300
I mean, these are easy predictions.
link |
00:19:01.620
I'm sort of modeling them in my head
link |
00:19:03.320
on Jeff Bezos's famous predictions.
link |
00:19:05.820
He says, I can't predict the future,
link |
00:19:07.340
not in every way, I'm paraphrasing.
link |
00:19:09.820
But I can predict that people
link |
00:19:11.060
will never wanna pay more money for their stuff.
link |
00:19:13.220
They're never gonna want it to take longer to get there.
link |
00:19:15.580
So you can't predict everything,
link |
00:19:17.800
but you can predict something.
link |
00:19:18.880
Sure, of course it's gonna be faster and better.
link |
00:19:21.220
But what we can't really predict
link |
00:19:24.500
is the full scope of where AI will be in a certain period.
link |
00:19:28.700
I mean, I think it's safe to say that,
link |
00:19:31.900
although I'm very skeptical about current AI,
link |
00:19:35.660
that it's possible to do much better.
link |
00:19:37.700
You know, there's no in principled argument
link |
00:19:39.700
that says AI is an insolvable problem,
link |
00:19:42.100
that there's magic inside our brains
link |
00:19:43.620
that will never be captured.
link |
00:19:44.980
I mean, I've heard people make those kind of arguments.
link |
00:19:46.780
I don't think they're very good.
link |
00:19:48.980
So AI's gonna come, and probably 500 years
link |
00:19:54.100
is plenty to get there.
link |
00:19:55.540
And then once it's here, it really will change everything.
link |
00:19:59.260
So when you say AI's gonna come,
link |
00:20:01.060
are you talking about human level intelligence?
link |
00:20:03.660
So maybe I...
link |
00:20:04.980
I like the term general intelligence.
link |
00:20:06.660
So I don't think that the ultimate AI,
link |
00:20:09.500
if there is such a thing, is gonna look just like humans.
link |
00:20:11.980
I think it's gonna do some things
link |
00:20:13.600
that humans do better than current machines,
link |
00:20:16.580
like reason flexibly.
link |
00:20:18.580
And understand language and so forth.
link |
00:20:21.180
But it doesn't mean they have to be identical to humans.
link |
00:20:23.460
So for example, humans have terrible memory,
link |
00:20:25.980
and they suffer from what some people
link |
00:20:28.780
call motivated reasoning.
link |
00:20:29.920
So they like arguments that seem to support them,
link |
00:20:32.460
and they dismiss arguments that they don't like.
link |
00:20:35.460
There's no reason that a machine should ever do that.
link |
00:20:38.660
So you see that those limitations of memory
link |
00:20:42.280
as a bug, not a feature.
link |
00:20:43.940
Absolutely.
link |
00:20:44.820
I'll say two things about that.
link |
00:20:46.620
One is I was on a panel with Danny Kahneman,
link |
00:20:48.660
the Nobel Prize winner, last night,
link |
00:20:50.300
and we were talking about this stuff.
link |
00:20:51.760
And I think what we converged on
link |
00:20:53.480
is that humans are a low bar to exceed.
link |
00:20:56.120
They may be outside of our skill right now,
link |
00:20:58.940
but as AI programmers, but eventually AI will exceed it.
link |
00:21:04.300
So we're not talking about human level AI.
link |
00:21:06.060
We're talking about general intelligence
link |
00:21:07.900
that can do all kinds of different things
link |
00:21:09.420
and do it without some of the flaws that human beings have.
link |
00:21:12.220
The other thing I'll say is I wrote a whole book,
link |
00:21:13.700
actually, about the flaws of humans.
link |
00:21:15.280
It's actually a nice bookend to the,
link |
00:21:17.980
or counterpoint to the current book.
link |
00:21:19.180
So I wrote a book called Cluj,
link |
00:21:21.380
which was about the limits of the human mind.
link |
00:21:24.020
The current book is kind of about those few things
link |
00:21:26.380
that humans do a lot better than machines.
link |
00:21:28.760
Do you think it's possible that the flaws
link |
00:21:30.820
of the human mind, the limits of memory,
link |
00:21:33.260
our mortality, our bias,
link |
00:21:38.460
is a strength, not a weakness,
link |
00:21:40.300
that that is the thing that enables,
link |
00:21:43.500
from which motivation springs and meaning springs or not?
link |
00:21:47.940
I've heard a lot of arguments like this.
link |
00:21:49.460
I've never found them that convincing.
link |
00:21:50.860
I think that there's a lot of making lemonade out of lemons.
link |
00:21:55.120
So we, for example, do a lot of free association
link |
00:21:58.260
where one idea just leads to the next
link |
00:22:00.780
and they're not really that well connected.
link |
00:22:02.540
And we enjoy that and we make poetry out of it
link |
00:22:04.500
and we make kind of movies with free associations
link |
00:22:07.100
and it's fun and whatever.
link |
00:22:08.140
I don't think that's really a virtue of the system.
link |
00:22:12.300
I think that the limitations in human reasoning
link |
00:22:15.340
actually get us in a lot of trouble.
link |
00:22:16.580
Like, for example, politically we can't see eye to eye
link |
00:22:19.300
because we have the motivational reasoning I was talking
link |
00:22:21.780
about and something related called confirmation bias.
link |
00:22:25.080
So we have all of these problems that actually make
link |
00:22:27.460
for a rougher society because we can't get along
link |
00:22:29.920
because we can't interpret the data in shared ways.
link |
00:22:34.320
And then we do some nice stuff with that.
link |
00:22:36.460
So my free associations are different from yours
link |
00:22:38.900
and you're kind of amused by them and that's great.
link |
00:22:41.600
And hence poetry.
link |
00:22:42.620
So there are lots of ways in which we take
link |
00:22:45.060
a lousy situation and make it good.
link |
00:22:47.540
Another example would be our memories are terrible.
link |
00:22:50.580
So we play games like Concentration where you flip over
link |
00:22:53.300
two cards, try to find a pair.
link |
00:22:54.980
Can you imagine a computer playing that?
link |
00:22:56.480
Computer's like, this is the dullest game in the world.
link |
00:22:58.300
I know where all the cards are, I see it once,
link |
00:22:59.940
I know where it is, what are you even talking about?
link |
00:23:02.580
So we make a fun game out of having this terrible memory.
link |
00:23:07.040
So we are imperfect in discovering and optimizing
link |
00:23:12.220
some kind of utility function.
link |
00:23:13.540
But you think in general, there is a utility function.
link |
00:23:16.300
There's an objective function that's better than others.
link |
00:23:18.860
I didn't say that.
link |
00:23:20.340
But see, the presumption, when you say...
link |
00:23:24.420
I think you could design a better memory system.
link |
00:23:27.220
You could argue about utility functions
link |
00:23:29.900
and how you wanna think about that.
link |
00:23:32.100
But objectively, it would be really nice
link |
00:23:34.180
to do some of the following things.
link |
00:23:36.500
To get rid of memories that are no longer useful.
link |
00:23:41.140
Objectively, that would just be good.
link |
00:23:42.700
And we're not that good at it.
link |
00:23:43.580
So when you park in the same lot every day,
link |
00:23:46.540
you confuse where you parked today
link |
00:23:47.900
with where you parked yesterday
link |
00:23:48.860
with where you parked the day before and so forth.
link |
00:23:50.700
So you blur together a series of memories.
link |
00:23:52.620
There's just no way that that's optimal.
link |
00:23:55.380
I mean, I've heard all kinds of wacky arguments
link |
00:23:56.940
of people trying to defend that.
link |
00:23:58.140
But in the end of the day,
link |
00:23:58.980
I don't think any of them hold water.
link |
00:24:00.420
It's just above.
link |
00:24:01.260
Or memories of traumatic events would be possibly
link |
00:24:04.420
a very nice feature to have to get rid of those.
link |
00:24:06.780
It'd be great if you could just be like,
link |
00:24:08.300
I'm gonna wipe this sector.
link |
00:24:10.580
I'm done with that.
link |
00:24:12.020
I didn't have fun last night.
link |
00:24:13.260
I don't wanna think about it anymore.
link |
00:24:14.780
Whoop, bye bye.
link |
00:24:15.820
I'm gone.
link |
00:24:16.660
But we can't.
link |
00:24:17.740
Do you think it's possible to build a system...
link |
00:24:20.380
So you said human level intelligence is a weird concept, but...
link |
00:24:23.780
Well, I'm saying I prefer general intelligence.
link |
00:24:25.420
General intelligence.
link |
00:24:26.260
I mean, human level intelligence is a real thing.
link |
00:24:28.140
And you could try to make a machine
link |
00:24:29.820
that matches people or something like that.
link |
00:24:31.940
I'm saying that per se shouldn't be the objective,
link |
00:24:34.220
but rather that we should learn from humans
link |
00:24:37.220
the things they do well and incorporate that into our AI,
link |
00:24:39.660
just as we incorporate the things that machines do well
link |
00:24:42.100
that people do terribly.
link |
00:24:43.260
So, I mean, it's great that AI systems
link |
00:24:45.780
can do all this brute force computation that people can't.
link |
00:24:48.340
And one of the reasons I work on this stuff
link |
00:24:50.820
is because I would like to see machines solve problems
link |
00:24:53.300
that people can't, that combine the strength,
link |
00:24:56.020
or that in order to be solved would combine
link |
00:24:59.460
the strengths of machines to do all this computation
link |
00:25:02.220
with the ability, let's say, of people to read.
link |
00:25:04.220
So I'd like machines that can read
link |
00:25:06.180
the entire medical literature in a day.
link |
00:25:08.660
7,000 new papers or whatever the numbers,
link |
00:25:10.780
comes out every day.
link |
00:25:11.740
There's no way for any doctor or whatever to read them all.
link |
00:25:15.740
A machine that could read would be a brilliant thing.
link |
00:25:17.980
And that would be strengths of brute force computation
link |
00:25:21.060
combined with kind of subtlety and understanding medicine
link |
00:25:24.300
that a good doctor or scientist has.
link |
00:25:26.900
So if we can linger a little bit
link |
00:25:28.020
on the idea of general intelligence.
link |
00:25:29.660
So Yann LeCun believes that human intelligence
link |
00:25:32.860
isn't general at all, it's very narrow.
link |
00:25:35.580
How do you think?
link |
00:25:36.700
I don't think that makes sense.
link |
00:25:38.140
We have lots of narrow intelligences for specific problems.
link |
00:25:42.140
But the fact is, like, anybody can walk into,
link |
00:25:45.940
let's say, a Hollywood movie,
link |
00:25:47.620
and reason about the content
link |
00:25:49.140
of almost anything that goes on there.
link |
00:25:51.700
So you can reason about what happens in a bank robbery,
link |
00:25:55.180
or what happens when someone is infertile
link |
00:25:58.620
and wants to go to IVF to try to have a child,
link |
00:26:02.780
or you can, the list is essentially endless.
link |
00:26:05.940
And not everybody understands every scene in the movie,
link |
00:26:09.580
but there's a huge range of things
link |
00:26:11.740
that pretty much any ordinary adult can understand.
link |
00:26:15.060
His argument is, is that actually,
link |
00:26:18.220
the set of things seems large for us humans
link |
00:26:20.700
because we're very limited in considering
link |
00:26:24.380
the kind of possibilities of experiences that are possible.
link |
00:26:27.340
But in fact, the amount of experience that are possible
link |
00:26:30.180
is infinitely larger.
link |
00:26:32.500
Well, I mean, if you wanna make an argument
link |
00:26:35.140
that humans are constrained in what they can understand,
link |
00:26:38.780
I have no issue with that.
link |
00:26:40.940
I think that's right.
link |
00:26:41.780
But it's still not the same thing at all
link |
00:26:44.460
as saying, here's a system that can play Go.
link |
00:26:47.460
It's been trained on five million games.
link |
00:26:49.700
And then I say, can it play on a rectangular board
link |
00:26:52.580
rather than a square board?
link |
00:26:53.700
And you say, well, if I retrain it from scratch
link |
00:26:56.580
on another five million games, it can.
link |
00:26:58.340
That's really, really narrow, and that's where we are.
link |
00:27:01.140
We don't have even a system that could play Go
link |
00:27:05.140
and then without further retraining,
link |
00:27:07.100
play on a rectangular board,
link |
00:27:08.700
which any human could do with very little problem.
link |
00:27:12.600
So that's what I mean by narrow.
link |
00:27:14.860
And so it's just wordplay to say.
link |
00:27:16.900
That is semantics, yeah.
link |
00:27:18.060
Then it's just words.
link |
00:27:19.300
Then yeah, you mean general in a sense
link |
00:27:21.180
that you can do all kinds of Go board shapes flexibly.
link |
00:27:25.780
Well, that would be like a first step
link |
00:27:28.100
in the right direction,
link |
00:27:29.020
but obviously that's not what it really meaning.
link |
00:27:30.540
You're kidding.
link |
00:27:32.380
What I mean by general is that you could transfer
link |
00:27:36.140
the knowledge you learn in one domain to another.
link |
00:27:38.940
So if you learn about bank robberies in movies
link |
00:27:43.320
and there's chase scenes,
link |
00:27:44.780
then you can understand that amazing scene in Breaking Bad
link |
00:27:47.740
when Walter White has a car chase scene
link |
00:27:50.580
with only one person.
link |
00:27:51.500
He's the only one in it.
link |
00:27:52.620
And you can reflect on how that car chase scene
link |
00:27:55.540
is like all the other car chase scenes you've ever seen
link |
00:27:58.060
and totally different and why that's cool.
link |
00:28:01.140
And the fact that the number of domains
link |
00:28:03.100
you can do that with is finite
link |
00:28:04.540
doesn't make it less general.
link |
00:28:05.760
So the idea of general is you could just do it
link |
00:28:07.340
on a lot of, don't transfer it across a lot of domains.
link |
00:28:09.380
Yeah, I mean, I'm not saying humans are infinitely general
link |
00:28:11.740
or that humans are perfect.
link |
00:28:12.960
I just said a minute ago, it's a low bar,
link |
00:28:15.340
but it's just, it's a low bar.
link |
00:28:17.420
But right now, like the bar is here and we're there
link |
00:28:20.460
and eventually we'll get way past it.
link |
00:28:22.660
So speaking of low bars,
link |
00:28:25.600
you've highlighted in your new book as well,
link |
00:28:27.420
but a couple of years ago wrote a paper
link |
00:28:29.340
titled Deep Learning, A Critical Appraisal
link |
00:28:31.300
that lists 10 challenges faced
link |
00:28:33.340
by current deep learning systems.
link |
00:28:36.020
So let me summarize them as data efficiency,
link |
00:28:40.140
transfer learning, hierarchical knowledge,
link |
00:28:42.900
open ended inference, explainability,
link |
00:28:46.300
integrating prior knowledge, cause of reasoning,
link |
00:28:49.660
modeling on a stable world, robustness, adversarial examples
link |
00:28:53.220
and so on.
link |
00:28:54.140
And then my favorite probably is reliability
link |
00:28:56.900
in the engineering of real world systems.
link |
00:28:59.140
So whatever people can read the paper,
link |
00:29:01.600
they should definitely read the paper,
link |
00:29:02.940
should definitely read your book.
link |
00:29:04.320
But which of these challenges is solved in your view
link |
00:29:08.140
has the biggest impact on the AI community?
link |
00:29:11.060
It's a very good question.
link |
00:29:13.940
And I'm gonna be evasive because I think that
link |
00:29:16.300
they go together a lot.
link |
00:29:17.980
So some of them might be solved independently of others,
link |
00:29:21.420
but I think a good solution to AI
link |
00:29:23.700
starts by having real,
link |
00:29:25.460
what I would call cognitive models of what's going on.
link |
00:29:28.420
So right now we have a approach that's dominant
link |
00:29:31.340
where you take statistical approximations of things,
link |
00:29:33.920
but you don't really understand them.
link |
00:29:35.740
So you know that bottles are correlated in your data
link |
00:29:39.100
with bottle caps,
link |
00:29:40.300
but you don't understand that there's a thread
link |
00:29:42.220
on the bottle cap that fits with the thread on the bottle
link |
00:29:45.300
and then that's what tightens it.
link |
00:29:46.620
If I tighten enough that there's a seal
link |
00:29:48.540
and the water won't come out.
link |
00:29:49.660
Like there's no machine that understands that.
link |
00:29:51.980
And having a good cognitive model
link |
00:29:53.820
of that kind of everyday phenomena
link |
00:29:55.480
is what we call common sense.
link |
00:29:56.620
And if you had that,
link |
00:29:57.820
then a lot of these other things start to fall
link |
00:30:00.700
into at least a little bit better place.
link |
00:30:02.860
Right now you're like learning correlations between pixels
link |
00:30:05.640
when you play a video game or something like that.
link |
00:30:07.660
And it doesn't work very well.
link |
00:30:08.940
It works when the video game is just the way
link |
00:30:10.720
that you studied it and then you alter the video game
link |
00:30:12.940
in small ways,
link |
00:30:13.760
like you move the paddle and break out a few pixels
link |
00:30:15.780
and the system falls apart.
link |
00:30:17.460
Because it doesn't understand,
link |
00:30:19.020
it doesn't have a representation of a paddle,
link |
00:30:20.900
a ball, a wall, a set of bricks and so forth.
link |
00:30:23.340
And so it's reasoning at the wrong level.
link |
00:30:26.440
So the idea of common sense,
link |
00:30:29.220
it's full of mystery,
link |
00:30:30.220
you've worked on it,
link |
00:30:31.060
but it's nevertheless full of mystery,
link |
00:30:33.560
full of promise.
link |
00:30:34.720
What does common sense mean?
link |
00:30:36.540
What does knowledge mean?
link |
00:30:38.020
So the way you've been discussing it now
link |
00:30:40.020
is very intuitive.
link |
00:30:40.940
It makes a lot of sense that that is something
link |
00:30:42.580
we should have and that's something
link |
00:30:43.700
deep learning systems don't have.
link |
00:30:45.600
But the argument could be that we're oversimplifying it
link |
00:30:49.740
because we're oversimplifying the notion of common sense
link |
00:30:53.180
because that's how it feels like we as humans
link |
00:30:57.140
at the cognitive level approach problems.
link |
00:30:59.320
So maybe.
link |
00:31:00.160
A lot of people aren't actually gonna read my book.
link |
00:31:03.320
But if they did read the book,
link |
00:31:05.220
one of the things that might come as a surprise to them
link |
00:31:07.140
is that we actually say common sense is really hard
link |
00:31:10.660
and really complicated.
link |
00:31:11.640
So they would probably,
link |
00:31:13.020
my critics know that I like common sense,
link |
00:31:15.140
but that chapter actually starts by us beating up
link |
00:31:18.600
not on deep learning,
link |
00:31:19.900
but kind of on our own home team as it will.
link |
00:31:21.960
So Ernie and I are first and foremost
link |
00:31:25.180
people that believe in at least some
link |
00:31:26.780
of what good old fashioned AI tried to do.
link |
00:31:28.700
So we believe in symbols and logic and programming.
link |
00:31:32.500
Things like that are important.
link |
00:31:33.740
And we go through why even those tools
link |
00:31:37.020
that we hold fairly dear aren't really enough.
link |
00:31:39.560
So we talk about why common sense is actually many things.
link |
00:31:42.660
And some of them fit really well with those
link |
00:31:45.300
classical sets of tools.
link |
00:31:46.540
So things like taxonomy.
link |
00:31:48.240
So I know that a bottle is an object
link |
00:31:51.460
or it's a vessel, let's say.
link |
00:31:52.860
And I know a vessel is an object
link |
00:31:54.480
and objects are material things in the physical world.
link |
00:31:57.580
So I can make some inferences.
link |
00:32:00.500
If I know that vessels need to not have holes in them,
link |
00:32:07.020
then I can infer that in order to carry their contents,
link |
00:32:09.540
then I can infer that a bottle
link |
00:32:10.920
shouldn't have a hole in it in order to carry its contents.
link |
00:32:12.860
So you can do hierarchical inference and so forth.
link |
00:32:15.620
And we say that's great,
link |
00:32:17.260
but it's only a tiny piece of what you need for common sense.
link |
00:32:21.100
We give lots of examples that don't fit into that.
link |
00:32:23.460
So another one that we talk about is a cheese grater.
link |
00:32:26.500
You've got holes in a cheese grater.
link |
00:32:28.040
You've got a handle on top.
link |
00:32:29.500
You can build a model in the game engine sense of a model
link |
00:32:33.380
so that you could have a little cartoon character
link |
00:32:35.820
flying around through the holes of the grater.
link |
00:32:37.980
But we don't have a system yet.
link |
00:32:39.980
Taxonomy doesn't help us that much
link |
00:32:41.620
that really understands why the handle is on top
link |
00:32:43.780
and what you do with the handle,
link |
00:32:45.240
or why all of those circles are sharp,
link |
00:32:47.620
or how you'd hold the cheese with respect to the grater
link |
00:32:50.500
in order to make it actually work.
link |
00:32:52.120
Do you think these ideas are just abstractions
link |
00:32:55.020
that could emerge on a system
link |
00:32:57.140
like a very large deep neural network?
link |
00:32:59.920
I'm a skeptic that that kind of emergence per se can work.
link |
00:33:03.140
So I think that deep learning might play a role
link |
00:33:05.840
in the systems that do what I want systems to do,
link |
00:33:08.760
but it won't do it by itself.
link |
00:33:09.900
I've never seen a deep learning system
link |
00:33:13.140
really extract an abstract concept.
link |
00:33:15.900
What they do, principled reasons for that
link |
00:33:18.820
stemming from how back propagation works,
link |
00:33:20.540
how the architectures are set up.
link |
00:33:22.920
One example is deep learning people
link |
00:33:25.120
actually all build in something called convolution,
link |
00:33:29.620
which Jan Lacune is famous for, which is an abstraction.
link |
00:33:33.180
They don't have their systems learn this.
link |
00:33:34.960
So the abstraction is an object that looks the same
link |
00:33:37.740
if it appears in different places.
link |
00:33:39.220
And what Lacune figured out and why,
link |
00:33:41.940
essentially why he was a co winner of the Turing Award
link |
00:33:44.300
was that if you programmed this in innately,
link |
00:33:47.620
then your system would be a whole lot more efficient.
link |
00:33:50.680
In principle, this should be learnable,
link |
00:33:53.220
but people don't have systems that kind of reify things
link |
00:33:56.220
and make them more abstract.
link |
00:33:58.000
And so what you'd really wind up with
link |
00:34:00.420
if you don't program that in advance is a system
link |
00:34:02.700
that kind of realizes that this is the same thing as this,
link |
00:34:05.460
but then I take your little clock there
link |
00:34:06.980
and I move it over and it doesn't realize
link |
00:34:08.380
that the same thing applies to the clock.
link |
00:34:10.460
So the really nice thing, you're right,
link |
00:34:12.680
that convolution is just one of the things
link |
00:34:14.760
that's like, it's an innate feature
link |
00:34:17.160
that's programmed by the human expert.
link |
00:34:19.460
We need more of those, not less.
link |
00:34:21.260
Yes, but the nice feature is it feels like
link |
00:34:24.420
that requires coming up with that brilliant idea,
link |
00:34:28.200
can get you a Turing Award,
link |
00:34:29.780
but it requires less effort than encoding
link |
00:34:34.780
and something we'll talk about, the expert system.
link |
00:34:36.620
So encoding a lot of knowledge by hand.
link |
00:34:40.020
So it feels like there's a huge amount of limitations
link |
00:34:43.500
which you clearly outline with deep learning,
link |
00:34:46.500
but the nice feature of deep learning,
link |
00:34:47.820
whatever it is able to accomplish,
link |
00:34:49.600
it does a lot of stuff automatically
link |
00:34:53.500
without human intervention.
link |
00:34:54.900
Well, and that's part of why people love it, right?
link |
00:34:57.100
But I always think of this quote from Bertrand Russell,
link |
00:34:59.820
which is it has all the advantages
link |
00:35:02.740
of theft over honest toil.
link |
00:35:04.420
It's really hard to program into a machine
link |
00:35:08.140
a notion of causality or even how a bottle works
link |
00:35:11.300
or what containers are.
link |
00:35:12.640
Ernie Davis and I wrote a, I don't know,
link |
00:35:14.260
45 page academic paper trying just to understand
link |
00:35:17.980
what a container is,
link |
00:35:18.980
which I don't think anybody ever read the paper,
link |
00:35:21.100
but it's a very detailed analysis of all the things,
link |
00:35:25.260
well, not even all of it,
link |
00:35:26.100
some of the things you need to do
link |
00:35:27.140
in order to understand a container.
link |
00:35:28.580
It would be a whole lot nice,
link |
00:35:30.060
and I'm a coauthor on the paper,
link |
00:35:32.200
I made it a little bit better,
link |
00:35:33.180
but Ernie did the hard work for that particular paper.
link |
00:35:36.620
And it took him like three months
link |
00:35:38.060
to get the logical statements correct.
link |
00:35:40.660
And maybe that's not the right way to do it,
link |
00:35:42.860
it's a way to do it.
link |
00:35:44.100
But on that way of doing it,
link |
00:35:46.140
it's really hard work to do something
link |
00:35:48.440
as simple as understanding containers.
link |
00:35:50.220
And nobody wants to do that hard work,
link |
00:35:52.820
even Ernie didn't want to do that hard work.
link |
00:35:55.600
Everybody would rather just like feed their system in
link |
00:35:58.380
with a bunch of videos with a bunch of containers
link |
00:36:00.340
and have the systems infer how containers work.
link |
00:36:03.820
It would be like so much less effort,
link |
00:36:05.420
let the machine do the work.
link |
00:36:06.820
And so I understand the impulse,
link |
00:36:08.220
I understand why people want to do that.
link |
00:36:10.220
I just don't think that it works.
link |
00:36:11.860
I've never seen anybody build a system
link |
00:36:14.580
that in a robust way can actually watch videos
link |
00:36:18.700
and predict exactly which containers would leak
link |
00:36:21.300
and which ones wouldn't or something like,
link |
00:36:23.540
and I know someone's gonna go out and do that
link |
00:36:25.060
since I said it, and I look forward to seeing it.
link |
00:36:28.100
But getting these things to work robustly
link |
00:36:30.540
is really, really hard.
link |
00:36:32.900
So Yann LeCun, who was my colleague at NYU for many years,
link |
00:36:37.740
thinks that the hard work should go into defining
link |
00:36:40.760
an unsupervised learning algorithm
link |
00:36:43.180
that will watch videos, use the next frame basically
link |
00:36:46.680
in order to tell it what's going on.
link |
00:36:48.540
And he thinks that's the Royal road
link |
00:36:49.940
and he's willing to put in the work
link |
00:36:51.260
in devising that algorithm.
link |
00:36:53.300
Then he wants the machine to do the rest.
link |
00:36:55.580
And again, I understand the impulse.
link |
00:36:57.820
My intuition, based on years of watching this stuff
link |
00:37:01.700
and making predictions 20 years ago that still hold
link |
00:37:03.940
even though there's a lot more computation and so forth,
link |
00:37:06.500
is that we actually have to do
link |
00:37:07.460
a different kind of hard work,
link |
00:37:08.520
which is more like building a design specification
link |
00:37:11.320
for what we want the system to do,
link |
00:37:13.100
doing hard engineering work to figure out
link |
00:37:15.060
how we do things like what Yann did for convolution
link |
00:37:18.420
in order to figure out how to encode complex knowledge
link |
00:37:21.660
into the systems.
link |
00:37:22.620
The current systems don't have that much knowledge
link |
00:37:25.340
other than convolution, which is again,
link |
00:37:27.580
this objects being in different places
link |
00:37:30.500
and having the same perception, I guess I'll say.
link |
00:37:34.460
Same appearance.
link |
00:37:36.740
People don't want to do that work.
link |
00:37:38.260
They don't see how to naturally fit one with the other.
link |
00:37:41.580
I think that's, yes, absolutely.
link |
00:37:43.300
But also on the expert system side,
link |
00:37:45.540
there's a temptation to go too far the other way.
link |
00:37:47.620
So we're just having an expert sort of sit down
link |
00:37:49.860
and encode the description,
link |
00:37:51.940
the framework for what a container is,
link |
00:37:54.060
and then having the system reason the rest.
link |
00:37:56.540
From my view, one really exciting possibility
link |
00:37:59.260
is of active learning where it's continuous interaction
link |
00:38:02.180
between a human and machine.
link |
00:38:04.080
As the machine, there's kind of deep learning type
link |
00:38:07.060
extraction of information from data patterns and so on,
link |
00:38:10.120
but humans also guiding the learning procedures,
link |
00:38:14.660
guiding both the process and the framework
link |
00:38:19.940
of how the machine learns, whatever the task is.
link |
00:38:22.100
I was with you with almost everything you said
link |
00:38:24.100
except the phrase deep learning.
link |
00:38:26.460
What I think you really want there
link |
00:38:28.180
is a new form of machine learning.
link |
00:38:30.500
So let's remember, deep learning is a particular way
link |
00:38:32.980
of doing machine learning.
link |
00:38:33.980
Most often it's done with supervised data
link |
00:38:36.980
for perceptual categories.
link |
00:38:38.820
There are other things you can do with deep learning,
link |
00:38:41.780
some of them quite technical,
link |
00:38:42.740
but the standard use of deep learning
link |
00:38:44.600
is I have a lot of examples and I have labels for them.
link |
00:38:47.600
So here are pictures.
link |
00:38:48.820
This one's the Eiffel Tower.
link |
00:38:50.380
This one's the Sears Tower.
link |
00:38:51.660
This one's the Empire State Building.
link |
00:38:53.320
This one's a cat.
link |
00:38:54.160
This one's a pig and so forth.
link |
00:38:55.180
You just get millions of examples, millions of labels,
link |
00:38:58.900
and deep learning is extremely good at that.
link |
00:39:01.220
It's better than any other solution that anybody has devised,
link |
00:39:04.460
but it is not good at representing abstract knowledge.
link |
00:39:07.380
It's not good at representing things
link |
00:39:09.380
like bottles contain liquid and have tops to them
link |
00:39:13.980
and so forth.
link |
00:39:14.820
It's not very good at learning
link |
00:39:15.860
or representing that kind of knowledge.
link |
00:39:17.860
It is an example of having a machine learn something,
link |
00:39:21.300
but it's a machine that learns a particular kind of thing,
link |
00:39:23.900
which is object classification.
link |
00:39:25.540
It's not a particularly good algorithm for learning
link |
00:39:28.580
about the abstractions that govern our world.
link |
00:39:30.780
There may be such a thing.
link |
00:39:33.080
Part of what we counsel in the book
link |
00:39:34.300
is maybe people should be working on devising such things.
link |
00:39:36.980
So one possibility, just I wonder what you think about it,
link |
00:39:40.580
is that deep neural networks do form abstractions,
link |
00:39:45.180
but they're not accessible to us humans
link |
00:39:48.500
in terms of we can't.
link |
00:39:49.340
There's some truth in that.
link |
00:39:50.780
So is it possible that either current or future
link |
00:39:54.180
neural networks form very high level abstractions,
link |
00:39:56.520
which are as powerful as our human abstractions
link |
00:40:01.820
of common sense.
link |
00:40:02.660
We just can't get a hold of them.
link |
00:40:04.900
And so the problem is essentially
link |
00:40:06.620
we need to make them explainable.
link |
00:40:09.220
This is an astute question,
link |
00:40:10.640
but I think the answer is at least partly no.
link |
00:40:13.080
One of the kinds of classical neural network architecture
link |
00:40:16.060
is what we call an auto associator.
link |
00:40:17.620
It just tries to take an input,
link |
00:40:20.140
goes through a set of hidden layers,
link |
00:40:21.500
and comes out with an output.
link |
00:40:23.040
And it's supposed to learn essentially
link |
00:40:24.420
the identity function,
link |
00:40:25.460
that your input is the same as your output.
link |
00:40:27.260
So you think of it as binary numbers.
link |
00:40:28.460
You've got the one, the two, the four, the eight,
link |
00:40:30.660
the 16, and so forth.
link |
00:40:32.180
And so if you want to input 24,
link |
00:40:33.940
you turn on the 16, you turn on the eight.
link |
00:40:35.860
It's like binary one, one, and a bunch of zeros.
link |
00:40:38.940
So I did some experiments in 1998
link |
00:40:41.620
with the precursors of contemporary deep learning.
link |
00:40:46.620
And what I showed was you could train these networks
link |
00:40:50.460
on all the even numbers,
link |
00:40:52.060
and they would never generalize to the odd number.
link |
00:40:54.620
A lot of people thought that I was, I don't know,
link |
00:40:56.700
an idiot or faking the experiment,
link |
00:40:58.460
or it wasn't true or whatever.
link |
00:41:00.100
But it is true that with this class of networks
link |
00:41:03.260
that we had in that day,
link |
00:41:04.860
that they would never ever make this generalization.
link |
00:41:07.140
And it's not that the networks were stupid,
link |
00:41:09.660
it's that they see the world in a different way than we do.
link |
00:41:13.380
They were basically concerned,
link |
00:41:14.720
what is the probability that the rightmost output node
link |
00:41:18.580
is going to be one?
link |
00:41:19.980
And as far as they were concerned,
link |
00:41:21.220
in everything they'd ever been trained on, it was a zero.
link |
00:41:24.420
That node had never been turned on,
link |
00:41:27.020
and so they figured, why turn it on now?
link |
00:41:28.960
Whereas a person would look at the same problem and say,
link |
00:41:30.940
well, it's obvious,
link |
00:41:31.780
we're just doing the thing that corresponds.
link |
00:41:33.780
The Latin for it is mutatis mutandis,
link |
00:41:35.500
we'll change what needs to be changed.
link |
00:41:38.220
And we do this, this is what algebra is.
link |
00:41:40.500
So I can do f of x equals y plus two,
link |
00:41:43.840
and I can do it for a couple of values,
link |
00:41:45.380
I can tell you if y is three,
link |
00:41:46.500
then x is five, and if y is four, x is six.
link |
00:41:49.140
And now I can do it with some totally different number,
link |
00:41:50.980
like a million, then you can say,
link |
00:41:51.980
well, obviously it's a million and two,
link |
00:41:53.140
because you have an algebraic operation
link |
00:41:55.620
that you're applying to a variable.
link |
00:41:57.460
And deep learning systems kind of emulate that,
link |
00:42:00.620
but they don't actually do it.
link |
00:42:02.500
The particular example,
link |
00:42:04.140
you could fudge a solution to that particular problem.
link |
00:42:08.140
The general form of that problem remains,
link |
00:42:10.500
that what they learn is really correlations
link |
00:42:12.400
between different input and output nodes.
link |
00:42:14.180
And they're complex correlations
link |
00:42:16.140
with multiple nodes involved and so forth.
link |
00:42:18.780
Ultimately, they're correlative,
link |
00:42:20.260
they're not structured over these operations over variables.
link |
00:42:23.060
Now, someday, people may do a new form of deep learning
link |
00:42:25.960
that incorporates that stuff,
link |
00:42:27.300
and I think it will help a lot.
link |
00:42:28.460
And there's some tentative work on things
link |
00:42:30.260
like differentiable programming right now
link |
00:42:32.180
that fall into that category.
link |
00:42:34.240
But the sort of classic stuff
link |
00:42:35.500
like people use for ImageNet doesn't have it.
link |
00:42:38.780
And you have people like Hinton going around saying,
link |
00:42:41.060
symbol manipulation, like what Marcus,
link |
00:42:42.860
what I advocate is like the gasoline engine.
link |
00:42:45.680
It's obsolete.
link |
00:42:46.520
We should just use this cool electric power
link |
00:42:48.820
that we've got with the deep learning.
link |
00:42:50.320
And that's really destructive,
link |
00:42:51.980
because we really do need to have the gasoline engine stuff
link |
00:42:55.900
that represents, I mean, I don't think it's a good analogy,
link |
00:42:59.580
but we really do need to have the stuff
link |
00:43:02.180
that represents symbols.
link |
00:43:03.660
Yeah, and Hinton as well would say
link |
00:43:06.200
that we do need to throw out everything and start over.
link |
00:43:08.960
Hinton said that to Axios,
link |
00:43:12.820
and I had a friend who interviewed him
link |
00:43:15.540
and tried to pin him down
link |
00:43:16.460
on what exactly we need to throw out,
link |
00:43:17.820
and he was very evasive.
link |
00:43:19.900
Well, of course, because we can't, if he knew.
link |
00:43:22.700
Then he'd throw it out himself.
link |
00:43:23.940
But I mean, you can't have it both ways.
link |
00:43:25.400
You can't be like, I don't know what to throw out,
link |
00:43:27.520
but I am gonna throw out the symbols.
link |
00:43:29.980
I mean, and not just the symbols,
link |
00:43:32.140
but the variables and the operations over variables.
link |
00:43:34.100
Don't forget, the operations over variables,
link |
00:43:36.140
the stuff that I'm endorsing
link |
00:43:37.740
and which John McCarthy did when he founded AI,
link |
00:43:41.500
that stuff is the stuff
link |
00:43:42.660
that we build most computers out of.
link |
00:43:44.180
There are people now who say,
link |
00:43:45.460
we don't need computer programmers anymore.
link |
00:43:48.780
Not quite looking at the statistics
link |
00:43:50.240
of how much computer programmers
link |
00:43:51.180
actually get paid right now.
link |
00:43:52.980
We need lots of computer programmers,
link |
00:43:54.380
and most of them, they do a little bit of machine learning,
link |
00:43:57.780
but they still do a lot of code, right?
link |
00:43:59.900
Code where it's like, if the value of X
link |
00:44:02.220
is greater than the value of Y,
link |
00:44:03.580
then do this kind of thing,
link |
00:44:04.500
like conditionals and comparing operations over variables.
link |
00:44:08.100
Like, there's this fantasy you can machine learn anything.
link |
00:44:10.220
There's some things you would never wanna machine learn.
link |
00:44:12.580
I would not use a phone operating system
link |
00:44:14.980
that was machine learned.
link |
00:44:16.100
Like, you made a bunch of phone calls
link |
00:44:17.820
and you recorded which packets were transmitted
link |
00:44:19.740
and you just machine learned it, it'd be insane.
link |
00:44:22.500
Or to build a web browser by taking logs of keystrokes
link |
00:44:27.500
and images, screenshots,
link |
00:44:29.420
and then trying to learn the relation between them.
link |
00:44:31.500
Nobody would ever,
link |
00:44:32.860
no rational person would ever try to build a browser
link |
00:44:35.100
that made, they would use symbol manipulation,
link |
00:44:37.460
the stuff that I think AI needs to avail itself of
link |
00:44:40.140
in addition to deep learning.
link |
00:44:42.140
Can you describe your view of symbol manipulation
link |
00:44:46.540
in its early days?
link |
00:44:47.920
Can you describe expert systems
link |
00:44:49.540
and where do you think they hit a wall
link |
00:44:52.540
or a set of challenges?
link |
00:44:53.940
Sure, so I mean, first I just wanna clarify,
link |
00:44:56.580
I'm not endorsing expert systems per se.
link |
00:44:58.940
You've been kind of contrasting them.
link |
00:45:00.760
There is a contrast,
link |
00:45:01.600
but that's not the thing that I'm endorsing.
link |
00:45:04.220
So expert systems tried to capture things
link |
00:45:06.500
like medical knowledge with a large set of rules.
link |
00:45:09.460
So if the patient has this symptom and this other symptom,
link |
00:45:12.860
then it is likely that they have this disease.
link |
00:45:15.700
So there are logical rules
link |
00:45:16.860
and they were symbol manipulating rules
link |
00:45:18.340
of just the sort that I'm talking about.
link |
00:45:20.980
And the problem.
link |
00:45:21.820
They encode a set of knowledge that the experts then put in.
link |
00:45:24.980
And very explicitly so.
link |
00:45:26.260
So you'd have somebody interview an expert
link |
00:45:28.780
and then try to turn that stuff into rules.
link |
00:45:31.940
And at some level I'm arguing for rules.
link |
00:45:33.980
But the difference is those guys did in the 80s
link |
00:45:37.700
was almost entirely rules,
link |
00:45:40.040
almost entirely handwritten with no machine learning.
link |
00:45:42.980
What a lot of people are doing now
link |
00:45:44.340
is almost entirely one species of machine learning
link |
00:45:47.340
with no rules.
link |
00:45:48.260
And what I'm counseling is actually a hybrid.
link |
00:45:50.380
I'm saying that both of these things have their advantage.
link |
00:45:52.900
So if you're talking about perceptual classification,
link |
00:45:55.300
how do I recognize a bottle?
link |
00:45:57.140
Deep learning is the best tool we've got right now.
link |
00:45:59.540
If you're talking about making inferences
link |
00:46:00.940
about what a bottle does,
link |
00:46:02.420
something closer to the expert systems
link |
00:46:04.140
is probably still the best available alternative.
link |
00:46:07.340
And probably we want something that is better able
link |
00:46:09.860
to handle quantitative and statistical information
link |
00:46:12.620
than those classical systems typically were.
link |
00:46:14.940
So we need new technologies
link |
00:46:16.980
that are gonna draw some of the strengths
link |
00:46:18.620
of both the expert systems and the deep learning,
link |
00:46:21.060
but are gonna find new ways to synthesize them.
link |
00:46:23.260
How hard do you think it is to add knowledge at the low level?
link |
00:46:27.740
So mine human intellects to add extra information
link |
00:46:32.140
to symbol manipulating systems?
link |
00:46:36.540
In some domains it's not that hard,
link |
00:46:37.840
but it's often really hard.
link |
00:46:40.100
Partly because a lot of the things that are important,
link |
00:46:44.120
people wouldn't bother to tell you.
link |
00:46:46.060
So if you pay someone on Amazon Mechanical Turk
link |
00:46:49.680
to tell you stuff about bottles,
link |
00:46:52.060
they probably won't even bother to tell you
link |
00:46:55.060
some of the basic level stuff
link |
00:46:57.020
that's just so obvious to a human being
link |
00:46:59.180
and yet so hard to capture in machines.
link |
00:47:04.580
They're gonna tell you more exotic things,
link |
00:47:06.540
and they're all well and good,
link |
00:47:08.940
but they're not getting to the root of the problem.
link |
00:47:12.460
So untutored humans aren't very good at knowing,
link |
00:47:16.540
and why should they be,
link |
00:47:18.340
what kind of knowledge the computer system developers
link |
00:47:22.260
actually need?
link |
00:47:23.460
I don't think that that's an irremediable problem.
link |
00:47:26.620
I think it's historically been a problem.
link |
00:47:28.620
People have had crowdsourcing efforts,
link |
00:47:31.080
and they don't work that well.
link |
00:47:32.060
There's one at MIT, we're recording this at MIT,
link |
00:47:35.300
called Virtual Home, where,
link |
00:47:37.500
and we talk about this in the book,
link |
00:47:39.540
find the exact example there,
link |
00:47:40.740
but people were asked to do things
link |
00:47:42.800
like describe an exercise routine.
link |
00:47:44.880
And the things that the people describe
link |
00:47:47.460
are at a very low level
link |
00:47:48.580
and don't really capture what's going on.
link |
00:47:50.100
So they're like, go to the room
link |
00:47:52.340
with the television and the weights,
link |
00:47:54.700
turn on the television,
link |
00:47:56.100
press the remote to turn on the television,
link |
00:47:59.020
lift weight, put weight down, whatever.
link |
00:48:01.440
It's like very micro level,
link |
00:48:03.620
and it's not telling you
link |
00:48:04.900
what an exercise routine is really about,
link |
00:48:06.860
which is like, I wanna fit a certain number of exercises
link |
00:48:09.860
in a certain time period,
link |
00:48:10.940
I wanna emphasize these muscles.
link |
00:48:12.700
You want some kind of abstract description.
link |
00:48:15.060
The fact that you happen to press the remote control
link |
00:48:17.260
in this room when you watch this television
link |
00:48:20.020
isn't really the essence of the exercise routine.
link |
00:48:23.060
But if you just ask people like, what did they do?
link |
00:48:24.780
Then they give you this fine grain.
link |
00:48:26.980
And so it takes a level of expertise
link |
00:48:29.780
about how the AI works
link |
00:48:31.900
in order to craft the right kind of knowledge.
link |
00:48:34.540
So there's this ocean of knowledge that we all operate on.
link |
00:48:37.580
Some of them may not even be conscious,
link |
00:48:39.340
or at least we're not able to communicate it effectively.
link |
00:48:43.300
Yeah, most of it we would recognize if somebody said it,
link |
00:48:45.700
if it was true or not,
link |
00:48:47.420
but we wouldn't think to say that it's true or not.
link |
00:48:49.660
That's a really interesting mathematical property.
link |
00:48:53.060
This ocean has the property
link |
00:48:54.720
that every piece of knowledge in it,
link |
00:48:56.720
we will recognize it as true if we're told,
link |
00:48:59.940
but we're unlikely to retrieve it in the reverse.
link |
00:49:04.140
So that interesting property,
link |
00:49:07.180
I would say there's a huge ocean of that knowledge.
link |
00:49:10.580
What's your intuition?
link |
00:49:11.580
Is it accessible to AI systems somehow?
link |
00:49:14.700
Can we?
link |
00:49:15.940
So you said this.
link |
00:49:16.780
I mean, most of it is not,
link |
00:49:18.780
well, I'll give you an asterisk on this in a second,
link |
00:49:20.540
but most of it has not ever been encoded
link |
00:49:23.260
in machine interpretable form.
link |
00:49:25.660
And so, I mean, if you say accessible,
link |
00:49:27.300
there's two meanings of that.
link |
00:49:28.640
One is like, could you build it into a machine?
link |
00:49:31.540
Yes.
link |
00:49:32.380
The other is like, is there some database
link |
00:49:34.460
that we could go download and stick into our machine?
link |
00:49:38.380
But the first thing, could we?
link |
00:49:40.660
What's your intuition? I think we could.
link |
00:49:42.020
I think it hasn't been done right.
link |
00:49:45.020
You know, the closest, and this is the asterisk,
link |
00:49:47.300
is the CYC psych system tried to do this.
link |
00:49:51.140
A lot of logicians worked for Doug Lennon
link |
00:49:53.020
for 30 years on this project.
link |
00:49:55.460
I think they stuck too closely to logic,
link |
00:49:57.900
didn't represent enough about probabilities,
link |
00:50:00.220
tried to hand code it.
link |
00:50:01.180
There are various issues,
link |
00:50:02.180
and it hasn't been that successful.
link |
00:50:04.480
That is the closest existing system
link |
00:50:08.500
to trying to encode this.
link |
00:50:10.620
Why do you think there's not more excitement
link |
00:50:13.460
slash money behind this idea currently?
link |
00:50:16.420
There was.
link |
00:50:17.260
People view that project as a failure.
link |
00:50:19.180
I think that they confuse the failure
link |
00:50:22.060
of a specific instance that was conceived 30 years ago
link |
00:50:25.100
for the failure of an approach,
link |
00:50:26.180
which they don't do for deep learning.
link |
00:50:28.160
So in 2010, people had the same attitude
link |
00:50:31.940
towards deep learning.
link |
00:50:32.780
They're like, this stuff doesn't really work.
link |
00:50:35.500
And all these other algorithms work better and so forth.
link |
00:50:39.140
And then certain key technical advances were made,
link |
00:50:41.900
but mostly it was the advent
link |
00:50:43.780
of graphics processing units that changed that.
link |
00:50:46.400
It wasn't even anything foundational in the techniques.
link |
00:50:50.060
And there was some new tricks,
link |
00:50:51.220
but mostly it was just more compute and more data,
link |
00:50:55.300
things like ImageNet that didn't exist before
link |
00:50:57.900
that allowed deep learning.
link |
00:50:59.020
And it could be, to work,
link |
00:51:00.860
it could be that CYC just needs a few more things
link |
00:51:03.780
or something like CYC,
link |
00:51:05.500
but the widespread view is that that just doesn't work.
link |
00:51:08.820
And people are reasoning from a single example.
link |
00:51:11.820
They don't do that with deep learning.
link |
00:51:13.260
They don't say nothing that existed in 2010,
link |
00:51:16.580
and there were many, many efforts in deep learning
link |
00:51:18.860
was really worth anything.
link |
00:51:20.580
I mean, really, there's no model from 2010
link |
00:51:23.820
in deep learning or the predecessors of deep learning
link |
00:51:26.620
that has any commercial value whatsoever at this point.
link |
00:51:29.660
They're all failures.
link |
00:51:31.540
But that doesn't mean that there wasn't anything there.
link |
00:51:33.500
I have a friend, I was getting to know him,
link |
00:51:35.940
and he said, I had a company too,
link |
00:51:38.820
I was talking about I had a new company.
link |
00:51:40.580
He said, I had a company too, and it failed.
link |
00:51:42.900
And I said, well, what did you do?
link |
00:51:44.260
And he said, deep learning.
link |
00:51:45.660
And the problem was he did it in 1986
link |
00:51:47.940
or something like that.
link |
00:51:48.780
And we didn't have the tools then, or 1990,
link |
00:51:51.060
we didn't have the tools then, not the algorithms.
link |
00:51:53.980
His algorithms weren't that different from model algorithms,
link |
00:51:56.540
but he didn't have the GPUs to run it fast enough.
link |
00:51:58.420
He didn't have the data.
link |
00:51:59.620
And so it failed.
link |
00:52:01.340
It could be that symbol manipulation per se
link |
00:52:06.820
with modern amounts of data and compute
link |
00:52:09.580
and maybe some advance in compute
link |
00:52:11.940
for that kind of compute might be great.
link |
00:52:14.900
My perspective on it is not that we want to resuscitate
link |
00:52:19.340
that stuff per se, but we want to borrow lessons from it,
link |
00:52:21.540
bring together with other things that we've learned.
link |
00:52:23.380
And it might have an ImageNet moment
link |
00:52:25.900
where it would spark the world's imagination
link |
00:52:28.220
and there'll be an explosion of symbol manipulation efforts.
link |
00:52:31.460
Yeah, I think that people at AI2,
link |
00:52:33.660
Paul Allen's AI Institute, are trying to build data sets.
link |
00:52:39.060
Well, they're not doing it
link |
00:52:39.900
for quite the reason that you say,
link |
00:52:41.100
but they're trying to build data sets
link |
00:52:43.220
that at least spark interest in common sense reasoning.
link |
00:52:45.380
To create benchmarks.
link |
00:52:46.780
Benchmarks for common sense.
link |
00:52:48.220
That's a large part of what the AI2.org
link |
00:52:50.860
is working on right now.
link |
00:52:51.980
So speaking of compute,
link |
00:52:53.260
Rich Sutton wrote a blog post titled Bitter Lesson.
link |
00:52:56.380
I don't know if you've read it,
link |
00:52:57.220
but he said that the biggest lesson that can be read
link |
00:52:59.900
from so many years of AI research
link |
00:53:01.580
is that general methods that leverage computation
link |
00:53:04.180
are ultimately the most effective.
link |
00:53:06.300
Do you think that?
link |
00:53:07.140
The most effective at what?
link |
00:53:08.620
Right, so they have been most effective
link |
00:53:11.820
for perceptual classification problems
link |
00:53:14.500
and for some reinforcement learning problems.
link |
00:53:18.060
And he works on reinforcement learning.
link |
00:53:19.380
Well, no, let me push back on that.
link |
00:53:20.700
You're actually absolutely right.
link |
00:53:22.820
But I would also say they have been most effective generally
link |
00:53:28.060
because everything we've done up to...
link |
00:53:31.500
Would you argue against that?
link |
00:53:32.900
Is, to me, deep learning is the first thing
link |
00:53:36.220
that has been successful at anything in AI.
link |
00:53:41.860
And you're pointing out that this success
link |
00:53:45.300
is very limited, folks,
link |
00:53:47.100
but has there been something truly successful
link |
00:53:50.260
before deep learning?
link |
00:53:51.660
Sure, I mean, I want to make a larger point,
link |
00:53:54.860
but on the narrower point, classical AI is used,
link |
00:54:00.020
for example, in doing navigation instructions.
link |
00:54:03.660
It's very successful.
link |
00:54:06.020
Everybody on the planet uses it now,
link |
00:54:07.780
like multiple times a day.
link |
00:54:09.420
That's a measure of success, right?
link |
00:54:12.220
So I don't think classical AI was wildly successful,
link |
00:54:16.060
but there are cases like that.
link |
00:54:17.580
They're just used all the time.
link |
00:54:19.140
Nobody even notices them because they're so pervasive.
link |
00:54:23.740
So there are some successes for classical AI.
link |
00:54:26.580
I think deep learning has been more successful,
link |
00:54:28.700
but my usual line about this, and I didn't invent it,
link |
00:54:32.020
but I like it a lot,
link |
00:54:33.060
is just because you can build a better ladder
link |
00:54:34.780
doesn't mean you can build a ladder to the moon.
link |
00:54:37.140
So the bitter lesson is if you have
link |
00:54:39.660
a perceptual classification problem,
link |
00:54:42.220
throwing a lot of data at it is better than anything else.
link |
00:54:45.740
But that has not given us any material progress
link |
00:54:49.980
in natural language understanding,
link |
00:54:51.860
common sense reasoning,
link |
00:54:53.060
like a robot would need to navigate a home.
link |
00:54:56.220
Problems like that, there's no actual progress there.
link |
00:54:59.420
So flip side of that, if we remove data from the picture,
link |
00:55:02.220
another bitter lesson is that you just have
link |
00:55:05.780
a very simple algorithm,
link |
00:55:10.100
and you wait for compute to scale.
link |
00:55:12.500
It doesn't have to be learning.
link |
00:55:13.540
It doesn't have to be deep learning.
link |
00:55:14.580
It doesn't have to be data driven,
link |
00:55:16.420
but just wait for the compute.
link |
00:55:18.220
So my question for you,
link |
00:55:19.060
do you think compute can unlock some of the things
link |
00:55:21.660
with either deep learning or symbol manipulation that?
link |
00:55:25.460
Sure, but I'll put a proviso on that.
link |
00:55:29.780
I think more compute's always better.
link |
00:55:31.940
Nobody's gonna argue with more compute.
link |
00:55:33.660
It's like having more money.
link |
00:55:34.700
I mean, there's the data.
link |
00:55:36.020
There's diminishing returns on more money.
link |
00:55:37.460
Exactly, there's diminishing returns on more money,
link |
00:55:39.740
but nobody's gonna argue
link |
00:55:40.980
if you wanna give them more money, right?
link |
00:55:42.620
Except maybe the people who signed the giving pledge,
link |
00:55:44.620
and some of them have a problem.
link |
00:55:46.300
They've promised to give away more money
link |
00:55:47.980
than they're able to.
link |
00:55:49.660
But the rest of us, if you wanna give me more money, fine.
link |
00:55:52.500
I'm saying more money, more problems, but okay.
link |
00:55:54.580
That's true too.
link |
00:55:55.500
What I would say to you is your brain uses like 20 watts,
link |
00:56:00.100
and it does a lot of things that deep learning doesn't do,
link |
00:56:02.660
or that symbol manipulation doesn't do,
link |
00:56:05.140
that AI just hasn't figured out how to do.
link |
00:56:07.020
So it's an existence proof
link |
00:56:09.100
that you don't need server resources
link |
00:56:12.140
that are Google scale in order to have an intelligence.
link |
00:56:16.460
I built, with a lot of help from my wife,
link |
00:56:18.900
two intelligences that are 20 watts each,
link |
00:56:21.660
and far exceed anything that anybody else
link |
00:56:25.060
has built at a silicon.
link |
00:56:26.780
Speaking of those two robots,
link |
00:56:30.020
what have you learned about AI from having?
link |
00:56:33.260
Well, they're not robots, but.
link |
00:56:35.300
Sorry, intelligent agents.
link |
00:56:36.740
Those two intelligent agents.
link |
00:56:38.140
I've learned a lot by watching my two intelligent agents.
link |
00:56:42.780
I think that what's fundamentally interesting,
link |
00:56:45.820
well, one of the many things
link |
00:56:46.980
that's fundamentally interesting about them
link |
00:56:48.660
is the way that they set their own problems to solve.
link |
00:56:51.940
So my two kids are a year and a half apart.
link |
00:56:54.540
They're both five and six and a half.
link |
00:56:56.420
They play together all the time,
link |
00:56:58.180
and they're constantly creating new challenges.
link |
00:57:00.940
That's what they do, is they make up games,
link |
00:57:03.780
and they're like, well, what if this, or what if that,
link |
00:57:05.940
or what if I had this superpower,
link |
00:57:07.860
or what if you could walk through this wall?
link |
00:57:10.340
So they're doing these what if scenarios all the time,
link |
00:57:14.020
and that's how they learn something about the world
link |
00:57:17.540
and grow their minds, and machines don't really do that.
link |
00:57:22.580
So that's interesting, and you've talked about this,
link |
00:57:24.460
you've written about it, you've thought about it,
link |
00:57:26.100
nature versus nurture.
link |
00:57:29.260
So what innate knowledge do you think we're born with,
link |
00:57:33.580
and what do we learn along the way
link |
00:57:35.540
in those early months and years?
link |
00:57:38.260
Can I just say how much I like that question?
link |
00:57:41.540
You phrased it just right, and almost nobody ever does,
link |
00:57:45.780
which is what is the innate knowledge
link |
00:57:47.220
and what's learned along the way?
link |
00:57:49.180
So many people dichotomize it,
link |
00:57:51.180
and they think it's nature versus nurture,
link |
00:57:53.380
when it is obviously has to be nature and nurture.
link |
00:57:56.740
They have to work together.
link |
00:57:58.540
You can't learn this stuff along the way
link |
00:58:00.500
unless you have some innate stuff,
link |
00:58:02.340
but just because you have the innate stuff
link |
00:58:03.860
doesn't mean you don't learn anything.
link |
00:58:05.820
And so many people get that wrong, including in the field.
link |
00:58:09.340
People think if I work in machine learning,
link |
00:58:12.220
the learning side, I must not be allowed to work
link |
00:58:15.260
on the innate side, or that will be cheating.
link |
00:58:17.300
Exactly, people have said that to me,
link |
00:58:19.620
and it's just absurd, so thank you.
link |
00:58:23.380
But you could break that apart more.
link |
00:58:25.140
I've talked to folks who studied
link |
00:58:26.540
the development of the brain,
link |
00:58:28.260
and the growth of the brain in the first few days
link |
00:58:32.940
in the first few months in the womb,
link |
00:58:35.660
all of that, is that innate?
link |
00:58:39.500
So that process of development from a stem cell
link |
00:58:42.300
to the growth of the central nervous system and so on,
link |
00:58:46.020
to the information that's encoded
link |
00:58:49.300
through the long arc of evolution.
link |
00:58:52.300
So all of that comes into play, and it's unclear.
link |
00:58:55.300
It's not just whether it's a dichotomy or not.
link |
00:58:57.340
It's where most, or where the knowledge is encoded.
link |
00:59:02.060
So what's your intuition about the innate knowledge,
link |
00:59:07.780
the power of it, what's contained in it,
link |
00:59:09.700
what can we learn from it?
link |
00:59:11.340
One of my earlier books was actually trying
link |
00:59:12.740
to understand the biology of this.
link |
00:59:14.020
The book was called The Birth of the Mind.
link |
00:59:15.860
Like how is it the genes even build innate knowledge?
link |
00:59:18.900
And from the perspective of the conversation
link |
00:59:21.460
we're having today, there's actually two questions.
link |
00:59:23.580
One is what innate knowledge or mechanisms,
link |
00:59:26.460
or what have you, people or other animals
link |
00:59:29.660
might be endowed with.
link |
00:59:30.900
I always like showing this video
link |
00:59:32.260
of a baby ibex climbing down a mountain.
link |
00:59:34.620
That baby ibex, a few hours after its birth,
link |
00:59:37.380
knows how to climb down a mountain.
link |
00:59:38.420
That means that it knows, not consciously,
link |
00:59:40.940
something about its own body and physics
link |
00:59:43.020
and 3D geometry and all of this kind of stuff.
link |
00:59:47.500
So there's one question about what does biology
link |
00:59:49.660
give its creatures and what has evolved in our brains?
link |
00:59:53.220
How is that represented in our brains?
link |
00:59:54.940
The question I thought about in the book
link |
00:59:56.180
The Birth of the Mind.
link |
00:59:57.340
And then there's a question of what AI should have.
link |
00:59:59.300
And they don't have to be the same.
link |
01:00:01.540
But I would say that it's a pretty interesting
link |
01:00:06.940
set of things that we are equipped with
link |
01:00:08.660
that allows us to do a lot of interesting things.
link |
01:00:10.500
So I would argue or guess, based on my reading
link |
01:00:13.740
of the developmental psychology literature,
link |
01:00:15.220
which I've also participated in,
link |
01:00:17.980
that children are born with a notion of space,
link |
01:00:21.740
time, other agents, places,
link |
01:00:25.740
and also this kind of mental algebra
link |
01:00:27.620
that I was describing before.
link |
01:00:30.220
No certain causation if I didn't just say that.
link |
01:00:33.060
So at least those kinds of things.
link |
01:00:35.220
They're like frameworks for learning the other things.
link |
01:00:38.940
Are they disjoint in your view
link |
01:00:40.340
or is it just somehow all connected?
link |
01:00:42.860
You've talked a lot about language.
link |
01:00:44.340
Is it all kind of connected in some mesh
link |
01:00:47.940
that's language like?
link |
01:00:50.260
If understanding concepts all together or?
link |
01:00:52.740
I don't think we know for people how they're represented
link |
01:00:55.740
and machines just don't really do this yet.
link |
01:00:58.180
So I think it's an interesting open question
link |
01:01:00.540
both for science and for engineering.
link |
01:01:03.540
Some of it has to be at least interrelated
link |
01:01:06.340
in the way that the interfaces of a software package
link |
01:01:10.180
have to be able to talk to one another.
link |
01:01:12.140
So the systems that represent space and time
link |
01:01:16.620
can't be totally disjoint because a lot of the things
link |
01:01:19.820
that we reason about are the relations
link |
01:01:21.500
between space and time and cause.
link |
01:01:22.980
So I put this on and I have expectations
link |
01:01:26.460
about what's gonna happen with the bottle cap
link |
01:01:28.180
on top of the bottle and those span space and time.
link |
01:01:32.540
If the cap is over here, I get a different outcome.
link |
01:01:35.740
If the timing is different, if I put this here,
link |
01:01:38.540
after I move that, then I get a different outcome.
link |
01:01:41.900
That relates to causality.
link |
01:01:43.060
So obviously these mechanisms, whatever they are,
link |
01:01:47.840
can certainly communicate with each other.
link |
01:01:50.100
So I think evolution had a significant role
link |
01:01:53.180
to play in the development of this whole kluge, right?
link |
01:01:57.100
How efficient do you think is evolution?
link |
01:01:59.220
Oh, it's terribly inefficient except that.
link |
01:02:01.620
Okay, well, can we do better?
link |
01:02:03.980
Well, I'll come to that in a sec.
link |
01:02:05.740
It's inefficient except that.
link |
01:02:08.100
Once it gets a good idea, it runs with it.
link |
01:02:10.900
So it took, I guess, a billion years,
link |
01:02:15.660
if I went roughly a billion years, to evolve
link |
01:02:20.420
to a vertebrate brain plan.
link |
01:02:24.040
And once that vertebrate brain plan evolved,
link |
01:02:26.920
it spread everywhere.
link |
01:02:28.480
So fish have it and dogs have it and we have it.
link |
01:02:31.700
We have adaptations of it and specializations of it,
link |
01:02:34.140
but, and the same thing with a primate brain plan.
link |
01:02:37.160
So monkeys have it and apes have it and we have it.
link |
01:02:41.100
So there are additional innovations like color vision
link |
01:02:43.780
and those spread really rapidly.
link |
01:02:45.860
So it takes evolution a long time to get a good idea,
link |
01:02:48.820
but, and I'm being anthropomorphic and not literal here,
link |
01:02:53.300
but once it has that idea, so to speak,
link |
01:02:55.580
which cashes out into one set of genes or in the genome,
link |
01:02:58.540
those genes spread very rapidly
link |
01:03:00.420
and they're like subroutines or libraries,
link |
01:03:02.660
I guess the word people might use nowadays
link |
01:03:04.540
or be more familiar with.
link |
01:03:05.620
They're libraries that get used over and over again.
link |
01:03:08.780
So once you have the library for building something
link |
01:03:11.740
with multiple digits, you can use it for a hand,
link |
01:03:13.840
but you can also use it for a foot.
link |
01:03:15.540
You just kind of reuse the library
link |
01:03:17.420
with slightly different parameters.
link |
01:03:19.080
Evolution does a lot of that,
link |
01:03:20.660
which means that the speed over time picks up.
link |
01:03:23.500
So evolution can happen faster
link |
01:03:25.560
because you have bigger and bigger libraries.
link |
01:03:28.380
And what I think has happened in attempts
link |
01:03:32.220
at evolutionary computation is that people start
link |
01:03:35.740
with libraries that are very, very minimal,
link |
01:03:40.340
like almost nothing, and then progress is slow
link |
01:03:44.260
and it's hard for someone to get a good PhD thesis
link |
01:03:46.620
out of it and they give up.
link |
01:03:48.260
If we had richer libraries to begin with,
link |
01:03:50.260
if you were evolving from systems
link |
01:03:52.580
that had an rich innate structure to begin with,
link |
01:03:55.320
then things might speed up.
link |
01:03:56.780
Or more PhD students, if the evolutionary process
link |
01:03:59.900
is indeed in a meta way runs away with good ideas,
link |
01:04:04.260
you need to have a lot of ideas,
link |
01:04:06.740
pool of ideas in order for it to discover one
link |
01:04:08.820
that you can run away with.
link |
01:04:10.260
And PhD students representing individual ideas as well.
link |
01:04:13.220
Yeah, I mean, you could throw
link |
01:04:14.340
a billion PhD students at it.
link |
01:04:16.220
Yeah, the monkeys are typewriters with Shakespeare, yep.
link |
01:04:20.180
Well, I mean, those aren't cumulative, right?
link |
01:04:22.060
That's just random.
link |
01:04:23.420
And part of the point that I'm making
link |
01:04:24.940
is that evolution is cumulative.
link |
01:04:26.780
So if you have a billion monkeys independently,
link |
01:04:31.140
you don't really get anywhere.
link |
01:04:32.420
But if you have a billion monkeys,
link |
01:04:33.820
and I think Dawkins made this point originally,
link |
01:04:35.700
or probably other people, Dawkins made it very nice
link |
01:04:37.580
and either a selfish gene or blind watchmaker.
link |
01:04:40.420
If there is some sort of fitness function
link |
01:04:44.060
that can drive you towards something,
link |
01:04:45.860
I guess that's Dawkins point.
link |
01:04:47.060
And my point, which is a variation on that,
link |
01:04:49.420
is that if the evolution is cumulative,
link |
01:04:51.940
I mean, the related points,
link |
01:04:53.820
then you can start going faster.
link |
01:04:55.600
Do you think something like the process of evolution
link |
01:04:57.760
is required to build intelligent systems?
link |
01:05:00.180
So if we... Not logically.
link |
01:05:01.560
So all the stuff that evolution did,
link |
01:05:04.040
a good engineer might be able to do.
link |
01:05:07.040
So for example, evolution made quadrupeds,
link |
01:05:10.540
which distribute the load across a horizontal surface.
link |
01:05:14.180
A good engineer could come up with that idea.
link |
01:05:16.980
I mean, sometimes good engineers come up with ideas
link |
01:05:18.740
by looking at biology.
link |
01:05:19.760
There's lots of ways to get your ideas.
link |
01:05:22.500
Part of what I'm suggesting
link |
01:05:23.660
is we should look at biology a lot more.
link |
01:05:25.980
We should look at the biology of thought and understanding
link |
01:05:30.180
and the biology by which creatures intuitively reason
link |
01:05:33.480
about physics or other agents,
link |
01:05:35.960
or like how do dogs reason about people?
link |
01:05:37.900
Like they're actually pretty good at it.
link |
01:05:39.620
If we could understand, at my college we joked dognition,
link |
01:05:44.000
if we could understand dognition well,
link |
01:05:46.280
and how it was implemented, that might help us with our AI.
link |
01:05:49.780
So do you think it's possible
link |
01:05:53.780
that the kind of timescale that evolution took
link |
01:05:57.180
is the kind of timescale that will be needed
link |
01:05:58.940
to build intelligent systems?
link |
01:06:00.500
Or can we significantly accelerate that process
link |
01:06:02.980
inside a computer?
link |
01:06:04.020
I mean, I think the way that we accelerate that process
link |
01:06:07.580
is we borrow from biology, not slavishly,
link |
01:06:12.100
but I think we look at how biology has solved problems
link |
01:06:15.260
and we say, does that inspire
link |
01:06:16.780
any engineering solutions here?
link |
01:06:18.940
Try to mimic biological systems
link |
01:06:20.700
and then therefore have a shortcut.
link |
01:06:22.380
Yeah, I mean, there's a field called biomimicry
link |
01:06:25.020
and people do that for like material science all the time.
link |
01:06:28.980
We should be doing the analog of that for AI
link |
01:06:32.940
and the analog for that for AI
link |
01:06:34.460
is to look at cognitive science or the cognitive sciences,
link |
01:06:37.020
which is psychology, maybe neuroscience, linguistics,
link |
01:06:40.380
and so forth, look to those for insight.
link |
01:06:43.460
What do you think is a good test of intelligence
link |
01:06:45.340
in your view?
link |
01:06:46.180
So I don't think there's one good test.
link |
01:06:48.500
In fact, I tried to organize a movement
link |
01:06:51.780
towards something called a Turing Olympics
link |
01:06:53.380
and my hope is that Francois is actually gonna take,
link |
01:06:56.140
Francois Chollet is gonna take over this.
link |
01:06:58.260
I think he's interested and I don't,
link |
01:06:59.940
I just don't have place in my busy life at this moment,
link |
01:07:03.500
but the notion is that there'd be many tests
link |
01:07:06.460
and not just one because intelligence is multifaceted.
link |
01:07:09.500
There can't really be a single measure of it
link |
01:07:12.900
because it isn't a single thing.
link |
01:07:15.620
Like just the crudest level,
link |
01:07:17.340
the SAT has a verbal component and a math component
link |
01:07:19.860
because they're not identical.
link |
01:07:21.340
And Howard Gardner has talked about multiple intelligences
link |
01:07:23.660
like kinesthetic intelligence
link |
01:07:25.420
and verbal intelligence and so forth.
link |
01:07:27.740
There are a lot of things that go into intelligence
link |
01:07:29.940
and people can get good at one or the other.
link |
01:07:32.580
I mean, in some sense, like every expert has developed
link |
01:07:35.260
a very specific kind of intelligence
link |
01:07:37.260
and then there are people that are generalists
link |
01:07:39.300
and I think of myself as a generalist
link |
01:07:41.740
with respect to cognitive science,
link |
01:07:43.380
which doesn't mean I know anything about quantum mechanics,
link |
01:07:45.620
but I know a lot about the different facets of the mind.
link |
01:07:49.260
And there's a kind of intelligence
link |
01:07:51.380
to thinking about intelligence.
link |
01:07:52.660
I like to think that I have some of that,
link |
01:07:54.740
but social intelligence, I'm just okay.
link |
01:07:57.500
There are people that are much better at that than I am.
link |
01:08:00.140
Sure, but what would be really impressive to you?
link |
01:08:04.140
I think the idea of a touring Olympics is really interesting
link |
01:08:07.060
especially if somebody like Francois is running it,
link |
01:08:09.660
but to you in general, not as a benchmark,
link |
01:08:14.380
but if you saw an AI system being able to accomplish
link |
01:08:17.300
something that would impress the heck out of you,
link |
01:08:21.740
what would that thing be?
link |
01:08:22.740
Would it be natural language conversation?
link |
01:08:24.700
For me personally, I would like to see
link |
01:08:28.580
a kind of comprehension that relates to what you just said.
link |
01:08:30.660
So I wrote a piece in the New Yorker in I think 2015
link |
01:08:34.980
right after Eugene Guestman, which was a software package,
link |
01:08:39.940
won a version of the Turing test.
link |
01:08:42.940
And the way that it did this is it be,
link |
01:08:45.060
well, the way you win the Turing test,
link |
01:08:46.900
so called win it, is the Turing test is you fool a person
link |
01:08:50.700
into thinking that a machine is a person,
link |
01:08:54.420
is you're evasive, you pretend to have limitations
link |
01:08:57.940
so you don't have to answer certain questions and so forth.
link |
01:09:00.540
So this particular system pretended to be a 13 year old boy
link |
01:09:04.300
from Odessa who didn't understand English
link |
01:09:06.980
and was kind of sarcastic
link |
01:09:08.060
and wouldn't answer your questions and so forth.
link |
01:09:09.660
And so judges got fooled into thinking briefly
link |
01:09:12.460
with a very little exposure, it was a 13 year old boy,
link |
01:09:14.660
and it docked all the questions
link |
01:09:16.340
Turing was actually interested in,
link |
01:09:17.540
which is like how do you make the machine
link |
01:09:18.780
actually intelligent?
link |
01:09:20.420
So that test itself is not that good.
link |
01:09:22.100
And so in New Yorker, I proposed an alternative, I guess,
link |
01:09:26.100
and the one that I proposed there
link |
01:09:27.260
was a comprehension test.
link |
01:09:30.020
And I must like Breaking Bad
link |
01:09:31.060
because I've already given you one Breaking Bad example
link |
01:09:32.900
and in that article, I have one as well,
link |
01:09:35.660
which was something like if Walter,
link |
01:09:37.660
you should be able to watch an episode of Breaking Bad
link |
01:09:40.340
or maybe you have to watch the whole series
link |
01:09:41.700
to be able to answer the question and say,
link |
01:09:43.500
if Walter White took a hit out on Jesse,
link |
01:09:45.580
why did he do that?
link |
01:09:47.180
So if you could answer kind of arbitrary questions
link |
01:09:49.380
about characters motivations, I would be really impressed
link |
01:09:52.700
with that and he built software to do that.
link |
01:09:55.380
They could watch a film or there are different versions.
link |
01:09:58.500
And so ultimately, I wrote this up with Praveen Paritosh
link |
01:10:01.940
in a special issue of AI Magazine
link |
01:10:04.060
that basically was about the Turing Olympics.
link |
01:10:05.780
There were like 14 tests proposed.
link |
01:10:07.700
The one that I was pushing was a comprehension challenge
link |
01:10:10.100
and Praveen who's at Google was trying to figure out
link |
01:10:12.380
like how we would actually run it
link |
01:10:13.460
and so we wrote a paper together.
link |
01:10:15.340
And you could have a text version too
link |
01:10:17.300
or you could have an auditory podcast version,
link |
01:10:19.780
you could have a written version.
link |
01:10:20.620
But the point is that you win at this test
link |
01:10:23.820
if you can do, let's say human level or better than humans
link |
01:10:27.060
at answering kind of arbitrary questions.
link |
01:10:29.780
Why did this person pick up the stone?
link |
01:10:31.660
What were they thinking when they picked up the stone?
link |
01:10:34.180
Were they trying to knock down glass?
link |
01:10:36.260
And I mean, ideally these wouldn't be multiple choice either
link |
01:10:38.700
because multiple choice is pretty easily gamed.
link |
01:10:41.140
So if you could have relatively open ended questions
link |
01:10:44.180
and you can answer why people are doing this stuff,
link |
01:10:47.380
I would be very impressed.
link |
01:10:48.220
And of course, humans can do this, right?
link |
01:10:50.060
If you watch a well constructed movie
link |
01:10:52.820
and somebody picks up a rock,
link |
01:10:55.540
everybody watching the movie
link |
01:10:56.940
knows why they picked up the rock, right?
link |
01:10:59.420
They all know, oh my gosh,
link |
01:11:01.140
he's gonna hit this character or whatever.
link |
01:11:03.620
We have an example in the book about
link |
01:11:06.220
when a whole bunch of people say, I am Spartacus,
link |
01:11:08.700
you know, this famous scene.
link |
01:11:11.780
The viewers understand,
link |
01:11:13.540
first of all, that everybody or everybody minus one
link |
01:11:18.220
has to be lying.
link |
01:11:19.060
They can't all be Spartacus.
link |
01:11:20.340
We have enough common sense knowledge
link |
01:11:21.780
to know they couldn't all have the same name.
link |
01:11:24.100
We know that they're lying
link |
01:11:25.340
and we can infer why they're lying, right?
link |
01:11:27.100
They're lying to protect someone
link |
01:11:28.460
and to protect things they believe in.
link |
01:11:30.340
You get a machine that can do that.
link |
01:11:32.340
They can say, this is why these guys all got up
link |
01:11:35.100
and said, I am Spartacus.
link |
01:11:36.940
I will sit down and say, AI has really achieved a lot.
link |
01:11:40.540
Thank you.
link |
01:11:41.380
Without cheating any part of the system.
link |
01:11:43.860
Yeah, I mean, if you do it,
link |
01:11:45.620
there are lots of ways you could cheat.
link |
01:11:46.700
You could build a Spartacus machine
link |
01:11:48.820
that works on that film.
link |
01:11:50.260
That's not what I'm talking about.
link |
01:11:51.100
I'm talking about, you can do this
link |
01:11:52.860
with essentially arbitrary films
link |
01:11:54.860
or from a large set. Even beyond films
link |
01:11:56.580
because it's possible such a system would discover
link |
01:11:58.980
that the number of narrative arcs in film
link |
01:12:02.580
is limited to 1930. Well, there's a famous thing
link |
01:12:04.740
about the classic seven plots or whatever.
link |
01:12:07.060
I don't care.
link |
01:12:07.900
If you wanna build in the system,
link |
01:12:09.140
boy meets girl, boy loses girl, boy finds girl.
link |
01:12:11.660
That's fine.
link |
01:12:12.500
I don't mind having some head stories on it.
link |
01:12:13.980
And they acknowledge.
link |
01:12:14.820
Okay, good.
link |
01:12:16.340
I mean, you could build it in innately
link |
01:12:17.980
or you could have your system watch a lot of films again.
link |
01:12:20.460
If you can do this at all,
link |
01:12:22.380
but with a wide range of films,
link |
01:12:23.740
not just one film in one genre.
link |
01:12:27.340
But even if you could do it for all Westerns,
link |
01:12:28.860
I'd be reasonably impressed.
link |
01:12:30.300
Yeah.
link |
01:12:31.940
So in terms of being impressed,
link |
01:12:34.100
just for the fun of it,
link |
01:12:35.820
because you've put so many interesting ideas out there
link |
01:12:38.420
in your book,
link |
01:12:40.420
challenging the community for further steps.
link |
01:12:43.700
Is it possible on the deep learning front
link |
01:12:46.740
that you're wrong about its limitations?
link |
01:12:50.260
That deep learning will unlock,
link |
01:12:52.260
Yann LeCun next year will publish a paper
link |
01:12:54.500
that achieves this comprehension.
link |
01:12:56.940
So do you think that way often as a scientist?
link |
01:13:00.300
Do you consider that your intuition
link |
01:13:03.060
that deep learning could actually run away with it?
link |
01:13:06.740
I'm more worried about rebranding
link |
01:13:09.780
as a kind of political thing.
link |
01:13:11.380
So, I mean, what's gonna happen, I think,
link |
01:13:14.100
is the deep learning is gonna start
link |
01:13:15.660
to encompass symbol manipulation.
link |
01:13:17.380
So I think Hinton's just wrong.
link |
01:13:19.260
Hinton says we don't want hybrids.
link |
01:13:20.860
I think people will work towards hybrids
link |
01:13:22.380
and they will relabel their hybrids as deep learning.
link |
01:13:24.740
We've already seen some of that.
link |
01:13:25.860
So AlphaGo is often described as a deep learning system,
link |
01:13:29.620
but it's more correctly described as a system
link |
01:13:31.740
that has deep learning, but also Monte Carlo tree search,
link |
01:13:33.940
which is a classical AI technique.
link |
01:13:35.580
And people will start to blur the lines
link |
01:13:37.540
in the way that IBM blurred Watson.
link |
01:13:39.820
First, Watson meant this particular system,
link |
01:13:41.580
and then it was just anything that IBM built
link |
01:13:43.140
in their cognitive division.
link |
01:13:44.140
But purely, let me ask, for sure,
link |
01:13:45.740
that's a branding question and that's like a giant mess.
link |
01:13:49.500
I mean, purely, a single neural network
link |
01:13:51.940
being able to accomplish reasonable comprehension.
link |
01:13:54.060
I don't stay up at night worrying
link |
01:13:55.780
that that's gonna happen.
link |
01:13:57.780
And I'll just give you two examples.
link |
01:13:59.220
One is a guy at DeepMind thought he had finally outfoxed me.
link |
01:14:03.540
At Zergilord, I think is his Twitter handle.
link |
01:14:06.980
And he said, he specifically made an example.
link |
01:14:10.580
Marcus said that such and such.
link |
01:14:12.620
He fed it into GP2, which is the AI system
link |
01:14:16.420
that is so smart that OpenAI couldn't release it
link |
01:14:19.060
because it would destroy the world, right?
link |
01:14:21.180
You remember that a few months ago.
link |
01:14:22.940
So he feeds it into GPT2, and my example
link |
01:14:27.220
was something like a rose is a rose,
link |
01:14:28.740
a tulip is a tulip, a lily is a blank.
link |
01:14:31.340
And he got it to actually do that,
link |
01:14:32.860
which was a little bit impressive.
link |
01:14:34.020
And I wrote back and I said, that's impressive,
link |
01:14:35.340
but can I ask you a few questions?
link |
01:14:37.740
I said, was that just one example?
link |
01:14:40.060
Can it do it generally?
link |
01:14:41.620
And can it do it with novel words,
link |
01:14:43.220
which was part of what I was talking about in 1998
link |
01:14:45.300
when I first raised the example.
link |
01:14:46.740
So a dax is a dax, right?
link |
01:14:50.340
And he sheepishly wrote back about 20 minutes later.
link |
01:14:53.020
And the answer was, well, it had some problems with those.
link |
01:14:55.340
So I made some predictions 21 years ago that still hold.
link |
01:15:00.500
In the world of computer science, that's amazing, right?
link |
01:15:02.660
Because there's a thousand or a million times more memory
link |
01:15:06.500
and computations a million times,
link |
01:15:10.020
do million times more operations per second
link |
01:15:13.140
spread across a cluster.
link |
01:15:15.340
And there's been advances in replacing sigmoids
link |
01:15:20.780
with other functions and so forth.
link |
01:15:23.380
There's all kinds of advances,
link |
01:15:25.380
but the fundamental architecture hasn't changed
link |
01:15:27.100
and the fundamental limit hasn't changed.
link |
01:15:28.580
And what I said then is kind of still true.
link |
01:15:30.860
Then here's a second example.
link |
01:15:32.220
I recently had a piece in Wired
link |
01:15:34.020
that's adapted from the book.
link |
01:15:35.260
And the book went to press before GP2 came out,
link |
01:15:40.140
but we described this children's story
link |
01:15:42.300
and all the inferences that you make in this story
link |
01:15:45.580
about a boy finding a lost wallet.
link |
01:15:48.260
And for fun, in the Wired piece, we ran it through GP2.
link |
01:15:52.860
GPT2, something called talktotransformer.com,
link |
01:15:55.460
and your viewers can try this experiment themselves.
link |
01:15:58.180
Go to the Wired piece that has the link
link |
01:15:59.700
and it has the story.
link |
01:16:01.100
And the system made perfectly fluent text
link |
01:16:04.300
that was totally inconsistent
link |
01:16:06.420
with the conceptual underpinnings of the story, right?
link |
01:16:10.260
This is what, again, I predicted in 1998.
link |
01:16:13.220
And for that matter, Chomsky and Miller
link |
01:16:14.700
made the same prediction in 1963.
link |
01:16:16.660
I was just updating their claim for a slightly new text.
link |
01:16:19.420
So those particular architectures
link |
01:16:22.580
that don't have any built in knowledge,
link |
01:16:24.820
they're basically just a bunch of layers
link |
01:16:27.020
doing correlational stuff.
link |
01:16:28.940
They're not gonna solve these problems.
link |
01:16:31.220
So 20 years ago, you said the emperor has no clothes.
link |
01:16:34.500
Today, the emperor still has no clothes.
link |
01:16:36.860
The lighting's better though.
link |
01:16:38.020
The lighting is better.
link |
01:16:39.020
And I think you yourself are also, I mean.
link |
01:16:42.260
And we found out some things to do with naked emperors.
link |
01:16:44.340
I mean, it's not like stuff is worthless.
link |
01:16:46.420
I mean, they're not really naked.
link |
01:16:48.260
It's more like they're in their briefs
link |
01:16:49.580
than everybody thinks they are.
link |
01:16:50.820
And so like, I mean, they are great at speech recognition,
link |
01:16:54.340
but the problems that I said were hard.
link |
01:16:56.460
I didn't literally say the emperor has no clothes.
link |
01:16:58.220
I said, this is a set of problems
link |
01:17:00.140
that humans are really good at.
link |
01:17:01.780
And it wasn't couched as AI.
link |
01:17:03.140
It was couched as cognitive science.
link |
01:17:04.300
But I said, if you wanna build a neural model
link |
01:17:07.700
of how humans do certain class of things,
link |
01:17:10.340
you're gonna have to change the architecture.
link |
01:17:11.940
And I stand by those claims.
link |
01:17:13.620
So, and I think people should understand
link |
01:17:16.740
you're quite entertaining in your cynicism,
link |
01:17:19.020
but you're also very optimistic and a dreamer
link |
01:17:22.220
about the future of AI too.
link |
01:17:23.900
So you're both, it's just.
link |
01:17:25.340
There's a famous saying about being,
link |
01:17:27.820
people overselling technology in the short run
link |
01:17:30.700
and underselling it in the long run.
link |
01:17:34.100
And so I actually end the book,
link |
01:17:37.180
Ernie Davis and I end our book with an optimistic chapter,
link |
01:17:40.500
which kind of killed Ernie
link |
01:17:41.700
because he's even more pessimistic than I am.
link |
01:17:44.380
He describes me as a contrarian and him as a pessimist.
link |
01:17:47.580
But I persuaded him that we should end the book
link |
01:17:49.820
with a look at what would happen
link |
01:17:52.620
if AI really did incorporate, for example,
link |
01:17:55.340
the common sense reasoning and the nativism
link |
01:17:57.300
and so forth, the things that we counseled for.
link |
01:17:59.660
And we wrote it and it's an optimistic chapter
link |
01:18:02.140
that AI suitably reconstructed so that we could trust it,
link |
01:18:05.900
which we can't now, could really be world changing.
link |
01:18:09.500
So on that point, if you look at the future trajectories
link |
01:18:13.100
of AI, people have worries about negative effects of AI,
link |
01:18:17.140
whether it's at the large existential scale
link |
01:18:21.020
or smaller short term scale of negative impact on society.
link |
01:18:25.220
So you write about trustworthy AI,
link |
01:18:27.140
how can we build AI systems that align with our values,
link |
01:18:31.500
that make for a better world,
link |
01:18:32.780
that we can interact with, that we can trust?
link |
01:18:34.980
The first thing we have to do
link |
01:18:35.820
is to replace deep learning with deep understanding.
link |
01:18:38.260
So you can't have alignment with a system
link |
01:18:42.460
that traffics only in correlations
link |
01:18:44.620
and doesn't understand concepts like bottles or harm.
link |
01:18:47.900
So Asimov talked about these famous laws
link |
01:18:51.340
and the first one was first do no harm.
link |
01:18:54.060
And you can quibble about the details of Asimov's laws,
link |
01:18:56.860
but we have to, if we're gonna build real robots
link |
01:18:58.780
in the real world, have something like that.
link |
01:19:00.540
That means we have to program in a notion
link |
01:19:02.500
that's at least something like harm.
link |
01:19:04.240
That means we have to have these more abstract ideas
link |
01:19:06.620
that deep learning is not particularly good at.
link |
01:19:08.460
They have to be in the mix somewhere.
link |
01:19:10.620
And you could do statistical analysis
link |
01:19:12.380
about probabilities of given harms or whatever,
link |
01:19:14.380
but you have to know what a harm is
link |
01:19:15.820
in the same way that you have to understand
link |
01:19:17.420
that a bottle isn't just a collection of pixels.
link |
01:19:20.660
And also be able to, you're implying
link |
01:19:24.020
that you need to also be able to communicate
link |
01:19:25.940
that to humans so the AI systems would be able
link |
01:19:29.700
to prove to humans that they understand
link |
01:19:33.780
that they know what harm means.
link |
01:19:35.460
I might run it in the reverse direction,
link |
01:19:37.380
but roughly speaking, I agree with you.
link |
01:19:38.620
So we probably need to have committees
link |
01:19:42.500
of wise people, ethicists and so forth.
link |
01:19:45.660
Think about what these rules ought to be
link |
01:19:47.500
and we shouldn't just leave it to software engineers.
link |
01:19:49.780
It shouldn't just be software engineers
link |
01:19:51.620
and it shouldn't just be people
link |
01:19:53.900
who own large mega corporations
link |
01:19:56.580
that are good at technology, ethicists
link |
01:19:58.860
and so forth should be involved.
link |
01:20:00.260
But there should be some assembly of wise people
link |
01:20:04.660
as I was putting it that tries to figure out
link |
01:20:07.220
what the rules ought to be.
link |
01:20:08.700
And those have to get translated into code.
link |
01:20:12.460
You can argue or code or neural networks or something.
link |
01:20:15.460
They have to be translated into something
link |
01:20:18.660
that machines can work with.
link |
01:20:19.980
And that means there has to be a way
link |
01:20:21.940
of working the translation.
link |
01:20:23.380
And right now we don't.
link |
01:20:24.460
We don't have a way.
link |
01:20:25.340
So let's say you and I were the committee
link |
01:20:27.060
and we decide that Asimov's first law is actually right.
link |
01:20:29.820
And let's say it's not just two white guys,
link |
01:20:31.580
which would be kind of unfortunate that we have abroad.
link |
01:20:34.020
And so we've representative sample of the world
link |
01:20:36.260
or however we wanna do this.
link |
01:20:37.500
And the committee decides eventually,
link |
01:20:40.460
okay, Asimov's first law is actually pretty good.
link |
01:20:42.820
There are these exceptions to it.
link |
01:20:44.060
We wanna program in these exceptions.
link |
01:20:46.060
But let's start with just the first one
link |
01:20:47.460
and then we'll get to the exceptions.
link |
01:20:48.860
First one is first do no harm.
link |
01:20:50.620
Well, somebody has to now actually turn that into
link |
01:20:53.620
a computer program or a neural network or something.
link |
01:20:56.220
And one way of taking the whole book,
link |
01:20:58.740
the whole argument that I'm making
link |
01:21:00.260
is that we just don't have to do that yet.
link |
01:21:02.500
And we're fooling ourselves
link |
01:21:03.540
if we think that we can build trustworthy AI
link |
01:21:05.860
if we can't even specify in any kind of,
link |
01:21:09.500
we can't do it in Python and we can't do it in TensorFlow.
link |
01:21:13.140
We're fooling ourselves in thinking
link |
01:21:14.380
that we can make trustworthy AI
link |
01:21:15.820
if we can't translate harm into something
link |
01:21:18.780
that we can execute.
link |
01:21:19.940
And if we can't, then we should be thinking really hard
link |
01:21:22.820
how could we ever do such a thing?
link |
01:21:24.620
Because if we're gonna use AI
link |
01:21:26.500
in the ways that we wanna use it,
link |
01:21:27.940
to make job interviews or to do surveillance,
link |
01:21:31.060
not that I personally wanna do that or whatever.
link |
01:21:32.460
I mean, if we're gonna use AI
link |
01:21:33.780
in ways that have practical impact on people's lives
link |
01:21:36.180
or medicine, it's gotta be able
link |
01:21:38.980
to understand stuff like that.
link |
01:21:41.180
So one of the things your book highlights
link |
01:21:42.820
is that a lot of people in the deep learning community,
link |
01:21:47.380
but also the general public, politicians,
link |
01:21:50.220
just people in all general groups and walks of life
link |
01:21:53.220
have different levels of misunderstanding of AI.
link |
01:21:57.340
So when you talk about committees,
link |
01:22:00.940
what's your advice to our society?
link |
01:22:05.620
How do we grow, how do we learn about AI
link |
01:22:08.140
such that such committees could emerge
link |
01:22:10.820
where large groups of people could have
link |
01:22:13.500
a productive discourse about
link |
01:22:15.180
how to build successful AI systems?
link |
01:22:17.820
Part of the reason we wrote the book
link |
01:22:19.660
was to try to inform those committees.
link |
01:22:22.060
So part of the reason we wrote the book
link |
01:22:23.540
was to inspire a future generation of students
link |
01:22:25.660
to solve what we think are the important problems.
link |
01:22:27.860
So a lot of the book is trying to pinpoint
link |
01:22:29.860
what we think are the hard problems
link |
01:22:31.220
where we think effort would most be rewarded.
link |
01:22:34.020
And part of it is to try to train people
link |
01:22:37.780
who talk about AI, but aren't experts in the field
link |
01:22:41.020
to understand what's realistic and what's not.
link |
01:22:43.500
One of my favorite parts in the book
link |
01:22:44.660
is the six questions you should ask
link |
01:22:46.940
anytime you read a media account.
link |
01:22:48.380
So like number one is if somebody talks about something,
link |
01:22:51.060
look for the demo.
link |
01:22:51.900
If there's no demo, don't believe it.
link |
01:22:54.100
Like the demo that you can try.
link |
01:22:55.300
If you can't try it at home,
link |
01:22:56.460
maybe it doesn't really work that well yet.
link |
01:22:58.380
So if, we don't have this example in the book,
link |
01:23:00.620
but if Sundar Pinchai says we have this thing
link |
01:23:04.140
that allows it to sound like human beings in conversation,
link |
01:23:08.380
you should ask, can I try it?
link |
01:23:10.380
And you should ask how general it is.
link |
01:23:11.860
And it turns out at that time,
link |
01:23:13.060
I'm alluding to Google Duplex when it was announced,
link |
01:23:15.460
it only worked on calling hairdressers,
link |
01:23:18.220
restaurants and finding opening hours.
link |
01:23:20.020
That's not very general, that's narrow AI.
link |
01:23:22.260
And I'm not gonna ask your thoughts about Sophia,
link |
01:23:24.580
but yeah, I understand that's a really good question
link |
01:23:27.740
to ask of any kind of hype top idea.
link |
01:23:30.220
Sophia has very good material written for her,
link |
01:23:32.580
but she doesn't understand the things that she's saying.
link |
01:23:35.380
So a while ago you've written a book
link |
01:23:38.220
on the science of learning, which I think is fascinating,
link |
01:23:40.540
but the learning case studies of playing guitar.
link |
01:23:43.500
That's called Guitar Zero.
link |
01:23:45.100
I love guitar myself, I've been playing my whole life.
link |
01:23:47.340
So let me ask a very important question.
link |
01:23:50.260
What is your favorite song, rock song,
link |
01:23:53.500
to listen to or try to play?
link |
01:23:56.300
Well, those would be different,
link |
01:23:57.140
but I'll say that my favorite rock song to listen to
link |
01:23:59.660
is probably All Along the Watchtower,
link |
01:24:01.060
the Jimi Hendrix version.
link |
01:24:01.980
The Jimi Hendrix version.
link |
01:24:02.980
It feels magic to me.
link |
01:24:04.860
I've actually recently learned it, I love that song.
link |
01:24:07.040
I've been trying to put it on YouTube, myself singing.
link |
01:24:09.380
Singing is the scary part.
link |
01:24:11.300
If you could party with a rock star for a weekend,
link |
01:24:13.380
living or dead, who would you choose?
link |
01:24:17.780
And pick their mind, it's not necessarily about the partying.
link |
01:24:21.140
Thanks for the clarification.
link |
01:24:24.700
I guess John Lennon's such an intriguing person,
link |
01:24:26.980
and I think a troubled person, but an intriguing one.
link |
01:24:31.660
Beautiful.
link |
01:24:32.500
Well, Imagine is one of my favorite songs.
link |
01:24:35.460
Also one of my favorite songs.
link |
01:24:37.100
That's a beautiful way to end it.
link |
01:24:38.300
Gary, thank you so much for talking to me.
link |
01:24:39.780
Thanks so much for having me.