back to indexGary Marcus: Toward a Hybrid of Deep Learning and Symbolic AI | Lex Fridman Podcast #43
link |
The following is a conversation with Gary Marcus.
link |
He's a professor emeritus at NYU, founder of robust AI
link |
and geometric intelligence.
link |
The latter is a machine learning company
link |
that was acquired by Uber in 2016.
link |
He's the author of several books on natural
link |
and artificial intelligence,
link |
including his new book, Rebooting AI,
link |
Building Machines We Can Trust.
link |
Gary has been a critical voice highlighting the limits
link |
of deep learning and AI in general
link |
and discussing the challenges before our AI community
link |
that must be solved in order to achieve
link |
artificial general intelligence.
link |
As I'm having these conversations,
link |
I try to find paths toward insight, towards new ideas.
link |
I try to have no ego in the process and gets in the way.
link |
I'll often continuously try on several hats, several roles.
link |
One, for example, is the role of a three year old
link |
who understands very little about anything
link |
and asks big what and why questions.
link |
The other might be a role of a devil's advocate
link |
who presents counter ideas with a goal of arriving
link |
at greater understanding through debate.
link |
Hopefully both are useful, interesting,
link |
and even entertaining at times.
link |
I ask for your patience as I learn
link |
to have better conversations.
link |
This is the Artificial Intelligence Podcast.
link |
If you enjoy it, subscribe on YouTube,
link |
give it 5,000 iTunes, support it on Patreon,
link |
or simply connect with me on Twitter
link |
at Lex Freedman spelled F R I D M A M.
link |
And now here's my conversation with Gary Marcus.
link |
Do you think human civilization will one day have
link |
to face an AI driven technological singularity
link |
that will in a societal way modify our place
link |
in the food chain of intelligent living beings
link |
I think our place in the food chain has already changed.
link |
So there are lots of things people used to do by hand
link |
that they do with machine.
link |
If you think of a singularity as like one single moment,
link |
which is I guess what it suggests,
link |
I don't know if it'll be like that,
link |
but I think that there's a lot of gradual change
link |
and AI is getting better and better.
link |
I mean, I'm here to tell you why I think it's not nearly
link |
as good as people think, but the overall trend is clear.
link |
Maybe Rick Hertzwell thinks it's an exponential
link |
and I think it's linear in some cases,
link |
it's close to zero right now, but it's all gonna happen.
link |
We are gonna get to human level intelligence
link |
or whatever you want, what you will,
link |
artificial general intelligence at some point.
link |
And that's certainly gonna change our place
link |
in the food chain.
link |
Cause a lot of the tedious things that we do now,
link |
we're gonna have machines do
link |
and a lot of the dangerous things that we do now,
link |
we're gonna have machines do.
link |
I think our whole lives are gonna change
link |
from people finding their meaning through their work,
link |
through people finding their meaning
link |
through creative expression.
link |
So the singularity will be a very gradual,
link |
in fact, removing the meaning of the word singularity.
link |
It'll be a very gradual transformation in your view.
link |
I think that it'll be somewhere in between
link |
and I guess it depends what you mean by gradual and sudden.
link |
I don't think it's gonna be one day.
link |
I think it's important to realize that intelligence
link |
is a multi dimensional variable.
link |
So people sort of write this stuff as if like IQ was one number
link |
and the day that you hit 262
link |
or whatever you displace the human beings.
link |
And really there's lots of facets to intelligence.
link |
So there's verbal intelligence
link |
and there's motor intelligence
link |
and there's mathematical intelligence and so forth.
link |
Machines in their mathematical intelligence
link |
far exceed most people already
link |
in their ability to play games.
link |
They far exceed most people already.
link |
In their ability to understand language,
link |
they lag behind my five year old, far behind my five year old.
link |
So there are some facets of intelligence,
link |
the machines of graphs and some that they haven't.
link |
And we have a lot of work left to do
link |
to get them to say understand natural language
link |
or to understand how to flexibly approach some,
link |
kind of novel MacGyver problem solving kind of situation.
link |
And I don't know that all of these things will come once.
link |
I think there are certain vital prerequisites
link |
that we're missing now.
link |
So for example, machines don't really have common sense now.
link |
So they don't understand that bottles contain water
link |
and that people drink water to quench their thirst
link |
and that they don't want to dehydrate.
link |
They don't know these basic facts about human beings.
link |
And I think that that's a great limiting step for many things.
link |
It's a great limiting step for reading, for example,
link |
because stories depend on things like,
link |
oh my God, that person's running out of water.
link |
That's why they did this thing.
link |
Or if they only had water, they could put out the fire.
link |
So you watch a movie and your knowledge
link |
about how things work matter.
link |
And so a computer can't understand that movie
link |
if it doesn't have that background knowledge.
link |
Same thing if you read a book.
link |
And so there are lots of places where
link |
if we had a good machine interpretable set of common sense,
link |
many things would accelerate relatively quickly,
link |
but I don't think even that is like a single point.
link |
There's many different aspects of knowledge.
link |
And we might, for example, find that we make a lot of progress
link |
on physical reasoning, getting machines to understand,
link |
for example, how keys fit into the locks
link |
or that kind of stuff or how this gadget here works
link |
and so forth and so on.
link |
Machines might do that long before they do
link |
really good psychological reasoning,
link |
because it's easier to get kind of labeled data
link |
or to do direct experimentation on a microphone stand
link |
than it is to do direct experimentation on human beings
link |
to understand the levers that guide them.
link |
That's a really interesting point actually,
link |
whether it's easier to gain common sense knowledge
link |
or psychological knowledge.
link |
I would say the common sense knowledge
link |
includes both physical knowledge and psychological knowledge.
link |
And the argument I was making.
link |
It's physical versus psychological.
link |
Yeah, physical versus psychological.
link |
The argument I was making is physical knowledge
link |
might be more accessible,
link |
because you could have a robot, for example,
link |
lift a bottle, try putting a bottle cap on it,
link |
see that it falls off if it does this
link |
and see that it could turn it upside down
link |
and so the robot could do some experimentation.
link |
We do some of our psychological reasoning
link |
by looking at our own minds.
link |
So I can sort of guess how you might react
link |
to something based on how I think I would react to it.
link |
And robots don't have that intuition,
link |
and they also can't do experiments on people
link |
in the same way or we'll probably shut them down.
link |
So if we wanted to have robots figure out
link |
how I respond to pain by pinching me in different ways,
link |
like that's probably,
link |
it's not gonna make it past the human subjects board
link |
and companies are gonna get sued or whatever.
link |
So there's certain kinds of practical experience
link |
that are limited or off limits to robots.
link |
That's a really interesting point.
link |
What is more difficult to gain a grounding in?
link |
Because to play devil's advocate,
link |
I would say that human behavior is easier expressed
link |
in data and digital form.
link |
And so when you look at Facebook algorithms,
link |
they get to observe human behavior.
link |
So you get to study and manipulate even a human behavior
link |
in a way that you perhaps cannot study
link |
or manipulate the physical world.
link |
So it's true why you said pain is like physical pain,
link |
but that's again the physical world.
link |
Emotional pain might be much easier to experiment with,
link |
perhaps unethical, but nevertheless,
link |
some would argue it's already going on.
link |
I think that you're right, for example,
link |
that Facebook does a lot of experimentation
link |
in psychological reasoning.
link |
In fact, Zuckerberg talked about AI at a talk
link |
that he gave nips, I wasn't there,
link |
at the conference that's been renamed Neurups,
link |
but he used to be called nips when he gave the talk.
link |
And he talked about Facebook basically
link |
having a gigantic theory of mind.
link |
So I think it is certainly possible.
link |
I mean, Facebook does some of that.
link |
I think they have a really good idea
link |
of how to addict people to things.
link |
They understand what draws people back to things.
link |
And I think they exploit it
link |
in ways that I'm not very comfortable with.
link |
But even so, I think that there are only some slices
link |
of human experience that they can access
link |
through the kind of interface they have.
link |
And of course, they're doing all kinds of VR stuff,
link |
and maybe that'll change and they'll expand their data.
link |
And I'm sure that that's part of their goal.
link |
So it is an interesting question.
link |
I think love, fear, insecurity, all of the things
link |
that I would say some of the deepest things
link |
about human nature and the human mind
link |
could be explored to digital form.
link |
It's that you're actually the first person
link |
just now that brought up.
link |
I wonder what is more difficult
link |
because I think folks who are the slow,
link |
and we'll talk a lot about deep learning,
link |
but the people who are thinking beyond deep learning
link |
are thinking about the physical world.
link |
You're starting to think about robotics
link |
in the home robotics.
link |
How do we make robots manipulate objects
link |
which requires an understanding of the physical world
link |
and then requires common sense reasoning.
link |
And that has felt to be like the next step
link |
for common sense reasoning.
link |
But you've now brought up the idea
link |
that there's also the emotional part.
link |
And it's interesting whether that's hard or easy.
link |
I think some parts of it are and some aren't.
link |
So my company that I recently founded
link |
with Brod Brooks from MIT for many years
link |
and so forth, we're interested in both.
link |
We're interested in physical reasoning
link |
and psychological reasoning among many other things.
link |
And there are pieces of each of these that are accessible.
link |
So if you want a robot to figure out
link |
whether it can fit under a table,
link |
that's a relatively accessible piece of physical reasoning.
link |
If you know the height of the table
link |
and you know the height of the robot, it's not that hard.
link |
If you wanted to do physical reasoning about Jenga,
link |
it gets a little bit more complicated
link |
and you have to have higher resolution data
link |
in order to do it.
link |
With psychological reasoning,
link |
it's not that hard to know, for example,
link |
that people have goals and they like to act on those goals,
link |
but it's really hard to know exactly what those goals are.
link |
My idea is a frustration.
link |
I mean, you could argue it's extremely difficult
link |
to understand the sources of human frustration
link |
as they're playing Jenga with you or not.
link |
You could argue that it's very accessible.
link |
There's some things that are gonna be obvious and some not.
link |
So I don't think anybody really can do this well yet,
link |
but I think it's not inconceivable
link |
to imagine machines in the not so distant future
link |
being able to understand that if people lose in a game
link |
that they don't like that.
link |
That's not such a hard thing to program
link |
and it's pretty consistent across people.
link |
Most people don't enjoy losing
link |
and so that makes it relatively easy to code.
link |
On the other hand, if you wanted to capture everything
link |
about frustration, well, people get frustrated
link |
for a lot of different reasons.
link |
They might get sexually frustrated,
link |
they might get frustrated,
link |
they can get their promotion at work,
link |
all kinds of different things.
link |
And the more you expand the scope,
link |
the harder it is for anything like the existing techniques
link |
to really do that.
link |
So I'm talking to Gary Kasparov next week
link |
and he seemed pretty frustrated with this game
link |
So yeah, well, I'm frustrated with my game
link |
against him last year because I played him.
link |
I had two excuses, I'll give you my excuses up front
link |
that it won't mitigate the outcome.
link |
I was jet lagged and I hadn't played in 25 or 30 years,
link |
but the outcome is he completely destroyed me
link |
and it wasn't even close.
link |
Have you ever been beaten in any board game by a machine?
link |
I have, I actually played the predecessor to deep blue.
link |
Deep thought, I believe it was called.
link |
And that too crushed me.
link |
And after that, you realize it's over for us.
link |
Well, there's no point in my playing deep blue.
link |
I mean, it's a waste of deep blues, computation.
link |
I mean, I played Kasparov
link |
because we both gave lectures this same event
link |
and he was playing 30 people.
link |
I forgot to mention that.
link |
Not only did he crush me,
link |
but he crushed 29 other people at the same time.
link |
I mean, but the actual philosophical and emotional
link |
experience of being beaten by a machine, I imagine,
link |
is, I mean, to you who thinks about these things,
link |
maybe a profound experience or no, it was a simple.
link |
No, I mean, I think.
link |
Mathematical experience.
link |
Yeah, I think a game like chess particularly
link |
where it's, you know, you have perfect information,
link |
it's, you know, two player closed end
link |
and there's more computation for the computer.
link |
It's no surprise the machine wins.
link |
I mean, I'm not sad when a computer,
link |
I'm not sad when a computer calculates
link |
a cube root faster than me.
link |
Like, I know I can't win that game.
link |
I'm not going to try.
link |
Well, with a system like AlphaGo or AlphaZero,
link |
do you see a little bit more magic in a system like that,
link |
even though it's simply playing a board game,
link |
but because there's a strong learning component?
link |
You know, I find you should mention that
link |
in the context of this conversation
link |
because Kasparov and I are working on an article
link |
that's going to be called AI is not magic.
link |
And, you know, neither one of us thinks that it's magic.
link |
And part of the point of this article
link |
is that AI is actually a grab bag of different techniques
link |
and some of them have,
link |
or they each have their own unique strengths and weaknesses.
link |
So, you know, you read media accounts and it's like,
link |
ooh, AI, it must be magical or can solve any problem.
link |
Well, no, some problems are really accessible
link |
like chess and Go and other problems like reading
link |
are completely outside the current technology.
link |
And it's not like you can take the technology
link |
that drives AlphaGo and apply it to reading and get anywhere.
link |
You know, DeepMind has tried that a bit.
link |
They have all kinds of resources.
link |
You know, they built AlphaGo and they have,
link |
you know, they, I wrote a piece recently
link |
that they lost and you can argue about the word lost,
link |
but they spent $530 million more than they made last year.
link |
So, you know, they're making huge investments.
link |
They have a large budget
link |
and they have applied the same kinds of techniques
link |
to reading or to language.
link |
And it's just much less productive there
link |
because it's a fundamentally different kind of problem.
link |
Chess and Go and so forth are closed in problems.
link |
The rules haven't changed in 2,500 years.
link |
There's only so many moves you can make.
link |
You can talk about the exponential
link |
as you look at the combinations of moves.
link |
But fundamentally, you know, the Go board has 361 squares.
link |
That's the only, you know, those intersections
link |
are the only places that you can place your stone.
link |
Whereas when you're reading,
link |
the next sentence could be anything.
link |
You know, it's completely up to the writer
link |
what they're gonna do next.
link |
That's fascinating that you think this way.
link |
You're clearly a brilliant mind
link |
who points out the emperor has no clothes,
link |
but so I'll play the role of a person who says...
link |
You're gonna put clothes on the emperor?
link |
Good luck with it.
link |
Romanticizes the notion of the emperor, period.
link |
Suggesting that clothes don't even matter.
link |
Okay, so that's really interesting
link |
that you're talking about language.
link |
So there's the physical world
link |
of being able to move about the world,
link |
making an omelet and coffee and so on.
link |
There's language where you first understand
link |
what's being written
link |
and then maybe even more complicated
link |
than that having a natural dialogue.
link |
And then there's the game of Go and chess.
link |
I would argue that language is much closer to Go
link |
than it is to the physical world.
link |
Like it is still very constrained.
link |
When you say the possibility
link |
of the number of sentences that could come,
link |
it is huge, but it nevertheless is much more constrained.
link |
It feels maybe I'm wrong than the possibilities
link |
that the physical world brings us.
link |
There's something to what you say
link |
in some ways in which I disagree.
link |
So one interesting thing about language
link |
is that it abstracts away.
link |
This bottle, I don't know if they're gonna be
link |
in the field of view, is on this table.
link |
And I use the word on here
link |
and I can use the word on here, maybe not here,
link |
but that one word encompasses in analog space
link |
a sort of infinite number of possibilities.
link |
So there is a way in which language filters down
link |
the variation of the world and there's other ways.
link |
So we have a grammar and more or less,
link |
you have to follow the rules of that grammar.
link |
You can break them a little bit,
link |
but by and large, we follow the rules of grammar
link |
and so that's a constraint on language.
link |
So there are ways in which language is a constrained system.
link |
On the other hand, there are many arguments.
link |
Let's say there's an infinite number of possible sentences
link |
and you can establish that by just stacking them up.
link |
So I think there's water on the table.
link |
You think that I think there's water on the table.
link |
Your mother thinks that you think
link |
that I think the water is on the table.
link |
Your brother thinks that maybe your mom is wrong
link |
to think that you think that I think.
link |
So we can make it in sentences of infinite length
link |
or we can stack up adjectives.
link |
This is a very silly example of very, very silly example
link |
of very, very, very, very, very, very, very silly example
link |
So there are good arguments
link |
that there's an infinite range of sentences.
link |
In any case, it's vast by any reasonable measure.
link |
And for example, almost anything in the physical world
link |
we can talk about in the language world.
link |
And interestingly, many of the sentences that we understand
link |
we can only understand if we have a very rich model
link |
of the physical world.
link |
So I don't ultimately want to adjudicate the debate
link |
that I think you just set up, but I find it interesting.
link |
Maybe the physical world is even more complicated
link |
I think that's fair, but you think
link |
that language is really, really complicated.
link |
It's really, really hard.
link |
Well, it's really, really hard for machines,
link |
for linguists, people trying to understand it.
link |
It's not that hard for children
link |
and that's part of what's driven my whole career.
link |
I was a student of Stephen Pinkers
link |
and we were trying to figure out
link |
why kids could learn language when machines couldn't.
link |
I think we're gonna get into language.
link |
We're gonna get into communication intelligence
link |
and neural networks and so on.
link |
But let me return to the high level of the futuristic
link |
for a brief moment.
link |
So you've written in your book, in your new book,
link |
it will be arrogant to suppose that we could forecast
link |
where AI will be, where the impact it will have
link |
in a thousand years or even 500 years.
link |
So let me ask you to be arrogant.
link |
What do AI systems with or without physical bodies
link |
look like 100 years from now?
link |
If you would, just a, you can't predict,
link |
but if you were to philosophize and imagine, do.
link |
Can I first justify the arrogance
link |
before you try to push me beyond it?
link |
I mean, there are examples, like, you know,
link |
people figured out how electricity worked.
link |
They had no idea that that was gonna lead to cell phones,
link |
I mean, things can move awfully fast
link |
once new technologies are perfected.
link |
Even when they made transistors,
link |
they weren't really thinking that cell phones
link |
would lead to social networking.
link |
There are nevertheless predictions of the future,
link |
which are statistically unlikely to come to be,
link |
but nevertheless is the best.
link |
You're asking me to be wrong.
link |
I'm asking you to be.
link |
Which way would I like to be wrong?
link |
Pick the least unlikely to be wrong thing,
link |
even though it's most very likely to be wrong.
link |
I mean, here's some things
link |
that we can safely predict, I suppose.
link |
We can predict that AI will be faster than it is now.
link |
It will be cheaper than it is now.
link |
It will be better in the sense of being more general
link |
and applicable in more places.
link |
It will be pervasive.
link |
You know, I mean, these are easy predictions.
link |
I'm sort of modeling them in my head
link |
on Jeff Bezos's famous predictions.
link |
He says, I can't predict the future.
link |
I'm paraphrasing, but I can predict
link |
that people will never wanna pay more money for their stuff.
link |
They're never gonna want it to take longer to get there.
link |
And you know, so like, you can't predict everything,
link |
but you can predict some things.
link |
Sure, of course it's gonna be faster and better.
link |
And what we can't really predict
link |
is the full scope of where AI will be in a certain period.
link |
I mean, I think it's safe to say
link |
that although I'm very skeptical about current AI,
link |
that it's possible to do much better.
link |
You know, there's no in principle at argument
link |
that says AI is an insolvable problem,
link |
that there's magic inside our brains
link |
that will never be captured.
link |
I mean, I've heard people make those kind of arguments.
link |
I don't think they're very good.
link |
So AI is gonna come and probably 500 years of planning
link |
And then once it's here, it really will change everything.
link |
So when you say AI is gonna come,
link |
are you talking about human level intelligence?
link |
I like the term general intelligence.
link |
So I don't think that the ultimate AI,
link |
if there is such a thing, is gonna look just like humans.
link |
I think it's gonna do some things
link |
that humans do better than current machines,
link |
like reason flexibly.
link |
And understand language and so forth.
link |
But it doesn't mean they have to be identical to humans.
link |
So for example, humans have terrible memory
link |
and they suffer from what some people call
link |
motivated reasoning.
link |
So they like arguments that seem to support them
link |
and they dismiss arguments that they don't like.
link |
There's no reason that a machine should ever do that.
link |
So you see that those limitations of memory
link |
as a bug, not a feature?
link |
I'll say two things about that.
link |
One is I was on a panel with Danny Kahneman,
link |
the Nobel Prize winner last night,
link |
and we were talking about this stuff.
link |
And I think what we converged on is that
link |
the humans are a low bar to exceed.
link |
They may be outside of our skill right now,
link |
but as AI programmers,
link |
but eventually AI will exceed it.
link |
So we're not talking about human level AI.
link |
We're talking about general intelligence
link |
that can do all kinds of different things
link |
and do it without some of the flaws that human beings have.
link |
The other thing I'll say is I wrote a whole book actually
link |
about the flaws of humans.
link |
It's actually a nice counterpoint to the current book.
link |
So I wrote a book called Cluj,
link |
which was about the limits of the human mind.
link |
The current book is kind of about those few things
link |
that humans do a lot better than machines.
link |
Do you think it's possible that the flaws of the human mind,
link |
the limits of memory, our mortality,
link |
our bias is a strength, not a weakness.
link |
That is the thing that enables
link |
from which motivation springs and meaning springs.
link |
I've heard a lot of arguments like this.
link |
I've never found them that convincing.
link |
I think that there's a lot of making lemonade out of lemons.
link |
So we, for example, do a lot of free association
link |
where one idea just leads to the next
link |
and they're not really that well connected.
link |
And we enjoy that and we make poetry out of it
link |
and we make kind of movies with free associations
link |
and it's fun and whatever.
link |
I don't think that's really a virtue of the system.
link |
I think that the limitations in human reasoning
link |
actually get us in a lot of trouble.
link |
Like for example, politically, we can't see eye to eye
link |
because we have the motivational reasoning I was talking about
link |
and something related called confirmation bias.
link |
So we have all of these problems
link |
that actually make for a rougher society
link |
because we can't get along
link |
because we can't interpret the data in shared ways.
link |
And then we do some nice stuff with that.
link |
So my free associations are different from yours
link |
and you're kind of amused by them and that's great.
link |
So there are lots of ways in which we take a lousy situation
link |
Another example would be our memories are terrible.
link |
So we play games like concentration
link |
where you flip over the two cards, try to find a pair.
link |
Can you imagine a computer playing that?
link |
Computers like this is the dullest game in the world.
link |
I know where all the cards are.
link |
I know where it is.
link |
What are you even talking about?
link |
So we make a fun game out of having this terrible memory.
link |
So we are imperfect in discovering and optimizing
link |
some kind of utility function.
link |
But you think in general, there is a utility function.
link |
There's an objective function that's better than others.
link |
I didn't say that.
link |
The presumption, when you say...
link |
I think you could design a better memory system.
link |
You could argue about utility functions
link |
and how you wanna think about that.
link |
But objectively, it would be really nice
link |
to do some of the following things.
link |
To get rid of memories that are no longer useful.
link |
Like objectively, that would just be good.
link |
And we're not that good at it.
link |
So when you park in the same lot every day,
link |
you confuse where you parked today
link |
with where you parked yesterday
link |
with where you parked the day before and so forth.
link |
So you blur together a series of memories.
link |
There's just no way that that's optimal.
link |
I mean, I've heard all kinds of wacky arguments
link |
of people trying to defend that.
link |
But in the end of the day,
link |
I don't think any of them hold water.
link |
Or trauma memories of traumatic events
link |
would be possibly a very nice feature to have
link |
to get rid of those.
link |
It'd be great if you could just be like,
link |
I'm gonna wipe this sector.
link |
I'm done with that.
link |
I didn't have fun last night.
link |
I don't wanna think about it anymore.
link |
I'm gone, but we can't.
link |
Do you think it's possible to build a system?
link |
So you said human level intelligence is a weird concept,
link |
Well, I'm saying I prefer general intelligence.
link |
General intelligence.
link |
I mean, human level intelligence is a real thing.
link |
And you could try to make a machine
link |
that matches people or something like that.
link |
I'm saying that per se shouldn't be the objective,
link |
but rather that we should learn from humans
link |
the things they do well and incorporate that into our AI
link |
just as we incorporate the things that machines do well
link |
that people do terribly.
link |
So I mean, it's great that AI systems
link |
can do all this brute force computation that people can't.
link |
And one of the reasons I work on this stuff
link |
is because I would like to see machine solve problems
link |
that people can't that combine the strength
link |
or that in order to be solved would combine
link |
the strengths of machines to do all this computation
link |
with the ability, let's say, of people to read.
link |
So I'd like machines that can read
link |
the entire medical literature in a day.
link |
7,000 new papers or whatever the numbers comes out every day.
link |
There's no way for any doctor or whatever to read them all.
link |
Machine that could read would be a brilliant thing.
link |
And that would be strengths of brute force computation
link |
combined with kind of subtlety and understanding medicine
link |
that a good doctor or scientist has.
link |
So if we can linger a little bit
link |
on the idea of general intelligence.
link |
So Yanlacun believes that human intelligence
link |
is in general at all, it's very narrow.
link |
How do you think, I don't think that makes sense.
link |
We have lots of narrow intelligences
link |
for specific problems.
link |
But the fact is like anybody can walk into,
link |
let's say a Hollywood movie and reason about the content
link |
of almost anything that goes on there.
link |
So you can reason about what happens in a bank robbery
link |
or what happens when someone is infertile
link |
and wants to go to IVF to try to have a child.
link |
Or you can, the list is essentially endless.
link |
And not everybody understands every scene in a movie,
link |
but there's a huge range of things
link |
that pretty much any ordinary adult can understand.
link |
His argument is that actually the set of things
link |
seems large to us humans because we're very limited
link |
in considering the kind of possibilities
link |
of experience as they're possible.
link |
But in fact, the amount of experience that are possible
link |
is infinitely larger.
link |
Well, I mean, if you wanna make an argument
link |
that humans are constrained in what they can understand,
link |
I have no issue with that, I think that's right.
link |
But it's still not the same thing at all
link |
as saying, here's a system that can play go.
link |
It's been trained on five million games.
link |
And then I say, can it play on a rectangular board
link |
rather than a square board?
link |
And you say, well, if I retrain it from scratch
link |
on another five million games, I can't.
link |
That's really, really narrow and that's where we are.
link |
We don't have even a system that could play go
link |
and then without further retraining
link |
play on a rectangular board,
link |
which any good human could do with very little problem.
link |
So that's what I mean by narrow.
link |
And so it's just wordplay to say.
link |
Then it's semantics, then it's just words.
link |
Then yeah, you mean general in a sense
link |
that you can do all kinds of go board shapes flexibly.
link |
Well, I mean, that would be like a first step
link |
in the right direction,
link |
but obviously that's not what it really meaning.
link |
What I mean by a general is that you could transfer
link |
the knowledge you learn in one domain to another.
link |
So if you learn about bank robberies in movies
link |
and there's chase scenes,
link |
then you can understand that amazing scene in Breaking Bad
link |
when Walter White has a car chase scene
link |
with only one person, he's the only one in it.
link |
And you can reflect on how that car chase scene
link |
is like all the other car chase scenes you've ever seen
link |
and totally different and why that's cool.
link |
And the fact that the number of domains
link |
you can do that with is finite,
link |
doesn't make it less general.
link |
So the idea of general is you can just do it
link |
on a lot of transfer across a lot of domains.
link |
Yeah, I mean, I'm not saying humans are infinitely general
link |
or that humans are perfect.
link |
I just said a minute ago, it's a low bar,
link |
but it's just, it's a low bar.
link |
But right now, like the bar is here and we're there
link |
and eventually we'll get way past it.
link |
So speaking of low bars,
link |
you've highlighted in your new book as well,
link |
but a couple of years ago wrote a paper
link |
titled Deep Learning a Critical Appraisal
link |
that lists 10 challenges faced by
link |
current deep learning systems.
link |
So let me summarize them as data efficiency,
link |
transfer learning, hierarchical knowledge,
link |
open ended inference, explainability,
link |
integrating prior knowledge, causal reasoning,
link |
modeling on a stable world, robustness, adversarial examples
link |
And then my favorite probably is reliability
link |
and engineering of real world systems.
link |
So whatever people can read the paper,
link |
they should definitely read the paper,
link |
should definitely read your book.
link |
But which of these challenges is solved in your view
link |
has the biggest impact on the AI community?
link |
It's a very good question.
link |
And I'm gonna be evasive because I think that
link |
they go together a lot.
link |
So some of them might be solved independently of others,
link |
but I think a good solution to AI starts
link |
by having real what I would call cognitive models
link |
of what's going on.
link |
So right now we have an approach that's dominant
link |
where you take statistical approximations of things,
link |
but you don't really understand them.
link |
So you know that bottles are correlated
link |
in your data with bottle caps,
link |
but you don't understand that there's a thread
link |
on the bottle cap that fits with the thread on the bottle
link |
and that that's tightens in if I tighten enough
link |
that there's a seal and the water can come out.
link |
Like there's no machine that understands that.
link |
And having a good cognitive model
link |
of that kind of everyday phenomena
link |
is what we call common sense.
link |
And if you had that, then a lot of these other things
link |
start to fall into at least a little bit better place.
link |
So right now you're like learning correlations
link |
between pixels when you play a video game
link |
or something like that.
link |
And it doesn't work very well.
link |
It works when the video game is just the way
link |
that you studied it and then you alter the video game
link |
in small ways like you move the paddle
link |
and break out a few pixels and the system falls apart.
link |
Because it doesn't understand,
link |
it doesn't have a representation of a paddle,
link |
a ball, a wall, a set of bricks and so forth.
link |
And so it's reasoning at the wrong level.
link |
So the idea of common sense, it's full of mystery.
link |
You've worked on it, but it's nevertheless full of mystery,
link |
What does common sense mean?
link |
What does knowledge mean?
link |
So the way you've been discussing it now is very intuitive.
link |
It makes a lot of sense that that is something we should have
link |
and that's something deep learning systems don't have.
link |
But the argument could be that we're oversimplifying it
link |
because we're oversimplifying the notion of common sense
link |
because that's how it feels like we as humans
link |
at the cognitive level approach problems.
link |
A lot of people aren't actually gonna read my book.
link |
But if they did read the book,
link |
one of the things that might come as a surprise to them
link |
is that we actually say a common sense is really hard
link |
and really complicated.
link |
So my critics know that I like common sense,
link |
but that chapter actually starts by us beating up
link |
not on deep learning,
link |
but kind of on our own home team as it will.
link |
So Ernie and I are first and foremost people that believe
link |
in at least some of what good old fashioned AI tried to do.
link |
So we believe in symbols and logic and programming.
link |
Things like that are important.
link |
And we go through why even those tools
link |
that we hold fairly dear aren't really enough.
link |
So we talk about why common sense is actually many things.
link |
And some of them fit really well with those
link |
classical sets of tools.
link |
So things like taxonomy.
link |
So I know that a bottle is an object
link |
or it's a vessel, let's say.
link |
And I know a vessel is an object
link |
and objects are material things in the physical world.
link |
So I can make some inferences.
link |
If I know that vessels need to not have holes in them,
link |
then I can infer that in order to carry their contents
link |
that I can infer that a bottle shouldn't have a hole
link |
in it in order to carry its contents.
link |
So you can do hierarchical inference and so forth.
link |
And we say that's great,
link |
but it's only a tiny piece of what you need for common sense.
link |
And we give lots of examples that don't fit into that.
link |
So another one that we talk about is a cheese grater.
link |
You've got holes in a cheese grater.
link |
You've got a handle on top.
link |
You can build a model in the game engine sense of a model
link |
so that you could have a little cartoon character
link |
flying around through the holes of the grater.
link |
But we don't have a system yet.
link |
Taxonomy doesn't help us that much.
link |
It really understands why the handle is on top
link |
and what you do with the handle
link |
or why all of those circles are sharp
link |
or how you'd hold the cheese with respect to the grater
link |
in order to make it actually work.
link |
Do you think these ideas are just abstractions
link |
that could emerge on a system like
link |
a very large deep neural network?
link |
I'm a skeptic that that kind of emergence per se can work.
link |
So I think that deep learning might play a role
link |
in the systems that do what I want systems to do,
link |
but it won't do it by itself.
link |
I've never seen a deep learning system
link |
really extract an abstract concept.
link |
What they do, principle reasons for that,
link |
stemming from how back propagation works,
link |
how the architectures are set up.
link |
One example is deep learning people
link |
actually all build in something called convolution
link |
which Jan Lacoon is famous for, which is an abstraction.
link |
They don't have their systems learn this.
link |
So the abstraction is an object that looks the same
link |
if it appears in different places.
link |
And what Lacoon figured out and why,
link |
essentially why he was a co winner of the Turing word
link |
was that if you program this in innately,
link |
then your system would be a whole lot more efficient.
link |
In principle, this should be learnable,
link |
but people don't have systems that kind of reify things
link |
and make them more abstract.
link |
And so what you'd really wind up with,
link |
if you don't program that in advance as a system,
link |
the kind of realizes that this is the same thing as this,
link |
but then I take your little clock there
link |
and I move it over and it doesn't realize
link |
that the same thing applies to the clock.
link |
So the really nice thing, you're right,
link |
that convolution is just one of the things
link |
that's like it's an innate feature
link |
that's programmed by the human expert,
link |
but we need more of those, not less.
link |
So the, but the nice feature is,
link |
it feels like that requires coming up with that brilliant
link |
idea can get your Turing award,
link |
but it requires less effort than encoding
link |
and something we'll talk about the expert system.
link |
So encoding a lot of knowledge by hand.
link |
So it feels like one, there's a huge amount of limitations
link |
which you clearly outline with deep learning,
link |
but the nice feature of deep learning,
link |
whatever it is able to accomplish,
link |
it does it, it does a lot of stuff automatically
link |
without human intervention.
link |
Well, and that's part of why people love it, right?
link |
But I always think of this quote from Bertrand Russell,
link |
which is it has all the advantages of theft over honest toil.
link |
It's really hard to program into a machine
link |
a notion of causality or, you know,
link |
even how a bottle works or what containers are.
link |
Ernie Davis and I wrote a, I don't know,
link |
45 page academic paper trying just to understand
link |
what a container is, which I don't think anybody
link |
ever read the paper, but it's a very detailed analysis
link |
of all the things, not even all,
link |
some of the things you need to do
link |
in order to understand a container.
link |
It would be a whole lot nice and, you know,
link |
I'm a co author on the paper,
link |
I made it a little bit better,
link |
but Ernie did the hard work for that particular paper.
link |
And it took him like three months
link |
to get the logical statements correct.
link |
And maybe that's not the right way to do it.
link |
It's a way to do it, but on that way of doing it,
link |
it's really hard work to do something
link |
as simple as understanding containers.
link |
And nobody wants to do that hard work.
link |
Even Ernie didn't want to do that hard work.
link |
Everybody would rather just like feed their system in
link |
with a bunch of videos with a bunch of containers
link |
and have the systems infer how can containers work.
link |
It would be like so much less effort,
link |
let the machine do the work.
link |
And so I understand the impulse,
link |
I understand why people want to do that.
link |
I just don't think that it works.
link |
I've never seen anybody build a system
link |
that in a robust way can actually watch videos
link |
and predict exactly, you know,
link |
which containers would leak
link |
and which ones wouldn't or something like,
link |
and I know someone's gonna go out and do that
link |
since I said it, and I look forward to seeing it,
link |
but getting these things to work robustly
link |
is really, really hard.
link |
So Yann LeCun, who was my colleague at NYU
link |
for many years, thinks that the hard work
link |
should go into defining an unsupervised learning algorithm
link |
that will watch videos, use the next frame basically
link |
in order to tell it what's going on.
link |
And he thinks that's the royal road
link |
and he's willing to put in the work
link |
in devising that algorithm.
link |
Then he wants the machine to do the rest.
link |
And again, I understand the impulse.
link |
My intuition, based on years of watching this stuff
link |
and making predictions 20 years ago that still hold,
link |
even though there's a lot more computation and so forth,
link |
is that we actually have to do a different kind of hard work,
link |
which is more like building a design specification
link |
for what we want the system to do,
link |
doing hard engineering work to figure out
link |
how we do things like what Yann did for convolution
link |
in order to figure out how to encode complex knowledge
link |
The current systems don't have that much knowledge
link |
other than convolution,
link |
which is again this, you know,
link |
object experience in different places
link |
and having the same perception, I guess I'll say.
link |
People don't wanna do that work.
link |
They don't see how to naturally fit one with the other.
link |
I think that's, yes, absolutely.
link |
But also on the expert system side,
link |
there's a temptation to go too far the other way.
link |
So it was just having an expert sort of sit down
link |
and encode the description, the framework
link |
for what a container is,
link |
and then having the system reason for the rest.
link |
For my view, like one really exciting possibility
link |
is of active learning where it's continuous interaction
link |
between a human and machine.
link |
As the machine, there's kind of deep learning type
link |
extraction of information from data patterns and so on,
link |
but humans also guiding the learning procedures,
link |
guiding both the process and the framework
link |
of how the machine learns, whatever the task is.
link |
I was with you with almost everything you said,
link |
except the phrase deep learning.
link |
What I think you really want there
link |
is a new form of machine learning.
link |
So let's remember deep learning is a particular way
link |
of doing machine learning.
link |
Most often it's done with supervised data
link |
for perceptual categories.
link |
There are other things you can do with deep learning.
link |
Some of them quite technical,
link |
but the standard use of deep learning
link |
is I have a lot of examples and I have labels for them.
link |
So here are pictures.
link |
This one's the Eiffel Tower.
link |
This one's the Sears Tower.
link |
This one's the Empire State Building.
link |
This one's a pig and so forth.
link |
You just get millions of examples, millions of labels.
link |
And deep learning is extremely good at that.
link |
It's better than any other solution
link |
that anybody has devised,
link |
but it is not good at representing abstract knowledge.
link |
It's not good at representing things like bottles
link |
contain liquid and have tops to them and so forth.
link |
It's not very good at learning
link |
or representing that kind of knowledge.
link |
It is an example of having a machine learn something,
link |
but it's a machine that learns a particular kind of thing,
link |
which is object classification.
link |
It's not a particularly good algorithm
link |
for learning about the abstractions
link |
that govern our world.
link |
There may be such a thing,
link |
part of what we counsel in the book
link |
is maybe people should be working on devising such things.
link |
So one possibility, just I wonder what you think about it,
link |
is deep neural networks do form abstractions,
link |
but they're not accessible to us humans
link |
in terms of we can't.
link |
There's some truth in that.
link |
So is it possible that either current or future neural networks
link |
form very high level abstractions,
link |
which are as powerful as our human abstractions of common sense,
link |
we just can't get a hold of them.
link |
And so the problem is essentially
link |
we need to make them explainable.
link |
This is an astute question,
link |
but I think the answer is at least partly no.
link |
One of the kinds of classical neural network architecture
link |
is what we call an auto associator.
link |
It just tries to take an input, goes through a set of hidden layers
link |
and comes out with an output.
link |
And it's supposed to learn essentially the identity function,
link |
that your input is the same as your output.
link |
So you think of this binary numbers,
link |
you've got like the one, the two, the four, the eight,
link |
the 16 and so forth.
link |
And so if you want to input 24, you turn on the 16,
link |
you turn on the eight.
link |
It's like binary one, one and bunch of zeros.
link |
So I did some experiments in 1998
link |
with the precursors of contemporary deep learning.
link |
And what I showed was you could train these networks
link |
on all the even numbers
link |
and they would never generalize to the odd number.
link |
A lot of people thought that I was, I don't know,
link |
an idiot or faking the experiment or wasn't true or whatever,
link |
but it is true that with this class of networks
link |
that we had in that day,
link |
that they would never, ever make this generalization.
link |
And it's not that the networks were stupid,
link |
it's that they see the world in a different way than we do.
link |
They were basically concerned,
link |
what is the probability that the right most output node
link |
is going to be a one?
link |
And as far as they were concerned,
link |
in everything that they'd ever been trained on,
link |
it was a zero, that node had never been turned on.
link |
And so they figured, why turn it on now?
link |
Whereas a person would look at the same problem
link |
and say, well, it's obvious,
link |
we're just doing the thing that corresponds.
link |
The Latin for it is mutatus, mutatus,
link |
we'll change what needs to be changed.
link |
And we do this, this is what algebra is.
link |
So I can do f of x equals y plus two
link |
and I can do it for a couple of values.
link |
I can tell you if y is three, then x is five
link |
and if y is four, x is six.
link |
And now I can do it with some totally different number,
link |
like a million, then you can say,
link |
well, obviously it's a million and two
link |
because you have an algebraic operation
link |
that you're applying to a variable.
link |
And deep learning systems kind of emulate that,
link |
but they don't actually do it.
link |
The particular example,
link |
you could fudge a solution to that particular problem.
link |
The general form of that problem remains
link |
that what they learn is really correlations
link |
between different input and output nodes.
link |
And they're complex correlations
link |
with multiple nodes involved and so forth,
link |
but ultimately they're correlative.
link |
They're not structured over these operations
link |
Now, someday people may do a new form of deep learning
link |
that incorporates that stuff
link |
and I think it will help a lot.
link |
And there's some tentative work on things
link |
like differentiable programming right now
link |
that fall into that category.
link |
But there's sort of classic stuff
link |
like people use for ImageNet, doesn't have it.
link |
And you have people like Hinton going around
link |
and saying symbol manipulation like what Marcus,
link |
what I advocate is like the gasoline engine.
link |
We should just use this cool electric power
link |
that we've got with the deep learning.
link |
And that's really destructive
link |
because we really do need to have the gasoline engine stuff
link |
that represents, I mean, I don't think it's a good analogy,
link |
but we really do need to have the stuff
link |
that represents symbols.
link |
Yeah, and Hinton as well would say that
link |
we do need to throw out everything and start over.
link |
So I mean, there is a question.
link |
Yeah, Hinton said that to Axios
link |
and I had a friend who interviewed him
link |
and tried to pin him down on what exactly we need to throw
link |
and he was very evasive.
link |
Well, of course, because we can't,
link |
if he knew that he'd throw it out himself,
link |
but I mean, he can't have it both ways.
link |
He can't be like, I don't know what to throw out,
link |
but I am gonna throw out the symbols.
link |
I mean, and not just the symbols,
link |
but the variables and the operations over variables.
link |
Don't forget the operations over variables,
link |
the stuff that I'm endorsing
link |
and which John McCarthy did when he founded AI.
link |
That stuff is the stuff that we build most computers out of.
link |
There are people now who say,
link |
we don't need computer programmers anymore.
link |
Not quite looking at the statistics
link |
of how much computer programmers actually get paid right now.
link |
We need lots of computer programmers
link |
and most of them, they do a little bit of machine learning,
link |
but they still do a lot of code, right?
link |
Code where it's like, if the value of X is greater
link |
than the value of Y, then do this kind of thing,
link |
like conditionals and comparing operations over variables.
link |
Like there's this fantasy, you can machine learn anything.
link |
There's some things you would never wanna machine learn.
link |
I would not use a phone operating system
link |
that was machine learned.
link |
Like you made a bunch of phone calls
link |
and you recorded which packets were transmitted
link |
and you just machine learned it, it'd be insane.
link |
Or to build a web browser by taking logs of keystrokes
link |
and images, screenshots,
link |
and then trying to learn the relation between them.
link |
Nobody would ever, no rational person
link |
would ever try to build a browser that way.
link |
They would use symbol manipulation,
link |
the stuff that I think AI needs to avail itself of
link |
in addition to deep learning.
link |
Can you describe what your view of symbol manipulation
link |
in its early days?
link |
Can you describe expert systems
link |
and where do you think they hit a wall
link |
or a set of challenges?
link |
Sure, so I mean, first I just wanna clarify.
link |
I'm not endorsing expert systems per se.
link |
You've been kind of contrasting them.
link |
There is a contrast,
link |
but that's not the thing that I'm endorsing.
link |
So expert systems try to capture things
link |
like medical knowledge with a large set of rules.
link |
So if the patient has this symptom and this other symptom,
link |
then it is likely that they have this disease.
link |
So there are logical rules
link |
and they were symbol manipulating rules of just the sort
link |
that I'm talking about.
link |
And the problem. They encode a set of knowledge
link |
that the experts then put in.
link |
And very explicitly so.
link |
So you'd have somebody interview an expert
link |
and then try to turn that stuff into rules.
link |
And at some level I'm arguing for rules,
link |
but the difference is those guys did in the 80s
link |
was almost entirely rules,
link |
almost entirely handwritten with no machine learning.
link |
What a lot of people are doing now
link |
is almost entirely one species of machine learning
link |
And what I'm counseling is actually a hybrid.
link |
I'm saying that both of these things have their advantage.
link |
So if you're talking about perceptual classification,
link |
how do I recognize a bottle?
link |
Deep learning is the best tool we've got right now.
link |
If you're talking about making inferences
link |
about what a bottle does,
link |
something closer to the expert systems
link |
is probably still the best available alternative.
link |
And probably we want something that is better able
link |
to handle quantitative and statistical information
link |
than those classical systems typically were.
link |
So we need new technologies
link |
that are gonna draw some of the strengths
link |
of both the expert systems and the deep learning,
link |
but are gonna find new ways to synthesize them.
link |
How hard do you think it is to add knowledge at the low level?
link |
So mine human intellects to add extra information
link |
to symbol manipulating systems.
link |
In some domains, it's not that hard,
link |
but it's often really hard.
link |
Partly because a lot of the things that are important,
link |
people wouldn't bother to tell you.
link |
So if you pay someone on Amazon Mechanical Turk
link |
to tell you stuff about bottles,
link |
they probably won't even bother to tell you
link |
some of the basic level stuff
link |
that's just so obvious to a human being
link |
and yet so hard to capture in machines.
link |
You know, they're gonna tell you more exotic things
link |
and like they're all well and good,
link |
but they're not getting to the root of the problem.
link |
So untutored humans aren't very good at knowing
link |
and why should they be,
link |
what kind of knowledge the computer system developers
link |
I don't think that that's an irremediable problem.
link |
I think it's historically been a problem.
link |
People have had crowdsourcing efforts
link |
and they don't work that well.
link |
There's one at MIT.
link |
We're recording this at MIT called Virtual Home
link |
where, and we talk about this in the book.
link |
Find the exact example there,
link |
but people were asked to do things
link |
like describe an exercise routine.
link |
And the things that the people describe it
link |
are very low level and don't really capture what's going on.
link |
So they're like, go to the room with the television
link |
and the weights, turn on the television,
link |
press the remote to turn on the television,
link |
lift weight, put weight down,
link |
it's like very micro level.
link |
And it's not telling you what an exercise routine
link |
is really about, which is like,
link |
I wanna fit a certain number of exercises
link |
in a certain time period,
link |
I wanna emphasize these muscles.
link |
You want some kind of abstract description.
link |
The fact that you happen to press the remote control
link |
in this room when you watch this television
link |
isn't really the essence of the exercise routine,
link |
but if you just ask people like, what did they do?
link |
Then they give you this fine grain.
link |
And so it takes a little level of expertise
link |
about how the AI works in order to craft
link |
the right kind of knowledge.
link |
So there's this ocean of knowledge
link |
that we all operate on.
link |
Some of it may not even be conscious,
link |
or at least we're not able to communicate it effectively.
link |
Yeah, most of it we would recognize if somebody said it,
link |
if it was true or not,
link |
but we wouldn't think to say that it's true or not.
link |
It's a really interesting mathematical property.
link |
This ocean has the property that every piece
link |
of knowledge in it,
link |
we will recognize it as true if we're told,
link |
but we're unlikely to retrieve it in the reverse.
link |
So that interesting property,
link |
I would say there's a huge ocean of that knowledge.
link |
What's your intuition?
link |
Is it accessible to AI systems somehow?
link |
Can we, so you said,
link |
I mean, most of it is not,
link |
well, I'll give you an asterisk on this in a second,
link |
but most of it is not ever been encoded
link |
in machine interpretable form.
link |
And so, I mean, if you say accessible,
link |
there's two meanings of that.
link |
One is like, could you build it into a machine?
link |
The other is like, is there some database
link |
that we could go download and stick into our machine?
link |
But the first thing, no.
link |
Is what's your intuition?
link |
I think it hasn't been done right.
link |
The closest, and this is the asterisk,
link |
is the CYC psych system, try to do this.
link |
A lot of logicians worked for Doug Lennon
link |
for 30 years on this project.
link |
I think they stuck too closely to logic,
link |
didn't represent enough about probabilities,
link |
tried to hand code it, there are various issues,
link |
and it hasn't been that successful.
link |
That is the closest existing system
link |
to trying to encode this.
link |
Why do you think there's not more excitement
link |
slash money behind this idea currently?
link |
There was, people view that project as a failure.
link |
I think that they confused the failure of a specific instance
link |
that was conceived 30 years ago for the failure of an approach,
link |
which they don't do for deep learning.
link |
So in 2010, people had the same attitude towards deep learning.
link |
They're like, this stuff doesn't really work.
link |
And all these other algorithms work better and so forth.
link |
And then certain key technical advances were made.
link |
But mostly, it was the advent of graphics processing units
link |
that changed that.
link |
It wasn't even anything foundational in the techniques.
link |
And there were some new tricks.
link |
But mostly, it was just more compute and more data,
link |
things like ImageNet that didn't exist before,
link |
that allowed deep learning.
link |
And it could be to work.
link |
It could be that psych just needs a few more things
link |
or something like psych.
link |
But the widespread view is that that just doesn't work.
link |
And people are reasoning from a single example.
link |
They don't do that with deep learning.
link |
They don't say nothing that existed in 2010.
link |
And there were many, many efforts in deep learning
link |
was really worth anything.
link |
I mean, really, there's no model from 2010
link |
in deep learning that has any commercial value whatsoever
link |
They're all failures.
link |
But that doesn't mean that there wasn't anything there.
link |
I have a friend who I was getting to know him.
link |
And he said, I had a company, too.
link |
I was talking about I had a new company.
link |
And he said, I had a company, too, and it failed.
link |
And I said, well, what did you do?
link |
And he said, deep learning.
link |
And the problem was he did it in 1986 or something like that.
link |
And we didn't have the tools then or 1990.
link |
We didn't have the tools then, not the algorithms.
link |
His algorithms weren't that different from other algorithms.
link |
But he didn't have the GPUs to run it fast enough.
link |
He didn't have the data.
link |
It could be that symbol manipulation, per se,
link |
with modern amounts of data and compute
link |
and maybe some advance in compute for that kind of compute,
link |
My perspective on it is not that we
link |
want to resuscitate that stuff, per se,
link |
but we want to borrow lessons from it, bring together
link |
with other things that we've learned.
link |
And it might have an ImageNet moment where it will spark
link |
the world's imagination.
link |
And there will be an explosion of symbol manipulation efforts.
link |
Yeah, I think that people at AI2, the Paul Allen AI Institute,
link |
are trying to build data sets that, well,
link |
they're not doing it for quite the reason that you say,
link |
but they're trying to build data sets that at least
link |
spark interest in common sense reasoning.
link |
To create benchmarks that people are thinking.
link |
Benchmarks for common sense, that's
link |
a large part of what the AI2.org is working on right now.
link |
So speaking of compute, Rich Sutton
link |
wrote a blog post titled Bitter Lesson.
link |
I don't know if you've read it, but he said that the biggest
link |
lesson that can be read from 70 years of AI research
link |
is that general methods that leverage computation
link |
are ultimately the most effective.
link |
Do you think that?
link |
The most effective of what?
link |
So they have been most effective for perceptual classification
link |
problems and for some reinforcement learning problems.
link |
He works on reinforcement learning.
link |
Well, no, let me push back on that.
link |
You're actually absolutely right.
link |
But I would also say they've been most effective generally
link |
because everything we've done up to the point.
link |
Would you argue against that?
link |
To me, deep learning is the first thing
link |
that has been successful at anything in AI.
link |
And you're pointing out that this success is very limited,
link |
But has there been something truly successful
link |
before deep learning?
link |
I want to make a larger point.
link |
But on the narrower point, classical AI
link |
is used, for example, in doing navigation instructions.
link |
It's very successful.
link |
Everybody on the planet uses it now, like multiple times a day.
link |
That's a measure of success, right?
link |
So I don't think classical AI was wildly successful.
link |
But there are cases like that that is used all the time.
link |
Nobody even notices them because they're so pervasive.
link |
So there are some successes for classical AI.
link |
I think deep learning has been more successful.
link |
But my usual line about this, and I didn't invent it,
link |
but I like it a lot, is just because you
link |
can build a better ladder doesn't mean
link |
you can build a ladder to the moon.
link |
So the bitter lesson is if you have a perceptual classification
link |
problem, throwing a lot of data at it
link |
is better than anything else.
link |
But that has not given us any material progress
link |
in natural language understanding,
link |
common sense reasoning like a robot would
link |
need to navigate a home.
link |
Problems like that, there is no actual progress there.
link |
So flip side of that, if we remove data from the picture,
link |
another bitter lesson is that you just have a very simple
link |
algorithm and you wait for compute to scale.
link |
This doesn't have to be learning.
link |
It doesn't have to be deep learning.
link |
It doesn't have to be data driven,
link |
but just wait for the compute.
link |
So my question for you, do you think
link |
compute can unlock some of the things
link |
with either deep learning or simple manipulation that?
link |
Sure, but I'll put a proviso on that.
link |
More compute's always better, like nobody's
link |
going to argue with more compute.
link |
It's like having more money.
link |
I mean, there's the data.
link |
There's diminishing returns on more money.
link |
There's diminishing returns on more money,
link |
but nobody's going to argue if you
link |
want to give them more money, right?
link |
Except maybe the people who signed the giving pledge,
link |
and some of them have a problem.
link |
They have problems to give away more money
link |
than they're able to.
link |
But the rest of us, if you want to give me more money, fine.
link |
Say more money, more problems, but OK.
link |
What I would say to you is your brain uses like 20 watts,
link |
and it does a lot of things that deep learning doesn't do,
link |
or that simple manipulation doesn't do,
link |
that AI just hasn't figured out how to do.
link |
So it's an existence proof that you
link |
don't need server resources that are Google scale in order
link |
to have an intelligence.
link |
I built, with a lot of help from my wife,
link |
two intelligences that are 20 watts each
link |
and far exceed anything that anybody else has built at a silicon.
link |
Speaking of those two robots, what
link |
have you learned about AI from having?
link |
Well, they're not robots, but.
link |
Sorry, intelligent agents.
link |
There's two intelligent agents.
link |
I've learned a lot by watching my two intelligent agents.
link |
I think that what's fundamentally interesting,
link |
well, one of the many things that's fundamentally interesting
link |
about them is the way that they set their own problems
link |
So my two kids are a year and a half apart.
link |
They're both five and six and a half.
link |
They play together all the time, and they're constantly
link |
creating new challenges.
link |
Like that's what they do, is they make up games,
link |
and they're like, well, what if this, or what if that,
link |
or what if I had this superpower,
link |
or what if you could walk through this wall.
link |
So they're doing these what if scenarios all the time.
link |
And that's how they learn something about the world
link |
and grow their minds, and machines don't really do that.
link |
So that's interesting.
link |
And you've talked about this, you've written about it,
link |
you thought about it, nature versus nurture.
link |
So what innate knowledge do you think we're born with?
link |
And what do we learn along the way
link |
in those early months and years?
link |
Can I just say how much I like that question?
link |
You phrased it just right, and almost nobody ever does.
link |
Which is what is the innate knowledge
link |
in what's learned along the way.
link |
So many people that catamize it,
link |
and they think it's nature versus nurture.
link |
When it is obviously has to be nature and nurture,
link |
they have to work together.
link |
You can't learn the stuff along the way
link |
unless you have some innate stuff.
link |
But just because you have the innate stuff
link |
doesn't mean you don't learn anything.
link |
And so many people get that wrong, including in the field.
link |
Like people think, if I work in machine learning,
link |
the learning side, I must not be allowed to work
link |
on the innate side where that will be cheating.
link |
Exactly, people have said that to me.
link |
And it's just absurd.
link |
But you could break that apart more.
link |
I've talked to folks who studied
link |
the development of the brain.
link |
And I mean, the growth of the brain
link |
in the first few days, in the first few months,
link |
in the womb, all of that, is that innate?
link |
So that process of development from a stem cell
link |
to the growth, the central nervous system and so on,
link |
to the information that's encoded
link |
through the long arc of evolution.
link |
So all of that comes into play and it's unclear.
link |
It's not just whether it's the dichotomy or not.
link |
It's where most, or where the knowledge is encoded.
link |
So what's your intuition about the innate knowledge,
link |
the power of it, what's contained in it?
link |
What can we learn from it?
link |
One of my earlier books was actually
link |
trying to understand the biology of this.
link |
The book was called The Birth of the Mind.
link |
Like how is it the genes even build innate knowledge?
link |
And from the perspective of the conversation
link |
we're having today, there's actually two questions.
link |
One is what innate knowledge or mechanisms
link |
People or other animals might be endowed with,
link |
I always like showing this video
link |
of a baby Ibex climbing down a mountain.
link |
That baby Ibex a few hours after his birth
link |
knows how to climb down a mountain.
link |
That means that it knows, not consciously,
link |
something about its own body and physics
link |
and 3D geometry and all of this kind of stuff.
link |
So there's one question about like what does biology
link |
give its creatures and what has evolved in our brains?
link |
How is that represented in our brains?
link |
The question I thought about in the book,
link |
The Birth of the Mind.
link |
And then there's a question of what AI should have.
link |
And they don't have to be the same.
link |
But I would say that it's a pretty interesting set
link |
of things that we are equipped with
link |
that allows us to do a lot of interesting things.
link |
So I would argue or guess based on my reading
link |
of the developmental psychology literature,
link |
which I've also participated in,
link |
that children are born with a notion of space, time,
link |
other agents, places,
link |
and also this kind of mental algebra
link |
that I was describing before.
link |
No certain of causation if I didn't just say that.
link |
So at least those kinds of things.
link |
They're like frameworks for learning the other things.
link |
So are they disjoint in your view?
link |
Or is it just somehow all connected?
link |
You've talked a lot about language.
link |
Is it all kind of connected in some mesh
link |
that's language like if understanding concepts altogether?
link |
Or I don't think we know for people
link |
how they're represented in machines
link |
just don't really do this yet.
link |
So I think it's an interesting open question
link |
both for science and for engineering.
link |
Some of it has to be at least interrelated
link |
in the way that the interfaces of a software package
link |
have to be able to talk to one another.
link |
So the systems that represent space and time
link |
can't be totally disjoint
link |
because a lot of the things that we reason about
link |
are the relations between space and time and cause.
link |
So I put this on and I have expectations
link |
about what's gonna happen with the bottle cap
link |
on top of the bottle.
link |
And those span space and time.
link |
If the cap is over here, I get a different outcome.
link |
If the timing is different, if I put this here
link |
after I move that, then I get a different outcome
link |
that relates to causality.
link |
So obviously these mechanisms, whatever they are,
link |
can certainly communicate with each other.
link |
So I think evolution had a significant role
link |
to play in the development of this whole collage, right?
link |
How efficient do you think is evolution?
link |
Oh, it's terribly inefficient, except that.
link |
Well, can we do better?
link |
Well, let's come to that in a second.
link |
It's inefficient except that once it gets a good idea,
link |
So it took, I guess a billion years,
link |
roughly a billion years to evolve to a vertebrate brain plan.
link |
And once that vertebrate plan evolved,
link |
it spread everywhere.
link |
So fish have it and dogs have it and we have it.
link |
We have adaptations of it and specializations of it.
link |
And the same thing with a primate brain plan.
link |
So monkeys have it and apes have it and we have it.
link |
So there are additional innovations like color vision
link |
and those spread really rapidly.
link |
So it takes evolution a long time to get a good idea,
link |
but being anthropomorphic and not literal here,
link |
but once it has that idea, so to speak,
link |
which caches out into one set of genes or in the genome,
link |
those genes spread very rapidly
link |
and they're like subroutines or libraries,
link |
I guess the word people might use nowadays
link |
or be more familiar with,
link |
they're libraries that can get used over and over again.
link |
So once you have the library for building something
link |
with multiple digits, you can use it for a hand,
link |
but you can also use it for a foot.
link |
You just kind of reuse the library
link |
with slightly different parameters.
link |
Evolution does a lot of that,
link |
which means that the speed over time picks up.
link |
So evolution can happen faster
link |
because you have bigger and bigger libraries.
link |
And what I think has happened in attempts
link |
at evolutionary computation is that people start
link |
with libraries that are very, very minimal,
link |
like almost nothing and then progress is slow
link |
and it's hard for someone to get a good PhD thesis
link |
out of it and they give up.
link |
If we had richer libraries to begin with,
link |
if you were evolving from systems
link |
that hadn't originate structure to begin with,
link |
then things might speed up.
link |
Or more PhD students, if the evolutionary process is indeed
link |
in a meta way, runs away with good ideas,
link |
you need to have a lot of ideas,
link |
pool of ideas in order for it to discover one
link |
that you can run away with.
link |
And PhD students representing individual ideas as well.
link |
Yeah, I mean, you could throw a billion PhD students at it.
link |
Yeah, the monkeys at typewriters with Shakespeare, yep.
link |
Well, I mean, those aren't cumulative, right?
link |
That's just random.
link |
And part of the point that I'm making
link |
is that evolution is cumulative.
link |
So if you have a billion monkeys independently,
link |
you don't really get anywhere.
link |
But if you have a billion monkeys,
link |
and I think Dawkins made this point originally,
link |
or probably other people,
link |
Dawkins made it very nice
link |
and either a selfish gene or blind watchmaker.
link |
If there is some sort of fitness function
link |
that can drive you towards something,
link |
I guess that's Dawkins point.
link |
And my point, which is a little variation on that,
link |
is that if the evolution is cumulative,
link |
the related points, then you can start going faster.
link |
Do you think something like the process of evolution
link |
is required to build intelligent systems?
link |
So all the stuff that evolution did,
link |
a good engineer might be able to do.
link |
So for example, evolution made quadrupeds,
link |
which distribute the load across a horizontal surface.
link |
A good engineer could come up with that idea.
link |
I mean, sometimes good engineers come up with ideas
link |
by looking at biology.
link |
There's lots of ways to get your ideas.
link |
And part of what I'm suggesting
link |
is we should look at biology a lot more.
link |
We should look at the biology of thought
link |
and understanding the biology
link |
by which creatures intuitively reason about physics
link |
or other agents or like,
link |
how do dogs reason about people?
link |
Like they're actually pretty good at it.
link |
If we could understand...
link |
At my college, we joked, dognition.
link |
If we could understand dognition well,
link |
then how it was implemented,
link |
that might help us with our AI.
link |
So do you think it's possible
link |
that the kind of timescale that evolution took
link |
is the kind of timescale that will be needed
link |
to build intelligent systems?
link |
Or can we significantly accelerate that process
link |
inside a computer?
link |
I mean, I think the way that we accelerate that process
link |
is we borrow from biology.
link |
Not slavishly, but I think we look at how biology
link |
has solved problems and we say,
link |
does that inspire any engineering solutions here?
link |
And try to mimic biological systems
link |
and then therefore have a shortcut?
link |
Yeah, I mean, there's a field called biomimicry
link |
and people do that for like material science all the time.
link |
We should be doing the analog of that for AI.
link |
And the analog for that for AI
link |
is to look at cognitive science
link |
or the cognitive sciences,
link |
which is psychology, maybe neuroscience, linguistics
link |
and so forth, look to those for insight.
link |
What do you think is a good test of intelligence
link |
I don't think there's one good test.
link |
In fact, I try to organize a movement
link |
towards something called a Turing Olympics.
link |
And my hope is that Francois is actually gonna take,
link |
Francois Chalet is gonna take over this.
link |
I think he's interested in that.
link |
I just don't have a place in my busy life at this moment.
link |
But the notion is that there'd be many tests
link |
and not just one because intelligence is multifaceted.
link |
There can't really be a single measure of it
link |
because it isn't a single thing.
link |
Like just the crudest level,
link |
the SAT is a verbal component and a math component
link |
because they're not identical.
link |
And Howard Gardner has talked about multiple intelligence,
link |
like kinesthetic intelligence
link |
and verbal intelligence and so forth.
link |
There are a lot of things that go into intelligence
link |
and people can get good at one or the other.
link |
I mean, in some sense, like every expert
link |
has developed a very specific kind of intelligence.
link |
And then there are people that are generalists.
link |
And I think of myself as a generalist
link |
with respect to cognitive science,
link |
which doesn't mean I know anything about quantum mechanics,
link |
but I know a lot about the different facets of the mind.
link |
And there's a kind of intelligence
link |
to thinking about intelligence.
link |
I like to think that I have some of that,
link |
but social intelligence, I'm just okay.
link |
There are people that are much better at that than I am.
link |
Sure, but what would be really impressive to you?
link |
I think the idea of a touring Olympics is really interesting,
link |
especially if somebody like Francois is running it.
link |
But to you in general, not as a benchmark,
link |
but if you saw an AI system being able to accomplish something
link |
that would impress the heck out of you,
link |
what would that thing be?
link |
Would it be natural language conversation?
link |
For me personally, I would like to see a kind of comprehension
link |
that relates to what you just said.
link |
So I wrote a piece in the New Yorker in I think 2015,
link |
right after Eugene Gustman, which was a software package,
link |
won a version of the touring test.
link |
And the way that it did this is it'd be,
link |
well, the way you win the touring test,
link |
so called win it, is the touring test is you fool a person
link |
into thinking that a machine is a person.
link |
Is you're evasive, you pretend to have limitations
link |
so you don't have to answer certain questions and so forth.
link |
So this particular system
link |
pretended to be a 13 year old boy from Odessa
link |
who didn't understand English and was kind of sarcastic
link |
and wouldn't answer your questions and so forth.
link |
And so judges got fooled into thinking briefly
link |
with a very little exposure to the 13 year old boy
link |
and it docked all the questions the touring was actually
link |
interested in, which is like,
link |
how do you make the machine actually intelligent?
link |
So that test itself is not that good.
link |
And so in New Yorker, I proposed an alternative, I guess.
link |
And the one that I proposed there was a comprehension test.
link |
And I must like Breaking Bad,
link |
because I've already given you one Breaking Bad example
link |
and in that article I have one as well,
link |
which was something like if Walter,
link |
you should be able to watch an episode of Breaking Bad
link |
or maybe you have to watch the whole series
link |
to be able to answer the question and say,
link |
if Walter White took a hit out on Jesse,
link |
why did he do that?
link |
So if you could answer kind of arbitrary questions
link |
about characters motivations,
link |
I would be really impressed with that.
link |
I mean, he built software to do that.
link |
They could watch a film or they're different versions.
link |
And so ultimately I wrote this up with Praveen Paratosh
link |
in a special issue of AI magazine
link |
that basically was about the Turing Olympics.
link |
There were like 14 tests proposed.
link |
The one that I was pushing was a comprehension challenge
link |
and Praveen who's at Google was trying to figure out
link |
like how we would actually run it.
link |
And so we wrote a paper together.
link |
And you could have a text version too,
link |
or you could have an auditory podcast version,
link |
you could have a written version.
link |
But the point is that you win at this test
link |
if you can do let's say human level or better than humans
link |
at answering kind of arbitrary questions.
link |
You know, why did this person pick up the stone?
link |
What were they thinking when they picked up the stone?
link |
Were they trying to knock down glass?
link |
And I mean, ideally these wouldn't be multiple choice either
link |
because multiple choice is pretty easily gamed.
link |
So if you could have relatively open ended questions
link |
and you can answer why people are doing this stuff,
link |
I would be very impressed.
link |
And of course humans can do this, right?
link |
If you watch a well constructed movie
link |
and somebody picks up a rock,
link |
everybody watching the movie knows
link |
why they picked up the rock, right?
link |
They all know, oh my gosh, he's gonna hit this character
link |
We have an example in the book about
link |
when a whole bunch of people say, I am Spartacus,
link |
you know this famous scene?
link |
The viewers understand, first of all,
link |
that everybody or everybody minus one has to be lying.
link |
They can't all be Spartacus.
link |
We have enough common sense knowledge
link |
to know they couldn't all have the same name.
link |
We know that they're lying
link |
and we can infer why they're lying, right?
link |
They're lying to protect someone
link |
and to protect things they believe in.
link |
You get a machine that can do that.
link |
They can say, this is why these guys all got up
link |
and said, I am Spartacus.
link |
I will sit down and say AI has really achieved a lot.
link |
Without cheating any part of the system.
link |
Yeah, I mean, if you do it,
link |
there are lots of ways you can cheat.
link |
Like you could build a Spartacus machine
link |
that works on that film.
link |
Like that's not what I'm talking about.
link |
I'm talking about, you can do this
link |
with essentially arbitrary films from a large size.
link |
Even beyond films because it's possible
link |
such a system would discover
link |
that the number of narrative arcs in film
link |
is like limited to like 1930.
link |
There's a famous thing about the classic seven plots
link |
I don't care if you want to build in the system,
link |
boy meets girl, boy loses girl, boy finds girl.
link |
I don't mind having some headstone knowledge.
link |
I mean, you could build it into Nathalie
link |
or you could have your system watch a lot of films again.
link |
If you can do this at all,
link |
but with a wide range of films,
link |
not just one film and one genre.
link |
But even if you could do it for all Westerns,
link |
I'd be reasonably impressed.
link |
So in terms of being impressed,
link |
just for the fun of it,
link |
because you've put so many interesting ideas out there
link |
a challenge in the community for further steps,
link |
is it possible on the deep learning front
link |
that you're wrong about its limitations,
link |
that deep learning will unlock,
link |
Yanlacou next year will publish a paper
link |
that achieves this comprehension.
link |
So do you think that way often as a scientist,
link |
do you consider that your intuition
link |
that deep learning could actually run away with it?
link |
I'm more worried about rebranding
link |
as a kind of political thing.
link |
So I mean, what's gonna happen, I think,
link |
is that deep learning is gonna start to encompass
link |
simple manipulation.
link |
So I think Hinton's just wrong.
link |
Hinton says we don't want hybrids.
link |
I think people will work towards hybrids
link |
and they will relabel their hybrids as deep learning.
link |
We've already seen some of that.
link |
So AlphaGo is often described as a deep learning system,
link |
but it's more correctly described as a system
link |
that has deep learning, but also Monte Carlo Tree Search,
link |
which is a classical AI technique.
link |
And people will start to blur the lines
link |
in the way that IBM blurred Watson.
link |
First Watson meant this particular system
link |
and then it was just anything that IBM built
link |
in their cognitive division.
link |
But purely, let me ask for sure.
link |
That's a branding question and that's a giant mess.
link |
I mean purely a single neural network
link |
being able to accomplish reasoning and comprehension.
link |
I don't stay up at night
link |
worrying that that's gonna happen.
link |
And I'll just give you two examples.
link |
One is a guy at DeepMind
link |
thought he had finally outfoxed me at Xergy Lord,
link |
I think is his Twitter handle.
link |
And he specifically made an example.
link |
Marcus said that such and such, he fed it into GP2,
link |
which is the AI system that is so smart
link |
that OpenAI couldn't release it
link |
because it would destroy the world, right?
link |
You remember that a few months ago.
link |
So he feeds it into GPT2 and my example was something
link |
like a rose is a rose, a tulip is a tulip,
link |
a lily is a blank.
link |
And he got it to actually do that,
link |
which was a little bit impressive.
link |
And I wrote back and I said, that's impressive,
link |
but can I ask you a few questions?
link |
I said, was that just one example?
link |
Can it do it generally?
link |
And can it do it with novel words?
link |
Which is part of what I was talking about in 1998
link |
when I first raised the example.
link |
So a DAX is a DAX, right?
link |
And he sheepishly wrote back about 20 minutes later
link |
and the answer was, well, it had some problems with those.
link |
So I made some predictions 21 years ago
link |
that still hold in the world of computer science.
link |
That's amazing, right?
link |
Because there's a thousand or a million times more memory
link |
and computations a million times,
link |
do a million times more operations per second,
link |
spread across a cluster and there's been advances
link |
in replacing sigmoids with other functions and so forth.
link |
There's all kinds of advances,
link |
but the fundamental architecture hasn't changed
link |
and the fundamental limit hasn't changed.
link |
And what I said then is kind of still true.
link |
And then here's a second example.
link |
I recently had a piece in Wired that's adapted from the book
link |
and the book didn't, it was when to press before GP2 came out.
link |
But we describe this children's story
link |
and all the inferences that you make in this story
link |
about a boy finding a lost wallet.
link |
And for fun in the Wired piece, we ran it through GP2.
link |
GP2 at something called TalkToTransformer.com
link |
and your viewers can try this experiment themselves,
link |
go to the Wired piece that has the link and it has the story.
link |
And the system made perfectly fluent text
link |
that was totally inconsistent
link |
with the conceptual underpinnings of the story, right?
link |
And this is what, again, I predicted in 1998
link |
and for that matter, Chomsky Miller
link |
made the same prediction in 1963.
link |
I was just updating their claim for a slightly new text.
link |
So those particular architectures
link |
that don't have any built in knowledge,
link |
they're basically just a bunch of layers
link |
doing correlational stuff,
link |
they're not gonna solve these problems.
link |
So 20 years ago, you said the emperor has no clothes.
link |
Today, the emperor still has no clothes.
link |
The lighting's better though.
link |
The lighting is better.
link |
And I think you yourself are also, I mean.
link |
And we found out some things to do with Naked Emperors.
link |
I mean, it's not like stuff is worthless.
link |
I mean, they're not really Naked.
link |
It's more like they're in their briefs
link |
and everybody thinks that.
link |
And so like, I mean, they are great at speech recognition.
link |
But the problems that I said were hard.
link |
I didn't literally say the emperor has no clothes.
link |
I said, this is a set of problems
link |
that humans are really good at.
link |
And it wasn't couched as AI,
link |
it was couched as cognitive science.
link |
But I said, if you wanna build a neural model
link |
of how humans do certain class of things,
link |
you're gonna have to change the architecture.
link |
And I stand by those claims.
link |
So, and I think people should understand
link |
you're quite entertaining in your cynicism,
link |
but you're also very optimistic and a dreamer
link |
about the future of AI too.
link |
So you're both, it's just.
link |
There's a famous saying about being,
link |
people overselling technology in the short run
link |
and underselling it in the long run.
link |
And so I actually end the book,
link |
Ernie Davis and I end our book with an optimistic chapter,
link |
which kind of killed Ernie
link |
because he's even more pessimistic than I am.
link |
He describes me as a contrarian and him as a pessimist.
link |
But I persuaded him that we should end the book
link |
with a look at what would happen
link |
if AI really did incorporate, for example,
link |
the common sense reasoning and the nativism
link |
and so forth, the things that we counseled for.
link |
And we wrote it and it's an optimistic chapter
link |
that AI suitably reconstructed so that we could trust it,
link |
which we can't now, could really be world changing.
link |
So on that point, if you look at the future
link |
trajectories of AI, people have worries
link |
about negative effects of AI,
link |
whether it's at the large existential scale
link |
or smaller short term scale of negative impact on society.
link |
So you write about trustworthy AI,
link |
how can we build AI systems that align with our values
link |
that make for a better world
link |
that we can interact with that we can trust?
link |
The first thing we have to do
link |
is to replace deep learning with deep understanding.
link |
So you can't have alignment with a system
link |
that traffics only in correlations
link |
and doesn't understand concepts like bottles or harm.
link |
So you, Asimov talked about these famous laws
link |
and the first one was first do no harm.
link |
And you can quibble about the details of Asimov's laws,
link |
but we have to, if we're gonna build real robots
link |
in the real world, have something like that.
link |
That means we have to program in a notion
link |
that's at least something like harm.
link |
That means we have to have these more abstract ideas
link |
that deep learning is not particularly good at.
link |
They have to be in the mix somewhere.
link |
And you could do statistical analysis
link |
about probabilities of given harms or whatever,
link |
but you have to know what a harm is
link |
in the same way that you have to understand
link |
that a bottle isn't just a collection of pixels.
link |
And also be able to, you're implying
link |
that you need to also be able to communicate that to humans.
link |
So the AI systems would be able to prove to humans
link |
that they understand that they know what harm means.
link |
I might run it in the reverse direction,
link |
but roughly speaking, I agree with you.
link |
So we probably need to have committees of wise people,
link |
ethicists and so forth, think about what these rules
link |
ought to be and we should just leave it
link |
to software engineers.
link |
It shouldn't just be software engineers
link |
and it shouldn't just be people
link |
who own large mega corporations that are good at technology.
link |
Ethicists and so forth should be involved,
link |
but there should be some assembly of wise people
link |
as I was putting it that tries to figure out
link |
what the rules ought to be.
link |
And those have to get translated into code.
link |
You can argue or code or neural networks or something.
link |
They have to be translated into something
link |
that machines can work with.
link |
And that means there has to be a way
link |
of working the translation.
link |
And right now we don't.
link |
We don't have a way.
link |
So let's say you and I were the committee
link |
and we decide that Asimov's first law is actually right.
link |
And let's say it's not just two white guys,
link |
which would be kind of unfortunate
link |
and then we have a broad.
link |
And so we've represented a sample of the world
link |
or however we want to do this.
link |
And the committee decides eventually,
link |
okay, Asimov's first law is actually pretty good.
link |
There are these exceptions to it.
link |
We want to program in these exceptions,
link |
but let's start with just the first one
link |
and then we'll get to the exceptions.
link |
First one is first do no harm.
link |
Well, somebody has to now actually turn that
link |
into a computer program or a neural network or something.
link |
And one way of taking the whole book,
link |
the whole argument that I'm making
link |
is that we just don't have to do that yet
link |
and we're fooling ourselves if we think
link |
that we can build trustworthy AI.
link |
If we can't even specify in any kind of,
link |
we can't do it in Python
link |
and we can't do it in TensorFlow,
link |
we're fooling ourselves and thinking
link |
that we can make trustworthy AI
link |
if we can't translate harm into something
link |
that we can execute.
link |
And if we can't, then we should be thinking really hard,
link |
how could we ever do such a thing?
link |
Because if we're going to use AI
link |
in the ways that we want to use it to make job interviews
link |
or to do surveillance,
link |
not that I personally want to do that or whatever,
link |
I mean, if we're going to use AI
link |
in ways that have practical impact on people's lives
link |
or medicine, it's got to be able to understand stuff like that.
link |
So one of the things your book highlights
link |
is that a lot of people in the deep learning community,
link |
but also the general public, politicians,
link |
just people in all general groups and walks of life
link |
have a different levels of misunderstanding of AI.
link |
So when you talk about committees,
link |
what's your advice to our society?
link |
How do we learn about AI such that
link |
such committees could emerge
link |
where large groups of people could have a productive discourse
link |
about how to build successful AI systems?
link |
Part of the reason we wrote the book
link |
was to try to inform those committees.
link |
So part of the reason we wrote the book
link |
was to inspire a future generation of students
link |
to solve what we think are the important problems.
link |
So a lot of the book is trying to pinpoint
link |
what we think are the hard problems
link |
where we think effort would most be rewarded.
link |
And part of it is to try to train people
link |
who talk about AI, but aren't experts in the field
link |
to understand what's realistic and what's not.
link |
One of my favorite parts in the book
link |
is the six questions you should ask.
link |
Anytime you read a media account,
link |
so number one is if somebody talks about something,
link |
look for the demo.
link |
If there's no demo, don't believe it.
link |
Like the demo that you can try.
link |
If you can't try it at home,
link |
maybe it doesn't really work that well yet.
link |
So if we don't have this example in the book,
link |
but if Sundar Pinchai says we have this thing
link |
that allows it to sound like human beings in conversation,
link |
you should ask, can I try it?
link |
And you should ask how general it is.
link |
And it turns out at that time,
link |
I'm alluding to Google Duplex when it was announced,
link |
it only worked on calling hairdressers,
link |
restaurants, and finding opening hours.
link |
That's not very general.
link |
And I'm not gonna ask your thoughts about Sophia, but yeah.
link |
I understand that's a really good question to ask
link |
of any kind of hype top idea.
link |
So Sophia has very good material written for her,
link |
but she doesn't understand the things that she's saying.
link |
So a while ago, you've written a book
link |
on the science of learning, which I think is fascinating,
link |
but the learning case studies of playing guitar.
link |
I love guitar myself, I've been playing my whole life.
link |
So let me ask a very important question.
link |
What is your favorite song, rock song to listen to
link |
Well, those would be different,
link |
but I'll say that my favorite rock song to listen to
link |
is probably all along the Watchtower,
link |
the Jimi Hendrix version.
link |
The Jimi Hendrix version.
link |
It just feels magic to me.
link |
I've actually recently learned that I love that song.
link |
I've been trying to put it on YouTube myself singing.
link |
Singing is the scary part.
link |
If you could party with a rock star for a weekend,
link |
living or dead, who would you choose?
link |
And pick their mind,
link |
it's not necessarily about the party.
link |
Thanks for the clarification. I guess John Lennon
link |
is such an intriguing person and I think a troubled person,
link |
but an intriguing one.
link |
Well, Imagine is one of my favorite songs.
link |
Also one of my favorite songs.
link |
That's a beautiful way to end it.
link |
Gary, thank you so much for talking to me.
link |
Thanks so much for having me.