back to index

Tomaso Poggio: Brains, Minds, and Machines | Lex Fridman Podcast #13


small model | large model

link |
00:00:00.000
The following is a conversation with Tomaso Poggio.
link |
00:00:02.960
He's a professor at MIT and is a director of the Center
link |
00:00:06.200
for Brains, Minds, and Machines.
link |
00:00:08.360
Cited over 100,000 times, his work
link |
00:00:11.640
has had a profound impact on our understanding
link |
00:00:14.560
of the nature of intelligence in both biological
link |
00:00:17.680
and artificial neural networks.
link |
00:00:19.880
He has been an advisor to many highly impactful researchers
link |
00:00:23.840
and entrepreneurs in AI, including
link |
00:00:26.120
Demisus Habbis of DeepMind,
link |
00:00:28.000
Amnon Shashwa of Mobileye, and Christoph Koch
link |
00:00:31.200
of the Allen Institute for Brain Science.
link |
00:00:34.120
This conversation is part of the MIT course
link |
00:00:36.400
on artificial general intelligence
link |
00:00:38.120
and the artificial intelligence podcast.
link |
00:00:40.240
If you enjoy it, subscribe on YouTube, iTunes,
link |
00:00:42.760
or simply connect with me on Twitter
link |
00:00:44.600
at Lex Freedman, spelled F R I D.
link |
00:00:47.960
And now, here's my conversation with Tomaso Poggio.
link |
00:00:52.480
You've mentioned that in your childhood,
link |
00:00:54.520
you've developed a fascination with physics,
link |
00:00:56.960
especially the theory of relativity,
link |
00:00:59.720
and that Einstein was also a childhood hero to you.
link |
00:01:04.520
What aspect of Einstein's genius, the nature of his genius,
link |
00:01:09.040
do you think was essential
link |
00:01:10.200
for discovering the theory of relativity?
link |
00:01:12.960
You know, Einstein was a hero to me,
link |
00:01:15.960
and I'm sure to many people,
link |
00:01:17.200
because he was able to make, of course,
link |
00:01:21.680
a major, major contribution to physics
link |
00:01:25.200
with simplifying a bit,
link |
00:01:28.520
just a gedanken experiment, a thought experiment.
link |
00:01:35.200
You know, imagining communication with lights
link |
00:01:38.880
between a stationary observer and somebody on a train.
link |
00:01:43.240
And I thought, you know, the fact that just
link |
00:01:48.560
with the force of his thought, of his thinking, of his mind,
link |
00:01:52.720
it could get to something so deep
link |
00:01:55.640
in terms of physical reality,
link |
00:01:57.520
how time depends on space and speed.
link |
00:02:01.320
It was something absolutely fascinating.
link |
00:02:04.120
It was the power of intelligence,
link |
00:02:06.720
the power of the mind.
link |
00:02:08.440
Do you think the ability to imagine,
link |
00:02:11.120
to visualize as he did, as a lot of great physicists do,
link |
00:02:15.200
do you think that's in all of us human beings,
link |
00:02:18.640
or is there something special
link |
00:02:20.600
to that one particular human being?
link |
00:02:22.880
I think, you know, all of us can learn
link |
00:02:27.160
and have, in principle, similar breakthroughs.
link |
00:02:33.240
There is lesson to be learned from Einstein.
link |
00:02:37.200
He was one of five PhD students at ETA,
link |
00:02:42.600
the Eidgenossische Technische Hochschule in Zurich, in physics.
link |
00:02:47.600
And he was the worst of the five.
link |
00:02:49.840
The only one who did not get an academic position
link |
00:02:53.600
when he graduated, when he finished his PhD,
link |
00:02:57.040
and he went to work, as everybody knows,
link |
00:03:00.000
for the patent office.
link |
00:03:01.720
So it's not so much that he worked for the patent office,
link |
00:03:05.000
but the fact that obviously he was smart,
link |
00:03:07.880
but he was not the top student,
link |
00:03:10.240
obviously he was the anti conformist.
link |
00:03:12.640
He was not thinking in the traditional way
link |
00:03:15.720
that probably teachers and the other students were doing.
link |
00:03:18.760
So there is a lot to be said about trying to do the opposite
link |
00:03:25.960
or something quite different from what other people are doing.
link |
00:03:29.800
That's certainly true for the stock market.
link |
00:03:31.840
Never buy if everybody's buying it.
link |
00:03:35.800
And also true for science.
link |
00:03:37.440
Yes.
link |
00:03:38.440
So you've also mentioned staying on the theme of physics
link |
00:03:42.440
that you were excited at a young age
link |
00:03:46.440
by the mysteries of the universe that physics could uncover.
link |
00:03:50.440
Such, as I saw mentioned, the possibility of time travel.
link |
00:03:56.440
So out of the box question I think I'll get to ask today,
link |
00:03:59.440
do you think time travel is possible?
link |
00:04:02.440
Well, it would be nice if it were possible right now.
link |
00:04:05.440
In science you never say no.
link |
00:04:11.440
But your understanding of the nature of time.
link |
00:04:14.440
Yeah.
link |
00:04:15.440
It's very likely that it's not possible to travel in time.
link |
00:04:20.440
We may be able to travel forward in time.
link |
00:04:24.440
If we can, for instance, freeze ourselves
link |
00:04:28.440
or go on some spacecraft traveling close to the speed of light,
link |
00:04:34.440
but in terms of actively traveling, for instance, back in time,
link |
00:04:39.440
I find probably very unlikely.
link |
00:04:43.440
So do you still hold the underlying dream of the engineering intelligence
link |
00:04:49.440
that will build systems that are able to do such huge leaps
link |
00:04:54.440
like discovering the kind of mechanism
link |
00:04:58.440
that would be required to travel through time?
link |
00:05:00.440
Do you still hold that dream?
link |
00:05:02.440
Or echoes of it from your childhood?
link |
00:05:05.440
Yeah.
link |
00:05:06.440
I don't think there are certain problems
link |
00:05:10.440
that probably cannot be solved,
link |
00:05:13.440
depending on what you believe about the physical reality.
link |
00:05:17.440
Maybe it's totally impossible to create energy from nothing
link |
00:05:23.440
or to travel back in time.
link |
00:05:26.440
But about making machines that can think as well as we do or better,
link |
00:05:35.440
or more likely, especially in the short and mid term,
link |
00:05:39.440
help us think better,
link |
00:05:41.440
which in a sense is happening already with the computers we have,
link |
00:05:45.440
and it will happen more and more.
link |
00:05:47.440
But that I certainly believe,
link |
00:05:49.440
and I don't see in principle why computers at some point
link |
00:05:53.440
could not become more intelligent than we are,
link |
00:05:59.440
although the word intelligence is a tricky one,
link |
00:06:03.440
and one who should discuss what I mean with that.
link |
00:06:07.440
Intelligence, consciousness, words like love,
link |
00:06:12.440
all these need to be disentangled.
link |
00:06:16.440
So you've mentioned also that you believe the problem of intelligence
link |
00:06:20.440
is the greatest problem in science,
link |
00:06:23.440
greater than the origin of life and the origin of the universe.
link |
00:06:26.440
You've also, in the talk,
link |
00:06:29.440
I've said that you're open to arguments against you.
link |
00:06:34.440
So what do you think is the most captivating aspect
link |
00:06:40.440
of this problem of understanding the nature of intelligence?
link |
00:06:43.440
Why does it captivate you as it does?
link |
00:06:46.440
Well, originally, I think one of the motivations that I had as a teenager,
link |
00:06:54.440
when I was infatuated with the theory of relativity,
link |
00:06:58.440
was really that I found that there was the problem of time and space
link |
00:07:05.440
and general relativity,
link |
00:07:07.440
but there were so many other problems of the same level of difficulty
link |
00:07:12.440
and importance that I could, even if I were Einstein,
link |
00:07:16.440
it was difficult to hope to solve all of them.
link |
00:07:19.440
So what about solving a problem whose solution allowed me to solve all the problems?
link |
00:07:26.440
And this was what if we could find the key to an intelligence
link |
00:07:32.440
ten times better or faster than Einstein?
link |
00:07:36.440
So that's sort of seeing artificial intelligence
link |
00:07:39.440
as a tool to expand our capabilities.
link |
00:07:42.440
But is there just an inherent curiosity in you
link |
00:07:47.440
and just understanding what it is in here that makes it all work?
link |
00:07:53.440
Yes, absolutely. You're right.
link |
00:07:55.440
So I started saying this was the motivation when I was a teenager,
link |
00:08:00.440
but soon after, I think the problem of human intelligence
link |
00:08:06.440
became a real focus of my science and my research,
link |
00:08:14.440
because I think for me the most interesting problem is really asking who we are.
link |
00:08:27.440
It is asking not only a question about science,
link |
00:08:31.440
but even about the very tool we are using to do science, which is our brain.
link |
00:08:37.440
How does our brain work?
link |
00:08:39.440
From where does it come from?
link |
00:08:41.440
What are its limitations?
link |
00:08:43.440
Can we make it better?
link |
00:08:45.440
And that in many ways is the ultimate question
link |
00:08:49.440
that underlies this whole effort of science.
link |
00:08:53.440
So you've made significant contributions in both the science of intelligence
link |
00:08:58.440
and the engineering of intelligence.
link |
00:09:01.440
In a hypothetical way, let me ask,
link |
00:09:04.440
how far do you think we can get in creating intelligence systems
link |
00:09:08.440
without understanding the biological,
link |
00:09:11.440
the understanding how the human brain creates intelligence?
link |
00:09:15.440
Put another way, do you think we can build a strong ass system
link |
00:09:18.440
without really getting at the core, understanding the functional nature of the brain?
link |
00:09:24.440
Well, this is a real difficult question.
link |
00:09:28.440
We did solve problems like flying
link |
00:09:34.440
without really using too much our knowledge about how birds fly.
link |
00:09:43.440
It was important, I guess, to know that you could have things heavier than air
link |
00:09:51.440
being able to fly like birds.
link |
00:09:55.440
But beyond that, probably we did not learn very much.
link |
00:10:00.440
The brothers right did learn a lot of observation about birds
link |
00:10:08.440
and designing their aircraft,
link |
00:10:12.440
but you can argue we did not use much of biology in that particular case.
link |
00:10:17.440
Now, in the case of intelligence, I think that it's a bit of a bet right now.
link |
00:10:28.440
If you ask, okay, we all agree we'll get at some point, maybe soon,
link |
00:10:36.440
maybe later, to a machine that is indistinguishable from my secretary
link |
00:10:42.440
in terms of what I can ask the machine to do.
link |
00:10:47.440
I think we'll get there and now the question is,
link |
00:10:50.440
you can ask people, do you think we'll get there without any knowledge about the human brain
link |
00:10:56.440
or the best way to get there is to understand better the human brain?
link |
00:11:02.440
This is, I think, an educated bet that different people with different backgrounds
link |
00:11:08.440
will decide in different ways.
link |
00:11:11.440
The recent history of the progress in AI in the last, I would say, five years
link |
00:11:17.440
or ten years has been that the main breakthroughs, the main recent breakthroughs,
link |
00:11:26.440
really start from neuroscience.
link |
00:11:31.440
I can mention reinforcement learning as one,
link |
00:11:35.440
is one of the algorithms at the core of AlphaGo,
link |
00:11:41.440
which is the system that beat the kind of an official world champion of Go,
link |
00:11:46.440
Lee Siddle, two, three years ago in Seoul.
link |
00:11:52.440
That's one, and that started really with the work of Pavlov in 1900,
link |
00:12:00.440
Marvin Miski in the 60s and many other neuroscientists later on.
link |
00:12:07.440
And deep learning started, which is the core again of AlphaGo
link |
00:12:13.440
and systems like autonomous driving systems for cars,
link |
00:12:19.440
like the systems that Mobileye, which is a company started by one of my ex,
link |
00:12:25.440
Okamnon Shashua, so that is the core of those things.
link |
00:12:30.440
And deep learning, really the initial ideas in terms of the architecture
link |
00:12:35.440
of these layered hierarchical networks started with work of Thorston Wiesel
link |
00:12:42.440
and David Hubel at Harvard up the river in the 60s.
link |
00:12:47.440
So recent history suggests that neuroscience played a big role in these breakthroughs.
link |
00:12:54.440
My personal bet is that there is a good chance they continue to play a big role,
link |
00:12:59.440
maybe not in all the future breakthroughs, but in some of them.
link |
00:13:03.440
At least in inspiration.
link |
00:13:05.440
At least in inspiration, absolutely, yes.
link |
00:13:07.440
So you studied both artificial and biological neural networks,
link |
00:13:12.440
you said these mechanisms that underlie deep learning and reinforcement learning,
link |
00:13:19.440
but there is nevertheless significant differences between biological and artificial neural networks
link |
00:13:25.440
as they stand now.
link |
00:13:27.440
So between the two, what do you find is the most interesting, mysterious,
link |
00:13:32.440
maybe even beautiful difference as it currently stands in our understanding?
link |
00:13:37.440
I must confess that until recently I found that the artificial networks
link |
00:13:44.440
were too simplistic relative to real neural networks.
link |
00:13:49.440
But, you know, recently I've been started to think that, yes,
link |
00:13:54.440
there are very big simplification of what you find in the brain.
link |
00:13:59.440
But on the other hand, they are much closer in terms of the architecture to the brain
link |
00:14:07.440
than other models that we had, that computer science used as model of thinking,
link |
00:14:13.440
or mathematical logics, you know, LISP, Prologue, and those kind of things.
link |
00:14:19.440
So in comparison to those, they're much closer to the brain.
link |
00:14:23.440
You have networks of neurons, which is what the brain is about.
link |
00:14:28.440
The artificial neurons in the models are, as I said, caricature of the biological neurons,
link |
00:14:35.440
but they're still neurons, single units communicating with other units,
link |
00:14:39.440
something that is absent in the traditional computer type models of mathematics, reasoning, and so on.
link |
00:14:50.440
So what aspect would you like to see in artificial neural networks added over time
link |
00:14:56.440
as we try to figure out ways to improve them?
link |
00:14:59.440
So one of the main differences and, you know, problems in terms of deep learning today,
link |
00:15:10.440
and it's not only deep learning, and the brain is the need for deep learning techniques
link |
00:15:17.440
to have a lot of labeled examples.
link |
00:15:22.440
For instance, for ImageNet, you have a training set which is one million images, each one labeled by some human
link |
00:15:31.440
in terms of which object is there.
link |
00:15:34.440
And it's clear that in biology, a baby may be able to see a million images in the first years of life,
link |
00:15:46.440
but will not have a million of labels given to him or her by parents or caretakers.
link |
00:15:56.440
So how do you solve that?
link |
00:15:59.440
You know, I think there is this interesting challenge that today, deep learning and related techniques
link |
00:16:07.440
are all about big data, big data meaning a lot of examples labeled by humans,
link |
00:16:18.440
whereas in nature you have...
link |
00:16:22.440
So this big data is n going to infinity, that's the best, you know, n meaning labeled data.
link |
00:16:29.440
But I think the biological world is more n going to 1.
link |
00:16:34.440
A child can learn from a very small number of labeled examples.
link |
00:16:42.440
Like you tell a child, this is a car, you don't need to say like in ImageNet, you know, this is a car, this is a car,
link |
00:16:49.440
this is not a car, this is not a car, one million times.
link |
00:16:53.440
And of course with AlphaGo and AlphaZero variants, because the world of Go is so simplistic that you can actually learn by yourself
link |
00:17:05.440
through self play, you can play against each other.
link |
00:17:08.440
And the real world, the visual system that you've studied extensively is a lot more complicated than the game of Go.
link |
00:17:15.440
On the comment about children, which are fascinatingly good at learning new stuff,
link |
00:17:22.440
how much of it do you think is hardware and how much of it is software?
link |
00:17:26.440
Yeah, that's a good and deep question, in a sense is the old question of nurture and nature,
link |
00:17:32.440
how much is in the gene and how much is in the experience of an individual.
link |
00:17:40.440
Obviously, it's both that play a role and I believe that the way evolution gives put prior information, so to speak, hardwired,
link |
00:17:55.440
it's not really hardwired, but that's essentially an hypothesis.
link |
00:18:02.440
I think what's going on is that evolution is almost necessarily, if you believe in Darwin, it's very opportunistic.
link |
00:18:14.440
And think about our DNA and the DNA of Drosophila.
link |
00:18:23.440
Our DNA does not have many more genes than Drosophila.
link |
00:18:28.440
The fly, the fruit fly.
link |
00:18:32.440
Now, we know that the fruit fly does not learn very much during its individual existence.
link |
00:18:39.440
It looks like one of these machinery that it's really mostly, not 100%, but 95% hardcoded by the genes.
link |
00:18:51.440
But since we don't have many more genes than Drosophila, evolution could encode in us a kind of general learning machinery
link |
00:19:02.440
and then had to give very weak priors.
link |
00:19:09.440
Like, for instance, let me give a specific example, which is recent to work by a member of our Center for Brains, Mines and Machines.
link |
00:19:20.440
We know because of work of other people in our group and other groups that there are cells in a part of our brain, neurons, that are tuned to faces.
link |
00:19:30.440
They seem to be involved in face recognition.
link |
00:19:33.440
Now, this face area seems to be present in young children and adults.
link |
00:19:43.440
And one question is there from the beginning, is hardwired by evolution or somehow is learned very quickly.
link |
00:19:54.440
So what's your, by the way, a lot of the questions I'm asking, the answer is we don't really know,
link |
00:20:00.440
but as a person who has contributed some profound ideas in these fields, you're a good person to guess at some of these.
link |
00:20:08.440
So, of course, there's a caveat before a lot of the stuff we talk about, but what is your hunch?
link |
00:20:14.440
Is the face, the part of the brain that seems to be concentrated on face recognition, are you born with that?
link |
00:20:21.440
Or are you just designed to learn that quickly, like the face of the mother and son?
link |
00:20:26.440
My hunch, my bias was the second one, learned very quickly and turns out that Marge Livingstone at Harvard has done some amazing experiments in which she raised baby monkeys,
link |
00:20:42.440
depriving them of faces during the first weeks of life.
link |
00:20:47.440
So they see technicians, but the technicians have a mask.
link |
00:20:52.440
Yes.
link |
00:20:54.440
And so when they looked at the area in the brain of these monkeys that were usually you find faces, they found no face preference.
link |
00:21:10.440
So my guess is that what evolution does in this case is there is a plastic area, which is plastic, which is kind of predetermined to be imprinted very easily.
link |
00:21:26.440
But the command from the gene is not a detailed circuitry for a face template.
link |
00:21:31.440
Could be.
link |
00:21:33.440
But this will require probably a lot of bits.
link |
00:21:35.440
You had to specify a lot of connection of a lot of neurons.
link |
00:21:39.440
Instead, the command from the gene is something like imprint, memorize what you see most often in the first two weeks of life, especially in connection with food and maybe nipples.
link |
00:21:53.440
I don't know.
link |
00:21:54.440
Right.
link |
00:21:55.440
Well, source of food.
link |
00:21:56.440
And so in that area is very plastic at first and it solidifies.
link |
00:22:00.440
It'd be interesting if a variant of that experiment would show a different kind of pattern associated with food than a face pattern, whether that could stick.
link |
00:22:10.440
There are indications that during that experiment, what the monkeys saw quite often were the blue gloves of the technicians that were giving to the baby monkeys the milk.
link |
00:22:25.440
And some of the cells instead of being face sensitive in that area are hand sensitive.
link |
00:22:33.440
That's fascinating.
link |
00:22:35.440
Can you talk about what are the different parts of the brain and in your view sort of loosely and how do they contribute to intelligence?
link |
00:22:45.440
Do you see the brain as a bunch of different modules and they together come in the human brain to create intelligence or is it all one mush of the same kind of fundamental architecture?
link |
00:23:04.440
Yeah, that's an important question and there was a phase in neuroscience back in the 1950s or so in which it was believed for a while that the brain was equipotential.
link |
00:23:21.440
This was the term.
link |
00:23:22.440
You could cut out a piece and nothing special happened apart, a little bit less performance.
link |
00:23:31.440
There was a surgeon, Lashley, who did a lot of experiments of this type with mice and rats and concluded that every part of the brain was essentially equivalent to any other one.
link |
00:23:50.440
It turns out that that's really not true. There are very specific modules in the brain, as you said, and people may lose the ability to speak if you have a stroke in a certain region or may lose control of their legs in another region.
link |
00:24:12.440
So they're very specific. The brain is also quite flexible and redundant so often it can correct things and take over functions from one part of the brain to the other, but really there are specific modules.
link |
00:24:33.440
So the answer that we know from this old work, which was basically based on lesions, either on animals or very often there was a mine of very interesting data coming from the war, from different types of injuries that soldiers had in the brain.
link |
00:25:02.440
And more recently, functional MRI, which allow you to check which part of the brain are active when you're doing different tasks, as you can replace some of this.
link |
00:25:23.440
You can see that certain parts of the brain are involved, are active in certain tasks.
link |
00:25:32.440
But sort of taking a step back to that part of the brain that discovers that specializes in the face and how that might be learned, what's your intuition behind, you know, is it possible that the sort of from a physicist's perspective when you get lower and lower, that it's all the same stuff and it just, when you're born, it's plastic and it quickly figures out this part is going to be about vision, this is going to be about language, this is about common sense reasoning.
link |
00:26:01.440
Do you have an intuition that that kind of learning is going on really quickly or is it really kind of solidified in hardware?
link |
00:26:09.440
That's a great question.
link |
00:26:10.440
So there are parts of the brain like the cerebellum or the hippocampus that are quite different from each other.
link |
00:26:21.440
They clearly have different anatomy, different connectivity.
link |
00:26:25.440
Then there is the cortex, which is the most developed part of the brain in humans.
link |
00:26:35.440
And in the cortex, you have different regions of the cortex that are responsible for vision, for audition, for motor control, for language.
link |
00:26:47.440
Now, one of the big puzzles of this is that in the cortex, it looks like it is the same in terms of hardware, in terms of type of neurons and connectivity across these different modalities.
link |
00:27:07.440
So for the cortex, I think aside these other parts of the brain like spinal cord, hippocampus, cerebellum and so on.
link |
00:27:17.440
For the cortex, I think your question about hardware and software and learning and so on, I think is rather open.
link |
00:27:28.440
And I find it very interesting for us to think about an architecture, computer architecture that is good for vision and at the same time is good for language.
link |
00:27:40.440
It seems to be so different problem areas that you have to solve.
link |
00:27:48.440
But the underlying mechanism might be the same and that's really instructive for artificial neural networks.
link |
00:27:54.440
So we've done a lot of great work in vision and human vision, computer vision.
link |
00:28:00.440
And you mentioned the problem of human vision is really as difficult as the problem of general intelligence.
link |
00:28:07.440
And maybe that connects to the cortex discussion.
link |
00:28:10.440
Can you describe the human visual cortex and how the humans begin to understand the world through the raw sensory information?
link |
00:28:21.440
What's for folks who are not familiar, especially on the computer vision side, we don't often actually take a step back except saying with a sentence or two that one is inspired by the other.
link |
00:28:36.440
What is it that we know about the human visual cortex?
link |
00:28:39.440
That's interesting.
link |
00:28:40.440
So we know quite a bit at the same time, we don't know a lot, but the bit we know, in a sense, we know a lot of the details and many we don't know.
link |
00:28:53.440
And we know a lot of the top level, the answer to the top level question, but we don't know some basic ones, even in terms of general neuroscience forgetting vision.
link |
00:29:05.440
You know, why do we sleep? It's such a basic question.
link |
00:29:11.440
And we really don't have an answer to that.
link |
00:29:14.440
So taking a step back on that. So sleep, for example, is fascinating.
link |
00:29:18.440
Do you think that's a neuroscience question?
link |
00:29:21.440
Or if we talk about abstractions, what do you think is an interesting way to study intelligence or most effective on the levels of abstraction?
link |
00:29:30.440
Is it chemical, is it biological, is it electrophysical, mathematical as you've done a lot of excellent work on that side?
link |
00:29:37.440
Which psychology, sort of like at which level of abstraction do you think?
link |
00:29:42.440
Well, in terms of levels of abstraction, I think we need all of them.
link |
00:29:48.440
It's one, you know, it's like if you ask me, what does it mean to understand a computer?
link |
00:29:56.440
That's much simpler. But in a computer, I could say, well, understand how to use PowerPoint.
link |
00:30:04.440
That's my level of understanding a computer. It's, it has reasonable, you know, it gives me some power to produce slides and beautiful slides.
link |
00:30:13.440
And now somebody else says, well, I know how the transistor work that are inside the computer can write the equation for, you know, transistor and diodes and circuits, logical circuits.
link |
00:30:28.440
And I can ask this guy, do you know how to operate PowerPoint? No idea.
link |
00:30:33.440
So do you think if we discovered computers walking amongst us full of these transistors that are also operating under windows and have PowerPoint, do you think it's digging in a little bit more?
link |
00:30:49.440
How useful is it to understand the transistor in order to be able to understand PowerPoint in these higher level intelligence processes?
link |
00:31:00.440
So I think in the case of computers, because they were made by engineers by us, these different level of understanding are rather separate on purpose.
link |
00:31:12.440
You know, they are separate modules so that the engineer that designed the circuit for the chips does not need to know what is inside PowerPoint.
link |
00:31:23.440
And somebody can write the software translating from one to the other.
link |
00:31:30.440
So in that case, I don't think understanding the transistor help you understand PowerPoint or very little.
link |
00:31:40.440
If you want to understand the computer, this question, you know, I would say you have to understanding at different levels if you really want to build one.
link |
00:31:51.440
But for the brain, I think these levels of understanding, so the algorithms, which kind of computation, you know, the equivalent PowerPoint and the circuits, you know, the transistors, I think they are much more intertwined with each other.
link |
00:32:09.440
There is not, you know, a neatly level of the software separate from the hardware.
link |
00:32:15.440
And so that's why I think in the case of the brain, the problem is more difficult and more than for computers requires the interaction, the collaboration between different types of expertise.
link |
00:32:29.440
So the brain is a big hierarchical mess that you can't just disentangle levels.
link |
00:32:35.440
I think you can, but it's much more difficult and it's not completely obvious.
link |
00:32:41.440
And I said, I think he's one of the person I think is the greatest problem in science.
link |
00:32:47.440
So, you know, I think it's fair that it's difficult.
link |
00:32:52.440
That's a difficult one.
link |
00:32:53.440
That said, you do talk about compositionality and why it might be useful.
link |
00:32:58.440
And when you discuss why these neural networks in artificial or biological sense learn anything, you talk about compositionality.
link |
00:33:07.440
See, there's a sense that nature can be disentangled or well, all aspects of our cognition could be disentangled a little to some degree.
link |
00:33:22.440
So why do you think what, first of all, how do you see compositionality and why do you think it exists at all in nature?
link |
00:33:31.440
I spoke about, I use the term compositionality.
link |
00:33:39.440
When we looked at deep neural networks, multi layers and trying to understand when and why they are more powerful than more classical one layer networks,
link |
00:33:54.440
like linear classifier, kernel machines, so called.
link |
00:34:01.440
And what we found is that in terms of approximating or learning or representing a function, a mapping from an input to an output,
link |
00:34:12.440
like from an image to the label in the image, if this function has a particular structure,
link |
00:34:20.440
then deep networks are much more powerful than shallow networks to approximate the underlying function.
link |
00:34:28.440
And the particular structure is a structure of compositionality.
link |
00:34:33.440
If the function is made up of functions of function, so that you need to look on when you are interpreting an image,
link |
00:34:45.440
classifying an image, you don't need to look at all pixels at once, but you can compute something from small groups of pixels,
link |
00:34:56.440
and then you can compute something on the output of this local computation and so on.
link |
00:35:04.440
It is similar to what you do when you read a sentence, you don't need to read the first and the last letter,
link |
00:35:10.440
but you can read syllables, combine them in words, combine the words in sentences.
link |
00:35:17.440
So this is this kind of structure.
link |
00:35:20.440
So that's as part of a discussion of why deep neural networks may be more effective than the shallow methods.
link |
00:35:27.440
And is your sense for most things we can use neural networks for,
link |
00:35:35.440
those problems are going to be compositional in nature, like language, like vision.
link |
00:35:43.440
How far can we get in this kind of way?
link |
00:35:47.440
So here is almost philosophy.
link |
00:35:51.440
Well, let's go there.
link |
00:35:53.440
Yeah, let's go there.
link |
00:35:55.440
So friend of mine, Max Tagmark, who is a physicist at MIT.
link |
00:36:00.440
I've talked to him on this thing.
link |
00:36:02.440
Yeah, and he disagrees with you, right?
link |
00:36:04.440
We agree on most, but the conclusion is a bit different.
link |
00:36:09.440
His conclusion is that for images, for instance,
link |
00:36:14.440
the compositional structure of this function that we have to learn or to solve these problems
link |
00:36:23.440
comes from physics, comes from the fact that you have local interactions in physics between atoms and other atoms,
link |
00:36:35.440
between particle of matter and other particles, between planets and other planets,
link |
00:36:42.440
between stars and others.
link |
00:36:44.440
It's all local.
link |
00:36:48.440
And that's true, but you could push this argument a bit further.
link |
00:36:55.440
Not this argument, actually.
link |
00:36:57.440
You could argue that, you know, maybe that's part of the true,
link |
00:37:02.440
but maybe what happens is kind of the opposite,
link |
00:37:06.440
is that our brain is wired up as a deep network.
link |
00:37:11.440
So it can learn, understand, solve problems that have this compositional structure.
link |
00:37:22.440
And it cannot solve problems that don't have this compositional structure.
link |
00:37:29.440
So the problems we are accustomed to, we think about, we test our algorithms on,
link |
00:37:37.440
are this compositional structure because our brain is made up.
link |
00:37:42.440
And that's, in a sense, an evolutionary perspective that we've...
link |
00:37:46.440
So the ones that weren't dealing with the compositional nature of reality died off?
link |
00:37:54.440
Yes, but also could be, maybe the reason why we have this local connectivity in the brain,
link |
00:38:05.440
like simple cells in cortex looking only at the small part of the image,
link |
00:38:10.440
each one of them, and then other cells looking at the small number of the simple cells and so on.
link |
00:38:16.440
The reason for this may be purely that it was difficult to grow long range connectivity.
link |
00:38:24.440
So suppose it's, you know, for biology, it's possible to grow short range connectivity,
link |
00:38:33.440
but not long range also because there is a limited number of long range.
link |
00:38:39.440
And so you have this limitation from the biology.
link |
00:38:44.440
And this means you build a deep convolutional network.
link |
00:38:49.440
This would be something like a deep convolutional network.
link |
00:38:53.440
And this is great for solving certain class of problems.
link |
00:38:57.440
These are the ones we find easy and important for our life.
link |
00:39:02.440
And yes, they were enough for us to survive.
link |
00:39:06.440
And you can start a successful business on solving those problems with mobile eye.
link |
00:39:13.440
Driving is a compositional problem.
link |
00:39:16.440
So on the learning task, we don't know much about how the brain learns in terms of optimization.
link |
00:39:25.440
So the thing that's stochastic gradient descent is what artificial neural networks
link |
00:39:31.440
use for the most part to adjust the parameters in such a way that it's able to deal
link |
00:39:38.440
based on the labeled data, it's able to solve the problem.
link |
00:39:42.440
So what's your intuition about why it works at all?
link |
00:39:49.440
How hard of a problem it is to optimize a neural network, artificial neural network?
link |
00:39:55.440
Is there other alternatives?
link |
00:39:57.440
Just in general, your intuition is behind this very simplistic algorithm
link |
00:40:03.440
that seems to do pretty good, surprising.
link |
00:40:05.440
Yes, yes.
link |
00:40:07.440
So I find neuroscience, the architecture of cortex is really similar to the architecture of deep networks.
link |
00:40:16.440
So there is a nice correspondence there between the biology and this kind of local connectivity hierarchical
link |
00:40:26.440
architecture.
link |
00:40:28.440
The stochastic gradient descent, as you said, is a very simple technique.
link |
00:40:35.440
It seems pretty unlikely that biology could do that from what we know right now about cortex and neurons and synapses.
link |
00:40:49.440
So it's a big question open whether there are other optimization learning algorithms
link |
00:40:58.440
that can replace stochastic gradient descent.
link |
00:41:02.440
And my guess is yes, but nobody has found yet a real answer.
link |
00:41:11.440
I mean, people are trying, still trying, and there are some interesting ideas.
link |
00:41:17.440
The fact that stochastic gradient descent is so successful, this has become clearly not so mysterious.
link |
00:41:27.440
And the reason is that it's an interesting fact, you know, is a change in a sense in how people think about statistics.
link |
00:41:39.440
And this is the following is that typically when you had data and you had, say, a model with parameters,
link |
00:41:51.440
you are trying to fit the model to the data, you know, to fit the parameter.
link |
00:41:55.440
And typically the kind of kind of crowd wisdom type idea was you should have at least, you know, twice the number of data than the number of parameters.
link |
00:42:12.440
Maybe 10 times is better.
link |
00:42:15.440
Now, the way you train neural network these days is that they have 10 or 100 times more parameters than data.
link |
00:42:24.440
Exactly the opposite.
link |
00:42:26.440
And which, you know, it has been one of the puzzles about neural networks.
link |
00:42:34.440
How can you get something that really works when you have so much freedom?
link |
00:42:40.440
From that little data you can generalize somehow.
link |
00:42:43.440
Right, exactly.
link |
00:42:44.440
Do you think the stochastic nature of it is essential, the randomness?
link |
00:42:48.440
I think we have some initial understanding why this happens, but one nice side effect of having this over parameterization, more parameters than data,
link |
00:43:00.440
is that when you look for the minima of a loss function like stochastic degree of descent is doing,
link |
00:43:07.440
you find I made some calculations based on some old basic theorem of algebra called Bezu theorem.
link |
00:43:19.440
And that gives you an estimate of the number of solutions of a system of polynomial equation.
link |
00:43:25.440
Anyway, the bottom line is that there are probably more minima for a typical deep networks than atoms in the universe.
link |
00:43:38.440
Just to say there are a lot because of the over parameterization.
link |
00:43:43.440
Yes.
link |
00:43:44.440
More global minimum, zero minimum, good minimum.
link |
00:43:48.440
More global minimum.
link |
00:43:51.440
Yes, a lot of them, so you have a lot of solutions, so it's not so surprising that you can find them relatively easily.
link |
00:44:00.440
This is because of the over parameterization.
link |
00:44:04.440
The over parameterization sprinkles that entire space with solutions that are pretty good.
link |
00:44:09.440
It's not so surprising, right?
link |
00:44:11.440
It's like if you have a system of linear equation and you have more unknowns than equations,
link |
00:44:17.440
then we know you have an infinite number of solutions and the question is to pick one.
link |
00:44:24.440
That's another story, but you have an infinite number of solutions,
link |
00:44:27.440
so there are a lot of value of your unknowns that satisfy the equations.
link |
00:44:32.440
But it's possible that there's a lot of those solutions that aren't very good.
link |
00:44:37.440
What's surprising is that they're pretty good.
link |
00:44:38.440
So that's a separate question.
link |
00:44:39.440
Why can you pick one that generalizes one?
link |
00:44:43.440
That's a separate question with separate answers.
link |
00:44:46.440
One theorem that people like to talk about that inspires imagination of the power of neural networks
link |
00:44:53.440
is the universal approximation theorem that you can approximate any computable function
link |
00:45:00.440
with just a finite number of neurons and a single hidden layer.
link |
00:45:04.440
Do you find this theorem one surprising?
link |
00:45:07.440
Do you find it useful, interesting, inspiring?
link |
00:45:12.440
No, this one, I never found it very surprising.
link |
00:45:16.440
It was known since the 80s, since I entered the field,
link |
00:45:22.440
because it's basically the same as Viastras theorem,
link |
00:45:27.440
which says that I can approximate any continuous function with a polynomial of sufficiently,
link |
00:45:34.440
with a sufficient number of terms, monomials.
link |
00:45:37.440
It's basically the same, and the proofs are very similar.
link |
00:45:41.440
So your intuition was there was never any doubt that neural networks in theory could be very strong approximations.
link |
00:45:48.440
The interesting question is that if this theorem says you can approximate fine,
link |
00:45:58.440
but when you ask how many neurons, for instance, or in the case of how many monomials,
link |
00:46:06.440
I need to get a good approximation.
link |
00:46:11.440
Then it turns out that that depends on the dimensionality of your function, how many variables you have.
link |
00:46:20.440
But it depends on the dimensionality of your function in a bad way.
link |
00:46:25.440
For instance, suppose you want an error which is no worse than 10% in your approximation.
link |
00:46:35.440
If you want to approximate your function within 10%,
link |
00:46:40.440
then it turns out that the number of units you need are in the order of 10 to the dimensionality, d.
link |
00:46:48.440
How many variables?
link |
00:46:50.440
So if you have two variables, d is 2 and you have 100 units and OK.
link |
00:46:57.440
But if you have, say, 200 by 200 pixel images,
link |
00:47:02.440
now this is 40,000, whatever.
link |
00:47:06.440
We again go to the size of the universe pretty quickly.
link |
00:47:09.440
Exactly, 10 to the 40,000 or something.
link |
00:47:13.440
And so this is called the curse of dimensionality, not quite appropriately.
link |
00:47:21.440
And the hope is with the extra layers you can remove the curse.
link |
00:47:27.440
What we proved is that if you have deep layers or hierarchical architecture
link |
00:47:34.440
with the local connectivity of the type of convolutional deep learning,
link |
00:47:39.440
and if you're dealing with a function that has this kind of hierarchical architecture,
link |
00:47:46.440
then you avoid completely the curse.
link |
00:47:50.440
You've spoken a lot about supervised deep learning.
link |
00:47:53.440
What are your thoughts, hopes, views on the challenges of unsupervised learning
link |
00:47:58.440
with GANs, with generative adversarial networks?
link |
00:48:04.440
Do you see those as distinct, the power of GANs,
link |
00:48:08.440
do you see those as distinct from supervised methods in neural networks,
link |
00:48:12.440
or are they really all in the same representation ballpark?
link |
00:48:16.440
GANs is one way to get estimation of probability densities,
link |
00:48:24.440
which is a somewhat new way that people have not done before.
link |
00:48:29.440
I don't know whether this will really play an important role in intelligence,
link |
00:48:38.440
or it's interesting, I'm less enthusiastic about it than many people in the field.
link |
00:48:47.440
I have the feeling that many people in the field are really impressed by the ability
link |
00:48:53.440
of producing realistic looking images in this generative way.
link |
00:49:00.440
Which describes the popularity of the methods,
link |
00:49:02.440
but you're saying that while that's exciting and cool to look at, it may not be the tool that's useful for it.
link |
00:49:10.440
So you describe it kind of beautifully.
link |
00:49:12.440
Current supervised methods go N to infinity in terms of the number of labeled points,
link |
00:49:17.440
and we really have to figure out how to go to N to 1.
link |
00:49:20.440
And you're thinking GANs might help, but they might not be the right...
link |
00:49:24.440
I don't think for that problem, which I really think is important.
link |
00:49:28.440
I think they certainly have applications, for instance, in computer graphics.
link |
00:49:35.440
I did work long ago, which was a little bit similar in terms of,
link |
00:49:43.440
saying I have a network and I present images,
link |
00:49:49.440
so the input is images and output is, for instance, the pose of the image, a face, how much is smiling,
link |
00:49:59.440
is rotated 45 degrees or not.
link |
00:50:02.440
What about having a network that I train with the same data set,
link |
00:50:08.440
but now I invert input and output.
link |
00:50:10.440
Now the input is the pose or the expression, a number, certain numbers,
link |
00:50:16.440
and the output is the image and I train it.
link |
00:50:19.440
And we did pretty good interesting results in terms of producing very realistic looking images.
link |
00:50:27.440
It was less sophisticated mechanism, but the output was pretty less than GANs,
link |
00:50:35.440
but the output was pretty much of the same quality.
link |
00:50:38.440
So I think for computer graphics type application,
link |
00:50:43.440
definitely GANs can be quite useful and not only for that,
link |
00:50:48.440
but for helping, for instance, on this problem unsupervised example of reducing the number of labelled examples,
link |
00:51:01.440
I think people, it's like they think they can get out more than they put in.
link |
00:51:10.440
There's no free lunch, as you said.
link |
00:51:13.440
What's your intuition?
link |
00:51:16.440
How can we slow the growth of N to infinity in supervised learning?
link |
00:51:24.440
So, for example, Mobileye has very successfully,
link |
00:51:29.440
I mean essentially annotated large amounts of data to be able to drive a car.
link |
00:51:34.440
Now, one thought is, so we're trying to teach machines, the school of AI,
link |
00:51:40.440
and we're trying to, so how can we become better teachers, maybe?
link |
00:51:45.440
That's one way.
link |
00:51:47.440
I like that because, again, one caricature of the history of computer science,
link |
00:51:58.440
it begins with programmers, expensive, continuous labellers, cheap,
link |
00:52:09.440
and the future would be schools, like we have for kids.
link |
00:52:16.440
Currently, the labelling methods, we're not selective about which examples we teach networks with.
link |
00:52:26.440
I think the focus of making networks that learn much faster is often on the architecture side,
link |
00:52:33.440
but how can we pick better examples with which to learn?
link |
00:52:37.440
Do you have intuitions about that?
link |
00:52:39.440
Well, that's part of the problem, but the other one is, if we look at biology,
link |
00:52:50.440
the reasonable assumption, I think, is in the same spirit as I said,
link |
00:52:58.440
evolution is opportunistic and has weak priors.
link |
00:53:03.440
The way I think the intelligence of a child, a baby may develop,
link |
00:53:10.440
is by bootstrapping weak priors from evolution.
link |
00:53:17.440
For instance, you can assume that you have most organisms,
link |
00:53:26.440
including human babies, built in some basic machinery to detect motion and relative motion.
link |
00:53:37.440
In fact, we know all insects, from fruit flies to other animals, they have this.
link |
00:53:46.440
Even in the retinas, in the very peripheral part, it's very conserved across species,
link |
00:53:55.440
something that evolution discovered early.
link |
00:53:58.440
It may be the reason why babies tend to look in the first few days to moving objects,
link |
00:54:05.440
and not to not moving objects.
link |
00:54:07.440
Now, moving objects means, okay, they're attracted by motion,
link |
00:54:11.440
but motion also means that motion gives automatic segmentation from the background.
link |
00:54:19.440
So because of motion boundaries, either the object is moving,
link |
00:54:26.440
or the eye of the baby is tracking the moving object, and the background is moving.
link |
00:54:32.440
Yeah, so just purely on the visual characteristics of the scene, that seems to be the most useful.
link |
00:54:37.440
Right, so it's like looking at an object without background.
link |
00:54:43.440
It's ideal for learning the object, otherwise it's really difficult, because you have so much stuff.
link |
00:54:49.440
So suppose you do this at the beginning, first weeks,
link |
00:54:54.440
then after that you can recognize the object, now they are imprinted, the number one,
link |
00:55:01.440
even in the background, even without motion.
link |
00:55:05.440
So that's the, by the way, I just want to ask on the object recognition problem,
link |
00:55:10.440
so there is this being responsive to movement and doing edge detection, essentially.
link |
00:55:16.440
What's the gap between being effectively,
link |
00:55:20.440
effectively visually recognizing stuff, detecting where it is, and understanding the scene?
link |
00:55:27.440
Is this a huge gap in many layers, or is it close?
link |
00:55:32.440
No, I think that's a huge gap.
link |
00:55:35.440
I think present algorithm with all the success that we have, and the fact that are a lot of very useful,
link |
00:55:44.440
I think we are in a golden age for applications of low level vision,
link |
00:55:51.440
and low level speech recognition, and so on, you know, Alexa, and so on.
link |
00:55:56.440
There are many more things of similar level to be done, including medical diagnosis and so on,
link |
00:56:01.440
but we are far from what we call understanding of a scene, of language, of actions, of people.
link |
00:56:11.440
That is, despite the claims, that's, I think, very far.
link |
00:56:17.440
We're a little bit off.
link |
00:56:19.440
So in popular culture, and among many researchers, some of which I've spoken with,
link |
00:56:24.440
the Sewell Russell and Elon Musk, in and out of the AI field, there's a concern about the existential threat of AI.
link |
00:56:34.440
And how do you think about this concern, and is it valuable to think about large scale,
link |
00:56:44.440
long term, unintended consequences of intelligent systems we try to build?
link |
00:56:51.440
I always think it's better to worry first, you know, early rather than late.
link |
00:56:58.440
So worry is good.
link |
00:56:59.440
Yeah, I'm not against worrying at all.
link |
00:57:02.440
Personally, I think that, you know, it will take a long time before there is real reason to be worried.
link |
00:57:15.440
But as I said, I think it's good to put in place and think about possible safety against,
link |
00:57:23.440
what I find a bit misleading are things like that have been said by people I know, like Elon Musk and what is Bostrom in particular,
link |
00:57:35.440
and what is his first name, Nick Bostrom, right?
link |
00:57:39.440
And, you know, and a couple of other people that, for instance, AI is more dangerous than nuclear weapons.
link |
00:57:46.440
I think that's really wrong.
link |
00:57:50.440
That can be misleading, because in terms of priority, we should still be more worried about nuclear weapons
link |
00:57:59.440
and what people are doing about it and so on than AI.
link |
00:58:05.440
And you've spoken about them as obvious and yourself saying that you think you'll be about 100 years out
link |
00:58:15.440
before we have a general intelligence system that's on par with the human being.
link |
00:58:20.440
Do you have any updates for those predictions?
link |
00:58:22.440
Well, I think he said...
link |
00:58:23.440
He said 20, I think.
link |
00:58:25.440
He said 20, right.
link |
00:58:26.440
This was a couple of years ago.
link |
00:58:27.440
I have not asked him again, so I should have.
link |
00:58:31.440
Your own prediction, what's your prediction about when you'll be truly surprised
link |
00:58:38.440
and what's the confidence interval on that?
link |
00:58:42.440
You know, it's so difficult to predict the future and even the present.
link |
00:58:46.440
It's pretty hard to predict.
link |
00:58:48.440
Right, but I would be...
link |
00:58:50.440
As I said, this is completely...
link |
00:58:52.440
I would be more like Rod Brooks.
link |
00:58:56.440
I think he's about 200 years old.
link |
00:58:59.440
200 years.
link |
00:59:01.440
When we have this kind of AGI system, artificial intelligence system,
link |
00:59:06.440
you're sitting in a room with her, him, it,
link |
00:59:12.440
do you think it will be the underlying design of such a system
link |
00:59:17.440
and something we'll be able to understand?
link |
00:59:19.440
It will be simple?
link |
00:59:20.440
Do you think it will be explainable?
link |
00:59:25.440
Understandable by us?
link |
00:59:27.440
Your intuition, again, we're in the realm of philosophy a little bit.
link |
00:59:31.440
Well, probably no.
link |
00:59:35.440
But again, it depends what you really mean for understanding.
link |
00:59:42.440
I think we don't understand how deep networks work.
link |
00:59:53.440
I think we're beginning to have a theory now.
link |
00:59:56.440
But in the case of deep networks,
link |
00:59:59.440
or even in the case of the simpler kernel machines or linear classifier,
link |
01:00:06.440
we really don't understand the individual units or so.
link |
01:00:12.440
But we understand what the computation and the limitations and the properties of it are.
link |
01:00:20.440
It's similar to many things.
link |
01:00:24.440
Does it mean to understand how a fusion bomb works?
link |
01:00:29.440
How many of us, you know, many of us understand the basic principle
link |
01:00:35.440
and some of us may understand deeper details?
link |
01:00:40.440
In that sense, understanding is, as a community, as a civilization,
link |
01:00:44.440
can we build another copy of it?
link |
01:00:46.440
Okay.
link |
01:00:47.440
And in that sense, do you think there'll be,
link |
01:00:50.440
there'll need to be some evolutionary component where it runs away from our understanding?
link |
01:00:56.440
Or do you think it could be engineered from the ground up?
link |
01:00:59.440
The same way you go from the transistor to PowerPoint?
link |
01:01:02.440
Right.
link |
01:01:03.440
So many years ago, this was actually 40, 41 years ago,
link |
01:01:09.440
I wrote a paper with David Maher,
link |
01:01:13.440
who was one of the founding fathers of computer vision, computational vision.
link |
01:01:19.440
I wrote a paper about levels of understanding,
link |
01:01:23.440
which is related to the question we discussed earlier about understanding PowerPoint,
link |
01:01:28.440
understanding transistors and so on.
link |
01:01:31.440
And, you know, in that kind of framework, we had a level of the hardware
link |
01:01:38.440
and the top level of the algorithms.
link |
01:01:41.440
We did not have learning.
link |
01:01:44.440
Recently, I updated adding levels and one level I added to those three was learning.
link |
01:01:54.440
So, and you can imagine, you could have a good understanding
link |
01:01:59.440
of how you construct learning machine, like we do.
link |
01:02:04.440
But being unable to describe in detail what the learning machines will discover, right?
link |
01:02:13.440
Now, that would be still a powerful understanding if I can build a learning machine,
link |
01:02:19.440
even if I don't understand in detail every time it learns something.
link |
01:02:25.440
Just like our children, if they start listening to a certain type of music,
link |
01:02:31.440
I don't know, Miley Cyrus or something,
link |
01:02:33.440
you don't understand why they came to that particular preference,
link |
01:02:37.440
but you understand the learning process.
link |
01:02:39.440
That's very interesting.
link |
01:02:41.440
So, on learning for systems to be part of our world,
link |
01:02:50.440
it has a certain, one of the challenging things that you've spoken about is learning ethics,
link |
01:02:56.440
learning morals.
link |
01:02:59.440
And how hard do you think is the problem of, first of all, humans understanding our ethics?
link |
01:03:06.440
What is the origin on the neural and low level of ethics?
link |
01:03:10.440
What is it at the higher level?
link |
01:03:12.440
Is it something that's learnable from machines in your intuition?
link |
01:03:17.440
I think, yeah, ethics is learnable, very likely.
link |
01:03:23.440
I think it's one of these problems where I think understanding the neuroscience of ethics,
link |
01:03:36.440
people discuss there is an ethics of neuroscience.
link |
01:03:42.440
How a neuroscientist should or should not behave,
link |
01:03:46.440
can think of a neurosurgeon and the ethics that he or she has to be.
link |
01:03:53.440
But I'm more interested in the neuroscience of ethics.
link |
01:03:57.440
You're blowing my mind right now, the neuroscience of ethics, it's very meta.
link |
01:04:01.440
And I think that would be important to understand also for being able to design machines
link |
01:04:09.440
that are ethical machines in our sense of ethics.
link |
01:04:14.440
And you think there is something in neuroscience, there's patterns,
link |
01:04:20.440
tools in neuroscience that could help us shed some light on ethics
link |
01:04:25.440
or is it more on the psychologist's sociology at a much higher level?
link |
01:04:29.440
No, there is psychology, but there is also, in the meantime,
link |
01:04:33.440
there is evidence, fMRI, of specific areas of the brain
link |
01:04:41.440
that are involved in certain ethical judgment.
link |
01:04:44.440
And not only this, you can stimulate those areas with magnetic fields
link |
01:04:49.440
and change the ethical decisions.
link |
01:04:54.440
So that's work by a colleague of mine, Rebecca Sacks,
link |
01:05:00.440
and there are other researchers doing similar work.
link |
01:05:04.440
And I think this is the beginning, but ideally at some point
link |
01:05:11.440
we'll have an understanding of how this works and why it evolved, right?
link |
01:05:17.440
The big why question, yeah, it must have some purpose.
link |
01:05:21.440
Yeah, obviously it has some social purposes, probably.
link |
01:05:29.440
If neuroscience holds the key to at least eliminate some aspect of ethics,
link |
01:05:34.440
that means it could be a learnable problem.
link |
01:05:36.440
Yeah, exactly.
link |
01:05:38.440
And as we're getting into harder and harder questions,
link |
01:05:41.440
let's go to the hard problem of consciousness.
link |
01:05:44.440
Is this an important problem for us to think about and solve on the engineering
link |
01:05:51.440
of intelligence side of your work, of our dream?
link |
01:05:55.440
You know, it's unclear.
link |
01:05:57.440
So, again, this is a deep problem, partly because it's very difficult
link |
01:06:04.440
to define consciousness and there is a debate among neuroscientists
link |
01:06:16.440
about whether consciousness and philosophers, of course,
link |
01:06:22.440
whether consciousness is something that requires flesh and blood, so to speak,
link |
01:06:30.440
or could be, you know, that we could have silicon devices that are conscious,
link |
01:06:40.440
or up to a statement like everything has some degree of consciousness
link |
01:06:45.440
and some more than others.
link |
01:06:48.440
This is like Giulio Tonioni and Fee.
link |
01:06:53.440
We just recently talked to Christof Ko.
link |
01:06:56.440
Christof was my first graduate student.
link |
01:07:00.440
Do you think it's important to illuminate aspects of consciousness
link |
01:07:06.440
in order to engineer intelligence systems?
link |
01:07:10.440
Do you think an intelligence system would ultimately have consciousness?
link |
01:07:14.440
Are they intro linked?
link |
01:07:18.440
You know, most of the people working in artificial intelligence, I think,
link |
01:07:23.440
they answer, we don't strictly need consciousness to have an intelligence system.
link |
01:07:29.440
That's sort of the easier question, because it's a very engineering answer to the question.
link |
01:07:35.440
It has a touring test, we don't need consciousness.
link |
01:07:38.440
But if you were to go, do you think it's possible that we need to have that kind of self awareness?
link |
01:07:47.440
We may, yes.
link |
01:07:49.440
So, for instance, I personally think that when test a machine or a person in a touring test,
link |
01:08:00.440
in an extended touring testing, I think consciousness is part of what we require in that test,
link |
01:08:10.440
you know, implicitly to say that this is intelligent.
link |
01:08:14.440
Christof disagrees.
link |
01:08:17.440
Yes, he does.
link |
01:08:19.440
Despite many other romantic notions he holds, he disagrees with that one.
link |
01:08:24.440
Yes, that's right.
link |
01:08:26.440
So, you know, who would see?
link |
01:08:29.440
Do you think, as a quick question, Ernest Becker's fear of death,
link |
01:08:37.440
do you think mortality and those kinds of things are important for consciousness and for intelligence,
link |
01:08:48.440
the finiteness of life, finiteness of existence,
link |
01:08:53.440
or is that just a side effect of evolutionary side effect that's useful for natural selection?
link |
01:09:00.440
Do you think this kind of thing that this interview is going to run out of time soon,
link |
01:09:05.440
our life will run out of time soon?
link |
01:09:08.440
Do you think that's needed to make this conversation good and life good?
link |
01:09:12.440
You know, I never thought about it.
link |
01:09:14.440
It's a very interesting question.
link |
01:09:16.440
I think Steve Jobs in his commencement speech at Stanford argued that, you know,
link |
01:09:25.440
having a finite life was important for stimulating achievements.
link |
01:09:30.440
It was a different.
link |
01:09:32.440
You live every day like it's your last, right?
link |
01:09:34.440
Yeah.
link |
01:09:35.440
So, rationally, I don't think strictly you need mortality for consciousness, but...
link |
01:09:45.440
Who knows?
link |
01:09:46.440
They seem to go together in our biological system, right?
link |
01:09:49.440
Yeah.
link |
01:09:51.440
You've mentioned before and the students are associated with...
link |
01:09:57.440
AlphaGo immobilized the big recent success stories in AI.
link |
01:10:01.440
I think it's captivated the entire world of what AI can do.
link |
01:10:05.440
So, what do you think will be the next breakthrough?
link |
01:10:10.440
What's your intuition about the next breakthrough?
link |
01:10:13.440
Of course, I don't know where the next breakthrough is.
link |
01:10:16.440
I think that there is a good chance, as I said before, that the next breakthrough
link |
01:10:22.440
would also be inspired by, you know, neuroscience.
link |
01:10:27.440
But which one?
link |
01:10:31.440
I don't know.
link |
01:10:32.440
And there's...
link |
01:10:33.440
So, MIT has this quest for intelligence.
link |
01:10:35.440
Yeah.
link |
01:10:36.440
And there's a few moonshots which, in that spirit, which ones are you excited about?
link |
01:10:41.440
What...
link |
01:10:42.440
Which projects kind of...
link |
01:10:44.440
Well, of course, I'm excited about one of the moonshots with...
link |
01:10:48.440
Which is our center for brains, minds, and machines.
link |
01:10:52.440
The one which is fully funded by NSF.
link |
01:10:57.440
And it's a...
link |
01:10:59.440
It is about visual intelligence.
link |
01:11:02.440
And that one is particularly about understanding.
link |
01:11:05.440
Visual intelligence.
link |
01:11:07.440
Visual cortex and visual intelligence in the sense of how we look around ourselves
link |
01:11:16.440
and understand the world around ourselves, you know, meaning what is going on,
link |
01:11:25.440
how we could go from here to there without hitting obstacles.
link |
01:11:31.440
You know, whether there are other agents, people in the environment.
link |
01:11:36.440
These are all things that we perceive very quickly.
link |
01:11:41.440
And it's something actually quite close to being conscious, not quite.
link |
01:11:47.440
But there is this interesting experiment that was run at Google X,
link |
01:11:53.440
which is, in a sense, is just a virtual reality experiment,
link |
01:11:58.440
but in which they had subject sitting, say, in a chair with goggles, like Oculus and so on.
link |
01:12:09.440
Earphones.
link |
01:12:11.440
And they were seeing through the eyes of a robot nearby to cameras, microphones for receiving.
link |
01:12:20.440
So their sensory system was there.
link |
01:12:23.440
And the impression of all the subjects, very strong, they could not shake it off,
link |
01:12:30.440
was that they were where the robot was.
link |
01:12:35.440
They could look at themselves from the robot and still feel they were where the robot is.
link |
01:12:42.440
They were looking at their body.
link |
01:12:45.440
Their self had moved.
link |
01:12:48.440
So some aspect of seeing understanding has to have ability to place yourself,
link |
01:12:54.440
have a self awareness about your position in the world and what the world is.
link |
01:12:59.440
So we may have to solve the heart problem of consciousness to solve it.
link |
01:13:04.440
On their way, yes.
link |
01:13:05.440
It's quite a moonshot.
link |
01:13:07.440
So you've been an advisor to some incredible minds, including Demis Osabis, Christof Koch,
link |
01:13:14.440
Amna Shashwar, like you said, all went on to become seminal figures in their respective fields.
link |
01:13:21.440
From your own success as a researcher and from perspective as a mentor of these researchers,
link |
01:13:28.440
having guided them in the way of advice,
link |
01:13:33.440
what does it take to be successful in science and engineering careers?
link |
01:13:39.440
Whether you're talking to somebody in their teens, 20s and 30s, what does that path look like?
link |
01:13:47.440
It's curiosity and having fun.
link |
01:13:52.440
And I think it's important also having fun with other curious minds.
link |
01:14:01.440
It's the people you surround with to have fun and curiosity.
link |
01:14:06.440
You mentioned Steve Jobs.
link |
01:14:09.440
Is there also an underlying ambition that's unique that you saw,
link |
01:14:14.440
or is it really does boil down to insatiable curiosity and fun?
link |
01:14:18.440
Well, of course.
link |
01:14:20.440
It's being curious in an active and ambitious way, yes, definitely.
link |
01:14:29.440
But I think sometime in science, there are friends of mine who are like this.
link |
01:14:38.440
You know, there are some of the scientists who like to work by themselves
link |
01:14:44.440
and kind of communicate only when they complete their work or discover something.
link |
01:14:54.440
I think I always found the actual process of discovering something
link |
01:15:02.440
is more fun if it's together with other intelligent and curious and fun people.
link |
01:15:09.440
So if you see the fun in that process, the side effect of that process
link |
01:15:13.440
would be that you'll actually end up discovering something.
link |
01:15:16.440
So as you've led many incredible efforts here, what's the secret to being a good advisor,
link |
01:15:25.440
mentor, leader in a research setting?
link |
01:15:28.440
Is it a similar spirit or what advice could you give to people, young faculty and so on?
link |
01:15:35.440
It's partly repeating what I said about an environment that should be friendly and fun
link |
01:15:42.440
and ambitious and, you know, I think I learned a lot from some of my advisors and friends
link |
01:15:52.440
and some were physicists and there was, for instance, this behavior that was encouraged
link |
01:16:02.440
of when somebody comes with a new idea in the group, unless it's really stupid
link |
01:16:08.440
but you are always enthusiastic.
link |
01:16:11.440
And then you're enthusiastic for a few minutes, for a few hours.
link |
01:16:14.440
Then you start, you know, asking critically a few questions, testing this.
link |
01:16:22.440
But, you know, this is a process that is, I think it's very good.
link |
01:16:28.440
You have to be enthusiastic.
link |
01:16:30.440
Sometimes people are very critical from the beginning.
link |
01:16:33.440
That's not...
link |
01:16:35.440
Yes, you have to give it a chance.
link |
01:16:37.440
Yes.
link |
01:16:38.440
That's seed to grow.
link |
01:16:39.440
That said, with some of your ideas, which are quite revolutionary, so there's a witness,
link |
01:16:44.440
especially in the human vision side and neuroscience side, there could be some pretty heated arguments.
link |
01:16:49.440
Do you enjoy these?
link |
01:16:51.440
Is that a part of science and academic pursuits that you enjoy?
link |
01:16:55.440
Yeah.
link |
01:16:56.440
Is that something that happens in your group as well?
link |
01:17:00.440
Yeah, absolutely.
link |
01:17:02.440
I also spent some time in Germany again, there is this tradition in which people are more forthright, less kind than here.
link |
01:17:14.440
So, you know, in the US, when you write a bad letter, you still say, this guy is nice, you know.
link |
01:17:23.440
Yes, yes.
link |
01:17:25.440
So...
link |
01:17:26.440
Yeah, here in America it's degrees of nice.
link |
01:17:28.440
Yes.
link |
01:17:29.440
It's all just degrees of nice, yeah.
link |
01:17:31.440
Right, so as long as this does not become personal and it's really like, you know, a football game with its rules, that's great.
link |
01:17:44.440
It's fun.
link |
01:17:46.440
So, if you somehow find yourself in a position to ask one question of an oracle, like a genie, maybe a god, and you're guaranteed to get a clear answer,
link |
01:17:58.440
what kind of question would you ask?
link |
01:18:00.440
What would be the question you would ask?
link |
01:18:03.440
In the spirit of our discussion, it could be, how could I become ten times more intelligent?
link |
01:18:09.440
And so, but see, you only get a clear short answer.
link |
01:18:15.440
So, do you think there's a clear short answer to that?
link |
01:18:18.440
No.
link |
01:18:19.440
And that's the answer you'll get.
link |
01:18:22.440
Okay.
link |
01:18:23.440
So, you've mentioned Flowers of Algernon.
link |
01:18:26.440
Oh, yeah.
link |
01:18:27.440
There's a story that inspired you in your childhood.
link |
01:18:32.440
As this story of a mouse, a human achieving genius level intelligence, and then understanding what was happening while slowly becoming not intelligent again in this tragedy of gaining intelligence and losing intelligence.
link |
01:18:48.440
Do you think in that spirit, in that story, do you think intelligence is a gift or a curse from the perspective of happiness and meaning of life?
link |
01:18:59.440
You try to create an intelligent system that understands the universe, but on an individual level, the meaning of life, do you think intelligence is a gift?
link |
01:19:10.440
It's a good question.
link |
01:19:16.440
I don't know.
link |
01:19:22.440
As one of the, as one people who consider the smartest people in the world, in some, in some dimension at the very least, what do you think?
link |
01:19:34.440
I don't know.
link |
01:19:35.440
It may be invariant to intelligence, let's agree of happiness.
link |
01:19:39.440
It would be nice if it were.
link |
01:19:43.440
That's the hope.
link |
01:19:44.440
Yeah.
link |
01:19:45.440
You could be smart and happy and clueless and happy.
link |
01:19:49.440
Yeah.
link |
01:19:51.440
As always on the discussion of the meaning of life is probably a good place to end.
link |
01:19:56.440
Tomasso, thank you so much for talking today.
link |
01:19:58.440
Thank you.
link |
01:19:59.440
This was great.