back to index

Daphne Koller: Biomedicine and Machine Learning | Lex Fridman Podcast #93


small model | large model

link |
00:00:00.000
The following is a conversation with Daphne Koller,
link |
00:00:03.300
a professor of computer science at Stanford University,
link |
00:00:06.260
a cofounder of Coursera with Andrew Ng,
link |
00:00:08.980
and founder and CEO of Incitro,
link |
00:00:11.880
a company at the intersection
link |
00:00:13.380
of machine learning and biomedicine.
link |
00:00:15.940
We're now in the exciting early days
link |
00:00:17.820
of using the data driven methods of machine learning
link |
00:00:20.580
to help discover and develop new drugs
link |
00:00:22.580
and treatments at scale.
link |
00:00:24.420
Daphne and Incitro are leading the way on this
link |
00:00:27.780
with breakthroughs that may ripple
link |
00:00:29.660
through all fields of medicine,
link |
00:00:31.620
including ones most critical for helping
link |
00:00:34.260
with the current coronavirus pandemic.
link |
00:00:37.220
This conversation was recorded
link |
00:00:38.660
before the COVID 19 outbreak.
link |
00:00:41.300
For everyone feeling the medical, psychological,
link |
00:00:43.540
and financial burden of this crisis,
link |
00:00:45.620
I'm sending love your way.
link |
00:00:47.700
Stay strong, we're in this together, we'll beat this thing.
link |
00:00:51.740
This is the Artificial Intelligence Podcast.
link |
00:00:54.260
If you enjoy it, subscribe on YouTube,
link |
00:00:56.380
review it with five stars on Apple Podcast,
link |
00:00:58.720
support it on Patreon,
link |
00:01:00.100
or simply connect with me on Twitter
link |
00:01:02.060
at Lex Friedman, spelled F R I D M A N.
link |
00:01:05.940
As usual, I'll do a few minutes of ads now
link |
00:01:08.100
and never any ads in the middle
link |
00:01:09.420
that can break the flow of this conversation.
link |
00:01:11.740
I hope that works for you
link |
00:01:13.060
and doesn't hurt the listening experience.
link |
00:01:15.940
This show is presented by Cash App,
link |
00:01:17.940
the number one finance app in the app store.
link |
00:01:20.280
When you get it, use code LEXPODCAST.
link |
00:01:23.420
Cash App lets you send money to friends,
link |
00:01:25.620
buy Bitcoin, and invest in the stock market
link |
00:01:27.900
with as little as one dollar.
link |
00:01:30.220
Since Cash App allows you to send
link |
00:01:31.700
and receive money digitally,
link |
00:01:33.420
peer to peer, and security in all digital transactions
link |
00:01:36.900
is very important,
link |
00:01:38.120
let me mention the PCI data security standard
link |
00:01:41.380
that Cash App is compliant with.
link |
00:01:43.900
I'm a big fan of standards for safety and security.
link |
00:01:46.780
PCI DSS is a good example of that,
link |
00:01:49.520
where a bunch of competitors got together
link |
00:01:51.140
and agreed that there needs to be a global standard
link |
00:01:53.860
around the security of transactions.
link |
00:01:56.020
Now we just need to do the same for autonomous vehicles
link |
00:01:58.420
and AI systems in general.
link |
00:02:00.620
So again, if you get Cash App from the App Store
link |
00:02:03.260
or Google Play and use the code LEXPODCAST,
link |
00:02:07.060
you get $10 and Cash App will also donate $10 to FIRST,
link |
00:02:11.220
an organization that is helping to advance robotics
link |
00:02:14.100
and STEM education for young people around the world.
link |
00:02:17.700
And now here's my conversation with Daphne Koller.
link |
00:02:22.420
So you cofounded Coursera and made a huge impact
link |
00:02:25.040
in the global education of AI.
link |
00:02:26.660
And after five years in August, 2016,
link |
00:02:29.700
wrote a blog post saying that you're stepping away
link |
00:02:33.040
and wrote, quote,
link |
00:02:34.460
it is time for me to turn to another critical challenge,
link |
00:02:37.500
the development of machine learning
link |
00:02:38.940
and its applications to improving human health.
link |
00:02:41.700
So let me ask two far out philosophical questions.
link |
00:02:45.140
One, do you think we'll one day find cures
link |
00:02:48.020
for all major diseases known today?
link |
00:02:50.760
And two, do you think we'll one day figure out
link |
00:02:53.560
a way to extend the human lifespan,
link |
00:02:55.980
perhaps to the point of immortality?
link |
00:02:59.460
So one day is a very long time
link |
00:03:01.780
and I don't like to make predictions
link |
00:03:04.260
of the type we will never be able to do X
link |
00:03:07.300
because I think that's a smacks of hubris.
link |
00:03:12.740
It seems that never in the entire eternity
link |
00:03:16.140
of human existence will we be able to solve a problem.
link |
00:03:19.380
That being said, curing disease is very hard
link |
00:03:24.260
because oftentimes by the time you discover the disease,
link |
00:03:28.540
a lot of damage has already been done.
link |
00:03:30.560
And so to assume that we would be able to cure disease
link |
00:03:34.980
at that stage assumes that we would come up with ways
link |
00:03:37.620
of basically regenerating entire parts of the human body
link |
00:03:41.940
in the way that actually returns it to its original state.
link |
00:03:45.340
And that's a very challenging problem.
link |
00:03:47.420
We have cured very few diseases.
link |
00:03:49.420
We've been able to provide treatment
link |
00:03:51.460
for an increasingly large number,
link |
00:03:52.940
but the number of things that you could actually define
link |
00:03:54.700
to be cures is actually not that large.
link |
00:03:59.440
So I think that there's a lot of work
link |
00:04:02.540
that would need to happen before one could legitimately say
link |
00:04:05.660
that we have cured even a reasonable number,
link |
00:04:08.820
far less all diseases.
link |
00:04:10.460
On the scale of zero to 100,
link |
00:04:12.780
where are we in understanding the fundamental mechanisms
link |
00:04:15.580
of all of major diseases?
link |
00:04:18.140
What's your sense?
link |
00:04:19.260
So from the computer science perspective
link |
00:04:21.080
that you've entered the world of health,
link |
00:04:24.160
how far along are we?
link |
00:04:26.740
I think it depends on which disease.
link |
00:04:29.520
I mean, there are ones where I would say
link |
00:04:31.780
we're maybe not quite at a hundred
link |
00:04:33.420
because biology is really complicated
link |
00:04:35.580
and there's always new things that we uncover
link |
00:04:38.960
that people didn't even realize existed.
link |
00:04:43.040
But I would say there's diseases
link |
00:04:44.420
where we might be in the 70s or 80s,
link |
00:04:48.060
and then there's diseases in which I would say
link |
00:04:51.340
with probably the majority where we're really close to zero.
link |
00:04:55.220
Would Alzheimer's and schizophrenia
link |
00:04:57.980
and type two diabetes fall closer to zero or to the 80?
link |
00:05:04.340
I think Alzheimer's is probably closer to zero than to 80.
link |
00:05:11.060
There are hypotheses,
link |
00:05:12.660
but I don't think those hypotheses have as of yet
link |
00:05:17.300
been sufficiently validated that we believe them to be true.
link |
00:05:21.980
And there is an increasing number of people
link |
00:05:23.780
who believe that the traditional hypotheses
link |
00:05:25.900
might not really explain what's going on.
link |
00:05:28.020
I would also say that Alzheimer's and schizophrenia
link |
00:05:31.700
and even type two diabetes are not really one disease.
link |
00:05:35.300
They're almost certainly a heterogeneous collection
link |
00:05:39.380
of mechanisms that manifest in clinically similar ways.
link |
00:05:43.700
So in the same way that we now understand
link |
00:05:46.640
that breast cancer is really not one disease,
link |
00:05:48.900
it is multitude of cellular mechanisms,
link |
00:05:53.420
all of which ultimately translate
link |
00:05:55.160
to uncontrolled proliferation, but it's not one disease.
link |
00:05:59.340
The same is almost undoubtedly true
link |
00:06:01.140
for those other diseases as well.
link |
00:06:02.900
And that understanding that needs to precede
link |
00:06:05.780
any understanding of the specific mechanisms
link |
00:06:08.460
of any of those other diseases.
link |
00:06:10.100
Now, in schizophrenia, I would say
link |
00:06:11.580
we're almost certainly closer to zero than to anything else.
link |
00:06:15.220
Type two diabetes is a bit of a mix.
link |
00:06:18.260
There are clear mechanisms that are implicated
link |
00:06:21.380
that I think have been validated
link |
00:06:22.980
that have to do with insulin resistance and such,
link |
00:06:25.260
but there's almost certainly there as well
link |
00:06:28.500
many mechanisms that we have not yet understood.
link |
00:06:31.300
You've also thought and worked a little bit
link |
00:06:34.420
on the longevity side.
link |
00:06:35.860
Do you see the disease and longevity as overlapping
link |
00:06:40.260
completely, partially, or not at all as efforts?
link |
00:06:45.260
Those mechanisms are certainly overlapping.
link |
00:06:48.620
There's a well known phenomenon that says
link |
00:06:51.940
that for most diseases, other than childhood diseases,
link |
00:06:56.820
the risk for contracting that disease
link |
00:07:01.300
increases exponentially year on year,
link |
00:07:03.260
every year from the time you're about 40.
link |
00:07:05.700
So obviously there's a connection between those two things.
link |
00:07:10.380
That's not to say that they're identical.
link |
00:07:12.420
There's clearly aging that happens
link |
00:07:14.980
that is not really associated with any specific disease.
link |
00:07:18.740
And there's also diseases and mechanisms of disease
link |
00:07:22.300
that are not specifically related to aging.
link |
00:07:25.660
So I think overlap is where we're at.
link |
00:07:29.140
Okay.
link |
00:07:30.420
It is a little unfortunate that we get older
link |
00:07:32.620
and it seems that there's some correlation
link |
00:07:34.180
with the occurrence of diseases
link |
00:07:39.060
or the fact that we get older.
link |
00:07:40.780
And both are quite sad.
link |
00:07:43.100
I mean, there's processes that happen as cells age
link |
00:07:46.700
that I think are contributing to disease.
link |
00:07:49.580
Some of those have to do with DNA damage
link |
00:07:52.780
that accumulates as cells divide
link |
00:07:54.980
where the repair mechanisms don't fully correct for those.
link |
00:07:59.620
There are accumulations of proteins
link |
00:08:03.660
that are misfolded and potentially aggregate
link |
00:08:06.340
and those too contribute to disease
link |
00:08:08.540
and will contribute to inflammation.
link |
00:08:10.540
There's a multitude of mechanisms that have been uncovered
link |
00:08:14.020
that are sort of wear and tear at the cellular level
link |
00:08:17.100
that contribute to disease processes
link |
00:08:21.780
and I'm sure there's many that we don't yet understand.
link |
00:08:24.860
On a small tangent and perhaps philosophical,
link |
00:08:30.220
the fact that things get older
link |
00:08:32.340
and the fact that things die is a very powerful feature
link |
00:08:36.580
for the growth of new things.
link |
00:08:38.900
It's a learning, it's a kind of learning mechanism.
link |
00:08:41.380
So it's both tragic and beautiful.
link |
00:08:44.660
So do you, so in trying to fight disease
link |
00:08:52.140
and trying to fight aging,
link |
00:08:55.260
do you think about sort of the useful fact of our mortality
link |
00:08:58.940
or would you, like if you were, could be immortal,
link |
00:09:02.660
would you choose to be immortal?
link |
00:09:07.140
Again, I think immortal is a very long time
link |
00:09:10.860
and I don't know that that would necessarily be something
link |
00:09:16.020
that I would want to aspire to
link |
00:09:17.900
but I think all of us aspire to an increased health span,
link |
00:09:24.180
I would say, which is an increased amount of time
link |
00:09:27.620
where you're healthy and active
link |
00:09:29.860
and feel as you did when you were 20
link |
00:09:33.300
and we're nowhere close to that.
link |
00:09:36.780
People deteriorate physically and mentally over time
link |
00:09:41.820
and that is a very sad phenomenon.
link |
00:09:43.660
So I think a wonderful aspiration would be
link |
00:09:47.300
if we could all live to the biblical 120 maybe
link |
00:09:52.340
in perfect health.
link |
00:09:53.740
In high quality of life.
link |
00:09:54.820
High quality of life.
link |
00:09:55.860
I think that would be an amazing goal
link |
00:09:57.780
for us to achieve as a society
link |
00:09:59.300
now is the right age 120 or 100 or 150.
link |
00:10:03.660
I think that's up for debate
link |
00:10:05.740
but I think an increased health span
link |
00:10:07.660
is a really worthy goal.
link |
00:10:10.100
And anyway, in a grand time of the age of the universe,
link |
00:10:14.700
it's all pretty short.
link |
00:10:16.580
So from the perspective,
link |
00:10:18.460
you've done obviously a lot of incredible work
link |
00:10:20.980
in machine learning.
link |
00:10:22.060
So what role do you think data and machine learning
link |
00:10:25.140
play in this goal of trying to understand diseases
link |
00:10:29.300
and trying to eradicate diseases?
link |
00:10:32.940
Up until now, I don't think it's played
link |
00:10:35.180
very much of a significant role
link |
00:10:37.860
because largely the data sets that one really needed
link |
00:10:42.420
to enable a powerful machine learning methods,
link |
00:10:47.300
those data sets haven't really existed.
link |
00:10:49.620
There's been dribs and drabs
link |
00:10:50.940
and some interesting machine learning
link |
00:10:53.300
that has been applied, I would say machine learning
link |
00:10:55.700
slash data science,
link |
00:10:57.660
but the last few years are starting to change that.
link |
00:11:00.180
So we now see an increase in some large data sets
link |
00:11:06.300
but equally importantly, an increase in technologies
link |
00:11:11.340
that are able to produce data at scale.
link |
00:11:14.700
It's not typically the case that people have deliberately
link |
00:11:19.340
proactively used those tools
link |
00:11:21.420
for the purpose of generating data for machine learning.
link |
00:11:24.180
They, to the extent that those techniques
link |
00:11:26.540
have been used for data production,
link |
00:11:28.540
they've been used for data production
link |
00:11:29.860
to drive scientific discovery
link |
00:11:31.300
and the machine learning came as a sort of byproduct
link |
00:11:34.420
second stage of, oh, you know, now we have a data set,
link |
00:11:36.900
let's do machine learning on that
link |
00:11:38.260
rather than a more simplistic data analysis method.
link |
00:11:41.820
But what we are doing in Citro
link |
00:11:44.420
is actually flipping that around and saying,
link |
00:11:46.780
here's this incredible repertoire of methods
link |
00:11:50.300
that bioengineers, cell biologists have come up with,
link |
00:11:54.580
let's see if we can put them together in brand new ways
link |
00:11:57.420
with the goal of creating data sets
link |
00:12:00.260
that machine learning can really be applied on productively
link |
00:12:03.380
to create powerful predictive models
link |
00:12:06.580
that can help us address fundamental problems
link |
00:12:08.460
in human health.
link |
00:12:09.420
So really focus to get, make data the primary focus
link |
00:12:14.500
and the primary goal and find,
link |
00:12:16.460
use the mechanisms of biology and chemistry
link |
00:12:18.900
to create the kinds of data set
link |
00:12:23.340
that could allow machine learning to benefit the most.
link |
00:12:25.700
I wouldn't put it in those terms
link |
00:12:27.580
because that says that data is the end goal.
link |
00:12:30.460
Data is the means.
link |
00:12:32.140
So for us, the end goal is helping address challenges
link |
00:12:35.740
in human health and the method that we've elected to do that
link |
00:12:39.980
is to apply machine learning to build predictive models
link |
00:12:44.140
and machine learning, in my opinion,
link |
00:12:45.980
can only be really successfully applied
link |
00:12:48.820
especially the more powerful models
link |
00:12:50.700
if you give it data that is of sufficient scale
link |
00:12:53.540
and sufficient quality.
link |
00:12:54.540
So how do you create those data sets
link |
00:12:58.580
so as to drive the ability to generate predictive models
link |
00:13:03.700
which subsequently help improve human health?
link |
00:13:05.740
So before we dive into the details of that,
link |
00:13:08.700
let me take a step back and ask when and where
link |
00:13:13.820
was your interest in human health born?
link |
00:13:16.780
Are there moments, events, perhaps if I may ask,
link |
00:13:19.900
tragedies in your own life that catalyzes passion
link |
00:13:23.060
or was it the broader desire to help humankind?
link |
00:13:26.580
So I would say it's a bit of both.
link |
00:13:29.180
So on, I mean, my interest in human health
link |
00:13:32.620
actually dates back to the early 2000s
link |
00:13:37.780
when a lot of my peers in machine learning
link |
00:13:43.940
and I were using data sets
link |
00:13:45.500
that frankly were not very inspiring.
link |
00:13:47.420
Some of us old timers still remember
link |
00:13:49.820
the quote unquote 20 news groups data set
link |
00:13:52.300
where this was literally a bunch of texts
link |
00:13:55.740
from 20 news groups,
link |
00:13:57.100
a concept that doesn't really even exist anymore.
link |
00:13:59.260
And the question was, can you classify
link |
00:14:01.660
which news group a particular bag of words came from?
link |
00:14:06.780
And it wasn't very interesting.
link |
00:14:08.700
The data sets at the time on the biology side
link |
00:14:12.460
were much more interesting,
link |
00:14:14.020
both from a technical and also from
link |
00:14:15.540
an aspirational perspective.
link |
00:14:17.540
They were still pretty small,
link |
00:14:18.860
but they were better than 20 news groups.
link |
00:14:20.740
And so I started out, I think just by wanting
link |
00:14:25.620
to do something that was more, I don't know,
link |
00:14:27.860
societally useful and technically interesting.
link |
00:14:30.780
And then over time became more and more interested
link |
00:14:34.420
in the biology and the human health aspects for themselves
link |
00:14:40.220
and began to work even sometimes on papers
link |
00:14:43.460
that were just in biology
link |
00:14:45.140
without having a significant machine learning component.
link |
00:14:48.460
I think my interest in drug discovery
link |
00:14:52.740
is partly due to an incident I had with
link |
00:14:58.580
when my father sadly passed away about 12 years ago.
link |
00:15:02.580
He had an autoimmune disease that settled in his lungs
link |
00:15:08.900
and the doctors basically said,
link |
00:15:11.460
well, there's only one thing that we could do,
link |
00:15:13.380
which is give him prednisone.
link |
00:15:15.020
At some point, I remember a doctor even came and said,
link |
00:15:17.780
hey, let's do a lung biopsy to figure out
link |
00:15:19.620
which autoimmune disease he has.
link |
00:15:20.940
And I said, would that be helpful?
link |
00:15:23.180
Would that change treatment?
link |
00:15:24.020
He said, no, there's only prednisone.
link |
00:15:25.500
That's the only thing we can give him.
link |
00:15:27.180
And I had friends who were rheumatologists who said
link |
00:15:29.900
the FDA would never approve prednisone today
link |
00:15:32.060
because the ratio of side effects to benefit
link |
00:15:37.260
is probably not large enough.
link |
00:15:39.580
Today, we're in a state where there's probably four or five,
link |
00:15:44.860
maybe even more, well, it depends for which autoimmune disease,
link |
00:15:48.740
but there are multiple drugs that can help people
link |
00:15:52.940
with autoimmune disease,
link |
00:15:53.980
many of which didn't exist 12 years ago.
link |
00:15:56.740
And I think we're at a golden time in some ways
link |
00:16:00.380
in drug discovery where there's the ability to create drugs
link |
00:16:05.380
that are much more safe and much more effective
link |
00:16:10.580
than we've ever been able to before.
link |
00:16:13.060
And what's lacking is enough understanding
link |
00:16:16.340
of biology and mechanism to know where to aim that engine.
link |
00:16:22.300
And I think that's where machine learning can help.
link |
00:16:25.380
So in 2018, you started and now lead a company in Citro,
link |
00:16:29.900
which is, like you mentioned,
link |
00:16:32.580
perhaps the focus is drug discovery
link |
00:16:34.740
and the utilization of machine learning for drug discovery.
link |
00:16:38.140
So you mentioned that, quote,
link |
00:16:40.620
we're really interested in creating
link |
00:16:42.100
what you might call a disease in a dish model,
link |
00:16:45.580
disease in a dish models,
link |
00:16:47.380
places where diseases are complex,
link |
00:16:49.180
where we really haven't had a good model system,
link |
00:16:52.220
where typical animal models that have been used for years,
link |
00:16:55.020
including testing on mice, just aren't very effective.
link |
00:16:58.860
So can you try to describe what is an animal model
link |
00:17:02.640
and what is a disease in a dish model?
link |
00:17:05.340
Sure.
link |
00:17:06.260
So an animal model for disease
link |
00:17:09.300
is where you create effectively,
link |
00:17:13.860
it's what it sounds like.
link |
00:17:14.900
It's oftentimes a mouse where we have introduced
link |
00:17:19.300
some external perturbation that creates the disease
link |
00:17:22.780
and then we cure that disease.
link |
00:17:26.300
And the hope is that by doing that,
link |
00:17:28.740
we will cure a similar disease in the human.
link |
00:17:31.340
The problem is that oftentimes
link |
00:17:33.500
the way in which we generate the disease in the animal
link |
00:17:36.900
has nothing to do with how that disease
link |
00:17:38.560
actually comes about in a human.
link |
00:17:40.900
It's what you might think of as a copy of the phenotype,
link |
00:17:44.420
a copy of the clinical outcome,
link |
00:17:46.740
but the mechanisms are quite different.
link |
00:17:48.740
And so curing the disease in the animal,
link |
00:17:52.120
which in most cases doesn't happen naturally,
link |
00:17:54.880
mice don't get Alzheimer's, they don't get diabetes,
link |
00:17:57.180
they don't get atherosclerosis,
link |
00:17:58.700
they don't get autism or schizophrenia.
link |
00:18:02.580
Those cures don't translate over
link |
00:18:05.700
to what happens in the human.
link |
00:18:08.140
And that's where most drugs fails
link |
00:18:10.860
just because the findings that we had in the mouse
link |
00:18:13.700
don't translate to a human.
link |
00:18:16.660
The disease in the dish models is a fairly new approach.
link |
00:18:20.860
It's been enabled by technologies
link |
00:18:24.140
that have not existed for more than five to 10 years.
link |
00:18:28.420
So for instance, the ability for us to take a cell
link |
00:18:32.780
from any one of us, you or me,
link |
00:18:35.540
revert that say skin cell to what's called stem cell status,
link |
00:18:39.960
which is what's called the pluripotent cell
link |
00:18:44.740
that can then be differentiated
link |
00:18:46.600
into different types of cells.
link |
00:18:47.860
So from that pluripotent cell,
link |
00:18:49.800
one can create a Lex neuron or a Lex cardiomyocyte
link |
00:18:54.300
or a Lex hepatocyte that has your genetics,
link |
00:18:57.760
but that right cell type.
link |
00:19:00.020
And so if there's a genetic burden of disease
link |
00:19:04.780
that would manifest in that particular cell type,
link |
00:19:07.180
you might be able to see it by looking at those cells
link |
00:19:10.300
and saying, oh, that's what potentially sick cells
link |
00:19:13.380
look like versus healthy cells
link |
00:19:15.620
and then explore what kind of interventions
link |
00:19:20.740
might revert the unhealthy looking cell to a healthy cell.
link |
00:19:24.860
Now, of course, curing cells is not the same
link |
00:19:27.740
as curing people.
link |
00:19:29.820
And so there's still potentially a translatability gap,
link |
00:19:33.220
but at least for diseases that are driven,
link |
00:19:38.500
say by human genetics and where the human genetics
link |
00:19:41.980
is what drives the cellular phenotype,
link |
00:19:43.780
there is some reason to hope that if we revert those cells
link |
00:19:47.960
in which the disease begins
link |
00:19:49.600
and where the disease is driven by genetics
link |
00:19:52.180
and we can revert that cell back to a healthy state,
link |
00:19:55.260
maybe that will help also revert
link |
00:19:58.140
the more global clinical phenotype.
link |
00:20:00.860
So that's really what we're hoping to do.
link |
00:20:02.740
That step, that backward step, I was reading about it,
link |
00:20:06.020
the Yamanaka factor.
link |
00:20:08.300
Yes.
link |
00:20:09.700
So it's like that reverse step back to stem cells.
link |
00:20:12.280
Yes.
link |
00:20:13.120
Seems like magic.
link |
00:20:14.180
It is.
link |
00:20:15.740
Honestly, before that happened,
link |
00:20:17.660
I think very few people would have predicted
link |
00:20:20.120
that to be possible.
link |
00:20:21.700
It's amazing.
link |
00:20:22.540
Can you maybe elaborate, is it actually possible?
link |
00:20:25.180
Like where, like how stable?
link |
00:20:27.300
So this result was maybe like,
link |
00:20:29.380
I don't know how many years ago,
link |
00:20:30.580
maybe 10 years ago was first demonstrated,
link |
00:20:32.700
something like that.
link |
00:20:33.860
Is this, how hard is this?
link |
00:20:35.520
Like how noisy is this backward step?
link |
00:20:37.500
It seems quite incredible and cool.
link |
00:20:39.460
It is, it is incredible and cool.
link |
00:20:42.220
It was much more, I think finicky and bespoke
link |
00:20:46.420
at the early stages when the discovery was first made.
link |
00:20:49.980
But at this point, it's become almost industrialized.
link |
00:20:54.500
There are what's called contract research organizations,
link |
00:20:59.440
vendors that will take a sample from a human
link |
00:21:02.300
and revert it back to stem cell status.
link |
00:21:04.460
And it works a very good fraction of the time.
link |
00:21:07.120
Now there are people who will ask,
link |
00:21:10.360
I think good questions.
link |
00:21:12.060
Is this really truly a stem cell or does it remember
link |
00:21:15.340
certain aspects of what,
link |
00:21:17.860
of changes that were made in the human beyond the genetics?
link |
00:21:22.500
It's passed as a skin cell, yeah.
link |
00:21:24.660
It's passed as a skin cell or it's passed
link |
00:21:26.740
in terms of exposures to different environmental factors
link |
00:21:29.920
and so on.
link |
00:21:30.920
So I think the consensus right now
link |
00:21:33.300
is that these are not always perfect
link |
00:21:36.420
and there is little bits and pieces of memory sometimes,
link |
00:21:40.020
but by and large, these are actually pretty good.
link |
00:21:44.780
So one of the key things,
link |
00:21:47.260
well, maybe you can correct me,
link |
00:21:48.740
but one of the useful things for machine learning
link |
00:21:50.900
is size, scale of data.
link |
00:21:54.180
How easy it is to do these kinds of reversals to stem cells
link |
00:21:59.100
and then disease in a dish models at scale.
link |
00:22:02.360
Is that a huge challenge or not?
link |
00:22:06.180
So the reversal is not as of this point
link |
00:22:11.660
something that can be done at the scale
link |
00:22:14.220
of tens of thousands or hundreds of thousands.
link |
00:22:18.540
I think total number of stem cells or IPS cells
link |
00:22:22.260
that are what's called induced pluripotent stem cells
link |
00:22:25.260
in the world I think is somewhere between five and 10,000
link |
00:22:30.220
last I looked.
link |
00:22:31.460
Now again, that might not count things that exist
link |
00:22:34.460
in this or that academic center
link |
00:22:36.260
and they may add up to a bit more,
link |
00:22:37.860
but that's about the range.
link |
00:22:40.060
So it's not something that you could at this point
link |
00:22:42.180
generate IPS cells from a million people,
link |
00:22:45.540
but maybe you don't need to
link |
00:22:47.900
because maybe that background is enough
link |
00:22:51.820
because it can also be now perturbed in different ways.
link |
00:22:56.140
And some people have done really interesting experiments
link |
00:23:00.100
in for instance, taking cells from a healthy human
link |
00:23:05.660
and then introducing a mutation into it
link |
00:23:08.540
using one of the other miracle technologies
link |
00:23:11.860
that's emerged in the last decade
link |
00:23:13.820
which is CRISPR gene editing
link |
00:23:16.140
and introduced a mutation that is known to be pathogenic.
link |
00:23:19.660
And so you can now look at the healthy cells
link |
00:23:22.420
and the unhealthy cells, the one with the mutation
link |
00:23:24.740
and do a one on one comparison
link |
00:23:26.100
where everything else is held constant.
link |
00:23:28.380
And so you could really start to understand specifically
link |
00:23:31.820
what the mutation does at the cellular level.
link |
00:23:34.380
So the IPS cells are a great starting point
link |
00:23:37.700
and obviously more diversity is better
link |
00:23:39.820
because you also wanna capture ethnic background
link |
00:23:42.380
and how that affects things,
link |
00:23:43.580
but maybe you don't need one from every single patient
link |
00:23:46.780
with every single type of disease
link |
00:23:48.100
because we have other tools at our disposal.
link |
00:23:50.260
Well, how much difference is there between people
link |
00:23:52.580
I mentioned ethnic background in terms of IPS cells?
link |
00:23:54.940
So we're all like, it seems like these magical cells
link |
00:23:59.380
that can do to create anything
link |
00:24:01.860
between different populations, different people.
link |
00:24:04.020
Is there a lot of variability between cell cells?
link |
00:24:07.020
Well, first of all, there's the variability,
link |
00:24:09.580
that's driven simply by the fact
link |
00:24:10.980
that genetically we're different.
link |
00:24:13.420
So a stem cell that's derived from my genotype
link |
00:24:15.820
is gonna be different from a stem cell
link |
00:24:18.340
that's derived from your genotype.
link |
00:24:20.540
There's also some differences that have more to do with
link |
00:24:23.700
for whatever reason, some people's stem cells
link |
00:24:27.260
differentiate better than other people's stem cells.
link |
00:24:29.860
We don't entirely understand why.
link |
00:24:31.500
So there's certainly some differences there as well,
link |
00:24:34.180
but the fundamental difference
link |
00:24:35.460
and the one that we really care about and is a positive
link |
00:24:38.740
is that the fact that the genetics are different
link |
00:24:43.220
and therefore recapitulate my disease burden
link |
00:24:45.980
versus your disease burden.
link |
00:24:47.780
What's a disease burden?
link |
00:24:49.260
Well, a disease burden is just if you think,
link |
00:24:52.300
I mean, it's not a well defined mathematical term,
link |
00:24:55.060
although there are mathematical formulations of it.
link |
00:24:58.260
If you think about the fact that some of us are more likely
link |
00:25:01.500
to get a certain disease than others
link |
00:25:03.460
because we have more variations in our genome
link |
00:25:07.300
that are causative of the disease,
link |
00:25:09.500
maybe fewer that are protective of the disease.
link |
00:25:12.620
People have quantified that
link |
00:25:14.860
using what are called polygenic risk scores,
link |
00:25:17.860
which look at all of the variations
link |
00:25:20.820
in an individual person's genome
link |
00:25:23.620
and add them all up in terms of how much risk they confer
link |
00:25:26.620
for a particular disease.
link |
00:25:27.820
And then they've put people on a spectrum
link |
00:25:30.540
of their disease risk.
link |
00:25:32.540
And for certain diseases where we've been sufficiently
link |
00:25:36.500
powered to really understand the connection
link |
00:25:38.740
between the many, many small variations
link |
00:25:41.580
that give rise to an increased disease risk,
link |
00:25:44.940
there's some pretty significant differences
link |
00:25:47.060
in terms of the risk between the people,
link |
00:25:49.300
say at the highest decile of this polygenic risk score
link |
00:25:52.060
and the people at the lowest decile.
link |
00:25:53.500
Sometimes those differences are factor of 10 or 12 higher.
link |
00:25:58.940
So there's definitely a lot that our genetics
link |
00:26:03.940
contributes to disease risk, even if it's not
link |
00:26:07.100
by any stretch the full explanation.
link |
00:26:09.100
And from a machine learning perspective,
link |
00:26:10.500
there's signal there.
link |
00:26:12.020
There is definitely signal in the genetics
link |
00:26:14.780
and there's even more signal, we believe,
link |
00:26:19.100
in looking at the cells that are derived
link |
00:26:21.540
from those different genetics because in principle,
link |
00:26:25.540
you could say all the signal is there at the genetics level.
link |
00:26:28.660
So we don't need to look at the cells,
link |
00:26:30.180
but our understanding of the biology is so limited at this
link |
00:26:34.100
point than seeing what actually happens at the cellular
link |
00:26:37.100
level is a heck of a lot closer to the human clinical outcome
link |
00:26:41.780
than looking at the genetics directly.
link |
00:26:44.620
And so we can learn a lot more from it
link |
00:26:47.180
than we could by looking at genetics alone.
link |
00:26:49.420
So just to get a sense, I don't know if it's easy to do,
link |
00:26:51.660
but what kind of data is useful
link |
00:26:54.220
in this disease in a dish model?
link |
00:26:56.220
Like what's the source of raw data information?
link |
00:26:59.940
And also from my outsider's perspective,
link |
00:27:03.900
so biology and cells are squishy things.
link |
00:27:08.620
And then how do you connect the computer to that?
link |
00:27:15.620
Which sensory mechanisms, I guess.
link |
00:27:17.780
So that's another one of those revolutions
link |
00:27:20.660
that have happened in the last 10 years
link |
00:27:22.540
in that our ability to measure cells very quantitatively
link |
00:27:27.540
has also dramatically increased.
link |
00:27:30.020
So back when I started doing biology in the late 90s,
link |
00:27:35.260
early 2000s, that was the initial era
link |
00:27:40.820
where we started to measure biology
link |
00:27:42.500
in really quantitative ways using things like microarrays,
link |
00:27:46.420
where you would measure in a single experiment
link |
00:27:50.580
the activity level, what's called expression level
link |
00:27:53.820
of every gene in the genome in that sample.
link |
00:27:56.980
And that ability is what actually allowed us
link |
00:28:00.340
to even understand that there are molecular subtypes
link |
00:28:04.180
of diseases like cancer, where up until that point,
link |
00:28:06.820
it's like, oh, you have breast cancer.
link |
00:28:09.220
But then when we looked at the molecular data,
link |
00:28:13.180
it was clear that there's different subtypes
link |
00:28:14.940
of breast cancer that at the level of gene activity
link |
00:28:17.460
look completely different to each other.
link |
00:28:20.660
So that was the beginning of this process.
link |
00:28:23.100
Now we have the ability to measure individual cells
link |
00:28:26.900
in terms of their gene activity
link |
00:28:28.860
using what's called single cell RNA sequencing,
link |
00:28:31.340
which basically sequences the RNA,
link |
00:28:35.020
which is that activity level of different genes
link |
00:28:39.300
for every gene in the genome.
link |
00:28:40.980
And you could do that at single cell level.
link |
00:28:42.700
So that's an incredibly powerful way of measuring cells.
link |
00:28:45.380
I mean, you literally count the number of transcripts.
link |
00:28:47.860
So it really turns that squishy thing
link |
00:28:50.020
into something that's digital.
link |
00:28:51.820
Another tremendous data source that's emerged
link |
00:28:55.100
in the last few years is microscopy
link |
00:28:57.860
and specifically even super resolution microscopy,
link |
00:29:00.580
where you could use digital reconstruction
link |
00:29:03.460
to look at subcellular structures,
link |
00:29:06.460
sometimes even things that are below
link |
00:29:08.380
the diffraction limit of light
link |
00:29:10.540
by doing a sophisticated reconstruction.
link |
00:29:13.340
And again, that gives you a tremendous amount of information
link |
00:29:16.500
at the subcellular level.
link |
00:29:18.420
There's now more and more ways that amazing scientists
link |
00:29:22.860
out there are developing for getting new types
link |
00:29:27.540
of information from even single cells.
link |
00:29:30.820
And so that is a way of turning those squishy things
link |
00:29:35.500
into digital data.
link |
00:29:37.260
Into beautiful data sets.
link |
00:29:38.660
But so that data set then with machine learning tools
link |
00:29:42.540
allows you to maybe understand the developmental,
link |
00:29:45.820
like the mechanism of a particular disease.
link |
00:29:49.900
And if it's possible to sort of at a high level describe,
link |
00:29:54.300
how does that help lead to a drug discovery
link |
00:30:01.180
that can help prevent, reverse that mechanism?
link |
00:30:05.380
So I think there's different ways in which this data
link |
00:30:08.180
could potentially be used.
link |
00:30:10.420
Some people use it for scientific discovery
link |
00:30:13.820
and say, oh, look, we see this phenotype
link |
00:30:17.060
at the cellular level.
link |
00:30:20.060
So let's try and work our way backwards
link |
00:30:22.940
and think which genes might be involved in pathways
link |
00:30:26.100
that give rise to that.
link |
00:30:27.060
So that's a very sort of analytical method
link |
00:30:32.380
to sort of work our way backwards
link |
00:30:35.140
using our understanding of known biology.
link |
00:30:38.500
Some people use it in a somewhat more,
link |
00:30:44.100
sort of forward, if that was a backward,
link |
00:30:46.580
this would be forward, which is to say,
link |
00:30:48.140
okay, if I can perturb this gene,
link |
00:30:51.140
does it show a phenotype that is similar
link |
00:30:54.060
to what I see in disease patients?
link |
00:30:56.020
And so maybe that gene is actually causal of the disease.
link |
00:30:58.980
So that's a different way.
link |
00:31:00.180
And then there's what we do,
link |
00:31:01.580
which is basically to take that very large collection
link |
00:31:06.260
of data and use machine learning to uncover the patterns
link |
00:31:10.660
that emerge from it.
link |
00:31:12.340
So for instance, what are those subtypes
link |
00:31:14.900
that might be similar at the human clinical outcome,
link |
00:31:18.620
but quite distinct when you look at the molecular data?
link |
00:31:21.740
And then if we can identify such a subtype,
link |
00:31:25.140
are there interventions that if I apply it
link |
00:31:27.980
to cells that come from this subtype of the disease
link |
00:31:32.060
and you apply that intervention,
link |
00:31:34.140
it could be a drug or it could be a CRISPR gene intervention,
link |
00:31:38.820
does it revert the disease state
link |
00:31:41.340
to something that looks more like normal,
link |
00:31:42.980
happy, healthy cells?
link |
00:31:44.100
And so hopefully if you see that,
link |
00:31:46.900
that gives you a certain hope
link |
00:31:50.380
that that intervention will also have
link |
00:31:53.100
a meaningful clinical benefit to people.
link |
00:31:55.100
And there's obviously a bunch of things
link |
00:31:56.580
that you would wanna do after that to validate that,
link |
00:31:58.740
but it's a very different and much less hypothesis driven way
link |
00:32:03.900
of uncovering new potential interventions
link |
00:32:06.100
and might give rise to things that are not the same things
link |
00:32:10.100
that everyone else is already looking at.
link |
00:32:12.460
That's, I don't know, I'm just like to psychoanalyze
link |
00:32:16.780
my own feeling about our discussion currently.
link |
00:32:18.700
It's so exciting to talk about sort of a machine,
link |
00:32:21.500
fundamentally, well, something that's been turned
link |
00:32:23.780
into a machine learning problem
link |
00:32:25.900
and that says can have so much real world impact.
link |
00:32:29.140
That's how I feel too.
link |
00:32:30.340
That's kind of exciting because I'm so,
link |
00:32:32.260
most of my day is spent with data sets
link |
00:32:35.740
that I guess closer to the news groups.
link |
00:32:39.060
So this is a kind of, it just feels good to talk about.
link |
00:32:41.980
In fact, I almost don't wanna talk about machine learning.
link |
00:32:45.340
I wanna talk about the fundamentals of the data set,
link |
00:32:47.460
which is an exciting place to be.
link |
00:32:50.420
I agree with you.
link |
00:32:51.740
It's what gets me up in the morning.
link |
00:32:53.740
It's also what attracts a lot of the people
link |
00:32:57.140
who work at InCetro to InCetro
link |
00:32:59.140
because I think all of the,
link |
00:33:01.660
certainly all of our machine learning people
link |
00:33:03.220
are outstanding and could go get a job selling ads online
link |
00:33:08.220
or doing eCommerce or even self driving cars.
link |
00:33:12.500
But I think they would want, they come to us
link |
00:33:17.860
because they want to work on something
link |
00:33:20.020
that has more of an aspirational nature
link |
00:33:22.380
and can really benefit humanity.
link |
00:33:24.740
What, with these approaches, what do you hope,
link |
00:33:28.300
what kind of diseases can be helped?
link |
00:33:31.140
We mentioned Alzheimer's, schizophrenia, type 2 diabetes.
link |
00:33:33.940
Can you just describe the various kinds of diseases
link |
00:33:36.540
that this approach can help?
link |
00:33:38.580
Well, we don't know.
link |
00:33:39.620
And I try and be very cautious about making promises
link |
00:33:43.900
about some things that, oh, we will cure X.
link |
00:33:46.620
People make that promise.
link |
00:33:48.060
And I think it's, I tried to first deliver and then promise
link |
00:33:52.700
as opposed to the other way around.
link |
00:33:54.460
There are characteristics of a disease
link |
00:33:57.340
that make it more likely that this type of approach
link |
00:34:00.580
can potentially be helpful.
link |
00:34:02.700
So for instance, diseases that have
link |
00:34:04.580
a very strong genetic basis are ones
link |
00:34:08.820
that are more likely to manifest
link |
00:34:10.940
in a stem cell derived model.
link |
00:34:13.860
We would want the cellular models
link |
00:34:16.300
to be relatively reproducible and robust
link |
00:34:19.940
so that you could actually get enough of those cells
link |
00:34:25.380
and in a way that isn't very highly variable and noisy.
link |
00:34:30.740
You would want the disease to be relatively contained
link |
00:34:34.140
in one or a small number of cell types
link |
00:34:36.700
that you could actually create in an in vitro,
link |
00:34:40.020
in a dish setting.
link |
00:34:40.980
Whereas if it's something that's really broad and systemic
link |
00:34:43.460
and involves multiple cells
link |
00:34:45.540
that are in very distal parts of your body,
link |
00:34:48.460
putting that all in the dish is really challenging.
link |
00:34:50.980
So we want to focus on the ones
link |
00:34:53.740
that are most likely to be successful today
link |
00:34:56.980
with the hope, I think, that really smart bioengineers
link |
00:35:01.980
out there are developing better and better systems
link |
00:35:04.900
all the time so that diseases that might not be tractable
link |
00:35:07.900
today might be tractable in three years.
link |
00:35:11.220
So for instance, five years ago,
link |
00:35:14.340
these stem cell derived models didn't really exist.
link |
00:35:16.140
People were doing most of the work in cancer cells
link |
00:35:18.540
and cancer cells are very, very poor models
link |
00:35:21.660
of most human biology because they're,
link |
00:35:24.300
A, they were cancer to begin with
link |
00:35:25.820
and B, as you passage them and they proliferate in a dish,
link |
00:35:30.140
they become, because of the genomic instability,
link |
00:35:32.660
even less similar to human biology.
link |
00:35:35.700
Now we have these stem cell derived models.
link |
00:35:39.340
We have the capability to reasonably robustly,
link |
00:35:42.620
not quite at the right scale yet, but close,
link |
00:35:45.820
to derive what's called organoids,
link |
00:35:47.940
which are these teeny little sort of multicellular organ,
link |
00:35:54.820
sort of models of an organ system.
link |
00:35:56.660
So there's cerebral organoids and liver organoids
link |
00:35:59.300
and kidney organoids and.
link |
00:36:01.620
Yeah, brain organoids.
link |
00:36:03.460
That's organoids.
link |
00:36:04.300
It's possibly the coolest thing I've ever seen.
link |
00:36:05.500
Is that not like the coolest thing?
link |
00:36:07.500
Yeah.
link |
00:36:08.380
And then I think on the horizon,
link |
00:36:09.940
we're starting to see things like connecting
link |
00:36:11.780
these organoids to each other
link |
00:36:13.900
so that you could actually start,
link |
00:36:15.140
and there's some really cool papers that start to do that
link |
00:36:17.620
where you can actually start to say,
link |
00:36:19.020
okay, can we do multi organ system stuff?
link |
00:36:22.180
There's many challenges to that.
link |
00:36:23.500
It's not easy by any stretch, but it might,
link |
00:36:27.780
I'm sure people will figure it out.
link |
00:36:29.460
And in three years or five years,
link |
00:36:31.580
there will be disease models that we could make
link |
00:36:34.020
for things that we can't make today.
link |
00:36:35.420
Yeah, and this conversation would seem almost outdated
link |
00:36:38.700
with the kind of scale that could be achieved
link |
00:36:40.460
in like three years.
link |
00:36:41.300
I hope so.
link |
00:36:42.140
That's the hope.
link |
00:36:42.980
That would be so cool.
link |
00:36:43.820
So you've cofounded Coursera with Andrew Ng
link |
00:36:48.060
and were part of the whole MOOC revolution.
link |
00:36:51.380
So to jump topics a little bit,
link |
00:36:53.900
can you maybe tell the origin story of the history,
link |
00:36:57.900
the origin story of MOOCs, of Coursera,
link |
00:37:00.900
and in general, your teaching to huge audiences
link |
00:37:07.100
on a very sort of impactful topic of AI in general?
link |
00:37:12.100
So I think the origin story of MOOCs
link |
00:37:15.860
emanates from a number of efforts
link |
00:37:17.940
that occurred at Stanford University
link |
00:37:20.580
around the late 2000s
link |
00:37:25.420
where different individuals within Stanford,
link |
00:37:28.580
myself included, were getting really excited
link |
00:37:31.500
about the opportunities of using online technologies
link |
00:37:35.220
as a way of achieving both improved quality of teaching
link |
00:37:38.980
and also improved scale.
link |
00:37:40.940
And so Andrew, for instance,
link |
00:37:44.420
led the Stanford Engineering Everywhere,
link |
00:37:48.820
which was sort of an attempt to take 10 Stanford courses
link |
00:37:51.660
and put them online just as video lectures.
link |
00:37:55.980
I led an effort within Stanford to take some of the courses
link |
00:38:00.620
and really create a very different teaching model
link |
00:38:04.380
that broke those up into smaller units
link |
00:38:07.340
and had some of those embedded interactions and so on,
link |
00:38:11.060
which got a lot of support from university leaders
link |
00:38:14.620
because they felt like it was potentially a way
link |
00:38:17.380
of improving the quality of instruction at Stanford
link |
00:38:19.580
by moving to what's now called the flipped classroom model.
link |
00:38:22.980
And so those efforts eventually sort of started
link |
00:38:26.620
to interplay with each other
link |
00:38:28.020
and created a tremendous sense of excitement and energy
link |
00:38:30.940
within the Stanford community
link |
00:38:32.780
about the potential of online teaching
link |
00:38:36.380
and led in the fall of 2011
link |
00:38:39.260
to the launch of the first Stanford MOOCs.
link |
00:38:43.740
By the way, MOOCs, it's probably impossible
link |
00:38:46.420
that people don't know, but it's, I guess, massive.
link |
00:38:49.020
Open online courses. Open online courses.
link |
00:38:51.900
We did not come up with the acronym.
link |
00:38:54.300
I'm not particularly fond of the acronym,
link |
00:38:57.020
but it is what it is. It is what it is.
link |
00:38:58.460
Big bang is not a great term for the start of the universe,
link |
00:39:01.380
but it is what it is. Probably so.
link |
00:39:05.220
So anyway, so those courses launched in the fall of 2011,
link |
00:39:10.900
and there were, within a matter of weeks,
link |
00:39:13.780
with no real publicity campaign, just a New York Times article
link |
00:39:17.940
that went viral, about 100,000 students or more
link |
00:39:22.660
in each of those courses.
link |
00:39:24.580
And I remember this conversation that Andrew and I had.
link |
00:39:29.180
We were just like, wow, there's this real need here.
link |
00:39:33.420
And I think we both felt like, sure,
link |
00:39:36.220
we were accomplished academics and we could go back
link |
00:39:39.820
and go back to our labs, write more papers.
link |
00:39:42.620
But if we did that, then this wouldn't happen.
link |
00:39:45.860
And it seemed too important not to happen.
link |
00:39:48.700
And so we spent a fair bit of time debating,
link |
00:39:51.620
do we wanna do this as a Stanford effort,
link |
00:39:55.300
kind of building on what we'd started?
link |
00:39:56.860
Do we wanna do this as a for profit company?
link |
00:39:59.340
Do we wanna do this as a nonprofit?
link |
00:40:00.780
And we decided ultimately to do it as we did with Coursera.
link |
00:40:04.900
And so, you know, we started really operating
link |
00:40:09.900
as a company at the beginning of 2012.
link |
00:40:13.380
And the rest is history.
link |
00:40:15.340
But how did you, was that really surprising to you?
link |
00:40:19.580
How did you at that time and at this time
link |
00:40:23.300
make sense of this need for sort of global education
link |
00:40:27.580
you mentioned that you felt that, wow,
link |
00:40:29.380
the popularity indicates that there's a hunger
link |
00:40:33.260
for sort of globalization of learning.
link |
00:40:37.620
I think there is a hunger for learning that,
link |
00:40:43.620
you know, globalization is part of it,
link |
00:40:45.100
but I think it's just a hunger for learning.
link |
00:40:47.140
The world has changed in the last 50 years.
link |
00:40:50.420
It used to be that you finished college, you got a job,
link |
00:40:54.820
by and large, the skills that you learned in college
link |
00:40:57.020
were pretty much what got you through
link |
00:40:59.700
the rest of your job history.
link |
00:41:01.380
And yeah, you learn some stuff,
link |
00:41:02.940
but it wasn't a dramatic change.
link |
00:41:05.500
Today, we're in a world where the skills that you need
link |
00:41:09.420
for a lot of jobs, they didn't even exist
link |
00:41:11.260
when you went to college.
link |
00:41:12.500
And the jobs, and many of the jobs that existed
link |
00:41:14.540
when you went to college don't even exist today or are dying.
link |
00:41:18.620
So part of that is due to AI, but not only.
link |
00:41:22.580
And we need to find a way of keeping people,
link |
00:41:27.300
giving people access to the skills that they need today.
link |
00:41:29.900
And I think that's really what's driving
link |
00:41:32.020
a lot of this hunger.
link |
00:41:33.900
So I think if we even take a step back,
link |
00:41:37.020
for you, all of this started in trying to think
link |
00:41:39.940
of new ways to teach or to,
link |
00:41:43.140
new ways to sort of organize the material
link |
00:41:47.100
and present the material in a way
link |
00:41:48.380
that would help the education process, the pedagogy, yeah.
link |
00:41:51.380
So what have you learned about effective education
link |
00:41:56.380
from this process of playing,
link |
00:41:57.540
of experimenting with different ideas?
link |
00:42:00.580
So we learned a number of things.
link |
00:42:03.940
Some of which I think could translate back
link |
00:42:06.620
and have translated back effectively
link |
00:42:08.380
to how people teach on campus.
link |
00:42:09.900
And some of which I think are more specific
link |
00:42:11.700
to people who learn online,
link |
00:42:13.820
more sort of people who learn as part of their daily life.
link |
00:42:18.900
So we learned, for instance, very quickly
link |
00:42:20.980
that short is better.
link |
00:42:23.180
So people who are especially in the workforce
link |
00:42:26.820
can't do a 15 week semester long course.
link |
00:42:30.020
They just can't fit that into their lives.
link |
00:42:32.500
Sure, can you describe the shortness of what?
link |
00:42:35.540
The entirety, so every aspect,
link |
00:42:39.060
so the little lecture, the lecture's short,
link |
00:42:41.980
the course is short.
link |
00:42:43.020
Both.
link |
00:42:43.860
We started out, the first online education efforts
link |
00:42:47.820
were actually MIT's OpenCourseWare initiatives.
link |
00:42:50.620
And that was recording of classroom lectures and,
link |
00:42:55.860
Hour and a half or something like that, yeah.
link |
00:42:57.620
And that didn't really work very well.
link |
00:43:00.380
I mean, some people benefit.
link |
00:43:01.540
I mean, of course they did,
link |
00:43:03.140
but it's not really a very palatable experience
link |
00:43:06.700
for someone who has a job and three kids
link |
00:43:11.220
and they need to run errands and such.
link |
00:43:13.980
They can't fit 15 weeks into their life
link |
00:43:17.900
and the hour and a half is really hard.
link |
00:43:20.700
So we learned very quickly.
link |
00:43:22.940
I mean, we started out with short video modules
link |
00:43:26.540
and over time we made them shorter
link |
00:43:28.180
because we realized that 15 minutes was still too long.
link |
00:43:31.660
If you wanna fit in when you're waiting in line
link |
00:43:33.860
for your kid's doctor's appointment,
link |
00:43:35.500
it's better if it's five to seven.
link |
00:43:38.620
We learned that 15 week courses don't work
link |
00:43:42.540
and you really wanna break this up into shorter units
link |
00:43:44.820
so that there is a natural completion point,
link |
00:43:46.820
gives people a sense of they're really close
link |
00:43:48.660
to finishing something meaningful.
link |
00:43:50.420
They can always come back and take part two and part three.
link |
00:43:53.580
We also learned that compressing the content works
link |
00:43:56.500
really well because if some people that pace works well
link |
00:44:00.340
and for others, they can always rewind and watch again.
link |
00:44:03.260
And so people have the ability
link |
00:44:05.340
to then learn at their own pace.
link |
00:44:06.980
And so that flexibility, the brevity and the flexibility
link |
00:44:11.740
are both things that we found to be very important.
link |
00:44:15.420
We learned that engagement during the content is important
link |
00:44:18.780
and the quicker you give people feedback,
link |
00:44:20.620
the more likely they are to be engaged.
link |
00:44:22.540
Hence the introduction of these,
link |
00:44:24.540
which we actually was an intuition that I had going in
link |
00:44:27.740
and was then validated using data
link |
00:44:30.900
that introducing some of these sort of little micro quizzes
link |
00:44:34.300
into the lectures really helps.
link |
00:44:36.500
Self graded as automatically graded assessments
link |
00:44:39.420
really helped too because it gives people feedback.
link |
00:44:41.900
See, there you are.
link |
00:44:43.180
So all of these are valuable.
link |
00:44:45.620
And then we learned a bunch of other things too.
link |
00:44:47.260
We did some really interesting experiments, for instance,
link |
00:44:49.420
on gender bias and how having a female role model
link |
00:44:54.180
as an instructor can change the balance of men to women
link |
00:44:59.340
in terms of, especially in STEM courses.
link |
00:45:02.020
And you could do that online by doing AB testing
link |
00:45:04.820
in ways that would be really difficult to go on campus.
link |
00:45:07.740
Oh, that's exciting.
link |
00:45:09.140
But so the shortness, the compression,
link |
00:45:11.540
I mean, that's actually, so that probably is true
link |
00:45:15.700
for all good editing is always just compressing the content,
link |
00:45:20.980
making it shorter.
link |
00:45:21.940
So that puts a lot of burden on the creator of the,
link |
00:45:24.860
the instructor and the creator of the educational content.
link |
00:45:28.660
Probably most lectures at MIT or Stanford
link |
00:45:31.260
could be five times shorter
link |
00:45:34.340
if the preparation was put enough.
link |
00:45:37.580
So maybe people might disagree with that,
link |
00:45:41.660
but like the Christmas, the clarity that a lot of the,
link |
00:45:45.340
like Coursera delivers is, how much effort does that take?
link |
00:45:50.140
So first of all, let me say that it's not clear
link |
00:45:54.100
that that crispness would work as effectively
link |
00:45:57.380
in a face to face setting
link |
00:45:58.900
because people need time to absorb the material.
link |
00:46:02.420
And so you need to at least pause
link |
00:46:04.740
and give people a chance to reflect and maybe practice.
link |
00:46:07.300
And that's what MOOCs do is that they give you
link |
00:46:09.500
these chunks of content and then ask you
link |
00:46:11.780
to practice with it.
link |
00:46:13.420
And that's where I think some of the newer pedagogy
link |
00:46:16.300
that people are adopting in face to face teaching
link |
00:46:19.180
that have to do with interactive learning and such
link |
00:46:21.580
can be really helpful.
link |
00:46:23.460
But both those approaches,
link |
00:46:26.620
whether you're doing that type of methodology
link |
00:46:29.380
in online teaching or in that flipped classroom,
link |
00:46:32.820
interactive teaching.
link |
00:46:34.500
What's that, sorry to pause, what's flipped classroom?
link |
00:46:37.180
Flipped classroom is a way in which online content
link |
00:46:41.540
is used to supplement face to face teaching
link |
00:46:45.060
where people watch the videos perhaps
link |
00:46:47.220
and do some of the exercises before coming to class.
link |
00:46:49.860
And then when they come to class,
link |
00:46:51.180
it's actually to do much deeper problem solving
link |
00:46:53.580
oftentimes in a group.
link |
00:46:56.100
But any one of those different pedagogies
link |
00:47:00.460
that are beyond just standing there and droning on
link |
00:47:03.500
in front of the classroom for an hour and 15 minutes
link |
00:47:06.300
require a heck of a lot more preparation.
link |
00:47:09.260
And so it's one of the challenges I think that people have
link |
00:47:13.660
that we had when trying to convince instructors
link |
00:47:15.740
to teach on Coursera.
link |
00:47:16.700
And it's part of the challenges that pedagogy experts
link |
00:47:20.380
on campus have in trying to get faculty
link |
00:47:22.060
to teach differently is that it's actually harder
link |
00:47:23.740
to teach that way than it is to stand there and drone.
link |
00:47:27.860
Do you think MOOCs will replace in person education
link |
00:47:32.420
or become the majority of in person of education
link |
00:47:37.420
of the way people learn in the future?
link |
00:47:41.380
Again, the future could be very far away,
link |
00:47:43.260
but where's the trend going do you think?
link |
00:47:46.020
So I think it's a nuanced and complicated answer.
link |
00:47:50.140
I don't think MOOCs will replace face to face teaching.
link |
00:47:55.780
I think learning is in many cases a social experience.
link |
00:48:00.300
And even at Coursera, we had people who naturally formed
link |
00:48:05.300
study groups, even when they didn't have to,
link |
00:48:07.780
to just come and talk to each other.
link |
00:48:10.300
And we found that that actually benefited their learning
link |
00:48:14.420
in very important ways.
link |
00:48:15.780
So there was more success among learners
link |
00:48:19.660
who had those study groups than among ones who didn't.
link |
00:48:22.620
So I don't think it's just gonna,
link |
00:48:23.860
oh, we're all gonna just suddenly learn online
link |
00:48:26.060
with a computer and no one else in the same way
link |
00:48:28.940
that recorded music has not replaced live concerts.
link |
00:48:33.180
But I do think that especially when you are thinking
link |
00:48:38.940
about continuing education, the stuff that people get
link |
00:48:42.740
when they're traditional,
link |
00:48:44.700
whatever high school, college education is done,
link |
00:48:47.780
and they yet have to maintain their level of expertise
link |
00:48:52.500
and skills in a rapidly changing world,
link |
00:48:54.620
I think people will consume more and more educational content
link |
00:48:58.180
in this online format because going back to school
link |
00:49:01.380
for formal education is not an option for most people.
link |
00:49:04.860
Briefly, it might be a difficult question to ask,
link |
00:49:07.380
but there's a lot of people fascinated
link |
00:49:09.940
by artificial intelligence, by machine learning,
link |
00:49:12.820
by deep learning.
link |
00:49:13.940
Is there a recommendation for the next year
link |
00:49:18.140
or for a lifelong journey of somebody interested in this?
link |
00:49:21.340
How do they begin?
link |
00:49:23.700
How do they enter that learning journey?
link |
00:49:27.220
I think the important thing is first to just get started.
link |
00:49:30.900
And there's plenty of online content that one can get
link |
00:49:36.580
for both the core foundations of mathematics
link |
00:49:40.460
and statistics and programming.
link |
00:49:42.260
And then from there to machine learning,
link |
00:49:44.580
I would encourage people not to skip
link |
00:49:47.100
to quickly pass the foundations
link |
00:49:48.700
because I find that there's a lot of people
link |
00:49:51.060
who learn machine learning, whether it's online
link |
00:49:53.740
or on campus without getting those foundations.
link |
00:49:56.180
And they basically just turn the crank on existing models
link |
00:50:00.020
in ways that A, don't allow for a lot of innovation
link |
00:50:03.540
and an adjustment to the problem at hand,
link |
00:50:07.700
but also B, are sometimes just wrong
link |
00:50:09.660
and they don't even realize that their application is wrong
link |
00:50:12.900
because there's artifacts that they haven't fully understood.
link |
00:50:15.940
So I think the foundations,
link |
00:50:17.860
machine learning is an important step.
link |
00:50:19.860
And then actually start solving problems,
link |
00:50:24.860
try and find someone to solve them with
link |
00:50:27.620
because especially at the beginning,
link |
00:50:28.980
it's useful to have someone to bounce ideas off
link |
00:50:31.580
and fix mistakes that you make
link |
00:50:33.220
and you can fix mistakes that they make,
link |
00:50:35.980
but then just find practical problems,
link |
00:50:40.540
whether it's in your workplace or if you don't have that,
link |
00:50:43.300
Kaggle competitions or such are a really great place
link |
00:50:46.100
to find interesting problems and just practice.
link |
00:50:50.860
Practice.
link |
00:50:52.340
Perhaps a bit of a romanticized question,
link |
00:50:54.540
but what idea in deep learning do you find,
link |
00:50:59.340
have you found in your journey the most beautiful
link |
00:51:02.220
or surprising or interesting?
link |
00:51:07.660
Perhaps not just deep learning,
link |
00:51:09.420
but AI in general, statistics.
link |
00:51:14.940
I'm gonna answer with two things.
link |
00:51:19.100
One would be the foundational concept of end to end training,
link |
00:51:23.100
which is that you start from the raw data
link |
00:51:26.940
and you train something that is not like a single piece,
link |
00:51:32.980
but rather towards the actual goal that you're looking to.
link |
00:51:38.980
From the raw data to the outcome,
link |
00:51:40.820
like no details in between.
link |
00:51:43.580
Well, not no details, but the fact that you,
link |
00:51:45.460
I mean, you could certainly introduce building blocks
link |
00:51:47.540
that were trained towards other tasks.
link |
00:51:50.260
I'm actually coming to that in my second half of the answer,
link |
00:51:53.060
but it doesn't have to be like a single monolithic blob
link |
00:51:57.740
in the middle.
link |
00:51:58.580
Actually, I think that's not ideal,
link |
00:52:00.220
but rather the fact that at the end of the day,
link |
00:52:02.620
you can actually train something that goes all the way
link |
00:52:04.780
from the beginning to the end.
link |
00:52:06.900
And the other one that I find really compelling
link |
00:52:09.140
is the notion of learning a representation
link |
00:52:13.180
that in its turn, even if it was trained to another task,
link |
00:52:18.180
can potentially be used as a much more rapid starting point
link |
00:52:24.260
to solving a different task.
link |
00:52:26.700
And that's, I think, reminiscent
link |
00:52:29.500
of what makes people successful learners.
link |
00:52:32.300
It's something that is relatively new
link |
00:52:35.460
in the machine learning space.
link |
00:52:36.540
I think it's underutilized even relative
link |
00:52:38.700
to today's capabilities, but more and more
link |
00:52:41.460
of how do we learn sort of reusable representation?
link |
00:52:45.220
And so end to end and transfer learning.
link |
00:52:49.700
Yeah.
link |
00:52:51.140
Is it surprising to you that neural networks
link |
00:52:53.660
are able to, in many cases, do these things?
link |
00:52:56.980
Is it maybe taken back to when you first would dive deep
link |
00:53:02.260
into neural networks or in general, even today,
link |
00:53:05.460
is it surprising that neural networks work at all
link |
00:53:07.860
and work wonderfully to do this kind of raw end to end
link |
00:53:12.860
and end to end learning and even transfer learning?
link |
00:53:16.380
I think I was surprised by how well
link |
00:53:22.540
when you have large enough amounts of data,
link |
00:53:26.820
it's possible to find a meaningful representation
link |
00:53:32.940
in what is an exceedingly high dimensional space.
link |
00:53:36.060
And so I find that to be really exciting
link |
00:53:39.300
and people are still working on the math for that.
link |
00:53:41.620
There's more papers on that every year.
link |
00:53:43.580
And I think it would be really cool
link |
00:53:46.220
if we figured that out, but that to me was a surprise
link |
00:53:52.220
because in the early days when I was starting my way
link |
00:53:55.420
in machine learning and the data sets were rather small,
link |
00:53:58.700
I think we believed, I believed that you needed
link |
00:54:02.780
to have a much more constrained
link |
00:54:05.500
and knowledge rich search space
link |
00:54:08.620
to really make, to really get to a meaningful answer.
link |
00:54:11.860
And I think it was true at the time.
link |
00:54:13.860
What I think is still a question
link |
00:54:18.220
is will a completely knowledge free approach
link |
00:54:23.180
where there's no prior knowledge going
link |
00:54:26.020
into the construction of the model,
link |
00:54:28.980
is that gonna be the solution or not?
link |
00:54:31.620
It's not actually the solution today
link |
00:54:34.180
in the sense that the architecture of a convolutional
link |
00:54:38.940
neural network that's used for images
link |
00:54:41.500
is actually quite different
link |
00:54:43.260
to the type of network that's used for language
link |
00:54:46.580
and yet different from the one that's used for speech
link |
00:54:50.220
or biology or any other application.
link |
00:54:52.500
There's still some insight that goes
link |
00:54:55.860
into the structure of the network
link |
00:54:58.180
to get the right performance.
link |
00:55:00.820
Will you be able to come up
link |
00:55:01.660
with a universal learning machine?
link |
00:55:03.220
I don't know.
link |
00:55:05.100
I wonder if there's always has to be some insight
link |
00:55:07.300
injected somewhere or whether it can converge.
link |
00:55:10.300
So you've done a lot of interesting work
link |
00:55:13.580
with probabilistic graphical models in general,
link |
00:55:16.340
Bayesian deep learning and so on.
link |
00:55:18.420
Can you maybe speak high level,
link |
00:55:21.060
how can learning systems deal with uncertainty?
link |
00:55:25.500
One of the limitations I think of a lot
link |
00:55:28.940
of machine learning models is that
link |
00:55:33.780
they come up with an answer
link |
00:55:35.780
and you don't know how much you can believe that answer.
link |
00:55:40.860
And oftentimes the answer is actually
link |
00:55:47.740
quite poorly calibrated relative to its uncertainties.
link |
00:55:50.580
Even if you look at where the confidence
link |
00:55:55.500
that comes out of say the neural network at the end,
link |
00:55:58.980
and you ask how much more likely
link |
00:56:01.820
is an answer of 0.8 versus 0.9,
link |
00:56:04.820
it's not really in any way calibrated
link |
00:56:07.700
to the actual reliability of that network
link |
00:56:12.340
and how true it is.
link |
00:56:13.180
And the further away you move from the training data,
link |
00:56:16.780
the more, not only the more wrong the network is,
link |
00:56:20.700
often it's more wrong and more confident
link |
00:56:22.580
in its wrong answer.
link |
00:56:24.380
And that is a serious issue in a lot of application areas.
link |
00:56:29.340
So when you think for instance,
link |
00:56:30.380
about medical diagnosis as being maybe an epitome
link |
00:56:33.340
of how problematic this can be,
link |
00:56:35.700
if you were training your network
link |
00:56:37.700
on a certain set of patients
link |
00:56:40.180
and a certain patient population,
link |
00:56:41.540
and I have a patient that is an outlier
link |
00:56:44.620
and there's no human that looks at this,
link |
00:56:46.780
and that patient is put into a neural network
link |
00:56:49.100
and your network not only gives
link |
00:56:50.340
a completely incorrect diagnosis,
link |
00:56:51.940
but is supremely confident
link |
00:56:53.980
in its wrong answer, you could kill people.
link |
00:56:56.340
So I think creating more of an understanding
link |
00:57:01.940
of how do you produce networks
link |
00:57:05.540
that are calibrated in their uncertainty
link |
00:57:09.060
and can also say, you know what, I give up.
link |
00:57:10.940
I don't know what to say about this particular data instance
link |
00:57:14.580
because I've never seen something
link |
00:57:16.340
that's sufficiently like it before.
link |
00:57:18.140
I think it's going to be really important
link |
00:57:20.540
in mission critical applications,
link |
00:57:23.060
especially ones where human life is at stake
link |
00:57:25.380
and that includes medical applications,
link |
00:57:28.300
but it also includes automated driving
link |
00:57:31.180
because you'd want the network to be able to say,
link |
00:57:33.300
you know what, I have no idea what this blob is
link |
00:57:36.020
that I'm seeing in the middle of the road.
link |
00:57:37.140
So I'm just going to stop
link |
00:57:38.380
because I don't want to potentially run over a pedestrian
link |
00:57:41.540
that I don't recognize.
link |
00:57:42.820
Is there good mechanisms, ideas of how to allow
link |
00:57:47.540
learning systems to provide that uncertainty
link |
00:57:52.260
along with their predictions?
link |
00:57:54.060
Certainly people have come up with mechanisms
link |
00:57:57.180
that involve Bayesian deep learning,
link |
00:58:00.700
deep learning that involves Gaussian processes.
link |
00:58:04.460
I mean, there's a slew of different approaches
link |
00:58:07.660
that people have come up with.
link |
00:58:09.180
There's methods that use ensembles of networks
link |
00:58:13.660
trained with different subsets of data
link |
00:58:15.260
or different random starting points.
link |
00:58:17.620
Those are actually sometimes surprisingly good
link |
00:58:20.260
at creating a sort of set of how confident
link |
00:58:24.020
or not you are in your answer.
link |
00:58:26.580
It's very much an area of open research.
link |
00:58:30.020
Let's cautiously venture back into the land of philosophy
link |
00:58:33.660
and speaking of AI systems providing uncertainty,
link |
00:58:37.660
somebody like Stuart Russell believes
link |
00:58:41.140
that as we create more and more intelligence systems,
link |
00:58:43.420
it's really important for them to be full of self doubt
link |
00:58:46.820
because if they're given more and more power,
link |
00:58:51.940
we want the way to maintain human control
link |
00:58:54.820
over AI systems or human supervision, which is true.
link |
00:58:57.900
Like you just mentioned with autonomous vehicles,
link |
00:58:59.500
it's really important to get human supervision
link |
00:59:02.420
when the car is not sure because if it's really confident
link |
00:59:05.940
in cases when it can get in trouble,
link |
00:59:07.860
it's gonna be really problematic.
link |
00:59:09.380
So let me ask about sort of the questions of AGI
link |
00:59:12.980
and human level intelligence.
link |
00:59:14.860
I mean, we've talked about curing diseases,
link |
00:59:18.780
which is sort of fundamental thing
link |
00:59:20.180
we can have an impact today,
link |
00:59:21.780
but AI people also dream of both understanding
link |
00:59:26.180
and creating intelligence.
link |
00:59:29.220
Is that something you think about?
link |
00:59:30.420
Is that something you dream about?
link |
00:59:32.780
Is that something you think is within our reach
link |
00:59:36.980
to be thinking about as computer scientists?
link |
00:59:39.660
Well, boy, let me tease apart different parts
link |
00:59:43.500
of that question.
link |
00:59:45.180
The worst question.
link |
00:59:46.420
Yeah, it's a multi part question.
link |
00:59:50.940
So let me start with the feasibility of AGI.
link |
00:59:57.500
Then I'll talk about the timelines a little bit
link |
01:00:01.500
and then talk about, well, what controls does one need
link |
01:00:05.980
when thinking about protections in the AI space?
link |
01:00:10.540
So, I think AGI obviously is a longstanding dream
link |
01:00:17.180
that even our early pioneers in the space had,
link |
01:00:21.300
the Turing test and so on
link |
01:00:23.460
are the earliest discussions of that.
link |
01:00:27.580
We're obviously closer than we were 70 or so years ago,
link |
01:00:32.580
but I think it's still very far away.
link |
01:00:37.420
I think machine learning algorithms today
link |
01:00:40.900
are really exquisitely good pattern recognizers
link |
01:00:46.180
in very specific problem domains
link |
01:00:49.420
where they have seen enough training data
link |
01:00:51.540
to make good predictions.
link |
01:00:53.740
You take a machine learning algorithm
link |
01:00:57.860
and you move it to a slightly different version
link |
01:01:00.660
of even that same problem, far less one that's different
link |
01:01:03.780
and it will just completely choke.
link |
01:01:06.980
So I think we're nowhere close to the versatility
link |
01:01:11.620
and flexibility of even a human toddler
link |
01:01:15.620
in terms of their ability to context switch
link |
01:01:19.740
and solve different problems
link |
01:01:20.740
using a single knowledge base, single brain.
link |
01:01:24.340
So am I desperately worried about
link |
01:01:28.820
the machines taking over the universe
link |
01:01:33.540
and starting to kill people
link |
01:01:35.500
because they want to have more power?
link |
01:01:37.380
I don't think so.
link |
01:01:38.460
Well, so to pause on that,
link |
01:01:40.460
so you kind of intuited that super intelligence
link |
01:01:43.620
is a very difficult thing to achieve.
link |
01:01:46.300
Even intelligence.
link |
01:01:47.140
Intelligence, intelligence.
link |
01:01:48.180
Super intelligence, we're not even close to intelligence.
link |
01:01:50.500
Even just the greater abilities of generalization
link |
01:01:53.380
of our current systems.
link |
01:01:55.180
But we haven't answered all the parts
link |
01:01:59.180
and we'll take another.
link |
01:02:00.020
I'm getting to the second part.
link |
01:02:00.860
Okay, but maybe another tangent you can also pick up
link |
01:02:04.340
is can we get in trouble with much dumber systems?
link |
01:02:08.140
Yes, and that is exactly where I was going.
link |
01:02:11.300
So just to wrap up on the threats of AGI,
link |
01:02:16.140
I think that it seems to me a little early today
link |
01:02:21.140
to figure out protections against a human level
link |
01:02:26.220
or superhuman level intelligence
link |
01:02:28.620
where we don't even see the skeleton
link |
01:02:31.580
of what that would look like.
link |
01:02:33.140
So it seems that it's very speculative
link |
01:02:35.740
on how to protect against that.
link |
01:02:39.820
But we can definitely and have gotten into trouble
link |
01:02:43.940
on much dumber systems.
link |
01:02:45.980
And a lot of that has to do with the fact
link |
01:02:48.340
that the systems that we're building are increasingly
link |
01:02:52.300
complex, increasingly poorly understood.
link |
01:02:57.380
And there's ripple effects that are unpredictable
link |
01:03:01.420
in changing little things that can have dramatic consequences
link |
01:03:06.420
on the outcome.
link |
01:03:08.460
And by the way, that's not unique to artificial intelligence.
link |
01:03:11.620
I think artificial intelligence exacerbates that,
link |
01:03:13.820
brings it to a new level.
link |
01:03:15.100
But heck, our electric grid is really complicated.
link |
01:03:18.420
The software that runs our financial markets
link |
01:03:20.820
is really complicated.
link |
01:03:22.540
And we've seen those ripple effects translate
link |
01:03:25.820
to dramatic negative consequences,
link |
01:03:28.540
like for instance, financial crashes that have to do
link |
01:03:32.820
with feedback loops that we didn't anticipate.
link |
01:03:35.020
So I think that's an issue that we need to be thoughtful
link |
01:03:38.460
about in many places,
link |
01:03:41.940
artificial intelligence being one of them.
link |
01:03:44.300
And I think it's really important that people are thinking
link |
01:03:49.660
about ways in which we can have better interpretability
link |
01:03:54.380
of systems, better tests for, for instance,
link |
01:03:59.140
measuring the extent to which a machine learning system
link |
01:04:01.900
that was trained in one set of circumstances,
link |
01:04:04.860
how well does it actually work
link |
01:04:07.340
in a very different set of circumstances
link |
01:04:09.540
where you might say, for instance,
link |
01:04:12.340
well, I'm not gonna be able to test my automated vehicle
link |
01:04:14.740
in every possible city, village,
link |
01:04:18.980
weather condition and so on.
link |
01:04:20.780
But if you trained it on this set of conditions
link |
01:04:23.740
and then tested it on 50 or a hundred others
link |
01:04:27.340
that were quite different from the ones
link |
01:04:29.140
that you trained it on and it worked,
link |
01:04:31.980
then that gives you confidence that the next 50
link |
01:04:34.100
that you didn't test it on might also work.
link |
01:04:36.100
So effectively it's testing for generalizability.
link |
01:04:39.020
So I think there's ways that we should be
link |
01:04:41.300
constantly thinking about to validate the robustness
link |
01:04:45.900
of our systems.
link |
01:04:47.500
I think it's very different from the let's make sure
link |
01:04:50.980
robots don't take over the world.
link |
01:04:53.260
And then the other place where I think we have a threat,
link |
01:04:57.020
which is also important for us to think about
link |
01:04:59.420
is the extent to which technology can be abused.
link |
01:05:03.180
So like any really powerful technology,
link |
01:05:06.540
machine learning can be very much used badly
link |
01:05:10.900
as well as to good.
link |
01:05:12.700
And that goes back to many other technologies
link |
01:05:15.580
that have come up with when people invented
link |
01:05:19.140
projectile missiles and it turned into guns
link |
01:05:22.140
and people invented nuclear power
link |
01:05:24.660
and it turned into nuclear bombs.
link |
01:05:26.420
And I think honestly, I would say that to me,
link |
01:05:30.340
gene editing and CRISPR is at least as dangerous
link |
01:05:33.500
as technology if used badly than as machine learning.
link |
01:05:39.780
You could create really nasty viruses and such
link |
01:05:43.860
using gene editing that you would be really careful about.
link |
01:05:51.900
So anyway, that's something that we need
link |
01:05:56.700
to be really thoughtful about whenever we have
link |
01:05:59.620
any really powerful new technology.
link |
01:06:02.500
Yeah, and in the case of machine learning
link |
01:06:04.140
is adversarial machine learning.
link |
01:06:06.820
So all the kinds of attacks like security almost threats
link |
01:06:09.140
and there's a social engineering
link |
01:06:10.540
with machine learning algorithms.
link |
01:06:12.100
And there's face recognition and big brother is watching you
link |
01:06:15.900
and there's the killer drones that can potentially go
link |
01:06:20.980
and targeted execution of people in a different country.
link |
01:06:27.180
One can argue that bombs are not necessarily
link |
01:06:29.620
that much better, but people wanna kill someone,
link |
01:06:34.020
they'll find a way to do it.
link |
01:06:35.740
So in general, if you look at trends in the data,
link |
01:06:39.060
there's less wars, there's less violence,
link |
01:06:41.100
there's more human rights.
link |
01:06:42.940
So we've been doing overall quite good as a human species.
link |
01:06:48.340
Are you optimistic?
link |
01:06:49.180
Surprisingly sometimes.
link |
01:06:50.620
Are you optimistic?
link |
01:06:52.740
Maybe another way to ask is do you think most people
link |
01:06:55.540
are good and fundamentally we tend towards a better world,
link |
01:07:03.140
which is underlying the question,
link |
01:07:05.460
will machine learning with gene editing
link |
01:07:09.180
ultimately land us somewhere good?
link |
01:07:12.140
Are you optimistic?
link |
01:07:15.860
I think by and large, I'm optimistic.
link |
01:07:19.140
I think that most people mean well,
link |
01:07:24.140
that doesn't mean that most people are altruistic do gooders,
link |
01:07:28.140
but I think most people mean well,
link |
01:07:31.020
but I think it's also really important for us as a society
link |
01:07:34.980
to create social norms where doing good
link |
01:07:40.820
and being perceived well by our peers
link |
01:07:47.140
are positively correlated.
link |
01:07:49.780
I mean, it's very easy to create dysfunctional norms
link |
01:07:54.060
in emotional societies.
link |
01:07:55.620
There's certainly multiple psychological experiments
link |
01:07:58.540
as well as sadly real world events
link |
01:08:02.420
where people have devolved to a world
link |
01:08:05.300
where being perceived well by your peers
link |
01:08:09.340
is correlated with really atrocious,
link |
01:08:14.100
often genocidal behaviors.
link |
01:08:17.820
So we really want to make sure
link |
01:08:19.500
that we maintain a set of social norms
link |
01:08:21.740
where people know that to be a successful member of society,
link |
01:08:25.660
you want to be doing good.
link |
01:08:27.500
And one of the things that I sometimes worry about
link |
01:08:31.420
is that some societies don't seem to necessarily
link |
01:08:35.420
be moving in the forward direction in that regard
link |
01:08:38.340
where it's not necessarily the case
link |
01:08:43.620
that being a good person
link |
01:08:45.100
is what makes you be perceived well by your peers.
link |
01:08:47.980
And I think that's a really important thing
link |
01:08:49.700
for us as a society to remember.
link |
01:08:51.300
It's really easy to degenerate back into a universe
link |
01:08:55.940
where it's okay to do really bad stuff
link |
01:09:00.540
and still have your peers think you're amazing.
link |
01:09:04.980
It's fun to ask a world class computer scientist
link |
01:09:08.180
and engineer a ridiculously philosophical question
link |
01:09:11.380
like what is the meaning of life?
link |
01:09:13.460
Let me ask, what gives your life meaning?
link |
01:09:17.500
Or what is the source of fulfillment, happiness,
link |
01:09:22.180
joy, purpose?
link |
01:09:26.540
When we were starting Coursera in the fall of 2011,
link |
01:09:32.980
that was right around the time that Steve Jobs passed away.
link |
01:09:37.740
And so the media was full of various famous quotes
link |
01:09:41.020
that he uttered and one of them that really stuck with me
link |
01:09:45.500
because it resonated with stuff that I'd been feeling
link |
01:09:48.780
for even years before that is that our goal in life
link |
01:09:52.380
should be to make a dent in the universe.
link |
01:09:55.100
So I think that to me, what gives my life meaning
link |
01:10:00.620
is that I would hope that when I am lying there
link |
01:10:05.900
on my deathbed and looking at what I'd done in my life
link |
01:10:09.660
that I can point to ways in which I have left the world
link |
01:10:15.860
a better place than it was when I entered it.
link |
01:10:20.460
This is something I tell my kids all the time
link |
01:10:23.620
because I also think that the burden of that
link |
01:10:27.260
is much greater for those of us who were born to privilege.
link |
01:10:31.420
And in some ways I was, I mean, I wasn't born super wealthy
link |
01:10:34.380
or anything like that, but I grew up in an educated family
link |
01:10:37.900
with parents who loved me and took care of me
link |
01:10:40.860
and I had a chance at a great education
link |
01:10:43.060
and I always had enough to eat.
link |
01:10:46.620
So I was in many ways born to privilege
link |
01:10:48.900
more than the vast majority of humanity.
link |
01:10:51.940
And my kids I think are even more so born to privilege
link |
01:10:55.940
than I was fortunate enough to be.
link |
01:10:57.940
And I think it's really important that especially
link |
01:11:01.020
for those of us who have that opportunity
link |
01:11:03.900
that we use our lives to make the world a better place.
link |
01:11:07.420
I don't think there's a better way to end it.
link |
01:11:09.620
Daphne, it was an honor to talk to you.
link |
01:11:11.620
Thank you so much for talking today.
link |
01:11:12.620
Thank you.
link |
01:11:14.420
Thanks for listening to this conversation
link |
01:11:15.900
with Daphne Koller and thank you
link |
01:11:17.780
to our presenting sponsor, Cash App.
link |
01:11:19.900
Please consider supporting the podcast
link |
01:11:21.660
by downloading Cash App and using code LEXPodcast.
link |
01:11:26.180
If you enjoy this podcast, subscribe on YouTube,
link |
01:11:28.620
review it with five stars on Apple Podcast,
link |
01:11:31.060
support it on Patreon, or simply connect with me
link |
01:11:33.340
on Twitter at LEXFREEDMAN.
link |
01:11:36.260
And now let me leave you with some words from Hippocrates,
link |
01:11:39.820
a physician from ancient Greece
link |
01:11:41.900
who's considered to be the father of medicine.
link |
01:11:45.340
Wherever the art of medicine is loved,
link |
01:11:48.340
there's also a love of humanity.
link |
01:11:50.780
Thank you for listening and hope to see you next time.