back to index

Andrew Ng: Deep Learning, Education, and Real-World AI | Lex Fridman Podcast #73


small model | large model

link |
00:00:00.000
The following is a conversation with Andrew Ng,
link |
00:00:03.640
one of the most impactful educators, researchers, innovators, and leaders
link |
00:00:08.120
in artificial intelligence and technology space in general.
link |
00:00:11.960
He cofounded Coursera and Google Brain,
link |
00:00:15.320
launched Deep Learning AI, Landing AI, and the AI Fund,
link |
00:00:19.640
and was the chief scientist at Baidu.
link |
00:00:23.120
As a Stanford professor and with Coursera and Deep Learning AI,
link |
00:00:27.320
he has helped educate and inspire millions of students, including me.
link |
00:00:33.640
This is the Artificial Intelligence Podcast.
link |
00:00:36.320
If you enjoy it, subscribe on YouTube, give it five stars on Apple Podcast,
link |
00:00:40.520
support it on Patreon, or simply connect with me on Twitter
link |
00:00:43.800
at Lex Friedman, spelled F R I D M A N.
link |
00:00:48.360
As usual, I'll do one or two minutes of ads now
link |
00:00:51.280
and never any ads in the middle that can break the flow of the conversation.
link |
00:00:54.960
I hope that works for you and doesn't hurt the listening experience.
link |
00:00:59.040
This show is presented by Cash App, the number one finance app in the App Store.
link |
00:01:03.640
When you get it, use code LEXPODCAST.
link |
00:01:07.080
Cash App lets you send money to friends, buy Bitcoin,
link |
00:01:10.440
and invest in the stock market with as little as $1.
link |
00:01:13.760
Broker services are provided by Cash App Investing,
link |
00:01:16.760
a subsidiary of Square, a member SIPC.
link |
00:01:20.800
Since Cash App allows you to buy Bitcoin,
link |
00:01:23.120
let me mention that cryptocurrency in the context of the history of money is fascinating.
link |
00:01:28.840
I recommend Ascent of Money as a great book on this history.
link |
00:01:33.800
Debits and credits on ledgers started over 30,000 years ago.
link |
00:01:38.520
The US dollar was created over 200 years ago,
link |
00:01:42.240
and Bitcoin, the first decentralized cryptocurrency, released just over 10 years ago.
link |
00:01:48.200
So given that history, cryptocurrency is still very much in its early days of development,
link |
00:01:53.640
but it's still aiming to and just might redefine the nature of money.
link |
00:01:59.800
So again, if you get Cash App from the App Store or Google Play
link |
00:02:03.480
and use the code LEXPODCAST, you'll get $10,
link |
00:02:07.320
and Cash App will also donate $10 to FIRST,
link |
00:02:10.160
one of my favorite organizations that is helping to advance robotics and STEM education
link |
00:02:15.480
for young people around the world.
link |
00:02:18.600
And now, here's my conversation with Andrew Ng.
link |
00:02:23.200
The courses you taught on machine learning at Stanford
link |
00:02:25.920
and later on Coursera that you cofounded have educated and inspired millions of people.
link |
00:02:31.880
So let me ask you, what people or ideas inspired you
link |
00:02:35.080
to get into computer science and machine learning when you were young?
link |
00:02:39.200
When did you first fall in love with the field, is another way to put it.
link |
00:02:43.840
Growing up in Hong Kong and Singapore, I started learning to code when I was five or six years old.
link |
00:02:50.120
At that time, I was learning the basic programming language,
link |
00:02:53.680
and they would take these books and they'll tell you,
link |
00:02:56.160
type this program into your computer, so type that program to my computer.
link |
00:03:00.080
And as a result of all that typing, I would get to play these very simple shoot them up games
link |
00:03:05.840
that I had implemented on my little computer.
link |
00:03:09.880
So I thought it was fascinating as a young kid that I could write this code.
link |
00:03:14.920
I was really just copying code from a book into my computer
link |
00:03:18.280
to then play these cool little video games.
link |
00:03:21.000
Another moment for me was when I was a teenager and my father,
link |
00:03:27.080
who's a doctor, was reading about expert systems and about neural networks.
link |
00:03:31.400
So he got me to read some of these books, and I thought it was really cool.
link |
00:03:34.800
You could write a computer that started to exhibit intelligence.
link |
00:03:39.320
Then I remember doing an internship while I was in high school, this was in Singapore,
link |
00:03:44.440
where I remember doing a lot of photocopying and as an office assistant.
link |
00:03:50.360
And the highlight of my job was when I got to use the shredder.
link |
00:03:53.800
So the teenager me, remote thinking, boy, this is a lot of photocopying.
link |
00:03:57.800
If only we could write software, build a robot, something to automate this,
link |
00:04:01.080
maybe I could do something else.
link |
00:04:03.000
So I think a lot of my work since then has centered on the theme of automation.
link |
00:04:07.640
Even the way I think about machine learning today,
link |
00:04:09.960
we're very good at writing learning algorithms that can automate things that people can do.
link |
00:04:14.920
Or even launching the first MOOCs, Mass Open Online Courses, that later led to Coursera.
link |
00:04:20.040
I was trying to automate what could be automatable in how I was teaching on campus.
link |
00:04:25.320
Process of education, trying to automate parts of that to make it more,
link |
00:04:30.280
sort of to have more impact from a single teacher, a single educator.
link |
00:04:34.680
Yeah, I felt, you know, teaching at Stanford,
link |
00:04:37.800
teaching machine learning to about 400 students a year at the time.
link |
00:04:41.240
And I found myself filming the exact same video every year,
link |
00:04:46.040
telling the same jokes in the same room.
link |
00:04:48.680
And I thought, why am I doing this?
link |
00:04:50.200
Why don't we just take last year's video?
link |
00:04:51.720
And then I can spend my time building a deeper relationship with students.
link |
00:04:55.240
So that process of thinking through how to do that,
link |
00:04:57.880
that led to the first MOOCs that we launched.
link |
00:05:00.520
And then you have more time to write new jokes.
link |
00:05:03.320
Are there favorite memories from your early days at Stanford,
link |
00:05:06.200
teaching thousands of people in person and then millions of people online?
link |
00:05:12.520
You know, teaching online, what not many people know was that a lot of those videos
link |
00:05:19.720
were shot between the hours of 10 p.m. and 3 a.m.
link |
00:05:24.520
A lot of times, we were launching the first MOOCs at Stanford.
link |
00:05:31.400
We had already announced the course, about 100,000 people signed up.
link |
00:05:33.960
We just started to write the code and we had not yet actually filmed the videos.
link |
00:05:39.240
So a lot of pressure, 100,000 people waiting for us to produce the content.
link |
00:05:43.320
So many Fridays, Saturdays, I would go out, have dinner with my friends,
link |
00:05:49.160
and then I would think, OK, do you want to go home now?
link |
00:05:51.480
Or do you want to go to the office to film videos?
link |
00:05:54.680
And the thought of being able to help 100,000 people potentially learn machine learning,
link |
00:05:59.400
fortunately, that made me think, OK, I want to go to my office,
link |
00:06:03.080
go to my tiny little recording studio.
link |
00:06:05.320
I would adjust my Logitech webcam, adjust my Wacom tablet,
link |
00:06:10.520
make sure my lapel mic was on,
link |
00:06:12.440
and then I would start recording often until 2 a.m. or 3 a.m.
link |
00:06:15.880
I think unfortunately, that doesn't show that it was recorded that late at night,
link |
00:06:20.360
but it was really inspiring the thought that we could create content
link |
00:06:25.480
to help so many people learn about machine learning.
link |
00:06:27.800
How did that feel?
link |
00:06:29.320
The fact that you're probably somewhat alone,
link |
00:06:31.400
maybe a couple of friends recording with a Logitech webcam
link |
00:06:36.120
and kind of going home alone at 1 or 2 a.m. at night
link |
00:06:40.920
and knowing that that's going to reach sort of thousands of people,
link |
00:06:45.160
eventually millions of people, what's that feeling like?
link |
00:06:48.840
I mean, is there a feeling of just satisfaction of pushing through?
link |
00:06:54.040
I think it's humbling.
link |
00:06:55.160
And I wasn't thinking about what I was feeling.
link |
00:06:57.960
I think one thing that I'm proud to say we got right from the early days
link |
00:07:02.440
was I told my whole team back then that the number one priority
link |
00:07:06.360
is to do what's best for learners, do what's best for students.
link |
00:07:09.160
And so when I went to the recording studio,
link |
00:07:11.480
the only thing on my mind was what can I say?
link |
00:07:13.880
How can I design my slides?
link |
00:07:15.080
What I need to draw right to make these concepts as clear as possible for learners?
link |
00:07:20.520
I think I've seen sometimes instructors is tempting to,
link |
00:07:24.120
hey, let's talk about my work.
link |
00:07:25.480
Maybe if I teach you about my research,
link |
00:07:27.400
someone will cite my papers a couple more times.
link |
00:07:29.960
And I think one of the things we got right,
link |
00:07:31.800
launching the first few MOOCs and later building Coursera,
link |
00:07:34.200
was putting in place that bedrock principle of
link |
00:07:37.080
let's just do what's best for learners and forget about everything else.
link |
00:07:40.200
And I think that that is a guiding principle
link |
00:07:43.160
turned out to be really important to the rise of the MOOC movement.
link |
00:07:46.840
And the kind of learner you imagined in your mind
link |
00:07:49.320
is as broad as possible, as global as possible.
link |
00:07:53.880
So really try to reach as many people
link |
00:07:56.280
interested in machine learning and AI as possible.
link |
00:07:59.640
I really want to help anyone that had an interest in machine learning
link |
00:08:03.160
to break into the field.
link |
00:08:05.560
And I think sometimes I've actually had people ask me,
link |
00:08:08.280
hey, why are you spending so much time explaining gradient descent?
link |
00:08:11.560
And my answer was, if I look at what I think the learner needs
link |
00:08:15.560
and what benefit from, I felt that having that
link |
00:08:18.920
a good understanding of the foundations coming back to the basics
link |
00:08:22.040
would put them in a better stead to then build on a long term career.
link |
00:08:26.840
So try to consistently make decisions on that principle.
link |
00:08:30.520
So one of the things you actually revealed to the narrow AI community
link |
00:08:35.800
at the time and to the world is that the amount of people
link |
00:08:39.560
who are actually interested in AI is much larger than we imagined.
link |
00:08:43.480
By you teaching the class and how popular it became,
link |
00:08:47.000
it showed that, wow, this isn't just a small community
link |
00:08:50.920
of sort of people who go to NeurIPS and it's much bigger.
link |
00:08:56.680
It's developers, it's people from all over the world.
link |
00:08:59.800
I mean, I'm Russian, so everybody in Russia is really interested.
link |
00:09:03.320
There's a huge number of programmers who are interested in machine learning,
link |
00:09:06.600
India, China, South America, everywhere.
link |
00:09:10.680
There's just millions of people who are interested in machine learning.
link |
00:09:13.480
So how big do you get a sense that the number of people
link |
00:09:16.760
is that are interested from your perspective?
link |
00:09:20.360
I think the number has grown over time.
link |
00:09:22.920
I think it's one of those things that maybe it feels like it came out of nowhere,
link |
00:09:26.360
but it's an insight that building it, it took years.
link |
00:09:28.600
It's one of those overnight successes that took years to get there.
link |
00:09:33.160
My first foray into this type of online education
link |
00:09:35.960
was when we were filming my Stanford class
link |
00:09:37.880
and sticking the videos on YouTube and some other things.
link |
00:09:40.360
We had uploaded the horrors and so on,
link |
00:09:42.040
but it's basically the one hour, 15 minute video that we put on YouTube.
link |
00:09:47.160
And then we had four or five other versions of websites that I had built,
link |
00:09:52.040
most of which you would never have heard of
link |
00:09:53.800
because they reached small audiences,
link |
00:09:55.720
but that allowed me to iterate,
link |
00:09:57.480
allowed my team and me to iterate,
link |
00:09:59.000
to learn what are the ideas that work and what doesn't.
link |
00:10:02.280
For example, one of the features I was really excited about
link |
00:10:04.920
and really proud of was build this website
link |
00:10:07.240
where multiple people could be logged into the website at the same time.
link |
00:10:11.240
So today, if you go to a website,
link |
00:10:13.480
if you are logged in and then I want to log in,
link |
00:10:15.800
you need to log out because it's the same browser, the same computer.
link |
00:10:18.760
But I thought, well, what if two people say you and me
link |
00:10:21.240
were watching a video together in front of a computer?
link |
00:10:24.200
What if a website could have you type your name and password,
link |
00:10:27.480
have me type my name and password,
link |
00:10:28.920
and then now the computer knows both of us are watching together
link |
00:10:31.720
and it gives both of us credit for anything we do as a group.
link |
00:10:35.320
Influencers feature rolled it out in a high school in San Francisco.
link |
00:10:39.640
We had about 20 something users.
link |
00:10:42.920
Where's the teacher there?
link |
00:10:43.880
Sacred Heart Cathedral Prep, the teacher is great.
link |
00:10:46.200
I mean, guess what?
link |
00:10:47.400
Zero people use this feature.
link |
00:10:49.720
It turns out people studying online,
link |
00:10:51.800
they want to watch the videos by themselves.
link |
00:10:53.960
So you can play back, pause at your own speed rather than in groups.
link |
00:10:57.720
So that was one example of a tiny lesson learned out of many
link |
00:11:01.560
that allowed us to hone into the set of features.
link |
00:11:04.840
It sounds like a brilliant feature.
link |
00:11:06.360
So I guess the lesson to take from that is
link |
00:11:11.080
there's something that looks amazing on paper and then nobody uses it.
link |
00:11:15.160
It doesn't actually have the impact that you think it might have.
link |
00:11:18.200
And so, yeah, I saw that you really went through a lot of different features
link |
00:11:21.640
and a lot of ideas to arrive at Coursera,
link |
00:11:25.000
the final kind of powerful thing that showed the world
link |
00:11:28.920
that MOOCs can educate millions.
link |
00:11:32.040
And I think with the whole machine learning movement as well,
link |
00:11:35.480
I think it didn't come out of nowhere.
link |
00:11:38.280
Instead, what happened was as more people learn about machine learning,
link |
00:11:42.200
they will tell their friends and their friends will see
link |
00:11:44.040
how it's applicable to their work.
link |
00:11:45.880
And then the community kept on growing.
link |
00:11:48.680
And I think we're still growing.
link |
00:11:50.920
I don't know in the future what percentage of all developers
link |
00:11:54.680
will be AI developers.
link |
00:11:56.040
I could easily see it being north of 50%, right?
link |
00:11:58.840
Because so many AI developers broadly construed,
link |
00:12:03.880
not just people doing the machine learning modeling,
link |
00:12:05.800
but the people building infrastructure, data pipelines,
link |
00:12:08.840
all the software surrounding the core machine learning model
link |
00:12:13.000
maybe is even bigger.
link |
00:12:14.600
I feel like today almost every software engineer
link |
00:12:17.560
has some understanding of the cloud.
link |
00:12:19.720
Not all, but maybe this is my microcontroller developer
link |
00:12:23.160
that doesn't need to deal with the cloud.
link |
00:12:24.760
But I feel like the vast majority of software engineers today
link |
00:12:28.360
are sort of having an appreciation of the cloud.
link |
00:12:31.320
I think in the future, maybe we'll approach nearly 100% of all developers
link |
00:12:35.320
being in some way an AI developer
link |
00:12:38.440
or at least having an appreciation of machine learning.
link |
00:12:41.240
And my hope is that there's this kind of effect
link |
00:12:44.440
that there's people who are not really interested in being a programmer
link |
00:12:48.360
or being into software engineering, like biologists, chemists,
link |
00:12:51.960
and physicists, even mechanical engineers,
link |
00:12:55.480
all these disciplines that are now more and more sitting on large data sets.
link |
00:13:01.560
And here they didn't think they're interested in programming
link |
00:13:04.360
until they have this data set and they realize
link |
00:13:06.040
there's this set of machine learning tools
link |
00:13:07.640
that allow you to use the data set.
link |
00:13:09.320
So they actually become, they learn to program
link |
00:13:12.040
and they become new programmers.
link |
00:13:13.480
So like the, not just because you've mentioned
link |
00:13:16.040
a larger percentage of developers become machine learning people.
link |
00:13:19.240
So it seems like more and more the kinds of people
link |
00:13:24.200
who are becoming developers is also growing significantly.
link |
00:13:27.560
Yeah, I think once upon a time,
link |
00:13:30.040
only a small part of humanity was literate, could read and write.
link |
00:13:34.040
And maybe you thought, maybe not everyone needs to learn to read and write.
link |
00:13:37.800
You just go listen to a few monks read to you and maybe that was enough.
link |
00:13:44.360
Or maybe you just need a few handful of authors to write the bestsellers
link |
00:13:47.960
and no one else needs to write.
link |
00:13:50.200
But what we found was that by giving as many people,
link |
00:13:53.480
in some countries, almost everyone, basic literacy,
link |
00:13:56.680
it dramatically enhanced human to human communications.
link |
00:13:59.320
And we can now write for an audience of one,
link |
00:14:01.240
such as if I send you an email or you send me an email.
link |
00:14:04.760
I think in computing, we're still in that phase
link |
00:14:07.640
where so few people know how to code
link |
00:14:09.720
that the coders mostly have to code for relatively large audiences.
link |
00:14:14.280
But if everyone, or most people became developers at some level,
link |
00:14:20.360
similar to how most people in developed economies are somewhat literate,
link |
00:14:24.280
I would love to see the owners of a mom and pop store
link |
00:14:27.720
be able to write a little bit of code to customize the TV display
link |
00:14:30.840
for their special this week.
link |
00:14:32.280
And I think it will enhance human to computer communications,
link |
00:14:36.280
which is becoming more and more important today as well.
link |
00:14:38.600
So you think it's possible that machine learning
link |
00:14:41.640
becomes kind of similar to literacy,
link |
00:14:45.000
where like you said, the owners of a mom and pop shop,
link |
00:14:49.800
is basically everybody in all walks of life
link |
00:14:52.040
would have some degree of programming capability?
link |
00:14:55.480
I could see society getting there.
link |
00:14:58.360
There's one other interesting thing.
link |
00:15:00.600
If I go talk to the mom and pop store,
link |
00:15:02.680
if I talk to a lot of people in their daily professions,
link |
00:15:05.240
I previously didn't have a good story for why they should learn to code.
link |
00:15:09.400
We could give them some reasons.
link |
00:15:11.080
But what I found with the rise of machine learning and data science is that
link |
00:15:14.440
I think the number of people with a concrete use for data science
link |
00:15:18.280
in their daily lives, in their jobs,
link |
00:15:20.360
may be even larger than the number of people
link |
00:15:22.600
who have concrete use for software engineering.
link |
00:15:25.240
For example, if you run a small mom and pop store,
link |
00:15:28.040
I think if you can analyze the data about your sales, your customers,
link |
00:15:31.880
I think there's actually real value there,
link |
00:15:34.120
maybe even more than traditional software engineering.
link |
00:15:37.160
So I find that for a lot of my friends in various professions,
link |
00:15:40.280
be it recruiters or accountants or people that work in the factories,
link |
00:15:45.080
which I deal with more and more these days,
link |
00:15:48.120
I feel if they were data scientists at some level,
link |
00:15:51.160
they could immediately use that in their work.
link |
00:15:54.440
So I think that data science and machine learning
link |
00:15:56.760
may be an even easier entree into the developer world
link |
00:16:00.280
for a lot of people than the software engineering.
link |
00:16:03.240
That's interesting.
link |
00:16:04.280
And I agree with that, but that's beautifully put.
link |
00:16:06.360
But we live in a world where most courses and talks have slides,
link |
00:16:11.320
PowerPoint, keynote,
link |
00:16:12.920
and yet you famously often still use a marker and a whiteboard.
link |
00:16:17.400
The simplicity of that is compelling,
link |
00:16:19.480
and for me at least, fun to watch.
link |
00:16:22.200
So let me ask, why do you like using a marker and whiteboard,
link |
00:16:25.960
even on the biggest of stages?
link |
00:16:28.920
I think it depends on the concepts you want to explain.
link |
00:16:32.520
For mathematical concepts,
link |
00:16:34.120
it's nice to build up the equation one piece at a time,
link |
00:16:37.000
and the whiteboard marker or the pen and stylus
link |
00:16:41.320
is a very easy way to build up the equation,
link |
00:16:43.880
to build up a complex concept one piece at a time
link |
00:16:47.320
while you're talking about it,
link |
00:16:48.440
and sometimes that enhances understandability.
link |
00:16:52.680
The downside of writing is that it's slow,
link |
00:16:54.760
and so if you want a long sentence, it's very hard to write that.
link |
00:16:57.240
So I think there are pros and cons,
link |
00:16:58.360
and sometimes I use slides,
link |
00:17:00.360
and sometimes I use a whiteboard or a stylus.
link |
00:17:03.160
The slowness of a whiteboard is also its upside,
link |
00:17:06.200
because it forces you to reduce everything to the basics.
link |
00:17:12.600
Some of your talks involve the whiteboard.
link |
00:17:14.760
I mean, you go very slowly,
link |
00:17:17.720
and you really focus on the most simple principles,
link |
00:17:20.040
and that's a beautiful,
link |
00:17:22.760
that enforces a kind of a minimalism of ideas
link |
00:17:26.440
that I think is surprising at least for me is great for education.
link |
00:17:31.560
Like a great talk, I think, is not one that has a lot of content.
link |
00:17:36.920
A great talk is one that just clearly says a few simple ideas,
link |
00:17:41.800
and I think the whiteboard somehow enforces that.
link |
00:17:46.280
Peter Abbeel, who's now one of the top roboticists
link |
00:17:49.400
and reinforcement learning experts in the world,
link |
00:17:51.400
was your first PhD student.
link |
00:17:54.280
So I bring him up just because I kind of imagine
link |
00:17:58.360
this must have been an interesting time in your life,
link |
00:18:01.400
and do you have any favorite memories of working with Peter,
link |
00:18:04.840
since you were your first student in those uncertain times,
link |
00:18:08.280
especially before deep learning really sort of blew up?
link |
00:18:15.400
Any favorite memories from those times?
link |
00:18:17.720
Yeah, I was really fortunate to have had Peter Abbeel
link |
00:18:20.680
as my first PhD student,
link |
00:18:22.600
and I think even my long term professional success
link |
00:18:25.480
builds on early foundations or early work
link |
00:18:27.640
that Peter was so critical to.
link |
00:18:29.880
So I was really grateful to him for working with me.
link |
00:18:34.840
What not a lot of people know is just how hard research was,
link |
00:18:39.720
and still is.
link |
00:18:42.200
Peter's PhD thesis was using reinforcement learning
link |
00:18:44.840
to fly helicopters.
link |
00:18:47.080
And so, even today, the website heli.stanford.edu,
link |
00:18:51.640
heli.stanford.edu is still up.
link |
00:18:53.320
You can watch videos of us using reinforcement learning
link |
00:18:56.200
to make a helicopter fly upside down,
link |
00:18:57.960
fly loose roses, so it's cool.
link |
00:18:59.880
It's one of the most incredible robotics videos ever,
link |
00:19:02.360
so people should watch it.
link |
00:19:03.560
Oh yeah, thank you.
link |
00:19:04.280
It's inspiring.
link |
00:19:05.000
That's from like 2008 or seven or six, like that range.
link |
00:19:10.360
Yeah, something like that.
link |
00:19:11.400
Yeah, so it was over 10 years old.
link |
00:19:12.920
That was really inspiring to a lot of people, yeah.
link |
00:19:15.320
What not many people see is how hard it was.
link |
00:19:18.920
So Peter and Adam Coase and Morgan Quigley and I
link |
00:19:22.680
were working on various versions of the helicopter,
link |
00:19:25.400
and a lot of things did not work.
link |
00:19:27.320
For example, it turns out one of the hardest problems we had
link |
00:19:29.800
was when the helicopter's flying around upside down,
link |
00:19:32.200
doing stunts, how do you figure out the position?
link |
00:19:34.760
How do you localize the helicopter?
link |
00:19:36.760
So we wanted to try all sorts of things.
link |
00:19:38.840
Having one GPS unit doesn't work
link |
00:19:41.080
because you're flying upside down,
link |
00:19:42.120
the GPS unit's facing down, so you can't see the satellites.
link |
00:19:44.760
So we experimented trying to have two GPS units,
link |
00:19:48.520
one facing up, one facing down.
link |
00:19:49.880
So if you flip over, that didn't work
link |
00:19:51.720
because the downward facing one couldn't synchronize
link |
00:19:54.200
if you're flipping quickly.
link |
00:19:55.880
Morgan Quigley was exploring this crazy,
link |
00:19:58.520
complicated configuration of specialized hardware
link |
00:20:01.560
to interpret GPS signals.
link |
00:20:03.800
Looking at the FPG is completely insane.
link |
00:20:06.040
Spent about a year working on that, didn't work.
link |
00:20:09.480
So I remember Peter, great guy, him and me,
link |
00:20:13.400
sitting down in my office looking at some of the latest things
link |
00:20:17.160
we had tried that didn't work and saying,
link |
00:20:20.920
done it, what now?
link |
00:20:22.440
Because we tried so many things and it just didn't work.
link |
00:20:25.720
In the end, what we did, and Adam Coles was crucial to this,
link |
00:20:31.240
was put cameras on the ground and use cameras on the ground
link |
00:20:34.280
to localize the helicopter.
link |
00:20:35.880
And that solved the localization problem
link |
00:20:38.600
so that we could then focus on the reinforcement learning
link |
00:20:41.080
and inverse reinforcement learning techniques
link |
00:20:43.160
so it didn't actually make the helicopter fly.
link |
00:20:46.600
And I'm reminded, when I was doing this work at Stanford,
link |
00:20:50.600
around that time, there was a lot of reinforcement learning
link |
00:20:54.040
theoretical papers, but not a lot of practical applications.
link |
00:20:58.360
So the autonomous helicopter work for flying helicopters
link |
00:21:02.120
was one of the few practical applications
link |
00:21:05.400
of reinforcement learning at the time,
link |
00:21:06.840
which caused it to become pretty well known.
link |
00:21:10.440
I feel like we might have almost come full circle with today.
link |
00:21:13.560
There's so much buzz, so much hype, so much excitement
link |
00:21:16.280
about reinforcement learning.
link |
00:21:17.720
But again, we're hunting for more applications
link |
00:21:20.600
of all of these great ideas that David Kuhnke has come up with.
link |
00:21:23.800
What was the drive sort of in the face of the fact
link |
00:21:28.120
that most people are doing theoretical work?
link |
00:21:30.040
What motivates you in the uncertainty and the challenges
link |
00:21:32.920
to get the helicopter sort of to do the applied work,
link |
00:21:36.360
to get the actual system to work?
link |
00:21:39.320
Yeah, in the face of fear, uncertainty, sort of the setbacks
link |
00:21:43.240
that you mentioned for localization.
link |
00:21:45.880
I like stuff that works.
link |
00:21:47.960
In the physical world.
link |
00:21:48.840
So like, it's back to the shredder.
link |
00:21:50.920
You know, I like theory, but when I work on theory myself,
link |
00:21:55.400
and this is personal taste,
link |
00:21:56.440
I'm not saying anyone else should do what I do.
link |
00:21:58.600
But when I work on theory, I personally enjoy it more
link |
00:22:01.880
if I feel that the work I do will influence people,
link |
00:22:06.360
have positive impact, or help someone.
link |
00:22:10.040
I remember when many years ago,
link |
00:22:12.680
I was speaking with a mathematics professor,
link |
00:22:15.480
and it kind of just said, hey, why do you do what you do?
link |
00:22:18.280
It kind of just said, hey, why do you do what you do?
link |
00:22:21.160
And then he said, he had stars in his eyes when he answered.
link |
00:22:25.640
And this mathematician, not from Stanford,
link |
00:22:28.760
different university, he said, I do what I do
link |
00:22:31.320
because it helps me to discover truth and beauty
link |
00:22:35.320
in the universe.
link |
00:22:36.600
He had stars in his eyes when he said that.
link |
00:22:38.360
And I thought, that's great.
link |
00:22:41.240
I don't want to do that.
link |
00:22:42.440
I think it's great that someone does that,
link |
00:22:44.040
fully support the people that do it,
link |
00:22:45.320
a lot of respect for people that do that.
link |
00:22:46.920
But I am more motivated when I can see a line
link |
00:22:50.680
to how the work that my teams and I are doing helps people.
link |
00:22:56.920
The world needs all sorts of people.
link |
00:22:58.440
I'm just one type.
link |
00:22:59.320
I don't think everyone should do things
link |
00:23:01.160
the same way as I do.
link |
00:23:02.360
But when I delve into either theory or practice,
link |
00:23:05.960
if I personally have conviction that here's a pathway
link |
00:23:09.320
to help people, I find that more satisfying
link |
00:23:14.200
to have that conviction.
link |
00:23:15.160
That's your path.
link |
00:23:17.560
You were a proponent of deep learning
link |
00:23:19.960
before it gained widespread acceptance.
link |
00:23:23.320
What did you see in this field that gave you confidence?
link |
00:23:26.120
What was your thinking process like in that first decade
link |
00:23:28.680
of the, I don't know what that's called, 2000s, the aughts?
link |
00:23:33.720
Yeah, I can tell you the thing we got wrong
link |
00:23:35.480
and the thing we got right.
link |
00:23:36.760
The thing we really got wrong was the importance of,
link |
00:23:40.520
the early importance of unsupervised learning.
link |
00:23:42.840
So early days of Google Brain,
link |
00:23:46.040
we put a lot of effort into unsupervised learning
link |
00:23:48.040
rather than supervised learning.
link |
00:23:49.560
And there was this argument,
link |
00:23:50.840
I think it was around 2005 after NeurIPS,
link |
00:23:55.400
at that time called NIPS, but now NeurIPS had ended.
link |
00:23:58.280
And Jeff Hinton and I were sitting in the cafeteria
link |
00:24:01.320
outside the conference.
link |
00:24:02.680
We had lunch, we were just chatting.
link |
00:24:04.200
And Jeff pulled up this napkin.
link |
00:24:05.480
He started sketching this argument on a napkin.
link |
00:24:07.960
It was very compelling, as I'll repeat it.
link |
00:24:10.120
Human brain has about a hundred trillion.
link |
00:24:12.520
So there's 10 to the 14 synaptic connections.
link |
00:24:16.200
You will live for about 10 to the nine seconds.
link |
00:24:19.480
That's 30 years.
link |
00:24:20.360
You actually live for two by 10 to the nine,
link |
00:24:22.760
maybe three by 10 to the nine seconds.
link |
00:24:24.200
So just let's say 10 to the nine.
link |
00:24:26.440
So if each synaptic connection,
link |
00:24:29.000
each weight in your brain's neural network
link |
00:24:31.000
has just a one bit parameter,
link |
00:24:33.240
that's 10 to the 14 bits you need to learn
link |
00:24:36.440
in up to 10 to the nine seconds.
link |
00:24:38.920
10 to the nine seconds of your life.
link |
00:24:41.800
So via this simple argument,
link |
00:24:43.560
which is a lot of problems, it's very simplified.
link |
00:24:45.960
That's 10 to the five bits per second
link |
00:24:47.480
you need to learn in your life.
link |
00:24:49.720
And I have a one year old daughter.
link |
00:24:52.440
I am not pointing out 10 to five bits per second
link |
00:24:56.440
of labels to her.
link |
00:24:59.640
And I think I'm a very loving parent,
link |
00:25:01.960
but I'm just not gonna do that.
link |
00:25:04.840
So from this very crude, definitely problematic argument,
link |
00:25:08.680
there's just no way that most of what we know
link |
00:25:11.240
is through supervised learning.
link |
00:25:13.320
But where you get so many bits of information
link |
00:25:15.160
is from sucking in images, audio,
link |
00:25:16.840
those experiences in the world.
link |
00:25:19.480
And so that argument,
link |
00:25:21.320
and there are a lot of known forces argument
link |
00:25:23.080
you should go into,
link |
00:25:24.680
really convinced me that there's a lot of power
link |
00:25:26.840
to unsupervised learning.
link |
00:25:29.400
So that was the part that we actually maybe got wrong.
link |
00:25:32.360
I still think unsupervised learning is really important,
link |
00:25:34.680
but in the early days, 10, 15 years ago,
link |
00:25:38.760
a lot of us thought that was the path forward.
link |
00:25:41.080
Oh, so you're saying that that perhaps
link |
00:25:43.400
was the wrong intuition for the time.
link |
00:25:45.560
For the time, that was the part we got wrong.
link |
00:25:48.440
The part we got right was the importance of scale.
link |
00:25:51.560
So Adam Coates, another wonderful person,
link |
00:25:55.960
fortunate to have worked with him,
link |
00:25:57.960
he was in my group at Stanford at the time
link |
00:25:59.880
and Adam had run these experiments at Stanford
link |
00:26:02.280
showing that the bigger we train a learning algorithm,
link |
00:26:05.960
the better its performance.
link |
00:26:07.800
And it was based on that.
link |
00:26:09.960
There was a graph that Adam generated
link |
00:26:12.760
where the X axis, Y axis lines going up into the right.
link |
00:26:15.640
So the bigger you make this thing,
link |
00:26:17.320
the better its performance accuracy is the vertical axis.
link |
00:26:20.200
So it's really based on that chart that Adam generated
link |
00:26:22.600
that he gave me the conviction
link |
00:26:23.800
that you could scale these models way bigger
link |
00:26:26.120
than what we could on a few CPUs,
link |
00:26:27.720
which is where we had at Stanford
link |
00:26:29.240
that we could get even better results.
link |
00:26:31.400
And it was really based on that one figure
link |
00:26:33.240
that Adam generated
link |
00:26:34.920
that gave me the conviction to go with Sebastian Thrun
link |
00:26:38.600
to pitch starting a project at Google,
link |
00:26:42.600
which became the Google Brain project.
link |
00:26:43.960
The Brain, you go find a Google Brain.
link |
00:26:45.640
And there the intuition was scale
link |
00:26:48.920
will bring performance for the system.
link |
00:26:52.120
So we should chase a larger and larger scale.
link |
00:26:55.320
And I think people don't realize how groundbreaking of it.
link |
00:27:00.040
It's simple, but it's a groundbreaking idea
link |
00:27:02.200
that bigger data sets will result in better performance.
link |
00:27:05.960
It was controversial at the time.
link |
00:27:08.600
Some of my well meaning friends,
link |
00:27:10.120
senior people in the machine learning community,
link |
00:27:11.480
I won't name, but some of whom we know,
link |
00:27:16.040
my well meaning friends came
link |
00:27:17.800
and were trying to give me friendly,
link |
00:27:19.160
I was like, hey, Andrew, why are you doing this?
link |
00:27:20.840
This is crazy.
link |
00:27:21.720
It's in the near natural architecture.
link |
00:27:23.160
Look at these architectures of building.
link |
00:27:24.760
You just want to go for scale?
link |
00:27:25.960
Like this is a bad career move.
link |
00:27:27.320
So my well meaning friends,
link |
00:27:29.000
some of them were trying to talk me out of it.
link |
00:27:33.960
But I find that if you want to make a breakthrough,
link |
00:27:36.760
you sometimes have to have conviction
link |
00:27:38.920
and do something before it's popular,
link |
00:27:40.920
since that lets you have a bigger impact.
link |
00:27:43.000
Let me ask you just a small tangent on that topic.
link |
00:27:45.960
I find myself arguing with people saying that greater scale,
link |
00:27:51.320
especially in the context of active learning,
link |
00:27:53.400
so very carefully selecting the data set,
link |
00:27:56.840
but growing the scale of the data set
link |
00:27:59.160
is going to lead to even further breakthroughs
link |
00:28:01.560
in deep learning.
link |
00:28:02.680
And there's currently pushback at that idea
link |
00:28:05.800
that larger data sets are no longer,
link |
00:28:09.000
so you want to increase the efficiency of learning.
link |
00:28:11.800
You want to make better learning mechanisms.
link |
00:28:13.960
And I personally believe that bigger data sets will still,
link |
00:28:17.640
with the same learning methods we have now,
link |
00:28:19.880
will result in better performance.
link |
00:28:21.720
What's your intuition at this time
link |
00:28:23.400
on this dual side?
link |
00:28:27.480
Do we need to come up with better architectures for learning
link |
00:28:31.240
or can we just get bigger, better data sets
link |
00:28:35.080
that will improve performance?
link |
00:28:37.160
I think both are important and it's also problem dependent.
link |
00:28:40.360
So for a few data sets,
link |
00:28:41.800
we may be approaching a Bayes error rate
link |
00:28:45.960
or approaching or surpassing human level performance
link |
00:28:48.600
and then there's that theoretical ceiling
link |
00:28:50.680
that we will never surpass,
link |
00:28:51.880
so Bayes error rate.
link |
00:28:54.520
But then I think there are plenty of problems
link |
00:28:56.120
where we're still quite far
link |
00:28:57.960
from either human level performance
link |
00:28:59.640
or from Bayes error rate
link |
00:29:00.840
and bigger data sets with neural networks
link |
00:29:05.240
without further algorithmic innovation
link |
00:29:07.000
will be sufficient to take us further.
link |
00:29:10.280
But on the flip side,
link |
00:29:11.240
if we look at the recent breakthroughs
link |
00:29:12.760
using transforming networks or language models,
link |
00:29:15.480
it was a combination of novel architecture
link |
00:29:18.200
but also scale had a lot to do with it.
link |
00:29:20.440
If we look at what happened with GP2 and BERTZ,
link |
00:29:22.920
I think scale was a large part of the story.
link |
00:29:26.200
Yeah, that's not often talked about
link |
00:29:28.200
is the scale of the data set it was trained on
link |
00:29:30.920
and the quality of the data set
link |
00:29:32.360
because there's some,
link |
00:29:35.000
so it was like reddit threads that had,
link |
00:29:38.280
they were operated highly.
link |
00:29:39.880
So there's already some weak supervision
link |
00:29:42.920
on a very large data set
link |
00:29:44.680
that people don't often talk about, right?
link |
00:29:47.160
I find that today we have maturing processes
link |
00:29:50.360
to managing code,
link |
00:29:52.360
things like Git, right?
link |
00:29:53.400
Version control.
link |
00:29:54.520
It took us a long time to evolve the good processes.
link |
00:29:58.360
I remember when my friends and I
link |
00:29:59.560
were emailing each other C++ files in email,
link |
00:30:02.200
but then we had,
link |
00:30:03.080
was it CVS or version Git?
link |
00:30:05.080
Maybe something else in the future.
link |
00:30:07.400
We're very mature in terms of tools for managing data
link |
00:30:10.600
and think about the clean data
link |
00:30:11.960
and how to solve down very hot, messy data problems.
link |
00:30:15.320
I think there's a lot of innovation there
link |
00:30:17.160
to be had still.
link |
00:30:17.960
I love the idea that you were versioning through email.
link |
00:30:21.960
I'll give you one example.
link |
00:30:23.880
When we work with manufacturing companies,
link |
00:30:29.160
it's not at all uncommon
link |
00:30:31.160
for there to be multiple labels
link |
00:30:34.200
that disagree with each other, right?
link |
00:30:36.280
And so we would do the work in visual inspection.
link |
00:30:40.440
We will take, say, a plastic part
link |
00:30:42.920
and show it to one inspector
link |
00:30:44.680
and the inspector, sometimes very opinionated,
link |
00:30:47.160
they'll go, clearly, that's a defect.
link |
00:30:48.520
This scratch, unacceptable.
link |
00:30:49.640
Gotta reject this part.
link |
00:30:51.160
Take the same part to different inspector,
link |
00:30:53.320
different, very opinionated.
link |
00:30:54.920
Clearly, the scratch is small.
link |
00:30:56.200
It's fine.
link |
00:30:56.760
Don't throw it away.
link |
00:30:57.480
You're gonna make us, you know.
link |
00:30:59.240
And then sometimes you take the same plastic part,
link |
00:31:01.800
show it to the same inspector
link |
00:31:03.400
in the afternoon, I suppose, in the morning,
link |
00:31:05.400
and very opinionated go, in the morning,
link |
00:31:07.480
they say, clearly, it's okay.
link |
00:31:08.680
In the afternoon, equally confident.
link |
00:31:10.600
Clearly, this is a defect.
link |
00:31:12.280
And so what is an AI team supposed to do
link |
00:31:14.760
if sometimes even one person doesn't agree
link |
00:31:17.400
with himself or herself in the span of a day?
link |
00:31:20.280
So I think these are the types of very practical,
link |
00:31:23.640
very messy data problems that my teams wrestle with.
link |
00:31:30.200
In the case of large consumer internet companies
link |
00:31:32.840
where you have a billion users,
link |
00:31:34.200
you have a lot of data.
link |
00:31:35.480
You don't worry about it.
link |
00:31:36.360
Just take the average.
link |
00:31:37.160
It kind of works.
link |
00:31:38.280
But in a case of other industry settings,
link |
00:31:40.760
we don't have big data.
link |
00:31:42.360
If just a small data, very small data sets,
link |
00:31:44.520
maybe around 100 defective parts
link |
00:31:47.720
or 100 examples of a defect.
link |
00:31:49.720
If you have only 100 examples,
link |
00:31:51.320
these little labeling errors,
link |
00:31:53.240
if 10 of your 100 labels are wrong,
link |
00:31:55.800
that actually is 10% of your data set has a big impact.
link |
00:31:58.520
So how do you clean this up?
link |
00:31:59.640
What are you supposed to do?
link |
00:32:01.160
This is an example of the types of things
link |
00:32:03.400
that my teams, this is a landing AI example,
link |
00:32:06.680
are wrestling with to deal with small data,
link |
00:32:09.000
which comes up all the time
link |
00:32:10.040
once you're outside consumer internet.
link |
00:32:12.120
Yeah, that's fascinating.
link |
00:32:13.000
So then you invest more effort and time
link |
00:32:15.240
in thinking about the actual labeling process.
link |
00:32:18.040
What are the labels?
link |
00:32:19.560
What are the how are disagreements resolved
link |
00:32:22.440
and all those kinds of like pragmatic real world problems.
link |
00:32:25.640
That's a fascinating space.
link |
00:32:27.240
Yeah, I find that actually when I'm teaching at Stanford,
link |
00:32:29.560
I increasingly encourage students at Stanford
link |
00:32:32.680
to try to find their own project
link |
00:32:37.080
for the end of term project,
link |
00:32:38.280
rather than just downloading someone else's
link |
00:32:40.360
nicely clean data set.
link |
00:32:41.880
It's actually much harder if you need to go
link |
00:32:43.320
and define your own problem and find your own data set,
link |
00:32:45.480
rather than you go to one of the several good websites,
link |
00:32:48.680
very good websites with clean scoped data sets
link |
00:32:52.760
that you could just work on.
link |
00:32:55.240
You're now running three efforts,
link |
00:32:56.920
the AI Fund, Landing AI, and deeplearning.ai.
link |
00:33:02.280
As you've said, the AI Fund is involved
link |
00:33:04.520
in creating new companies from scratch.
link |
00:33:06.600
Landing AI is involved in helping
link |
00:33:08.520
already established companies do AI
link |
00:33:10.440
and deeplearning.ai is for education of everyone else
link |
00:33:14.600
or of individuals interested in getting into the field
link |
00:33:18.040
and excelling in it.
link |
00:33:19.320
So let's perhaps talk about each of these areas.
link |
00:33:22.280
First, deeplearning.ai.
link |
00:33:25.560
How, the basic question,
link |
00:33:27.640
how does a person interested in deep learning
link |
00:33:30.040
get started in the field?
link |
00:33:32.280
Deep learning.ai is working to create courses
link |
00:33:35.640
to help people break into AI.
link |
00:33:37.480
So my machine learning course that I taught through Stanford
link |
00:33:42.120
is one of the most popular courses on Coursera.
link |
00:33:45.400
To this day, it's probably one of the courses,
link |
00:33:48.440
sort of, if I asked somebody,
link |
00:33:49.720
how did you get into machine learning
link |
00:33:52.200
or how did you fall in love with machine learning
link |
00:33:54.040
or would get you interested,
link |
00:33:55.800
it always goes back to Andrew Ng at some point.
link |
00:33:58.920
I see, yeah, I'm sure.
link |
00:34:00.040
You've influenced, the amount of people
link |
00:34:01.880
you've influenced is ridiculous.
link |
00:34:03.160
So for that, I'm sure I speak for a lot of people
link |
00:34:05.720
say big thank you.
link |
00:34:07.080
No, yeah, thank you.
link |
00:34:09.080
I was once reading a news article,
link |
00:34:13.320
I think it was tech review
link |
00:34:15.080
and I'm gonna mess up the statistic,
link |
00:34:17.480
but I remember reading an article that said
link |
00:34:20.120
something like one third of all programmers are self taught.
link |
00:34:23.640
I may have the number one third,
link |
00:34:24.760
around me was two thirds,
link |
00:34:25.640
but when I read that article,
link |
00:34:26.600
I thought this doesn't make sense.
link |
00:34:28.120
Everyone is self taught.
link |
00:34:29.400
So, cause you teach yourself.
link |
00:34:31.160
I don't teach people.
link |
00:34:32.920
That's well put.
link |
00:34:33.480
Yeah, so how does one get started in deep learning
link |
00:34:37.960
and where does deeplearning.ai fit into that?
link |
00:34:40.520
So the deep learning specialization offered by deeplearning.ai
link |
00:34:43.640
is I think it was Coursera's top specialization.
link |
00:34:49.880
It might still be.
link |
00:34:50.680
So it's a very popular way for people
link |
00:34:52.840
to take that specialization
link |
00:34:54.360
to learn about everything from neural networks
link |
00:34:57.720
to how to tune in your network
link |
00:34:59.960
to what is a ConvNet to what is a RNN
link |
00:35:02.920
or a sequence model or what is an attention model.
link |
00:35:05.800
And so the deep learning specialization
link |
00:35:09.080
steps everyone through those algorithms
link |
00:35:10.840
so you deeply understand it
link |
00:35:12.200
and can implement it and use it for whatever application.
link |
00:35:15.160
From the very beginning.
link |
00:35:16.440
So what would you say are the prerequisites
link |
00:35:19.480
for somebody to take the deep learning specialization
link |
00:35:22.040
in terms of maybe math or programming background?
link |
00:35:25.560
Yeah, need to understand basic programming
link |
00:35:27.960
since there are programming exercises in Python
link |
00:35:30.120
and the math prereq is quite basic.
link |
00:35:34.360
So no calculus is needed.
link |
00:35:35.880
If you know calculus is great, you get better intuitions
link |
00:35:38.600
but deliberately try to teach that specialization
link |
00:35:41.160
without requiring calculus.
link |
00:35:42.680
So I think high school math would be sufficient.
link |
00:35:47.160
If you know how to multiply two matrices,
link |
00:35:49.000
I think that's great.
link |
00:35:52.120
So a little basic linear algebra is great.
link |
00:35:54.680
Basic linear algebra,
link |
00:35:55.960
even very, very basic linear algebra in some programming.
link |
00:36:00.040
I think that people that have done the machine learning course
link |
00:36:02.120
will find a deep learning specialization a bit easier
link |
00:36:05.000
but it's also possible to jump
link |
00:36:06.360
into the deep learning specialization directly
link |
00:36:08.280
but it will be a little bit harder
link |
00:36:09.960
since we tend to go over faster concepts
link |
00:36:14.440
like how does gradient descent work
link |
00:36:16.120
and what is the objective function
link |
00:36:17.640
which is covered more slowly in the machine learning course.
link |
00:36:20.120
Could you briefly mention some of the key concepts
link |
00:36:22.840
in deep learning that students should learn
link |
00:36:25.000
that you envision them learning in the first few months
link |
00:36:27.640
in the first year or so?
link |
00:36:29.320
So if you take the deep learning specialization,
link |
00:36:31.880
you learn the foundations of what is a neural network.
link |
00:36:34.840
How do you build up a neural network
link |
00:36:36.840
from a single logistic unit to a stack of layers
link |
00:36:40.600
to different activation functions.
link |
00:36:43.000
You learn how to train the neural networks.
link |
00:36:44.920
One thing I'm very proud of in that specialization
link |
00:36:47.720
is we go through a lot of practical knowhow
link |
00:36:50.200
of how to actually make these things work.
link |
00:36:52.200
So what are the differences between different optimization algorithms?
link |
00:36:55.640
What do you do if the algorithm overfits
link |
00:36:57.240
or how do you tell if the algorithm is overfitting?
link |
00:36:59.000
When do you collect more data?
link |
00:37:00.120
When should you not bother to collect more data?
link |
00:37:03.160
I find that even today, unfortunately,
link |
00:37:06.200
there are engineers that will spend six months
link |
00:37:09.960
trying to pursue a particular direction
link |
00:37:12.520
such as collect more data
link |
00:37:13.880
because we heard more data is valuable
link |
00:37:15.800
but sometimes you could run some tests
link |
00:37:18.280
and could have figured out six months earlier
link |
00:37:20.360
that for this particular problem, collecting more data isn't going to cut it.
link |
00:37:23.880
So just don't spend six months collecting more data.
link |
00:37:26.280
Spend your time modifying the architecture or trying something else.
link |
00:37:30.280
So go through a lot of the practical knowhow
link |
00:37:32.600
so that when someone, when you take the deep learning specialization,
link |
00:37:37.240
you have those skills to be very efficient
link |
00:37:39.720
in how you build these networks.
link |
00:37:41.960
So dive right in to play with the network, to train it,
link |
00:37:45.160
to do the inference on a particular data set,
link |
00:37:47.160
to build intuition about it without building it up too big
link |
00:37:52.120
to where you spend, like you said, six months
link |
00:37:54.760
learning, building up your big project
link |
00:37:57.320
without building any intuition of a small aspect of the data
link |
00:38:02.200
that could already tell you everything you need to know about that data.
link |
00:38:05.640
Yes, and also the systematic frameworks of thinking
link |
00:38:09.240
for how to go about building practical machine learning.
link |
00:38:12.280
Maybe to make an analogy, when we learn to code,
link |
00:38:15.320
we have to learn the syntax of some programming language, right?
link |
00:38:17.960
Be it Python or C++ or Octave or whatever.
link |
00:38:21.480
But the equally important or maybe even more important part of coding
link |
00:38:24.920
is to understand how to string together these lines of code
link |
00:38:27.640
into coherent things.
link |
00:38:28.760
So when should you put something in a function column?
link |
00:38:31.880
When should you not?
link |
00:38:32.840
How do you think about abstraction?
link |
00:38:34.600
So those frameworks are what makes a programmer efficient
link |
00:38:39.000
even more than understanding the syntax.
link |
00:38:41.560
I remember when I was an undergrad at Carnegie Mellon,
link |
00:38:44.120
one of my friends would debug their code
link |
00:38:47.480
by first trying to compile it, and then it was C++ code.
link |
00:38:50.840
And then every line in the syntax error,
link |
00:38:53.240
they want to get rid of the syntax errors as quickly as possible.
link |
00:38:55.640
So how do you do that?
link |
00:38:56.520
Well, they would delete every single line of code with a syntax error.
link |
00:38:59.640
So really efficient for getting rid of syntax errors
link |
00:39:01.640
for horrible debugging errors.
link |
00:39:02.920
So I think we learn how to debug.
link |
00:39:05.320
And I think in machine learning,
link |
00:39:06.920
the way you debug a machine learning program
link |
00:39:09.320
is very different than the way you do binary search or whatever,
link |
00:39:13.000
or use a debugger, trace through the code
link |
00:39:15.080
in traditional software engineering.
link |
00:39:17.000
So it's an evolving discipline,
link |
00:39:18.920
but I find that the people that are really good
link |
00:39:20.760
at debugging machine learning algorithms
link |
00:39:22.840
are easily 10x, maybe 100x faster at getting something to work.
link |
00:39:28.120
And the basic process of debugging is,
link |
00:39:30.760
so the bug in this case,
link |
00:39:32.600
why isn't this thing learning, improving,
link |
00:39:36.360
sort of going into the questions of overfitting
link |
00:39:39.240
and all those kinds of things?
link |
00:39:40.760
That's the logical space that the debugging is happening in
link |
00:39:45.240
with neural networks.
link |
00:39:46.440
Yeah, often the question is, why doesn't it work yet?
link |
00:39:50.280
Or can I expect it to eventually work?
link |
00:39:52.920
And what are the things I could try?
link |
00:39:54.760
Change the architecture, more data, more regularization,
link |
00:39:57.400
different optimization algorithm,
link |
00:40:00.600
different types of data.
link |
00:40:01.880
So to answer those questions systematically,
link |
00:40:04.200
so that you don't spend six months hitting down the blind alley
link |
00:40:08.040
before someone comes and says,
link |
00:40:09.720
why did you spend six months doing this?
link |
00:40:12.120
What concepts in deep learning
link |
00:40:13.960
do you think students struggle the most with?
link |
00:40:16.440
Or sort of is the biggest challenge for them
link |
00:40:19.000
was to get over that hill.
link |
00:40:23.160
It hooks them and it inspires them and they really get it.
link |
00:40:28.040
Similar to learning mathematics,
link |
00:40:30.200
I think one of the challenges of deep learning
link |
00:40:32.440
is that there are a lot of concepts
link |
00:40:33.960
that build on top of each other.
link |
00:40:36.760
If you ask me what's hard about mathematics,
link |
00:40:38.760
I have a hard time pinpointing one thing.
link |
00:40:40.920
Is it addition, subtraction?
link |
00:40:42.280
Is it a carry?
link |
00:40:43.080
Is it multiplication?
link |
00:40:44.360
There's just a lot of stuff.
link |
00:40:45.720
I think one of the challenges of learning math
link |
00:40:48.040
and of learning certain technical fields
link |
00:40:49.800
is that there are a lot of concepts
link |
00:40:51.480
and if you miss a concept,
link |
00:40:53.080
then you're kind of missing the prerequisite
link |
00:40:55.400
for something that comes later.
link |
00:40:58.040
So in the deep learning specialization,
link |
00:41:01.880
try to break down the concepts
link |
00:41:03.480
to maximize the odds of each component being understandable.
link |
00:41:06.920
So when you move on to the more advanced thing,
link |
00:41:09.240
we learn confidence,
link |
00:41:10.760
hopefully you have enough intuitions
link |
00:41:12.280
from the earlier sections
link |
00:41:13.880
to then understand why we structure confidence
link |
00:41:16.760
in a certain way
link |
00:41:18.520
and then eventually why we built RNNs and LSTMs
link |
00:41:23.000
or attention models in a certain way
link |
00:41:24.760
building on top of the earlier concepts.
link |
00:41:27.560
Actually, I'm curious,
link |
00:41:28.600
you do a lot of teaching as well.
link |
00:41:30.920
Do you have a favorite,
link |
00:41:33.080
this is the hard concept moment in your teaching?
link |
00:41:39.480
Well, I don't think anyone's ever turned the interview on me.
link |
00:41:43.320
I'm glad you get first.
link |
00:41:46.600
I think that's a really good question.
link |
00:41:48.920
Yeah, it's really hard to capture the moment
link |
00:41:51.160
when they struggle.
link |
00:41:51.800
I think you put it really eloquently.
link |
00:41:53.320
I do think there's moments
link |
00:41:55.080
that are like aha moments
link |
00:41:57.240
that really inspire people.
link |
00:41:59.400
I think for some reason,
link |
00:42:01.400
reinforcement learning,
link |
00:42:03.240
especially deep reinforcement learning
link |
00:42:05.560
is a really great way
link |
00:42:07.400
to really inspire people
link |
00:42:09.560
and get what the use of neural networks can do.
link |
00:42:13.480
Even though neural networks
link |
00:42:15.160
really are just a part of the deep RL framework,
link |
00:42:18.440
but it's a really nice way
link |
00:42:19.640
to paint the entirety of the picture
link |
00:42:22.360
of a neural network
link |
00:42:23.960
being able to learn from scratch,
link |
00:42:25.880
knowing nothing and explore the world
link |
00:42:27.720
and pick up lessons.
link |
00:42:29.080
I find that a lot of the aha moments
link |
00:42:31.240
happen when you use deep RL
link |
00:42:33.640
to teach people about neural networks,
link |
00:42:36.200
which is counterintuitive.
link |
00:42:37.720
I find like a lot of the inspired sort of fire
link |
00:42:40.680
in people's passion,
link |
00:42:41.560
people's eyes,
link |
00:42:42.200
it comes from the RL world.
link |
00:42:44.680
Do you find reinforcement learning
link |
00:42:46.920
to be a useful part
link |
00:42:48.520
of the teaching process or no?
link |
00:42:51.800
I still teach reinforcement learning
link |
00:42:53.400
in one of my Stanford classes
link |
00:42:55.480
and my PhD thesis was on reinforcement learning.
link |
00:42:57.320
So I clearly loved a few.
link |
00:42:59.240
I find that if I'm trying to teach
link |
00:43:00.840
students the most useful techniques
link |
00:43:03.000
for them to use today,
link |
00:43:04.520
I end up shrinking the amount of time
link |
00:43:07.000
I talk about reinforcement learning.
link |
00:43:08.840
It's not what's working today.
link |
00:43:10.760
Now, our world changes so fast.
link |
00:43:12.280
Maybe this will be totally different
link |
00:43:13.480
in a couple of years.
link |
00:43:15.800
But I think we need a couple more things
link |
00:43:17.640
for reinforcement learning to get there.
link |
00:43:20.600
One of my teams is looking
link |
00:43:21.720
to reinforcement learning
link |
00:43:22.600
for some robotic control tasks.
link |
00:43:23.800
So I see the applications,
link |
00:43:25.160
but if you look at it as a percentage
link |
00:43:27.560
of all of the impact
link |
00:43:28.520
of the types of things we do,
link |
00:43:30.040
it's at least today outside of
link |
00:43:33.720
playing video games, right?
link |
00:43:35.320
In a few of the games, the scope.
link |
00:43:38.440
Actually, at NeurIPS,
link |
00:43:39.560
a bunch of us were standing around
link |
00:43:40.840
saying, hey, what's your best example
link |
00:43:42.760
of an actual deploy reinforcement
link |
00:43:44.200
learning application?
link |
00:43:45.240
And among like
link |
00:43:47.160
senior machine learning researchers, right?
link |
00:43:49.000
And again, there are some emerging ones,
link |
00:43:51.400
but there are not that many great examples.
link |
00:43:55.240
I think you're absolutely right.
link |
00:43:58.040
The sad thing is there hasn't been
link |
00:43:59.880
a big impactful real world application
link |
00:44:03.480
of reinforcement learning.
link |
00:44:04.840
I think its biggest impact to me
link |
00:44:07.560
has been in the toy domain,
link |
00:44:09.320
in the game domain,
link |
00:44:10.200
in the small example.
link |
00:44:11.240
That's what I mean for educational purpose.
link |
00:44:13.560
It seems to be a fun thing to explore
link |
00:44:15.640
in your networks with.
link |
00:44:16.760
But I think from your perspective,
link |
00:44:19.000
and I think that might be
link |
00:44:20.440
the best perspective is
link |
00:44:22.280
if you're trying to educate
link |
00:44:23.560
with a simple example
link |
00:44:24.680
in order to illustrate
link |
00:44:25.800
how this can actually be grown
link |
00:44:27.640
to scale and have a real world impact,
link |
00:44:31.560
then perhaps focusing on the fundamentals
link |
00:44:33.640
of supervised learning
link |
00:44:35.400
in the context of a simple data set,
link |
00:44:38.920
even like an MNIST data set
link |
00:44:40.440
is the right way,
link |
00:44:42.040
is the right path to take.
link |
00:44:45.080
The amount of fun I've seen people
link |
00:44:46.520
have with reinforcement learning
link |
00:44:47.880
has been great,
link |
00:44:48.440
but not in the applied impact
link |
00:44:51.320
in the real world setting.
link |
00:44:52.760
So it's a trade off,
link |
00:44:54.040
how much impact you want to have
link |
00:44:55.320
versus how much fun you want to have.
link |
00:44:56.680
Yeah, that's really cool.
link |
00:44:58.200
And I feel like the world
link |
00:44:59.960
actually needs all sorts.
link |
00:45:01.240
Even within machine learning,
link |
00:45:02.520
I feel like deep learning
link |
00:45:04.360
is so exciting,
link |
00:45:05.800
but the AI team
link |
00:45:07.080
shouldn't just use deep learning.
link |
00:45:08.360
I find that my teams
link |
00:45:09.320
use a portfolio of tools.
link |
00:45:11.640
And maybe that's not the exciting thing
link |
00:45:13.080
to say, but some days
link |
00:45:14.680
we use a neural net,
link |
00:45:15.720
some days we use a PCA.
link |
00:45:19.960
Actually, the other day,
link |
00:45:20.600
I was sitting down with my team
link |
00:45:21.480
looking at PCA residuals,
link |
00:45:22.760
trying to figure out what's going on
link |
00:45:23.800
with PCA applied
link |
00:45:24.600
to manufacturing problem.
link |
00:45:25.640
And some days we use
link |
00:45:26.920
a probabilistic graphical model,
link |
00:45:28.200
some days we use a knowledge draft,
link |
00:45:29.720
which is one of the things
link |
00:45:30.520
that has tremendous industry impact.
link |
00:45:33.000
But the amount of chatter
link |
00:45:34.680
about knowledge drafts in academia
link |
00:45:36.360
is really thin compared
link |
00:45:37.640
to the actual real world impact.
link |
00:45:39.640
So I think reinforcement learning
link |
00:45:41.400
should be in that portfolio.
link |
00:45:42.520
And then it's about balancing
link |
00:45:43.640
how much we teach all of these things.
link |
00:45:45.240
And the world should have
link |
00:45:47.000
diverse skills.
link |
00:45:47.800
It'd be sad if everyone
link |
00:45:49.240
just learned one narrow thing.
link |
00:45:51.400
Yeah, the diverse skill
link |
00:45:52.360
help you discover the right tool
link |
00:45:53.720
for the job.
link |
00:45:54.280
What is the most beautiful,
link |
00:45:56.680
surprising or inspiring idea
link |
00:45:59.160
in deep learning to you?
link |
00:46:00.760
Something that captivated
link |
00:46:03.400
your imagination.
link |
00:46:04.600
Is it the scale that could be,
link |
00:46:07.080
the performance that could be
link |
00:46:07.960
achieved with scale?
link |
00:46:08.920
Or is there other ideas?
link |
00:46:11.560
I think that if my only job
link |
00:46:14.360
was being an academic researcher,
link |
00:46:16.520
if an unlimited budget
link |
00:46:18.120
and didn't have to worry
link |
00:46:19.960
about short term impact
link |
00:46:21.800
and only focus on long term impact,
link |
00:46:23.800
I'd probably spend all my time
link |
00:46:24.760
doing research on unsupervised learning.
link |
00:46:27.400
I still think unsupervised learning
link |
00:46:28.840
is a beautiful idea.
link |
00:46:31.400
At both this past NeurIPS and ICML,
link |
00:46:34.600
I was attending workshops
link |
00:46:35.960
or listening to various talks
link |
00:46:37.480
about self supervised learning,
link |
00:46:39.160
which is one vertical segment
link |
00:46:41.480
maybe of unsupervised learning
link |
00:46:43.160
that I'm excited about.
link |
00:46:45.160
Maybe just to summarize the idea,
link |
00:46:46.360
I guess you know the idea
link |
00:46:47.400
about describing fleet.
link |
00:46:48.520
No, please.
link |
00:46:49.080
So here's the example
link |
00:46:49.960
of self supervised learning.
link |
00:46:52.040
Let's say we grab a lot
link |
00:46:53.480
of unlabeled images off the internet.
link |
00:46:55.560
So with infinite amounts
link |
00:46:56.680
of this type of data,
link |
00:46:58.040
I'm going to take each image
link |
00:46:59.320
and rotate it by a random
link |
00:47:01.160
multiple of 90 degrees.
link |
00:47:03.000
And then I'm going to train
link |
00:47:04.760
a supervised neural network
link |
00:47:06.200
to predict what was
link |
00:47:07.400
the original orientation.
link |
00:47:08.920
So it has to be rotated 90 degrees,
link |
00:47:10.760
180 degrees, 270 degrees,
link |
00:47:12.440
or zero degrees.
link |
00:47:14.360
So you can generate
link |
00:47:15.640
an infinite amounts of labeled data
link |
00:47:17.560
because you rotated the image
link |
00:47:18.920
so you know what's the
link |
00:47:19.880
ground truth label.
link |
00:47:20.760
And so various researchers
link |
00:47:23.320
have found that by taking
link |
00:47:24.680
unlabeled data and making
link |
00:47:26.600
up labeled data sets
link |
00:47:27.880
and training a large neural network
link |
00:47:29.720
on these tasks,
link |
00:47:30.920
you can then take the hidden
link |
00:47:32.040
layer representation and transfer
link |
00:47:34.120
it to a different task
link |
00:47:35.400
very powerfully.
link |
00:47:37.640
Learning word embeddings
link |
00:47:39.000
where we take a sentence,
link |
00:47:40.040
delete a word,
link |
00:47:40.760
predict the missing word,
link |
00:47:42.120
which is how we learn.
link |
00:47:43.480
One of the ways we learn
link |
00:47:44.440
word embeddings
link |
00:47:45.480
is another example.
link |
00:47:47.160
And I think there's now
link |
00:47:48.680
this portfolio of techniques
link |
00:47:50.440
for generating these made up tasks.
link |
00:47:53.320
Another one called jigsaw
link |
00:47:54.760
would be if you take an image,
link |
00:47:56.760
cut it up into a three by three grid,
link |
00:47:59.240
so like a nine,
link |
00:48:00.040
three by three puzzle piece,
link |
00:48:01.560
jump up the nine pieces
link |
00:48:02.840
and have a neural network predict
link |
00:48:04.520
which of the nine factorial
link |
00:48:06.360
possible permutations
link |
00:48:07.880
it came from.
link |
00:48:09.320
So many groups,
link |
00:48:11.480
including OpenAI,
link |
00:48:13.080
Peter B has been doing
link |
00:48:14.520
some work on this too,
link |
00:48:16.280
Facebook, Google Brain,
link |
00:48:18.440
I think DeepMind,
link |
00:48:19.560
oh actually,
link |
00:48:21.240
Aaron van der Oort
link |
00:48:22.200
has great work on the CPC objective.
link |
00:48:24.360
So many teams are doing exciting work
link |
00:48:26.120
and I think this is a way
link |
00:48:27.640
to generate infinite label data
link |
00:48:30.440
and I find this a very exciting
link |
00:48:32.920
piece of unsupervised learning.
link |
00:48:34.040
So long term you think
link |
00:48:35.080
that's going to unlock
link |
00:48:37.160
a lot of power
link |
00:48:38.280
in machine learning systems
link |
00:48:39.960
is this kind of unsupervised learning.
link |
00:48:42.200
I don't think there's
link |
00:48:43.080
a whole enchilada,
link |
00:48:43.880
I think it's just a piece of it
link |
00:48:45.080
and I think this one piece
link |
00:48:46.440
unsupervised,
link |
00:48:47.320
self supervised learning
link |
00:48:48.840
is starting to get traction.
link |
00:48:50.200
We're very close
link |
00:48:51.320
to it being useful.
link |
00:48:53.160
Well, word embedding
link |
00:48:54.040
is really useful.
link |
00:48:55.480
I think we're getting
link |
00:48:56.200
closer and closer
link |
00:48:57.080
to just having a significant
link |
00:48:59.240
real world impact
link |
00:49:00.440
maybe in computer vision and video
link |
00:49:03.080
but I think this concept
link |
00:49:05.000
and I think there'll be
link |
00:49:05.880
other concepts around it.
link |
00:49:07.000
You know, other unsupervised
link |
00:49:08.760
learning things that I worked on
link |
00:49:10.520
I've been excited about.
link |
00:49:12.040
I was really excited
link |
00:49:12.840
about sparse coding
link |
00:49:14.600
and ICA,
link |
00:49:16.040
slow feature analysis.
link |
00:49:17.480
I think all of these are ideas
link |
00:49:18.760
that various of us
link |
00:49:20.040
were working on
link |
00:49:20.680
about a decade ago
link |
00:49:21.720
before we all got distracted
link |
00:49:23.160
by how well supervised
link |
00:49:24.680
learning was doing.
link |
00:49:26.200
So we would return
link |
00:49:27.880
we would return to the fundamentals
link |
00:49:29.400
of representation learning
link |
00:49:30.760
that really started
link |
00:49:32.200
this movement of deep learning.
link |
00:49:33.720
I think there's a lot more work
link |
00:49:34.840
that one could explore around
link |
00:49:36.120
this theme of ideas
link |
00:49:37.080
and other ideas
link |
00:49:38.200
to come up with better algorithms.
link |
00:49:40.200
So if we could return
link |
00:49:42.040
to maybe talk quickly
link |
00:49:43.880
about the specifics
link |
00:49:45.080
of deep learning.ai
link |
00:49:46.600
the deep learning specialization
link |
00:49:48.120
perhaps how long does it take
link |
00:49:50.360
to complete the course
link |
00:49:51.240
would you say?
link |
00:49:52.680
The official length
link |
00:49:53.800
of the deep learning specialization
link |
00:49:55.320
is I think 16 weeks
link |
00:49:57.080
so about four months
link |
00:49:58.920
but it's go at your own pace.
link |
00:50:00.760
So if you subscribe
link |
00:50:01.960
to the deep learning specialization
link |
00:50:03.560
there are people that finished it
link |
00:50:04.760
in less than a month
link |
00:50:05.720
by working more intensely
link |
00:50:07.000
and studying more intensely
link |
00:50:07.960
so it really depends on
link |
00:50:09.240
on the individual.
link |
00:50:10.920
When we created
link |
00:50:11.480
the deep learning specialization
link |
00:50:13.480
we wanted to make it
link |
00:50:15.400
very accessible
link |
00:50:16.360
and very affordable.
link |
00:50:18.440
And with you know
link |
00:50:19.480
Coursera and deep learning.ai
link |
00:50:20.840
education mission
link |
00:50:21.720
one of the things
link |
00:50:22.120
that's really important to me
link |
00:50:23.480
is that if there's someone
link |
00:50:25.560
for whom paying anything
link |
00:50:27.160
is a financial hardship
link |
00:50:29.320
then just apply for financial aid
link |
00:50:30.920
and get it for free.
link |
00:50:34.280
If you were to recommend
link |
00:50:35.880
a daily schedule for people
link |
00:50:38.040
in learning whether it's
link |
00:50:39.240
through the deep learning.ai
link |
00:50:40.600
specialization or just learning
link |
00:50:42.680
in the world of deep learning
link |
00:50:43.960
what would you recommend?
link |
00:50:45.480
How do they go about day to day
link |
00:50:47.160
sort of specific advice
link |
00:50:48.760
about learning
link |
00:50:49.800
about their journey in the world
link |
00:50:51.720
of deep learning machine learning?
link |
00:50:53.400
I think getting the habit of learning
link |
00:50:56.760
is key and that means regularity.
link |
00:51:00.920
So for example
link |
00:51:02.840
we send out a weekly newsletter
link |
00:51:05.080
the batch every Wednesday
link |
00:51:06.680
so people know it's coming Wednesday
link |
00:51:08.200
you can spend a little bit of time
link |
00:51:09.160
on Wednesday
link |
00:51:10.200
catching up on the latest news
link |
00:51:11.560
catching up on the latest news
link |
00:51:13.640
through the batch on Wednesday
link |
00:51:17.400
and for myself
link |
00:51:18.600
I've picked up a habit of spending
link |
00:51:21.160
some time every Saturday
link |
00:51:22.520
and every Sunday reading or studying
link |
00:51:24.600
and so I don't wake up on the Saturday
link |
00:51:26.600
and have to make a decision
link |
00:51:27.640
do I feel like reading
link |
00:51:28.840
or studying today or not
link |
00:51:30.280
it's just what I do
link |
00:51:31.640
and the fact is a habit
link |
00:51:33.160
makes it easier.
link |
00:51:34.200
So I think if someone can get into that habit
link |
00:51:37.640
it's like you know
link |
00:51:38.760
just like we brush our teeth every morning
link |
00:51:41.080
I don't think about it
link |
00:51:42.040
if I thought about it
link |
00:51:42.760
it's a little bit annoying
link |
00:51:43.480
to have to spend two minutes doing that
link |
00:51:45.960
but it's a habit that it takes
link |
00:51:47.720
no cognitive load
link |
00:51:49.080
but this would be so much harder
link |
00:51:50.360
if we have to make a decision every morning
link |
00:51:53.640
and actually that's the reason
link |
00:51:54.680
why I wear the same thing every day as well
link |
00:51:56.040
it's just one less decision
link |
00:51:57.160
I just get up and wear my blue shirt
link |
00:51:59.560
so but I think if you can get that habit
link |
00:52:01.160
that consistency of studying
link |
00:52:02.840
then it actually feels easier.
link |
00:52:05.720
So yeah it's kind of amazing
link |
00:52:08.600
in my own life
link |
00:52:09.320
like I play guitar every day for
link |
00:52:12.840
I force myself to at least for five minutes
link |
00:52:14.920
play guitar
link |
00:52:15.560
it's just it's a ridiculously short period of time
link |
00:52:18.040
but because I've gotten into that habit
link |
00:52:20.120
it's incredible what you can accomplish
link |
00:52:21.720
in a period of a year or two years
link |
00:52:24.440
you can become
link |
00:52:26.280
you know exceptionally good
link |
00:52:28.280
at certain aspects of a thing
link |
00:52:29.720
by just doing it every day
link |
00:52:30.920
for a very short period of time
link |
00:52:32.040
it's kind of a miracle
link |
00:52:33.000
that that's how it works
link |
00:52:34.600
it adds up over time.
link |
00:52:36.200
Yeah and I think this is often
link |
00:52:38.360
not about the bursts of sustained efforts
link |
00:52:40.760
and the all nighters
link |
00:52:41.880
because you could only do that
link |
00:52:43.080
a limited number of times
link |
00:52:44.200
it's the sustained effort over a long time
link |
00:52:47.240
I think you know reading two research papers
link |
00:52:50.360
is a nice thing to do
link |
00:52:51.880
but the power is not reading two research papers
link |
00:52:54.200
it's reading two research papers a week
link |
00:52:56.760
for a year
link |
00:52:57.480
then you read a hundred papers
link |
00:52:58.920
and you actually learn a lot
link |
00:53:00.200
when you read a hundred papers.
link |
00:53:02.040
So regularity and making learning a habit
link |
00:53:05.720
do you have general other study tips
link |
00:53:09.720
for particularly deep learning
link |
00:53:11.880
that people should
link |
00:53:13.400
in their process of learning
link |
00:53:15.000
is there some kind of recommendations
link |
00:53:16.600
or tips you have as they learn?
link |
00:53:19.720
One thing I still do
link |
00:53:21.560
when I'm trying to study something really deeply
link |
00:53:23.320
is take handwritten notes
link |
00:53:25.800
it varies
link |
00:53:26.360
I know there are a lot of people
link |
00:53:27.640
that take the deep learning courses
link |
00:53:29.320
during a commute or something
link |
00:53:31.960
where it may be more awkward to take notes
link |
00:53:33.800
so I know it may not work for everyone
link |
00:53:36.680
but when I'm taking courses on Coursera
link |
00:53:39.640
and I still take some every now and then
link |
00:53:41.640
the most recent one I took
link |
00:53:42.520
was a course on clinical trials
link |
00:53:44.360
because I was interested about that
link |
00:53:45.640
I got out my little Moleskine notebook
link |
00:53:47.880
and what I was seeing on my desk
link |
00:53:48.840
was just taking down notes
link |
00:53:50.280
so what the instructor was saying
link |
00:53:51.480
and that act we know that
link |
00:53:53.000
that act of taking notes
link |
00:53:54.760
preferably handwritten notes
link |
00:53:57.240
increases retention.
link |
00:53:59.560
So as you're sort of watching the video
link |
00:54:01.720
just kind of pausing maybe
link |
00:54:03.800
and then taking the basic insights down on paper.
link |
00:54:07.800
Yeah so there have been a few studies
link |
00:54:09.960
if you search online
link |
00:54:11.080
you find some of these studies
link |
00:54:12.680
that taking handwritten notes
link |
00:54:15.080
because handwriting is slower
link |
00:54:16.920
as we're saying just now
link |
00:54:18.920
it causes you to recode the knowledge
link |
00:54:21.240
in your own words more
link |
00:54:23.080
and that process of recoding
link |
00:54:24.840
promotes long term retention
link |
00:54:26.600
this is as opposed to typing
link |
00:54:28.200
which is fine
link |
00:54:28.920
again typing is better than nothing
link |
00:54:30.680
or in taking a class
link |
00:54:31.800
and not taking notes is better
link |
00:54:32.760
than not taking any class at all
link |
00:54:34.360
but comparing handwritten notes
link |
00:54:36.440
and typing
link |
00:54:37.960
you can usually type faster
link |
00:54:39.480
for a lot of people
link |
00:54:40.280
you can handwrite notes
link |
00:54:41.480
and so when people type
link |
00:54:42.920
they're more likely to just transcribe
link |
00:54:44.920
verbatim what they heard
link |
00:54:46.280
and that reduces the amount of recoding
link |
00:54:49.080
and that actually results
link |
00:54:50.360
in less long term retention.
link |
00:54:52.360
I don't know what the psychological effect
link |
00:54:53.960
there is but so true
link |
00:54:55.320
there's something fundamentally different
link |
00:54:56.840
about writing hand handwriting
link |
00:54:59.400
I wonder what that is
link |
00:55:00.200
I wonder if it is as simple
link |
00:55:01.640
as just the time it takes to write it slower
link |
00:55:04.360
yeah and because you can't write
link |
00:55:07.400
as many words
link |
00:55:08.120
you have to take whatever they said
link |
00:55:10.200
and summarize it into fewer words
link |
00:55:11.960
and that summarization process
link |
00:55:13.400
requires deeper processing of the meaning
link |
00:55:15.880
which then results in better retention
link |
00:55:17.880
that's fascinating
link |
00:55:20.040
oh and I think because of Coursera
link |
00:55:22.440
I spent so much time studying pedagogy
link |
00:55:24.120
this is actually one of my passions
link |
00:55:25.400
I really love learning
link |
00:55:27.000
how to more efficiently
link |
00:55:28.040
help others learn
link |
00:55:28.920
you know one of the things I do
link |
00:55:30.600
both when creating videos
link |
00:55:32.280
or when we write the batch is
link |
00:55:34.760
I try to think is one minute spent of us
link |
00:55:37.800
going to be a more efficient learning experience
link |
00:55:40.600
than one minute spent anywhere else
link |
00:55:42.520
and we really try to you know
link |
00:55:45.080
make it time efficient for the learners
link |
00:55:46.920
because you know everyone's busy
link |
00:55:48.680
so when when we're editing
link |
00:55:50.280
I often tell my teams
link |
00:55:51.960
every word needs to fight for its life
link |
00:55:53.800
and if you can delete a word
link |
00:55:54.680
let's just delete it and not wait
link |
00:55:56.360
let's not waste the learning time
link |
00:55:57.880
let's not waste the learning time
link |
00:55:59.960
oh that's so it's so amazing
link |
00:56:01.400
that you think that way
link |
00:56:02.200
because there is millions of people
link |
00:56:03.560
that are impacted by your teaching
link |
00:56:04.840
and sort of that one minute spent
link |
00:56:06.680
has a ripple effect right
link |
00:56:08.360
through years of time
link |
00:56:09.560
which is it's just fascinating to think about
link |
00:56:12.600
how does one make a career
link |
00:56:14.280
out of an interest in deep learning
link |
00:56:15.960
do you have advice for people
link |
00:56:18.680
we just talked about
link |
00:56:19.480
sort of the beginning early steps
link |
00:56:21.400
but if you want to make it
link |
00:56:22.600
an entire life's journey
link |
00:56:24.280
or at least a journey of a decade or two
link |
00:56:26.360
how do you how do you do it
link |
00:56:28.200
so most important thing is to get started
link |
00:56:30.120
right and and I think in the early parts
link |
00:56:34.280
of a career coursework
link |
00:56:35.800
um like the deep learning specialization
link |
00:56:38.040
or it's a very efficient way
link |
00:56:41.080
to master this material
link |
00:56:43.320
so because you know instructors
link |
00:56:46.600
uh be it me or someone else
link |
00:56:48.280
or you know Lawrence Maroney
link |
00:56:49.640
teaches our TensorFlow specialization
link |
00:56:51.240
or other things we're working on
link |
00:56:52.280
spend effort to try to make it time efficient
link |
00:56:55.640
for you to learn a new concept
link |
00:56:57.640
so coursework is actually a very efficient way
link |
00:57:00.600
for people to learn concepts
link |
00:57:02.280
and the beginning parts of breaking
link |
00:57:04.120
into a new field
link |
00:57:05.960
in fact one thing I see at Stanford
link |
00:57:08.520
some of my PhD students want to jump
link |
00:57:10.280
in the research right away
link |
00:57:11.400
and I actually tend to say look
link |
00:57:13.160
in your first couple years of PhD
link |
00:57:14.440
and spend time taking courses
link |
00:57:16.680
because it lays a foundation
link |
00:57:17.960
it's fine if you're less productive
link |
00:57:19.640
in your first couple years
link |
00:57:20.680
you'll be better off in the long term
link |
00:57:23.400
beyond a certain point
link |
00:57:24.520
there's materials that doesn't exist in courses
link |
00:57:27.640
because it's too cutting edge
link |
00:57:28.840
the course hasn't been created yet
link |
00:57:30.040
there's some practical experience
link |
00:57:31.320
that we're not yet that good
link |
00:57:32.760
as teaching in a course
link |
00:57:34.440
and I think after exhausting
link |
00:57:36.040
the efficient coursework
link |
00:57:37.720
then most people need to go on
link |
00:57:40.360
to either ideally work on projects
link |
00:57:44.520
and then maybe also continue their learning
link |
00:57:47.080
by reading blog posts and research papers
link |
00:57:49.560
and things like that
link |
00:57:50.920
doing projects is really important
link |
00:57:52.280
and again I think it's important
link |
00:57:55.080
to start small and just do something
link |
00:57:57.560
today you read about deep learning
link |
00:57:58.920
feels like oh all these people
link |
00:57:59.800
doing such exciting things
link |
00:58:01.080
what if I'm not building a neural network
link |
00:58:02.920
that changes the world
link |
00:58:03.720
then what's the point?
link |
00:58:04.440
Well the point is sometimes building
link |
00:58:06.360
that tiny neural network
link |
00:58:07.720
you know be it MNIST or upgrade
link |
00:58:10.120
to a fashion MNIST to whatever
link |
00:58:12.280
so doing your own fun hobby project
link |
00:58:14.680
that's how you gain the skills
link |
00:58:15.960
to let you do bigger and bigger projects
link |
00:58:18.200
I find this to be true at the individual level
link |
00:58:20.520
and also at the organizational level
link |
00:58:23.080
for a company to become good at machine learning
link |
00:58:24.920
sometimes the right thing to do
link |
00:58:26.200
is not to tackle the giant project
link |
00:58:29.240
is instead to do the small project
link |
00:58:31.240
that lets the organization learn
link |
00:58:33.320
and then build out from there
link |
00:58:34.600
but this is true both for individuals
link |
00:58:35.960
and for companies
link |
00:58:38.200
taking the first step
link |
00:58:40.680
and then taking small steps is the key
link |
00:58:44.520
should students pursue a PhD
link |
00:58:46.280
do you think you can do so much
link |
00:58:48.520
that's one of the fascinating things
link |
00:58:50.200
in machine learning
link |
00:58:51.160
you can have so much impact
link |
00:58:52.280
without ever getting a PhD
link |
00:58:54.440
so what are your thoughts
link |
00:58:56.040
should people go to grad school
link |
00:58:57.400
should people get a PhD?
link |
00:58:59.400
I think that there are multiple good options
link |
00:59:01.720
of which doing a PhD could be one of them
link |
00:59:05.000
I think that if someone's admitted
link |
00:59:06.920
to a top PhD program
link |
00:59:08.520
you know at MIT, Stanford, top schools
link |
00:59:11.880
I think that's a very good experience
link |
00:59:15.320
or if someone gets a job
link |
00:59:17.000
at a top organization
link |
00:59:18.760
at the top AI team
link |
00:59:20.440
I think that's also a very good experience
link |
00:59:23.880
there are some things you still need a PhD to do
link |
00:59:25.880
if someone's aspiration is to be a professor
link |
00:59:27.640
you know at the top academic university
link |
00:59:29.080
you just need a PhD to do that
link |
00:59:30.920
but if it goes to you know
link |
00:59:32.520
start a company, build a company
link |
00:59:34.120
do great technical work
link |
00:59:35.320
I think a PhD is a good experience
link |
00:59:37.640
but I would look at the different options
link |
00:59:40.200
available to someone
link |
00:59:41.160
you know where are the places
link |
00:59:42.120
where you can get a job
link |
00:59:42.920
where are the places to get a PhD program
link |
00:59:44.920
and kind of weigh the pros and cons of those
link |
00:59:46.840
So just to linger on that for a little bit longer
link |
00:59:50.040
what final dreams and goals
link |
00:59:51.720
do you think people should have
link |
00:59:53.000
so what options should they explore
link |
00:59:57.320
so you can work in industry
link |
00:59:59.720
so for a large company
link |
01:00:01.960
like Google, Facebook, Baidu
link |
01:00:03.560
all these large sort of companies
link |
01:00:06.040
that already have huge teams
link |
01:00:07.720
of machine learning engineers
link |
01:00:09.160
you can also do with an industry
link |
01:00:10.920
sort of more research groups
link |
01:00:12.200
that kind of like Google Research, Google Brain
link |
01:00:14.440
then you can also do
link |
01:00:16.600
like we said a professor in academia
link |
01:00:20.360
and what else
link |
01:00:21.800
oh you can build your own company
link |
01:00:23.880
you can do a startup
link |
01:00:25.080
is there anything that stands out
link |
01:00:27.240
between those options
link |
01:00:28.440
or are they all beautiful different journeys
link |
01:00:30.680
that people should consider
link |
01:00:32.520
I think the thing that affects your experience more
link |
01:00:34.760
is less are you in this company
link |
01:00:36.920
versus that company
link |
01:00:38.040
or academia versus industry
link |
01:00:40.040
I think the thing that affects your experience most
link |
01:00:41.480
is who are the people you're interacting with
link |
01:00:43.640
in a daily basis
link |
01:00:45.480
so even if you look at some of the large companies
link |
01:00:49.400
the experience of individuals
link |
01:00:50.920
in different teams is very different
link |
01:00:52.920
and what matters most is not the logo above the door
link |
01:00:56.120
when you walk into the giant building every day
link |
01:00:58.280
what matters the most is who are the 10 people
link |
01:01:00.440
who are the 30 people you interact with every day
link |
01:01:03.080
so I actually tend to advise people
link |
01:01:04.840
if you get a job from a company
link |
01:01:07.480
ask who is your manager
link |
01:01:09.320
who are your peers
link |
01:01:10.120
who are you actually going to talk to
link |
01:01:11.320
we're all social creatures
link |
01:01:12.440
we tend to become more like the people around us
link |
01:01:15.400
and if you're working with great people
link |
01:01:17.480
you will learn faster
link |
01:01:19.240
or if you get admitted
link |
01:01:20.520
if you get a job at a great company
link |
01:01:23.000
or a great university
link |
01:01:24.120
maybe the logo you walk in is great
link |
01:01:26.680
but you're actually stuck on some team
link |
01:01:28.200
doing really work that doesn't excite you
link |
01:01:31.160
and then that's actually a really bad experience
link |
01:01:33.640
so this is true both for universities
link |
01:01:36.280
and for large companies
link |
01:01:37.960
for small companies you can kind of figure out
link |
01:01:39.640
who you'll be working with quite quickly
link |
01:01:41.240
and I tend to advise people
link |
01:01:43.240
if a company refuses to tell you
link |
01:01:45.080
who you will work with
link |
01:01:46.120
someone say oh join us
link |
01:01:47.160
the rotation system will figure it out
link |
01:01:48.920
I think that that's a worrying answer
link |
01:01:51.000
because it because it means you may not get sent
link |
01:01:54.680
to you may not actually get to a team
link |
01:01:57.640
with great peers and great people to work with
link |
01:02:00.120
it's actually a really profound advice
link |
01:02:01.960
that we kind of sometimes sweep
link |
01:02:04.440
we don't consider too rigorously or carefully
link |
01:02:07.880
the people around you are really often
link |
01:02:10.280
especially when you accomplish great things
link |
01:02:13.000
it seems the great things are accomplished
link |
01:02:14.600
because of the people around you
link |
01:02:16.680
so that's a it's not about the the
link |
01:02:20.360
where whether you learn this thing
link |
01:02:21.880
or that thing or like you said
link |
01:02:23.320
the logo that hangs up top
link |
01:02:25.000
it's the people that's a fascinating
link |
01:02:27.320
and it's such a hard search process
link |
01:02:30.520
of finding just like finding the right friends
link |
01:02:34.120
and somebody to get married with
link |
01:02:36.360
and that kind of thing
link |
01:02:37.400
it's a very hard search
link |
01:02:38.680
it's a people search problem
link |
01:02:40.840
yeah but I think when someone interviews
link |
01:02:43.320
you know at a university
link |
01:02:44.440
or the research lab or the large corporation
link |
01:02:47.320
it's good to insist on just asking
link |
01:02:49.400
who are the people
link |
01:02:50.200
who is my manager
link |
01:02:51.320
and if you refuse to tell me
link |
01:02:52.520
I'm gonna think well maybe that's
link |
01:02:54.440
because you don't have a good answer
link |
01:02:55.560
it may not be someone I like
link |
01:02:57.240
and if you don't particularly connect
link |
01:02:59.320
if something feels off with the people
link |
01:03:02.360
then don't stick to it
link |
01:03:05.880
you know that's a really important signal to consider
link |
01:03:08.520
yeah yeah and actually I actually
link |
01:03:11.160
in my standard class CS230
link |
01:03:13.240
as well as an ACM talk
link |
01:03:14.520
I think I gave like a hour long talk
link |
01:03:16.920
on career advice
link |
01:03:18.200
including on the job search process
link |
01:03:20.200
and then some of these
link |
01:03:20.920
so you can find those videos online
link |
01:03:23.160
awesome and I'll point them
link |
01:03:25.000
I'll point people to them
link |
01:03:26.440
beautiful
link |
01:03:28.360
so the AI fund helps AI startups
link |
01:03:32.120
get off the ground
link |
01:03:33.400
or perhaps you can elaborate
link |
01:03:34.680
on all the fun things it's involved with
link |
01:03:36.920
what's your advice
link |
01:03:37.800
and how does one build a successful AI startup
link |
01:03:41.880
you know in Silicon Valley
link |
01:03:43.320
a lot of startup failures
link |
01:03:44.920
come from building other products
link |
01:03:46.680
that no one wanted
link |
01:03:48.520
so when you know cool technology
link |
01:03:51.800
but who's going to use it
link |
01:03:53.400
so I think I tend to be very outcome driven
link |
01:03:57.640
and customer obsessed
link |
01:04:00.280
ultimately we don't get to vote
link |
01:04:02.360
if we succeed or fail
link |
01:04:04.120
it's only the customer
link |
01:04:05.560
that they're the only one
link |
01:04:06.920
that gets a thumbs up or thumbs down vote
link |
01:04:08.840
in the long term
link |
01:04:09.560
in the short term
link |
01:04:10.600
you know there are various people
link |
01:04:12.040
that get various votes
link |
01:04:13.000
but in the long term
link |
01:04:14.440
that's what really matters
link |
01:04:16.280
so as you build the startup
link |
01:04:17.400
you have to constantly ask the question
link |
01:04:20.760
will the customer give a thumbs up on this
link |
01:04:24.120
I think so
link |
01:04:24.760
I think startups that are very customer focused
link |
01:04:27.320
customer obsessed
link |
01:04:28.200
deeply understand the customer
link |
01:04:30.360
and are oriented to serve the customer
link |
01:04:34.200
are more likely to succeed
link |
01:04:36.360
with the provisional
link |
01:04:37.240
I think all of us should only do things
link |
01:04:38.920
that we think create social good
link |
01:04:40.760
and moves the world forward
link |
01:04:41.880
so I personally don't want to build
link |
01:04:44.360
addictive digital products
link |
01:04:45.880
just to sell a lot of ads
link |
01:04:47.160
or you know there are things
link |
01:04:48.200
that could be lucrative
link |
01:04:49.400
that I won't do
link |
01:04:51.720
but if we can find ways to serve people
link |
01:04:53.640
in meaningful ways
link |
01:04:55.160
I think those can be
link |
01:04:57.240
great things to do
link |
01:04:58.920
either in the academic setting
link |
01:05:00.360
or in a corporate setting
link |
01:05:01.320
or a startup setting
link |
01:05:02.920
so can you give me the idea
link |
01:05:04.440
of why you started the AI fund
link |
01:05:08.520
I remember when I was leading
link |
01:05:10.120
the AI group at Baidu
link |
01:05:13.160
I had two jobs
link |
01:05:14.920
two parts of my job
link |
01:05:15.800
one was to build an AI engine
link |
01:05:17.240
to support the existing businesses
link |
01:05:19.000
and that was running
link |
01:05:20.520
just ran
link |
01:05:21.320
just performed by itself
link |
01:05:23.080
there was a second part of my job at the time
link |
01:05:24.600
which was to try to systematically initiate
link |
01:05:27.240
new lines of businesses
link |
01:05:28.920
using the company's AI capabilities
link |
01:05:31.080
so you know the self driving car team
link |
01:05:33.240
came out of my group
link |
01:05:34.360
the smart speaker team
link |
01:05:37.080
similar to what is Amazon Echo Alexa in the US
link |
01:05:40.840
but we actually announced it
link |
01:05:41.720
before Amazon did
link |
01:05:42.760
so Baidu wasn't following Amazon
link |
01:05:47.320
that came out of my group
link |
01:05:48.600
and I found that to be
link |
01:05:50.680
actually the most fun part of my job
link |
01:05:53.400
so what I wanted to do was
link |
01:05:55.080
to build AI fund as a startup studio
link |
01:05:58.200
to systematically create new startups
link |
01:06:01.000
from scratch
link |
01:06:02.600
with all the things we can now do with AI
link |
01:06:04.840
I think the ability to build new teams
link |
01:06:07.240
to go after this rich space of opportunities
link |
01:06:09.960
is a very important way
link |
01:06:11.720
to very important mechanism
link |
01:06:13.480
to get these projects done
link |
01:06:14.760
that I think will move the world forward
link |
01:06:16.520
so I've been fortunate to build a few teams
link |
01:06:19.160
that had a meaningful positive impact
link |
01:06:21.560
and I felt that we might be able to do this
link |
01:06:25.000
in a more systematic repeatable way
link |
01:06:27.880
so a startup studio is a relatively new concept
link |
01:06:31.400
there are maybe dozens of startup studios
link |
01:06:34.120
you know right now
link |
01:06:35.640
but I feel like all of us
link |
01:06:38.680
many teams are still trying to figure out
link |
01:06:40.840
how do you systematically build companies
link |
01:06:43.640
with a high success rate
link |
01:06:45.320
so I think even a lot of my you know
link |
01:06:47.960
venture capital friends are
link |
01:06:49.560
seem to be more and more building companies
link |
01:06:51.640
rather than investing in companies
link |
01:06:53.000
but I find a fascinating thing to do
link |
01:06:55.240
to figure out the mechanisms
link |
01:06:56.520
by which we could systematically build
link |
01:06:58.680
successful teams, successful businesses
link |
01:07:01.400
in areas that we find meaningful
link |
01:07:03.320
so a startup studio is something
link |
01:07:05.720
is a place and a mechanism
link |
01:07:08.440
for startups to go from zero to success
link |
01:07:11.000
to try to develop a blueprint
link |
01:07:13.720
it's actually a place for us
link |
01:07:14.680
to build startups from scratch
link |
01:07:16.520
so we often bring in founders
link |
01:07:19.320
and work with them
link |
01:07:21.160
or maybe even have existing ideas
link |
01:07:23.720
that we match founders with
link |
01:07:26.440
and then this launches
link |
01:07:27.880
you know hopefully into successful companies
link |
01:07:30.920
so how close are you to figuring out
link |
01:07:34.040
a way to automate the process
link |
01:07:36.920
of starting from scratch
link |
01:07:38.280
and building a successful AI startup
link |
01:07:40.440
yeah I think we've been constantly
link |
01:07:43.720
improving and iterating on our processes
link |
01:07:46.680
how we do that
link |
01:07:47.560
so things like you know
link |
01:07:48.920
how many customer calls do we need to make
link |
01:07:50.600
in order to get customer validation
link |
01:07:52.840
how do we make sure this technology
link |
01:07:54.040
can be built
link |
01:07:54.520
quite a lot of our businesses
link |
01:07:56.200
need cutting edge machine learning algorithms
link |
01:07:58.440
so you know kind of algorithms
link |
01:07:59.480
have developed in the last one or two years
link |
01:08:01.880
and even if it works in a research paper
link |
01:08:04.280
it turns out taking the production
link |
01:08:05.640
is really hard
link |
01:08:06.200
there are a lot of issues
link |
01:08:07.160
for making these things work in the real life
link |
01:08:10.840
that are not widely addressed in academia
link |
01:08:13.400
so how do we validate
link |
01:08:14.520
that this is actually doable
link |
01:08:15.720
how do you build a team
link |
01:08:17.080
get the specialized domain knowledge
link |
01:08:18.600
be it in education or health care
link |
01:08:20.200
whatever sector we're focusing on
link |
01:08:21.800
so I think we've actually getting
link |
01:08:23.240
we've been getting much better
link |
01:08:24.680
at giving the entrepreneurs
link |
01:08:27.880
a high success rate
link |
01:08:29.400
but I think we're still
link |
01:08:31.080
I think the whole world is still
link |
01:08:32.520
in the early phases of figuring this out
link |
01:08:34.120
but do you think there is some aspects
link |
01:08:36.840
of that process that are transferable
link |
01:08:38.760
from one startup to another
link |
01:08:40.280
to another to another
link |
01:08:41.640
yeah very much so
link |
01:08:43.000
you know starting from scratch
link |
01:08:45.080
you know starting a company
link |
01:08:46.520
to most entrepreneurs
link |
01:08:47.640
is a really lonely thing
link |
01:08:50.680
and I've seen so many entrepreneurs
link |
01:08:53.720
not know how to make certain decisions
link |
01:08:56.200
like when do you need to
link |
01:08:58.440
how do you do B2B sales right
link |
01:09:00.040
if you don't know that
link |
01:09:00.920
it's really hard
link |
01:09:02.280
or how do you market this efficiently
link |
01:09:05.400
other than you know buying ads
link |
01:09:06.920
which is really expensive
link |
01:09:08.360
are there more efficient tactics for that
link |
01:09:10.040
or for a machine learning project
link |
01:09:12.360
you know basic decisions
link |
01:09:14.200
can change the course of
link |
01:09:15.320
whether machine learning product works or not
link |
01:09:18.360
and so there are so many hundreds of decisions
link |
01:09:20.920
that entrepreneurs need to make
link |
01:09:22.600
and making a mistake
link |
01:09:24.440
and a couple key decisions
link |
01:09:25.640
can have a huge impact
link |
01:09:28.520
on the fate of the company
link |
01:09:30.120
so I think a startup studio
link |
01:09:31.400
provides a support structure
link |
01:09:32.920
that makes starting a company
link |
01:09:34.280
much less of a lonely experience
link |
01:09:36.200
and also when facing with these key decisions
link |
01:09:39.960
like trying to hire your first
link |
01:09:42.280
uh the VP of engineering
link |
01:09:44.840
what's a good selection criteria
link |
01:09:46.280
how do you solve
link |
01:09:46.920
should I hire this person or not
link |
01:09:48.600
by helping by having a ecosystem
link |
01:09:51.400
around the entrepreneurs
link |
01:09:52.920
the founders to help
link |
01:09:54.520
I think we help them at the key moments
link |
01:09:57.320
and hopefully significantly
link |
01:09:59.720
make them more enjoyable
link |
01:10:00.840
and then higher success rate
link |
01:10:02.280
so there's somebody to brainstorm with
link |
01:10:04.520
in these very difficult decision points
link |
01:10:07.880
and also to help them recognize
link |
01:10:10.920
what they may not even realize
link |
01:10:12.840
is a key decision point
link |
01:10:14.760
that's that's the first
link |
01:10:15.800
and probably the most important part
link |
01:10:17.240
yeah actually I can say one other thing
link |
01:10:19.720
um you know I think
link |
01:10:22.200
building companies is one thing
link |
01:10:23.800
but I feel like it's really important
link |
01:10:26.360
that we build companies
link |
01:10:28.040
that move the world forward
link |
01:10:29.960
for example within the AI Fund team
link |
01:10:32.360
there was once an idea
link |
01:10:33.640
for a new company
link |
01:10:35.480
that if it had succeeded
link |
01:10:37.240
would have resulted in people
link |
01:10:38.680
watching a lot more videos
link |
01:10:40.040
in a certain narrow vertical type of video
link |
01:10:42.760
um I looked at it
link |
01:10:43.880
the business case was fine
link |
01:10:45.480
the revenue case was fine
link |
01:10:46.600
but I looked and just said
link |
01:10:48.200
I don't want to do this
link |
01:10:49.240
like you know I don't actually
link |
01:10:50.600
just want to have a lot more people
link |
01:10:52.360
watch this type of video
link |
01:10:53.720
wasn't educational
link |
01:10:54.600
it's an educational baby
link |
01:10:56.200
and so and so I I I I code the idea
link |
01:10:59.000
on the basis that I didn't think
link |
01:11:00.520
it would actually help people
link |
01:11:01.880
so um whether building companies
link |
01:11:04.040
or working enterprises
link |
01:11:05.240
or doing personal projects
link |
01:11:06.600
I think um it's up to each of us
link |
01:11:10.200
to figure out what's the difference
link |
01:11:11.480
we want to make in the world
link |
01:11:13.960
With landing AI
link |
01:11:15.240
you help already established companies
link |
01:11:17.000
grow their AI and machine learning efforts
link |
01:11:20.040
how does a large company
link |
01:11:21.720
integrate machine learning
link |
01:11:22.840
into their efforts?
link |
01:11:25.240
AI is a general purpose technology
link |
01:11:27.560
and I think it will transform every industry
link |
01:11:30.360
our community has already transformed
link |
01:11:32.920
to a large extent
link |
01:11:33.640
the software internet sector
link |
01:11:35.320
most software internet companies
link |
01:11:36.840
outside the top right
link |
01:11:38.040
five or six or three or four
link |
01:11:39.960
already have reasonable
link |
01:11:41.880
machine learning capabilities
link |
01:11:43.160
or or getting there
link |
01:11:44.040
it's still room for improvement
link |
01:11:46.200
but when I look outside
link |
01:11:47.320
the software internet sector
link |
01:11:49.080
everything from manufacturing
link |
01:11:50.600
agriculture, healthcare
link |
01:11:52.040
logistics transportation
link |
01:11:53.720
there's so many opportunities
link |
01:11:55.480
that very few people are working on
link |
01:11:57.800
so I think the next wave of AI
link |
01:11:59.640
is for us to also transform
link |
01:12:01.080
all of those other industries
link |
01:12:03.240
there was a McKinsey study
link |
01:12:04.440
estimating 13 trillion dollars
link |
01:12:06.840
of global economic growth
link |
01:12:09.560
US GDP is 19 trillion dollars
link |
01:12:11.560
so 13 trillion is a big number
link |
01:12:13.160
or PwC estimates 16 trillion dollars
link |
01:12:16.040
so whatever number is is large
link |
01:12:18.200
but the interesting thing to me
link |
01:12:19.400
was a lot of that impact
link |
01:12:20.600
will be outside
link |
01:12:21.640
the software internet sector
link |
01:12:23.640
so we need more teams
link |
01:12:25.880
to work with these companies
link |
01:12:27.880
to help them adopt AI
link |
01:12:29.640
and I think this is one thing
link |
01:12:30.680
so make you know
link |
01:12:31.800
help drive global economic growth
link |
01:12:33.560
and make humanity more powerful
link |
01:12:35.800
and like you said the impact is there
link |
01:12:37.720
so what are the best industries
link |
01:12:39.400
the biggest industries
link |
01:12:40.360
where AI can help
link |
01:12:41.560
perhaps outside the software tech sector
link |
01:12:44.360
frankly I think it's all of them
link |
01:12:47.880
some of the ones I'm spending a lot of time on
link |
01:12:49.800
are manufacturing agriculture
link |
01:12:52.360
look into healthcare
link |
01:12:54.440
for example in manufacturing
link |
01:12:56.360
we do a lot of work in visual inspection
link |
01:12:58.600
where today there are people standing around
link |
01:13:01.320
using the eye human eye
link |
01:13:02.840
to check if you know
link |
01:13:03.880
this plastic part or the smartphone
link |
01:13:05.720
or this thing has a scratch
link |
01:13:07.320
or a dent or something in it
link |
01:13:09.320
we can use a camera to take a picture
link |
01:13:12.440
use a algorithm
link |
01:13:14.040
deep learning and other things
link |
01:13:15.400
to check if it's defective or not
link |
01:13:17.800
and thus help factories improve yield
link |
01:13:20.440
and improve quality
link |
01:13:21.560
and improve throughput
link |
01:13:23.480
it turns out the practical problems
link |
01:13:25.000
we run into are very different
link |
01:13:26.520
than the ones you might read about
link |
01:13:28.040
in in most research papers
link |
01:13:29.400
the data sets are really small
link |
01:13:30.680
so we face small data problems
link |
01:13:33.160
you know the factories
link |
01:13:34.200
keep on changing the environment
link |
01:13:35.800
so it works well on your test set
link |
01:13:38.200
but guess what
link |
01:13:40.680
something changes in the factory
link |
01:13:41.960
the lights go on or off
link |
01:13:43.480
recently there was a factory
link |
01:13:45.080
in which a bird threw through the factory
link |
01:13:47.800
and pooped on something
link |
01:13:48.840
and so that changed stuff
link |
01:13:50.760
and so increasing our algorithm
link |
01:13:53.080
makes robustness
link |
01:13:54.200
so all the changes happen in the factory
link |
01:13:56.920
I find that we run a lot of practical problems
link |
01:13:59.160
that are not as widely discussed
link |
01:14:01.480
in academia
link |
01:14:02.600
and it's really fun
link |
01:14:03.960
kind of being on the cutting edge
link |
01:14:05.080
solving these problems before
link |
01:14:07.560
maybe before many people are even aware
link |
01:14:09.240
that there is a problem there
link |
01:14:10.360
and that's such a fascinating space
link |
01:14:12.280
you're absolutely right
link |
01:14:13.160
but what is the first step
link |
01:14:15.400
that a company should take
link |
01:14:16.520
it's just scary leap
link |
01:14:18.200
into this new world of
link |
01:14:20.120
going from the human eye
link |
01:14:21.720
inspecting to digitizing that process
link |
01:14:24.680
having a camera
link |
01:14:25.640
having an algorithm
link |
01:14:27.240
what's the first step
link |
01:14:28.200
like what's the early journey
link |
01:14:30.040
that you recommend
link |
01:14:31.080
that you see these companies taking
link |
01:14:33.400
I published a document
link |
01:14:34.520
called the AI Transformation Playbook
link |
01:14:37.000
that's online
link |
01:14:37.720
and taught briefly in the AI for Everyone
link |
01:14:39.800
course on Coursera
link |
01:14:41.000
about the long term journey
link |
01:14:42.760
that companies should take
link |
01:14:44.120
but the first step
link |
01:14:45.000
is actually to start small
link |
01:14:46.920
I've seen a lot more companies fail
link |
01:14:48.840
by starting too big
link |
01:14:50.280
than by starting too small
link |
01:14:52.680
take even Google
link |
01:14:54.120
you know most people don't realize
link |
01:14:55.640
how hard it was
link |
01:14:56.920
and how controversial it was
link |
01:14:58.440
in the early days
link |
01:14:59.960
so when I started Google Brain
link |
01:15:02.360
it was controversial
link |
01:15:03.560
you know people thought
link |
01:15:04.680
deep learning near nest
link |
01:15:06.280
tried it didn't work
link |
01:15:07.320
why would you want to do deep learning
link |
01:15:09.240
so my first internal customer
link |
01:15:11.560
within Google
link |
01:15:12.360
was the Google speech team
link |
01:15:13.960
which is not the most lucrative
link |
01:15:15.560
project in Google
link |
01:15:17.160
not the most important
link |
01:15:18.280
it's not web search or advertising
link |
01:15:20.600
but by starting small
link |
01:15:22.840
my team helped the speech team
link |
01:15:25.800
build a more accurate speech recognition system
link |
01:15:28.280
and this caused their peers
link |
01:15:30.120
other teams to start
link |
01:15:31.080
to have more faith in deep learning
link |
01:15:32.920
my second internal customer
link |
01:15:34.360
was the Google Maps team
link |
01:15:36.360
where we used computer vision
link |
01:15:37.880
to read house numbers
link |
01:15:39.560
from basic street view images
link |
01:15:41.000
to more accurately locate houses
link |
01:15:42.600
within Google Maps
link |
01:15:43.560
so improve the quality of geodata
link |
01:15:45.800
and it was only after those two successes
link |
01:15:48.200
that I then started
link |
01:15:49.240
a more serious conversation
link |
01:15:50.440
with the Google Ads team
link |
01:15:52.600
and so there's a ripple effect
link |
01:15:54.120
that you showed that it works
link |
01:15:55.480
in these cases
link |
01:15:56.680
and then it just propagates
link |
01:15:58.120
through the entire company
link |
01:15:59.080
that this thing has a lot of value
link |
01:16:01.400
and use for us
link |
01:16:02.760
I think the early small scale projects
link |
01:16:05.160
it helps the teams gain faith
link |
01:16:07.160
but also helps the teams learn
link |
01:16:09.160
what these technologies do
link |
01:16:11.480
I still remember when our first GPU server
link |
01:16:14.360
it was a server under some guy's desk
link |
01:16:16.840
and you know and then that taught us
link |
01:16:19.240
early important lessons about
link |
01:16:21.080
how do you have multiple users
link |
01:16:23.480
share a set of GPUs
link |
01:16:25.000
which is really not obvious at the time
link |
01:16:26.920
but those early lessons were important
link |
01:16:29.240
we learned a lot from that first GPU server
link |
01:16:31.880
that later helped the teams think through
link |
01:16:33.880
how to scale it up
link |
01:16:34.840
to much larger deployments
link |
01:16:37.320
Are there concrete challenges
link |
01:16:38.840
that companies face
link |
01:16:40.120
that you see is important for them to solve?
link |
01:16:43.800
I think building and deploying
link |
01:16:45.080
machine learning systems is hard
link |
01:16:47.080
there's a huge gulf between
link |
01:16:48.760
something that works
link |
01:16:49.560
in a jupyter notebook on your laptop
link |
01:16:51.560
versus something that runs
link |
01:16:52.840
their production deployment setting
link |
01:16:54.440
in a factory or agriculture plant or whatever
link |
01:16:58.200
so I see a lot of people
link |
01:16:59.720
get something to work on your laptop
link |
01:17:01.000
and say wow look what I've done
link |
01:17:02.120
and that's great that's hard
link |
01:17:03.800
that's a very important first step
link |
01:17:05.640
but a lot of teams underestimate
link |
01:17:07.160
the rest of the steps needed
link |
01:17:09.480
so for example
link |
01:17:10.280
I've heard this exact same conversation
link |
01:17:12.360
between a lot of machine learning people
link |
01:17:13.880
and business people
link |
01:17:15.000
the machine learning person says
link |
01:17:16.920
look my algorithm does well on the test set
link |
01:17:20.760
and it's a clean test set at the end of peak
link |
01:17:22.440
and the machine and the business person says
link |
01:17:24.360
thank you very much
link |
01:17:25.560
but your algorithm sucks it doesn't work
link |
01:17:28.440
and the machine learning person says
link |
01:17:29.960
no wait I did well on the test set
link |
01:17:33.720
and I think there is a gulf between
link |
01:17:36.680
what it takes to do well on the test set
link |
01:17:38.760
on your hard drive
link |
01:17:39.720
versus what it takes to work well
link |
01:17:41.560
in a deployment setting
link |
01:17:43.240
some common problems
link |
01:17:45.560
robustness and generalization
link |
01:17:47.240
you deploy something in the factory
link |
01:17:49.640
maybe they chop down a tree outside the factory
link |
01:17:51.640
so the tree no longer covers the window
link |
01:17:54.280
and the lighting is different
link |
01:17:55.320
so the test set changes
link |
01:17:56.760
and in machine learning
link |
01:17:58.360
and especially in academia
link |
01:18:00.360
we don't know how to deal with test set distributions
link |
01:18:02.840
that are dramatically different
link |
01:18:04.360
than the training set distribution
link |
01:18:06.280
you know that this research
link |
01:18:07.560
the stuff like domain annotation
link |
01:18:10.200
transfer learning
link |
01:18:11.000
you know there are people working on it
link |
01:18:12.680
but we're really not good at this
link |
01:18:14.440
so how do you actually get this to work
link |
01:18:17.000
because your test set distribution
link |
01:18:18.360
is going to change
link |
01:18:19.320
and I think also if you look at the number of lines of code
link |
01:18:23.240
in the software system
link |
01:18:24.760
the machine learning model is maybe five percent
link |
01:18:27.960
or even fewer
link |
01:18:29.800
relative to the entire software system
link |
01:18:31.960
you need to build
link |
01:18:33.000
so how do you get all that work done
link |
01:18:34.760
and make it reliable and systematic
link |
01:18:36.520
so good software engineering work
link |
01:18:38.360
is fundamental here
link |
01:18:40.200
to building a successful small machine learning system
link |
01:18:44.120
yes and the software system
link |
01:18:46.040
needs to interface with the machine learning system
link |
01:18:48.360
needs to interface with people's workloads
link |
01:18:50.600
so machine learning is automation on steroids
link |
01:18:53.960
if we take one task out of many tasks
link |
01:18:56.280
that are done in the factory
link |
01:18:57.080
so the factory does lots of things
link |
01:18:58.760
one task is vision inspection
link |
01:19:00.680
if we automate that one task
link |
01:19:02.360
it can be really valuable
link |
01:19:03.800
but you may need to redesign a lot of other tasks
link |
01:19:06.040
around that one task
link |
01:19:07.240
for example say the machine learning algorithm
link |
01:19:09.720
says this is defective
link |
01:19:10.920
what are you supposed to do
link |
01:19:11.720
do you throw it away
link |
01:19:12.520
do you get a human to double check
link |
01:19:14.040
do you want to rework it or fix it
link |
01:19:16.120
so you need to redesign a lot of tasks
link |
01:19:17.960
around that thing you've now automated
link |
01:19:20.040
so planning for the change management
link |
01:19:22.600
and making sure that the software you write
link |
01:19:24.840
is consistent with the new workflow
link |
01:19:26.680
and you take the time to explain to people
link |
01:19:28.200
what needs to happen
link |
01:19:29.000
so I think what landing AI has become good at
link |
01:19:34.360
and then I think we learned by making the steps
link |
01:19:36.520
and you know painful experiences
link |
01:19:38.280
well my what would become good at is
link |
01:19:41.720
working with our partners to think through
link |
01:19:43.800
all the things beyond just the machine learning model
link |
01:19:46.440
or running the jupyter notebook
link |
01:19:47.560
but to build the entire system
link |
01:19:50.200
manage the change process
link |
01:19:51.720
and figure out how to deploy this in a way
link |
01:19:53.160
that has an actual impact
link |
01:19:55.480
the processes that the large software tech companies
link |
01:19:58.120
use for deploying don't work
link |
01:19:59.880
for a lot of other scenarios
link |
01:20:01.480
for example when I was leading large speech teams
link |
01:20:05.720
if the speech recognition system goes down
link |
01:20:07.800
what happens well alarms goes off
link |
01:20:09.560
and then someone like me would say hey
link |
01:20:11.400
you 20 engine environment
link |
01:20:12.840
you 20 engineers please fix this
link |
01:20:16.600
but if you have a system girl in the factory
link |
01:20:19.240
there are not 20 machine learning engineers
link |
01:20:21.320
sitting around you can page your duty
link |
01:20:22.920
and have them fix it
link |
01:20:23.800
so how do you deal with the maintenance
link |
01:20:26.200
or the or the dev ops or the mo ops
link |
01:20:28.280
or the other aspects of this
link |
01:20:30.280
so these are concepts that I think landing AI
link |
01:20:33.960
and a few other teams on the cutting edge
link |
01:20:36.360
but we don't even have systematic terminology yet
link |
01:20:39.480
to describe some of the stuff we do
link |
01:20:40.920
because I think we're inventing it on the fly.
link |
01:20:44.680
So you mentioned some people are interested
link |
01:20:46.600
in discovering mathematical beauty
link |
01:20:48.360
and truth in the universe
link |
01:20:49.560
and you're interested in having
link |
01:20:51.640
a big positive impact in the world
link |
01:20:54.920
so let me ask the two are not inconsistent
link |
01:20:57.240
no they're all together
link |
01:20:58.760
I'm only half joking
link |
01:21:00.840
because you're probably interested a little bit in both
link |
01:21:03.480
but let me ask a romanticized question
link |
01:21:06.040
so much of the work
link |
01:21:08.040
your work and our discussion today
link |
01:21:09.480
has been on applied AI
link |
01:21:11.960
maybe you can even call narrow AI
link |
01:21:14.440
where the goal is to create systems
link |
01:21:15.720
that automate some specific process
link |
01:21:17.400
that adds a lot of value to the world
link |
01:21:19.640
but there's another branch of AI
link |
01:21:21.240
starting with Alan Turing
link |
01:21:22.760
that kind of dreams of creating human level
link |
01:21:25.560
or superhuman level intelligence
link |
01:21:28.360
is this something you dream of as well
link |
01:21:30.360
do you think we human beings
link |
01:21:32.120
will ever build a human level intelligence
link |
01:21:34.440
or superhuman level intelligence system?
link |
01:21:37.160
I would love to get to AGI
link |
01:21:38.680
and I think humanity will
link |
01:21:40.840
but whether it takes 100 years
link |
01:21:42.600
or 500 or 5000
link |
01:21:45.000
I find hard to estimate
link |
01:21:47.960
do you have
link |
01:21:49.880
some folks have worries
link |
01:21:51.640
about the different trajectories
link |
01:21:53.160
that path would take
link |
01:21:54.360
even existential threats of an AGI system
link |
01:21:57.400
do you have such concerns
link |
01:21:59.560
whether in the short term or the long term?
link |
01:22:02.200
I do worry about the long term fate of humanity
link |
01:22:05.880
I do wonder as well
link |
01:22:08.280
I do worry about overpopulation on the planet Mars
link |
01:22:12.280
just not today
link |
01:22:13.400
I think there will be a day
link |
01:22:15.160
when maybe someday in the future
link |
01:22:17.640
Mars will be polluted
link |
01:22:19.160
there are all these children dying
link |
01:22:20.680
and someone will look back at this video
link |
01:22:22.040
and say Andrew how is Andrew so heartless?
link |
01:22:24.040
He didn't care about all these children
link |
01:22:25.640
dying on the planet Mars
link |
01:22:27.080
and I apologize to the future viewer
link |
01:22:29.400
I do care about the children
link |
01:22:31.000
but I just don't know how to
link |
01:22:32.200
productively work on that today
link |
01:22:33.720
your picture will be in the dictionary
link |
01:22:35.960
for the people who are ignorant
link |
01:22:37.240
about the overpopulation on Mars
link |
01:22:39.800
yes so it's a long term problem
link |
01:22:42.440
is there something in the short term
link |
01:22:43.960
we should be thinking about
link |
01:22:45.720
in terms of aligning the values of our AI systems
link |
01:22:48.520
with the values of us humans
link |
01:22:52.440
sort of something that Stuart Russell
link |
01:22:54.520
and other folks are thinking about
link |
01:22:56.200
as this system develops more and more
link |
01:22:58.600
we want to make sure that it represents
link |
01:23:01.400
the better angels of our nature
link |
01:23:03.720
the ethics the values of our society
link |
01:23:07.800
you know if you take self driving cars
link |
01:23:11.080
the biggest problem with self driving cars
link |
01:23:12.600
is not that there's some trolley dilemma
link |
01:23:16.040
and you teach this so you know
link |
01:23:17.640
how many times when you are driving your car
link |
01:23:20.040
did you face this moral dilemma
link |
01:23:21.800
who do I crash into?
link |
01:23:24.120
so I think self driving cars
link |
01:23:25.320
will run into that problem roughly as often
link |
01:23:27.640
as we do when we drive our cars
link |
01:23:29.240
the biggest problem with self driving cars
link |
01:23:30.920
is when there's a big white truck across the road
link |
01:23:33.160
and what you should do is break
link |
01:23:34.360
and not crash into it
link |
01:23:35.560
and the self driving car fails
link |
01:23:37.560
and it crashes into it
link |
01:23:38.520
so I think we need to solve that problem first
link |
01:23:40.600
I think the problem with some of these discussions
link |
01:23:42.920
about AGI you know alignments
link |
01:23:47.080
the paperclip problem
link |
01:23:49.480
is that is a huge distraction
link |
01:23:51.720
from the much harder problems
link |
01:23:53.560
that we actually need to address today
link |
01:23:56.120
it's not the hardest problems
link |
01:23:57.640
we need to address today
link |
01:23:59.320
it's not the hard problems
link |
01:24:00.040
we need to address today
link |
01:24:01.000
I think bias is a huge issue
link |
01:24:04.120
I worry about wealth and equality
link |
01:24:06.120
the AI and internet are causing
link |
01:24:09.000
an acceleration of concentration of power
link |
01:24:11.240
because we can now centralize data
link |
01:24:13.640
use AI to process it
link |
01:24:14.760
and so industry after industry
link |
01:24:16.280
we've affected every industry
link |
01:24:18.040
so the internet industry has a lot of
link |
01:24:20.040
win and take most
link |
01:24:20.760
or win and take all dynamics
link |
01:24:22.520
but we've infected all these other industries
link |
01:24:24.760
so we're also giving these other industries
link |
01:24:26.600
most of them to take all flavors
link |
01:24:28.600
so look at what Uber and Lyft
link |
01:24:30.920
did to the taxi industry
link |
01:24:32.440
so we're doing this type of thing
link |
01:24:33.560
it's a lot and so this
link |
01:24:34.920
so we're creating tremendous wealth
link |
01:24:36.360
but how do we make sure that the wealth
link |
01:24:37.720
is fairly shared
link |
01:24:39.800
I think that and then how do we help
link |
01:24:43.080
people whose jobs are displaced
link |
01:24:44.760
you know I think education is part of it
link |
01:24:46.920
there may be even more
link |
01:24:48.360
that we need to do than education
link |
01:24:52.040
I think bias is a serious issue
link |
01:24:54.200
there are adverse uses of AI
link |
01:24:56.520
like deepfakes being used
link |
01:24:57.960
for various and various purposes
link |
01:24:59.880
so I worry about some teams
link |
01:25:04.440
maybe accidentally
link |
01:25:05.480
and I hope not deliberately
link |
01:25:07.240
making a lot of noise about things
link |
01:25:09.960
that problems in the distant future
link |
01:25:12.520
rather than focusing on
link |
01:25:13.880
some of the much harder problems
link |
01:25:15.160
yeah the overshadow of the problems
link |
01:25:17.000
that we have already today
link |
01:25:18.120
they're exceptionally challenging
link |
01:25:19.560
like those you said
link |
01:25:20.520
and even the silly ones
link |
01:25:21.960
but the ones that have a huge impact
link |
01:25:23.560
huge impact
link |
01:25:24.520
which is the lighting variation
link |
01:25:25.960
outside of your factory window
link |
01:25:27.960
that that ultimately is
link |
01:25:30.120
what makes the difference
link |
01:25:31.320
between like you said
link |
01:25:32.120
the Jupiter notebook
link |
01:25:33.160
and something that actually transforms
link |
01:25:35.400
an entire industry potentially
link |
01:25:37.400
yeah and I think
link |
01:25:38.200
and then just to some companies
link |
01:25:40.600
or a regulator comes to you
link |
01:25:42.600
and says look your product
link |
01:25:44.200
is messing things up
link |
01:25:45.880
fixing it may have a revenue impact
link |
01:25:47.720
well it's much more fun to talk to them
link |
01:25:49.400
about how you promise
link |
01:25:50.440
not to wipe out humanity
link |
01:25:51.960
and to face the actually really hard problems we face
link |
01:25:55.720
so your life has been a great journey
link |
01:25:57.480
from teaching to research
link |
01:25:58.840
to entrepreneurship
link |
01:26:00.680
two questions
link |
01:26:01.880
one are there regrets
link |
01:26:04.040
moments that if you went back
link |
01:26:05.560
you would do differently
link |
01:26:07.000
and two are there moments
link |
01:26:08.920
you're especially proud of
link |
01:26:10.680
moments that made you truly happy
link |
01:26:13.160
you know I've made so many mistakes
link |
01:26:17.080
it feels like every time
link |
01:26:18.440
I discover something
link |
01:26:19.720
I go why didn't I think of this
link |
01:26:23.080
you know five years earlier
link |
01:26:24.520
or even 10 years earlier
link |
01:26:27.240
and as recently
link |
01:26:29.480
and then sometimes I read a book
link |
01:26:30.920
and I go I wish I read this book 10 years ago
link |
01:26:33.800
my life would have been so different
link |
01:26:35.480
although that happened recently
link |
01:26:36.600
and then I was thinking
link |
01:26:37.800
if only I read this book
link |
01:26:39.240
when we're starting up Coursera
link |
01:26:40.520
I could have been so much better
link |
01:26:42.760
but I discovered the book
link |
01:26:43.640
had not yet been written
link |
01:26:44.600
we're starting Coursera
link |
01:26:45.560
so that made me feel better
link |
01:26:46.600
so that made me feel better
link |
01:26:49.400
but I find that the process of discovery
link |
01:26:53.080
we keep on finding out things
link |
01:26:54.440
that seem so obvious in hindsight
link |
01:26:57.480
but it always takes us so much longer
link |
01:26:59.320
than than I wish to to figure it out
link |
01:27:03.400
so on the second question
link |
01:27:06.280
are there moments in your life
link |
01:27:08.040
that if you look back
link |
01:27:09.960
that you're especially proud of
link |
01:27:12.440
or you're especially happy
link |
01:27:13.800
what would be the that filled you with happiness
link |
01:27:17.480
and fulfillment
link |
01:27:18.440
well two answers
link |
01:27:20.280
one does my daughter know of her
link |
01:27:21.800
yes of course
link |
01:27:22.680
because I know how much time I spent with her
link |
01:27:24.280
I just can't spend enough time with her
link |
01:27:25.720
congratulations by the way
link |
01:27:26.840
thank you
link |
01:27:27.800
and then second is helping other people
link |
01:27:29.880
I think to me
link |
01:27:30.920
I think the meaning of life
link |
01:27:32.520
is helping others achieve
link |
01:27:35.160
whatever are their dreams
link |
01:27:37.160
and then also to try to move the world forward
link |
01:27:40.440
making humanity more powerful as a whole
link |
01:27:43.880
so the times that I felt most happy
link |
01:27:46.040
most proud was when I felt
link |
01:27:49.000
someone else allowed me the good fortune
link |
01:27:52.600
of helping them a little bit
link |
01:27:54.440
on the path to their dreams
link |
01:27:57.160
I think there's no better way to end it
link |
01:27:58.840
than talking about happiness
link |
01:28:00.120
and the meaning of life
link |
01:28:01.080
so Andrew it's a huge honor
link |
01:28:03.240
me and millions of people
link |
01:28:04.360
thank you for all the work you've done
link |
01:28:05.960
thank you for talking today
link |
01:28:07.160
thank you so much thanks
link |
01:28:07.960
thanks for listening to this conversation with Andrew Ng
link |
01:28:10.760
and thank you to our presenting sponsor Cash App
link |
01:28:13.720
download it use code LEX podcast
link |
01:28:16.440
you'll get ten dollars
link |
01:28:17.720
and ten dollars will go to FIRST
link |
01:28:19.320
an organization that inspires and educates young minds
link |
01:28:22.360
to become science and technology innovators of tomorrow
link |
01:28:25.720
if you enjoy this podcast
link |
01:28:27.160
subscribe on YouTube
link |
01:28:28.600
give it five stars on Apple podcast
link |
01:28:30.680
support it on Patreon
link |
01:28:32.040
or simply connect with me on Twitter
link |
01:28:34.040
at LEX Freedman
link |
01:28:35.240
and now let me leave you with some words of wisdom from Andrew Ng
link |
01:28:39.320
ask yourself
link |
01:28:40.840
if what you're working on succeeds beyond your wildest dreams
link |
01:28:44.360
would you have significantly helped other people?
link |
01:28:47.880
if not then keep searching for something else to work on
link |
01:28:51.160
otherwise you're not living up to your full potential
link |
01:28:54.440
thank you for listening and hope to see you next time