back to index

Rajat Monga: TensorFlow | Lex Fridman Podcast #22


small model | large model

link |
00:00:00.000
The following is a conversation with Rajat Manga.
link |
00:00:03.080
He's an engineer and director of Google,
link |
00:00:04.920
leading the TensorFlow team.
link |
00:00:06.960
TensorFlow is an open source library
link |
00:00:09.160
at the center of much of the work going on in the world
link |
00:00:11.540
in deep learning, both the cutting edge research
link |
00:00:14.040
and the large scale application of learning based approaches.
link |
00:00:17.720
But it's quickly becoming much more than a software library.
link |
00:00:20.940
It's now an ecosystem of tools for the deployment of machine
link |
00:00:24.120
learning in the cloud, on the phone, in the browser,
link |
00:00:26.800
on both generic and specialized hardware.
link |
00:00:29.840
TPU, GPU, and so on.
link |
00:00:31.960
Plus, there's a big emphasis on growing a passionate community
link |
00:00:35.220
of developers.
link |
00:00:36.640
Rajat, Jeff Dean, and a large team of engineers at Google
link |
00:00:39.820
Brain are working to define the future of machine
link |
00:00:42.200
learning with TensorFlow 2.0, which is now in alpha.
link |
00:00:46.240
I think the decision to open source TensorFlow
link |
00:00:49.160
is a definitive moment in the tech industry.
link |
00:00:51.760
It showed that open innovation can be successful
link |
00:00:54.400
and inspire many companies to open source their code,
link |
00:00:56.920
to publish, and in general engage
link |
00:00:58.880
in the open exchange of ideas.
link |
00:01:01.240
This conversation is part of the Artificial Intelligence
link |
00:01:03.940
podcast.
link |
00:01:05.080
If you enjoy it, subscribe on YouTube, iTunes,
link |
00:01:07.860
or simply connect with me on Twitter at Lex Friedman,
link |
00:01:10.880
spelled F R I D.
link |
00:01:12.720
And now, here's my conversation with Rajat Manga.
link |
00:01:17.960
You were involved with Google Brain since its start in 2011
link |
00:01:22.520
with Jeff Dean.
link |
00:01:24.880
It started with this belief, the proprietary machine learning
link |
00:01:29.220
library, and turned into TensorFlow in 2014,
link |
00:01:32.800
the open source library.
link |
00:01:35.720
So what were the early days of Google Brain like?
link |
00:01:39.120
What were the goals, the missions?
link |
00:01:41.840
How do you even proceed forward once there's
link |
00:01:45.120
so much possibilities before you?
link |
00:01:47.760
It was interesting back then when I started,
link |
00:01:50.560
or when you were even just talking about it,
link |
00:01:55.400
the idea of deep learning was interesting and intriguing
link |
00:01:59.520
in some ways.
link |
00:02:00.480
It hadn't yet taken off, but it held some promise.
link |
00:02:04.920
It had shown some very promising and early results.
link |
00:02:08.740
I think the idea where Andrew and Jeff had started
link |
00:02:11.400
was, what if we can take this work people are doing
link |
00:02:15.440
in research and scale it to what Google has
link |
00:02:18.800
in terms of the compute power, and also
link |
00:02:23.000
put that kind of data together?
link |
00:02:24.320
What does it mean?
link |
00:02:25.320
And so far, the results had been, if you scale the compute,
link |
00:02:28.300
scale the data, it does better.
link |
00:02:30.200
And would that work?
link |
00:02:31.520
And so that was the first year or two, can we prove that out?
link |
00:02:35.140
And with this belief, when we started the first year,
link |
00:02:37.480
we got some early wins, which is always great.
link |
00:02:40.800
What were the wins like?
link |
00:02:41.960
What was the wins where you were,
link |
00:02:44.160
there's some problems to this, this is going to be good?
link |
00:02:46.640
I think there are two early wins where one was speech,
link |
00:02:49.680
that we collaborated very closely with the speech research
link |
00:02:52.280
team, who was also getting interested in this.
link |
00:02:54.820
And the other one was on images, where the cat paper,
link |
00:02:58.800
as we call it, that was covered by a lot of folks.
link |
00:03:03.160
And the birth of Google Brain was around neural networks.
link |
00:03:07.440
So it was deep learning from the very beginning.
link |
00:03:09.320
That was the whole mission.
link |
00:03:10.800
So what would, in terms of scale,
link |
00:03:15.040
what was the sort of dream of what this could become?
link |
00:03:21.080
Were there echoes of this open source TensorFlow community
link |
00:03:24.280
that might be brought in?
link |
00:03:26.240
Was there a sense of TPUs?
link |
00:03:28.640
Was there a sense of machine learning is now going to be
link |
00:03:31.760
at the core of the entire company,
link |
00:03:33.720
is going to grow into that direction?
link |
00:03:36.040
Yeah, I think, so that was interesting.
link |
00:03:38.320
And if I think back to 2012 or 2011,
link |
00:03:41.380
and first was can we scale it in the year or so,
link |
00:03:45.240
we had started scaling it to hundreds and thousands
link |
00:03:47.520
of machines.
link |
00:03:48.360
In fact, we had some runs even going to 10,000 machines.
link |
00:03:51.080
And all of those shows great promise.
link |
00:03:53.880
In terms of machine learning at Google,
link |
00:03:56.800
the good thing was Google's been doing machine learning
link |
00:03:58.780
for a long time.
link |
00:04:00.240
Deep learning was new, but as we scaled this up,
link |
00:04:03.760
we showed that, yes, that was possible.
link |
00:04:05.600
And it was going to impact lots of things.
link |
00:04:07.840
Like we started seeing real products wanting to use this.
link |
00:04:11.200
Again, speech was the first, there were image things
link |
00:04:13.800
that photos came out of and then many other products as well.
link |
00:04:17.400
So that was exciting.
link |
00:04:20.180
As we went into that a couple of years,
link |
00:04:23.160
externally also academia started to,
link |
00:04:25.800
there was lots of push on, okay,
link |
00:04:27.200
deep learning is interesting,
link |
00:04:28.320
we should be doing more and so on.
link |
00:04:30.600
And so by 2014, we were looking at, okay,
link |
00:04:34.580
this is a big thing, it's going to grow.
link |
00:04:36.780
And not just internally, externally as well.
link |
00:04:39.440
Yes, maybe Google's ahead of where everybody is,
link |
00:04:42.280
but there's a lot to do.
link |
00:04:43.640
So a lot of this started to make sense and come together.
link |
00:04:46.720
So the decision to open source,
link |
00:04:49.560
I was just chatting with Chris Glatner about this.
link |
00:04:52.200
The decision to go open source with TensorFlow,
link |
00:04:54.640
I would say sort of for me personally,
link |
00:04:57.080
seems to be one of the big seminal moments
link |
00:04:59.640
in all of software engineering ever.
link |
00:05:01.720
I think that's when a large company like Google
link |
00:05:04.620
decides to take a large project that many lawyers
link |
00:05:07.520
might argue has a lot of IP,
link |
00:05:10.800
just decide to go open source with it,
link |
00:05:12.900
and in so doing lead the entire world
link |
00:05:14.880
and saying, you know what, open innovation
link |
00:05:16.520
is a pretty powerful thing, and it's okay to do.
link |
00:05:22.360
That was, I mean, that's an incredible moment in time.
link |
00:05:26.320
So do you remember those discussions happening?
link |
00:05:29.320
Whether open source should be happening?
link |
00:05:31.400
What was that like?
link |
00:05:32.680
I would say, I think, so the initial idea came from Jeff,
link |
00:05:36.880
who was a big proponent of this.
link |
00:05:39.440
I think it came off of two big things.
link |
00:05:42.480
One was research wise, we were a research group.
link |
00:05:46.320
We were putting all our research out there.
link |
00:05:49.640
If you wanted to, we were building on others research
link |
00:05:51.720
and we wanted to push the state of the art forward.
link |
00:05:55.000
And part of that was to share the research.
link |
00:05:56.840
That's how I think deep learning and machine learning
link |
00:05:58.960
has really grown so fast.
link |
00:06:01.380
So the next step was, okay, now,
link |
00:06:03.360
would software help with that?
link |
00:06:05.360
And it seemed like they were existing
link |
00:06:08.440
a few libraries out there, Tiano being one,
link |
00:06:11.280
Torch being another, and a few others,
link |
00:06:14.000
but they were all done by academia
link |
00:06:15.480
and so the level was significantly different.
link |
00:06:18.960
The other one was from a software perspective,
link |
00:06:22.000
Google had done lots of software
link |
00:06:23.880
or that we used internally, you know,
link |
00:06:27.080
and we published papers.
link |
00:06:29.080
Often there was an open source project
link |
00:06:31.680
that came out of that that somebody else
link |
00:06:33.600
picked up that paper and implemented
link |
00:06:35.400
and they were very successful.
link |
00:06:38.240
Back then it was like, okay, there's Hadoop,
link |
00:06:41.440
which has come off of tech that we've built.
link |
00:06:44.140
We know the tech we've built is way better
link |
00:06:46.200
for a number of different reasons.
link |
00:06:47.880
We've invested a lot of effort in that.
link |
00:06:51.660
And turns out we have Google Cloud
link |
00:06:54.320
and we are now not really providing our tech,
link |
00:06:57.520
but we are saying, okay, we have Bigtable,
link |
00:07:00.360
which is the original thing.
link |
00:07:02.040
We are going to now provide H base APIs
link |
00:07:03.880
on top of that, which isn't as good,
link |
00:07:06.040
but that's what everybody's used to.
link |
00:07:07.480
So there's like, can we make something
link |
00:07:10.040
that is better and really just provide,
link |
00:07:12.320
helps the community in lots of ways,
link |
00:07:14.320
but also helps push a good standard forward.
link |
00:07:18.320
So how does Cloud fit into that?
link |
00:07:19.940
There's a TensorFlow open source library
link |
00:07:22.680
and how does the fact that you can
link |
00:07:25.800
use so many of the resources that Google provides
link |
00:07:28.240
and the Cloud fit into that strategy?
link |
00:07:31.100
So TensorFlow itself is open
link |
00:07:33.600
and you can use it anywhere, right?
link |
00:07:34.920
And we want to make sure that continues to be the case.
link |
00:07:38.360
On Google Cloud, we do make sure
link |
00:07:41.040
that there's lots of integrations with everything else
link |
00:07:43.840
and we want to make sure
link |
00:07:44.880
that it works really, really well there.
link |
00:07:47.320
You're leading the TensorFlow effort.
link |
00:07:50.400
Can you tell me the history
link |
00:07:51.280
and the timeline of TensorFlow project
link |
00:07:53.600
in terms of major design decisions,
link |
00:07:55.880
so like the open source decision,
link |
00:07:58.160
but really what to include and not?
link |
00:08:01.600
There's this incredible ecosystem
link |
00:08:03.200
that I'd like to talk about.
link |
00:08:04.760
There's all these parts,
link |
00:08:05.720
but what if just some sample moments
link |
00:08:11.240
that defined what TensorFlow eventually became
link |
00:08:15.040
through its, I don't know if you're allowed to say history
link |
00:08:17.640
when it's just, but in deep learning,
link |
00:08:20.240
everything moves so fast
link |
00:08:21.280
and just a few years is already history.
link |
00:08:23.460
Yes, yes, so looking back, we were building TensorFlow.
link |
00:08:29.780
I guess we open sourced it in 2015, November 2015.
link |
00:08:34.240
We started on it in summer of 2014, I guess.
link |
00:08:39.780
And somewhere like three to six, late 2014,
link |
00:08:42.960
by then we had decided that, okay,
link |
00:08:45.120
there's a high likelihood we'll open source it.
link |
00:08:47.080
So we started thinking about that
link |
00:08:48.880
and making sure we're heading down that path.
link |
00:08:53.960
At that point, by that point,
link |
00:08:56.080
we had seen a few, lots of different use cases at Google.
link |
00:08:59.320
So there were things like, okay,
link |
00:09:01.000
yes, you wanna run it at large scale in the data center.
link |
00:09:04.200
Yes, we need to support different kind of hardware.
link |
00:09:07.560
We had GPUs at that point.
link |
00:09:09.440
We had our first GPU at that point
link |
00:09:11.880
or was about to come out roughly around that time.
link |
00:09:15.700
So the design sort of included those.
link |
00:09:18.700
We had started to push on mobile.
link |
00:09:21.800
So we were running models on mobile.
link |
00:09:24.920
At that point, people were customizing code.
link |
00:09:28.160
So we wanted to make sure TensorFlow
link |
00:09:29.560
could support that as well.
link |
00:09:30.700
So that sort of became part of that overall design.
link |
00:09:35.260
When you say mobile,
link |
00:09:36.560
you mean like a pretty complicated algorithms
link |
00:09:38.680
running on the phone?
link |
00:09:40.040
That's correct.
link |
00:09:40.880
So when you have a model that you deploy on the phone
link |
00:09:44.320
and run it there, right?
link |
00:09:45.160
So already at that time,
link |
00:09:46.420
there was ideas of running machine learning on the phone.
link |
00:09:48.800
That's correct.
link |
00:09:49.640
We already had a couple of products
link |
00:09:51.400
that were doing that by then.
link |
00:09:53.260
And in those cases,
link |
00:09:54.500
we had basically customized handcrafted code
link |
00:09:57.540
or some internal libraries that we're using.
link |
00:10:00.160
So I was actually at Google during this time
link |
00:10:02.600
in a parallel, I guess, universe,
link |
00:10:04.560
but we were using Theano and Caffe.
link |
00:10:09.240
Was there some degree to which you were bouncing,
link |
00:10:11.600
like trying to see what Caffe was offering people,
link |
00:10:15.520
trying to see what Theano was offering
link |
00:10:17.960
that you want to make sure you're delivering
link |
00:10:19.960
on whatever that is?
link |
00:10:21.640
Perhaps the Python part of thing,
link |
00:10:23.720
maybe did that influence any design decisions?
link |
00:10:27.520
Totally.
link |
00:10:28.360
So when we built this belief
link |
00:10:29.600
and some of that was in parallel
link |
00:10:31.600
with some of these libraries coming up,
link |
00:10:33.400
I mean, Theano itself is older,
link |
00:10:36.680
but we were building this belief
link |
00:10:39.880
focused on our internal thing
link |
00:10:41.160
because our systems were very different.
link |
00:10:42.960
By the time we got to this,
link |
00:10:44.080
we looked at a number of libraries that were out there.
link |
00:10:47.120
Theano, there were folks in the group
link |
00:10:49.280
who had experience with Torch, with Lua.
link |
00:10:52.140
There were folks here who had seen Caffe.
link |
00:10:54.800
I mean, actually, Yang Jing was here as well.
link |
00:10:58.840
There's what other libraries?
link |
00:11:02.980
I think we looked at a number of things.
link |
00:11:04.920
Might even have looked at JNR back then.
link |
00:11:06.840
I'm trying to remember if it was there.
link |
00:11:09.400
In fact, yeah, we did discuss ideas around,
link |
00:11:12.040
okay, should we have a graph or not?
link |
00:11:17.840
So putting all these together was definitely,
link |
00:11:20.480
they were key decisions that we wanted.
link |
00:11:22.800
We had seen limitations in our prior disbelief things.
link |
00:11:28.800
A few of them were just in terms of research
link |
00:11:31.360
was moving so fast, we wanted the flexibility.
link |
00:11:35.040
The hardware was changing fast.
link |
00:11:36.360
We expected to change that
link |
00:11:37.760
so that those probably were two things.
link |
00:11:39.900
And yeah, I think the flexibility
link |
00:11:43.140
in terms of being able to express
link |
00:11:44.380
all kinds of crazy things was definitely a big one then.
link |
00:11:46.980
So what, the graph decisions though,
link |
00:11:49.020
with moving towards TensorFlow 2.0,
link |
00:11:52.460
there's more, by default, there'll be eager execution.
link |
00:11:56.800
So sort of hiding the graph a little bit
link |
00:11:59.260
because it's less intuitive
link |
00:12:00.660
in terms of the way people develop and so on.
link |
00:12:03.660
What was that discussion like in terms of using graphs?
link |
00:12:06.800
It seemed, it's kind of the Theano way.
link |
00:12:09.420
Did it seem the obvious choice?
link |
00:12:11.660
So I think where it came from was our disbelief
link |
00:12:15.780
had a graph like thing as well.
link |
00:12:17.700
A much more simple, it wasn't a general graph,
link |
00:12:19.780
it was more like a straight line thing.
link |
00:12:23.220
More like what you might think of cafe,
link |
00:12:25.060
I guess in that sense.
link |
00:12:26.440
But the graph was,
link |
00:12:28.900
and we always cared about the production stuff.
link |
00:12:31.180
Like even with disbelief,
link |
00:12:32.020
we were deploying a whole bunch of stuff in production.
link |
00:12:34.500
So graph did come from that when we thought of,
link |
00:12:37.460
okay, should we do that in Python?
link |
00:12:39.420
And we experimented with some ideas
link |
00:12:40.900
where it looked a lot simpler to use,
link |
00:12:44.740
but not having a graph meant,
link |
00:12:46.780
okay, how do you deploy now?
link |
00:12:47.980
So that was probably what tilted the balance for us
link |
00:12:51.180
and eventually we ended up with a graph.
link |
00:12:52.940
And I guess the question there is, did you,
link |
00:12:55.400
I mean, so production seems to be
link |
00:12:57.420
the really good thing to focus on,
link |
00:12:59.900
but did you even anticipate the other side of it
link |
00:13:02.500
where there could be, what is it?
link |
00:13:04.620
What are the numbers?
link |
00:13:05.460
It's been crazy, 41 million downloads.
link |
00:13:08.980
Yep.
link |
00:13:12.780
I mean, was that even like a possibility in your mind
link |
00:13:16.300
that it would be as popular as it became?
link |
00:13:19.220
So I think we did see a need for this
link |
00:13:24.480
a lot from the research perspective
link |
00:13:27.600
and like early days of deep learning in some ways.
link |
00:13:32.340
41 million, no, I don't think I imagined this number.
link |
00:13:35.140
Then it seemed like there's a potential future
link |
00:13:41.700
where lots more people would be doing this
link |
00:13:43.780
and how do we enable that?
link |
00:13:45.700
I would say this kind of growth,
link |
00:13:49.100
I probably started seeing somewhat after the open sourcing
link |
00:13:52.660
where it was like, okay,
link |
00:13:55.300
deep learning is actually growing way faster
link |
00:13:57.880
for a lot of different reasons.
link |
00:13:59.240
And we are in just the right place to push on that
link |
00:14:02.740
and leverage that and deliver on lots of things
link |
00:14:06.100
that people want.
link |
00:14:07.500
So what changed once you open sourced?
link |
00:14:09.780
Like how this incredible amount of attention
link |
00:14:13.380
from a global population of developers,
link |
00:14:16.540
how did the project start changing?
link |
00:14:18.260
I don't even actually remember during those times.
link |
00:14:22.220
I know looking now, there's really good documentation,
link |
00:14:24.620
there's an ecosystem of tools,
link |
00:14:26.620
there's a community, there's a blog,
link |
00:14:27.980
there's a YouTube channel now, right?
link |
00:14:29.820
Yeah.
link |
00:14:31.180
It's very community driven.
link |
00:14:33.860
Back then, I guess 0.1 version,
link |
00:14:38.700
is that the version?
link |
00:14:39.860
I think we call it 0.6 or five,
link |
00:14:42.180
something like that, I forget.
link |
00:14:43.740
What changed leading into 1.0?
link |
00:14:47.180
It's interesting.
link |
00:14:48.500
I think we've gone through a few things there.
link |
00:14:51.660
When we started out, when we first came out,
link |
00:14:53.720
people loved the documentation we have
link |
00:14:56.100
because it was just a huge step up from everything else
link |
00:14:58.860
because all of those were academic projects,
link |
00:15:00.440
people doing, who don't think about documentation.
link |
00:15:04.580
I think what that changed was,
link |
00:15:06.960
instead of deep learning being a research thing,
link |
00:15:10.380
some people who were just developers
link |
00:15:12.580
could now suddenly take this out
link |
00:15:14.660
and do some interesting things with it, right?
link |
00:15:16.940
Who had no clue what machine learning was before then.
link |
00:15:20.300
And that I think really changed
link |
00:15:22.580
how things started to scale up in some ways
link |
00:15:24.760
and pushed on it.
link |
00:15:27.900
Over the next few months as we looked at
link |
00:15:30.420
how do we stabilize things,
link |
00:15:31.980
as we look at not just researchers,
link |
00:15:33.900
now we want stability, people want to deploy things.
link |
00:15:36.520
That's how we started planning for 1.0
link |
00:15:38.980
and there are certain needs for that perspective.
link |
00:15:42.180
And so again, documentation comes up,
link |
00:15:45.380
designs, more kinds of things to put that together.
link |
00:15:49.380
And so that was exciting to get that to a stage
link |
00:15:52.240
where more and more enterprises wanted to buy in
link |
00:15:55.420
and really get behind that.
link |
00:15:57.740
And I think post 1.0 and over the next few releases,
link |
00:16:01.800
that enterprise adoption also started to take off.
link |
00:16:04.400
I would say between the initial release and 1.0,
link |
00:16:07.160
it was, okay, researchers of course,
link |
00:16:10.240
then a lot of hobbies and early interest,
link |
00:16:12.960
people excited about this who started to get on board
link |
00:16:15.160
and then over the 1.x thing, lots of enterprises.
link |
00:16:18.200
I imagine anything that's below 1.0
link |
00:16:23.200
gives pressure to be,
link |
00:16:25.160
the enterprise probably wants something that's stable.
link |
00:16:28.040
Exactly.
link |
00:16:28.880
And do you have a sense now that TensorFlow is stable?
link |
00:16:33.320
Like it feels like deep learning in general
link |
00:16:35.560
is extremely dynamic field, so much is changing.
link |
00:16:40.440
And TensorFlow has been growing incredibly.
link |
00:16:43.420
Do you have a sense of stability at the helm of it?
link |
00:16:46.760
I mean, I know you're in the midst of it, but.
link |
00:16:48.400
Yeah, I think in the midst of it,
link |
00:16:51.680
it's often easy to forget what an enterprise wants
link |
00:16:55.120
and what some of the people on that side want.
link |
00:16:58.800
There are still people running models
link |
00:17:00.420
that are three years old, four years old.
link |
00:17:02.680
So Inception is still used by tons of people.
link |
00:17:06.040
Even ResNet 50 is what, couple of years old now or more,
link |
00:17:08.960
but there are tons of people who use that and they're fine.
link |
00:17:12.240
They don't need the last couple of bits of performance
link |
00:17:15.320
or quality, they want some stability
link |
00:17:17.720
in things that just work.
link |
00:17:19.640
And so there is value in providing that
link |
00:17:22.240
with that kind of stability and making it really simpler
link |
00:17:25.200
because that allows a lot more people to access it.
link |
00:17:27.800
And then there's the research crowd which wants,
link |
00:17:31.200
okay, they wanna do these crazy things
link |
00:17:33.080
exactly like you're saying, right?
link |
00:17:34.280
Not just deep learning in the straight up models
link |
00:17:37.080
that used to be there, they want RNNs
link |
00:17:40.640
and even RNNs are maybe old, they are transformers now.
link |
00:17:43.480
And now it needs to combine with RL and GANs and so on.
link |
00:17:48.440
So there's definitely that area that like the boundary
link |
00:17:52.000
that's shifting and pushing the state of the art.
link |
00:17:55.200
But I think there's more and more of the past
link |
00:17:57.200
that's much more stable and even stuff
link |
00:18:01.440
that was two, three years old is very, very usable
link |
00:18:03.880
by lots of people.
link |
00:18:04.960
So that part makes it a lot easier.
link |
00:18:07.440
So I imagine, maybe you can correct me if I'm wrong,
link |
00:18:09.840
one of the biggest use cases is essentially
link |
00:18:12.440
taking something like ResNet 50
link |
00:18:14.440
and doing some kind of transfer learning
link |
00:18:17.280
on a very particular problem that you have.
link |
00:18:19.600
It's basically probably what majority of the world does.
link |
00:18:24.520
And you wanna make that as easy as possible.
link |
00:18:27.360
So I would say for the hobbyist perspective,
link |
00:18:30.440
that's the most common case, right?
link |
00:18:32.800
In fact, the apps and phones and stuff that you'll see,
link |
00:18:35.400
the early ones, that's the most common case.
link |
00:18:37.720
I would say there are a couple of reasons for that.
link |
00:18:40.360
One is that everybody talks about that.
link |
00:18:44.440
It looks great on slides.
link |
00:18:46.160
That's a presentation, yeah, exactly.
link |
00:18:49.960
What enterprises want is that is part of it,
link |
00:18:53.080
but that's not the big thing.
link |
00:18:54.360
Enterprises really have data
link |
00:18:56.080
that they wanna make predictions on.
link |
00:18:58.000
This is often what they used to do
link |
00:19:00.320
with the people who were doing ML
link |
00:19:01.760
was just regression models,
link |
00:19:03.560
linear regression, logistic regression, linear models,
link |
00:19:06.360
or maybe gradient booster trees and so on.
link |
00:19:09.760
Some of them still benefit from deep learning,
link |
00:19:11.680
but they want that's the bread and butter,
link |
00:19:14.400
or like the structured data and so on.
link |
00:19:16.360
So depending on the audience you look at,
link |
00:19:18.280
they're a little bit different.
link |
00:19:19.600
And they just have, I mean, the best of enterprise
link |
00:19:23.440
probably just has a very large data set,
link |
00:19:26.520
or deep learning can probably shine.
link |
00:19:28.720
That's correct, that's right.
link |
00:19:30.320
And then I think the other pieces that they wanted,
link |
00:19:33.320
again, with 2.0, the developer summit we put together
link |
00:19:36.480
is the whole TensorFlow Extended piece,
link |
00:19:39.080
which is the entire pipeline.
link |
00:19:40.680
They care about stability across doing their entire thing.
link |
00:19:43.640
They want simplicity across the entire thing.
link |
00:19:46.320
I don't need to just train a model.
link |
00:19:47.760
I need to do that every day again, over and over again.
link |
00:19:51.360
I wonder to which degree you have a role in,
link |
00:19:54.360
I don't know, so I teach a course on deep learning.
link |
00:19:56.720
I have people like lawyers come up to me and say,
link |
00:20:01.400
when is machine learning gonna enter legal,
link |
00:20:04.240
the legal realm?
link |
00:20:05.640
The same thing in all kinds of disciplines,
link |
00:20:09.520
immigration, insurance, often when I see
link |
00:20:14.720
what it boils down to is these companies
link |
00:20:17.440
are often a little bit old school
link |
00:20:19.480
in the way they organize the data.
link |
00:20:20.880
So the data is just not ready yet, it's not digitized.
link |
00:20:24.040
Do you also find yourself being in the role
link |
00:20:26.000
of an evangelist for like, let's get,
link |
00:20:31.520
organize your data, folks, and then you'll get
link |
00:20:33.760
the big benefit of TensorFlow.
link |
00:20:35.480
Do you get those, have those conversations?
link |
00:20:38.040
Yeah, yeah, you know, I get all kinds of questions there
link |
00:20:41.480
from, okay, what do I need to make this work, right?
link |
00:20:49.080
Do we really need deep learning?
link |
00:20:50.840
I mean, there are all these things,
link |
00:20:52.120
I already use this linear model, why would this help?
link |
00:20:55.200
I don't have enough data, let's say,
link |
00:20:57.200
or I wanna use machine learning,
link |
00:21:00.000
but I have no clue where to start.
link |
00:21:01.800
So it varies, that to all the way to the experts
link |
00:21:04.960
to why support very specific things, it's interesting.
link |
00:21:08.600
Is there a good answer?
link |
00:21:09.920
It boils down to oftentimes digitizing data.
link |
00:21:12.520
So whatever you want automated,
link |
00:21:14.480
whatever data you want to make prediction based on,
link |
00:21:17.560
you have to make sure that it's in an organized form.
link |
00:21:21.280
Like within the TensorFlow ecosystem,
link |
00:21:24.000
there's now, you're providing more and more data sets
link |
00:21:26.560
and more and more pre trained models.
link |
00:21:28.960
Are you finding yourself also the organizer of data sets?
link |
00:21:32.440
Yes, I think the TensorFlow data sets
link |
00:21:34.520
that we just released, that's definitely come up
link |
00:21:37.560
where people want these data sets,
link |
00:21:39.240
can we organize them and can we make that easier?
link |
00:21:41.760
So that's definitely one important thing.
link |
00:21:45.320
The other related thing I would say is I often tell people,
link |
00:21:47.680
you know what, don't think of the most fanciest thing
link |
00:21:51.000
that the newest model that you see,
link |
00:21:53.320
make something very basic work and then you can improve it.
link |
00:21:56.400
There's just lots of things you can do with it.
link |
00:21:58.920
Yeah, start with the basics, true.
link |
00:22:00.640
One of the big things that makes TensorFlow
link |
00:22:03.280
even more accessible was the appearance
link |
00:22:06.120
whenever that happened of Keras,
link |
00:22:08.360
the Keras standard sort of outside of TensorFlow.
link |
00:22:12.400
I think it was Keras on top of Tiano at first only
link |
00:22:18.240
and then Keras became on top of TensorFlow.
link |
00:22:22.520
Do you know when Keras chose to also add TensorFlow
link |
00:22:28.760
as a backend, who was the,
link |
00:22:31.200
was it just the community that drove that initially?
link |
00:22:34.000
Do you know if there was discussions, conversations?
link |
00:22:37.040
Yeah, so Francois started the Keras project
link |
00:22:41.000
before he was at Google and the first thing was Tiano.
link |
00:22:44.600
I don't remember if that was
link |
00:22:46.560
after TensorFlow was created or way before.
link |
00:22:49.680
And then at some point,
link |
00:22:51.440
when TensorFlow started becoming popular,
link |
00:22:53.040
there were enough similarities
link |
00:22:54.200
that he decided to create this interface
link |
00:22:56.360
and put TensorFlow as a backend.
link |
00:22:58.200
I believe that might still have been
link |
00:23:00.760
before he joined Google.
link |
00:23:03.320
So we weren't really talking about that.
link |
00:23:06.720
He decided on his own and thought that was interesting
link |
00:23:09.720
and relevant to the community.
link |
00:23:12.800
In fact, I didn't find out about him being at Google
link |
00:23:17.120
until a few months after he was here.
link |
00:23:19.680
He was working on some research ideas
link |
00:23:21.880
and doing Keras on his nights and weekends project.
link |
00:23:24.480
Oh, interesting.
link |
00:23:25.320
He wasn't like part of the TensorFlow.
link |
00:23:28.520
He didn't join initially.
link |
00:23:29.720
He joined research and he was doing some amazing research.
link |
00:23:32.280
He has some papers on that and research,
link |
00:23:34.360
so he's a great researcher as well.
link |
00:23:38.400
And at some point we realized,
link |
00:23:40.400
oh, he's doing this good stuff.
link |
00:23:42.440
People seem to like the API and he's right here.
link |
00:23:45.400
So we talked to him and he said,
link |
00:23:47.760
okay, why don't I come over to your team
link |
00:23:50.600
and work with you for a quarter
link |
00:23:52.840
and let's make that integration happen.
link |
00:23:55.520
And we talked to his manager and he said,
link |
00:23:56.840
sure, quarter's fine.
link |
00:23:59.800
And that quarter's been something like two years now.
link |
00:24:02.400
And so he's fully on this.
link |
00:24:05.080
So Keras got integrated into TensorFlow in a deep way.
link |
00:24:12.000
And now with 2.0, TensorFlow 2.0,
link |
00:24:15.240
sort of Keras is kind of the recommended way
link |
00:24:18.800
for a beginner to interact with TensorFlow.
link |
00:24:21.720
Which makes that initial sort of transfer learning
link |
00:24:24.640
or the basic use cases, even for an enterprise,
link |
00:24:28.040
super simple, right?
link |
00:24:29.320
That's correct, that's right.
link |
00:24:30.440
So what was that decision like?
link |
00:24:32.040
That seems like it's kind of a bold decision as well.
link |
00:24:38.680
We did spend a lot of time thinking about that one.
link |
00:24:41.240
We had a bunch of APIs, some built by us.
link |
00:24:46.000
There was a parallel layers API that we were building.
link |
00:24:48.760
And when we decided to do Keras in parallel,
link |
00:24:51.560
so there were like, okay, two things that we are looking at.
link |
00:24:54.400
And the first thing we was trying to do
link |
00:24:55.960
is just have them look similar,
link |
00:24:58.240
like be as integrated as possible,
link |
00:25:00.120
share all of that stuff.
link |
00:25:02.200
There were also like three other APIs
link |
00:25:04.000
that others had built over time
link |
00:25:05.840
because we didn't have a standard one.
link |
00:25:09.040
But one of the messages that we kept hearing
link |
00:25:11.480
from the community, okay, which one do we use?
link |
00:25:13.240
And they kept seeing like, okay,
link |
00:25:14.480
here's a model in this one and here's a model in this one,
link |
00:25:16.760
which should I pick?
link |
00:25:18.880
So that's sort of like, okay,
link |
00:25:20.960
we had to address that straight on with 2.0.
link |
00:25:24.080
The whole idea was we need to simplify.
link |
00:25:26.360
We had to pick one.
link |
00:25:28.640
Based on where we were, we were like,
link |
00:25:30.520
okay, let's see what are the people like?
link |
00:25:35.680
And Keras was clearly one that lots of people loved.
link |
00:25:39.320
There were lots of great things about it.
link |
00:25:41.640
So we settled on that.
link |
00:25:43.920
Organically, that's kind of the best way to do it.
link |
00:25:46.440
It was great.
link |
00:25:47.520
It was surprising, nevertheless,
link |
00:25:48.760
to sort of bring in an outside.
link |
00:25:51.120
I mean, there was a feeling like Keras
link |
00:25:52.560
might be almost like a competitor
link |
00:25:55.440
in a certain kind of, to TensorFlow.
link |
00:25:58.040
And in a sense, it became an empowering element
link |
00:26:01.320
of TensorFlow.
link |
00:26:02.240
That's right.
link |
00:26:03.280
Yeah, it's interesting how you can put two things together,
link |
00:26:06.440
which can align.
link |
00:26:08.800
In this case, I think Francois, the team,
link |
00:26:11.800
and a bunch of us have chatted,
link |
00:26:14.280
and I think we all want to see the same kind of things.
link |
00:26:17.360
We all care about making it easier
link |
00:26:18.800
for the huge set of developers out there,
link |
00:26:21.440
and that makes a difference.
link |
00:26:23.480
So Python has Guido van Rossum,
link |
00:26:26.880
who until recently held the position
link |
00:26:28.920
of benevolent dictator for life.
link |
00:26:31.920
All right, so there's a huge successful open source project
link |
00:26:36.480
like TensorFlow need one person who makes a final decision.
link |
00:26:40.680
So you've did a pretty successful TensorFlow Dev Summit
link |
00:26:45.480
just now, last couple of days.
link |
00:26:47.520
There's clearly a lot of different new features
link |
00:26:51.080
being incorporated, an amazing ecosystem, so on.
link |
00:26:54.160
Who's, how are those design decisions made?
link |
00:26:57.320
Is there a BDFL in TensorFlow,
link |
00:27:02.800
or is it more distributed and organic?
link |
00:27:05.800
I think it's somewhat different, I would say.
link |
00:27:08.760
I've always been involved in the key design directions,
link |
00:27:14.560
but there are lots of things that are distributed
link |
00:27:17.080
where there are a number of people, Martin Wick being one,
link |
00:27:20.560
who has really driven a lot of our open source stuff,
link |
00:27:23.880
a lot of the APIs,
link |
00:27:26.080
and there are a number of other people who've been,
link |
00:27:29.080
you know, pushed and been responsible
link |
00:27:31.360
for different parts of it.
link |
00:27:34.080
We do have regular design reviews.
link |
00:27:36.480
Over the last year, we've had a lot of
link |
00:27:38.480
we've really spent a lot of time opening up to the community
link |
00:27:41.480
and adding transparency.
link |
00:27:44.160
We're setting more processes in place,
link |
00:27:45.880
so RFCs, special interest groups,
link |
00:27:49.080
to really grow that community and scale that.
link |
00:27:53.600
I think the kind of scale that ecosystem is in,
link |
00:27:57.720
I don't think we could scale with having me
link |
00:27:59.520
as the lone point of decision maker.
link |
00:28:02.280
I got it. So, yeah, the growth of that ecosystem,
link |
00:28:05.920
maybe you can talk about it a little bit.
link |
00:28:08.040
First of all, it started with Andrej Karpathy
link |
00:28:10.720
when he first did ComNetJS.
link |
00:28:13.120
The fact that you can train and you'll network
link |
00:28:15.360
in the browser was, in JavaScript, was incredible.
link |
00:28:18.480
So now TensorFlow.js is really making that
link |
00:28:22.160
a serious, like a legit thing,
link |
00:28:26.400
a way to operate, whether it's in the backend
link |
00:28:28.520
or the front end.
link |
00:28:29.520
Then there's the TensorFlow Extended, like you mentioned.
link |
00:28:32.680
There's TensorFlow Lite for mobile.
link |
00:28:35.320
And all of it, as far as I can tell,
link |
00:28:37.440
it's really converging towards being able to
link |
00:28:41.680
save models in the same kind of way.
link |
00:28:43.440
You can move around, you can train on the desktop
link |
00:28:46.680
and then move it to mobile and so on.
link |
00:28:48.880
That's right.
link |
00:28:49.720
So there's that cohesiveness.
link |
00:28:52.280
So can you maybe give me, whatever I missed,
link |
00:28:56.120
a bigger overview of the mission of the ecosystem
link |
00:28:58.840
that's trying to be built and where is it moving forward?
link |
00:29:02.080
Yeah. So in short, the way I like to think of this is
link |
00:29:06.720
our goals to enable machine learning.
link |
00:29:09.680
And in a couple of ways, you know, one is
link |
00:29:13.120
we have lots of exciting things going on in ML today.
link |
00:29:16.520
We started with deep learning,
link |
00:29:17.520
but we now support a bunch of other algorithms too.
link |
00:29:21.360
So one is to, on the research side,
link |
00:29:23.760
keep pushing on the state of the art.
link |
00:29:25.280
Can we, you know, how do we enable researchers
link |
00:29:27.200
to build the next amazing thing?
link |
00:29:28.920
So BERT came out recently, you know,
link |
00:29:31.720
it's great that people are able to do new kinds of research.
link |
00:29:33.920
And there are lots of amazing research
link |
00:29:35.360
that happens across the world.
link |
00:29:37.480
So that's one direction.
link |
00:29:38.800
The other is how do you take that across
link |
00:29:42.440
all the people outside who want to take that research
link |
00:29:45.200
and do some great things with it
link |
00:29:46.600
and integrate it to build real products,
link |
00:29:48.600
to have a real impact on people.
link |
00:29:51.720
And so if that's the other axes in some ways,
link |
00:29:56.320
you know, at a high level, one way I think about it is
link |
00:29:59.600
there are a crazy number of compute devices
link |
00:30:02.440
across the world.
link |
00:30:04.160
And we often used to think of ML and training
link |
00:30:07.840
and all of this as, okay, something you do
link |
00:30:09.400
either in the workstation or the data center or cloud.
link |
00:30:13.560
But we see things running on the phones.
link |
00:30:15.640
We see things running on really tiny chips.
link |
00:30:17.600
I mean, we had some demos at the developer summit.
link |
00:30:20.680
And so the way I think about this ecosystem is
link |
00:30:25.760
how do we help get machine learning on every device
link |
00:30:29.880
that has a compute capability?
link |
00:30:32.480
And that continues to grow and so in some ways
link |
00:30:36.440
this ecosystem is looked at, you know,
link |
00:30:38.680
various aspects of that and grown over time
link |
00:30:41.120
to cover more of those.
link |
00:30:42.440
And we continue to push the boundaries.
link |
00:30:44.640
In some areas we've built more tooling
link |
00:30:48.160
and things around that to help you.
link |
00:30:50.000
I mean, the first tool we started was TensorBoard.
link |
00:30:52.760
You wanted to learn just the training piece,
link |
00:30:56.240
the effects or TensorFlow extended
link |
00:30:58.080
to really do your entire ML pipelines.
link |
00:31:00.400
If you're, you know, care about all that production stuff,
link |
00:31:04.760
but then going to the edge,
link |
00:31:06.600
going to different kinds of things.
link |
00:31:09.480
And it's not just us now.
link |
00:31:11.760
We are a place where there are lots of libraries
link |
00:31:14.440
being built on top.
link |
00:31:15.800
So there are some for research,
link |
00:31:17.760
maybe things like TensorFlow agents
link |
00:31:20.040
or TensorFlow probability that started as research things
link |
00:31:22.440
or for researchers for focusing
link |
00:31:24.200
on certain kinds of algorithms,
link |
00:31:26.120
but they're also being deployed
link |
00:31:27.280
or used by, you know, production folks.
link |
00:31:30.240
And some have come from within Google,
link |
00:31:33.320
just teams across Google
link |
00:31:34.720
who wanted to build these things.
link |
00:31:37.000
Others have come from just the community
link |
00:31:39.680
because there are different pieces
link |
00:31:41.840
that different parts of the community care about.
link |
00:31:44.600
And I see our goal as enabling even that, right?
link |
00:31:49.480
It's not, we cannot and won't build every single thing.
link |
00:31:53.240
That just doesn't make sense.
link |
00:31:54.840
But if we can enable others to build the things
link |
00:31:57.360
that they care about, and there's a broader community
link |
00:32:00.400
that cares about that, and we can help encourage that,
link |
00:32:02.880
and that's great.
link |
00:32:05.280
That really helps the entire ecosystem, not just those.
link |
00:32:08.600
One of the big things about 2.0 that we're pushing on is,
link |
00:32:11.840
okay, we have these so many different pieces, right?
link |
00:32:14.640
How do we help make all of them work well together?
link |
00:32:18.320
So there are a few key pieces there that we're pushing on,
link |
00:32:21.960
one being the core format in there
link |
00:32:23.880
and how we share the models themselves
link |
00:32:26.600
through save model and TensorFlow hub and so on.
link |
00:32:30.480
And a few of the pieces that we really put this together.
link |
00:32:34.000
I was very skeptical that that's,
link |
00:32:35.600
you know, when TensorFlow.js came out,
link |
00:32:37.280
it didn't seem, or deep learning JS as it was earlier.
link |
00:32:40.160
Yeah, that was the first.
link |
00:32:41.680
It seems like technically very difficult project.
link |
00:32:45.080
As a standalone, it's not as difficult,
link |
00:32:47.000
but as a thing that integrates into the ecosystem,
link |
00:32:49.960
it seems very difficult.
link |
00:32:51.240
So, I mean, there's a lot of aspects of this
link |
00:32:53.240
you're making look easy, but,
link |
00:32:54.840
and the technical side,
link |
00:32:57.160
how many challenges have to be overcome here?
link |
00:33:00.520
A lot.
link |
00:33:01.480
And still have to be overcome.
link |
00:33:03.040
That's the question here too.
link |
00:33:04.680
There are lots of steps to it, right?
link |
00:33:06.320
And we've iterated over the last few years,
link |
00:33:07.960
so there's a lot we've learned.
link |
00:33:10.680
I, yeah, and often when things come together well,
link |
00:33:14.200
things look easy and that's exactly the point.
link |
00:33:16.360
It should be easy for the end user,
link |
00:33:18.320
but there are lots of things that go behind that.
link |
00:33:21.320
If I think about still challenges ahead,
link |
00:33:25.320
there are,
link |
00:33:29.400
you know, we have a lot more devices coming on board,
link |
00:33:32.880
for example, from the hardware perspective.
link |
00:33:35.280
How do we make it really easy for these vendors
link |
00:33:37.600
to integrate with something like TensorFlow, right?
link |
00:33:42.040
So there's a lot of compiler stuff
link |
00:33:43.600
that others are working on.
link |
00:33:45.280
There are things we can do in terms of our APIs
link |
00:33:48.280
and so on that we can do.
link |
00:33:50.440
As we, you know,
link |
00:33:52.960
TensorFlow started as a very monolithic system
link |
00:33:55.760
and to some extent it still is.
link |
00:33:57.600
There are less, lots of tools around it,
link |
00:33:59.360
but the core is still pretty large and monolithic.
link |
00:34:02.880
One of the key challenges for us to scale that out
link |
00:34:05.680
is how do we break that apart with clearer interfaces?
link |
00:34:10.320
It's, you know, in some ways it's software engineering 101,
link |
00:34:14.520
but for a system that's now four years old, I guess,
link |
00:34:18.480
or more, and that's still rapidly evolving
link |
00:34:21.560
and that we're not slowing down with,
link |
00:34:23.960
it's hard to change and modify and really break apart.
link |
00:34:28.200
It's sort of like, as people say, right,
link |
00:34:29.880
it's like changing the engine with a car running
link |
00:34:32.560
or trying to fix that.
link |
00:34:33.600
That's exactly what we're trying to do.
link |
00:34:35.040
So there's a challenge here
link |
00:34:37.520
because the downside of so many people
link |
00:34:41.560
being excited about TensorFlow
link |
00:34:43.800
and coming to rely on it in many of their applications
link |
00:34:48.520
is that you're kind of responsible,
link |
00:34:52.000
like it's the technical debt.
link |
00:34:53.480
You're responsible for previous versions
link |
00:34:55.600
to some degree still working.
link |
00:34:57.560
So when you're trying to innovate,
link |
00:34:59.840
I mean, it's probably easier
link |
00:35:02.360
to just start from scratch every few months.
link |
00:35:04.760
Absolutely.
link |
00:35:07.160
So do you feel the pain of that?
link |
00:35:09.240
2.0 does break some back compatibility,
link |
00:35:14.320
but not too much.
link |
00:35:15.400
It seems like the conversion is pretty straightforward.
link |
00:35:18.160
Do you think that's still important
link |
00:35:20.280
given how quickly deep learning is changing?
link |
00:35:22.920
Can you just, the things that you've learned,
link |
00:35:26.440
can you just start over or is there pressure to not?
link |
00:35:29.320
It's a tricky balance.
link |
00:35:31.600
So if it was just a researcher writing a paper
link |
00:35:36.360
who a year later will not look at that code again,
link |
00:35:39.400
sure, it doesn't matter.
link |
00:35:41.600
There are a lot of production systems
link |
00:35:43.440
that rely on TensorFlow,
link |
00:35:44.680
both at Google and across the world.
link |
00:35:47.240
And people worry about this.
link |
00:35:49.760
I mean, these systems run for a long time.
link |
00:35:53.440
So it is important to keep that compatibility and so on.
link |
00:35:57.280
And yes, it does come with a huge cost.
link |
00:35:59.720
There's, we have to think about a lot of things
link |
00:36:02.960
as we do new things and make new changes.
link |
00:36:06.920
I think it's a trade off, right?
link |
00:36:09.080
You can, you might slow certain kinds of things down,
link |
00:36:12.960
but the overall value you're bringing
link |
00:36:14.560
because of that is much bigger
link |
00:36:16.920
because it's not just about breaking the person yesterday.
link |
00:36:20.520
It's also about telling the person tomorrow
link |
00:36:23.640
that, you know what, this is how we do things.
link |
00:36:26.240
We're not gonna break you when you come on board
link |
00:36:28.480
because there are lots of new people
link |
00:36:29.800
who are also gonna come on board.
link |
00:36:31.400
And, you know, one way I like to think about this,
link |
00:36:34.680
and I always push the team to think about it as well,
link |
00:36:37.960
when you wanna do new things,
link |
00:36:39.560
you wanna start with a clean slate.
link |
00:36:42.040
Design with a clean slate in mind,
link |
00:36:44.880
and then we'll figure out
link |
00:36:46.160
how to make sure all the other things work.
link |
00:36:48.640
And yes, we do make compromises occasionally,
link |
00:36:52.160
but unless you design with the clean slate
link |
00:36:55.200
and not worry about that,
link |
00:36:56.520
you'll never get to a good place.
link |
00:36:58.360
Oh, that's brilliant, so even if you are responsible
link |
00:37:02.560
when you're in the idea stage,
link |
00:37:04.080
when you're thinking of new,
link |
00:37:05.760
just put all that behind you.
link |
00:37:07.720
Okay, that's really, really well put.
link |
00:37:09.600
So I have to ask this
link |
00:37:11.080
because a lot of students, developers ask me
link |
00:37:13.240
how I feel about PyTorch versus TensorFlow.
link |
00:37:16.320
So I've recently completely switched
link |
00:37:18.280
my research group to TensorFlow.
link |
00:37:20.920
I wish everybody would just use the same thing,
link |
00:37:23.280
and TensorFlow is as close to that, I believe, as we have.
link |
00:37:26.960
But do you enjoy competition?
link |
00:37:32.040
So TensorFlow is leading in many ways,
link |
00:37:34.320
on many dimensions in terms of ecosystem,
link |
00:37:36.760
in terms of number of users,
link |
00:37:39.040
momentum, power, production levels, so on,
link |
00:37:41.200
but a lot of researchers are now also using PyTorch.
link |
00:37:46.000
Do you enjoy that kind of competition
link |
00:37:47.520
or do you just ignore it
link |
00:37:48.840
and focus on making TensorFlow the best that it can be?
link |
00:37:52.320
So just like research or anything people are doing,
link |
00:37:55.480
it's great to get different kinds of ideas.
link |
00:37:58.120
And when we started with TensorFlow,
link |
00:38:01.480
like I was saying earlier,
link |
00:38:03.280
one, it was very important
link |
00:38:05.240
for us to also have production in mind.
link |
00:38:07.440
We didn't want just research, right?
link |
00:38:09.000
And that's why we chose certain things.
link |
00:38:11.280
Now PyTorch came along and said,
link |
00:38:12.720
you know what, I only care about research.
link |
00:38:14.880
This is what I'm trying to do.
link |
00:38:16.280
What's the best thing I can do for this?
link |
00:38:18.400
And it started iterating and said,
link |
00:38:20.880
okay, I don't need to worry about graphs.
link |
00:38:22.560
Let me just run things.
link |
00:38:24.080
And I don't care if it's not as fast as it can be,
link |
00:38:27.440
but let me just make this part easy.
link |
00:38:30.480
And there are things you can learn from that, right?
link |
00:38:32.560
They, again, had the benefit of seeing what had come before,
link |
00:38:36.760
but also exploring certain different kinds of spaces.
link |
00:38:40.520
And they had some good things there,
link |
00:38:43.560
building on say things like JNR and so on before that.
link |
00:38:46.680
So competition is definitely interesting.
link |
00:38:49.320
It made us, you know,
link |
00:38:50.240
this is an area that we had thought about,
link |
00:38:51.880
like I said, way early on.
link |
00:38:53.720
Over time we had revisited this a couple of times,
link |
00:38:56.600
should we add this again?
link |
00:38:59.000
At some point we said, you know what,
link |
00:39:01.040
it seems like this can be done well,
link |
00:39:02.880
so let's try it again.
link |
00:39:04.320
And that's how we started pushing on eager execution.
link |
00:39:07.680
How do we combine those two together?
link |
00:39:09.880
Which has finally come very well together in 2.0,
link |
00:39:13.120
but it took us a while to get all the things together
link |
00:39:15.760
and so on.
link |
00:39:16.600
So let me ask, put another way,
link |
00:39:19.320
I think eager execution is a really powerful thing
link |
00:39:21.800
that was added.
link |
00:39:22.640
Do you think it wouldn't have been,
link |
00:39:25.800
you know, Muhammad Ali versus Frasier, right?
link |
00:39:28.360
Do you think it wouldn't have been added as quickly
link |
00:39:31.160
if PyTorch wasn't there?
link |
00:39:33.740
It might have taken longer.
link |
00:39:35.400
No longer?
link |
00:39:36.240
Yeah, it was, I mean,
link |
00:39:37.080
we had tried some variants of that before,
link |
00:39:38.900
so I'm sure it would have happened,
link |
00:39:40.900
but it might have taken longer.
link |
00:39:42.220
I'm grateful that TensorFlow is finally
link |
00:39:44.080
in the way they did.
link |
00:39:44.920
It's doing some incredible work last couple years.
link |
00:39:47.740
What other things that we didn't talk about
link |
00:39:49.600
are you looking forward in 2.0?
link |
00:39:51.480
That comes to mind.
link |
00:39:54.040
So we talked about some of the ecosystem stuff,
link |
00:39:56.520
making it easily accessible to Keras,
link |
00:40:00.000
eager execution.
link |
00:40:01.440
Is there other things that we missed?
link |
00:40:03.000
Yeah, so I would say one is just where 2.0 is,
link |
00:40:07.500
and you know, with all the things that we've talked about,
link |
00:40:10.740
I think as we think beyond that,
link |
00:40:13.760
there are lots of other things that it enables us to do
link |
00:40:16.600
and that we're excited about.
link |
00:40:18.760
So what it's setting us up for,
link |
00:40:20.720
okay, here are these really clean APIs.
link |
00:40:22.520
We've cleaned up the surface for what the users want.
link |
00:40:25.640
What it also allows us to do a whole bunch of stuff
link |
00:40:28.320
behind the scenes once we are ready with 2.0.
link |
00:40:31.600
So for example, in TensorFlow with graphs
link |
00:40:36.740
and all the things you could do,
link |
00:40:37.720
you could always get a lot of good performance
link |
00:40:40.600
if you spent the time to tune it, right?
link |
00:40:43.280
And we've clearly shown that, lots of people do that.
link |
00:40:47.720
With 2.0, with these APIs, where we are,
link |
00:40:53.040
we can give you a lot of performance
link |
00:40:55.140
just with whatever you do.
link |
00:40:57.040
You know, because we see these, it's much cleaner.
link |
00:41:01.400
We know most people are gonna do things this way.
link |
00:41:03.740
We can really optimize for that
link |
00:41:05.520
and get a lot of those things out of the box.
link |
00:41:09.040
And it really allows us, you know,
link |
00:41:10.360
both for single machine and distributed and so on,
link |
00:41:13.880
to really explore other spaces behind the scenes
link |
00:41:17.200
after 2.0 in the future versions as well.
link |
00:41:19.720
So right now the team's really excited about that,
link |
00:41:23.040
that over time I think we'll see that.
link |
00:41:25.840
The other piece that I was talking about
link |
00:41:27.760
in terms of just restructuring the monolithic thing
link |
00:41:31.640
into more pieces and making it more modular,
link |
00:41:34.360
I think that's gonna be really important
link |
00:41:36.840
for a lot of the other people in the ecosystem,
link |
00:41:41.800
other organizations and so on that wanted to build things.
link |
00:41:44.840
Can you elaborate a little bit what you mean
link |
00:41:46.400
by making TensorFlow ecosystem more modular?
link |
00:41:50.720
So the way it's organized today is there's one,
link |
00:41:55.040
there are lots of repositories
link |
00:41:56.320
in the TensorFlow organization at GitHub.
link |
00:41:58.360
The core one where we have TensorFlow,
link |
00:42:01.120
it has the execution engine,
link |
00:42:04.120
it has the key backends for CPUs and GPUs,
link |
00:42:08.320
it has the work to do distributed stuff.
link |
00:42:12.580
And all of these just work together
link |
00:42:14.420
in a single library or binary.
link |
00:42:17.280
There's no way to split them apart easily.
link |
00:42:18.840
I mean, there are some interfaces,
link |
00:42:20.000
but they're not very clean.
link |
00:42:21.640
In a perfect world, you would have clean interfaces where,
link |
00:42:24.860
okay, I wanna run it on my fancy cluster
link |
00:42:27.760
with some custom networking,
link |
00:42:29.400
just implement this and do that.
link |
00:42:31.000
I mean, we kind of support that,
link |
00:42:32.680
but it's hard for people today.
link |
00:42:35.520
I think as we are starting to see more interesting things
link |
00:42:38.180
in some of these spaces,
link |
00:42:39.440
having that clean separation will really start to help.
link |
00:42:42.360
And again, going to the large size of the ecosystem
link |
00:42:47.360
and the different groups involved there,
link |
00:42:50.140
enabling people to evolve
link |
00:42:52.580
and push on things more independently
link |
00:42:54.360
just allows it to scale better.
link |
00:42:56.040
And by people, you mean individual developers and?
link |
00:42:59.080
And organizations.
link |
00:42:59.960
And organizations.
link |
00:43:00.960
That's right.
link |
00:43:01.800
So the hope is that everybody sort of major,
link |
00:43:04.240
I don't know, Pepsi or something uses,
link |
00:43:06.900
like major corporations go to TensorFlow to this kind of.
link |
00:43:11.040
Yeah, if you look at enterprises like Pepsi or these,
link |
00:43:13.640
I mean, a lot of them are already using TensorFlow.
link |
00:43:15.800
They are not the ones that do the development
link |
00:43:18.920
or changes in the core.
link |
00:43:20.360
Some of them do, but a lot of them don't.
link |
00:43:21.960
I mean, they touch small pieces.
link |
00:43:23.720
There are lots of these,
link |
00:43:25.660
some of them being, let's say, hardware vendors
link |
00:43:27.660
who are building their custom hardware
link |
00:43:28.960
and they want their own pieces.
link |
00:43:30.840
Or some of them being bigger companies, say, IBM.
link |
00:43:34.160
I mean, they're involved in some of our
link |
00:43:36.480
special interest groups,
link |
00:43:38.100
and they see a lot of users
link |
00:43:39.960
who want certain things and they want to optimize for that.
link |
00:43:42.620
So folks like that often.
link |
00:43:44.440
Autonomous vehicle companies, perhaps.
link |
00:43:46.360
Exactly, yes.
link |
00:43:48.160
So, yeah, like I mentioned,
link |
00:43:50.000
TensorFlow has been downloaded 41 million times,
link |
00:43:52.760
50,000 commits, almost 10,000 pull requests,
link |
00:43:56.480
and 1,800 contributors.
link |
00:43:58.320
So I'm not sure if you can explain it,
link |
00:44:02.120
but what does it take to build a community like that?
link |
00:44:06.000
In retrospect, what do you think,
link |
00:44:09.160
what is the critical thing that allowed
link |
00:44:11.180
for this growth to happen,
link |
00:44:12.640
and how does that growth continue?
link |
00:44:14.600
Yeah, yeah, that's an interesting question.
link |
00:44:17.920
I wish I had all the answers there, I guess,
link |
00:44:20.240
so you could replicate it.
link |
00:44:22.520
I think there are a number of things
link |
00:44:25.560
that need to come together, right?
link |
00:44:27.880
One, just like any new thing,
link |
00:44:32.480
it is about, there's a sweet spot of timing,
link |
00:44:35.920
what's needed, does it grow with,
link |
00:44:38.880
what's needed, so in this case, for example,
link |
00:44:41.640
TensorFlow's not just grown because it was a good tool,
link |
00:44:43.680
it's also grown with the growth of deep learning itself.
link |
00:44:46.720
So those factors come into play.
link |
00:44:49.040
Other than that, though,
link |
00:44:52.080
I think just hearing, listening to the community,
link |
00:44:55.240
what they do, what they need,
link |
00:44:57.040
being open to, like in terms of external contributions,
link |
00:45:01.120
we've spent a lot of time in making sure
link |
00:45:04.560
we can accept those contributions well,
link |
00:45:06.880
we can help the contributors in adding those,
link |
00:45:09.480
putting the right process in place,
link |
00:45:11.320
getting the right kind of community,
link |
00:45:13.360
welcoming them and so on.
link |
00:45:16.160
Like over the last year, we've really pushed on transparency,
link |
00:45:19.320
that's important for an open source project.
link |
00:45:22.280
People wanna know where things are going,
link |
00:45:23.800
and we're like, okay, here's a process
link |
00:45:26.200
where you can do that, here are our RFCs and so on.
link |
00:45:29.360
So thinking through, there are lots of community aspects
link |
00:45:32.920
that come into that you can really work on.
link |
00:45:35.460
As a small project, it's maybe easy to do
link |
00:45:38.740
because there's like two developers and you can do those.
link |
00:45:42.180
As you grow, putting more of these processes in place,
link |
00:45:46.980
thinking about the documentation,
link |
00:45:49.140
thinking about what two developers care about,
link |
00:45:51.940
what kind of tools would they want to use,
link |
00:45:55.180
all of these come into play, I think.
link |
00:45:56.900
So one of the big things I think
link |
00:45:58.420
that feeds the TensorFlow fire
link |
00:46:00.700
is people building something on TensorFlow,
link |
00:46:03.980
and implement a particular architecture
link |
00:46:07.700
that does something cool and useful,
link |
00:46:09.500
and they put that on GitHub.
link |
00:46:11.100
And so it just feeds this growth.
link |
00:46:15.580
Do you have a sense that with 2.0 and 1.0
link |
00:46:19.580
that there may be a little bit of a partitioning
link |
00:46:21.580
like there is with Python 2 and 3,
link |
00:46:24.100
that there'll be a code base
link |
00:46:26.040
and in the older versions of TensorFlow,
link |
00:46:28.340
they will not be as compatible easily?
link |
00:46:31.140
Or are you pretty confident that this kind of conversion
link |
00:46:35.620
is pretty natural and easy to do?
link |
00:46:37.980
So we're definitely working hard
link |
00:46:39.980
to make that very easy to do.
link |
00:46:41.500
There's lots of tooling that we talked about
link |
00:46:43.500
at the developer summit this week,
link |
00:46:45.820
and we'll continue to invest in that tooling.
link |
00:46:48.260
It's, you know, when you think
link |
00:46:50.500
of these significant version changes,
link |
00:46:52.580
that's always a risk,
link |
00:46:53.580
and we are really pushing hard
link |
00:46:55.740
to make that transition very, very smooth.
link |
00:46:58.100
So I think, so at some level,
link |
00:47:02.700
people wanna move and they see the value in the new thing.
link |
00:47:05.620
They don't wanna move just because it's a new thing,
link |
00:47:07.740
and some people do,
link |
00:47:08.580
but most people want a really good thing.
link |
00:47:11.540
And I think over the next few months,
link |
00:47:13.820
as people start to see the value,
link |
00:47:15.460
we'll definitely see that shift happening.
link |
00:47:17.700
So I'm pretty excited and confident
link |
00:47:19.740
that we will see people moving.
link |
00:47:22.540
As you said earlier, this field is also moving rapidly,
link |
00:47:24.740
so that'll help because we can do more things
link |
00:47:26.780
and all the new things will clearly happen in 2.x,
link |
00:47:29.500
so people will have lots of good reasons to move.
link |
00:47:32.300
So what do you think TensorFlow 3.0 looks like?
link |
00:47:36.140
Is there, are things happening so crazily
link |
00:47:40.340
that even at the end of this year
link |
00:47:42.540
seems impossible to plan for?
link |
00:47:45.300
Or is it possible to plan for the next five years?
link |
00:47:49.420
I think it's tricky.
link |
00:47:50.820
There are some things that we can expect
link |
00:47:54.540
in terms of, okay, change, yes, change is gonna happen.
link |
00:47:59.700
Are there some things gonna stick around
link |
00:48:01.660
and some things not gonna stick around?
link |
00:48:03.740
I would say the basics of deep learning,
link |
00:48:08.140
the, you know, say convolution models
link |
00:48:10.420
or the basic kind of things,
link |
00:48:12.700
they'll probably be around in some form still in five years.
link |
00:48:16.300
Will RL and GAN stay?
link |
00:48:18.620
Very likely, based on where they are.
link |
00:48:21.180
Will we have new things?
link |
00:48:22.860
Probably, but those are hard to predict.
link |
00:48:24.660
And some directionally, some things that we can see is,
link |
00:48:30.620
you know, in things that we're starting to do, right,
link |
00:48:32.740
with some of our projects right now
link |
00:48:35.420
is just 2.0 combining eager execution and graphs
link |
00:48:39.140
where we're starting to make it more like
link |
00:48:41.460
just your natural programming language.
link |
00:48:43.140
You're not trying to program something else.
link |
00:48:45.660
Similarly, with Swift for TensorFlow,
link |
00:48:47.220
we're taking that approach.
link |
00:48:48.260
Can you do something ground up, right?
link |
00:48:50.020
So some of those ideas seem like, okay,
link |
00:48:52.100
that's the right direction.
link |
00:48:54.100
In five years, we expect to see more in that area.
link |
00:48:58.340
Other things we don't know is,
link |
00:49:00.060
will hardware accelerators be the same?
link |
00:49:03.180
Will we be able to train with four bits
link |
00:49:06.620
instead of 32 bits?
link |
00:49:09.020
And I think the TPU side of things is exploring that.
link |
00:49:11.500
I mean, TPU is already on version three.
link |
00:49:13.940
It seems that the evolution of TPU and TensorFlow
link |
00:49:17.540
are sort of, they're coevolving almost in terms of
link |
00:49:23.260
both are learning from each other and from the community
link |
00:49:25.740
and from the applications
link |
00:49:27.980
where the biggest benefit is achieved.
link |
00:49:29.740
That's right.
link |
00:49:30.580
You've been trying to sort of, with Eager, with Keras,
link |
00:49:33.340
to make TensorFlow as accessible
link |
00:49:34.940
and easy to use as possible.
link |
00:49:36.500
What do you think, for beginners,
link |
00:49:38.060
is the biggest thing they struggle with?
link |
00:49:40.020
Have you encountered that?
link |
00:49:42.100
Or is basically what Keras is solving is that Eager,
link |
00:49:46.260
like we talked about?
link |
00:49:47.420
Yeah, for some of them, like you said, right,
link |
00:49:50.620
the beginners want to just be able to take
link |
00:49:53.620
some image model,
link |
00:49:54.900
they don't care if it's Inception or ResNet
link |
00:49:57.060
or something else,
link |
00:49:58.100
and do some training or transfer learning
link |
00:50:00.820
on their kind of model.
link |
00:50:02.500
Being able to make that easy is important.
link |
00:50:04.460
So in some ways,
link |
00:50:07.060
if you do that by providing them simple models
link |
00:50:09.380
with say, in hub or so on,
link |
00:50:11.420
they don't care about what's inside that box,
link |
00:50:13.780
but they want to be able to use it.
link |
00:50:15.180
So we're pushing on, I think, different levels.
link |
00:50:17.660
If you look at just a component that you get,
link |
00:50:20.020
which has the layers already smooshed in,
link |
00:50:22.820
the beginners probably just want that.
link |
00:50:25.260
Then the next step is, okay,
link |
00:50:26.780
look at building layers with Keras.
link |
00:50:29.100
If you go out to research,
link |
00:50:30.300
then they are probably writing custom layers themselves
link |
00:50:33.180
or doing their own loops.
link |
00:50:34.460
So there's a whole spectrum there.
link |
00:50:36.380
And then providing the pre trained models
link |
00:50:38.660
seems to really decrease the time from you trying to start.
link |
00:50:43.660
You could basically in a Colab notebook
link |
00:50:46.860
achieve what you need.
link |
00:50:49.140
So I'm basically answering my own question
link |
00:50:51.340
because I think what TensorFlow delivered on recently
link |
00:50:54.300
is trivial for beginners.
link |
00:50:56.980
So I was just wondering if there was other pain points
link |
00:51:00.780
you're trying to ease,
link |
00:51:01.620
but I'm not sure there would.
link |
00:51:02.540
No, those are probably the big ones.
link |
00:51:04.900
I see high schoolers doing a whole bunch of things now,
link |
00:51:07.420
which is pretty amazing.
link |
00:51:09.220
It's both amazing and terrifying.
link |
00:51:11.420
Yes.
link |
00:51:12.700
In a sense that when they grow up,
link |
00:51:15.940
it's some incredible ideas will be coming from them.
link |
00:51:19.300
So there's certainly a technical aspect to your work,
link |
00:51:21.860
but you also have a management aspect to your role
link |
00:51:25.260
with TensorFlow leading the project,
link |
00:51:27.980
a large number of developers and people.
link |
00:51:31.140
So what do you look for in a good team?
link |
00:51:34.700
What do you think?
link |
00:51:36.220
Google has been at the forefront of exploring
link |
00:51:38.420
what it takes to build a good team
link |
00:51:40.500
and TensorFlow is one of the most cutting edge technologies
link |
00:51:45.540
in the world.
link |
00:51:46.380
So in this context, what do you think makes for a good team?
link |
00:51:50.500
It's definitely something I think a favorite about.
link |
00:51:53.180
I think in terms of the team being able
link |
00:51:59.780
to deliver something well,
link |
00:52:01.180
one of the things that's important is a cohesion
link |
00:52:04.780
across the team.
link |
00:52:05.820
So being able to execute together in doing things
link |
00:52:10.420
that's not an end, like at this scale,
link |
00:52:13.180
an individual engineer can only do so much.
link |
00:52:15.460
There's a lot more that they can do together,
link |
00:52:18.260
even though we have some amazing superstars across Google
link |
00:52:21.780
and in the team, but there's, you know,
link |
00:52:25.140
often the way I see it as the product
link |
00:52:27.380
of what the team generates is way larger
link |
00:52:29.140
than the whole or the individual put together.
link |
00:52:34.460
And so how do we have all of them work together,
link |
00:52:37.380
the culture of the team itself,
link |
00:52:40.060
hiring good people is important.
link |
00:52:43.100
But part of that is it's not just that,
link |
00:52:45.340
okay, we hire a bunch of smart people
link |
00:52:47.260
and throw them together and let them do things.
link |
00:52:49.740
It's also people have to care about what they're building,
link |
00:52:52.980
people have to be motivated for the right kind of things.
link |
00:52:57.380
That's often an important factor.
link |
00:53:01.500
And, you know, finally, how do you put that together
link |
00:53:04.660
with a somewhat unified vision of where we wanna go?
link |
00:53:08.860
So are we all looking in the same direction
link |
00:53:11.220
or each of us going all over?
link |
00:53:13.620
And sometimes it's a mix.
link |
00:53:16.100
Google's a very bottom up organization in some sense,
link |
00:53:21.460
also research even more so, and that's how we started.
link |
00:53:26.420
But as we've become this larger product and ecosystem,
link |
00:53:30.900
I think it's also important to combine that well
link |
00:53:33.180
with a mix of, okay, here's the direction we wanna go in.
link |
00:53:38.020
There is exploration we'll do around that,
link |
00:53:39.860
but let's keep staying in that direction,
link |
00:53:42.820
not just all over the place.
link |
00:53:44.460
And is there a way you monitor the health of the team?
link |
00:53:46.860
Sort of like, is there a way you know you did a good job?
link |
00:53:51.980
The team is good?
link |
00:53:53.020
Like, I mean, you're sort of, you're saying nice things,
link |
00:53:56.220
but it's sometimes difficult to determine how aligned.
link |
00:54:00.860
Yes.
link |
00:54:01.700
Because it's not binary.
link |
00:54:02.520
It's not like there's tensions and complexities and so on.
link |
00:54:06.740
And the other element of the mission of superstars,
link |
00:54:09.460
there's so much, even at Google,
link |
00:54:11.820
such a large percentage of work
link |
00:54:13.220
is done by individual superstars too.
link |
00:54:16.020
So there's a, and sometimes those superstars
link |
00:54:19.980
can be against the dynamic of a team and those tensions.
link |
00:54:25.220
I mean, I'm sure in TensorFlow it might be
link |
00:54:26.580
a little bit easier because the mission of the project
link |
00:54:28.900
is so sort of beautiful.
link |
00:54:31.740
You're at the cutting edge, so it's exciting.
link |
00:54:34.860
But have you had struggle with that?
link |
00:54:36.700
Has there been challenges?
link |
00:54:38.380
There are always people challenges
link |
00:54:39.860
in different kinds of ways.
link |
00:54:41.260
That said, I think we've been what's good
link |
00:54:44.780
about getting people who care and are, you know,
link |
00:54:48.980
have the same kind of culture,
link |
00:54:50.420
and that's Google in general to a large extent.
link |
00:54:53.460
But also, like you said, given that the project
link |
00:54:56.140
has had so many exciting things to do,
link |
00:54:58.780
there's been room for lots of people
link |
00:55:00.760
to do different kinds of things and grow,
link |
00:55:02.460
which does make the problem a bit easier, I guess.
link |
00:55:05.380
And it allows people, depending on what they're doing,
link |
00:55:09.940
if there's room around them, then that's fine.
link |
00:55:13.140
But yes, we do care about whether a superstar or not,
link |
00:55:19.220
that they need to work well with the team across Google.
link |
00:55:22.580
That's interesting to hear.
link |
00:55:23.680
So it's like superstar or not,
link |
00:55:26.500
the productivity broadly is about the team.
link |
00:55:30.540
Yeah, yeah.
link |
00:55:31.540
I mean, they might add a lot of value,
link |
00:55:32.980
but if they're hurting the team, then that's a problem.
link |
00:55:35.740
So in hiring engineers, it's so interesting, right,
link |
00:55:39.060
the hiring process.
link |
00:55:40.260
What do you look for?
link |
00:55:41.860
How do you determine a good developer
link |
00:55:44.300
or a good member of a team
link |
00:55:46.240
from just a few minutes or hours together?
link |
00:55:50.420
Again, no magic answers, I'm sure.
link |
00:55:52.260
Yeah, I mean, Google has a hiring process
link |
00:55:55.340
that we've refined over the last 20 years, I guess,
link |
00:55:59.660
and that you've probably heard and seen a lot about.
link |
00:56:02.220
So we do work with the same hiring process
link |
00:56:04.980
and that's really helped.
link |
00:56:08.340
For me in particular, I would say,
link |
00:56:10.900
in addition to the core technical skills,
link |
00:56:14.220
what does matter is their motivation
link |
00:56:17.580
in what they wanna do.
link |
00:56:19.600
Because if that doesn't align well
link |
00:56:21.380
with where we wanna go,
link |
00:56:22.980
that's not gonna lead to long term success
link |
00:56:25.360
for either them or the team.
link |
00:56:27.700
And I think that becomes more important
link |
00:56:30.020
the more senior the person is,
link |
00:56:31.480
but it's important at every level.
link |
00:56:33.580
Like even the junior most engineer,
link |
00:56:34.940
if they're not motivated to do well
link |
00:56:36.380
at what they're trying to do,
link |
00:56:37.700
however smart they are,
link |
00:56:38.820
it's gonna be hard for them to succeed.
link |
00:56:40.380
Does the Google hiring process touch on that passion?
link |
00:56:44.540
So like trying to determine,
link |
00:56:46.500
because I think as far as I understand,
link |
00:56:48.500
maybe you can speak to it,
link |
00:56:49.620
that the Google hiring process sort of helps
link |
00:56:53.380
in the initial like determines the skill set there,
link |
00:56:56.380
is your puzzle solving ability,
link |
00:56:57.940
problem solving ability good?
link |
00:56:59.920
But like, I'm not sure,
link |
00:57:02.540
but it seems that the determining
link |
00:57:05.040
whether the person is like fire inside them,
link |
00:57:07.580
that burns to do anything really,
link |
00:57:09.060
it doesn't really matter.
link |
00:57:09.900
It's just some cool stuff,
link |
00:57:11.540
I'm gonna do it.
link |
00:57:15.340
Is that something that ultimately ends up
link |
00:57:17.300
when they have a conversation with you
link |
00:57:18.820
or once it gets closer to the team?
link |
00:57:22.640
So one of the things we do have as part of the process
link |
00:57:25.420
is just a culture fit,
link |
00:57:27.180
like part of the interview process itself,
link |
00:57:29.200
in addition to just the technical skills
link |
00:57:31.020
and each engineer or whoever the interviewer is,
link |
00:57:34.260
is supposed to rate the person on the culture
link |
00:57:38.340
and the culture fit with Google and so on.
link |
00:57:40.000
So that is definitely part of the process.
link |
00:57:42.180
Now, there are various kinds of projects
link |
00:57:45.860
and different kinds of things.
link |
00:57:46.940
So there might be variants
link |
00:57:48.820
and of the kind of culture you want there and so on.
link |
00:57:51.380
And yes, that does vary.
link |
00:57:52.740
So for example,
link |
00:57:54.020
TensorFlow has always been a fast moving project
link |
00:57:56.980
and we want people who are comfortable with that.
link |
00:58:00.980
But at the same time now, for example,
link |
00:58:02.700
we are at a place where we are also very full fledged product
link |
00:58:05.260
and we wanna make sure things that work
link |
00:58:07.820
really, really work, right?
link |
00:58:09.340
You can't cut corners all the time.
link |
00:58:11.700
So balancing that out and finding the people
link |
00:58:14.340
who are the right fit for those is important.
link |
00:58:17.580
And I think those kinds of things do vary a bit
link |
00:58:19.740
across projects and teams and product areas across Google.
link |
00:58:23.220
And so you'll see some differences there
link |
00:58:25.260
in the final checklist.
link |
00:58:27.700
But a lot of the core culture,
link |
00:58:29.380
it comes along with just the engineering excellence
link |
00:58:32.220
and so on.
link |
00:58:34.740
What is the hardest part of your job?
link |
00:58:39.780
I'll take your pick, I guess.
link |
00:58:41.940
It's fun, I would say, right?
link |
00:58:44.460
Hard, yes.
link |
00:58:45.540
I mean, lots of things at different times.
link |
00:58:47.280
I think that does vary.
link |
00:58:49.220
So let me clarify that difficult things are fun
link |
00:58:52.680
when you solve them, right?
link |
00:58:53.980
So it's fun in that sense.
link |
00:58:57.500
I think the key to a successful thing across the board
link |
00:59:02.640
and in this case, it's a large ecosystem now,
link |
00:59:05.380
but even a small product,
link |
00:59:07.180
is striking that fine balance
link |
00:59:09.820
across different aspects of it.
link |
00:59:12.060
Sometimes it's how fast do you go
link |
00:59:13.940
versus how perfect it is.
link |
00:59:17.060
Sometimes it's how do you involve this huge community?
link |
00:59:21.460
Who do you involve or do you decide,
link |
00:59:23.640
okay, now is not a good time to involve them
link |
00:59:25.480
because it's not the right fit.
link |
00:59:30.220
Sometimes it's saying no to certain kinds of things.
link |
00:59:33.660
Those are often the hard decisions.
link |
00:59:36.860
Some of them you make quickly
link |
00:59:39.600
because you don't have the time.
link |
00:59:41.020
Some of them you get time to think about them,
link |
00:59:43.220
but they're always hard.
link |
00:59:44.500
So both choices are pretty good, those decisions.
link |
00:59:49.220
What about deadlines?
link |
00:59:50.380
Is this, do you find TensorFlow,
link |
00:59:53.580
to be driven by deadlines
link |
00:59:58.220
to a degree that a product might?
link |
01:00:00.400
Or is there still a balance to where it's less deadline?
link |
01:00:04.940
You had the Dev Summit today
link |
01:00:06.740
that came together incredibly.
link |
01:00:08.940
Looked like there's a lot of moving pieces and so on.
link |
01:00:11.460
So did that deadline make people rise to the occasion
link |
01:00:15.140
releasing TensorFlow 2.0 alpha?
link |
01:00:18.420
I'm sure that was done last minute as well.
link |
01:00:20.420
I mean, up to the last point.
link |
01:00:25.620
Again, it's one of those things
link |
01:00:26.860
that you need to strike the good balance.
link |
01:00:29.940
There's some value that deadlines bring
link |
01:00:32.100
that does bring a sense of urgency
link |
01:00:33.980
to get the right things together.
link |
01:00:35.780
Instead of getting the perfect thing out,
link |
01:00:38.340
you need something that's good and works well.
link |
01:00:41.320
And the team definitely did a great job
link |
01:00:43.260
in putting that together.
link |
01:00:44.100
So I was very amazed and excited
link |
01:00:45.920
by everything how that came together.
link |
01:00:48.740
That said, across the year,
link |
01:00:49.860
we try not to put out official deadlines.
link |
01:00:52.580
We focus on key things that are important,
link |
01:00:57.020
figure out how much of it's important.
link |
01:01:00.620
And we are developing in the open,
link |
01:01:03.900
both internally and externally,
link |
01:01:05.820
everything's available to everybody.
link |
01:01:07.980
So you can pick and look at where things are.
link |
01:01:11.220
We do releases at a regular cadence.
link |
01:01:13.260
So fine, if something doesn't necessarily end up
link |
01:01:16.180
this month, it'll end up in the next release
link |
01:01:17.820
in a month or two.
link |
01:01:18.780
And that's okay, but we want to keep moving
link |
01:01:22.860
as fast as we can in these different areas.
link |
01:01:26.500
Because we can iterate and improve on things,
link |
01:01:29.660
sometimes it's okay to put things out
link |
01:01:31.960
that aren't fully ready.
link |
01:01:32.980
We'll make sure it's clear that okay,
link |
01:01:34.580
this is experimental, but it's out there
link |
01:01:36.540
if you want to try and give feedback.
link |
01:01:37.980
That's very, very useful.
link |
01:01:39.420
I think that quick cycle and quick iteration is important.
link |
01:01:43.580
That's what we often focus on rather than
link |
01:01:46.940
here's a deadline where you get everything else.
link |
01:01:49.220
Is 2.0, is there pressure to make that stable?
link |
01:01:52.860
Or like, for example, WordPress 5.0 just came out
link |
01:01:57.780
and there was no pressure to,
link |
01:02:00.300
it was a lot of build updates delivered way too late,
link |
01:02:03.980
but, and they said, okay, well,
link |
01:02:05.980
but we're gonna release a lot of updates
link |
01:02:07.440
really quickly to improve it.
link |
01:02:09.660
Do you see TensorFlow 2.0 in that same kind of way
link |
01:02:12.220
or is there this pressure to once it hits 2.0,
link |
01:02:15.260
once you get to the release candidate
link |
01:02:16.780
and then you get to the final,
link |
01:02:18.980
that's gonna be the stable thing?
link |
01:02:22.460
So it's gonna be stable in,
link |
01:02:25.740
just like when NodeX was where every API that's there
link |
01:02:28.900
is gonna remain in work.
link |
01:02:32.100
It doesn't mean we can't change things under the covers.
link |
01:02:34.820
It doesn't mean we can't add things.
link |
01:02:36.740
So there's still a lot more for us to do
link |
01:02:39.200
and we'll continue to have more releases.
link |
01:02:41.100
So in that sense, there's still,
link |
01:02:42.640
I don't think we'll be done in like two months
link |
01:02:44.740
when we release this.
link |
01:02:46.140
I don't know if you can say, but is there,
link |
01:02:49.900
there's not external deadlines for TensorFlow 2.0,
link |
01:02:53.740
but is there internal deadlines,
link |
01:02:57.060
the artificial or otherwise,
link |
01:02:58.540
that you're trying to set for yourself
link |
01:03:00.860
or is it whenever it's ready?
link |
01:03:03.100
So we want it to be a great product, right?
link |
01:03:05.660
And that's a big important piece for us.
link |
01:03:09.900
TensorFlow's already out there.
link |
01:03:11.140
We have 41 million downloads for 1.0 X.
link |
01:03:13.740
So it's not like we have to have this.
link |
01:03:16.420
Yeah, exactly.
link |
01:03:17.260
So it's not like, a lot of the features
link |
01:03:19.340
that we've really polishing
link |
01:03:21.180
and putting them together are there.
link |
01:03:23.580
We don't have to rush that just because.
link |
01:03:26.220
So in that sense, we wanna get it right
link |
01:03:28.020
and really focus on that.
link |
01:03:29.940
That said, we have said that we are looking
link |
01:03:31.860
to get this out in the next few months,
link |
01:03:33.500
in the next quarter.
link |
01:03:34.500
And as far as possible,
link |
01:03:37.100
we'll definitely try to make that happen.
link |
01:03:39.780
Yeah, my favorite line was, spring is a relative concept.
link |
01:03:44.340
I love it.
link |
01:03:45.180
Yes.
link |
01:03:46.020
Spoken like a true developer.
link |
01:03:47.700
So something I'm really interested in
link |
01:03:50.220
and your previous line of work is,
link |
01:03:52.980
before TensorFlow, you led a team at Google on search ads.
link |
01:03:57.740
I think this is a very interesting topic
link |
01:04:01.860
on every level, on a technical level,
link |
01:04:04.980
because at their best, ads connect people
link |
01:04:07.220
to the things they want and need.
link |
01:04:09.420
So, and at their worst, they're just these things
link |
01:04:12.300
that annoy the heck out of you
link |
01:04:14.940
to the point of ruining the entire user experience
link |
01:04:17.340
of whatever you're actually doing.
link |
01:04:20.260
So they have a bad rep, I guess.
link |
01:04:23.620
And on the other end, so that this connecting users
link |
01:04:28.100
to the thing they need and want
link |
01:04:29.660
is a beautiful opportunity for machine learning to shine.
link |
01:04:34.060
Like huge amounts of data that's personalized
link |
01:04:36.340
and you kind of map to the thing
link |
01:04:37.860
they actually want won't get annoyed.
link |
01:04:40.380
So what have you learned from this,
link |
01:04:43.220
Google that's leading the world in this aspect,
link |
01:04:45.140
what have you learned from that experience
link |
01:04:47.540
and what do you think is the future of ads?
link |
01:04:51.540
Take you back to that.
link |
01:04:52.540
Yeah, yes, it's been a while,
link |
01:04:55.220
but I totally agree with what you said.
link |
01:04:59.700
I think the search ads, the way it was always looked at
link |
01:05:03.180
and I believe it still is,
link |
01:05:04.500
is it's an extension of what search is trying to do.
link |
01:05:08.100
And the goal is to make the information
link |
01:05:10.580
and make the world's information accessible.
link |
01:05:14.740
That's it's not just information,
link |
01:05:17.140
but maybe products or other things that people care about.
link |
01:05:20.780
And so it's really important for them to align
link |
01:05:23.860
with what the users need.
link |
01:05:26.500
And in search ads, there's a minimum quality level
link |
01:05:30.940
before that ad would be shown.
link |
01:05:32.300
If you don't have an ad that hits that quality,
link |
01:05:34.060
but it will not be shown even if we have it
link |
01:05:35.980
and okay, maybe we lose some money there, that's fine.
link |
01:05:39.620
That is really, really important.
link |
01:05:41.300
And I think that that is something I really liked
link |
01:05:43.420
about being there.
link |
01:05:45.060
Advertising is a key part.
link |
01:05:48.180
I mean, as a model, it's been around for ages, right?
link |
01:05:51.740
It's not a new model, it's been adapted to the web
link |
01:05:54.900
and became a core part of search
link |
01:05:57.500
and many other search engines across the world.
link |
01:06:00.780
And I do hope, like you said,
link |
01:06:04.420
there are aspects of ads that are annoying
link |
01:06:06.700
and I go to a website and if it just keeps popping
link |
01:06:10.260
an ad in my face not to let me read,
link |
01:06:12.540
that's gonna be annoying clearly.
link |
01:06:13.860
So I hope we can strike that balance
link |
01:06:18.780
between showing a good ad where it's valuable to the user
link |
01:06:23.780
and provides the monetization to the service.
link |
01:06:29.740
And this might be search, this might be a website,
link |
01:06:32.460
all of these, they do need the monetization
link |
01:06:35.660
for them to provide that service.
link |
01:06:38.540
But if it's done in a good balance between
link |
01:06:43.660
showing just some random stuff that's distracting
link |
01:06:46.820
versus showing something that's actually valuable.
link |
01:06:49.660
So do you see it moving forward as to continue
link |
01:06:54.660
being a model that funds businesses like Google,
link |
01:07:00.340
that's a significant revenue stream?
link |
01:07:04.380
Because that's one of the most exciting things
link |
01:07:07.420
but also limiting things in the internet
link |
01:07:09.020
is nobody wants to pay for anything.
link |
01:07:11.500
And advertisements, again, coupled at their best,
link |
01:07:14.660
are actually really useful and not annoying.
link |
01:07:16.660
Do you see that continuing and growing and improving
link |
01:07:21.660
or is there, do you see sort of more Netflix type models
link |
01:07:26.140
where you have to start to pay for content?
link |
01:07:28.420
I think it's a mix.
link |
01:07:29.780
I think it's gonna take a long while for everything
link |
01:07:32.260
to be paid on the internet, if at all, probably not.
link |
01:07:35.580
I mean, I think there's always gonna be things
link |
01:07:37.220
that are sort of monetized with things like ads.
link |
01:07:40.180
But over the last few years, I would say
link |
01:07:42.220
we've definitely seen that transition towards
link |
01:07:45.340
more paid services across the web
link |
01:07:48.660
and people are willing to pay for them
link |
01:07:50.420
because they do see the value.
link |
01:07:51.740
I mean, Netflix is a great example.
link |
01:07:53.660
I mean, we have YouTube doing things.
link |
01:07:56.580
People pay for the apps they buy.
link |
01:07:58.780
More people I find are willing to pay for newspaper content
link |
01:08:03.140
for the good news websites across the web.
link |
01:08:07.260
That wasn't the case a few years,
link |
01:08:08.900
even a few years ago, I would say.
link |
01:08:11.060
And I just see that change in myself as well
link |
01:08:13.340
and just lots of people around me.
link |
01:08:14.860
So definitely hopeful that we'll transition
link |
01:08:17.220
to that mix model where maybe you get
link |
01:08:20.900
to try something out for free, maybe with ads,
link |
01:08:24.180
but then there's a more clear revenue model
link |
01:08:27.420
that sort of helps go beyond that.
link |
01:08:30.660
So speaking of revenue, how is it that a person
link |
01:08:35.940
can use the TPU in a Google call app for free?
link |
01:08:39.460
So what's the, I guess the question is,
link |
01:08:43.980
what's the future of TensorFlow in terms of empowering,
link |
01:08:48.940
say, a class of 300 students?
link |
01:08:51.940
And I'm asked by MIT, what is going to be the future
link |
01:08:56.940
of them being able to do their homework in TensorFlow?
link |
01:09:00.020
Like, where are they going to train these networks, right?
link |
01:09:02.860
What's that future look like with TPUs,
link |
01:09:06.460
with cloud services, and so on?
link |
01:09:08.980
I think a number of things there.
link |
01:09:10.300
I mean, any TensorFlow open source,
link |
01:09:12.660
you can run it wherever, you can run it on your desktop
link |
01:09:15.020
and your desktops always keep getting more powerful,
link |
01:09:17.500
so maybe you can do more.
link |
01:09:19.540
My phone is like, I don't know how many times
link |
01:09:21.420
more powerful than my first desktop.
link |
01:09:23.740
You'll probably train it on your phone though,
link |
01:09:25.220
yeah, that's true.
link |
01:09:26.260
Right, so in that sense, the power you have
link |
01:09:28.460
in your hands is a lot more.
link |
01:09:31.500
Clouds are actually very interesting from, say,
link |
01:09:34.420
students or courses perspective,
link |
01:09:36.940
because they make it very easy to get started.
link |
01:09:40.060
I mean, Colab, the great thing about it is,
link |
01:09:42.740
go to a website and it just works.
link |
01:09:45.180
No installation needed, nothing to,
link |
01:09:47.580
you're just there and things are working.
link |
01:09:50.020
That's really the power of cloud as well.
link |
01:09:52.300
And so I do expect that to grow.
link |
01:09:55.340
Again, Colab is a free service.
link |
01:09:57.940
It's great to get started, to play with things,
link |
01:10:00.900
to explore things.
link |
01:10:03.140
That said, with free, you can only get so much.
link |
01:10:06.140
You'd be, yeah.
link |
01:10:08.220
So just like we were talking about,
link |
01:10:10.140
free versus paid, yeah, there are services
link |
01:10:12.940
you can pay for and get a lot more.
link |
01:10:15.340
Great, so if I'm a complete beginner
link |
01:10:17.740
interested in machine learning and TensorFlow,
link |
01:10:19.980
what should I do?
link |
01:10:21.620
Probably start with going to our website
link |
01:10:23.540
and playing there.
link |
01:10:24.380
So just go to TensorFlow.org and start clicking on things.
link |
01:10:26.620
Yep, check out tutorials and guides.
link |
01:10:28.500
There's stuff you can just click there
link |
01:10:29.860
and go to a Colab and do things.
link |
01:10:31.340
No installation needed, you can get started right there.
link |
01:10:34.100
Okay, awesome, Rajit, thank you so much for talking today.
link |
01:10:36.740
Thank you, Lex, it was great.