back to index

Rajat Monga: TensorFlow | Lex Fridman Podcast #22


small model | large model

link |
00:00:00.000
The following is a conversation with Rajat Manga.
link |
00:00:03.080
He's an engineering director at Google,
link |
00:00:04.960
leading the TensorFlow team.
link |
00:00:06.960
TensorFlow is an open source library
link |
00:00:09.160
at the center of much of the work going on in the world
link |
00:00:11.520
in deep learning, both the cutting edge research
link |
00:00:14.040
and the large scale application of learning based approaches.
link |
00:00:17.720
But it's quickly becoming much more
link |
00:00:19.480
than a software library.
link |
00:00:20.960
It's now an ecosystem of tools for the deployment
link |
00:00:23.760
of machine learning in the cloud, on the phone,
link |
00:00:25.720
in the browser, on both generic and specialized hardware.
link |
00:00:29.840
TPU, GPU, and so on.
link |
00:00:31.920
Plus, there's a big emphasis on growing
link |
00:00:34.200
a passionate community of developers.
link |
00:00:36.600
Rajat, Jeff Dean, and a large team of engineers at Google
link |
00:00:39.760
Brain are working to define the future of machine learning
link |
00:00:42.720
with TensorFlow 2.0, which is now in alpha.
link |
00:00:46.200
I think the decision to open source TensorFlow
link |
00:00:49.120
is a definitive moment in the tech industry.
link |
00:00:51.720
It showed that open innovation can be successful
link |
00:00:54.360
and inspire many companies to open source their code,
link |
00:00:56.840
to publish, and in general engage in the open exchange
link |
00:00:59.640
of ideas.
link |
00:01:01.160
This conversation is part of the artificial intelligence
link |
00:01:03.880
podcast.
link |
00:01:05.000
If you enjoy it, subscribe on YouTube, iTunes,
link |
00:01:07.760
or simply connect with me on Twitter
link |
00:01:09.600
at Lex Friedman, spelled FRID.
link |
00:01:12.640
And now, here's my conversation with Rajat Manga.
link |
00:01:17.880
You were involved with Google Brain since its start in 2011
link |
00:01:22.440
with Jeff Dean.
link |
00:01:24.800
It started with disbelief, the proprietary machine learning
link |
00:01:29.160
library, and turned into TensorFlow 2014,
link |
00:01:32.760
the open source library.
link |
00:01:35.760
So what were the early days of Google Brain like?
link |
00:01:39.040
What were the goals, the missions?
link |
00:01:41.760
How do you even proceed forward once there's
link |
00:01:45.080
so much possibilities before you?
link |
00:01:47.680
It was interesting back then when I started,
link |
00:01:50.520
or when you were even just talking about it.
link |
00:01:55.320
The idea of deep learning was interesting
link |
00:01:58.800
and intriguing in some ways.
link |
00:02:00.400
It hadn't yet taken off, but it held some promise.
link |
00:02:04.840
It had shown some very promising and early results.
link |
00:02:08.680
I think the idea where Andrew and Jeff had started
link |
00:02:11.360
was what if we can take this, what people are doing in research,
link |
00:02:16.160
and scale it to what Google has in terms of the compute power,
link |
00:02:21.560
and also put that kind of data together, what does it mean?
link |
00:02:25.240
And so far, the results had been if you scale the computer,
link |
00:02:28.240
scale the data, it does better, and would that work?
link |
00:02:31.480
And so that was the first year or two.
link |
00:02:33.360
Can we prove that outright?
link |
00:02:35.080
And with disbelief, when we started the first year,
link |
00:02:37.440
we got some early wins, which is always great.
link |
00:02:40.760
What were the wins like?
link |
00:02:41.880
What was the wins where there are some problems to this?
link |
00:02:45.240
This is going to be good.
link |
00:02:46.560
I think the two early wins were one was speech
link |
00:02:49.640
that we collaborated very closely with the speech research
link |
00:02:52.200
team, who was also getting interested in this.
link |
00:02:54.760
And the other one was on images where
link |
00:02:57.760
the cat paper, as we call it, that was covered by a lot of folks.
link |
00:03:03.120
And the birth of Google Brain was around neural networks.
link |
00:03:07.440
So it was deep learning from the very beginning.
link |
00:03:09.280
That was the whole mission.
link |
00:03:10.760
So in terms of scale, what was the dream
link |
00:03:18.960
of what this could become?
link |
00:03:21.040
Were there echoes of this open source TensorFlow community
link |
00:03:24.280
that might be brought in?
link |
00:03:26.240
Was there a sense of TPUs?
link |
00:03:28.640
Was there a sense of machine learning
link |
00:03:31.120
is now going to be at the core of the entire company?
link |
00:03:33.680
Is going to grow into that direction?
link |
00:03:36.040
Yeah, I think so that was interesting.
link |
00:03:38.320
And if I think back to 2012 or 2011,
link |
00:03:41.320
and first was can we scale it in the year or so,
link |
00:03:45.240
we had started scaling it to hundreds and thousands
link |
00:03:47.520
of machines.
link |
00:03:48.080
In fact, we had some runs even going to 10,000 machines.
link |
00:03:51.040
And all of those shows great promise.
link |
00:03:53.840
In terms of machine learning at Google,
link |
00:03:56.760
the good thing was Google's been doing machine learning
link |
00:03:58.760
for a long time.
link |
00:04:00.200
Deep learning was new.
link |
00:04:02.120
But as we scale this up, we showed that, yes, that was
link |
00:04:05.000
possible, and it was going to impact lots of things.
link |
00:04:07.840
Like, we started seeing real products wanting to use this.
link |
00:04:11.160
Again, speech was the first.
link |
00:04:12.720
There were image things that photos came out of
link |
00:04:15.120
in many other products as well.
link |
00:04:17.360
So that was exciting.
link |
00:04:20.120
As we went into with that a couple of years,
link |
00:04:23.120
externally also academia started to,
link |
00:04:25.760
there was lots of push on, OK, deep learning's
link |
00:04:27.760
interesting, we should be doing more, and so on.
link |
00:04:30.520
And so by 2014, we were looking at, OK, this is a big thing.
link |
00:04:35.560
It's going to grow.
link |
00:04:36.680
And not just internally, externally as well.
link |
00:04:39.400
Yes, maybe Google's ahead of where everybody is,
link |
00:04:42.240
but there's a lot to do.
link |
00:04:43.600
So a lot of this start to make sense and come together.
link |
00:04:46.640
So the decision to open source, I was just chatting with Chris
link |
00:04:51.080
Flattner about this, the decision to go open source
link |
00:04:53.360
with TensorFlow, I would say for me personally,
link |
00:04:57.040
seems to be one of the big seminal moments in all
link |
00:05:00.000
of software engineering ever.
link |
00:05:01.680
I think that when a large company like Google
link |
00:05:04.600
decides to take a large project that many lawyers might argue
link |
00:05:08.680
has a lot of IP, just decide to go open source with it.
link |
00:05:12.840
And in so doing, lead the entire world in saying,
link |
00:05:15.200
you know what, open innovation is a pretty powerful thing.
link |
00:05:19.280
And it's OK to do.
link |
00:05:22.320
That was, I mean, that's an incredible moment in time.
link |
00:05:26.400
So do you remember those discussions happening?
link |
00:05:29.280
Are there open source should be happening?
link |
00:05:31.320
What was that like?
link |
00:05:32.600
I would say, I think, so the initial idea came from Jeff,
link |
00:05:36.840
who was a big proponent of this.
link |
00:05:39.320
I think it came off of two big things.
link |
00:05:42.400
One was research wise, we were a research group.
link |
00:05:46.280
We were putting all our research out there if you wanted to.
link |
00:05:50.240
We were building on other's research,
link |
00:05:51.680
and we wanted to push the state of the art forward.
link |
00:05:54.920
And part of that was to share the research.
link |
00:05:56.800
That's how I think deep learning and machine learning
link |
00:05:58.920
has really grown so fast.
link |
00:06:01.360
So the next step was, OK, now word software
link |
00:06:04.280
help for that.
link |
00:06:05.280
And it seemed like they were existing a few libraries
link |
00:06:09.720
out there, Tiano being one, Torch being another,
link |
00:06:12.160
and a few others.
link |
00:06:13.960
But they were all done by academia,
link |
00:06:15.400
and so the level was significantly different.
link |
00:06:19.000
The other one was, from a software perspective,
link |
00:06:22.040
Google had done lots of software that we used internally.
link |
00:06:27.120
And we published papers.
link |
00:06:29.120
Often there was an open source project
link |
00:06:31.680
that came out of that, that somebody else
link |
00:06:33.600
picked up that paper and implemented,
link |
00:06:35.440
and they were very successful.
link |
00:06:38.280
Back then, it was like, OK, there's
link |
00:06:40.920
Hadoop, which has come off of tech that we've built.
link |
00:06:44.200
We know that tech we've built is way better
link |
00:06:46.240
for a number of different reasons.
link |
00:06:47.880
We've invested a lot of effort in that.
link |
00:06:51.680
And turns out, we have Google Cloud,
link |
00:06:54.320
and we are now not really providing our tech,
link |
00:06:57.520
but we are saying, OK, we have Bigtable, which
link |
00:07:00.520
is the original thing.
link |
00:07:02.080
We are going to now provide HBase APIs on top of that, which
link |
00:07:05.280
isn't as good, but that's what everybody's used to.
link |
00:07:07.480
So there's like, can we make something that is better
link |
00:07:10.960
and really just provide?
link |
00:07:12.320
Helps the community in lots of ways,
link |
00:07:14.320
but it also helps push the right, a good standard forward.
link |
00:07:18.320
So how does Cloud fit into that?
link |
00:07:19.960
There's a TensorFlow open source library.
link |
00:07:22.680
And how does the fact that you can
link |
00:07:25.800
use so many of the resources that Google provides
link |
00:07:28.240
and the Cloud fit into that strategy?
link |
00:07:31.480
So TensorFlow itself is open, and you can use it anywhere.
link |
00:07:34.920
And we want to make sure that continues to be the case.
link |
00:07:38.360
On Google Cloud, we do make sure that there's
link |
00:07:42.080
lots of integrations with everything else,
link |
00:07:43.800
and we want to make sure that it works really, really well there.
link |
00:07:47.280
You're leading the TensorFlow effort.
link |
00:07:50.080
Can you tell me the history and the timeline of TensorFlow
link |
00:07:52.360
project in terms of major design decisions,
link |
00:07:55.880
like the open source decision, but really, what to include
link |
00:08:01.240
and not?
link |
00:08:01.600
There's this incredible ecosystem that I'd
link |
00:08:03.600
like to talk about, there's all these parts.
link |
00:08:05.680
But if you just some sample moments that
link |
00:08:12.120
defined what TensorFlow eventually became through its,
link |
00:08:15.960
I don't know if you were allowed to say history when it's just,
link |
00:08:19.400
but in deep learning, everything moves so fast
link |
00:08:21.240
in just a few years, it's already history.
link |
00:08:23.400
Yes, yes.
link |
00:08:24.880
So looking back, we were building TensorFlow.
link |
00:08:29.760
I guess we open sourced it in 2015, November 2015.
link |
00:08:34.240
We started on it in summer of 2014, I guess.
link |
00:08:39.800
And somewhere like three to six late 2014,
link |
00:08:42.960
by then we had decided that, OK, there's
link |
00:08:45.320
a high likelihood we'll open source it.
link |
00:08:47.080
So we started thinking about that and making sure
link |
00:08:49.560
that we're heading down that path.
link |
00:08:53.960
At that point, by that point, we'd
link |
00:08:56.280
seen a few lots of different use cases at Google.
link |
00:08:59.280
So there were things like, OK, yes,
link |
00:09:01.200
you want to run in at large scale in the data center.
link |
00:09:04.160
Yes, we need to support different kind of hardware.
link |
00:09:07.480
We had GPUs at that point.
link |
00:09:09.400
We had our first GPU at that point
link |
00:09:11.880
or was about to come out roughly around that time.
link |
00:09:15.760
So the design included those.
link |
00:09:18.640
We had started to push on mobile.
link |
00:09:21.760
So we were running models on mobile.
link |
00:09:24.880
At that point, people were customizing code.
link |
00:09:28.080
So we wanted to make sure TensorFlow could support that
link |
00:09:30.280
as well so that that became part of that overall
link |
00:09:34.120
design.
link |
00:09:35.200
When you say mobile, you mean like pretty complicated
link |
00:09:38.040
algorithms of running on the phone?
link |
00:09:39.960
That's correct.
link |
00:09:40.480
So when you have a model that you
link |
00:09:42.680
deploy on the phone and run it there, right?
link |
00:09:45.200
So already at that time, there was ideas of running machine
link |
00:09:47.800
learning on the phone.
link |
00:09:48.720
That's correct.
link |
00:09:49.240
We already had a couple of products
link |
00:09:51.360
that were doing that by then.
link |
00:09:53.240
And in those cases, we had basically
link |
00:09:55.480
customized handcrafted code or some internal libraries
link |
00:09:59.280
that we're using.
link |
00:10:00.080
So I was actually at Google during this time in a parallel,
link |
00:10:03.280
I guess, universe.
link |
00:10:04.440
But we were using Theano and CAFE.
link |
00:10:09.240
Was there some degree to which you were bouncing,
link |
00:10:11.560
like trying to see what CAFE was offering people,
link |
00:10:15.440
trying to see what Theano was offering
link |
00:10:17.920
that you want to make sure you're delivering on whatever that
link |
00:10:21.320
is, perhaps the Python part of thing.
link |
00:10:23.680
Maybe did that influence any design decisions?
link |
00:10:27.440
Totally.
link |
00:10:27.880
So when we built this belief, and some of that
link |
00:10:30.840
was in parallel with some of these libraries
link |
00:10:32.920
coming up, I mean, Theano itself is older.
link |
00:10:36.600
But we were building this belief focused on our internal thing
link |
00:10:41.080
because our systems were very different.
link |
00:10:42.880
By the time we got to this, we looked
link |
00:10:44.480
at a number of libraries that were out there.
link |
00:10:47.040
Theano, there were folks in the group
link |
00:10:49.240
who had experience with Torch, with Lua.
link |
00:10:52.080
There were folks here who had seen CAFE.
link |
00:10:54.720
I mean, actually, Yang Cheng was here as well.
link |
00:10:58.800
There's what other libraries?
link |
00:11:02.960
I think we looked at a number of things.
link |
00:11:04.880
Might even have looked at Jane and her back then.
link |
00:11:06.800
I'm trying to remember if it was there.
link |
00:11:09.320
In fact, yeah, we did discuss ideas around, OK,
link |
00:11:12.240
should we have a graph or not?
link |
00:11:15.280
And they were supporting all these together
link |
00:11:19.280
was definitely, you know, there were key decisions
link |
00:11:21.880
that we wanted.
link |
00:11:22.560
We had seen limitations in our prior disbelief things.
link |
00:11:28.680
A few of them were just in terms of research
link |
00:11:31.320
was moving so fast.
link |
00:11:32.280
We wanted the flexibility.
link |
00:11:34.520
We want the hardware was changing fast.
link |
00:11:36.280
We expected to change that so that those probably were two
link |
00:11:39.160
things.
link |
00:11:41.400
And yeah, I think the flexibility in terms
link |
00:11:43.320
of being able to express all kinds of crazy things
link |
00:11:45.280
was definitely a big one then.
link |
00:11:46.840
So what the graph decisions, though,
link |
00:11:48.920
with moving towards TensorFlow 2.0, there's more,
link |
00:11:53.720
by default, there'll be eager execution.
link |
00:11:56.680
So sort of hiding the graph a little bit
link |
00:11:59.160
because it's less intuitive in terms of the way
link |
00:12:02.120
people develop and so on.
link |
00:12:03.520
What was that discussion like with in terms of using graphs?
link |
00:12:06.720
It seemed it's kind of the theano way.
link |
00:12:09.320
Did it seem the obvious choice?
link |
00:12:11.600
So I think where it came from was our disbelief,
link |
00:12:15.720
had a graph like thing as well.
link |
00:12:18.560
It wasn't a general graph.
link |
00:12:19.720
It was more like a straight line thing.
link |
00:12:23.160
More like what you might think of Cafe,
link |
00:12:25.000
I guess, in that sense.
link |
00:12:28.840
And we always cared about the production stuff.
link |
00:12:31.080
Even with disbelief, we were deploying a whole bunch of stuff
link |
00:12:33.480
in production.
link |
00:12:34.440
So graph did come from that when we thought of, OK,
link |
00:12:37.960
should we do that in Python and we experimented with some ideas
link |
00:12:40.800
where it looked a lot simpler to use,
link |
00:12:44.680
but not having a graph meant, OK, how do you deploy now?
link |
00:12:47.880
So that was probably what tilted the balance for us.
link |
00:12:51.080
And eventually, we ended up with the graph.
link |
00:12:52.880
And I guess the question there is, did you?
link |
00:12:55.320
I mean, production seems to be the really good thing to focus on.
link |
00:12:59.800
But did you even anticipate the other side of it
link |
00:13:02.400
where there could be, what is it?
link |
00:13:04.560
What are the numbers?
link |
00:13:05.240
Something crazy, 41 million downloads?
link |
00:13:08.920
Yep.
link |
00:13:12.680
I mean, was that even like a possibility in your mind
link |
00:13:16.240
that it would be as popular as it became?
link |
00:13:19.120
So I think we did see a need for this a lot
link |
00:13:24.880
from the research perspective and early days of deep learning
link |
00:13:30.000
in some ways.
link |
00:13:32.280
41 million?
link |
00:13:33.040
No, I don't think I imagine this number then.
link |
00:13:37.640
It seemed like there's a potential future where lots more people
link |
00:13:42.760
would be doing this.
link |
00:13:43.760
And how do we enable that?
link |
00:13:45.640
I would say this kind of growth, I probably
link |
00:13:49.560
started seeing somewhat after the open sourcing where it was
link |
00:13:53.680
like, OK, deep learning is actually
link |
00:13:56.240
growing way faster for a lot of different reasons.
link |
00:13:59.200
And we are in just the right place to push on that
link |
00:14:02.720
and leverage that and deliver on lots of things
link |
00:14:06.040
that people want.
link |
00:14:07.440
So what changed once the open source?
link |
00:14:09.760
Like how this incredible amount of attention
link |
00:14:13.320
from a global population of developers,
link |
00:14:16.120
how did the projects start changing?
link |
00:14:18.200
I don't even actually remember it during those times.
link |
00:14:21.880
I know looking now, there's really good documentation.
link |
00:14:24.560
There's an ecosystem of tools.
link |
00:14:26.560
There's a YouTube channel now.
link |
00:14:31.080
It's very community driven.
link |
00:14:33.760
Back then, I guess 0.1 version.
link |
00:14:38.800
Is that the version?
link |
00:14:39.760
I think we called it 0.6 or 5, something like that.
link |
00:14:42.680
Something like that.
link |
00:14:43.720
What changed leading into 1.0?
link |
00:14:47.200
It's interesting.
link |
00:14:48.480
I think we've gone through a few things there.
link |
00:14:51.640
When we started out, when we first came out,
link |
00:14:53.680
people loved the documentation we have.
link |
00:14:56.080
Because it was just a huge step up from everything else.
link |
00:14:58.800
Because all of those were academic projects, people
link |
00:15:01.920
don't think about documentation.
link |
00:15:04.560
I think what that changed was instead of deep learning
link |
00:15:08.040
being a research thing, some people who were just developers
link |
00:15:12.560
could now suddenly take this out and do
link |
00:15:15.080
some interesting things with it.
link |
00:15:16.920
Who had no clue what machine learning was before then.
link |
00:15:20.720
And that, I think, really changed
link |
00:15:22.520
how things started to scale up in some ways and pushed on it.
link |
00:15:27.880
Over the next few months, as we looked at,
link |
00:15:30.400
how do we stabilize things?
link |
00:15:31.960
As we look at not just researchers,
link |
00:15:33.840
now we want stability.
link |
00:15:34.880
People want to deploy things.
link |
00:15:36.480
That's how we started planning for 1.0.
link |
00:15:38.960
And there are certain needs for that perspective.
link |
00:15:42.240
And so, again, documentation comes up,
link |
00:15:45.320
designs, more kinds of things to put that together.
link |
00:15:49.480
And so that was exciting to get that to a stage where
link |
00:15:53.120
more and more enterprises wanted to buy in and really
link |
00:15:56.400
get behind that.
link |
00:15:58.720
And I think post 1.0 and with the next few releases,
link |
00:16:02.640
their enterprise adoption also started to take off.
link |
00:16:05.240
I would say between the initial release and 1.0,
link |
00:16:08.000
it was, OK, researchers, of course.
link |
00:16:11.000
Then a lot of hobbies and early interest,
link |
00:16:13.720
people excited about this who started to get on board.
link |
00:16:15.920
And then over the 1.x thing, lots of enterprises.
link |
00:16:19.000
I imagine anything that's below 1.0
link |
00:16:23.760
gets pressured to be enterprise problem or something
link |
00:16:27.160
that's stable.
link |
00:16:28.000
Exactly.
link |
00:16:28.800
And do you have a sense now that TensorFlow is stable?
link |
00:16:33.360
It feels like deep learning, in general,
link |
00:16:35.520
is extremely dynamic field.
link |
00:16:37.800
So much is changing.
link |
00:16:39.680
Do you have a, and TensorFlow has been growing incredibly.
link |
00:16:43.400
Do you have a sense of stability at the helm of this?
link |
00:16:46.720
I mean, I know you're in the midst of it.
link |
00:16:48.360
Yeah.
link |
00:16:50.360
I think in the midst of it, it's often easy to forget what
link |
00:16:54.000
an enterprise wants and what some of the people on that side
link |
00:16:58.160
want.
link |
00:16:58.760
There are still people running models
link |
00:17:00.360
that are three years old, four years old.
link |
00:17:02.640
So inception is still used by tons of people.
link |
00:17:06.000
Even less than 50 is what, a couple of years old now or more.
link |
00:17:08.880
But there are tons of people who use that, and they're fine.
link |
00:17:12.200
They don't need the last couple of bits of performance or quality.
link |
00:17:16.200
They want some stability in things that just work.
link |
00:17:19.600
And so there is value in providing that with that kind
link |
00:17:22.720
of stability and making it really simpler,
link |
00:17:25.160
because that allows a lot more people to access it.
link |
00:17:27.800
And then there's the research crowd, which wants, OK,
link |
00:17:31.640
they want to do these crazy things exactly like you're
link |
00:17:33.680
saying, not just deep learning in the straight up models
link |
00:17:37.000
that used to be there.
link |
00:17:38.400
They want RNNs, and even RNNs are maybe old.
link |
00:17:41.920
They are transformers now, and now it
link |
00:17:45.520
needs to combine with RL and GANs and so on.
link |
00:17:48.720
So there's definitely that area, the boundary that's
link |
00:17:52.160
shifting and pushing the state of the art.
link |
00:17:55.120
But I think there's more and more of the past
link |
00:17:57.120
that's much more stable.
link |
00:17:59.680
And even stuff that was two, three years old
link |
00:18:02.680
is very, very usable by lots of people.
link |
00:18:04.920
So that part makes it a lot easier.
link |
00:18:07.440
So I imagine maybe you can correct me if I'm wrong.
link |
00:18:09.800
One of the biggest use cases is essentially
link |
00:18:12.440
taking something like ResNet 50 and doing
link |
00:18:15.160
some kind of transfer learning on a very particular problem
link |
00:18:18.520
that you have.
link |
00:18:19.600
It's basically probably what majority of the world does.
link |
00:18:24.480
And you want to make that as easy as possible.
link |
00:18:27.040
So I would say, for the hobbyist perspective,
link |
00:18:30.400
that's the most common case.
link |
00:18:32.800
In fact, the apps on phones and stuff
link |
00:18:34.800
that you'll see, the early ones, that's the most common case.
link |
00:18:37.680
I would say there are a couple of reasons for that.
link |
00:18:40.320
One is that everybody talks about that.
link |
00:18:44.400
It looks great on slides.
link |
00:18:46.120
That's a great presentation.
link |
00:18:48.120
Exactly.
link |
00:18:50.040
What enterprises want is that is part of it,
link |
00:18:53.120
but that's not the big thing.
link |
00:18:54.480
Enterprises really have data that they
link |
00:18:56.760
want to make predictions on.
link |
00:18:58.040
This is often what they used to do with the people who
link |
00:19:01.160
were doing ML was just regression models,
link |
00:19:03.600
linear regression, logistic regression, linear models,
link |
00:19:06.440
or maybe gradient booster trees and so on.
link |
00:19:09.800
Some of them still benefit from deep learning,
link |
00:19:11.760
but they weren't that that's the bread and butter,
link |
00:19:14.440
like the structured data and so on.
link |
00:19:16.280
So depending on the audience you look at,
link |
00:19:18.200
they're a little bit different.
link |
00:19:19.520
And they just have, I mean, the best of enterprise
link |
00:19:23.320
probably just has a very large data set
link |
00:19:26.480
where deep learning can probably shine.
link |
00:19:28.640
That's correct.
link |
00:19:29.360
That's right.
link |
00:19:30.320
And then I think the other pieces
link |
00:19:32.240
that they wanted, again, to point out
link |
00:19:34.560
that the developer summit we put together
link |
00:19:36.400
is that the whole TensorFlow Extended
link |
00:19:38.200
piece, which is the entire pipeline,
link |
00:19:40.600
they care about stability across doing their entire thing.
link |
00:19:43.560
They want simplicity across the entire thing.
link |
00:19:46.200
I don't need to just train a model.
link |
00:19:47.680
I need to do that every day again, over and over again.
link |
00:19:51.280
I wonder to which degree you have a role in, I don't know.
link |
00:19:54.720
So I teach a course on deep learning.
link |
00:19:57.040
I have people like lawyers come up to me and say,
link |
00:20:01.320
when is machine learning going to enter legal,
link |
00:20:04.200
the legal realm?
link |
00:20:05.560
The same thing in all kinds of disciplines, immigration,
link |
00:20:11.720
insurance.
link |
00:20:13.800
Often when I see what it boils down to is these companies
link |
00:20:17.400
are often a little bit old school in the way
link |
00:20:19.760
they organize the data.
link |
00:20:20.840
So the data is just not ready yet.
link |
00:20:22.800
It's not digitized.
link |
00:20:24.040
Do you also find yourself being in the role of an evangelist
link |
00:20:28.160
for let's organize your data, folks,
link |
00:20:33.040
and then you'll get the big benefit of TensorFlow?
link |
00:20:35.440
Do you have those conversations?
link |
00:20:38.000
Yeah, I get all kinds of questions there from, OK,
link |
00:20:45.160
what do I need to make this work, right?
link |
00:20:49.000
Do we really need deep learning?
link |
00:20:50.800
I mean, there are all these things.
link |
00:20:52.240
I already used this linear model.
link |
00:20:54.000
Why would this help?
link |
00:20:55.160
I don't have enough data, let's say.
link |
00:20:57.160
Or I want to use machine learning,
link |
00:20:59.960
but I have no clue where to start.
link |
00:21:01.760
So it's a great start to all the way to the experts
link |
00:21:04.920
who wise were very specific things, so it's interesting.
link |
00:21:08.520
Is there a good answer?
link |
00:21:09.600
It boils down to oftentimes digitizing data.
link |
00:21:12.480
So whatever you want automated, whatever data
link |
00:21:15.240
you want to make prediction based on,
link |
00:21:17.480
you have to make sure that it's in an organized form.
link |
00:21:21.240
Like with an intensive flow ecosystem,
link |
00:21:23.920
there's now you're providing more and more data
link |
00:21:26.080
sets and more and more pretrained models.
link |
00:21:28.960
Are you finding yourself also the organizer of data sets?
link |
00:21:32.400
Yes, I think with TensorFlow data sets
link |
00:21:34.480
that we just released, that's definitely come up where people
link |
00:21:38.360
want these data sets.
link |
00:21:39.200
Can we organize them and can we make that easier?
link |
00:21:41.560
So that's definitely one important thing.
link |
00:21:45.320
The other related thing I would say is I often tell people,
link |
00:21:47.680
you know what, don't think of the most fanciest thing
link |
00:21:50.960
that the newest model that you see.
link |
00:21:53.320
Make something very basic work, and then
link |
00:21:55.480
you can improve it.
link |
00:21:56.360
There's just lots of things you can do with it.
link |
00:21:58.840
Yeah, start with the basics.
link |
00:22:00.080
Sure.
link |
00:22:00.580
One of the big things that makes TensorFlow even more
link |
00:22:03.760
accessible was the appearance, whenever
link |
00:22:06.440
that happened, of Keras, the Keras standard outside of TensorFlow.
link |
00:22:12.400
I think it was Keras on top of Tiano at first only,
link |
00:22:18.200
and then Keras became on top of TensorFlow.
link |
00:22:22.480
Do you know when Keras chose to also add TensorFlow as a back end,
link |
00:22:29.840
who was it just the community that drove that initially?
link |
00:22:33.960
Do you know if there was discussions, conversations?
link |
00:22:37.000
Yeah, so Franco started the Keras project
link |
00:22:40.920
before he was at Google, and the first thing was Tiano.
link |
00:22:44.560
I don't remember if that was after TensorFlow
link |
00:22:47.120
was created or way before.
link |
00:22:49.640
And then at some point, when TensorFlow
link |
00:22:52.000
started becoming popular, there were enough similarities
link |
00:22:54.160
that he decided to create this interface
link |
00:22:56.320
and put TensorFlow as a back end.
link |
00:22:59.200
I believe that might still have been before he joined Google.
link |
00:23:03.320
So we weren't really talking about that.
link |
00:23:06.720
He decided on his own and thought that was interesting
link |
00:23:09.720
and relevant to the community.
link |
00:23:12.760
In fact, I didn't find out about him being at Google
link |
00:23:17.080
until a few months after he was here.
link |
00:23:19.680
He was working on some research ideas.
link |
00:23:21.840
And doing Keras and his nights and weekends project and stuff.
link |
00:23:24.480
I wish this thing.
link |
00:23:25.280
So he wasn't part of the TensorFlow.
link |
00:23:28.480
He didn't join initially.
link |
00:23:29.680
He joined research, and he was doing some amazing research.
link |
00:23:32.240
He has some papers on that and research.
link |
00:23:35.440
He's a great researcher as well.
link |
00:23:38.400
And at some point, we realized, oh, he's doing this good stuff.
link |
00:23:42.400
People seem to like the API, and he's right here.
link |
00:23:45.480
So we talked to him, and he said, OK,
link |
00:23:48.280
why don't I come over to your team
link |
00:23:50.600
and work with you for a quarter?
link |
00:23:52.800
And let's make that integration happen.
link |
00:23:55.440
And we talked to his manager, and he said, sure,
link |
00:23:57.200
what, quarter's fine.
link |
00:23:59.720
And that quarter's been something like two years now.
link |
00:24:03.320
So he's fully on this.
link |
00:24:05.040
So Keras got integrated into TensorFlow in a deep way.
link |
00:24:12.000
And now with TensorFlow 2.0, Keras
link |
00:24:15.920
is kind of the recommended way for a beginner
link |
00:24:19.400
to interact with TensorFlow, which
link |
00:24:21.960
makes that initial sort of transfer learning
link |
00:24:24.640
or the basic use cases, even for an enterprise,
link |
00:24:28.040
super simple, right?
link |
00:24:29.320
That's correct.
link |
00:24:29.920
That's right.
link |
00:24:30.440
So what was that decision like?
link |
00:24:32.040
That seems like it's kind of a bold decision as well.
link |
00:24:38.640
We did spend a lot of time thinking about that one.
link |
00:24:41.200
We had a bunch of APIs some bit by us.
link |
00:24:46.000
There was a parallel layers API that we were building
link |
00:24:48.760
and when we decided to do Keras in parallel,
link |
00:24:51.560
so they were like, OK, two things that we are looking at.
link |
00:24:54.400
And the first thing we was trying to do
link |
00:24:55.960
is just have them look similar, be as integrated as possible,
link |
00:25:00.080
share all of that stuff.
link |
00:25:02.200
There were also three other APIs that others had built over time
link |
00:25:05.800
because we didn't have a standard one.
link |
00:25:09.000
But one of the messages that we kept hearing from the community,
link |
00:25:12.080
OK, which one do we use?
link |
00:25:13.200
And they kept seeing, OK, here's a model in this one,
link |
00:25:15.560
and here's a model in this one, which should I pick?
link |
00:25:18.840
So that's sort of like, OK, we had to address that
link |
00:25:22.680
straight on with 2.0.
link |
00:25:24.000
The whole idea was we need to simplify.
link |
00:25:26.320
We had to pick one.
link |
00:25:28.600
Based on where we were, we were like, OK, let's see what
link |
00:25:34.600
are the people like.
link |
00:25:35.640
And Keras was clearly one that lots of people loved.
link |
00:25:39.280
There were lots of great things about it.
link |
00:25:41.600
So we settled on that.
link |
00:25:43.880
Organically.
link |
00:25:44.680
That's kind of the best way to do it.
link |
00:25:46.560
It was great.
link |
00:25:47.160
But it was surprising, nevertheless,
link |
00:25:48.720
to sort of bring in and outside.
link |
00:25:51.120
I mean, there was a feeling like Keras might be almost
link |
00:25:54.440
like a competitor in a certain kind of a two tensor flow.
link |
00:25:58.000
And in a sense, it became an empowering element
link |
00:26:01.320
of tensor flow.
link |
00:26:02.200
That's right.
link |
00:26:03.280
Yeah, it's interesting how you can put two things together
link |
00:26:07.200
which can align right.
link |
00:26:08.280
And in this case, I think Francois, the team,
link |
00:26:11.760
and a bunch of us have chatted and I think we all
link |
00:26:15.480
want to see the same kind of things.
link |
00:26:17.320
We all care about making it easier for the huge set
link |
00:26:20.360
of developers out there.
link |
00:26:21.440
And that makes a difference.
link |
00:26:23.440
So Python has Guido van Rossum, who
link |
00:26:27.280
until recently held the position of benevolent
link |
00:26:30.320
dictator for life.
link |
00:26:31.960
Right, so there's a huge successful open source
link |
00:26:36.040
project like tensor flow.
link |
00:26:37.320
Need one person who makes a final decision.
link |
00:26:40.680
So you did a pretty successful tensor flow Dev Summit
link |
00:26:45.480
just now, last couple of days.
link |
00:26:47.520
There's clearly a lot of different new features
link |
00:26:51.080
being incorporated in amazing ecosystem, so on.
link |
00:26:55.480
How are those design decisions made?
link |
00:26:57.320
Is there a BDFL in tensor flow?
link |
00:27:00.960
And or is it more distributed and organic?
link |
00:27:05.800
I think it's somewhat different, I would say.
link |
00:27:09.880
I've always been involved in the key design directions.
link |
00:27:16.160
But there are lots of things that
link |
00:27:17.560
are distributed where their number of people, Martin
link |
00:27:20.960
Wick being one who has really driven a lot of our open source
link |
00:27:24.760
stuff, a lot of the APIs.
link |
00:27:27.360
And there are a number of other people
link |
00:27:29.200
who have been pushed and been responsible
link |
00:27:32.720
for different parts of it.
link |
00:27:35.240
We do have regular design reviews.
link |
00:27:37.840
Over the last year, we've really spent a lot of time opening up
link |
00:27:40.680
to the community and adding transparency.
link |
00:27:44.160
We're setting more processes in place,
link |
00:27:45.880
so RFCs, special interest groups, really
link |
00:27:49.600
grow that community and scale that.
link |
00:27:53.560
I think the kind of scale that ecosystem is in,
link |
00:27:57.680
I don't think we could scale with having me as the lone
link |
00:28:00.240
point of decision maker.
link |
00:28:02.320
I got it.
link |
00:28:03.440
So yeah, the growth of that ecosystem,
link |
00:28:05.880
maybe you can talk about it a little bit.
link |
00:28:08.040
First of all, when I started with Andre Karpathi
link |
00:28:10.720
when he first did ComNet.js, the fact
link |
00:28:13.640
that you can train in your own network
link |
00:28:15.360
and the browser in JavaScript was incredible.
link |
00:28:18.480
So now TensorFlow.js is really making
link |
00:28:21.000
that a serious, a legit thing, a way
link |
00:28:26.920
to operate, whether it's in the back end or the front end.
link |
00:28:29.560
Then there's the TensorFlow Extended, like you mentioned.
link |
00:28:32.720
There's TensorFlow Lite for mobile.
link |
00:28:35.360
And all of it, as far as I can tell,
link |
00:28:37.480
it's really converging towards being
link |
00:28:39.640
able to save models in the same kind of way.
link |
00:28:43.440
You can move around, you can train on the desktop,
link |
00:28:46.680
and then move it to mobile, and so on.
link |
00:28:48.800
That's right.
link |
00:28:49.280
So this is that cohesiveness.
link |
00:28:52.320
So can you maybe give me whatever
link |
00:28:55.240
I missed, a bigger overview of the mission of the ecosystem
link |
00:28:58.840
that's trying to be built, and where is it moving forward?
link |
00:29:02.120
Yeah.
link |
00:29:02.800
So in short, the way I like to think of this
link |
00:29:05.720
is our goals to enable machine learning.
link |
00:29:09.760
And in a couple of ways, one is we
link |
00:29:13.320
have lots of exciting things going on in ML today.
link |
00:29:16.560
We started with deep learning, but we now
link |
00:29:18.160
support a bunch of other algorithms too.
link |
00:29:21.400
So one is to, on the research side,
link |
00:29:23.760
keep pushing on the state of the art.
link |
00:29:25.360
Can we, how do we enable researchers
link |
00:29:27.240
to build the next amazing thing?
link |
00:29:28.960
So BERT came out recently.
link |
00:29:31.800
It's great that people are able to do new kinds of research.
link |
00:29:34.000
There are lots of amazing research
link |
00:29:35.400
that happens across the world.
link |
00:29:37.600
So that's one direction.
link |
00:29:38.880
The other is, how do you take that
link |
00:29:41.400
across all the people outside who want to take that research
link |
00:29:45.200
and do some great things with it and integrate it
link |
00:29:47.400
to build real products, to have a real impact on people?
link |
00:29:51.800
And so if that's the other axes in some ways.
link |
00:29:56.720
And a high level, one way I think about it
link |
00:29:58.520
is there are a crazy number of computer devices
link |
00:30:02.480
across the world.
link |
00:30:04.240
And we often used to think of ML and training and all of this
link |
00:30:08.440
as, OK, something you do either in the workstation
link |
00:30:10.800
or the data center or cloud.
link |
00:30:13.600
But we see things running on the phones.
link |
00:30:15.720
We see things running on really tiny chips.
link |
00:30:17.640
And we had some demos at the developer summit.
link |
00:30:20.760
And so the way I think about this ecosystem
link |
00:30:25.160
is, how do we help get machine learning on every device that
link |
00:30:30.280
has a compute capability?
link |
00:30:32.520
And that continues to grow.
link |
00:30:33.760
And so in some ways, this ecosystem
link |
00:30:37.240
has looked at various aspects of that
link |
00:30:40.280
and grown over time to cover more of those.
link |
00:30:42.440
And we continue to push the boundaries.
link |
00:30:44.640
In some areas, we've built more tooling and things
link |
00:30:48.640
around that to help you.
link |
00:30:50.040
I mean, the first tool we started was TensorBoard.
link |
00:30:52.800
You want to learn just the training piece, the effects
link |
00:30:56.920
for TensorFlow Extended to really do your entire ML
link |
00:30:59.840
pipelines if you care about all that production stuff,
link |
00:31:04.760
but then going to the edge, going to different kinds of things.
link |
00:31:09.520
And it's not just us now.
link |
00:31:11.800
We are a place where there are lots of libraries being built
link |
00:31:15.120
on top.
link |
00:31:15.840
So there are some for research, maybe things
link |
00:31:18.440
like TensorFlow Agents or TensorFlow Probability that
link |
00:31:21.240
started as research things or for researchers
link |
00:31:23.480
for focusing on certain kinds of algorithms,
link |
00:31:26.160
but they're also being deployed or reduced by production folks.
link |
00:31:30.280
And some have come from within Google, just teams
link |
00:31:34.000
across Google who wanted to do the build these things.
link |
00:31:37.040
Others have come from just the community
link |
00:31:39.680
because there are different pieces
link |
00:31:41.840
that different parts of the community care about.
link |
00:31:44.640
And I see our goal as enabling even that.
link |
00:31:49.520
It's not we cannot and won't build every single thing.
link |
00:31:53.240
That just doesn't make sense.
link |
00:31:54.840
But if we can enable others to build the things
link |
00:31:57.320
that they care about, and there's a broader community that
link |
00:32:00.640
cares about that, and we can help encourage that,
link |
00:32:02.880
and that's great.
link |
00:32:05.240
That really helps the entire ecosystem, not just those.
link |
00:32:08.600
One of the big things about 2.0 that we're pushing on
link |
00:32:11.280
is, OK, we have these so many different pieces, right?
link |
00:32:14.680
How do we help make all of them work well together?
link |
00:32:18.440
There are a few key pieces there that we're pushing on,
link |
00:32:21.960
one being the core format in there
link |
00:32:23.840
and how we share the models themselves through SAVE model
link |
00:32:27.480
and what TensorFlow Hub and so on.
link |
00:32:30.440
And a few of the pieces that we really put this together.
link |
00:32:34.000
I was very skeptical that that's, when TensorFlow.js came out,
link |
00:32:37.240
it didn't seem or deep learning.js.
link |
00:32:40.120
Yeah, that was the first.
link |
00:32:41.680
It seems like technically very difficult project.
link |
00:32:45.040
As a standalone, it's not as difficult.
link |
00:32:47.040
But as a thing that integrates into the ecosystem,
link |
00:32:49.920
it seems very difficult.
link |
00:32:51.240
So I mean, there's a lot of aspects of this
link |
00:32:53.200
you're making look easy.
link |
00:32:54.200
But on the technical side, how many challenges
link |
00:32:58.160
have to be overcome here?
link |
00:33:00.560
A lot.
link |
00:33:01.520
And still have to be overcome.
link |
00:33:03.080
That's the question here, too.
link |
00:33:04.840
There are lots of steps to it.
link |
00:33:06.160
I think we've iterated over the last few years,
link |
00:33:08.160
so there's a lot we've learned.
link |
00:33:10.720
I, yeah, and often when things come together well,
link |
00:33:14.200
things look easy.
link |
00:33:15.080
And that's exactly the point.
link |
00:33:16.400
It should be easy for the end user.
link |
00:33:18.280
But there are lots of things that go behind that.
link |
00:33:21.320
If I think about still challenges ahead,
link |
00:33:25.320
there are we have a lot more devices coming on board,
link |
00:33:32.880
for example, from the hardware perspective.
link |
00:33:35.280
How do we make it really easy for these vendors
link |
00:33:37.640
to integrate with something like TensorFlow?
link |
00:33:42.040
So there's a lot of compiler stuff
link |
00:33:43.640
that others are working on.
link |
00:33:45.320
There are things we can do in terms of our APIs
link |
00:33:48.320
and so on that we can do.
link |
00:33:50.520
As we, TensorFlow started as a very monolithic system.
link |
00:33:55.840
And to some extent, it still is.
link |
00:33:57.680
There are less lots of tools around it,
link |
00:33:59.400
but the core is still pretty large and monolithic.
link |
00:34:02.960
One of the key challenges for us to scale that out
link |
00:34:05.760
is how do we break that apart with clear interfaces?
link |
00:34:10.440
It's, in some ways, it's software engineering one
link |
00:34:13.720
one, but for a system that's now four years old, I guess,
link |
00:34:18.520
or more, and that's still rapidly evolving
link |
00:34:21.600
and that we're not slowing down with,
link |
00:34:24.000
it's hard to change and modify and really break apart.
link |
00:34:28.240
It's sort of like, as people say, right,
link |
00:34:29.880
it's like changing the engine with a car running
link |
00:34:32.560
or fixed benefits.
link |
00:34:33.560
That's exactly what we're trying to do.
link |
00:34:35.200
So there's a challenge here, because the downside
link |
00:34:39.960
of so many people being excited about TensorFlow
link |
00:34:43.840
and becoming to rely on it in many other applications
link |
00:34:48.600
is that you're kind of responsible.
link |
00:34:52.200
It's the technical debt.
link |
00:34:53.520
You're responsible for previous versions
link |
00:34:55.640
to some degree still working.
link |
00:34:57.720
So when you're trying to innovate,
link |
00:34:59.920
I mean, it's probably easier to just start from scratch
link |
00:35:03.760
every few months.
link |
00:35:05.800
Absolutely.
link |
00:35:07.160
So do you feel the pain of that?
link |
00:35:10.880
2.0 does break some back compatibility, but not too much.
link |
00:35:15.360
It seems like the conversion is pretty straightforward.
link |
00:35:18.120
Do you think that's still important,
link |
00:35:20.240
given how quickly deep learning is changing?
link |
00:35:22.880
Can you just, the things that you've learned,
link |
00:35:26.360
can you just start over?
link |
00:35:27.440
Or is there pressure to not?
link |
00:35:30.120
It's a tricky balance.
link |
00:35:31.640
So if it was just a researcher writing a paper who
link |
00:35:36.840
a year later will not look at that code again,
link |
00:35:39.400
sure, it doesn't matter.
link |
00:35:41.560
There are a lot of production systems
link |
00:35:43.440
that rely on TensorFlow, both at Google
link |
00:35:45.480
and across the world.
link |
00:35:47.240
And people worry about this.
link |
00:35:49.760
I mean, these systems run for a long time.
link |
00:35:53.400
So it is important to keep that compatibility and so on.
link |
00:35:57.240
And yes, it does come with a huge cost.
link |
00:36:00.960
We have to think about a lot of things
link |
00:36:02.920
as we do new things and make new changes.
link |
00:36:06.960
I think it's a trade off, right?
link |
00:36:09.120
You can, you might slow certain kinds of things down,
link |
00:36:12.960
but the overall value you're bringing because of that
link |
00:36:15.440
is much bigger because it's not just
link |
00:36:18.440
about breaking the person yesterday.
link |
00:36:20.520
It's also about telling the person tomorrow that, you know what?
link |
00:36:24.840
This is how we do things.
link |
00:36:26.320
We're not going to break you when you come on board
link |
00:36:28.520
because there are lots of new people who are also
link |
00:36:30.320
going to come on board.
link |
00:36:32.880
So one way I like to think about this,
link |
00:36:34.680
and I always push the team to think about it as well,
link |
00:36:37.960
when you want to do new things, you
link |
00:36:39.640
want to start with a clean slate,
link |
00:36:42.000
design with a clean slate in mind,
link |
00:36:44.880
and then we'll figure out how to make sure all the other things
link |
00:36:48.160
work.
link |
00:36:48.640
And yes, we do make compromises occasionally.
link |
00:36:52.160
But unless you design with the clean slate
link |
00:36:55.200
and not worry about that, you'll never get to a good place.
link |
00:36:58.400
That's brilliant.
link |
00:36:59.120
So even if you are responsible in the idea stage,
link |
00:37:04.080
when you're thinking of new, just put all that behind you.
link |
00:37:07.680
OK, that's really well put.
link |
00:37:09.600
So I have to ask this because a lot of students, developers,
link |
00:37:12.480
asked me how I feel about PyTorch versus TensorFlow.
link |
00:37:16.280
So I've recently completely switched my research group
link |
00:37:19.720
to TensorFlow.
link |
00:37:20.920
I wish everybody would just use the same thing.
link |
00:37:23.280
And TensorFlow is as close to that, I believe, as we have.
link |
00:37:26.960
But do you enjoy competition?
link |
00:37:32.000
So TensorFlow is leading in many ways, many dimensions
link |
00:37:35.800
in terms of the ecosystem, in terms of the number of users,
link |
00:37:39.000
momentum power, production level, so on.
link |
00:37:41.200
But a lot of researchers are now also using PyTorch.
link |
00:37:46.000
Do you enjoy that kind of competition,
link |
00:37:47.520
or do you just ignore it and focus
link |
00:37:49.440
on making TensorFlow the best that it can be?
link |
00:37:52.320
So just like research or anything people are doing,
link |
00:37:55.480
it's great to get different kinds of ideas.
link |
00:37:58.120
And when we started with TensorFlow,
link |
00:38:01.440
like I was saying earlier, it was very important for us
link |
00:38:05.480
to also have production in mind.
link |
00:38:07.440
We didn't want just research, right?
link |
00:38:08.960
And that's why we chose certain things.
link |
00:38:11.280
Now PyTorch came along and said, you know what?
link |
00:38:13.480
I only care about research.
link |
00:38:14.880
This is what I'm trying to do.
link |
00:38:16.320
What's the best thing I can do for this?
link |
00:38:18.400
And it started iterating and said, OK,
link |
00:38:21.120
I don't need to worry about graphs.
link |
00:38:22.520
Let me just run things.
link |
00:38:25.200
I don't care if it's not as fast as it can be,
link |
00:38:27.440
but let me just make this part easy.
link |
00:38:30.480
And there are things you can learn from that, right?
link |
00:38:32.560
They, again, had the benefit of seeing what had come before,
link |
00:38:36.720
but also exploring certain different kinds of spaces.
link |
00:38:40.520
And they had some good things there,
link |
00:38:43.560
building on, say, things like Jainer and so on before that.
link |
00:38:46.680
So competition is definitely interesting.
link |
00:38:49.320
It made us, you know, this is an area
link |
00:38:51.040
that we had thought about, like I said, very early on.
link |
00:38:53.720
Over time, we had revisited this a couple of times.
link |
00:38:56.600
Should we add this again?
link |
00:38:59.000
At some point, we said, you know what,
link |
00:39:00.480
here's it seems like this can be done well.
link |
00:39:02.920
So let's try it again.
link |
00:39:04.280
And that's how we started pushing on eager execution.
link |
00:39:07.680
How do we combine those two together,
link |
00:39:09.880
which has finally come very well together in 2.0,
link |
00:39:13.080
but it took us a while to get all the things together
link |
00:39:15.720
and so on.
link |
00:39:16.320
So let me, I mean, ask, put another way.
link |
00:39:19.320
I think eager execution is a really powerful thing,
link |
00:39:21.800
those added.
link |
00:39:22.680
Do you think he wouldn't have been,
link |
00:39:25.840
you know, Muhammad Ali versus Frazier, right?
link |
00:39:28.400
Do you think it wouldn't have been added as quickly
link |
00:39:31.200
if PyTorch wasn't there?
link |
00:39:33.760
It might have taken longer.
link |
00:39:35.440
No longer.
link |
00:39:36.280
It was, I mean, we had tried some variants of that before.
link |
00:39:38.960
So I'm sure it would have happened,
link |
00:39:40.920
but it might have taken longer.
link |
00:39:42.240
I'm grateful that TensorFlow is part of the way they did.
link |
00:39:44.800
That's doing some incredible work last couple of years.
link |
00:39:47.760
What other things that we didn't talk about?
link |
00:39:49.640
Are you looking forward in 2.0?
link |
00:39:51.520
That comes to mind.
link |
00:39:54.040
So we talked about some of the ecosystem stuff,
link |
00:39:56.520
making it easily accessible to Keras, eager execution.
link |
00:40:01.440
Is there other things that we missed?
link |
00:40:02.880
Yeah, so I would say one is just where 2.0 is,
link |
00:40:07.480
and, you know, with all the things that we've talked about,
link |
00:40:10.760
I think as we think beyond that,
link |
00:40:13.760
there are lots of other things that it enables us to do
link |
00:40:16.640
and that we're excited about.
link |
00:40:18.760
So what it's setting us up for,
link |
00:40:20.720
okay, here are these really clean APIs.
link |
00:40:22.520
We've cleaned up the surface for what the users want.
link |
00:40:25.640
What it also allows us to do a whole bunch of stuff
link |
00:40:28.320
behind the scenes once we are ready with 2.0.
link |
00:40:31.600
So for example, in TensorFlow with graphs
link |
00:40:36.760
and all the things you could do,
link |
00:40:37.720
you could always get a lot of good performance
link |
00:40:40.600
if you spent the time to tune it, right?
link |
00:40:43.280
And we've clearly shown that, lots of people do that.
link |
00:40:47.720
With 2.0, with these APIs where we are,
link |
00:40:53.040
we can give you a lot of performance
link |
00:40:55.120
just with whatever you do.
link |
00:40:57.040
You know, because we see these, it's much cleaner.
link |
00:41:01.400
We know most people are gonna do things this way.
link |
00:41:03.720
We can really optimize for that
link |
00:41:05.520
and get a lot of those things out of the box.
link |
00:41:09.040
And it really allows us, you know,
link |
00:41:10.400
both for single machine and distributed and so on,
link |
00:41:13.880
to really explore other spaces behind the scenes
link |
00:41:17.200
after 2.0 in the future versions as well.
link |
00:41:19.680
So right now, the team's really excited about that,
link |
00:41:23.000
that over time, I think we'll see that.
link |
00:41:25.800
The other piece that I was talking about
link |
00:41:27.720
in terms of just restructuring the monolithic thing
link |
00:41:31.600
into more pieces and making it more modular,
link |
00:41:34.320
I think that's gonna be really important
link |
00:41:36.800
for a lot of the other people in the ecosystem,
link |
00:41:41.800
other organizations and so on that wanted to build things.
link |
00:41:44.760
Can you elaborate a little bit what you mean
link |
00:41:46.360
by making TensorFlow more ecosystem or modular?
link |
00:41:50.680
So the way it's organized today is there's one,
link |
00:41:55.000
there are lots of repositories
link |
00:41:56.280
in the TensorFlow organization at GitHub,
link |
00:41:58.320
the core one where we have TensorFlow,
link |
00:42:01.080
it has the execution engine,
link |
00:42:04.080
it has, you know, the key backends for CPUs and GPUs,
link |
00:42:08.280
it has the work to do distributed stuff.
link |
00:42:12.560
And all of these just work together
link |
00:42:14.360
in a single library or binary,
link |
00:42:17.240
there's no way to split them apart easily.
link |
00:42:18.800
I mean, there are some interfaces,
link |
00:42:19.960
but they're not very clean.
link |
00:42:21.600
In a perfect world, you would have clean interfaces where,
link |
00:42:24.800
okay, I wanna run it on my fancy cluster
link |
00:42:27.720
with some custom networking,
link |
00:42:29.360
just implement this and do that.
link |
00:42:30.960
I mean, we kind of support that,
link |
00:42:32.640
but it's hard for people today.
link |
00:42:35.480
I think as we are starting to see more interesting things
link |
00:42:38.160
in some of these spaces,
link |
00:42:39.400
having that clean separation will really start to help.
link |
00:42:42.280
And again, going to the large size of the ecosystem
link |
00:42:47.360
and the different groups involved there,
link |
00:42:50.120
enabling people to evolve and push on things
link |
00:42:53.440
more independently just allows it to scale better.
link |
00:42:56.040
And by people, you mean individual developers and?
link |
00:42:59.080
And organizations.
link |
00:42:59.920
And organizations.
link |
00:43:00.920
That's right.
link |
00:43:01.760
So the hope is that everybody sort of major,
link |
00:43:04.200
I don't know, Pepsi or something uses,
link |
00:43:06.880
like major corporations go to TensorFlow to this kind of.
link |
00:43:11.040
Yeah, if you look at enterprise like Pepsi or these,
link |
00:43:13.640
I mean, a lot of them are already using TensorFlow.
link |
00:43:15.520
They are not the ones that do the development
link |
00:43:18.960
or changes in the core.
link |
00:43:20.360
Some of them do, but a lot of them don't.
link |
00:43:21.920
I mean, they touch small pieces.
link |
00:43:23.720
There are lots of these, some of them being,
link |
00:43:26.400
let's say hardware vendors who are building
link |
00:43:28.200
their custom hardware and they want their own pieces.
link |
00:43:30.840
Or some of them being bigger companies, say IBM.
link |
00:43:34.160
I mean, they're involved in some of our special interest
link |
00:43:37.320
groups and they see a lot of users
link |
00:43:39.960
who want certain things and they want to optimize for that.
link |
00:43:42.640
So folks like that often.
link |
00:43:44.480
Autonomous vehicle companies, perhaps.
link |
00:43:46.400
Exactly, yes.
link |
00:43:48.200
So yeah, like I mentioned, TensorFlow
link |
00:43:50.520
has been down on it 41 million times, 50,000 commits,
link |
00:43:54.120
almost 10,000 pull requests, 1,800 contributors.
link |
00:43:58.360
So I'm not sure if you can explain it,
link |
00:44:02.160
but what does it take to build a community like that?
link |
00:44:06.840
In retrospect, what do you think?
link |
00:44:09.200
What is the critical thing that allowed for this growth
link |
00:44:12.080
to happen and how does that growth continue?
link |
00:44:14.600
Yeah, that's an interesting question.
link |
00:44:17.920
I wish I had all the answers there, I guess,
link |
00:44:20.240
so you could replicate it.
link |
00:44:22.520
I think there are a number of things
link |
00:44:25.520
that need to come together, right?
link |
00:44:27.880
One, just like any new thing, there's
link |
00:44:33.720
a sweet spot of timing, what's needed,
link |
00:44:37.960
does it grow with what's needed.
link |
00:44:39.520
So in this case, for example, TensorFlow
link |
00:44:41.960
is not just grown because it has a good tool,
link |
00:44:43.640
it's also grown with the growth of deep learning itself.
link |
00:44:46.640
So those factors come into play.
link |
00:44:49.000
Other than that, though, I think just
link |
00:44:53.120
hearing, listening to the community, what they're
link |
00:44:55.560
doing, what they need, being open to,
link |
00:44:58.400
like in terms of external contributions,
link |
00:45:01.080
we've spent a lot of time in making sure
link |
00:45:04.520
we can accept those contributions well,
link |
00:45:06.840
we can help the contributors in adding those,
link |
00:45:09.400
putting the right process in place,
link |
00:45:11.240
getting the right kind of community,
link |
00:45:13.320
welcoming them, and so on.
link |
00:45:16.120
Like over the last year, we've really pushed on transparency.
link |
00:45:19.000
That's important for an open source project.
link |
00:45:22.200
People want to know where things are going,
link |
00:45:23.760
and we're like, OK, here's a process for you.
link |
00:45:26.400
You can do that, here are our seasons, and so on.
link |
00:45:29.320
So thinking through, there are lots of community aspects
link |
00:45:32.880
that come into that you can really work on.
link |
00:45:36.400
As a small project, it's maybe easy to do,
link |
00:45:38.720
because there's two developers, and you can do those.
link |
00:45:42.240
As you grow, putting more of these processes in place,
link |
00:45:46.960
thinking about the documentation,
link |
00:45:49.080
thinking about what two developers
link |
00:45:51.400
care about, what kind of tools would they want to use,
link |
00:45:55.080
all of these come into play, I think.
link |
00:45:56.840
So one of the big things, I think,
link |
00:45:58.400
that feeds the TensorFlow fire is people building something
link |
00:46:02.560
on TensorFlow, and implement a particular architecture
link |
00:46:07.680
that does something cool and useful,
link |
00:46:09.480
and they put that on GitHub.
link |
00:46:11.080
And so it just feeds this growth.
link |
00:46:15.640
Do you have a sense that with 2.0 and 1.0,
link |
00:46:19.560
that there may be a little bit of a partitioning like there
link |
00:46:21.880
is with Python 2 and 3, that there'll be a code base
link |
00:46:26.040
in the older versions of TensorFlow
link |
00:46:28.320
that will not be as compatible easily,
link |
00:46:31.120
or are you pretty confident that this kind of conversion
link |
00:46:35.600
is pretty natural and easy to do?
link |
00:46:37.960
So we're definitely working hard to make that very easy to do.
link |
00:46:41.480
There's lots of tooling that we talked about at the developer
link |
00:46:44.040
summit this week, and we'll continue
link |
00:46:46.480
to invest in that tooling.
link |
00:46:48.280
It's when you think of these significant version changes,
link |
00:46:52.560
that's always a risk, and we are really pushing hard
link |
00:46:55.720
to make that transition very, very smooth.
link |
00:46:59.160
I think, so at some level, people
link |
00:47:03.000
want to move when they see the value in the new thing.
link |
00:47:05.520
They don't want to move just because it's a new thing.
link |
00:47:07.640
And some people do, but most people want a really good thing.
link |
00:47:11.400
And I think over the next few months,
link |
00:47:13.760
as people start to see the value,
link |
00:47:15.400
we'll definitely see that shift happening.
link |
00:47:17.640
So I'm pretty excited and confident that we
link |
00:47:20.080
will see people moving.
link |
00:47:22.440
As you said earlier, this field is also moving rapidly,
link |
00:47:24.680
so that'll help because we can do more things.
link |
00:47:26.720
And all the new things will clearly
link |
00:47:28.520
happen in 2.x, so people will have lots of good reasons to move.
link |
00:47:32.280
So what do you think TensorFlow 3.0 looks like?
link |
00:47:36.160
Is there things happening so crazily
link |
00:47:40.320
that even at the end of this year,
link |
00:47:42.520
seems impossible to plan for?
link |
00:47:45.320
Or is it possible to plan for the next five years?
link |
00:47:49.440
I think it's tricky.
link |
00:47:50.800
There are some things that we can expect in terms of, OK,
link |
00:47:55.760
change, yes, change is going to happen.
link |
00:47:59.720
Are there some things going to stick around
link |
00:48:01.680
and some things not going to stick around?
link |
00:48:03.720
I would say the basics of deep learning,
link |
00:48:08.160
the convolutional models or the basic kind of things,
link |
00:48:12.680
they'll probably be around in some form still in five years.
link |
00:48:16.280
Will Aurel and Gans stay very likely based on where they are?
link |
00:48:21.160
Will we have new things?
link |
00:48:22.840
Probably, but those are hard to predict.
link |
00:48:24.680
And some directionally, some things that we can see
link |
00:48:29.080
is in things that we're starting to do
link |
00:48:32.800
with some of our projects right now is just
link |
00:48:36.560
to point out combining eager execution and graphs,
link |
00:48:39.120
where we're starting to make it more like just your natural
link |
00:48:42.240
programming language.
link |
00:48:43.160
You're not trying to program something else.
link |
00:48:45.640
Similarly, with Swift for TensorFlow,
link |
00:48:47.240
we're taking that approach.
link |
00:48:48.280
Can you do something round up?
link |
00:48:50.040
So some of those ideas seem like, OK,
link |
00:48:52.080
that's the right direction in five years
link |
00:48:55.000
we expect to see more in that area.
link |
00:48:58.360
Other things we don't know is, will hardware accelerators
link |
00:49:01.760
be the same?
link |
00:49:03.200
Will we be able to train with four bits instead of 32 bits?
link |
00:49:09.000
And I think the TPU side of things is exploring.
link |
00:49:11.440
I mean, TPU is already on version three.
link |
00:49:13.960
It seems that the evolution of TPU and TensorFlow
link |
00:49:17.520
are coevolving in terms of both their learning
link |
00:49:24.080
from each other and from the community
link |
00:49:25.720
and from the applications where the biggest benefit is achieved.
link |
00:49:29.720
That's right.
link |
00:49:30.560
You've been trying with eager with Keras
link |
00:49:33.320
to make TensorFlow as accessible and easy to use as possible.
link |
00:49:36.480
What do you think for beginners is the biggest thing
link |
00:49:39.040
they struggle with?
link |
00:49:40.000
Have you encountered that?
link |
00:49:42.080
Or is basically what Keras is solving
link |
00:49:44.280
is that eager, like we talked about TensorFlow?
link |
00:49:48.680
For some of them, like you said, the beginners
link |
00:49:51.480
want to just be able to take some image model.
link |
00:49:54.840
They don't care if it's inception or rest net or something else
link |
00:49:58.040
and do some training or transfer learning
link |
00:50:00.760
on their kind of model.
link |
00:50:02.440
Being able to make that easy is important.
link |
00:50:04.400
So in some ways, if you do that by providing them
link |
00:50:08.560
simple models with, say, in Hub or so on,
link |
00:50:11.360
they don't care about what's inside that box,
link |
00:50:13.680
but they want to be able to use it.
link |
00:50:15.120
So we're pushing on, I think, different levels.
link |
00:50:17.600
If you look at just a component that you get, which
link |
00:50:20.120
has the layers already smushed in,
link |
00:50:22.800
the beginners probably just want that.
link |
00:50:25.200
Then the next step is, OK, look at building
link |
00:50:27.360
layers with Keras.
link |
00:50:29.000
If you go out to research, then they
link |
00:50:30.600
are probably writing custom layers themselves
link |
00:50:33.120
or doing their own loops.
link |
00:50:34.360
So there's a whole spectrum there.
link |
00:50:36.320
And then providing the preentrain models
link |
00:50:38.600
seems to really decrease the time from you trying to start.
link |
00:50:44.760
So you could basically, in a Colab notebook,
link |
00:50:46.800
achieve what you need.
link |
00:50:49.080
So I'm basically answering my own question,
link |
00:50:51.280
because I think what TensorFlow delivered on recently
link |
00:50:54.240
is trivial for beginners.
link |
00:50:57.000
So I was just wondering if there was other pain points
link |
00:51:00.760
you're trying to ease, but I'm not sure there would.
link |
00:51:02.480
No, those are probably the big ones.
link |
00:51:04.240
I mean, I see high schoolers doing a whole bunch of things
link |
00:51:07.080
now, which is pretty amazing.
link |
00:51:08.840
It's both amazing and terrifying.
link |
00:51:11.360
Yes.
link |
00:51:12.640
In a sense that when they grow up,
link |
00:51:16.920
some incredible ideas will be coming from them.
link |
00:51:19.280
So there's certainly a technical aspect to your work,
link |
00:51:21.800
but you also have a management aspect
link |
00:51:24.600
to your role with TensorFlow, leading the project,
link |
00:51:28.000
a large number of developers and people.
link |
00:51:31.080
So what do you look for in a good team?
link |
00:51:34.680
What do you think Google has been at the forefront
link |
00:51:37.400
of exploring what it takes to build a good team?
link |
00:51:40.440
And TensorFlow is one of the most cutting edge technologies
link |
00:51:45.520
in the world.
link |
00:51:46.120
So in this context, what do you think
link |
00:51:48.080
makes for a good team?
link |
00:51:50.480
It's definitely something I think a fair bit about.
link |
00:51:53.200
I think in terms of the team being
link |
00:51:59.560
able to deliver something well, one of the things that's
link |
00:52:02.120
important is a cohesion across the team.
link |
00:52:05.800
So being able to execute together and doing things,
link |
00:52:10.400
it's not an end.
link |
00:52:11.440
Like at this scale, an individual engineer
link |
00:52:14.120
can only do so much.
link |
00:52:15.400
There's a lot more that they can do together,
link |
00:52:18.200
even though we have some amazing superstars across Google
link |
00:52:21.640
and in the team.
link |
00:52:22.600
But there's often the way I see it
link |
00:52:26.200
is the product of what the team generates
link |
00:52:28.360
is way larger than the whole individual put together.
link |
00:52:34.440
And so how do we have all of them work together,
link |
00:52:37.320
the culture of the team itself?
link |
00:52:40.000
Hiring good people is important.
link |
00:52:43.000
But part of that is it's not just that, OK,
link |
00:52:45.600
we hire a bunch of smart people and throw them together
link |
00:52:48.120
and let them do things.
link |
00:52:49.720
It's also people have to care about what they're building.
link |
00:52:52.920
People have to be motivated for the right kind of things.
link |
00:52:57.320
That's often an important factor.
link |
00:53:01.400
And finally, how do you put that together
link |
00:53:04.600
with a somewhat unified vision of where we want to go?
link |
00:53:08.840
So are we all looking in the same direction
link |
00:53:11.200
or just going all over?
link |
00:53:13.520
And sometimes it's a mix.
link |
00:53:16.040
Google's a very bottom up organization in some sense.
link |
00:53:21.400
Also research even more so.
link |
00:53:24.680
And that's how we started.
link |
00:53:26.320
But as we've become this larger product and ecosystem,
link |
00:53:30.840
I think it's also important to combine that well with a mix
link |
00:53:35.040
of, OK, here's the direction we want to go in.
link |
00:53:37.920
There is exploration we'll do around that.
link |
00:53:39.880
But let's keep staying in that direction, not just
link |
00:53:43.320
all over the place.
link |
00:53:44.360
And is there a way you monitor the health of the team?
link |
00:53:46.880
Sort of like, is there a way you know you did a good job?
link |
00:53:51.920
The team is good.
link |
00:53:53.000
I mean, you're saying nice things, but it's sometimes
link |
00:53:56.960
difficult to determine how aligned.
link |
00:54:01.120
Because it's not binary, it's not like there's tensions
link |
00:54:04.480
and complexities and so on.
link |
00:54:06.680
And the other element of this is the mesh of superstars.
link |
00:54:09.400
There's so much, even at Google, such a large percentage
link |
00:54:12.880
of work is done by individual superstars too.
link |
00:54:16.000
So there's a, and sometimes those superstars
link |
00:54:19.920
could be against the dynamic of a team and those tensions.
link |
00:54:25.120
I mean, I'm sure TensorFlow might be a little bit easier
link |
00:54:27.320
because the mission of the project is so beautiful.
link |
00:54:31.720
You're at the cutting edge, so it's exciting.
link |
00:54:34.760
But have you had struggle with that?
link |
00:54:36.640
Has there been challenges?
link |
00:54:38.360
There are always people challenges
link |
00:54:39.800
in different kinds of ways.
link |
00:54:41.240
That said, I think we've been what's
link |
00:54:44.520
good about getting people who care and have
link |
00:54:49.320
the same kind of culture, and that's Google in general
link |
00:54:51.440
to a large extent.
link |
00:54:53.480
But also, like you said, given that the project has had
link |
00:54:56.760
so many exciting things to do, there's
link |
00:54:59.160
been room for lots of people to do different kinds of things
link |
00:55:02.080
and grow, which does make the problem a bit easier, I guess.
link |
00:55:06.440
And it allows people, depending on what they're doing,
link |
00:55:09.920
if there's room around them, then that's fine.
link |
00:55:13.120
But yes, we do care about whether a superstar or not
link |
00:55:19.160
that they need to work well with the team across Google.
link |
00:55:22.560
That's interesting to hear.
link |
00:55:23.760
So it's like superstar or not, the productivity broadly
link |
00:55:27.960
is about the team.
link |
00:55:30.520
Yeah.
link |
00:55:31.520
I mean, they might add a lot of value,
link |
00:55:32.960
but if they're hurting the team, then that's a problem.
link |
00:55:35.720
So in hiring engineers, it's so interesting, right?
link |
00:55:38.720
The high rank process, what do you look for?
link |
00:55:41.840
How do you determine a good developer
link |
00:55:44.240
or a good member of a team from just a few minutes
link |
00:55:47.280
or hours together?
link |
00:55:50.320
Again, no magic answers, I'm sure.
link |
00:55:51.920
Yeah.
link |
00:55:52.760
And Google has a hiring process that we've refined
link |
00:55:56.240
over the last 20 years, I guess, and that you've probably
link |
00:56:00.880
heard and seen a lot about.
link |
00:56:02.200
So we do work with the same hiring process in that.
link |
00:56:05.280
That's really helped.
link |
00:56:08.280
For me in particular, I would say,
link |
00:56:10.880
in addition to the core technical skills,
link |
00:56:14.200
what does matter is their motivation
link |
00:56:17.560
in what they want to do.
link |
00:56:19.560
Because if that doesn't align well with where we want to go,
link |
00:56:22.960
that's not going to lead to long term success
link |
00:56:25.320
for either them or the team.
link |
00:56:27.640
And I think that becomes more important the more senior
link |
00:56:30.640
the person is, but it's important at every level.
link |
00:56:33.520
Like even the junior most engineer,
link |
00:56:34.920
if they're not motivated to do well at what they're trying to do,
link |
00:56:37.680
however smart they are, it's going
link |
00:56:39.080
to be hard for them to succeed.
link |
00:56:40.320
Does the Google hiring process touch on that passion?
link |
00:56:44.520
So like trying to determine.
link |
00:56:46.440
Because I think as far as I understand,
link |
00:56:48.440
maybe you can speak to it that the Google hiring process sort
link |
00:56:52.000
of helps the initial like determines the skill set there,
link |
00:56:56.360
is your puzzle solving ability, problem solving ability good.
link |
00:56:59.840
But I'm not sure, but it seems that the determining
link |
00:57:05.000
whether the person is like fire inside them
link |
00:57:07.560
that burns to do anything really doesn't really matter.
link |
00:57:09.840
It's just some cool stuff.
link |
00:57:11.520
I'm going to do it that I don't know.
link |
00:57:15.320
Is that something that ultimately ends up
link |
00:57:17.000
when they have a conversation with you
link |
00:57:18.840
or once it gets closer to the team?
link |
00:57:22.600
So one of the things we do have as part of the process
link |
00:57:25.400
is just a culture fit, like part of the interview process
link |
00:57:28.600
itself, in addition to just the technical skills.
link |
00:57:31.040
And each engineer or whoever the interviewer is,
link |
00:57:34.240
is supposed to rate the person on the culture and the culture
link |
00:57:38.800
fit with Google and so on.
link |
00:57:39.960
So that is definitely part of the process.
link |
00:57:42.160
Now, there are various kinds of projects
link |
00:57:45.800
and different kinds of things.
link |
00:57:46.960
So there might be variants in the kind of culture
link |
00:57:50.040
you want there and so on.
link |
00:57:51.320
And yes, that does vary.
link |
00:57:52.720
So for example, TensorFlow has always
link |
00:57:54.920
been a fast moving project.
link |
00:57:56.920
And we want people who are comfortable with that.
link |
00:58:00.920
But at the same time now, for example,
link |
00:58:02.640
we are at a place where we are also very full fledged product.
link |
00:58:05.200
And we want to make sure things that work really, really
link |
00:58:08.440
work right.
link |
00:58:09.320
You can't cut corners all the time.
link |
00:58:11.680
So balancing that out and finding the people
link |
00:58:14.320
who are the right fit for those is important.
link |
00:58:17.560
And I think those kind of things do vary a bit
link |
00:58:19.720
across projects and teams and product areas across Google.
link |
00:58:23.200
And so you'll see some differences there
link |
00:58:25.240
in the final checklist.
link |
00:58:27.640
But a lot of the core culture, it
link |
00:58:29.600
comes along with just the engineering, excellence,
link |
00:58:32.200
and so on.
link |
00:58:34.720
What is the hardest part of your job?
link |
00:58:39.680
I'll take your pick, I guess.
link |
00:58:41.920
It's fun, I would say.
link |
00:58:44.440
Hard, yes.
link |
00:58:45.520
I mean, lots of things at different times.
link |
00:58:47.240
I think that does vary.
link |
00:58:49.160
So let me clarify that difficult things are fun
link |
00:58:52.640
when you solve them, right?
link |
00:58:55.720
It's fun in that sense.
link |
00:58:57.480
I think the key to a successful thing across the board,
link |
00:59:02.600
and in this case, it's a large ecosystem now,
link |
00:59:05.320
but even a small product, is striking that fine balance
link |
00:59:09.800
across different aspects of it.
link |
00:59:12.000
Sometimes it's how fast you go versus how perfect it is.
link |
00:59:17.000
Sometimes it's how do you involve this huge community?
link |
00:59:21.400
Who do you involve?
link |
00:59:22.360
Or do you decide, OK, now is not a good time to involve them
link |
00:59:25.440
because it's not the right fit?
link |
00:59:30.160
Sometimes it's saying no to certain kinds of things.
link |
00:59:33.640
Those are often the hard decisions.
link |
00:59:36.880
Some of them you make quickly because you don't have the time.
link |
00:59:41.000
Some of them you get time to think about them,
link |
00:59:43.200
but they're always hard.
link |
00:59:44.480
So both choices are pretty good, those decisions.
link |
00:59:49.200
What about deadlines?
link |
00:59:50.360
Is this defined TensorFlow to be driven by deadlines
link |
00:59:58.200
to a degree that a product might?
link |
01:00:00.360
Or is there still a balance to where it's less deadline?
link |
01:00:04.920
You had the Dev Summit, they came together incredibly.
link |
01:00:08.920
Looked like there's a lot of moving pieces and so on.
link |
01:00:11.440
So did that deadline make people rise to the occasion,
link |
01:00:15.080
releasing TensorFlow 2.0 Alpha?
link |
01:00:18.360
I'm sure that was done last minute as well.
link |
01:00:20.360
I mean, up to the last point.
link |
01:00:25.600
Again, it's one of those things that you
link |
01:00:28.600
need to strike the good balance.
link |
01:00:29.960
There's some value that deadlines bring
link |
01:00:32.040
that does bring a sense of urgency
link |
01:00:33.920
to get the right things together.
link |
01:00:35.720
Instead of getting the perfect thing out,
link |
01:00:38.280
you need something that's good and works well.
link |
01:00:41.280
And the team definitely did a great job in putting that
link |
01:00:43.720
together, so it was very amazed and excited by everything,
link |
01:00:46.560
how that came together.
link |
01:00:48.680
That said, across the year, we try not
link |
01:00:50.640
to put out official deadlines.
link |
01:00:52.520
We focus on key things that are important,
link |
01:00:56.960
figure out how much of it's important,
link |
01:01:00.600
and we are developing in the open, internally and externally,
link |
01:01:05.760
everything's available to everybody.
link |
01:01:07.920
So you can pick and look at where things are.
link |
01:01:11.120
We do releases at a regular cadence,
link |
01:01:13.160
so fine if something doesn't necessarily end up with this
link |
01:01:16.320
month, it'll end up in the next release in a month or two.
link |
01:01:19.600
And that's OK, but we want to keep moving
link |
01:01:22.840
as fast as we can in these different areas.
link |
01:01:26.520
Because we can iterate and improve on things, sometimes
link |
01:01:30.080
it's OK to put things out that aren't fully ready.
link |
01:01:32.920
If you make sure it's clear that, OK, this is experimental,
link |
01:01:35.640
but it's out there if you want to try and give feedback.
link |
01:01:37.960
That's very, very useful.
link |
01:01:39.400
I think that quick cycle and quick iteration is important.
link |
01:01:43.560
That's what we often focus on rather than here's
link |
01:01:47.200
a deadline where you get everything else.
link |
01:01:49.200
It's 2.0, is there pressure to make that stable?
link |
01:01:52.880
Or like, for example, WordPress 5.0 just came out,
link |
01:01:57.760
and there was no pressure to, it was a lot of build updates
link |
01:02:01.760
that delivered way too late.
link |
01:02:04.960
And they said, OK, well, we're going
link |
01:02:06.440
to release a lot of updates really quickly to improve it.
link |
01:02:09.680
Do you see TensorFlow 2.0 in that same kind of way,
link |
01:02:12.240
or is there this pressure to once it hits 2.0,
link |
01:02:15.240
once you get to the release candidate,
link |
01:02:16.760
and then you get to the final, that's
link |
01:02:19.440
going to be the stable thing?
link |
01:02:22.480
So it's going to be stable in just like 1.0X
link |
01:02:26.680
was where every API that's there is going to remain in work.
link |
01:02:32.080
It doesn't mean we can't change things under the covers.
link |
01:02:34.800
It doesn't mean we can't add things.
link |
01:02:36.720
So there's still a lot more for us to do,
link |
01:02:39.200
and we continue to have more releases.
link |
01:02:41.080
So in that sense, there's still, I
link |
01:02:42.920
don't think we'd be done in like two months
link |
01:02:44.680
when we release this.
link |
01:02:46.160
I don't know if you can say, but is there, you know,
link |
01:02:49.880
there's not external deadlines for TensorFlow 2.0,
link |
01:02:53.680
but is there internal deadlines, artificial or otherwise,
link |
01:02:58.520
that you're trying to set for yourself,
link |
01:03:00.840
or is it whenever it's ready?
link |
01:03:03.080
So we want it to be a great product, right?
link |
01:03:05.680
And that's a big, important piece for us.
link |
01:03:09.880
TensorFlow is already out there.
link |
01:03:11.160
We have 41 million downloads for 1.x,
link |
01:03:13.720
so it's not like we have to have this.
link |
01:03:15.880
Yeah, exactly.
link |
01:03:17.280
So it's not like a lot of the features
link |
01:03:19.320
that we've really polishing and putting them together
link |
01:03:22.080
are there, we don't have to rush that just because.
link |
01:03:26.240
So in that sense, we want to get it right
link |
01:03:28.040
and really focus on that.
link |
01:03:29.920
That said, we have said that we are
link |
01:03:31.520
looking to get this out in the next few months,
link |
01:03:33.520
in the next quarter, and as far as possible,
link |
01:03:37.120
we'll definitely try to make that happen.
link |
01:03:40.000
Yeah, my favorite line was, spring is a relative concept.
link |
01:03:44.360
I love it.
link |
01:03:45.960
Spoken like a true developer.
link |
01:03:47.680
So something I'm really interested in,
link |
01:03:50.200
and your previous line of work is, before TensorFlow,
link |
01:03:53.840
you let a team and Google on search ads.
link |
01:03:57.720
I think this is a very interesting topic on every level,
link |
01:04:02.840
on a technical level, because if their best ads connect people
link |
01:04:07.200
to the things they want and need,
link |
01:04:10.080
and that they're worse, they're just these things
link |
01:04:12.280
that annoy the heck out of you to the point of ruining
link |
01:04:15.840
the entire user experience of whatever you're actually doing.
link |
01:04:20.240
So they have a bad rep, I guess.
link |
01:04:23.600
And on the other end, so that this connecting users
link |
01:04:28.080
to the thing they need to want is a beautiful opportunity
link |
01:04:32.120
for machine learning to shine, like huge amounts of data
link |
01:04:35.360
that's personalized, and you've got
link |
01:04:36.720
to map to the thing they actually won't get annoyed.
link |
01:04:40.400
So what have you learned from this Google that's
link |
01:04:43.760
leading the world in this aspect?
link |
01:04:45.160
What have you learned from that experience?
link |
01:04:47.560
And what do you think is the future of ads?
link |
01:04:51.520
Take you back to the end of that.
link |
01:04:54.040
Yes, it's been a while, but I totally agree with what you said.
link |
01:04:59.720
I think the search ads, the way it was always looked at,
link |
01:05:03.200
and I believe it still is, is it's
link |
01:05:05.520
an extension of what search is trying to do.
link |
01:05:08.240
The goal is to make the information
link |
01:05:10.560
and make the world's information accessible.
link |
01:05:14.680
With ads, it's not just information,
link |
01:05:17.120
but it may be products or other things
link |
01:05:19.120
that people care about.
link |
01:05:20.800
And so it's really important for them
link |
01:05:23.360
to align with what the users need.
link |
01:05:26.480
And in search ads, there's a minimum quality level
link |
01:05:30.920
before that ad would be shown.
link |
01:05:32.320
If we don't have an ad that hits that quality bar,
link |
01:05:34.200
it will not be shown, even if we have it.
link |
01:05:35.960
And OK, maybe we lose some money there.
link |
01:05:38.080
That's fine.
link |
01:05:39.560
That is really, really important,
link |
01:05:41.200
and I think that that is something I really
link |
01:05:43.000
liked about being there.
link |
01:05:45.040
Advertising is a key part.
link |
01:05:48.120
I mean, as a model, it's been around for ages, right?
link |
01:05:51.680
It's not a new model.
link |
01:05:52.920
It's been adapted to the web and became a core part of search
link |
01:05:57.440
and in many other search engines across the world.
link |
01:06:02.120
I do hope, like I said, there are aspects of ads
link |
01:06:05.920
that are annoying.
link |
01:06:06.680
And I go to a website, and if it just
link |
01:06:09.600
keeps popping an ad in my face, not to let me read,
link |
01:06:12.160
that's going to be annoying clearly.
link |
01:06:13.800
So I hope we can strike that balance between showing a good
link |
01:06:22.080
ad where it's valuable to the user
link |
01:06:25.040
and provides the monetization to the service.
link |
01:06:30.960
And this might be search.
link |
01:06:32.000
This might be a website.
link |
01:06:33.680
All of these, they do need the monetization for them
link |
01:06:37.320
to provide that service.
link |
01:06:39.640
But if it's done in a good balance between showing
link |
01:06:45.720
just some random stuff that's distracting
link |
01:06:48.040
versus showing something that's actually valuable.
link |
01:06:50.920
So do you see it moving forward as to continue
link |
01:06:55.360
being a model that funds businesses like Google?
link |
01:07:00.960
That's a significant revenue stream.
link |
01:07:05.160
Because that's one of the most exciting things,
link |
01:07:08.080
but also limiting things on the internet
link |
01:07:09.680
is nobody wants to pay for anything.
link |
01:07:12.200
And advertisements, again, coupled at their best
link |
01:07:15.360
are actually really useful and not annoying.
link |
01:07:17.360
Do you see that continuing and growing and improving?
link |
01:07:22.320
Or is there GC sort of more Netflix type models
link |
01:07:26.680
where you have to start to pay for content?
link |
01:07:28.960
I think it's a mix.
link |
01:07:31.000
I think it's going to take a long while for everything
link |
01:07:32.840
to be paid on the internet, if at all.
link |
01:07:35.320
Probably not.
link |
01:07:36.160
I mean, I think there's always going
link |
01:07:37.400
to be things that are sort of monetized with things like ads.
link |
01:07:40.760
But over the last few years, I would say
link |
01:07:42.800
we've definitely seen that transition
link |
01:07:44.760
towards more paid services across the web
link |
01:07:48.560
and people are willing to pay for them
link |
01:07:50.360
because they do see the value.
link |
01:07:51.760
I mean, Netflix is a great example.
link |
01:07:53.600
I mean, we have YouTube doing things.
link |
01:07:56.520
People pay for the apps they buy, more people
link |
01:07:59.720
they find are willing to pay for newspaper content,
link |
01:08:03.120
for the good news websites across the web.
link |
01:08:07.240
That wasn't the case even a few years ago, I would say.
link |
01:08:11.040
And I just see that change in myself as well
link |
01:08:13.280
and just lots of people around me.
link |
01:08:14.840
So definitely hopeful that we'll transition to that mix model
link |
01:08:19.240
where maybe you get to try something out for free,
link |
01:08:23.400
maybe with ads.
link |
01:08:24.120
But then there is a more clear revenue model
link |
01:08:27.080
that sort of helps go beyond that.
link |
01:08:30.600
So speaking of revenue, how is it
link |
01:08:34.760
that a person can use the TPU in a Google Colab for free?
link |
01:08:39.400
So what's the, I guess, the question is,
link |
01:08:43.920
what's the future of TensorFlow in terms of empowering,
link |
01:08:48.880
say, a class of 300 students?
link |
01:08:51.880
And I'm asked by MIT, what is going
link |
01:08:55.920
to be the future of them being able to do their homework
link |
01:08:58.640
in TensorFlow?
link |
01:09:00.200
Where are they going to train these networks, right?
link |
01:09:02.800
What's that future look like with TPUs, with cloud services,
link |
01:09:07.720
and so on?
link |
01:09:08.920
I think a number of things there.
link |
01:09:10.240
I mean, any TensorFlow open source,
link |
01:09:12.600
you can run it wherever.
link |
01:09:13.640
You can run it on your desktop, and your desktops
link |
01:09:15.880
always keep getting more powerful, so maybe you can do more.
link |
01:09:19.480
My phone is like, I don't know how many times more powerful
link |
01:09:22.040
than my first desktop.
link |
01:09:23.520
You'll probably train it on your phone, though.
link |
01:09:25.200
Yeah, that's true.
link |
01:09:26.200
Right, so in that sense, the power
link |
01:09:28.080
you have in your hand is a lot more.
link |
01:09:31.440
Clouds are actually very interesting from, say,
link |
01:09:34.400
students or courses perspective, because they
link |
01:09:37.840
make it very easy to get started.
link |
01:09:40.040
I mean, Colab, the great thing about it
link |
01:09:42.040
is go to a website, and it just works.
link |
01:09:45.120
No installation needed, nothing to, you know,
link |
01:09:47.560
you're just there, and things are working.
link |
01:09:49.960
That's really the power of cloud, as well.
link |
01:09:52.280
And so I do expect that to grow.
link |
01:09:55.320
Again, Colab is a free service.
link |
01:09:57.920
It's great to get started, to play with things,
link |
01:10:00.840
to explore things.
link |
01:10:03.080
That said, with free, you can only get so much, maybe.
link |
01:10:08.200
So just like we were talking about free versus paid,
link |
01:10:11.080
and there are services you can pay for and get a lot more.
link |
01:10:15.280
Great.
link |
01:10:16.000
So if I'm a complete beginner interested in machine
link |
01:10:18.480
learning and TensorFlow, what should I do?
link |
01:10:21.560
Probably start with going to a website and playing there.
link |
01:10:24.240
Just go to TensorFlow.org and start clicking on things.
link |
01:10:26.560
Yep, check out tutorials and guides.
link |
01:10:28.440
There's stuff you can just click there and go to Colab
link |
01:10:30.680
and do things.
link |
01:10:31.320
No installation needed.
link |
01:10:32.360
You can get started right there.
link |
01:10:34.040
OK, awesome.
link |
01:10:34.840
Roger, thank you so much for talking today.
link |
01:10:36.720
Thank you, Lex.
link |
01:10:37.440
Have fun this week.