back to index

Cristos Goodrow: YouTube Algorithm | Lex Fridman Podcast #68


small model | large model

link |
00:00:00.000
The following is a conversation with Christos Goodrow,
link |
00:00:03.360
vice president of engineering at Google and head of search and discovery at YouTube,
link |
00:00:08.320
also known as the YouTube algorithm. YouTube has approximately 1.9 billion users,
link |
00:00:15.120
and every day people watch over 1 billion hours of YouTube video. It is the second most popular
link |
00:00:21.680
search engine behind Google itself. For many people, it is not only a source of entertainment,
link |
00:00:27.200
but also how we learn new ideas from math and physics videos to podcasts to debates,
link |
00:00:33.120
opinions, ideas from out of the box thinkers and activists on some of the most tense,
link |
00:00:38.640
challenging and impactful topics in the world today. YouTube and other content platforms
link |
00:00:44.880
receive criticism from both viewers and creators, as they should. Because the engineering task
link |
00:00:51.200
before them is hard, and they don't always succeed, and the impact of their work is truly
link |
00:00:56.720
world changing. To me, YouTube has been an incredible wellspring of knowledge. I've watched
link |
00:01:03.040
hundreds if not thousands of lectures that change the way I see many fundamentals ideas in math,
link |
00:01:08.960
science, engineering, and philosophy. But it does put a mirror to ourselves, and keeps the
link |
00:01:15.360
responsibility of the steps we take in each of our online educational journeys into the hands of
link |
00:01:21.200
each of us. The YouTube algorithm has an important role in that journey of helping us find new exciting
link |
00:01:27.600
ideas to learn about. That's a difficult and an exciting problem for an artificial intelligence
link |
00:01:32.960
system. As I've said in lectures and other forums, recommendation systems will be one of the most
link |
00:01:38.640
impactful areas of AI in the 21st century, and YouTube is one of the biggest recommendation
link |
00:01:45.120
systems in the world. This is the Artificial Intelligence Podcast. If you enjoy it, subscribe
link |
00:01:51.760
on YouTube, give it five stars on Apple Podcasts, follow on Spotify, support it on Patreon, or
link |
00:01:57.440
simply connect with me on Twitter. Alex Friedman spelled F R I D M A N. I recently started doing
link |
00:02:04.480
ads at the end of the introduction. I'll do one or two minutes after introducing the episode and
link |
00:02:09.520
never any ads in the middle that can break the flow of the conversation. I hope that works for you
link |
00:02:14.560
and doesn't hurt the listening experience. This show is presented by Cash App, the number one
link |
00:02:20.640
finance app in the App Store. I personally use Cash App to send money to friends, but you can
link |
00:02:25.600
also use it to buy, sell, and deposit Bitcoin in just seconds. Cash App also has a new investing
link |
00:02:31.680
feature. You can buy fractions of a stock, say $1 worth, no matter what the stock price is.
link |
00:02:37.360
Broker services are provided by Cash App Investing, a subsidiary of Square and member SIPC.
link |
00:02:42.880
I'm excited to be working with Cash App to support one of my favorite organizations called First,
link |
00:02:48.640
best known for their first robotics and Lego competitions. They educate and inspire hundreds
link |
00:02:54.080
of thousands of students in over 110 countries and have a perfect rating and charity navigator,
link |
00:02:59.920
which means that donated money is used to maximum effectiveness. When you get Cash App from the
link |
00:03:05.360
App Store, Google Play, and use code LEX Podcast, you'll get $10 and Cash App will also donate to
link |
00:03:12.400
$10 to First, which again is an organization that I've personally seen inspire girls and boys
link |
00:03:18.800
to dream of engineering a better world. And now here's my conversation with Christos Godro.
link |
00:03:26.720
YouTube is the world's second most popular search engine behind Google, of course.
link |
00:03:31.280
We watch more than 1 billion hours of YouTube videos a day, more than Netflix and Facebook
link |
00:03:36.800
video combined. YouTube creators upload over 500,000 hours of video every day. Average lifespan
link |
00:03:44.720
of a human being just for comparison is about 700,000 hours. So what's uploaded every single date
link |
00:03:53.280
is just enough for a human to watch in a lifetime. So let me ask an absurd philosophical question.
link |
00:03:59.440
If from birth, when I was born, and there's many people born today with the internet,
link |
00:04:03.680
I watched YouTube videos nonstop. Do you think there are trajectories through
link |
00:04:09.760
YouTube video space that can maximize my average happiness or maybe education or
link |
00:04:17.120
my growth as a human being? I think there are some great trajectories through YouTube videos,
link |
00:04:24.000
but I wouldn't recommend that anyone spend all of their waking hours or all of their hours
link |
00:04:29.440
watching YouTube. I mean, I think about the fact that YouTube has been really great for my kids,
link |
00:04:34.560
for instance. My oldest daughter, you know, she's been watching YouTube for several years,
link |
00:04:41.920
she watches Tyler Oakley and the vlogbrothers. And I know that it's had a very profound and
link |
00:04:48.720
positive impact on her character. And my younger daughter, she's a ballerina and her teachers tell
link |
00:04:53.920
her that YouTube is a huge advantage for her because she can practice a routine and watch
link |
00:05:01.520
like professional dancers do that same routine and stop it and back it up and rewind and all
link |
00:05:07.040
that stuff, right? So it's been really good for them. And then even my son is a sophomore in
link |
00:05:12.640
college. He got through his linear algebra class because of a channel called Three Blue One Brown,
link |
00:05:19.600
which, you know, helps you understand linear algebra, but in a way that would be very hard
link |
00:05:26.240
for anyone to do on a whiteboard or a chalkboard. And so I think that those experiences, from my
link |
00:05:33.920
point of view, were very good. And so I can imagine really good trajectories through YouTube. Yes.
link |
00:05:38.720
Have you looked at, do you think of broadly about that trajectory over a period because
link |
00:05:43.680
YouTube has grown up now. So over a period of years, you just kind of gave a few anecdotal
link |
00:05:50.640
examples. But, you know, I used to watch certain shows on YouTube. I don't anymore. I've moved on
link |
00:05:56.240
to other shows. And ultimately, you want people to, from YouTube's perspective, to stay on YouTube,
link |
00:06:01.680
to grow as human beings on YouTube. So you have to think not just what makes them engage
link |
00:06:08.400
today or this month, but also over a period of years. Absolutely. That's right. I mean,
link |
00:06:14.080
if YouTube is going to continue to enrich people's lives, then, you know, then it has to grow with
link |
00:06:20.560
them. And people's interests change over time. And so I think we've been working on this problem.
link |
00:06:30.400
And I'll just say it broadly as like, how to introduce diversity and introduce people who
link |
00:06:37.040
are watching one thing to something else they might like. We've been working on that problem
link |
00:06:41.920
all the eight years I've been at YouTube. It's a hard problem because, I mean, of course,
link |
00:06:48.560
it's trivial to introduce diversity that doesn't help. Yeah, just had a random video.
link |
00:06:54.080
I could just randomly select a video from the billions that we have. It's likely not to even
link |
00:06:59.840
be in your language. So the likelihood that you would watch it and develop a new interest is
link |
00:07:06.400
very, very low. And so what you want to do when you're trying to increase diversity is find something
link |
00:07:13.840
that is not too similar to the things that you've watched, but also something that you might be
link |
00:07:21.760
likely to watch. And that balance, finding that spot between those two things is quite challenging.
link |
00:07:28.720
So the diversity of content, diversity of ideas, it's a really difficult,
link |
00:07:36.000
it's the thing like that's almost impossible to define, right? Like what's different?
link |
00:07:41.680
So how do you think about that? So two examples is I'm a huge fan of Three Blue One Brown, say,
link |
00:07:48.800
and then one diversity, I wasn't even aware of a channel called Veritasium,
link |
00:07:54.240
which is a great science, physics, whatever channel. So one version of diversity is showing me
link |
00:08:01.120
Derek's Veritasium channel, which I was really excited to discover actually and now watch a lot
link |
00:08:05.840
of his videos. Okay, so you're a person who's watching some math channels and you might be
link |
00:08:12.240
interested in some other science or math channels. So like you mentioned, the first kind of diversity
link |
00:08:17.360
is just show you some, some things from other channels that are related, but not just, you know,
link |
00:08:25.360
not all the Three Blue One Brown channel throw in a couple others. So that's the, maybe the first
link |
00:08:31.600
kind of diversity that we started with many, many years ago. Taking a bigger leap is about,
link |
00:08:40.320
I mean, the mechanisms we do, we use for that is, is we basically cluster videos and channels
link |
00:08:46.960
together, mostly videos, we do every, almost everything at the video level. And so we'll,
link |
00:08:51.520
we'll make some kind of a cluster via some embedding process. And then, and then measure,
link |
00:08:58.560
you know, what is the likelihood that a, that users who watch one cluster might also watch
link |
00:09:04.480
another cluster that's very distinct. So we may come to find that, that people who watch science
link |
00:09:11.680
videos also like jazz. This is possible, right? And so, and so because of that relationship that
link |
00:09:20.240
we've identified through the measure, through the embeddings and then the measurement of the
link |
00:09:26.720
people who watch both, we might recommend a jazz video once in a while. So there's this
link |
00:09:32.160
cluster in the embedding space of jazz videos and science videos. And so you kind of try to look at
link |
00:09:38.640
aggregate statistics where if a lot of people that jump from science cluster to the jazz
link |
00:09:45.920
cluster tend to remain as engaged or become more engaged, then that's, that means those two
link |
00:09:54.240
are, they should hop back and forth and they'll be, they'll be happy.
link |
00:09:57.280
Right. There's a higher likelihood that a person from who's watching science would like jazz than
link |
00:10:03.840
the person watching science would like, I don't know, backyard railroads or, or something else,
link |
00:10:08.320
right? And so we can try to measure these likelihoods and use that to make the best recommendation we
link |
00:10:15.280
can. So, okay. So we'll talk about the machine learning of that, but I have to linger on things
link |
00:10:21.040
that neither you or anyone have an answer to. There's gray areas of truth, which is, for example,
link |
00:10:29.840
now I can't believe I'm going there, but politics, it, it, it happens so that certain people believe
link |
00:10:37.200
certain things and they're very certain about them. Let's move outside the red versus blue
link |
00:10:42.400
politics of today's world. But there's different ideologies. For example, in college, I read quite
link |
00:10:48.560
a lot of Vyn Rand I studied and that's a particular philosophical ideology I find, I found it interesting
link |
00:10:54.160
to explore. Okay. So that was that kind of space. I've kind of moved on from that cluster,
link |
00:10:59.200
intellectually, but it nevertheless is an interesting cluster. There's, I was born in
link |
00:11:02.960
the Soviet Union, socialism, communism is a certain kind of political ideology that's
link |
00:11:08.000
really interesting to explore. Again, objectively, just there's a set of beliefs about how the economy
link |
00:11:13.440
should work and so on. And so it's hard to know what's true or not in terms of people within
link |
00:11:19.440
those communities are often advocating that this is how we achieve utopia in this world.
link |
00:11:25.120
And they're pretty certain about it. So how do you try to manage politics in this chaotic,
link |
00:11:34.080
divisive world, not positive or any kind of ideas in terms of filtering what people should
link |
00:11:39.520
watch next and in terms of also not letting certain things be on YouTube? This is exceptionally
link |
00:11:47.520
difficult responsibility. Right. Well, the responsibility to get this right is our top
link |
00:11:53.360
priority. And the first comes down to making sure that we have good, clear rules of the road.
link |
00:12:03.360
Just because we have freedom of speech doesn't mean that you can literally say anything. Like
link |
00:12:07.600
we as a society have accepted certain restrictions on our freedom of speech. There are things like
link |
00:12:15.280
libel laws and things like that. And so where we can draw a clear line, we do and we continue to
link |
00:12:23.440
evolve that line over time. However, as you pointed out, wherever you draw the line, there's going to
link |
00:12:30.720
be a borderline. And in that borderline area, we are going to maybe not remove videos, but we will
link |
00:12:40.160
try to reduce the recommendations of them or the proliferation of them by demoting them. And then
link |
00:12:47.280
alternatively, in those situations, try to raise what we would call authoritative or credible
link |
00:12:54.000
sources of information. You mentioned Iran and communism. Those are two valid points of view
link |
00:13:05.680
that people are going to debate and discuss. And of course, people who believe in one or the other
link |
00:13:12.240
of those things are going to try to persuade other people to their point of view. And so
link |
00:13:18.240
we're not trying to settle that or choose a side or anything like that. What we're trying
link |
00:13:23.040
to do is make sure that the people who are expressing those point of view and offering
link |
00:13:30.240
those positions are authoritative and credible. So let me ask a question about people I don't like
link |
00:13:38.320
personally. You heard me. I don't care if you leave comments on this. But sometimes they're
link |
00:13:45.120
brilliantly funny, which is trolls. So people who kind of mock, I mean, the internet is full,
link |
00:13:53.520
the Reddit of mock style comedy, where people just kind of make fun of, point out that the emperor
link |
00:14:00.880
has no clothes. And there's brilliant comedy in that. But sometimes you can get cruel and mean.
link |
00:14:06.880
So on that, on the mean point, and sorry to linger on these things that have no good answers,
link |
00:14:13.920
but actually, I totally hear you that this is really important that you're trying to solve it.
link |
00:14:19.920
But how do you reduce the meanness of people on YouTube?
link |
00:14:27.120
I understand that anyone who uploads YouTube videos has to become resilient to a certain
link |
00:14:33.600
amount of meanness. I've heard that from many creators. And we are trying in various ways,
link |
00:14:43.600
comment ranking, allowing certain features to block people to reduce or make that meanness or
link |
00:14:52.320
that trolling behavior less effective on YouTube. And so, I mean, it's very important. But it's
link |
00:15:04.080
something that we're going to keep having to work on. And as we improve it, maybe we'll get
link |
00:15:09.920
to a point where people don't have to suffer this sort of meanness when they upload YouTube videos.
link |
00:15:16.800
I hope we do. But it just does seem to be something that you have to be able to deal with
link |
00:15:24.320
as a YouTube creator nowadays. Do you have a hope that, so you mentioned two things that
link |
00:15:28.400
kind of agree with this. So there's like a machine learning approach of ranking
link |
00:15:34.320
comments based on whatever, based on how much they contribute to the healthy conversation.
link |
00:15:39.760
Let's put it that way. And the other is almost an interface question of how do you,
link |
00:15:47.200
how does the creator filter, so block or, how do humans themselves, the users of
link |
00:15:54.480
YouTube manage their own conversation? Do you have hope that these two tools will
link |
00:15:59.280
create a better society without limiting freedom of speech too much? Without sort of
link |
00:16:05.440
attacking, even like saying that, people like, what do you mean limiting sort of curating speech?
link |
00:16:12.560
I mean, I think that that overall is our whole project here at YouTube.
link |
00:16:16.960
Right. Like, we fundamentally believe and I personally believe very much that YouTube can
link |
00:16:23.440
be great. It's been great for my kids. I think it can be great for society. But it's absolutely
link |
00:16:30.640
critical that we get this responsibility part right. And that's why it's our top priority.
link |
00:16:37.040
Susan Wojcicki, who's the CEO of YouTube, she says something that I personally find very inspiring,
link |
00:16:42.960
which is that we want to do our jobs today in a manner so that people 20 and 30 years from now
link |
00:16:51.520
will look back and say, you know, YouTube, they really figured this out. They really found a way
link |
00:16:56.560
to strike the right balance between the openness and the value that the openness has, and also
link |
00:17:03.200
making sure that we are meeting our responsibility to users in society.
link |
00:17:09.040
So the burden on YouTube actually is quite incredible. And the one thing that people don't
link |
00:17:15.360
give enough credit to the seriousness and the magnitude of the problem, I think.
link |
00:17:19.360
So I personally hope that you do solve it because a lot is in your hand. A lot is riding on your
link |
00:17:27.200
success or failure. So it's besides, of course, running a successful company, you're also curating
link |
00:17:33.680
the content of the internet and the conversation on the internet. That's a powerful thing.
link |
00:17:40.160
So one thing that people wonder about is how much of it can be solved with pure machine learning?
link |
00:17:48.880
So looking at the data, studying the data and creating algorithms that curate the comments,
link |
00:17:55.280
curate the content, and how much of it needs human intervention, meaning people here at
link |
00:18:02.640
YouTube in a room sitting and thinking about what is the nature of truth? What is, what are the
link |
00:18:11.680
ideals that we should be promoting, that kind of thing? So algorithm versus human input,
link |
00:18:17.760
input. What's your sense? I mean, my own experience has demonstrated that you need both of those
link |
00:18:24.480
things. Algorithms, I mean, you're familiar with machine learning algorithms. And the thing they
link |
00:18:30.720
need most is data. And the data is generated by humans. And so for instance, when we're building
link |
00:18:39.600
a system to try to figure out which are the videos that are misinformation or borderline
link |
00:18:47.280
policy violations, well, the first thing we need to do is get human beings to make decisions about
link |
00:18:54.720
which of those videos are in which category. And then we use that data and basically take
link |
00:19:02.640
that information that's determined and governed by humans and extrapolate it or apply it to the
link |
00:19:10.800
entire set of billions of YouTube videos. And we couldn't get to all the videos on YouTube well
link |
00:19:19.360
without the humans. And we couldn't use the humans to get to all the videos of YouTube.
link |
00:19:24.320
So there's no world in which you have only one or the other of these things. And just as you said,
link |
00:19:32.320
a lot of it comes down to people at YouTube spending a lot of time trying to figure out what
link |
00:19:41.040
are the right policies? What are the outcomes based on those policies? Are they the kinds of
link |
00:19:47.040
things we want to see? And then once we kind of get an agreement or build some consensus around
link |
00:19:55.120
what the policies are, well, then we've got to find a way to implement those policies across all
link |
00:20:00.400
of YouTube. And that's where both the human beings, we call them evaluators or reviewers,
link |
00:20:07.280
come into play to help us with that. And then once we get a lot of training data from them,
link |
00:20:12.800
then we apply the machine learning techniques to take it even further.
link |
00:20:16.240
Do you have a sense that these human beings have a bias in some kind of direction?
link |
00:20:22.800
I mean, that's an interesting question. We do sort of in autonomous vehicles and computer vision
link |
00:20:30.800
in general, a lot of annotation. And we rarely ask what bias do the annotators have? Even in the
link |
00:20:42.000
sense that they're better at annotating certain things than others. For example, people are much
link |
00:20:48.560
better at for annotating segmentation at segmenting cars in a scene versus segmenting bushes or trees.
link |
00:20:58.880
You know, there's specific mechanical reasons for that, but also because it's semantic gray area.
link |
00:21:04.960
And just for a lot of reasons, people are just terrible at annotating trees.
link |
00:21:09.520
Okay. So in the same kind of sense, do you think of in terms of people reviewing videos or annotating
link |
00:21:15.840
the content of videos, is there some kind of bias that you're aware of or seek out in that human input?
link |
00:21:24.160
Well, we take steps to try to overcome these kinds of biases or biases that we think would be
link |
00:21:31.040
problematic. So for instance, we ask people to have a bias towards scientific consensus. That's
link |
00:21:38.560
something that we instruct them to do. We ask them to have a bias towards demonstration of
link |
00:21:46.720
expertise or credibility or authoritativeness. But there are other biases that we want to
link |
00:21:52.960
make sure to try to remove. And there's many techniques for doing this. One of them is you
link |
00:21:59.280
send the same thing to be reviewed to many people. And so that's one technique. Another is that you
link |
00:22:06.880
make sure that the people that are doing these sorts of tasks are from different backgrounds and
link |
00:22:13.520
different areas of the United States or of the world. But then even with all of that, it's possible
link |
00:22:19.440
for certain kinds of what we would call unfair biases to creep into machine learning systems,
link |
00:22:27.600
primarily as you said, because maybe the training data itself comes in in a biased way. And so
link |
00:22:33.760
we also have worked very hard on improving the machine learning systems to remove and reduce
link |
00:22:41.760
unfair biases when it's when it goes against or is involved some protected class, for instance.
link |
00:22:50.000
Thank you for exploring with me some of the more challenging things. I'm sure there's a few more
link |
00:22:56.000
that we'll jump back to. But let me jump into the fun part, which is maybe the basics
link |
00:23:02.800
of the quote unquote YouTube algorithm. What does the YouTube algorithm look at to make
link |
00:23:09.760
recommendation for what to watch next from a machine learning perspective? Or when you
link |
00:23:16.080
search for a particular term, how does it know what to show you next? Because it seems to, at
link |
00:23:21.440
least for me, do an incredible job of both. Well, that's kind of you to say. It didn't
link |
00:23:27.440
used to do a very good job. But it's gotten better over the years. Even I observe that it's
link |
00:23:34.080
improved quite a bit. Those are two different situations. Like when you search for something,
link |
00:23:40.320
YouTube uses the best technology we can get from Google to make sure that the YouTube search system
link |
00:23:48.800
finds what someone's looking for. And of course, the very first things that one thinks about is,
link |
00:23:54.640
okay, well, does the word occur in the title? For instance, but there are much more sophisticated
link |
00:24:03.440
things where we're mostly trying to do some syntactic match or maybe a semantic match based on
link |
00:24:12.400
words that we can add to the document itself. For instance, maybe is this video
link |
00:24:19.920
watched a lot after this query? That's something that we can observe. And then as a result,
link |
00:24:29.120
make sure that that document would be retrieved for that query. Now, when you talk about what kind
link |
00:24:36.240
of videos would be recommended to watch next, that's something, again, we've been working on for
link |
00:24:43.760
many years. And probably the first real attempt to do that well was to use collaborative filtering.
link |
00:24:56.960
So you can describe what collaborative filtering is?
link |
00:24:59.520
Sure. It's just basically what we do is we observe which videos get watched close together
link |
00:25:07.680
by the same person. And if you observe that, and if you can imagine creating a graph where the videos
link |
00:25:16.160
that get watched close together by the most people are sort of very close to one another in this
link |
00:25:21.760
graph and videos that don't frequently get watched close to close together by the same person or
link |
00:25:27.120
the same people are far apart, then you end up with this graph that we call the related
link |
00:25:34.320
graph that basically represents videos that are very similar or related in some way. And
link |
00:25:42.080
what's amazing about that is that it puts all the videos that are in the same language together,
link |
00:25:48.800
for instance. And we didn't even have to think about language. It just doesn't, right? And it
link |
00:25:55.360
puts all the videos that are about sports together, and it puts most of the music videos together,
link |
00:25:59.920
and it puts all of these sorts of videos together just because that's sort of the way the people
link |
00:26:06.560
using YouTube behave. So that already cleans up a lot of the problem. It takes care of the
link |
00:26:14.240
lowest hanging fruit, which happens to be a huge one of just managing these millions of videos.
link |
00:26:20.720
That's right. I remember a few years ago, I was talking to someone who was
link |
00:26:25.840
trying to propose that we do a research project concerning people who are bilingual. And this
link |
00:26:37.280
person was making this proposal based on the idea that YouTube could not possibly be good
link |
00:26:45.200
at recommending videos well to people who are bilingual. And so she was telling me
link |
00:26:53.280
about this, and I said, well, can you give me an example of what problem do you think we have on
link |
00:26:58.480
YouTube with the recommendations? And so she said, well, I'm a researcher in the U.S., and when I'm
link |
00:27:05.840
looking for academic topics, I want to see them in English. And so she searched for one,
link |
00:27:11.680
found a video, and then looked at the watch next suggestions, and they were all in English.
link |
00:27:16.560
And so she said, oh, I see. YouTube must think that I speak only English. And so she said,
link |
00:27:21.760
now I'm actually originally from Turkey, and sometimes when I'm cooking, let's say I want to
link |
00:27:26.080
make some baklava, I really like to watch videos that are in Turkish. And so she searched for a
link |
00:27:31.520
video about making the baklava, and then selected it, and it was in Turkish. And the watch next
link |
00:27:36.560
recommendations were in Turkish. And she just couldn't believe how this was possible. And how
link |
00:27:42.960
is it that you know that I speak both these two languages and put all the videos together? And
link |
00:27:47.200
it's just sort of an outcome of this related graph that's created through collaborative filtering.
link |
00:27:54.000
So for me, one of my huge interests is just human psychology, right? And that's such a powerful
link |
00:28:00.320
platform on which to utilize human psychology to discover what individual people want to watch
link |
00:28:06.560
next. But it's also be just fascinating to me. You know, I've Google search has ability to look
link |
00:28:15.120
at your own history. And I've done that before. Just, just what I've searched three years for
link |
00:28:21.440
many, many years. And it's fascinating picture of who I am actually. And I don't think anyone's
link |
00:28:28.240
ever summarized. I personally would love that a summary of who I am as a person on the internet
link |
00:28:35.200
to me, because I think it reveals, I think it puts a mirror to me or to others, you know,
link |
00:28:42.320
that's actually quite revealing and interesting. You know, just maybe the number of it's a joke,
link |
00:28:49.520
but not really is the number of cap videos I've watched or videos of people falling,
link |
00:28:54.800
you know, stuff that's absurd, that kind of stuff. It's really interesting. And of course,
link |
00:29:00.720
it's really good for the machine learning aspect to, to show, to figure out what to show next.
link |
00:29:06.960
But it's interesting. Have you just as a tangent played around with the idea of giving a map to
link |
00:29:14.480
people sort of as opposed to just using this information to show us next, showing them here
link |
00:29:22.240
are the clusters you've loved over the years kind of thing. Well, we do provide the history of all
link |
00:29:27.840
the videos that you've watched. Yes. So you can definitely search through that and look through
link |
00:29:32.160
it and search through it to see what it is that you've been watching on YouTube. We have actually,
link |
00:29:38.000
in various times, experimented with this sort of cluster idea, finding ways to demonstrate or show
link |
00:29:46.080
people what topics they've been interested in or what what clusters they've watched from.
link |
00:29:52.480
It's interesting that you bring this up because in some sense, the way the recommendation system
link |
00:29:59.760
of YouTube sees a user is exactly as the history of all the videos they've watched on YouTube.
link |
00:30:06.800
And so you can think of yourself or any user on YouTube as kind of like a DNA strand of all
link |
00:30:18.080
your videos, right? That sort of represents you. You can also think of it as maybe a vector in
link |
00:30:24.640
the space of all the videos on YouTube. And so now, once you think of it as a vector in the
link |
00:30:31.920
space of all the videos on YouTube, then you can start to say, okay, well, which other vectors
link |
00:30:37.600
are close to me and to my vector? And that's one of the ways that we generate some diverse
link |
00:30:44.880
recommendations is because you're like, okay, well, these people seem to be close with respect
link |
00:30:50.960
to the videos they've watched on YouTube. But here's a topic or a video that one of them has
link |
00:30:56.560
watched and enjoyed, but the other one hasn't. That could be an opportunity to make a good
link |
00:31:02.000
recommendation. I gotta tell you, I'm gonna ask for things that are impossible, but I would love
link |
00:31:07.440
to cluster than human beings. I would love to know who has similar trajectories as me,
link |
00:31:13.040
because you probably would want to hang out. There's a social aspect there.
link |
00:31:17.040
Like actually finding some of the most fascinating people I find on YouTube have like no followers,
link |
00:31:22.000
and I start following them and they create incredible content. And you know, and on that topic,
link |
00:31:27.200
I just love to ask, there's some videos that just blow my mind in terms of quality and depth.
link |
00:31:34.080
And just in every regard are amazing videos and they have like 57 views. Okay. How do you get
link |
00:31:43.200
videos of quality to be seen by many eyes? So the measure of quality, is it just something?
link |
00:31:50.800
Yeah. How do you know that something is good? Well, I mean, I think it depends initially on
link |
00:31:57.040
what sort of video we're talking about. So in the realm of let's say, you mentioned politics and news,
link |
00:32:04.640
in that realm, quality news or quality journalism relies on having a journalism
link |
00:32:17.360
department, right? Like you have to have actual journalists and fact checkers and people like
link |
00:32:21.920
that. And so in that situation, and in others, maybe science or in medicine, quality has a lot
link |
00:32:30.800
to do with the authoritativeness and the credibility and the expertise of the people who make the video.
link |
00:32:37.360
Now, if you think about the other end of the spectrum,
link |
00:32:40.880
you know, what is the highest quality prank video? Or what is the highest quality
link |
00:32:46.160
Minecraft video, right? That might be the one that people enjoy watching the most and watch to the
link |
00:32:53.360
end. Or it might be the one that when we ask people the next day after they watched it,
link |
00:33:02.160
were they satisfied with it? And so we, especially in the realm of entertainment,
link |
00:33:09.120
have been trying to get at better and better measures of quality or satisfaction or enrichment
link |
00:33:17.120
since I came to YouTube. And we started with, well, you know, the first approximation is the one that
link |
00:33:22.880
gets more views. But, you know, we both know that things can get a lot of views and not really be
link |
00:33:31.520
that high quality, especially if people are clicking on something and then immediately
link |
00:33:36.160
realizing that it's not that great and abandoning it. And that's why we moved from views to thinking
link |
00:33:43.360
to thinking about the amount of time people spend watching it with the premise that like,
link |
00:33:48.800
you know, in some sense, the time that someone spends watching a video is related to the value
link |
00:33:55.840
that they get from that video. It may not be perfectly related, but it has something to say
link |
00:34:00.480
about how much value they get. But even that's not good enough, right? Because I myself have spent
link |
00:34:07.920
time clicking through channels on television late at night and ended up watching under siege too,
link |
00:34:14.480
for some reason, I don't know. And if you were to ask me the next day, are you glad that you
link |
00:34:18.880
watched that show on TV last night? I'd say, yeah, I wish I would have gone to bed or read a book or
link |
00:34:26.000
almost anything else really. And so that's why some people got the idea a few years ago to try
link |
00:34:33.520
to survey users afterwards. And so we get feedback data from those surveys and then use that in the
link |
00:34:42.720
machine learning system to try to not just predict what you're going to click on right now,
link |
00:34:47.120
what you might watch for a while, but what when we ask you tomorrow, you'll give four or five stars
link |
00:34:52.800
to. So just to summarize, what are the signals from the machine learning perspective that the
link |
00:34:59.120
user can provide? So you mentioned just clicking on the video views, the time watch, maybe the
link |
00:35:04.560
relative time watch, the clicking like and dislike on the video, maybe commenting on the video,
link |
00:35:13.440
all those things, all those things. And then the one I wasn't actually quite aware of,
link |
00:35:18.560
even though I might have engaged in it is a survey afterwards, which is a brilliant idea.
link |
00:35:24.400
Is there other signals? I mean, that's already a really rich space of signals to learn from.
link |
00:35:30.480
Is there something else? Well, you mentioned commenting, also sharing the video. If you
link |
00:35:36.720
think it's worthy to be shared with someone else, you know, within YouTube or outside of YouTube
link |
00:35:40.800
as well. Either. Let's see, you mentioned like, dislike. Like and dislike. How important is that?
link |
00:35:46.400
It's very important, right? We want, it's predictive of satisfaction. But it's not, it's
link |
00:35:53.680
not perfectly predictive. Subscribe. If you subscribe to the channel of the person who
link |
00:36:00.640
made the video, then that also is a piece of information that signals satisfaction. Although
link |
00:36:08.800
over the years, we've learned that people have a wide range of attitudes about what it means to
link |
00:36:14.720
subscribe. We would ask some users who didn't subscribe very much, but they watched a lot from
link |
00:36:23.920
a few channels. We'd say, well, why didn't you subscribe? And they would say, well, I can't
link |
00:36:27.840
afford to pay for anything. And, you know, we tried to let them understand like, actually,
link |
00:36:33.840
it doesn't cost anything. It's free. It just helps us know that you are very interested in this
link |
00:36:39.200
creator. But then we've asked other people who subscribed to many things and don't really watch
link |
00:36:47.040
any of the videos from those channels. And we say, well, why did you subscribe to this if you
link |
00:36:53.040
weren't really interested in any more videos from that channel? And they might tell us,
link |
00:36:57.840
well, I just, you know, I thought the person did a great job and I just want to kind of
link |
00:37:00.800
give them a high five. Yeah. And so, yeah, that's where I said, I actually subscribed to channels
link |
00:37:06.960
where I just, this person is amazing. I like this person. But then I like this person,
link |
00:37:15.040
I really want to support them. That's how I click subscribe, even though I mean, never actually want
link |
00:37:21.200
to click on their videos when they're releasing it. I just love what they're doing. And it's maybe
link |
00:37:25.680
outside of my interest area and so on, which is probably the wrong way to use the subscribe button.
link |
00:37:31.520
Well, I just want to say congrats. This is a great work. Well, so you have to deal with all the
link |
00:37:37.040
space of people that see the subscribe button is totally different. That's right. And so, you know,
link |
00:37:41.840
we, we can't just close our eyes and say, sorry, you're using it wrong. You know, we're not going
link |
00:37:47.280
to pay attention to what you've done. We need to embrace all the ways in which all the different
link |
00:37:52.640
people in the world use the subscribe button or the like and the dislike button. So in terms of
link |
00:37:58.800
signals of machine learning, using for the search and for the recommendation, you've mentioned title,
link |
00:38:06.240
so like metadata, like text data that people provide description and title and maybe keywords.
link |
00:38:13.360
So maybe you can speak to the value of those things in search and also this incredible,
link |
00:38:19.840
fascinating area of the content itself. So the video content itself, trying to understand what's
link |
00:38:25.200
happening in the video. So YouTube released a data set that, you know, in the machine learning
link |
00:38:29.440
computer vision world, this is just an exciting space. How much is that currently? How much are
link |
00:38:35.920
you playing with that currently? How much is your hope for the future of being able to analyze the
link |
00:38:40.160
content of the video itself? Well, we have been working on that also since I came to YouTube.
link |
00:38:46.000
Analyzing the content. Analyzing the content of the video, right. And what I can tell you is that
link |
00:38:52.400
our ability to do it well is still somewhat crude. We can, we can tell if it's a music video. We can
link |
00:39:02.560
tell if it's a sports video. We can probably tell you that people are playing soccer. We probably
link |
00:39:09.920
can't tell whether it's Manchester United or my daughter's soccer team. So these things are kind
link |
00:39:16.800
of difficult and using them, we can use them in some ways. So for instance, we use that kind of
link |
00:39:22.960
information to understand and inform these clusters that I talked about. And also maybe to add some
link |
00:39:31.360
words like soccer, for instance, to the video if it doesn't occur in the title or the description,
link |
00:39:36.800
which is remarkable that often it doesn't. One of the things that I ask creators to do is please
link |
00:39:44.640
help us out with the title in the description. For instance, we were a few years ago having a
link |
00:39:52.800
live stream of some competition for World of Warcraft on YouTube. And it was a very important
link |
00:40:00.960
competition. But if you typed World of Warcraft in search, you wouldn't find it. World of Warcraft
link |
00:40:05.840
wasn't in the title? World of Warcraft wasn't in the title. It was match four, seven, eight, you know,
link |
00:40:11.280
A team versus B team. And World of Warcraft wasn't in the title. Just like, come on, give me.
link |
00:40:17.680
Being literal on the internet is actually very uncool, which is the problem.
link |
00:40:22.160
Oh, is that right?
link |
00:40:23.760
Well, I mean, in some sense, some of the greatest videos, I mean, there's a humor to just being
link |
00:40:29.200
indirect, being witty and so on. And actually, machine learning algorithms want you to be
link |
00:40:36.080
literal. You just want to say what's in the thing, be very, very simple. And in some sense,
link |
00:40:44.080
that gets away from wit and humor. So you have to play with both. But you're saying that for
link |
00:40:50.000
now, the content of the title, the content of the description, the actual text is one of the
link |
00:40:56.560
best ways for the algorithm to find your video and put them in the right cluster.
link |
00:41:02.800
That's right. And I would go further and say that if you want people, human beings, to select
link |
00:41:09.280
your video in search, then it helps to have, let's say, World of Warcraft in the title. Because
link |
00:41:16.240
why would a person's, you know, if they're looking at a bunch, they type World of Warcraft,
link |
00:41:19.840
and they have a bunch of videos, all of whom say World of Warcraft, except the one that you uploaded,
link |
00:41:24.800
well, even the person is going to think, well, maybe this isn't somehow search made a mistake.
link |
00:41:29.040
This isn't really about World of Warcraft. So it's important not just for the machine
link |
00:41:33.760
learning systems, but also for the people who might be looking for this sort of thing. They get a
link |
00:41:39.280
clue that it's what they're looking for by seeing that same thing prominently in the title of the
link |
00:41:45.360
video. Okay, let me push back on that. So I think from the algorithm perspective, yes, but if they
link |
00:41:50.560
typed in World of Warcraft and saw a video that with the title simply winning, and the thumbnail
link |
00:41:59.040
has like a sad orc or something, I don't know, right? Like, I think that's much, it gets your
link |
00:42:09.840
curiosity up. And then if they could trust that the algorithm was smart enough to figure out somehow
link |
00:42:15.680
that this is indeed a World of Warcraft video, that would have created the most beautiful
link |
00:42:19.760
experience. I think in terms of just the wit and the humor and the curiosity that we human beings
link |
00:42:25.120
naturally have, but you're saying, I mean, realistically speaking, it's really hard for
link |
00:42:29.440
the algorithm to figure out that the content of that video will be a World of Warcraft video.
link |
00:42:34.480
And you have to accept that some people are going to skip it. Yeah, right. I mean, and so you're
link |
00:42:39.600
right. The people who don't skip it and select it are going to be delighted. Yeah. But other people
link |
00:42:47.040
might say, yeah, this is not what I was looking for. And making stuff discoverable, I think,
link |
00:42:52.240
is what you're really working on and hoping. So yeah, so from your perspective, put stuff in
link |
00:42:59.200
the title of the description. And remember, the collaborative filtering part of the system
link |
00:43:04.080
starts by the same user watching videos together, right? So the way that they're probably going
link |
00:43:11.520
to do that is by searching for them. That's a fascinating aspect of it. It's like ant colonies.
link |
00:43:16.320
That's how they find stuff. So I mean, what degree for collaborative filtering in general
link |
00:43:24.560
is one curious ant, one curious user essential? So just a person who is more willing to click on
link |
00:43:31.440
random videos and sort of explore these cluster spaces. In your sense, how many people are just
link |
00:43:37.840
like watching the same thing over and over and over and over? And how many are just like the
link |
00:43:41.680
explorers that just kind of like click on stuff and then help the other ant in the ants colony
link |
00:43:48.160
discover the cool stuff. Do you have a sense of that at all? I really don't think I have a sense
link |
00:43:52.720
for the relative sizes of those groups. But I would say that people come to YouTube with
link |
00:43:58.800
some certain amount of intent. And as long as they, to the extent to which they try to satisfy
link |
00:44:06.720
that intent, that certainly helps our systems, right? Because our systems rely on kind of a
link |
00:44:12.880
faithful amount of behavior, right? And there are people who try to trick us, right? There are people
link |
00:44:19.440
and machines that try to associate videos together that really don't belong together,
link |
00:44:26.080
but they're trying to get that association made because it's profitable for them. And so we have
link |
00:44:32.000
to always be resilient to that sort of attempt at gaming the systems. So speaking to that,
link |
00:44:39.040
there's a lot of people that in a positive way, perhaps, I don't know, I don't like it, but
link |
00:44:44.080
like to gain, want to try to gain the system to get more attention. Everybody, creators,
link |
00:44:48.720
in a positive sense, want to get attention, right? So how do you, how do you work in this space when
link |
00:44:55.040
people create more and more sort of click baity titles and thumbnails? Sort of very
link |
00:45:03.840
to ask him, Derek has made a video where basically describes that it seems what works is to create
link |
00:45:09.360
a high quality video, really good video where people would want to watch and wants to click on it,
link |
00:45:14.560
but have click baity titles and thumbnails to get them to click on it in the first place.
link |
00:45:19.680
And he's saying, I'm embracing this fact, I'm just going to keep doing it. And I hope
link |
00:45:24.320
you forgive me for doing it. And you will enjoy my videos once you click on them.
link |
00:45:28.880
So in what sense do you see this kind of click bait style attempt to manipulate to get people in
link |
00:45:38.080
the door to manipulate the algorithm or play with the algorithm or game the algorithm?
link |
00:45:43.280
I think that that you can look at it as an attempt to game the algorithm. But
link |
00:45:48.640
even if you were to take the algorithm out of it and just say, okay, well, all these videos
link |
00:45:53.120
happen to be lined up, which the algorithm didn't make any decision about which one to
link |
00:45:57.760
put at the top or the bottom, but they're all lined up there, which one are the people going to
link |
00:46:02.480
choose? And I'll tell you the same thing that I told Derek is, you know, I have a bookshelf
link |
00:46:08.960
and they have two kinds of books on them, science books. I have my math books from when I was a
link |
00:46:15.040
student and they all look identical except for the titles on the covers. They're all yellow,
link |
00:46:21.920
they're all from Springer, and they're every single one of them. The cover is totally the same.
link |
00:46:28.160
Right? On the other hand, I have other more pop science type books, and they all have very
link |
00:46:34.560
interesting covers, right? And they have provocative titles and things like that. I mean, I wouldn't
link |
00:46:40.800
say that they're click baity because they are indeed good books. And I don't think that they
link |
00:46:46.560
cross any line, but that's just a decision you have to make, right? Like the people who write
link |
00:46:54.800
classical recursion theory by P. R. O. D. Freddie, he was fine with the yellow title and nothing
link |
00:47:01.680
more. Whereas I think other people who wrote a more popular type book understand that they need
link |
00:47:10.240
to have a compelling cover and a compelling title. And, you know, I don't think there's anything
link |
00:47:16.880
really wrong with that. We do take steps to make sure that there is a line that you don't cross.
link |
00:47:24.400
And if you go too far, maybe your thumbnail is especially racy or, you know, it's all caps with
link |
00:47:31.920
too many exclamation points. We observe that users are kind of, you know, sometimes offended
link |
00:47:40.720
by that. And so for the users who are offended by that, we will then depress or suppress those
link |
00:47:49.520
videos. And which reminds me, there's also another signal where users can say, I don't know if I was
link |
00:47:55.760
recently added, but I really enjoy it. I'm just saying I didn't, something like I don't want to
link |
00:48:00.960
see this video anymore or something like, like this is a, like there's certain videos that just
link |
00:48:08.000
cut me the wrong way. Like just jump out at me. It's like, I don't want this. And it feels really
link |
00:48:12.800
good to clean that up. To be like, I don't, that's not, that's not for me. I don't know. I think
link |
00:48:19.040
that might have been recently added, but that's also a really strong signal. Yes, absolutely.
link |
00:48:23.600
Right. We don't want to make a recommendation that people are unhappy with. And that makes me,
link |
00:48:29.920
that particular one makes me feel good as a user in general, and as a machine learning person,
link |
00:48:35.040
because I feel like I'm helping the algorithm. My interaction on YouTube don't always feel like
link |
00:48:39.840
I'm helping the algorithm. Like I'm not reminded of that fact. Like for example, Tesla and Autopilot
link |
00:48:46.640
and, you know, Musk create a feeling for their customers, for people that own Tesla's, that
link |
00:48:51.600
they're helping the algorithm of Tesla. Like they're all like a really proud, they're helping
link |
00:48:56.000
the fleet learn. I think YouTube doesn't always remind people that you're helping the algorithm
link |
00:49:01.120
get smarter. And for me, I love that idea. Like we're all collaboratively, like Wikipedia gives
link |
00:49:07.280
that sense that we're all together creating a beautiful thing. YouTube doesn't always remind
link |
00:49:13.520
me of that. This conversation is reminding me of that, but... Well, that's a good tip. We should
link |
00:49:19.360
keep that fact in mind when we design these features. I'm not sure I really thought about it
link |
00:49:24.480
that way, but that's a very interesting perspective. It's an interesting question of personalization
link |
00:49:30.800
that I feel like when I click like on a video, I'm just improving my experience.
link |
00:49:39.200
It would be great. It would make me personally, people are different, but make me feel great
link |
00:49:43.840
if I was helping also the YouTube algorithm broadly say something. You know what I'm saying?
link |
00:49:48.000
Like I don't know if that's human nature, but the products you love, and I certainly love
link |
00:49:55.040
YouTube, you want to help it get smarter and smarter and smarter because there's some kind
link |
00:50:00.000
of coupling between our lives together being better. If YouTube was better, then my life
link |
00:50:06.400
will be better. And that's that kind of reasoning. I'm not sure what that is. And I'm not sure how
link |
00:50:09.840
many people share that feeling. That could be just a machine learning feeling. But on that point,
link |
00:50:15.360
how much personalization is there in terms of next video recommendations? So is it kind of
link |
00:50:23.600
all really boiling down to clustering? Like if I'm in your clusters to me and so on and
link |
00:50:31.680
that kind of thing, or how much is personalized to me, the individual completely?
link |
00:50:35.840
It's very, very personalized. So your experience will be quite a bit different from anybody else's
link |
00:50:44.080
who's watching that same video, at least when they're logged in. And the reason is that we found
link |
00:50:51.760
that users often want two different kinds of things when they're watching a video. Sometimes
link |
00:50:58.720
they want to keep watching more on that topic or more in that genre. And other times they just
link |
00:51:06.480
are done and they're ready to move on to something else. And so the question is,
link |
00:51:10.560
well, what is the something else? And one of the first things one can imagine is, well,
link |
00:51:16.400
maybe something else is the latest video from some channel to which you've subscribed. And
link |
00:51:22.640
that's going to be very different for you than it is for me, right? And even if it's not something
link |
00:51:28.800
that you subscribe to, it's something that you watch a lot. And again, that'll be very different
link |
00:51:32.720
on a person by person basis. And so even the watch next, as well as the homepage, of course,
link |
00:51:41.040
is quite personalized. So what we mentioned some of the signals, but what does success look like?
link |
00:51:47.280
What does success look like in terms of the algorithm creating a great long term experience
link |
00:51:52.160
for a user? Or put another way, if you look at the videos I've watched this month,
link |
00:51:59.120
how do you know the algorithm succeeded for me? I think, first of all, if you come back and watch
link |
00:52:06.240
more YouTube, then that's one indication that you found some value from it. So just the number of
link |
00:52:11.040
hours is a powerful indicator? Well, I mean, not the hours themselves, but the fact that you return
link |
00:52:19.280
on another day. So that's probably the most simple indicator. People don't come back to things that
link |
00:52:26.880
they don't find value in, right? There's a lot of other things that they could do. But like I said,
link |
00:52:32.560
I mean, ideally, we would like everybody to feel that YouTube enriches their lives and that every
link |
00:52:38.400
video they watched is the best one they've ever watched since they've started watching YouTube.
link |
00:52:44.080
And so that's why we survey them and ask them, like, is this one to five stars? And so our version
link |
00:52:53.360
of success is every time someone takes that survey, they say it's five stars. And if we ask them,
link |
00:53:00.480
is this the best video you've ever seen on YouTube? They say yes, every single time. So it's hard to
link |
00:53:07.040
imagine that we would actually achieve that. Maybe asymptotically, we would get there. But
link |
00:53:12.320
that would be what we think success is. It's funny. I've recently said somewhere, I don't know,
link |
00:53:19.360
maybe tweeted, but that Ray Dalio has this video on the economic machine. I forget what it's called,
link |
00:53:27.600
but it's a 30 minute video. And I said, it's the greatest video I've ever watched on YouTube.
link |
00:53:32.560
It's like, I watched the whole thing and my mind was blown as a very crisp, clean description of
link |
00:53:38.560
how at least the American economic system works. It's a beautiful video. And I was just, I wanted
link |
00:53:44.240
to click on something to say, this is the best thing ever. Please let me, I can't believe I
link |
00:53:50.080
discovered it. I mean, the views and the likes reflect its quality. But I was almost upset that
link |
00:53:56.960
I haven't found it earlier and wanted to find other things like it. I don't think I've ever felt
link |
00:54:02.080
that this is the best video I've ever watched. And that was that. And to me, the ultimate Utopia,
link |
00:54:08.400
the best experiences were every single video where I don't see any of the videos I regret and
link |
00:54:13.440
every single video I watched is one that actually helps me grow, helps me enjoy life, be happy,
link |
00:54:19.920
and so on. Well, so that's, that's, that's a heck of a, that's a, that's one of the most beautiful
link |
00:54:29.840
and ambitious, I think, machine learning tasks. So when you look at a society as opposed to an
link |
00:54:34.560
individual user, do you think of how YouTube is changing society when you have these millions
link |
00:54:40.880
of people watching videos, growing, learning, changing, having debates? Do you have a sense
link |
00:54:47.440
of, yeah, what the big impact on society is? Because I think it's huge, but do you have a
link |
00:54:52.800
sense of what direction we're taking this world? Well, I mean, I think, you know, openness has had
link |
00:54:59.600
an impact on society already. There's a lot of... What do you mean by openness? Well, the fact that
link |
00:55:06.720
unlike other mediums, there's not someone sitting at YouTube who decides before you can upload your
link |
00:55:15.200
video, whether it's worth having you upload it, or, or worth anybody seeing it really, right? And so,
link |
00:55:23.360
you know, there are some creators who say, like, I wouldn't have this opportunity to, to reach an
link |
00:55:31.440
audience. Tyler Oakley often said that, you know, he wouldn't have had this opportunity to reach this
link |
00:55:37.920
audience if it weren't for YouTube. And, and so I think that's one way in which YouTube has changed
link |
00:55:47.840
society. I know that there are people that I work with from outside the United States, especially
link |
00:55:54.800
from places where literacy is low. And they think that YouTube can help in those places because
link |
00:56:04.720
you don't need to be able to read and write in order to learn something important for your life,
link |
00:56:09.680
maybe, you know, how to do some job or how to fix something. And so that's another way in which I
link |
00:56:16.960
think YouTube is possibly changing society. So I've, I've worked at YouTube for eight, almost nine
link |
00:56:24.480
years now. And it's fun because I meet people and, you know, you tell them where they, where you work,
link |
00:56:31.920
you say you work on YouTube, and they immediately say, I love YouTube. Yeah. Right. Which is great,
link |
00:56:37.280
makes me feel great. But then, of course, when I ask them, well, what is it that you love about
link |
00:56:41.920
YouTube? Not one time ever has anybody said that the search works outstanding or that the recommendations
link |
00:56:50.080
are great. What they always say when I ask them, what do you love about YouTube is they immediately
link |
00:56:57.840
start talking about some channel or some creator or some topic or some community that they found on
link |
00:57:04.320
YouTube and that they just love. Yeah. And so that has made me realize that YouTube is really about
link |
00:57:13.520
the video and connecting the people with the videos and then everything else kind of gets out of the
link |
00:57:21.440
way. So beyond the video, it's an interesting, because you kind of mentioned creator. What about
link |
00:57:29.120
the connection with just the individual creators as opposed to just individual video? So like,
link |
00:57:35.600
I gave the example of Dalya video that the video itself is incredible. But there's some people
link |
00:57:43.360
who are just creators that I love that they're one of the cool things about people who call
link |
00:57:49.680
themselves YouTubers or whatever is they have a journey. They usually almost all of them are
link |
00:57:54.800
or they suck horribly in the beginning and then they kind of grow and then there's that
link |
00:57:59.760
genuineness in their growth. So YouTube clearly wants to help creators connect with their audience
link |
00:58:06.720
in this kind of way. So how do you think about that process of helping creators grow,
link |
00:58:11.280
helping them connect with their audience, develop not just individual videos, but the
link |
00:58:16.000
entirety of a creator's life on YouTube? Well, I mean, we're trying to help creators find the
link |
00:58:21.920
biggest audience that they can find. And the reason why that's you brought up creator versus
link |
00:58:28.640
video, the reason why creator channel is so important is because if we have a hope of people
link |
00:58:37.760
coming back to YouTube, well, they have to have in their minds some sense of what they're going to
link |
00:58:45.120
find when they come back to YouTube. If YouTube were just the next viral video, and I have no
link |
00:58:53.280
concept of what the next viral video could be one time it's a cat playing a piano and the next day
link |
00:58:58.240
it's some children interrupting a reporter and the next day it's some other thing happening,
link |
00:59:05.360
then it's hard for me to when I'm not watching YouTube say, gosh, I really would like to see
link |
00:59:13.520
something from someone or about something. And so that's why I think this connection between
link |
00:59:20.400
fans and creators is so important for both because it's a way of fostering a relationship
link |
00:59:30.080
that can play out into the future. Let me talk about kind of a dark and interesting question
link |
00:59:37.600
in general. And again, a topic that you or nobody has an answer to, but social media has a sense of
link |
00:59:46.800
you know, it gives us highs and it gives us lows in the sense that sort of creators often speak
link |
00:59:53.120
about having sort of burn out and having psychological ups and downs and challenges
link |
00:59:59.520
mentally in terms of continuing the creation process. There's a momentum. There's a huge
link |
01:00:04.320
excited audience that makes everybody feel that makes creators feel great. And I think it's more
link |
01:00:10.160
than just financial. I think it's literally just they love that sense of community. It's part of
link |
01:00:16.560
the reason I upload to YouTube. I don't care about money. Never will. What I care about is the the
link |
01:00:21.840
community, but some people feel like this momentum and even when there's times in their life when
link |
01:00:27.520
they don't feel, you know, for some reason don't feel like creating. So how do you think about
link |
01:00:32.800
burn out this mental exhaustion that some YouTube creators go through? Is that something we have
link |
01:00:38.960
an answer for? Is it something? How do we even think about that? Well, the first thing is we
link |
01:00:42.960
want to make sure that the YouTube systems are not contributing to this sense, right? And so
link |
01:00:49.760
we've done a fair amount of research to demonstrate that you can absolutely take a break. If you are
link |
01:00:57.760
if you are a creator and you've been uploading a lot, we have just as many examples of people who
link |
01:01:03.680
took a break and came back more popular than they were before as we have examples of going the other
link |
01:01:09.440
way. Yeah, can we pause on that for a second? So the feeling that people have, I think, is if I take
link |
01:01:14.640
a break, everybody, the party will leave, right? So if you could just linger on that. So in your
link |
01:01:22.080
sense that taking a break is okay. Yes, taking a break is absolutely okay. And the reason I say
link |
01:01:28.560
that is because we have we can observe many examples of being of creators coming back very
link |
01:01:37.600
strong and even stronger after they have taken some sort of break. And so I just want to dispel the
link |
01:01:43.520
myth that this somehow necessarily means that your channel is going to go down or lose views.
link |
01:01:53.120
That is not the case. We know for sure that this is not a necessary outcome. And so we want to
link |
01:02:01.200
encourage people to make sure that they take care of themselves. That is job one, right? You have
link |
01:02:06.160
to look after yourself and your mental health. And you know, I think that it probably in some of
link |
01:02:14.000
these cases contributes to better videos once they come back, right? Because a lot of people,
link |
01:02:21.600
I know myself, if I burn out on something, then I'm probably not doing my best work,
link |
01:02:25.920
even though I can keep working until I pass out. And so I think that the taking a break
link |
01:02:33.360
may even improve the creative ideas that someone has.
link |
01:02:39.280
Okay, I think it's a really important thing to dispel. I think it applies to all of social media.
link |
01:02:45.520
Like literally, I've taken a break for a day every once in a while. Sorry, sorry if that sounds
link |
01:02:52.720
like a short time. But even like email, just taking a break from email or only checking email once a
link |
01:02:59.520
day, especially when you're going through something psychologically in your personal life or so on,
link |
01:03:04.880
or really not sleeping much because of work deadlines, it can refresh you in a way that's
link |
01:03:10.240
profound. And so the same applies. It was there when you came back, right? It's there. And it looks
link |
01:03:15.520
different actually when you come back. You're sort of brighter eyed with some coffee, everything,
link |
01:03:20.560
the world looks better. So it's important to take a break when you need it.
link |
01:03:24.880
So you've mentioned kind of the the YouTube algorithm isn't, you know, e equals mc squared.
link |
01:03:32.480
It's not the single equation. It's potentially sort of more than a million lines of code.
link |
01:03:40.560
Sort of is it more akin to what autonomous successful autonomous vehicles today are,
link |
01:03:46.240
which is they're just basically patches on top of patches of heuristics and human experts really
link |
01:03:53.520
tuning the algorithm and have some machine learning modules? Or is it becoming more and more a giant
link |
01:04:01.200
machine learning system with humans just doing a little bit of tweaking here and there? What's
link |
01:04:06.240
your sense? First off, do you even have a sense of what is the YouTube algorithm at this point?
link |
01:04:11.120
And whichever, however much you do have a sense, what does it look like?
link |
01:04:15.680
Well, we don't usually think about it as the algorithm because it's a bunch of systems that
link |
01:04:22.000
work on different services. The other thing that I think people don't understand is that
link |
01:04:28.880
what you might refer to as the YouTube algorithm from outside of YouTube is actually
link |
01:04:35.280
a bunch of code and machine learning systems and heuristics, but that's married with the behavior
link |
01:04:42.400
of all the people who come to YouTube every day. So the people part of the code essentially.
link |
01:04:46.480
Exactly, right? Like if there were no people who came to YouTube tomorrow, then the algorithm
link |
01:04:50.960
wouldn't work anymore, right? So that's a critical part of the algorithm. And so when people talk
link |
01:04:56.320
about, well, the algorithm does this, the algorithm does that, it's sometimes hard to understand.
link |
01:05:00.960
Well, you know, it could be the viewers are doing that and the algorithm is mostly just keeping track
link |
01:05:07.360
of what the viewers do and then reacting to those things in sort of more fine grained situations.
link |
01:05:15.440
And I think that this is the way that the recommendation system and the search system and
link |
01:05:21.680
probably many machine learning systems evolve is you start trying to solve a problem and the
link |
01:05:28.400
first way to solve a problem is often with a simple heuristic, right? And you want to say,
link |
01:05:35.280
what are the videos we're going to recommend? Well, how about the most popular ones?
link |
01:05:38.880
Right? And that's where you start. And over time, you collect some data and you refine
link |
01:05:47.120
your situation so that you're making less heuristics and you're building a system that can
link |
01:05:52.320
actually learn what to do in different situations based on some observations of those situations
link |
01:05:58.240
in the past. And you keep chipping away at these heuristics over time. And so I think that
link |
01:06:04.480
that just like with diversity, you know, I think the first diversity measure we took was, okay,
link |
01:06:11.360
not more than three videos in a row from the same channel, right? It's a pretty simple heuristic
link |
01:06:16.880
to encourage diversity, but it worked, right? Who needs to see four, five, six videos in a row
link |
01:06:23.040
from the same channel? And over time, we try to chip away at that and make it more fine grained
link |
01:06:29.840
and basically have it remove the heuristics in favor of something that can react to individuals
link |
01:06:39.120
and individual situations. So how do you, you mentioned, you know, we know that something
link |
01:06:45.600
worked. How do you get a sense when decisions are the kind of A B testing that this idea was a good
link |
01:06:52.000
one, this was not so good? What's, how do you measure that? And across which time scale, across
link |
01:06:59.200
how many users that kind of, that kind of thing? Well, you mentioned that A B experiments. And so
link |
01:07:06.000
just about every single change we make to YouTube, we do it only after we've run a A B experiment.
link |
01:07:13.520
And so in those experiments, which run from one week to months, we measure hundreds, literally
link |
01:07:24.320
hundreds of different variables and, and measure changes with confidence intervals in all of them,
link |
01:07:30.720
because we really are trying to get a sense for ultimately, does this improve the experience
link |
01:07:37.520
for viewers? That's the question we're trying to answer. And an experiment is one way because we
link |
01:07:43.200
can see certain things go up and down. So for instance, if we noticed in the experiment, people
link |
01:07:49.200
are dismissing videos less frequently, or they're saying that they're more satisfied,
link |
01:07:56.720
they're giving more videos five stars after they watch them, then those would be indications of
link |
01:08:02.640
that the experiment is successful, that it's improving the situation for viewers.
link |
01:08:07.920
But we can also look at other things, like we might do user studies where we invite some
link |
01:08:13.040
people in and ask them, like, what do you think about this? What do you think about that? How do
link |
01:08:17.040
you feel about this? And other various kinds of user research. But ultimately, before we launch
link |
01:08:23.840
something, we're going to want to run an experiment. So we get a sense for what the impact is going to
link |
01:08:28.640
be, not just to the viewers, but also to the different channels and all of that.
link |
01:08:36.400
An absurd question. Nobody knows. Well, actually, it's interesting. Maybe there's an answer, but
link |
01:08:40.720
if I want to make a viral video, how do I do it? I don't know how you make a viral video. I know
link |
01:08:48.480
that we have, in the past, tried to figure out if we could detect when a video was going to go
link |
01:08:56.160
viral. And those were, you take the first and second derivatives of the view count and maybe
link |
01:09:02.800
use that to do some prediction. But I can't say we ever got very good at that. Oftentimes,
link |
01:09:10.960
we look at where the traffic was coming from. If a lot of the viewership is coming from
link |
01:09:17.040
something like Twitter, then maybe it has a higher chance of becoming viral than if it
link |
01:09:23.680
were coming from search or something. But that was just trying to detect a video that might be
link |
01:09:29.600
viral. How to make one? Like, I have no idea. I mean, you get your kids to interrupt you while
link |
01:09:35.040
you're on the news or something. Absolutely. But after the fact, on one individual video,
link |
01:09:42.080
sort of ahead of time predicting is a really hard task. But after the video went viral in analysis,
link |
01:09:49.920
can you sometimes understand why it went viral from the perspective of YouTube broadly?
link |
01:09:55.440
Firstly, is it even interesting for YouTube that a particular video is viral? Or does that not
link |
01:10:02.000
matter for the experience of people? Well, I think people expect that if a video is going
link |
01:10:09.120
viral and it's something they would be interested in, then I think they would expect YouTube to
link |
01:10:15.120
recommend it to them. Right. So if someone's going viral, it's good to just let people ride the wave
link |
01:10:21.760
of it. It's viral. Well, I mean, we want to meet people's expectations in that way, of course.
link |
01:10:27.520
So like I mentioned, I hung out with Derek Mueller a while ago, a couple months back.
link |
01:10:33.840
He's actually the person who suggested I talk to you on this podcast.
link |
01:10:38.080
All right. Well, thank you, Derek. At that time, he just recently posted
link |
01:10:43.280
an awesome science video titled, Why Are 96 Million Black Balls on This Reservoir?
link |
01:10:49.520
And in a matter of, I don't know how long, but like a few days, you got 38 million views
link |
01:10:54.800
and it's still growing. Is this something you can analyze and understand why it happened
link |
01:11:02.160
this video and you want a particular video like it?
link |
01:11:05.360
I mean, we can surely see where it was recommended, where it was found, who watched it,
link |
01:11:12.160
and those sorts of things. So it's actually starting to interrupt. It is the video which
link |
01:11:17.760
helped me discover who Derek is. I didn't know who he is before. So I remember, you know,
link |
01:11:23.760
usually I just have all of these technical boring MIT Stanford talks in my recommendation
link |
01:11:29.520
because that's how I watch. And then all of a sudden there's this Black Balls in Reservoir
link |
01:11:34.160
video with like an excited nerd and with like just, why is this being recommended to me?
link |
01:11:40.720
So I clicked on it and watched the whole thing. It was awesome. But, and then a lot of people
link |
01:11:44.560
had that experience like, why was I recommended this? But they all, of course, watched it and
link |
01:11:49.200
enjoyed it, which is, what's your sense of this just wave of recommendation that comes
link |
01:11:55.040
with this viral video that ultimately people get enjoy after they click on it?
link |
01:12:00.160
Well, I think it's the system, you know, basically doing what anybody who's recommending
link |
01:12:04.640
something would do, which is you show it to some people and if they like it, you say,
link |
01:12:09.040
okay, well, can I find some more people who are a little bit like them? Okay, I'm gonna
link |
01:12:13.200
try it with them. Oh, they like it too. Let me expand the circle some more, find some more
link |
01:12:17.040
people. Oh, it turns out they like it too. And you just keep going until you get some
link |
01:12:21.280
feedback that says, no, now you've gone too far. These people don't like it anymore.
link |
01:12:25.680
And so I think that's basically what happened. Now, you asked me about how to make a video go
link |
01:12:32.240
viral or make a viral video. I don't think that if you or I decided to make a video about 96
link |
01:12:40.160
million balls, that it would also go viral. It's possible that Derek made like the canonical video
link |
01:12:47.280
about those black balls in the lake. Exactly. He did actually. Right. And so I don't know whether
link |
01:12:54.000
or not just following along is the secret. Yeah, but it's fascinating. I mean, just like you said,
link |
01:13:01.120
the algorithm sort of expanding that circle and then figuring out that more and more people did
link |
01:13:05.520
enjoy it. And that sort of phase shift of just a huge number of people enjoying it in the algorithm
link |
01:13:11.840
quickly, automatically, I assume, figuring that out. That's a, I don't know, the dynamics of
link |
01:13:18.000
psychology, that is a beautiful thing. And so what do you think about the idea of clipping,
link |
01:13:24.080
like too many people annoyed me into doing it, which is they were requesting it. They said it
link |
01:13:30.000
would be very beneficial to add clips in like the coolest points and actually have explicit
link |
01:13:37.200
videos. Like I'm reuploading a video, like a short clip, which is what the podcasts are doing.
link |
01:13:44.160
Do you see, as opposed to like I also add timestamps for the topics, you know,
link |
01:13:48.080
do you want the clip? Do you see YouTube somehow helping creators with that process or helping
link |
01:13:53.920
connect clips to the original videos? Or is that just on a long list of amazing features to work
link |
01:13:59.840
towards? Yeah, I mean, it's not something that I think we've done yet. But I can tell you that
link |
01:14:07.840
I think clipping is great. And I think it's actually great for you as a creator.
link |
01:14:12.400
And here's the reason. If you think about, I mean, let's, let's say the NBA is uploading
link |
01:14:20.080
videos of, of its games. Well, people might search for warriors versus rockets,
link |
01:14:27.920
or they might search for Steph Curry. And so a highlight from the game in which Steph Curry
link |
01:14:33.600
makes an amazing shot is an opportunity for someone to find a portion of that video. And so I think
link |
01:14:41.120
that you never know how people are going to search for something that you've created. And so you want
link |
01:14:48.400
to, I would say you want to make clips and, and add titles and things like that so that they can
link |
01:14:54.480
find it as easily as possible. Do you have a dream of a future, perhaps a distant future,
link |
01:15:00.800
when the YouTube algorithm figures that out, sort of automatically detects the parts of the video
link |
01:15:09.360
that are really interesting, exciting, potentially exciting for people, and sort of clip them out
link |
01:15:14.480
in this incredibly rich space? Because if you talk about, if you talk, even just this conversation,
link |
01:15:19.520
we probably covered 30, 40 little topics. And there's a huge space of users that would find,
link |
01:15:27.680
you know, 30% of those topics really interesting. And that space is very different. It's something
link |
01:15:33.120
that's beyond my ability to clip out, right? But the algorithm might be able to figure all that
link |
01:15:39.760
out, sort of expand into clips. Do you have a, do you think about this kind of thing? Do you have
link |
01:15:45.680
a hope, a dream that one day the algorithm will be able to do that kind of deep content analysis?
link |
01:15:50.160
Well, we've actually had projects that attempt to achieve this. But it really does depend on
link |
01:15:58.000
understanding the video well. And our understanding of the video right now is quite crude. And so
link |
01:16:04.320
I think it would be especially hard to do it with a conversation like this. One might be able to do
link |
01:16:11.680
it with, let's say, a soccer match more easily, right? You could probably find out where the goals
link |
01:16:19.600
were scored. And then of course, you need to figure out who it was that scored the goal. And
link |
01:16:25.600
that might require a human to do some annotation. But I think that trying to identify coherent
link |
01:16:31.600
topics in a transcript, like the one of our conversation, is not something that we're going
link |
01:16:40.720
to be very good at right away. And I was speaking more to the general problem, actually, of being
link |
01:16:45.760
able to do both a soccer match and our conversation without explicit sort of almost, my hope was that
link |
01:16:52.320
there exists an algorithm that's able to find exciting things in video. So Google now on Google
link |
01:17:03.440
search will help you find the segment of the video that you're interested in. So if you search for
link |
01:17:08.960
or something like how to change the filter in my dishwasher, then if there's a long video about
link |
01:17:14.720
your dishwasher, and this is the part where the person shows you how to change the filter, then
link |
01:17:19.760
it will highlight that area and provide a link directly to it. And from your recollection,
link |
01:17:26.880
do you know if the thumbnail reflects? What's the difference between showing the full video and
link |
01:17:31.440
the shorter clip? Do you know how it's presented in search results? I don't remember how it's
link |
01:17:35.680
presented. And the other thing I would say is that right now it's based on creator annotations.
link |
01:17:42.000
Ah, got it. So it's not the thing we're talking about. But folks are working on the more
link |
01:17:48.080
automatic version. It's interesting, people might not imagine this, but a lot of our systems start
link |
01:17:55.520
by using almost entirely the audience behavior. And then as they get better,
link |
01:18:01.920
the refinement comes from using the content. And I wish, I know there's privacy concerns, but
link |
01:18:11.200
I wish YouTube explored the space, which is sort of putting a camera on the users if they allowed
link |
01:18:18.400
it, right? To study their, like I did a lot of emotion recognition work and so on, to study
link |
01:18:24.720
actual sort of richer signal. One of the cool things when you upload 360 like VR video to YouTube,
link |
01:18:31.600
and I've done this a few times. So I've uploaded myself. It's a horrible idea. Some people enjoyed
link |
01:18:38.080
it, but whatever. The video of me giving a lecture in 360, 360 camera, and it's cool because YouTube
link |
01:18:44.160
allows you to then watch, where did people look at? There's a heat map of where, you know,
link |
01:18:50.160
of where the center of the VR experience was. And it's interesting because that reveals to you,
link |
01:18:55.360
like, what people looked at. And it's, it's not always what you were expecting. It's not,
link |
01:19:00.480
in the case of the lecture is pretty boring. It is what we're expecting, but we did a few funny
link |
01:19:05.600
videos where there's a bunch of people doing things and they, everybody tracks those people.
link |
01:19:10.000
You know, in the beginning, they all look at the main person and they start spreading around and
link |
01:19:14.000
looking at the other people. It's fascinating. So that kind of, that's a really strong signal
link |
01:19:18.640
of what people found exciting in the video. I don't know how you get that from people just
link |
01:19:23.760
watching, except they tuned out at this point. Like, it's hard to measure this moment was super
link |
01:19:31.440
exciting for people. I don't know how you get that signal. Maybe comment, is there a way to get
link |
01:19:35.520
that signal where this was like, this is when their eyes opened up and they're like, like,
link |
01:19:40.720
for me with the Ray Dalio video, right? Like, at first I was like, oh, okay, this is another one
link |
01:19:44.560
of these like dumb it down for you videos. And then you like start watching, it's like, okay,
link |
01:19:50.000
there's really crisp, clean, deep explanation of how the economy works. That's where I like set up
link |
01:19:55.120
and started watching right that moment. Is there a way to detect that moment? The only way I can
link |
01:20:00.320
think of is by asking people to label it. You mentioned that we're quite far away in terms
link |
01:20:07.360
of doing video analysis, deep video analysis. Of course, Google, YouTube, you know, we're
link |
01:20:15.920
quite far away from solving a time was driving problem too. I don't know. I think we're closer
link |
01:20:21.040
to that. You never know. And the Wright brothers thought they're never they're not going to fly
link |
01:20:28.160
for 50 years, three years before they flew. So what are the biggest challenges, would you say?
link |
01:20:34.720
Is it the broad challenge of understanding video, understanding natural language, understand the
link |
01:20:40.960
challenge before the entire machine learning community or just being able to understand
link |
01:20:45.040
it? Or is there something specific to video that's even more challenging than understanding
link |
01:20:50.960
natural language understanding? What's your sense of what the biggest challenge is?
link |
01:20:54.160
I mean, video is just so much information. And so precision becomes a real problem. It's like
link |
01:21:02.720
you're trying to classify something and you've got a million classes. And the distinctions
link |
01:21:11.200
among them, at least from a machine learning perspective, are often pretty small. You need
link |
01:21:23.200
to see this person's number in order to know which player it is. And there's a lot of players.
link |
01:21:30.800
Or you need to see the logo on their chest in order to know which team they play for.
link |
01:21:38.320
And so, and that's just figuring out who's who, right? And then you go further and saying, okay,
link |
01:21:43.280
well, you know, was that a goal? Was it not a goal? Like, is that an interesting moment,
link |
01:21:48.080
as you said? Or is that not an interesting moment? These things can be pretty hard.
link |
01:21:52.800
So, okay, so Yan Likun, I'm not sure if you're familiar sort of with this current thinking
link |
01:21:59.040
and work. So he believes that self, what is referring to as self supervised learning
link |
01:22:04.800
will be the solution sort of to achieving this kind of greater level of intelligence. In fact,
link |
01:22:10.160
the thing he's focusing on is watching video and predicting the next frame. So predicting
link |
01:22:15.360
the future of video, right? So for now, we're very far from that. But his thought is because
link |
01:22:22.000
it's unsupervised, or as he refers to as self supervised. You know, if you watch enough video,
link |
01:22:28.640
essentially, if you watch YouTube, you'll be able to learn about the nature of reality,
link |
01:22:34.000
the physics, the common sense reasoning required by just teaching a system to predict
link |
01:22:39.120
the next frame. So he's confident this is the way to go. So for you, from the perspective of just
link |
01:22:45.360
working with this video, how do you think an algorithm that just watches all of YouTube
link |
01:22:52.960
stays up all day and night watching YouTube would be able to understand enough of the
link |
01:22:59.280
physics of the world about the way this world works, be able to do common sense reasoning and
link |
01:23:03.680
so on? Well, I mean, we have systems that already watch all the videos on YouTube, right?
link |
01:23:10.640
But they're just looking for very specific things, right? They're supervised learning systems that
link |
01:23:15.920
are trying to identify something or classify something. And I don't know if predicting the
link |
01:23:23.920
next frame is really going to get there because I'm not an expert on compression algorithms,
link |
01:23:30.880
but I understand that that's kind of what compression, video compression algorithms do,
link |
01:23:34.640
is they basically try to predict the next frame and then fix up the places where they got it
link |
01:23:40.560
wrong. And that leads to higher compression than if you actually put all the bits for the next frame
link |
01:23:46.400
there. So I don't know if I believe that just being able to predict the next frame is going to be
link |
01:23:53.280
enough because there's so many frames and even a tiny bit of error on a per frame basis can lead
link |
01:24:00.480
to wildly different videos. So the thing is the idea of compression is one way to do compression
link |
01:24:07.760
is to describe through text what's contained in the video. That's the ultimate high level of
link |
01:24:11.520
compression. So the idea is a tradition when you think of video image compression, you're trying
link |
01:24:16.880
to maintain the same visual quality while reducing the size. But if you think of deep learning from
link |
01:24:24.400
a bigger perspective of what compression is, is you're trying to summarize the video. And the idea
link |
01:24:30.000
there is if you have a big enough neural network by watching the next, by trying to predict the
link |
01:24:36.080
next frame, you'll be able to form a compression of actually understanding what's going on in the
link |
01:24:41.600
scene. If there's two people talking, you can just reduce that entire video into the fact that two
link |
01:24:47.600
people are talking and maybe the content of what they're saying and so on. That's kind of the open
link |
01:24:53.840
ended dream. So I just wanted to sort of express that because it's an interesting compelling notion, but
link |
01:25:00.800
it is nevertheless true that video, our world is a lot more complicated than we get credit for.
link |
01:25:07.760
I mean, in terms of search and discovery, we have been working on trying to summarize videos
link |
01:25:13.280
in text or with some kind of labels for eight years, at least. And we're kind of so so.
link |
01:25:24.080
So if you were to say the problem is 100% solved and eight years ago, was 0% solved?
link |
01:25:34.640
Where are we on that timeline, would you say?
link |
01:25:36.960
Yeah, to summarize a video well, maybe less than a quarter of the way.
link |
01:25:44.240
So on that topic, what does YouTube look like 10, 20, 30 years from now?
link |
01:25:51.040
I mean, I think that YouTube is evolving to take the place of TV. I grew up as a kid in the 70s
link |
01:25:59.840
and I watched a tremendous amount of television. And I feel sorry for my poor mom because
link |
01:26:07.600
people told her at the time that it was going to rot my brain and that she should kill her
link |
01:26:11.840
television. But anyway, I mean, I think that YouTube is, at least for my family,
link |
01:26:19.680
a better version of television, right? It's one that is on demand. It's more tailored to the things
link |
01:26:26.240
that my kids want to watch. And also, they can find things that they would never have found on
link |
01:26:33.120
television. And so I think that at least from just observing my own family, that's where we're
link |
01:26:39.840
headed is that people watch YouTube kind of in the same way that I watched television when I was
link |
01:26:45.200
younger. So from a search and discovery perspective, what are you excited about in the 5, 10, 20,
link |
01:26:53.200
30 years? It's already really good. I think it's achieved a lot of, of course, we don't know what's
link |
01:27:00.640
possible. So it's the task of search of typing in the text or discovering new videos by the next
link |
01:27:08.080
recommendation. I personally am really happy with the experience. Continuously, I rarely watch a video
link |
01:27:14.160
that's not awesome from my own perspective. But what else is possible? What are you excited about?
link |
01:27:21.200
Well, I think introducing people to more of what's available on YouTube is not only very important
link |
01:27:29.200
to YouTube and to creators, but I think it will help enrich people's lives. Because there's a lot
link |
01:27:35.600
that I'm still finding out is available on YouTube that I didn't even know. I've been working
link |
01:27:40.960
YouTube eight years, and it wasn't until last year that I learned that, that I could watch
link |
01:27:48.400
USC football games from the 1970s. Like, I didn't even know that was possible until last year. And
link |
01:27:54.000
I've been working here quite some time. So, you know, what was broken about, about that, that it
link |
01:27:58.480
took me seven years to learn that this stuff was already on YouTube, even when I got here. So I
link |
01:28:04.080
think there's a big opportunity there. And then, as I said before, you know, we want to make sure
link |
01:28:10.480
that YouTube finds a way to ensure that it's acting responsibly with respect to society and
link |
01:28:21.520
enriching people's lives. So we want to take all of the great things that it does and make sure
link |
01:28:26.480
that we are eliminating the negative consequences that might happen. And then lastly, if we could
link |
01:28:33.920
get to a point where all the videos people watch are the best ones they've ever watched, that would
link |
01:28:38.960
be outstanding too. Do you see, in many senses, becoming a window into the world for people?
link |
01:28:45.360
It's, especially with live video, you get to watch events. I mean, it's really, it's the way you
link |
01:28:51.680
experience a lot of the world that's out there is better than TV in many, many ways. So do you see
link |
01:28:57.840
becoming more than just video? Do you see creators creating visual experiences and virtual worlds?
link |
01:29:05.360
So if I'm talking crazy now, but sort of virtual reality and entering that space,
link |
01:29:09.520
or is that, at least for now, totally outside what YouTube is thinking about?
link |
01:29:13.920
I mean, I think Google is thinking about virtual reality. I don't think about virtual reality too
link |
01:29:20.000
much. I know that we would want to make sure that YouTube is there when virtual reality becomes
link |
01:29:28.880
something or if virtual reality becomes something that a lot of people are interested in. But I
link |
01:29:34.720
haven't seen it really take off yet. Take off. Well, the future is wide open. Christos, I've been
link |
01:29:42.320
really looking forward to this conversation. It's been a huge honor. Thank you for answering some
link |
01:29:46.000
of the more difficult questions I've asked. I'm really excited about what YouTube has in store
link |
01:29:51.680
for us. It's one of the greatest products I've ever used and continues. So thank you so much for
link |
01:29:55.840
talking to it. It's my pleasure. Thanks for asking me. Thanks for listening to this conversation.
link |
01:30:01.200
And thank you to our presenting sponsor, Cash App. Download it, use code LEX Podcast.
link |
01:30:07.040
You'll get $10 and $10 will go to FIRST, a STEM education nonprofit that inspires hundreds of
link |
01:30:12.880
thousands of young minds to become future leaders and innovators. If you enjoy this podcast,
link |
01:30:18.640
subscribe on YouTube, give it five stars on Apple Podcasts, follow on Spotify,
link |
01:30:23.360
support on Patreon, or simply connect with me on Twitter.
link |
01:30:26.640
And now, let me leave you with some words of wisdom from Marcel Proust. The real
link |
01:30:32.800
voyage of discovery consists not in seeking new landscapes, but in having new eyes.
link |
01:30:38.560
Thank you for listening, and I hope to see you next time.