back to indexCristos Goodrow: YouTube Algorithm | Lex Fridman Podcast #68
link |
The following is a conversation with Christos Goodrow,
link |
vice president of engineering at Google and head of search and discovery at YouTube,
link |
also known as the YouTube algorithm. YouTube has approximately 1.9 billion users,
link |
and every day people watch over 1 billion hours of YouTube video. It is the second most popular
link |
search engine behind Google itself. For many people, it is not only a source of entertainment,
link |
but also how we learn new ideas from math and physics videos to podcasts to debates,
link |
opinions, ideas from out of the box thinkers and activists on some of the most tense,
link |
challenging and impactful topics in the world today. YouTube and other content platforms
link |
receive criticism from both viewers and creators, as they should. Because the engineering task
link |
before them is hard, and they don't always succeed, and the impact of their work is truly
link |
world changing. To me, YouTube has been an incredible wellspring of knowledge. I've watched
link |
hundreds if not thousands of lectures that change the way I see many fundamentals ideas in math,
link |
science, engineering, and philosophy. But it does put a mirror to ourselves, and keeps the
link |
responsibility of the steps we take in each of our online educational journeys into the hands of
link |
each of us. The YouTube algorithm has an important role in that journey of helping us find new exciting
link |
ideas to learn about. That's a difficult and an exciting problem for an artificial intelligence
link |
system. As I've said in lectures and other forums, recommendation systems will be one of the most
link |
impactful areas of AI in the 21st century, and YouTube is one of the biggest recommendation
link |
systems in the world. This is the Artificial Intelligence Podcast. If you enjoy it, subscribe
link |
on YouTube, give it five stars on Apple Podcasts, follow on Spotify, support it on Patreon, or
link |
simply connect with me on Twitter. Alex Friedman spelled F R I D M A N. I recently started doing
link |
ads at the end of the introduction. I'll do one or two minutes after introducing the episode and
link |
never any ads in the middle that can break the flow of the conversation. I hope that works for you
link |
and doesn't hurt the listening experience. This show is presented by Cash App, the number one
link |
finance app in the App Store. I personally use Cash App to send money to friends, but you can
link |
also use it to buy, sell, and deposit Bitcoin in just seconds. Cash App also has a new investing
link |
feature. You can buy fractions of a stock, say $1 worth, no matter what the stock price is.
link |
Broker services are provided by Cash App Investing, a subsidiary of Square and member SIPC.
link |
I'm excited to be working with Cash App to support one of my favorite organizations called First,
link |
best known for their first robotics and Lego competitions. They educate and inspire hundreds
link |
of thousands of students in over 110 countries and have a perfect rating and charity navigator,
link |
which means that donated money is used to maximum effectiveness. When you get Cash App from the
link |
App Store, Google Play, and use code LEX Podcast, you'll get $10 and Cash App will also donate to
link |
$10 to First, which again is an organization that I've personally seen inspire girls and boys
link |
to dream of engineering a better world. And now here's my conversation with Christos Godro.
link |
YouTube is the world's second most popular search engine behind Google, of course.
link |
We watch more than 1 billion hours of YouTube videos a day, more than Netflix and Facebook
link |
video combined. YouTube creators upload over 500,000 hours of video every day. Average lifespan
link |
of a human being just for comparison is about 700,000 hours. So what's uploaded every single date
link |
is just enough for a human to watch in a lifetime. So let me ask an absurd philosophical question.
link |
If from birth, when I was born, and there's many people born today with the internet,
link |
I watched YouTube videos nonstop. Do you think there are trajectories through
link |
YouTube video space that can maximize my average happiness or maybe education or
link |
my growth as a human being? I think there are some great trajectories through YouTube videos,
link |
but I wouldn't recommend that anyone spend all of their waking hours or all of their hours
link |
watching YouTube. I mean, I think about the fact that YouTube has been really great for my kids,
link |
for instance. My oldest daughter, you know, she's been watching YouTube for several years,
link |
she watches Tyler Oakley and the vlogbrothers. And I know that it's had a very profound and
link |
positive impact on her character. And my younger daughter, she's a ballerina and her teachers tell
link |
her that YouTube is a huge advantage for her because she can practice a routine and watch
link |
like professional dancers do that same routine and stop it and back it up and rewind and all
link |
that stuff, right? So it's been really good for them. And then even my son is a sophomore in
link |
college. He got through his linear algebra class because of a channel called Three Blue One Brown,
link |
which, you know, helps you understand linear algebra, but in a way that would be very hard
link |
for anyone to do on a whiteboard or a chalkboard. And so I think that those experiences, from my
link |
point of view, were very good. And so I can imagine really good trajectories through YouTube. Yes.
link |
Have you looked at, do you think of broadly about that trajectory over a period because
link |
YouTube has grown up now. So over a period of years, you just kind of gave a few anecdotal
link |
examples. But, you know, I used to watch certain shows on YouTube. I don't anymore. I've moved on
link |
to other shows. And ultimately, you want people to, from YouTube's perspective, to stay on YouTube,
link |
to grow as human beings on YouTube. So you have to think not just what makes them engage
link |
today or this month, but also over a period of years. Absolutely. That's right. I mean,
link |
if YouTube is going to continue to enrich people's lives, then, you know, then it has to grow with
link |
them. And people's interests change over time. And so I think we've been working on this problem.
link |
And I'll just say it broadly as like, how to introduce diversity and introduce people who
link |
are watching one thing to something else they might like. We've been working on that problem
link |
all the eight years I've been at YouTube. It's a hard problem because, I mean, of course,
link |
it's trivial to introduce diversity that doesn't help. Yeah, just had a random video.
link |
I could just randomly select a video from the billions that we have. It's likely not to even
link |
be in your language. So the likelihood that you would watch it and develop a new interest is
link |
very, very low. And so what you want to do when you're trying to increase diversity is find something
link |
that is not too similar to the things that you've watched, but also something that you might be
link |
likely to watch. And that balance, finding that spot between those two things is quite challenging.
link |
So the diversity of content, diversity of ideas, it's a really difficult,
link |
it's the thing like that's almost impossible to define, right? Like what's different?
link |
So how do you think about that? So two examples is I'm a huge fan of Three Blue One Brown, say,
link |
and then one diversity, I wasn't even aware of a channel called Veritasium,
link |
which is a great science, physics, whatever channel. So one version of diversity is showing me
link |
Derek's Veritasium channel, which I was really excited to discover actually and now watch a lot
link |
of his videos. Okay, so you're a person who's watching some math channels and you might be
link |
interested in some other science or math channels. So like you mentioned, the first kind of diversity
link |
is just show you some, some things from other channels that are related, but not just, you know,
link |
not all the Three Blue One Brown channel throw in a couple others. So that's the, maybe the first
link |
kind of diversity that we started with many, many years ago. Taking a bigger leap is about,
link |
I mean, the mechanisms we do, we use for that is, is we basically cluster videos and channels
link |
together, mostly videos, we do every, almost everything at the video level. And so we'll,
link |
we'll make some kind of a cluster via some embedding process. And then, and then measure,
link |
you know, what is the likelihood that a, that users who watch one cluster might also watch
link |
another cluster that's very distinct. So we may come to find that, that people who watch science
link |
videos also like jazz. This is possible, right? And so, and so because of that relationship that
link |
we've identified through the measure, through the embeddings and then the measurement of the
link |
people who watch both, we might recommend a jazz video once in a while. So there's this
link |
cluster in the embedding space of jazz videos and science videos. And so you kind of try to look at
link |
aggregate statistics where if a lot of people that jump from science cluster to the jazz
link |
cluster tend to remain as engaged or become more engaged, then that's, that means those two
link |
are, they should hop back and forth and they'll be, they'll be happy.
link |
Right. There's a higher likelihood that a person from who's watching science would like jazz than
link |
the person watching science would like, I don't know, backyard railroads or, or something else,
link |
right? And so we can try to measure these likelihoods and use that to make the best recommendation we
link |
can. So, okay. So we'll talk about the machine learning of that, but I have to linger on things
link |
that neither you or anyone have an answer to. There's gray areas of truth, which is, for example,
link |
now I can't believe I'm going there, but politics, it, it, it happens so that certain people believe
link |
certain things and they're very certain about them. Let's move outside the red versus blue
link |
politics of today's world. But there's different ideologies. For example, in college, I read quite
link |
a lot of Vyn Rand I studied and that's a particular philosophical ideology I find, I found it interesting
link |
to explore. Okay. So that was that kind of space. I've kind of moved on from that cluster,
link |
intellectually, but it nevertheless is an interesting cluster. There's, I was born in
link |
the Soviet Union, socialism, communism is a certain kind of political ideology that's
link |
really interesting to explore. Again, objectively, just there's a set of beliefs about how the economy
link |
should work and so on. And so it's hard to know what's true or not in terms of people within
link |
those communities are often advocating that this is how we achieve utopia in this world.
link |
And they're pretty certain about it. So how do you try to manage politics in this chaotic,
link |
divisive world, not positive or any kind of ideas in terms of filtering what people should
link |
watch next and in terms of also not letting certain things be on YouTube? This is exceptionally
link |
difficult responsibility. Right. Well, the responsibility to get this right is our top
link |
priority. And the first comes down to making sure that we have good, clear rules of the road.
link |
Just because we have freedom of speech doesn't mean that you can literally say anything. Like
link |
we as a society have accepted certain restrictions on our freedom of speech. There are things like
link |
libel laws and things like that. And so where we can draw a clear line, we do and we continue to
link |
evolve that line over time. However, as you pointed out, wherever you draw the line, there's going to
link |
be a borderline. And in that borderline area, we are going to maybe not remove videos, but we will
link |
try to reduce the recommendations of them or the proliferation of them by demoting them. And then
link |
alternatively, in those situations, try to raise what we would call authoritative or credible
link |
sources of information. You mentioned Iran and communism. Those are two valid points of view
link |
that people are going to debate and discuss. And of course, people who believe in one or the other
link |
of those things are going to try to persuade other people to their point of view. And so
link |
we're not trying to settle that or choose a side or anything like that. What we're trying
link |
to do is make sure that the people who are expressing those point of view and offering
link |
those positions are authoritative and credible. So let me ask a question about people I don't like
link |
personally. You heard me. I don't care if you leave comments on this. But sometimes they're
link |
brilliantly funny, which is trolls. So people who kind of mock, I mean, the internet is full,
link |
the Reddit of mock style comedy, where people just kind of make fun of, point out that the emperor
link |
has no clothes. And there's brilliant comedy in that. But sometimes you can get cruel and mean.
link |
So on that, on the mean point, and sorry to linger on these things that have no good answers,
link |
but actually, I totally hear you that this is really important that you're trying to solve it.
link |
But how do you reduce the meanness of people on YouTube?
link |
I understand that anyone who uploads YouTube videos has to become resilient to a certain
link |
amount of meanness. I've heard that from many creators. And we are trying in various ways,
link |
comment ranking, allowing certain features to block people to reduce or make that meanness or
link |
that trolling behavior less effective on YouTube. And so, I mean, it's very important. But it's
link |
something that we're going to keep having to work on. And as we improve it, maybe we'll get
link |
to a point where people don't have to suffer this sort of meanness when they upload YouTube videos.
link |
I hope we do. But it just does seem to be something that you have to be able to deal with
link |
as a YouTube creator nowadays. Do you have a hope that, so you mentioned two things that
link |
kind of agree with this. So there's like a machine learning approach of ranking
link |
comments based on whatever, based on how much they contribute to the healthy conversation.
link |
Let's put it that way. And the other is almost an interface question of how do you,
link |
how does the creator filter, so block or, how do humans themselves, the users of
link |
YouTube manage their own conversation? Do you have hope that these two tools will
link |
create a better society without limiting freedom of speech too much? Without sort of
link |
attacking, even like saying that, people like, what do you mean limiting sort of curating speech?
link |
I mean, I think that that overall is our whole project here at YouTube.
link |
Right. Like, we fundamentally believe and I personally believe very much that YouTube can
link |
be great. It's been great for my kids. I think it can be great for society. But it's absolutely
link |
critical that we get this responsibility part right. And that's why it's our top priority.
link |
Susan Wojcicki, who's the CEO of YouTube, she says something that I personally find very inspiring,
link |
which is that we want to do our jobs today in a manner so that people 20 and 30 years from now
link |
will look back and say, you know, YouTube, they really figured this out. They really found a way
link |
to strike the right balance between the openness and the value that the openness has, and also
link |
making sure that we are meeting our responsibility to users in society.
link |
So the burden on YouTube actually is quite incredible. And the one thing that people don't
link |
give enough credit to the seriousness and the magnitude of the problem, I think.
link |
So I personally hope that you do solve it because a lot is in your hand. A lot is riding on your
link |
success or failure. So it's besides, of course, running a successful company, you're also curating
link |
the content of the internet and the conversation on the internet. That's a powerful thing.
link |
So one thing that people wonder about is how much of it can be solved with pure machine learning?
link |
So looking at the data, studying the data and creating algorithms that curate the comments,
link |
curate the content, and how much of it needs human intervention, meaning people here at
link |
YouTube in a room sitting and thinking about what is the nature of truth? What is, what are the
link |
ideals that we should be promoting, that kind of thing? So algorithm versus human input,
link |
input. What's your sense? I mean, my own experience has demonstrated that you need both of those
link |
things. Algorithms, I mean, you're familiar with machine learning algorithms. And the thing they
link |
need most is data. And the data is generated by humans. And so for instance, when we're building
link |
a system to try to figure out which are the videos that are misinformation or borderline
link |
policy violations, well, the first thing we need to do is get human beings to make decisions about
link |
which of those videos are in which category. And then we use that data and basically take
link |
that information that's determined and governed by humans and extrapolate it or apply it to the
link |
entire set of billions of YouTube videos. And we couldn't get to all the videos on YouTube well
link |
without the humans. And we couldn't use the humans to get to all the videos of YouTube.
link |
So there's no world in which you have only one or the other of these things. And just as you said,
link |
a lot of it comes down to people at YouTube spending a lot of time trying to figure out what
link |
are the right policies? What are the outcomes based on those policies? Are they the kinds of
link |
things we want to see? And then once we kind of get an agreement or build some consensus around
link |
what the policies are, well, then we've got to find a way to implement those policies across all
link |
of YouTube. And that's where both the human beings, we call them evaluators or reviewers,
link |
come into play to help us with that. And then once we get a lot of training data from them,
link |
then we apply the machine learning techniques to take it even further.
link |
Do you have a sense that these human beings have a bias in some kind of direction?
link |
I mean, that's an interesting question. We do sort of in autonomous vehicles and computer vision
link |
in general, a lot of annotation. And we rarely ask what bias do the annotators have? Even in the
link |
sense that they're better at annotating certain things than others. For example, people are much
link |
better at for annotating segmentation at segmenting cars in a scene versus segmenting bushes or trees.
link |
You know, there's specific mechanical reasons for that, but also because it's semantic gray area.
link |
And just for a lot of reasons, people are just terrible at annotating trees.
link |
Okay. So in the same kind of sense, do you think of in terms of people reviewing videos or annotating
link |
the content of videos, is there some kind of bias that you're aware of or seek out in that human input?
link |
Well, we take steps to try to overcome these kinds of biases or biases that we think would be
link |
problematic. So for instance, we ask people to have a bias towards scientific consensus. That's
link |
something that we instruct them to do. We ask them to have a bias towards demonstration of
link |
expertise or credibility or authoritativeness. But there are other biases that we want to
link |
make sure to try to remove. And there's many techniques for doing this. One of them is you
link |
send the same thing to be reviewed to many people. And so that's one technique. Another is that you
link |
make sure that the people that are doing these sorts of tasks are from different backgrounds and
link |
different areas of the United States or of the world. But then even with all of that, it's possible
link |
for certain kinds of what we would call unfair biases to creep into machine learning systems,
link |
primarily as you said, because maybe the training data itself comes in in a biased way. And so
link |
we also have worked very hard on improving the machine learning systems to remove and reduce
link |
unfair biases when it's when it goes against or is involved some protected class, for instance.
link |
Thank you for exploring with me some of the more challenging things. I'm sure there's a few more
link |
that we'll jump back to. But let me jump into the fun part, which is maybe the basics
link |
of the quote unquote YouTube algorithm. What does the YouTube algorithm look at to make
link |
recommendation for what to watch next from a machine learning perspective? Or when you
link |
search for a particular term, how does it know what to show you next? Because it seems to, at
link |
least for me, do an incredible job of both. Well, that's kind of you to say. It didn't
link |
used to do a very good job. But it's gotten better over the years. Even I observe that it's
link |
improved quite a bit. Those are two different situations. Like when you search for something,
link |
YouTube uses the best technology we can get from Google to make sure that the YouTube search system
link |
finds what someone's looking for. And of course, the very first things that one thinks about is,
link |
okay, well, does the word occur in the title? For instance, but there are much more sophisticated
link |
things where we're mostly trying to do some syntactic match or maybe a semantic match based on
link |
words that we can add to the document itself. For instance, maybe is this video
link |
watched a lot after this query? That's something that we can observe. And then as a result,
link |
make sure that that document would be retrieved for that query. Now, when you talk about what kind
link |
of videos would be recommended to watch next, that's something, again, we've been working on for
link |
many years. And probably the first real attempt to do that well was to use collaborative filtering.
link |
So you can describe what collaborative filtering is?
link |
Sure. It's just basically what we do is we observe which videos get watched close together
link |
by the same person. And if you observe that, and if you can imagine creating a graph where the videos
link |
that get watched close together by the most people are sort of very close to one another in this
link |
graph and videos that don't frequently get watched close to close together by the same person or
link |
the same people are far apart, then you end up with this graph that we call the related
link |
graph that basically represents videos that are very similar or related in some way. And
link |
what's amazing about that is that it puts all the videos that are in the same language together,
link |
for instance. And we didn't even have to think about language. It just doesn't, right? And it
link |
puts all the videos that are about sports together, and it puts most of the music videos together,
link |
and it puts all of these sorts of videos together just because that's sort of the way the people
link |
using YouTube behave. So that already cleans up a lot of the problem. It takes care of the
link |
lowest hanging fruit, which happens to be a huge one of just managing these millions of videos.
link |
That's right. I remember a few years ago, I was talking to someone who was
link |
trying to propose that we do a research project concerning people who are bilingual. And this
link |
person was making this proposal based on the idea that YouTube could not possibly be good
link |
at recommending videos well to people who are bilingual. And so she was telling me
link |
about this, and I said, well, can you give me an example of what problem do you think we have on
link |
YouTube with the recommendations? And so she said, well, I'm a researcher in the U.S., and when I'm
link |
looking for academic topics, I want to see them in English. And so she searched for one,
link |
found a video, and then looked at the watch next suggestions, and they were all in English.
link |
And so she said, oh, I see. YouTube must think that I speak only English. And so she said,
link |
now I'm actually originally from Turkey, and sometimes when I'm cooking, let's say I want to
link |
make some baklava, I really like to watch videos that are in Turkish. And so she searched for a
link |
video about making the baklava, and then selected it, and it was in Turkish. And the watch next
link |
recommendations were in Turkish. And she just couldn't believe how this was possible. And how
link |
is it that you know that I speak both these two languages and put all the videos together? And
link |
it's just sort of an outcome of this related graph that's created through collaborative filtering.
link |
So for me, one of my huge interests is just human psychology, right? And that's such a powerful
link |
platform on which to utilize human psychology to discover what individual people want to watch
link |
next. But it's also be just fascinating to me. You know, I've Google search has ability to look
link |
at your own history. And I've done that before. Just, just what I've searched three years for
link |
many, many years. And it's fascinating picture of who I am actually. And I don't think anyone's
link |
ever summarized. I personally would love that a summary of who I am as a person on the internet
link |
to me, because I think it reveals, I think it puts a mirror to me or to others, you know,
link |
that's actually quite revealing and interesting. You know, just maybe the number of it's a joke,
link |
but not really is the number of cap videos I've watched or videos of people falling,
link |
you know, stuff that's absurd, that kind of stuff. It's really interesting. And of course,
link |
it's really good for the machine learning aspect to, to show, to figure out what to show next.
link |
But it's interesting. Have you just as a tangent played around with the idea of giving a map to
link |
people sort of as opposed to just using this information to show us next, showing them here
link |
are the clusters you've loved over the years kind of thing. Well, we do provide the history of all
link |
the videos that you've watched. Yes. So you can definitely search through that and look through
link |
it and search through it to see what it is that you've been watching on YouTube. We have actually,
link |
in various times, experimented with this sort of cluster idea, finding ways to demonstrate or show
link |
people what topics they've been interested in or what what clusters they've watched from.
link |
It's interesting that you bring this up because in some sense, the way the recommendation system
link |
of YouTube sees a user is exactly as the history of all the videos they've watched on YouTube.
link |
And so you can think of yourself or any user on YouTube as kind of like a DNA strand of all
link |
your videos, right? That sort of represents you. You can also think of it as maybe a vector in
link |
the space of all the videos on YouTube. And so now, once you think of it as a vector in the
link |
space of all the videos on YouTube, then you can start to say, okay, well, which other vectors
link |
are close to me and to my vector? And that's one of the ways that we generate some diverse
link |
recommendations is because you're like, okay, well, these people seem to be close with respect
link |
to the videos they've watched on YouTube. But here's a topic or a video that one of them has
link |
watched and enjoyed, but the other one hasn't. That could be an opportunity to make a good
link |
recommendation. I gotta tell you, I'm gonna ask for things that are impossible, but I would love
link |
to cluster than human beings. I would love to know who has similar trajectories as me,
link |
because you probably would want to hang out. There's a social aspect there.
link |
Like actually finding some of the most fascinating people I find on YouTube have like no followers,
link |
and I start following them and they create incredible content. And you know, and on that topic,
link |
I just love to ask, there's some videos that just blow my mind in terms of quality and depth.
link |
And just in every regard are amazing videos and they have like 57 views. Okay. How do you get
link |
videos of quality to be seen by many eyes? So the measure of quality, is it just something?
link |
Yeah. How do you know that something is good? Well, I mean, I think it depends initially on
link |
what sort of video we're talking about. So in the realm of let's say, you mentioned politics and news,
link |
in that realm, quality news or quality journalism relies on having a journalism
link |
department, right? Like you have to have actual journalists and fact checkers and people like
link |
that. And so in that situation, and in others, maybe science or in medicine, quality has a lot
link |
to do with the authoritativeness and the credibility and the expertise of the people who make the video.
link |
Now, if you think about the other end of the spectrum,
link |
you know, what is the highest quality prank video? Or what is the highest quality
link |
Minecraft video, right? That might be the one that people enjoy watching the most and watch to the
link |
end. Or it might be the one that when we ask people the next day after they watched it,
link |
were they satisfied with it? And so we, especially in the realm of entertainment,
link |
have been trying to get at better and better measures of quality or satisfaction or enrichment
link |
since I came to YouTube. And we started with, well, you know, the first approximation is the one that
link |
gets more views. But, you know, we both know that things can get a lot of views and not really be
link |
that high quality, especially if people are clicking on something and then immediately
link |
realizing that it's not that great and abandoning it. And that's why we moved from views to thinking
link |
to thinking about the amount of time people spend watching it with the premise that like,
link |
you know, in some sense, the time that someone spends watching a video is related to the value
link |
that they get from that video. It may not be perfectly related, but it has something to say
link |
about how much value they get. But even that's not good enough, right? Because I myself have spent
link |
time clicking through channels on television late at night and ended up watching under siege too,
link |
for some reason, I don't know. And if you were to ask me the next day, are you glad that you
link |
watched that show on TV last night? I'd say, yeah, I wish I would have gone to bed or read a book or
link |
almost anything else really. And so that's why some people got the idea a few years ago to try
link |
to survey users afterwards. And so we get feedback data from those surveys and then use that in the
link |
machine learning system to try to not just predict what you're going to click on right now,
link |
what you might watch for a while, but what when we ask you tomorrow, you'll give four or five stars
link |
to. So just to summarize, what are the signals from the machine learning perspective that the
link |
user can provide? So you mentioned just clicking on the video views, the time watch, maybe the
link |
relative time watch, the clicking like and dislike on the video, maybe commenting on the video,
link |
all those things, all those things. And then the one I wasn't actually quite aware of,
link |
even though I might have engaged in it is a survey afterwards, which is a brilliant idea.
link |
Is there other signals? I mean, that's already a really rich space of signals to learn from.
link |
Is there something else? Well, you mentioned commenting, also sharing the video. If you
link |
think it's worthy to be shared with someone else, you know, within YouTube or outside of YouTube
link |
as well. Either. Let's see, you mentioned like, dislike. Like and dislike. How important is that?
link |
It's very important, right? We want, it's predictive of satisfaction. But it's not, it's
link |
not perfectly predictive. Subscribe. If you subscribe to the channel of the person who
link |
made the video, then that also is a piece of information that signals satisfaction. Although
link |
over the years, we've learned that people have a wide range of attitudes about what it means to
link |
subscribe. We would ask some users who didn't subscribe very much, but they watched a lot from
link |
a few channels. We'd say, well, why didn't you subscribe? And they would say, well, I can't
link |
afford to pay for anything. And, you know, we tried to let them understand like, actually,
link |
it doesn't cost anything. It's free. It just helps us know that you are very interested in this
link |
creator. But then we've asked other people who subscribed to many things and don't really watch
link |
any of the videos from those channels. And we say, well, why did you subscribe to this if you
link |
weren't really interested in any more videos from that channel? And they might tell us,
link |
well, I just, you know, I thought the person did a great job and I just want to kind of
link |
give them a high five. Yeah. And so, yeah, that's where I said, I actually subscribed to channels
link |
where I just, this person is amazing. I like this person. But then I like this person,
link |
I really want to support them. That's how I click subscribe, even though I mean, never actually want
link |
to click on their videos when they're releasing it. I just love what they're doing. And it's maybe
link |
outside of my interest area and so on, which is probably the wrong way to use the subscribe button.
link |
Well, I just want to say congrats. This is a great work. Well, so you have to deal with all the
link |
space of people that see the subscribe button is totally different. That's right. And so, you know,
link |
we, we can't just close our eyes and say, sorry, you're using it wrong. You know, we're not going
link |
to pay attention to what you've done. We need to embrace all the ways in which all the different
link |
people in the world use the subscribe button or the like and the dislike button. So in terms of
link |
signals of machine learning, using for the search and for the recommendation, you've mentioned title,
link |
so like metadata, like text data that people provide description and title and maybe keywords.
link |
So maybe you can speak to the value of those things in search and also this incredible,
link |
fascinating area of the content itself. So the video content itself, trying to understand what's
link |
happening in the video. So YouTube released a data set that, you know, in the machine learning
link |
computer vision world, this is just an exciting space. How much is that currently? How much are
link |
you playing with that currently? How much is your hope for the future of being able to analyze the
link |
content of the video itself? Well, we have been working on that also since I came to YouTube.
link |
Analyzing the content. Analyzing the content of the video, right. And what I can tell you is that
link |
our ability to do it well is still somewhat crude. We can, we can tell if it's a music video. We can
link |
tell if it's a sports video. We can probably tell you that people are playing soccer. We probably
link |
can't tell whether it's Manchester United or my daughter's soccer team. So these things are kind
link |
of difficult and using them, we can use them in some ways. So for instance, we use that kind of
link |
information to understand and inform these clusters that I talked about. And also maybe to add some
link |
words like soccer, for instance, to the video if it doesn't occur in the title or the description,
link |
which is remarkable that often it doesn't. One of the things that I ask creators to do is please
link |
help us out with the title in the description. For instance, we were a few years ago having a
link |
live stream of some competition for World of Warcraft on YouTube. And it was a very important
link |
competition. But if you typed World of Warcraft in search, you wouldn't find it. World of Warcraft
link |
wasn't in the title? World of Warcraft wasn't in the title. It was match four, seven, eight, you know,
link |
A team versus B team. And World of Warcraft wasn't in the title. Just like, come on, give me.
link |
Being literal on the internet is actually very uncool, which is the problem.
link |
Oh, is that right?
link |
Well, I mean, in some sense, some of the greatest videos, I mean, there's a humor to just being
link |
indirect, being witty and so on. And actually, machine learning algorithms want you to be
link |
literal. You just want to say what's in the thing, be very, very simple. And in some sense,
link |
that gets away from wit and humor. So you have to play with both. But you're saying that for
link |
now, the content of the title, the content of the description, the actual text is one of the
link |
best ways for the algorithm to find your video and put them in the right cluster.
link |
That's right. And I would go further and say that if you want people, human beings, to select
link |
your video in search, then it helps to have, let's say, World of Warcraft in the title. Because
link |
why would a person's, you know, if they're looking at a bunch, they type World of Warcraft,
link |
and they have a bunch of videos, all of whom say World of Warcraft, except the one that you uploaded,
link |
well, even the person is going to think, well, maybe this isn't somehow search made a mistake.
link |
This isn't really about World of Warcraft. So it's important not just for the machine
link |
learning systems, but also for the people who might be looking for this sort of thing. They get a
link |
clue that it's what they're looking for by seeing that same thing prominently in the title of the
link |
video. Okay, let me push back on that. So I think from the algorithm perspective, yes, but if they
link |
typed in World of Warcraft and saw a video that with the title simply winning, and the thumbnail
link |
has like a sad orc or something, I don't know, right? Like, I think that's much, it gets your
link |
curiosity up. And then if they could trust that the algorithm was smart enough to figure out somehow
link |
that this is indeed a World of Warcraft video, that would have created the most beautiful
link |
experience. I think in terms of just the wit and the humor and the curiosity that we human beings
link |
naturally have, but you're saying, I mean, realistically speaking, it's really hard for
link |
the algorithm to figure out that the content of that video will be a World of Warcraft video.
link |
And you have to accept that some people are going to skip it. Yeah, right. I mean, and so you're
link |
right. The people who don't skip it and select it are going to be delighted. Yeah. But other people
link |
might say, yeah, this is not what I was looking for. And making stuff discoverable, I think,
link |
is what you're really working on and hoping. So yeah, so from your perspective, put stuff in
link |
the title of the description. And remember, the collaborative filtering part of the system
link |
starts by the same user watching videos together, right? So the way that they're probably going
link |
to do that is by searching for them. That's a fascinating aspect of it. It's like ant colonies.
link |
That's how they find stuff. So I mean, what degree for collaborative filtering in general
link |
is one curious ant, one curious user essential? So just a person who is more willing to click on
link |
random videos and sort of explore these cluster spaces. In your sense, how many people are just
link |
like watching the same thing over and over and over and over? And how many are just like the
link |
explorers that just kind of like click on stuff and then help the other ant in the ants colony
link |
discover the cool stuff. Do you have a sense of that at all? I really don't think I have a sense
link |
for the relative sizes of those groups. But I would say that people come to YouTube with
link |
some certain amount of intent. And as long as they, to the extent to which they try to satisfy
link |
that intent, that certainly helps our systems, right? Because our systems rely on kind of a
link |
faithful amount of behavior, right? And there are people who try to trick us, right? There are people
link |
and machines that try to associate videos together that really don't belong together,
link |
but they're trying to get that association made because it's profitable for them. And so we have
link |
to always be resilient to that sort of attempt at gaming the systems. So speaking to that,
link |
there's a lot of people that in a positive way, perhaps, I don't know, I don't like it, but
link |
like to gain, want to try to gain the system to get more attention. Everybody, creators,
link |
in a positive sense, want to get attention, right? So how do you, how do you work in this space when
link |
people create more and more sort of click baity titles and thumbnails? Sort of very
link |
to ask him, Derek has made a video where basically describes that it seems what works is to create
link |
a high quality video, really good video where people would want to watch and wants to click on it,
link |
but have click baity titles and thumbnails to get them to click on it in the first place.
link |
And he's saying, I'm embracing this fact, I'm just going to keep doing it. And I hope
link |
you forgive me for doing it. And you will enjoy my videos once you click on them.
link |
So in what sense do you see this kind of click bait style attempt to manipulate to get people in
link |
the door to manipulate the algorithm or play with the algorithm or game the algorithm?
link |
I think that that you can look at it as an attempt to game the algorithm. But
link |
even if you were to take the algorithm out of it and just say, okay, well, all these videos
link |
happen to be lined up, which the algorithm didn't make any decision about which one to
link |
put at the top or the bottom, but they're all lined up there, which one are the people going to
link |
choose? And I'll tell you the same thing that I told Derek is, you know, I have a bookshelf
link |
and they have two kinds of books on them, science books. I have my math books from when I was a
link |
student and they all look identical except for the titles on the covers. They're all yellow,
link |
they're all from Springer, and they're every single one of them. The cover is totally the same.
link |
Right? On the other hand, I have other more pop science type books, and they all have very
link |
interesting covers, right? And they have provocative titles and things like that. I mean, I wouldn't
link |
say that they're click baity because they are indeed good books. And I don't think that they
link |
cross any line, but that's just a decision you have to make, right? Like the people who write
link |
classical recursion theory by P. R. O. D. Freddie, he was fine with the yellow title and nothing
link |
more. Whereas I think other people who wrote a more popular type book understand that they need
link |
to have a compelling cover and a compelling title. And, you know, I don't think there's anything
link |
really wrong with that. We do take steps to make sure that there is a line that you don't cross.
link |
And if you go too far, maybe your thumbnail is especially racy or, you know, it's all caps with
link |
too many exclamation points. We observe that users are kind of, you know, sometimes offended
link |
by that. And so for the users who are offended by that, we will then depress or suppress those
link |
videos. And which reminds me, there's also another signal where users can say, I don't know if I was
link |
recently added, but I really enjoy it. I'm just saying I didn't, something like I don't want to
link |
see this video anymore or something like, like this is a, like there's certain videos that just
link |
cut me the wrong way. Like just jump out at me. It's like, I don't want this. And it feels really
link |
good to clean that up. To be like, I don't, that's not, that's not for me. I don't know. I think
link |
that might have been recently added, but that's also a really strong signal. Yes, absolutely.
link |
Right. We don't want to make a recommendation that people are unhappy with. And that makes me,
link |
that particular one makes me feel good as a user in general, and as a machine learning person,
link |
because I feel like I'm helping the algorithm. My interaction on YouTube don't always feel like
link |
I'm helping the algorithm. Like I'm not reminded of that fact. Like for example, Tesla and Autopilot
link |
and, you know, Musk create a feeling for their customers, for people that own Tesla's, that
link |
they're helping the algorithm of Tesla. Like they're all like a really proud, they're helping
link |
the fleet learn. I think YouTube doesn't always remind people that you're helping the algorithm
link |
get smarter. And for me, I love that idea. Like we're all collaboratively, like Wikipedia gives
link |
that sense that we're all together creating a beautiful thing. YouTube doesn't always remind
link |
me of that. This conversation is reminding me of that, but... Well, that's a good tip. We should
link |
keep that fact in mind when we design these features. I'm not sure I really thought about it
link |
that way, but that's a very interesting perspective. It's an interesting question of personalization
link |
that I feel like when I click like on a video, I'm just improving my experience.
link |
It would be great. It would make me personally, people are different, but make me feel great
link |
if I was helping also the YouTube algorithm broadly say something. You know what I'm saying?
link |
Like I don't know if that's human nature, but the products you love, and I certainly love
link |
YouTube, you want to help it get smarter and smarter and smarter because there's some kind
link |
of coupling between our lives together being better. If YouTube was better, then my life
link |
will be better. And that's that kind of reasoning. I'm not sure what that is. And I'm not sure how
link |
many people share that feeling. That could be just a machine learning feeling. But on that point,
link |
how much personalization is there in terms of next video recommendations? So is it kind of
link |
all really boiling down to clustering? Like if I'm in your clusters to me and so on and
link |
that kind of thing, or how much is personalized to me, the individual completely?
link |
It's very, very personalized. So your experience will be quite a bit different from anybody else's
link |
who's watching that same video, at least when they're logged in. And the reason is that we found
link |
that users often want two different kinds of things when they're watching a video. Sometimes
link |
they want to keep watching more on that topic or more in that genre. And other times they just
link |
are done and they're ready to move on to something else. And so the question is,
link |
well, what is the something else? And one of the first things one can imagine is, well,
link |
maybe something else is the latest video from some channel to which you've subscribed. And
link |
that's going to be very different for you than it is for me, right? And even if it's not something
link |
that you subscribe to, it's something that you watch a lot. And again, that'll be very different
link |
on a person by person basis. And so even the watch next, as well as the homepage, of course,
link |
is quite personalized. So what we mentioned some of the signals, but what does success look like?
link |
What does success look like in terms of the algorithm creating a great long term experience
link |
for a user? Or put another way, if you look at the videos I've watched this month,
link |
how do you know the algorithm succeeded for me? I think, first of all, if you come back and watch
link |
more YouTube, then that's one indication that you found some value from it. So just the number of
link |
hours is a powerful indicator? Well, I mean, not the hours themselves, but the fact that you return
link |
on another day. So that's probably the most simple indicator. People don't come back to things that
link |
they don't find value in, right? There's a lot of other things that they could do. But like I said,
link |
I mean, ideally, we would like everybody to feel that YouTube enriches their lives and that every
link |
video they watched is the best one they've ever watched since they've started watching YouTube.
link |
And so that's why we survey them and ask them, like, is this one to five stars? And so our version
link |
of success is every time someone takes that survey, they say it's five stars. And if we ask them,
link |
is this the best video you've ever seen on YouTube? They say yes, every single time. So it's hard to
link |
imagine that we would actually achieve that. Maybe asymptotically, we would get there. But
link |
that would be what we think success is. It's funny. I've recently said somewhere, I don't know,
link |
maybe tweeted, but that Ray Dalio has this video on the economic machine. I forget what it's called,
link |
but it's a 30 minute video. And I said, it's the greatest video I've ever watched on YouTube.
link |
It's like, I watched the whole thing and my mind was blown as a very crisp, clean description of
link |
how at least the American economic system works. It's a beautiful video. And I was just, I wanted
link |
to click on something to say, this is the best thing ever. Please let me, I can't believe I
link |
discovered it. I mean, the views and the likes reflect its quality. But I was almost upset that
link |
I haven't found it earlier and wanted to find other things like it. I don't think I've ever felt
link |
that this is the best video I've ever watched. And that was that. And to me, the ultimate Utopia,
link |
the best experiences were every single video where I don't see any of the videos I regret and
link |
every single video I watched is one that actually helps me grow, helps me enjoy life, be happy,
link |
and so on. Well, so that's, that's, that's a heck of a, that's a, that's one of the most beautiful
link |
and ambitious, I think, machine learning tasks. So when you look at a society as opposed to an
link |
individual user, do you think of how YouTube is changing society when you have these millions
link |
of people watching videos, growing, learning, changing, having debates? Do you have a sense
link |
of, yeah, what the big impact on society is? Because I think it's huge, but do you have a
link |
sense of what direction we're taking this world? Well, I mean, I think, you know, openness has had
link |
an impact on society already. There's a lot of... What do you mean by openness? Well, the fact that
link |
unlike other mediums, there's not someone sitting at YouTube who decides before you can upload your
link |
video, whether it's worth having you upload it, or, or worth anybody seeing it really, right? And so,
link |
you know, there are some creators who say, like, I wouldn't have this opportunity to, to reach an
link |
audience. Tyler Oakley often said that, you know, he wouldn't have had this opportunity to reach this
link |
audience if it weren't for YouTube. And, and so I think that's one way in which YouTube has changed
link |
society. I know that there are people that I work with from outside the United States, especially
link |
from places where literacy is low. And they think that YouTube can help in those places because
link |
you don't need to be able to read and write in order to learn something important for your life,
link |
maybe, you know, how to do some job or how to fix something. And so that's another way in which I
link |
think YouTube is possibly changing society. So I've, I've worked at YouTube for eight, almost nine
link |
years now. And it's fun because I meet people and, you know, you tell them where they, where you work,
link |
you say you work on YouTube, and they immediately say, I love YouTube. Yeah. Right. Which is great,
link |
makes me feel great. But then, of course, when I ask them, well, what is it that you love about
link |
YouTube? Not one time ever has anybody said that the search works outstanding or that the recommendations
link |
are great. What they always say when I ask them, what do you love about YouTube is they immediately
link |
start talking about some channel or some creator or some topic or some community that they found on
link |
YouTube and that they just love. Yeah. And so that has made me realize that YouTube is really about
link |
the video and connecting the people with the videos and then everything else kind of gets out of the
link |
way. So beyond the video, it's an interesting, because you kind of mentioned creator. What about
link |
the connection with just the individual creators as opposed to just individual video? So like,
link |
I gave the example of Dalya video that the video itself is incredible. But there's some people
link |
who are just creators that I love that they're one of the cool things about people who call
link |
themselves YouTubers or whatever is they have a journey. They usually almost all of them are
link |
or they suck horribly in the beginning and then they kind of grow and then there's that
link |
genuineness in their growth. So YouTube clearly wants to help creators connect with their audience
link |
in this kind of way. So how do you think about that process of helping creators grow,
link |
helping them connect with their audience, develop not just individual videos, but the
link |
entirety of a creator's life on YouTube? Well, I mean, we're trying to help creators find the
link |
biggest audience that they can find. And the reason why that's you brought up creator versus
link |
video, the reason why creator channel is so important is because if we have a hope of people
link |
coming back to YouTube, well, they have to have in their minds some sense of what they're going to
link |
find when they come back to YouTube. If YouTube were just the next viral video, and I have no
link |
concept of what the next viral video could be one time it's a cat playing a piano and the next day
link |
it's some children interrupting a reporter and the next day it's some other thing happening,
link |
then it's hard for me to when I'm not watching YouTube say, gosh, I really would like to see
link |
something from someone or about something. And so that's why I think this connection between
link |
fans and creators is so important for both because it's a way of fostering a relationship
link |
that can play out into the future. Let me talk about kind of a dark and interesting question
link |
in general. And again, a topic that you or nobody has an answer to, but social media has a sense of
link |
you know, it gives us highs and it gives us lows in the sense that sort of creators often speak
link |
about having sort of burn out and having psychological ups and downs and challenges
link |
mentally in terms of continuing the creation process. There's a momentum. There's a huge
link |
excited audience that makes everybody feel that makes creators feel great. And I think it's more
link |
than just financial. I think it's literally just they love that sense of community. It's part of
link |
the reason I upload to YouTube. I don't care about money. Never will. What I care about is the the
link |
community, but some people feel like this momentum and even when there's times in their life when
link |
they don't feel, you know, for some reason don't feel like creating. So how do you think about
link |
burn out this mental exhaustion that some YouTube creators go through? Is that something we have
link |
an answer for? Is it something? How do we even think about that? Well, the first thing is we
link |
want to make sure that the YouTube systems are not contributing to this sense, right? And so
link |
we've done a fair amount of research to demonstrate that you can absolutely take a break. If you are
link |
if you are a creator and you've been uploading a lot, we have just as many examples of people who
link |
took a break and came back more popular than they were before as we have examples of going the other
link |
way. Yeah, can we pause on that for a second? So the feeling that people have, I think, is if I take
link |
a break, everybody, the party will leave, right? So if you could just linger on that. So in your
link |
sense that taking a break is okay. Yes, taking a break is absolutely okay. And the reason I say
link |
that is because we have we can observe many examples of being of creators coming back very
link |
strong and even stronger after they have taken some sort of break. And so I just want to dispel the
link |
myth that this somehow necessarily means that your channel is going to go down or lose views.
link |
That is not the case. We know for sure that this is not a necessary outcome. And so we want to
link |
encourage people to make sure that they take care of themselves. That is job one, right? You have
link |
to look after yourself and your mental health. And you know, I think that it probably in some of
link |
these cases contributes to better videos once they come back, right? Because a lot of people,
link |
I know myself, if I burn out on something, then I'm probably not doing my best work,
link |
even though I can keep working until I pass out. And so I think that the taking a break
link |
may even improve the creative ideas that someone has.
link |
Okay, I think it's a really important thing to dispel. I think it applies to all of social media.
link |
Like literally, I've taken a break for a day every once in a while. Sorry, sorry if that sounds
link |
like a short time. But even like email, just taking a break from email or only checking email once a
link |
day, especially when you're going through something psychologically in your personal life or so on,
link |
or really not sleeping much because of work deadlines, it can refresh you in a way that's
link |
profound. And so the same applies. It was there when you came back, right? It's there. And it looks
link |
different actually when you come back. You're sort of brighter eyed with some coffee, everything,
link |
the world looks better. So it's important to take a break when you need it.
link |
So you've mentioned kind of the the YouTube algorithm isn't, you know, e equals mc squared.
link |
It's not the single equation. It's potentially sort of more than a million lines of code.
link |
Sort of is it more akin to what autonomous successful autonomous vehicles today are,
link |
which is they're just basically patches on top of patches of heuristics and human experts really
link |
tuning the algorithm and have some machine learning modules? Or is it becoming more and more a giant
link |
machine learning system with humans just doing a little bit of tweaking here and there? What's
link |
your sense? First off, do you even have a sense of what is the YouTube algorithm at this point?
link |
And whichever, however much you do have a sense, what does it look like?
link |
Well, we don't usually think about it as the algorithm because it's a bunch of systems that
link |
work on different services. The other thing that I think people don't understand is that
link |
what you might refer to as the YouTube algorithm from outside of YouTube is actually
link |
a bunch of code and machine learning systems and heuristics, but that's married with the behavior
link |
of all the people who come to YouTube every day. So the people part of the code essentially.
link |
Exactly, right? Like if there were no people who came to YouTube tomorrow, then the algorithm
link |
wouldn't work anymore, right? So that's a critical part of the algorithm. And so when people talk
link |
about, well, the algorithm does this, the algorithm does that, it's sometimes hard to understand.
link |
Well, you know, it could be the viewers are doing that and the algorithm is mostly just keeping track
link |
of what the viewers do and then reacting to those things in sort of more fine grained situations.
link |
And I think that this is the way that the recommendation system and the search system and
link |
probably many machine learning systems evolve is you start trying to solve a problem and the
link |
first way to solve a problem is often with a simple heuristic, right? And you want to say,
link |
what are the videos we're going to recommend? Well, how about the most popular ones?
link |
Right? And that's where you start. And over time, you collect some data and you refine
link |
your situation so that you're making less heuristics and you're building a system that can
link |
actually learn what to do in different situations based on some observations of those situations
link |
in the past. And you keep chipping away at these heuristics over time. And so I think that
link |
that just like with diversity, you know, I think the first diversity measure we took was, okay,
link |
not more than three videos in a row from the same channel, right? It's a pretty simple heuristic
link |
to encourage diversity, but it worked, right? Who needs to see four, five, six videos in a row
link |
from the same channel? And over time, we try to chip away at that and make it more fine grained
link |
and basically have it remove the heuristics in favor of something that can react to individuals
link |
and individual situations. So how do you, you mentioned, you know, we know that something
link |
worked. How do you get a sense when decisions are the kind of A B testing that this idea was a good
link |
one, this was not so good? What's, how do you measure that? And across which time scale, across
link |
how many users that kind of, that kind of thing? Well, you mentioned that A B experiments. And so
link |
just about every single change we make to YouTube, we do it only after we've run a A B experiment.
link |
And so in those experiments, which run from one week to months, we measure hundreds, literally
link |
hundreds of different variables and, and measure changes with confidence intervals in all of them,
link |
because we really are trying to get a sense for ultimately, does this improve the experience
link |
for viewers? That's the question we're trying to answer. And an experiment is one way because we
link |
can see certain things go up and down. So for instance, if we noticed in the experiment, people
link |
are dismissing videos less frequently, or they're saying that they're more satisfied,
link |
they're giving more videos five stars after they watch them, then those would be indications of
link |
that the experiment is successful, that it's improving the situation for viewers.
link |
But we can also look at other things, like we might do user studies where we invite some
link |
people in and ask them, like, what do you think about this? What do you think about that? How do
link |
you feel about this? And other various kinds of user research. But ultimately, before we launch
link |
something, we're going to want to run an experiment. So we get a sense for what the impact is going to
link |
be, not just to the viewers, but also to the different channels and all of that.
link |
An absurd question. Nobody knows. Well, actually, it's interesting. Maybe there's an answer, but
link |
if I want to make a viral video, how do I do it? I don't know how you make a viral video. I know
link |
that we have, in the past, tried to figure out if we could detect when a video was going to go
link |
viral. And those were, you take the first and second derivatives of the view count and maybe
link |
use that to do some prediction. But I can't say we ever got very good at that. Oftentimes,
link |
we look at where the traffic was coming from. If a lot of the viewership is coming from
link |
something like Twitter, then maybe it has a higher chance of becoming viral than if it
link |
were coming from search or something. But that was just trying to detect a video that might be
link |
viral. How to make one? Like, I have no idea. I mean, you get your kids to interrupt you while
link |
you're on the news or something. Absolutely. But after the fact, on one individual video,
link |
sort of ahead of time predicting is a really hard task. But after the video went viral in analysis,
link |
can you sometimes understand why it went viral from the perspective of YouTube broadly?
link |
Firstly, is it even interesting for YouTube that a particular video is viral? Or does that not
link |
matter for the experience of people? Well, I think people expect that if a video is going
link |
viral and it's something they would be interested in, then I think they would expect YouTube to
link |
recommend it to them. Right. So if someone's going viral, it's good to just let people ride the wave
link |
of it. It's viral. Well, I mean, we want to meet people's expectations in that way, of course.
link |
So like I mentioned, I hung out with Derek Mueller a while ago, a couple months back.
link |
He's actually the person who suggested I talk to you on this podcast.
link |
All right. Well, thank you, Derek. At that time, he just recently posted
link |
an awesome science video titled, Why Are 96 Million Black Balls on This Reservoir?
link |
And in a matter of, I don't know how long, but like a few days, you got 38 million views
link |
and it's still growing. Is this something you can analyze and understand why it happened
link |
this video and you want a particular video like it?
link |
I mean, we can surely see where it was recommended, where it was found, who watched it,
link |
and those sorts of things. So it's actually starting to interrupt. It is the video which
link |
helped me discover who Derek is. I didn't know who he is before. So I remember, you know,
link |
usually I just have all of these technical boring MIT Stanford talks in my recommendation
link |
because that's how I watch. And then all of a sudden there's this Black Balls in Reservoir
link |
video with like an excited nerd and with like just, why is this being recommended to me?
link |
So I clicked on it and watched the whole thing. It was awesome. But, and then a lot of people
link |
had that experience like, why was I recommended this? But they all, of course, watched it and
link |
enjoyed it, which is, what's your sense of this just wave of recommendation that comes
link |
with this viral video that ultimately people get enjoy after they click on it?
link |
Well, I think it's the system, you know, basically doing what anybody who's recommending
link |
something would do, which is you show it to some people and if they like it, you say,
link |
okay, well, can I find some more people who are a little bit like them? Okay, I'm gonna
link |
try it with them. Oh, they like it too. Let me expand the circle some more, find some more
link |
people. Oh, it turns out they like it too. And you just keep going until you get some
link |
feedback that says, no, now you've gone too far. These people don't like it anymore.
link |
And so I think that's basically what happened. Now, you asked me about how to make a video go
link |
viral or make a viral video. I don't think that if you or I decided to make a video about 96
link |
million balls, that it would also go viral. It's possible that Derek made like the canonical video
link |
about those black balls in the lake. Exactly. He did actually. Right. And so I don't know whether
link |
or not just following along is the secret. Yeah, but it's fascinating. I mean, just like you said,
link |
the algorithm sort of expanding that circle and then figuring out that more and more people did
link |
enjoy it. And that sort of phase shift of just a huge number of people enjoying it in the algorithm
link |
quickly, automatically, I assume, figuring that out. That's a, I don't know, the dynamics of
link |
psychology, that is a beautiful thing. And so what do you think about the idea of clipping,
link |
like too many people annoyed me into doing it, which is they were requesting it. They said it
link |
would be very beneficial to add clips in like the coolest points and actually have explicit
link |
videos. Like I'm reuploading a video, like a short clip, which is what the podcasts are doing.
link |
Do you see, as opposed to like I also add timestamps for the topics, you know,
link |
do you want the clip? Do you see YouTube somehow helping creators with that process or helping
link |
connect clips to the original videos? Or is that just on a long list of amazing features to work
link |
towards? Yeah, I mean, it's not something that I think we've done yet. But I can tell you that
link |
I think clipping is great. And I think it's actually great for you as a creator.
link |
And here's the reason. If you think about, I mean, let's, let's say the NBA is uploading
link |
videos of, of its games. Well, people might search for warriors versus rockets,
link |
or they might search for Steph Curry. And so a highlight from the game in which Steph Curry
link |
makes an amazing shot is an opportunity for someone to find a portion of that video. And so I think
link |
that you never know how people are going to search for something that you've created. And so you want
link |
to, I would say you want to make clips and, and add titles and things like that so that they can
link |
find it as easily as possible. Do you have a dream of a future, perhaps a distant future,
link |
when the YouTube algorithm figures that out, sort of automatically detects the parts of the video
link |
that are really interesting, exciting, potentially exciting for people, and sort of clip them out
link |
in this incredibly rich space? Because if you talk about, if you talk, even just this conversation,
link |
we probably covered 30, 40 little topics. And there's a huge space of users that would find,
link |
you know, 30% of those topics really interesting. And that space is very different. It's something
link |
that's beyond my ability to clip out, right? But the algorithm might be able to figure all that
link |
out, sort of expand into clips. Do you have a, do you think about this kind of thing? Do you have
link |
a hope, a dream that one day the algorithm will be able to do that kind of deep content analysis?
link |
Well, we've actually had projects that attempt to achieve this. But it really does depend on
link |
understanding the video well. And our understanding of the video right now is quite crude. And so
link |
I think it would be especially hard to do it with a conversation like this. One might be able to do
link |
it with, let's say, a soccer match more easily, right? You could probably find out where the goals
link |
were scored. And then of course, you need to figure out who it was that scored the goal. And
link |
that might require a human to do some annotation. But I think that trying to identify coherent
link |
topics in a transcript, like the one of our conversation, is not something that we're going
link |
to be very good at right away. And I was speaking more to the general problem, actually, of being
link |
able to do both a soccer match and our conversation without explicit sort of almost, my hope was that
link |
there exists an algorithm that's able to find exciting things in video. So Google now on Google
link |
search will help you find the segment of the video that you're interested in. So if you search for
link |
or something like how to change the filter in my dishwasher, then if there's a long video about
link |
your dishwasher, and this is the part where the person shows you how to change the filter, then
link |
it will highlight that area and provide a link directly to it. And from your recollection,
link |
do you know if the thumbnail reflects? What's the difference between showing the full video and
link |
the shorter clip? Do you know how it's presented in search results? I don't remember how it's
link |
presented. And the other thing I would say is that right now it's based on creator annotations.
link |
Ah, got it. So it's not the thing we're talking about. But folks are working on the more
link |
automatic version. It's interesting, people might not imagine this, but a lot of our systems start
link |
by using almost entirely the audience behavior. And then as they get better,
link |
the refinement comes from using the content. And I wish, I know there's privacy concerns, but
link |
I wish YouTube explored the space, which is sort of putting a camera on the users if they allowed
link |
it, right? To study their, like I did a lot of emotion recognition work and so on, to study
link |
actual sort of richer signal. One of the cool things when you upload 360 like VR video to YouTube,
link |
and I've done this a few times. So I've uploaded myself. It's a horrible idea. Some people enjoyed
link |
it, but whatever. The video of me giving a lecture in 360, 360 camera, and it's cool because YouTube
link |
allows you to then watch, where did people look at? There's a heat map of where, you know,
link |
of where the center of the VR experience was. And it's interesting because that reveals to you,
link |
like, what people looked at. And it's, it's not always what you were expecting. It's not,
link |
in the case of the lecture is pretty boring. It is what we're expecting, but we did a few funny
link |
videos where there's a bunch of people doing things and they, everybody tracks those people.
link |
You know, in the beginning, they all look at the main person and they start spreading around and
link |
looking at the other people. It's fascinating. So that kind of, that's a really strong signal
link |
of what people found exciting in the video. I don't know how you get that from people just
link |
watching, except they tuned out at this point. Like, it's hard to measure this moment was super
link |
exciting for people. I don't know how you get that signal. Maybe comment, is there a way to get
link |
that signal where this was like, this is when their eyes opened up and they're like, like,
link |
for me with the Ray Dalio video, right? Like, at first I was like, oh, okay, this is another one
link |
of these like dumb it down for you videos. And then you like start watching, it's like, okay,
link |
there's really crisp, clean, deep explanation of how the economy works. That's where I like set up
link |
and started watching right that moment. Is there a way to detect that moment? The only way I can
link |
think of is by asking people to label it. You mentioned that we're quite far away in terms
link |
of doing video analysis, deep video analysis. Of course, Google, YouTube, you know, we're
link |
quite far away from solving a time was driving problem too. I don't know. I think we're closer
link |
to that. You never know. And the Wright brothers thought they're never they're not going to fly
link |
for 50 years, three years before they flew. So what are the biggest challenges, would you say?
link |
Is it the broad challenge of understanding video, understanding natural language, understand the
link |
challenge before the entire machine learning community or just being able to understand
link |
it? Or is there something specific to video that's even more challenging than understanding
link |
natural language understanding? What's your sense of what the biggest challenge is?
link |
I mean, video is just so much information. And so precision becomes a real problem. It's like
link |
you're trying to classify something and you've got a million classes. And the distinctions
link |
among them, at least from a machine learning perspective, are often pretty small. You need
link |
to see this person's number in order to know which player it is. And there's a lot of players.
link |
Or you need to see the logo on their chest in order to know which team they play for.
link |
And so, and that's just figuring out who's who, right? And then you go further and saying, okay,
link |
well, you know, was that a goal? Was it not a goal? Like, is that an interesting moment,
link |
as you said? Or is that not an interesting moment? These things can be pretty hard.
link |
So, okay, so Yan Likun, I'm not sure if you're familiar sort of with this current thinking
link |
and work. So he believes that self, what is referring to as self supervised learning
link |
will be the solution sort of to achieving this kind of greater level of intelligence. In fact,
link |
the thing he's focusing on is watching video and predicting the next frame. So predicting
link |
the future of video, right? So for now, we're very far from that. But his thought is because
link |
it's unsupervised, or as he refers to as self supervised. You know, if you watch enough video,
link |
essentially, if you watch YouTube, you'll be able to learn about the nature of reality,
link |
the physics, the common sense reasoning required by just teaching a system to predict
link |
the next frame. So he's confident this is the way to go. So for you, from the perspective of just
link |
working with this video, how do you think an algorithm that just watches all of YouTube
link |
stays up all day and night watching YouTube would be able to understand enough of the
link |
physics of the world about the way this world works, be able to do common sense reasoning and
link |
so on? Well, I mean, we have systems that already watch all the videos on YouTube, right?
link |
But they're just looking for very specific things, right? They're supervised learning systems that
link |
are trying to identify something or classify something. And I don't know if predicting the
link |
next frame is really going to get there because I'm not an expert on compression algorithms,
link |
but I understand that that's kind of what compression, video compression algorithms do,
link |
is they basically try to predict the next frame and then fix up the places where they got it
link |
wrong. And that leads to higher compression than if you actually put all the bits for the next frame
link |
there. So I don't know if I believe that just being able to predict the next frame is going to be
link |
enough because there's so many frames and even a tiny bit of error on a per frame basis can lead
link |
to wildly different videos. So the thing is the idea of compression is one way to do compression
link |
is to describe through text what's contained in the video. That's the ultimate high level of
link |
compression. So the idea is a tradition when you think of video image compression, you're trying
link |
to maintain the same visual quality while reducing the size. But if you think of deep learning from
link |
a bigger perspective of what compression is, is you're trying to summarize the video. And the idea
link |
there is if you have a big enough neural network by watching the next, by trying to predict the
link |
next frame, you'll be able to form a compression of actually understanding what's going on in the
link |
scene. If there's two people talking, you can just reduce that entire video into the fact that two
link |
people are talking and maybe the content of what they're saying and so on. That's kind of the open
link |
ended dream. So I just wanted to sort of express that because it's an interesting compelling notion, but
link |
it is nevertheless true that video, our world is a lot more complicated than we get credit for.
link |
I mean, in terms of search and discovery, we have been working on trying to summarize videos
link |
in text or with some kind of labels for eight years, at least. And we're kind of so so.
link |
So if you were to say the problem is 100% solved and eight years ago, was 0% solved?
link |
Where are we on that timeline, would you say?
link |
Yeah, to summarize a video well, maybe less than a quarter of the way.
link |
So on that topic, what does YouTube look like 10, 20, 30 years from now?
link |
I mean, I think that YouTube is evolving to take the place of TV. I grew up as a kid in the 70s
link |
and I watched a tremendous amount of television. And I feel sorry for my poor mom because
link |
people told her at the time that it was going to rot my brain and that she should kill her
link |
television. But anyway, I mean, I think that YouTube is, at least for my family,
link |
a better version of television, right? It's one that is on demand. It's more tailored to the things
link |
that my kids want to watch. And also, they can find things that they would never have found on
link |
television. And so I think that at least from just observing my own family, that's where we're
link |
headed is that people watch YouTube kind of in the same way that I watched television when I was
link |
younger. So from a search and discovery perspective, what are you excited about in the 5, 10, 20,
link |
30 years? It's already really good. I think it's achieved a lot of, of course, we don't know what's
link |
possible. So it's the task of search of typing in the text or discovering new videos by the next
link |
recommendation. I personally am really happy with the experience. Continuously, I rarely watch a video
link |
that's not awesome from my own perspective. But what else is possible? What are you excited about?
link |
Well, I think introducing people to more of what's available on YouTube is not only very important
link |
to YouTube and to creators, but I think it will help enrich people's lives. Because there's a lot
link |
that I'm still finding out is available on YouTube that I didn't even know. I've been working
link |
YouTube eight years, and it wasn't until last year that I learned that, that I could watch
link |
USC football games from the 1970s. Like, I didn't even know that was possible until last year. And
link |
I've been working here quite some time. So, you know, what was broken about, about that, that it
link |
took me seven years to learn that this stuff was already on YouTube, even when I got here. So I
link |
think there's a big opportunity there. And then, as I said before, you know, we want to make sure
link |
that YouTube finds a way to ensure that it's acting responsibly with respect to society and
link |
enriching people's lives. So we want to take all of the great things that it does and make sure
link |
that we are eliminating the negative consequences that might happen. And then lastly, if we could
link |
get to a point where all the videos people watch are the best ones they've ever watched, that would
link |
be outstanding too. Do you see, in many senses, becoming a window into the world for people?
link |
It's, especially with live video, you get to watch events. I mean, it's really, it's the way you
link |
experience a lot of the world that's out there is better than TV in many, many ways. So do you see
link |
becoming more than just video? Do you see creators creating visual experiences and virtual worlds?
link |
So if I'm talking crazy now, but sort of virtual reality and entering that space,
link |
or is that, at least for now, totally outside what YouTube is thinking about?
link |
I mean, I think Google is thinking about virtual reality. I don't think about virtual reality too
link |
much. I know that we would want to make sure that YouTube is there when virtual reality becomes
link |
something or if virtual reality becomes something that a lot of people are interested in. But I
link |
haven't seen it really take off yet. Take off. Well, the future is wide open. Christos, I've been
link |
really looking forward to this conversation. It's been a huge honor. Thank you for answering some
link |
of the more difficult questions I've asked. I'm really excited about what YouTube has in store
link |
for us. It's one of the greatest products I've ever used and continues. So thank you so much for
link |
talking to it. It's my pleasure. Thanks for asking me. Thanks for listening to this conversation.
link |
And thank you to our presenting sponsor, Cash App. Download it, use code LEX Podcast.
link |
You'll get $10 and $10 will go to FIRST, a STEM education nonprofit that inspires hundreds of
link |
thousands of young minds to become future leaders and innovators. If you enjoy this podcast,
link |
subscribe on YouTube, give it five stars on Apple Podcasts, follow on Spotify,
link |
support on Patreon, or simply connect with me on Twitter.
link |
And now, let me leave you with some words of wisdom from Marcel Proust. The real
link |
voyage of discovery consists not in seeking new landscapes, but in having new eyes.
link |
Thank you for listening, and I hope to see you next time.