back to index

Gustav Soderstrom: Spotify | Lex Fridman Podcast #29


small model | large model

link |
00:00:00.000
The following is a conversation with Gustav Sorenstrom.
link |
00:00:03.920
He's the chief research and development officer at Spotify,
link |
00:00:07.200
leading their product design, data technology and engineering teams.
link |
00:00:11.200
As I've said before, in my research and in life in general,
link |
00:00:15.280
I love music, listening to it and creating it.
link |
00:00:18.720
And using technology, especially personalization through machine learning,
link |
00:00:23.600
to enrich the music discovery and listening experience.
link |
00:00:27.840
That is what Spotify has been doing for years, continually innovating,
link |
00:00:31.920
defining how we experience music as a society in the digital age.
link |
00:00:36.000
That's what Gustav and I talk about, among many other topics,
link |
00:00:39.200
including our shared appreciation of the movie True Romance,
link |
00:00:43.280
in my view, one of the great movies of all time.
link |
00:00:46.080
This is the Artificial Intelligence Podcast.
link |
00:00:49.280
If you enjoy it, subscribe on YouTube, give it five stars on iTunes,
link |
00:00:53.120
support on Patreon or simply connect with me on Twitter at Lex Friedman,
link |
00:00:58.000
spelled F R I D M A N.
link |
00:01:01.200
And now, here's my conversation with Gustav Sorenstrom.
link |
00:01:06.400
Spotify has over 50 million songs in its catalog.
link |
00:01:10.240
So let me ask the all important question.
link |
00:01:14.080
I feel like you're the right person to ask.
link |
00:01:16.240
What is the definitive greatest song of all time?
link |
00:01:19.520
It varies for me, personally.
link |
00:01:22.640
So you can't speak definitively for everyone?
link |
00:01:26.160
I wouldn't believe very much in machine learning if I did, right?
link |
00:01:30.240
Because everyone had the same taste.
link |
00:01:32.800
So for you, what is... you have to pick. What is the song?
link |
00:01:36.960
All right, so it's pretty easy for me.
link |
00:01:39.360
There's this song called You're So Cool, Hans Zimmer, a soundtrack to True Romance.
link |
00:01:46.000
It was a movie that made a big impression on me.
link |
00:01:49.040
And it's kind of been following me through my life.
link |
00:01:51.840
I actually had it play at my wedding.
link |
00:01:55.360
I sat with the organist and helped him play it on an organ,
link |
00:01:58.400
which was a pretty interesting experience.
link |
00:02:01.040
That is probably my, I would say, top three movie of all time.
link |
00:02:06.000
Yeah, this is an incredible movie.
link |
00:02:07.600
Yeah, and it came out during my formative years.
link |
00:02:10.400
And as I've discovered in music, you shape your music taste during those years.
link |
00:02:15.920
So it definitely affected me quite a bit.
link |
00:02:17.840
Did it affect you in any other kind of way?
link |
00:02:20.960
Well, the movie itself affected me back then.
link |
00:02:23.440
It was a big part of culture.
link |
00:02:25.600
I didn't really adopt any characters from the movie,
link |
00:02:27.680
but it was a great story of love, fantastic actors.
link |
00:02:33.040
And really, I didn't even know who Hans Zimmer was at the time, but fantastic music.
link |
00:02:39.040
And so that song has followed me.
link |
00:02:42.160
And the movie actually has followed me throughout my life.
link |
00:02:43.920
That was Quentin Tarantino, actually, I think, director or producer.
link |
00:02:48.480
So it's not Stairway to Heaven or Bohemian Rhapsody.
link |
00:02:52.080
Those are great.
link |
00:02:53.600
They're not my personal favorites, but I've realized that people have different tastes.
link |
00:02:57.760
And that's a big part of what we do.
link |
00:03:00.400
Well, for me, I would have to stick with Stairway to Heaven.
link |
00:03:04.000
So 35,000 years ago, I looked this up on Wikipedia,
link |
00:03:09.280
flute like instruments started being used in caves as part of hunting rituals.
link |
00:03:13.120
And primitive cultural gatherings, things like that.
link |
00:03:15.760
This is the birth of music.
link |
00:03:18.000
Since then, we had a few folks, Beethoven, Elvis, Beatles, Justin Bieber, of course, Drake.
link |
00:03:25.680
So in your view, let's start like high level philosophical.
link |
00:03:29.280
What is the purpose of music on this planet of ours?
link |
00:03:35.200
I think music has many different purposes.
link |
00:03:38.240
I think there's certainly a big purpose, which is the same as much of entertainment,
link |
00:03:44.640
which is escapism and to be able to live in some sort of other mental state for a while.
link |
00:03:52.080
But I also think you have the opposite of escaping,
link |
00:03:54.320
which is to help you focus on something you are actually doing.
link |
00:03:57.280
Because I think people use music as a tool to tune the brain
link |
00:04:02.640
to the activities that they are actually doing.
link |
00:04:05.120
And it's kind of like, in one sense, maybe it's the rawest signal.
link |
00:04:10.560
If you think about the brain as neural networks,
link |
00:04:13.040
it's maybe the most efficient hack we can do to actually actively tune it
link |
00:04:16.880
into some state that you want to be.
link |
00:04:18.880
You can do it in other ways.
link |
00:04:19.760
You can tell stories to put people in a certain mood.
link |
00:04:22.240
But music is probably very effective to get you to a certain mood very fast, I think.
link |
00:04:27.120
You know, there's a social component historically to music,
link |
00:04:30.960
where people listen to music together.
link |
00:04:32.480
I was just thinking about this, that to me, and you mentioned machine learning,
link |
00:04:36.880
but to me personally, music is a really private thing.
link |
00:04:43.040
I'm speaking for myself, I listen to music,
link |
00:04:45.920
like almost nobody knows the kind of things I have in my library,
link |
00:04:50.320
except people who are really close to me and they really only know a certain percentage.
link |
00:04:54.400
There's like some weird stuff that I'm almost probably embarrassed by, right?
link |
00:04:58.560
It's called the guilty pleasures, right?
link |
00:05:00.000
Everyone has the guilty pleasures, yeah.
link |
00:05:02.560
Hopefully they're not too bad, but for me, it's personal.
link |
00:05:06.560
Do you think of music as something that's social or as something that's personal?
link |
00:05:12.880
Or does it vary?
link |
00:05:14.560
So I think it's the same answer that you use it for both.
link |
00:05:20.720
We've thought a lot about this during these 10 years at Spotify, obviously.
link |
00:05:25.360
In one sense, as you said, music is incredibly
link |
00:05:27.840
social, you go to concerts and so forth.
link |
00:05:30.480
On the other hand, it is your escape and everyone has these things that are very personal to them.
link |
00:05:38.400
So what we've found is that when it comes to, most people claim that they have a friend or two
link |
00:05:47.680
that they are heavily inspired by and that they listen to.
link |
00:05:50.880
So I actually think music is very social, but in a smaller group setting,
link |
00:05:54.560
it's an intimate form of, it's an intimate relationship.
link |
00:06:00.400
It's not something that you necessarily share broadly.
link |
00:06:03.360
Now, at concerts, you can argue you do, but then you've gathered a lot of people
link |
00:06:07.040
that you have something in common with.
link |
00:06:08.880
I think this broadcast sharing of music is something we tried on social networks and so forth.
link |
00:06:16.960
But it turns out that people aren't super interested in sharing their music.
link |
00:06:23.120
They aren't super interested in what their friends listen to.
link |
00:06:28.480
They're interested in understanding if they have something in common perhaps with a friend,
link |
00:06:32.800
but not just as information.
link |
00:06:35.680
Right, that's really interesting.
link |
00:06:38.000
I was just thinking of it this morning, listening to Spotify.
link |
00:06:41.600
I really have a pretty intimate relationship with Spotify, with my playlists, right?
link |
00:06:48.480
I've had them for many years now and they've grown with me together.
link |
00:06:53.360
There's an intimate relationship you have with a library of music that you've developed.
link |
00:06:59.520
And we'll talk about different ways we can play with that.
link |
00:07:02.480
Can you do the impossible task and try to give a history of music listening
link |
00:07:09.280
from your perspective from before the internet and after the internet
link |
00:07:14.160
and just kind of everything leading up to streaming with Spotify and so on?
link |
00:07:18.800
I'll try.
link |
00:07:19.280
It could be a 100 year podcast.
link |
00:07:22.320
I'll try to do a brief version.
link |
00:07:24.400
There are some things that I think are very interesting during the history of music,
link |
00:07:28.080
which is that before recorded music, to be able to enjoy music,
link |
00:07:33.040
you actually had to be where the music was produced
link |
00:07:35.440
because you couldn't record it and time shift it, right?
link |
00:07:38.640
Creation and consumption had to happen at the same time, basically concerts.
link |
00:07:41.520
And so you either had to get to the nearest village to listen to music.
link |
00:07:46.320
And while that was cumbersome and it severely limited the distribution of music,
link |
00:07:51.440
it also had some different qualities,
link |
00:07:53.200
which was that the creator could always interact with the audience.
link |
00:07:56.640
It was always live.
link |
00:07:58.400
And also there was no time cap on the music.
link |
00:08:00.640
So I think it's not a coincidence that these early classical works,
link |
00:08:04.960
they're much longer than the three minutes.
link |
00:08:06.640
The three minutes came in as a restriction of the first wax disc that could only contain
link |
00:08:11.600
a three minute song on one side, right?
link |
00:08:14.080
So actually the recorded music severely limited or put constraints.
link |
00:08:20.400
I won't say limit.
link |
00:08:21.040
I mean, constraints are often good,
link |
00:08:22.160
but it put very hard constraints on the music format.
link |
00:08:24.960
So you kind of said, instead of doing this opus on many tens of minutes or something,
link |
00:08:31.200
now you get three and a half minutes because then you're out of wax on this disc.
link |
00:08:34.560
But in return, you get an amazing distribution.
link |
00:08:37.680
Your reach will widen, right?
link |
00:08:39.440
Just on that point real quick.
link |
00:08:42.560
Without the mass scale distribution, there's a scarcity component
link |
00:08:47.920
where you kind of look forward to it.
link |
00:08:51.760
We had that, it's like the Netflix versus HBO Game of Thrones.
link |
00:08:56.400
You like wait for the event because you can't really listen to it.
link |
00:09:00.160
So you like look forward to it and then it's like,
link |
00:09:02.800
you derive perhaps more pleasure because it's more rare for you to listen to a particular piece.
link |
00:09:07.920
You think there's value to that scarcity?
link |
00:09:10.480
Yeah, I think that that is definitely a thing.
link |
00:09:12.720
And there's always this component of if you have something in infinite amounts,
link |
00:09:17.200
will you value it as much?
link |
00:09:20.000
Probably not.
link |
00:09:20.880
Humanity is always seeking some, it's relative.
link |
00:09:24.400
So you're always seeking something you didn't have.
link |
00:09:25.840
And when you have it, you don't appreciate it as much.
link |
00:09:27.600
So I think that's probably true.
link |
00:09:29.520
But I think that that's probably true.
link |
00:09:31.200
But I think that's why concerts exist.
link |
00:09:33.040
So you can actually have both.
link |
00:09:35.520
But I think net, if you couldn't listen to music in your car driving, that'd be worse.
link |
00:09:42.000
That cost will be bigger than the benefit of the anticipation I think that you would have.
link |
00:09:47.360
So, yeah, it started with live concerts.
link |
00:09:50.720
Then it's being able to, you know, the phonograph invented, right?
link |
00:09:56.720
That you start to be able to record music.
link |
00:09:59.440
Exactly.
link |
00:09:59.840
So then you got this massive distribution that made it possible to create two things.
link |
00:10:04.560
I think, first of all, cultural phenomenons, they probably need distribution to be able to happen.
link |
00:10:10.560
But it also opened access to, you know, for a new kind of artist.
link |
00:10:15.520
So you started to have these phenomenons like Beatles and Elvis and so forth.
link |
00:10:18.720
That would really, a function of distribution, I think, obviously of talent and innovation.
link |
00:10:23.680
But there was also technical component.
link |
00:10:25.760
And of course, the next big innovation to come along was radio.
link |
00:10:29.040
Broadcast radio.
link |
00:10:30.720
And I think radio is interesting because it started not as a music medium.
link |
00:10:36.240
It started as an information medium for news.
link |
00:10:39.600
And then radio needed to find something to fill the time with so that they could honestly
link |
00:10:45.280
play more ads and make more money.
link |
00:10:47.200
And music was free.
link |
00:10:48.480
So then you had this massive distribution where you could program to people.
link |
00:10:52.480
I think those things, that ecosystem, is what created the ability for hits.
link |
00:10:59.200
But it was also a very broadcast medium.
link |
00:11:01.600
So you would tend to get these massive, massive hits, but maybe not such a long tail.
link |
00:11:07.440
In terms of choice of everybody listens to the same stuff.
link |
00:11:10.480
Yeah.
link |
00:11:10.960
And as you said, I think there are some social benefits to that.
link |
00:11:14.720
I think, for example, there's a high statistical chance that if I talk about the latest episode
link |
00:11:19.760
of Game of Thrones, we have something to talk about, just statistically.
link |
00:11:23.280
In the age of individual choice, maybe some of that goes away.
link |
00:11:26.240
So I do see the value of shared cultural components, but I also obviously love personalization.
link |
00:11:36.400
And so let's catch this up to the internet.
link |
00:11:39.120
So maybe Napster, well, first of all, there's MP3s, tapes, CDs.
link |
00:11:44.640
There was a digitalization of music with a CD, really.
link |
00:11:47.440
It was physical distribution, but the music became digital.
link |
00:11:51.200
And so they were files, but basically boxed software, to use a software analogy.
link |
00:11:56.800
And then you could start downloading these files.
link |
00:11:59.920
And I think there are two interesting things that happened.
link |
00:12:02.480
Back to music used to be longer before it was constrained by the distribution medium.
link |
00:12:08.080
I don't think that was a coincidence.
link |
00:12:09.840
And then really the only music genre to have developed mostly after music was a file again
link |
00:12:15.600
on the internet is EDM.
link |
00:12:17.360
And EDM is often much longer than the traditional music.
link |
00:12:20.640
I think it's interesting to think about the fact that music is no longer constrained in
link |
00:12:26.000
minutes per song or something.
link |
00:12:27.040
It's a legacy of an old distribution technology.
link |
00:12:31.120
And you see some of this new music that breaks the format.
link |
00:12:33.680
Not so much as I would have expected actually by now, but it still happens.
link |
00:12:38.160
So first of all, I don't really know what EDM is.
link |
00:12:41.120
Electronic dance music.
link |
00:12:42.320
Yeah.
link |
00:12:42.880
You could say Avicii.
link |
00:12:44.160
Avicii was one of the biggest in this genre.
link |
00:12:46.800
So the main constraint is of time.
link |
00:12:49.680
Something like a three, four, five minute song.
link |
00:12:52.480
So you could have songs that were eight minutes, 10 minutes and so forth.
link |
00:12:56.320
Because it started as a digital product that you downloaded.
link |
00:13:01.040
So you didn't have this constraint anymore.
link |
00:13:03.920
So I think it's something really interesting that I don't think has fully happened yet.
link |
00:13:08.480
We're kind of jumping ahead a little bit to where we are, but I think there's tons of format
link |
00:13:12.880
innovation in music that should happen now, that couldn't happen when you needed to really
link |
00:13:18.880
adhere to the distribution constraints.
link |
00:13:20.880
If you didn't adhere to that, you would get no distribution.
link |
00:13:24.240
So Björk, for example, the Icelandic artist, she made a full iPad app as an album.
link |
00:13:30.720
That was very expensive.
link |
00:13:33.440
Even though the app store has great distribution, she gets nowhere near the distribution versus
link |
00:13:38.000
staying within the three minute format.
link |
00:13:39.760
So I think now that music is fully digital inside these streaming services, there is
link |
00:13:44.720
the opportunity to change the format again and allow creators to be much more creative
link |
00:13:50.080
without limiting their distribution ability.
link |
00:13:52.800
That's interesting that you're right.
link |
00:13:54.960
It's surprising that we don't see that taken advantage more often.
link |
00:13:59.280
It's almost like the constraints of the distribution from the 50s and 60s have molded the culture
link |
00:14:06.400
to where we want the five, three to five minute song than anything else, not just.
link |
00:14:12.480
So we want the song as consumers and as artists, because I write a lot of music and I never
link |
00:14:18.880
even thought about writing something longer than 10 minutes.
link |
00:14:23.600
It's really interesting that those constraints.
link |
00:14:26.640
Because all your training data has been three and a half minute songs, right?
link |
00:14:29.600
It's right.
link |
00:14:30.320
Okay, so yes, digitization of data led to then mp3s.
link |
00:14:36.480
Yeah, so I think you had this file then that was distributed physically, but then you had
link |
00:14:42.240
the components of digital distribution and then the internet happened and there was this
link |
00:14:46.800
vacuum where you had a format that could be digitally shipped, but there was no business
link |
00:14:51.120
model.
link |
00:14:51.840
And then all these pirate networks happened, Napster and in Pirate Island.
link |
00:14:58.880
Napster and in Sweden Pirate Bay, which was one of the biggest.
link |
00:15:02.960
And I think from a consumer point of view, which kind of leads up to the inception of
link |
00:15:10.080
Spotify, from a consumer point of view, consumers for the first time had this access model to
link |
00:15:15.840
music where they could, without kind of any marginal cost, they could try different tracks.
link |
00:15:25.680
You could use music in new ways.
link |
00:15:27.360
There was no marginal cost.
link |
00:15:28.880
And that was a fantastic consumer experience to have access to all the music ever made,
link |
00:15:32.480
I think was fantastic.
link |
00:15:34.560
But it was also horrible for artists because there was no business model around it.
link |
00:15:38.000
So they didn't make any money.
link |
00:15:39.600
So the user need almost drove the user interface before there was a business model.
link |
00:15:46.400
And then there were these download stores that allowed you to download files, which
link |
00:15:52.160
was a solution, but it didn't solve the access problem.
link |
00:15:55.040
There was still a marginal cost of 99 cents to try one more track.
link |
00:15:58.560
And I think that that heavily limits how you listen to music.
link |
00:16:01.920
The example I always give is, you know, in Spotify, a huge amount of people listen to
link |
00:16:07.600
music while they sleep, while they go to sleep and while they sleep.
link |
00:16:11.280
If that costed you 99 cents per three minutes, you probably wouldn't do that.
link |
00:16:15.520
And you would be much less adventurous if there was a real dollar cost to exploring
link |
00:16:18.640
music.
link |
00:16:19.200
So the access model is interesting in that it changes your music behavior.
link |
00:16:22.320
You can be, you can take much more risk because there's no marginal cost to it.
link |
00:16:27.680
Maybe let me linger on piracy for a second, because I find, especially coming from Russia,
link |
00:16:33.200
piracy is something that's very interesting to me.
link |
00:16:39.440
Not me, of course, ever, but I have friends who have partook in piracy of music, software,
link |
00:16:49.040
TV shows, sporting events.
link |
00:16:52.400
And usually to me, what that shows is not that they're, they can actually pay the money
link |
00:16:58.400
and they're not trying to save money.
link |
00:17:00.480
They're choosing the best experience.
link |
00:17:03.760
So what to me, piracy shows is a business opportunity in all these domains.
link |
00:17:08.560
And that's where I think you're right.
link |
00:17:11.120
Spotify stepped in is basically piracy was an experience.
link |
00:17:15.840
You can explore with fine music you like, and actually the interface of piracy is horrible
link |
00:17:23.520
because it's, I mean, it's bad metadata, long download times, all kinds of stuff.
link |
00:17:29.680
And what Spotify does is basically first rewards artists and second makes the experience of
link |
00:17:37.520
exploring music much better.
link |
00:17:38.720
I mean, the same is true, I think for movies and so on.
link |
00:17:42.560
That piracy reveals in the software space, for example, I'm a huge user and fan of Adobe
link |
00:17:48.080
products and there was much more incentive to pirate Adobe products before they went
link |
00:17:54.720
to a monthly subscription plan.
link |
00:17:57.120
And now all of the said friends that used to pirate Adobe products that I know now actually
link |
00:18:04.640
pay gladly for the monthly subscription.
link |
00:18:06.880
Yeah, I think you're right.
link |
00:18:08.000
I think it's a sign of an opportunity for product development.
link |
00:18:11.360
And that sometimes there's a product market fit before there's a business model fit in
link |
00:18:19.120
product development.
link |
00:18:19.840
I think that's a sign of it.
link |
00:18:21.760
In Sweden, I think it was a bit of both.
link |
00:18:24.320
There was a culture where we even had a political party called the Pirate Party.
link |
00:18:30.480
And this was during the time when people said that information should be free.
link |
00:18:35.120
It was somehow wrong to charge for ones and zeros.
link |
00:18:38.080
So I think people felt that artists should probably make some money somehow else and
link |
00:18:43.600
concerts or something.
link |
00:18:44.880
So at least in Sweden, it was part really social acceptance, even at the political level.
link |
00:18:49.920
But that also forced Spotify to compete with free, which I don't think would actually
link |
00:18:56.800
could have happened anywhere else in the world.
link |
00:18:58.560
The music industry needed to be doing bad enough to take that risk.
link |
00:19:03.120
And Sweden was like the perfect testing ground.
link |
00:19:04.800
It had government funded high bandwidth, low latency broadband, which meant that the product
link |
00:19:10.640
would work.
link |
00:19:11.440
And it was also there was no music revenue anyway.
link |
00:19:14.000
So they were kind of like, I don't think this is going to work, but why not?
link |
00:19:18.800
So this product is one that I don't think could have happened in America, the world's
link |
00:19:21.920
largest music market, for example.
link |
00:19:23.920
So how do you compete with free?
link |
00:19:25.600
Because that's an interesting world of the internet where most people don't like to
link |
00:19:30.640
pay for things.
link |
00:19:31.520
So Spotify steps in and tries to, yes, compete with free.
link |
00:19:36.080
How do you do it?
link |
00:19:37.120
So I think two things.
link |
00:19:38.240
One is people are starting to pay for things on the internet.
link |
00:19:41.680
I think one way to think about it was that advertising was the first business model because
link |
00:19:47.440
no one would put a credit card on the internet.
link |
00:19:49.200
Transactional with Amazon was the second.
link |
00:19:51.600
And maybe subscription is the third.
link |
00:19:52.960
And if you look offline, subscription is the biggest of those.
link |
00:19:56.480
So that may still happen.
link |
00:19:57.600
I think people are starting to pay for things.
link |
00:19:59.040
But definitely back then, we needed to compete with free.
link |
00:20:02.480
And the first thing you need to do is obviously to lower the price to free and then you need
link |
00:20:07.600
to be better somehow.
link |
00:20:09.440
And the way that Spotify was better was on the user experience, on the actual performance,
link |
00:20:15.040
the latency of, you know, even if you had high bandwidth broadband, it would still take
link |
00:20:24.640
you 30 seconds to a minute to download one of these tracks.
link |
00:20:30.800
So the Spotify experience of starting within the perceptual limit of immediacy, about 250
link |
00:20:35.360
milliseconds, meant that the whole trick was it felt as if you had downloaded all of Pirate
link |
00:20:41.520
Bay.
link |
00:20:41.680
It was on your hard drive.
link |
00:20:42.800
It was that fast, even though it wasn't.
link |
00:20:45.360
And it was still free.
link |
00:20:46.720
But somehow you were actually still being a legal citizen.
link |
00:20:50.400
And that was the trick that Spotify managed to pull off.
link |
00:20:54.880
So I've actually heard you say this or write this.
link |
00:20:58.240
And I was surprised that I wasn't aware of it because I just took it for granted.
link |
00:21:02.400
You know, whenever an awesome thing comes along, you're just like, of course, it has
link |
00:21:05.920
to be this way.
link |
00:21:07.360
That's exactly right.
link |
00:21:08.560
That it felt like the entire world's libraries at my fingertips because of that latency being
link |
00:21:14.720
reduced.
link |
00:21:15.440
What was the technical challenge in reducing the latency?
link |
00:21:18.640
So there was a group of really, really talented engineers, one of them called Ludwig Strigius.
link |
00:21:25.280
He wrote the, actually from Gothenburg, he wrote the initial, the uTorrent client, which
link |
00:21:32.080
is kind of an interesting backstory to Spotify, that we have one of the top developers from
link |
00:21:38.480
uTorrent clients as well.
link |
00:21:39.840
So he wrote uTorrent, the world's smallest uTorrent client.
link |
00:21:42.320
And then he was acquired very early by Daniel and Martin, who founded Spotify, and they
link |
00:21:49.440
actually sold the uTorrent client to BitTorrent, but kept Ludwig.
link |
00:21:53.040
So Spotify had a lot of experience within peer to peer networking.
link |
00:21:59.040
So the original innovation was a distribution innovation, where Spotify built an end to
link |
00:22:04.560
end media distribution system up until only a few years ago, we actually hosted all the
link |
00:22:08.160
music ourselves.
link |
00:22:09.440
So we had both the service side and the client, and that meant that we could do things such
link |
00:22:13.360
as having a peer to peer solution to use local caching on the client side, because back then
link |
00:22:19.200
the world was mostly desktop.
link |
00:22:20.800
But we could also do things like hack the TCP protocols, things like Nagel's algorithm
link |
00:22:26.240
for kind of exponential back off, or ramp up and just go full throttle and optimize
link |
00:22:31.200
for latency at the cost of bandwidth.
link |
00:22:33.760
And all of this end to end control meant that we could do an experience that felt like a
link |
00:22:39.200
step change.
link |
00:22:40.480
These days, we actually are on GCP, we don't host our own stuff, and everyone is really
link |
00:22:46.720
fast these days.
link |
00:22:47.360
So that was the initial competitive advantage.
link |
00:22:49.440
But then obviously, you have to move on over time.
link |
00:22:51.440
And that was over 10 years ago, right?
link |
00:22:54.480
That was in 2008.
link |
00:22:55.840
The product was launched in Sweden.
link |
00:22:57.520
It was in a beta, I think, 2007.
link |
00:22:59.440
And it was on the desktop, right?
link |
00:23:00.800
It was desktop only.
link |
00:23:01.840
There's no phone.
link |
00:23:03.840
There was no phone.
link |
00:23:04.480
The iPhone came out in 2008.
link |
00:23:07.920
But the App Store came out one year later, I think.
link |
00:23:10.480
So the writing was on the wall, but there was no phone yet.
link |
00:23:14.160
You've mentioned that people would use Spotify to discover the songs they like, and then
link |
00:23:19.680
they would torrent those songs to so they can copy it to their phone.
link |
00:23:24.880
Just hilarious.
link |
00:23:25.840
Exactly.
link |
00:23:26.320
Not torrent, pirate.
link |
00:23:27.440
Seriously, piracy does seem to be like a good guide for business models.
link |
00:23:33.520
Video content.
link |
00:23:34.560
As far as I know, Spotify doesn't have video content.
link |
00:23:37.600
Well, we do have music videos, and we do have videos on the service.
link |
00:23:42.080
But the way we think about ourselves is that we're an audio service, and we think that
link |
00:23:48.320
if you look at the amount of time that people spend on audio, it's actually very similar
link |
00:23:52.800
to the amount of time that people spend on music.
link |
00:23:55.200
It's very similar to the amount of time that people spend on video.
link |
00:23:58.640
So the opportunity should be equally big.
link |
00:24:02.000
But today, it's not at all valued.
link |
00:24:03.520
Videos value much higher.
link |
00:24:05.040
So we think it's basically completely undervalued.
link |
00:24:08.320
So we think of ourselves as an audio service.
link |
00:24:10.560
But within that audio service, I think video can make a lot of sense.
link |
00:24:14.000
I think when you're discovering an artist, you probably do want to see them and understand
link |
00:24:19.040
who they are, to understand their identity.
link |
00:24:21.200
You won't see that video every time.
link |
00:24:22.400
90% of the time, the phone is going to be in your pocket.
link |
00:24:25.120
For podcasters, you use video.
link |
00:24:27.280
I think that can make a ton of sense.
link |
00:24:28.560
So we do have video, but we're an audio service where, think of it as we call it internally,
link |
00:24:33.600
backgroundable video.
link |
00:24:35.120
Video that is helpful, but isn't the driver of the narrative.
link |
00:24:39.440
I think also, if we look at YouTube, there's quite a few folks who listen to music on YouTube.
link |
00:24:48.560
So in some sense, YouTube is a bit of a competitor to Spotify, which is very strange to me that
link |
00:24:55.280
people use YouTube to listen to music.
link |
00:24:57.920
They play essentially the music videos, right?
link |
00:25:00.640
But don't watch the videos and put it in their pocket.
link |
00:25:03.360
Well, I think it's similar to what, strangely, maybe it's similar to what we were for the
link |
00:25:12.240
piracy networks, where YouTube, for historical reasons, have a lot of music videos.
link |
00:25:20.640
So people use YouTube for a lot of the discovery part of the process, I think.
link |
00:25:25.040
But then it's not a really good sort of, quote unquote, MP3 player, because it doesn't even
link |
00:25:29.520
background.
link |
00:25:29.920
Then you have to keep the app in the foreground.
link |
00:25:31.600
So it's not a good consumption tool, but it's a decently good discovery.
link |
00:25:36.160
I mean, I think YouTube is a fantastic product.
link |
00:25:38.400
And I use it for all kinds of purposes.
link |
00:25:40.320
That's true.
link |
00:25:41.040
If I were to admit something, I do use YouTube a little bit to assist in the discovery process
link |
00:25:46.560
of songs.
link |
00:25:47.280
And then if I like it, I'll add it to Spotify.
link |
00:25:50.320
But that's OK.
link |
00:25:51.760
That's OK with us.
link |
00:25:53.600
OK, so sorry, we're jumping around a little bit.
link |
00:25:55.520
So it's kind of incredible.
link |
00:25:58.560
You look at Napster, you look at the early days of Spotify.
link |
00:26:03.440
One fascinating point is how do you grow a user base?
link |
00:26:06.640
So you're there in Sweden.
link |
00:26:08.960
You have an idea.
link |
00:26:10.320
I saw the initial sketches that look terrible.
link |
00:26:14.160
How do you grow a user base from a few folks to millions?
link |
00:26:19.280
I think there are a bunch of tactical answers.
link |
00:26:22.240
So first of all, I think you need a great product.
link |
00:26:24.160
I don't think you take a bad product and market it to be successful.
link |
00:26:30.080
So you need a great product.
link |
00:26:31.120
But sorry to interrupt, but it's a totally new way to listen to music, too.
link |
00:26:34.720
So it's not just did people realize immediately that Spotify is a great product?
link |
00:26:38.560
No, I think they did.
link |
00:26:40.240
So back to the point of piracy, it was a totally new way to listen to music legally.
link |
00:26:45.840
But people had been used to the access model in Sweden
link |
00:26:48.960
and the rest of the world for a long time through piracy.
link |
00:26:50.880
So one way to think about Spotify, it was just legal and fast piracy.
link |
00:26:54.720
And so people have been using it for a long time.
link |
00:26:56.960
So they weren't alien to it.
link |
00:26:59.040
They didn't really understand how it could be illegal
link |
00:27:01.360
because it seemed too fast and too good to be true,
link |
00:27:03.920
which I think is a great product proposition if you can be too good to be true.
link |
00:27:06.960
But what I saw again and again was people showing each other,
link |
00:27:09.760
clicking the song, showing how fast it started and say, can you believe this?
link |
00:27:13.200
So I really think it was about speed.
link |
00:27:16.320
Then we also had an invite program that was really meant for scaling
link |
00:27:22.000
because we hosted our own service.
link |
00:27:23.280
We needed to control scaling.
link |
00:27:25.040
But that built a lot of expectation.
link |
00:27:27.600
And I don't want to say hype because hype implies that it wasn't true.
link |
00:27:32.880
Excitement around the product. And we've replicated that when we launched in the US.
link |
00:27:38.560
We also built up an invite only program first.
link |
00:27:41.200
There are lots of tactics, but I think you need a great product to solve some problem.
link |
00:27:46.160
And basically the key innovation, there was technology,
link |
00:27:51.440
but on a meta level, the innovation was really the access model versus the ownership model.
link |
00:27:55.600
And that was tricky.
link |
00:27:56.880
A lot of people said that they wanted to be able to do it.
link |
00:28:01.440
I mean, they wanted to own their music.
link |
00:28:04.480
They would never kind of rent it or borrow it.
link |
00:28:07.520
But I think the fact that we had a free tier,
link |
00:28:09.120
which meant that you get to keep this music for life as well, helped quite a lot.
link |
00:28:14.560
So this is an interesting psychological point that maybe you can speak to.
link |
00:28:18.560
It was a big shift for me.
link |
00:28:22.240
It's almost like I had to go to therapy for this.
link |
00:28:26.240
I think I would describe my early listening experience,
link |
00:28:29.360
and I think a lot of my friends do, as basically hoarding music.
link |
00:28:33.280
As you're like slowly, one song by one song,
link |
00:28:35.920
or maybe albums, gathering a collection of music that you love.
link |
00:28:40.960
And you own it.
link |
00:28:42.080
It's like often, especially with CDs or tape, you like physically had it.
link |
00:28:46.960
And what Spotify, what I had to come to grips with,
link |
00:28:50.240
it was kind of liberating actually, is to throw away all the music.
link |
00:28:55.520
I've had this therapy session with lots of people.
link |
00:28:58.480
And I think the mental trick is, so actually we've seen the user data.
link |
00:29:02.560
When Spotify started, a lot of people did the exact same thing.
link |
00:29:05.040
They started hoarding as if the music would disappear.
link |
00:29:09.280
Almost the equivalent of downloading.
link |
00:29:10.880
And so we had these playlists that had limits of like a few hundred thousand tracks.
link |
00:29:16.080
We figured no one will ever.
link |
00:29:17.360
Well, they do.
link |
00:29:18.560
Nuts and hundreds and hundreds of thousands of tracks.
link |
00:29:20.960
And to this day, some people want to actually save, quote unquote,
link |
00:29:25.760
and then play the entire catalog.
link |
00:29:26.960
But I think the therapy session goes something like instead of throwing away your music,
link |
00:29:34.080
if you took your files and you stored them in the locker at Google,
link |
00:29:38.720
it'd be a streaming service.
link |
00:29:39.680
It's just that in that locker, you have all the world's music now for free.
link |
00:29:42.720
So instead of giving away your music, you got all the music.
link |
00:29:45.520
It's yours.
link |
00:29:46.720
You could think of it as having a copy of the world's catalog there forever.
link |
00:29:50.240
So you actually got more music instead of less.
link |
00:29:52.720
It's just that you just took that hard disk and you sent it to someone who stored it for you.
link |
00:29:58.720
And once you go through that mental journey, I'm like, it's still my files.
link |
00:30:01.440
They're just over there.
link |
00:30:02.560
And I just have 40 million or 50 million or something now.
link |
00:30:05.520
Then people are like, OK, that's good.
link |
00:30:07.600
The problem is, I think, because you paid us a subscription,
link |
00:30:11.840
if we hadn't had the free tier where you would feel like,
link |
00:30:14.000
even if I don't want to pay anymore, I still get to keep them.
link |
00:30:17.120
You keep your playlist forever.
link |
00:30:18.480
They don't disappear even though you stop paying.
link |
00:30:20.240
I think that was really important.
link |
00:30:21.760
If we would have started as, you know, you can put in all this time,
link |
00:30:25.440
but if you stop paying, you lose all your work.
link |
00:30:27.280
I think that would have been a big challenge and was the big challenge for a lot of our competitors.
link |
00:30:31.760
That's another reason why I think the free tier is really important.
link |
00:30:34.880
That people need to feel the security, that the work they put in,
link |
00:30:37.600
it will never disappear, even if they decide not to pay.
link |
00:30:40.800
I like how you put the work you put in.
link |
00:30:42.880
I actually stopped even thinking of it that way.
link |
00:30:44.480
I just actually Spotify taught me to just enjoy music as opposed to.
link |
00:30:50.080
As opposed to what I was doing before, which is like in an unhealthy way, hoarding music.
link |
00:30:58.560
Which I found that because I was doing that,
link |
00:31:01.280
I was listening to a small selection of songs way too much to where I was getting sick of them.
link |
00:31:07.520
Whereas Spotify, the more liberating kind of approach is I was just enjoying.
link |
00:31:11.680
Of course, I listened to Stairway to Heaven over and over,
link |
00:31:13.920
but because of the extra variety, I don't get as sick of them.
link |
00:31:18.240
There's an interesting statistic I saw.
link |
00:31:21.520
So Spotify has, maybe you can correct me, but over 50 million songs, tracks,
link |
00:31:27.600
and over 3 billion playlists.
link |
00:31:31.360
So 50 million songs and 3 billion playlists.
link |
00:31:35.520
60 times more playlist songs.
link |
00:31:38.480
What do you make of that?
link |
00:31:39.920
Yeah.
link |
00:31:40.160
So the way I think about it is that from a statistician or machine learning point of view,
link |
00:31:48.320
you have all these, if you want to think about reinforcement learning,
link |
00:31:52.000
you have this state space of all the tracks.
link |
00:31:54.320
You can take different journeys through this world.
link |
00:32:00.160
I think of these as people helping themselves and each other,
link |
00:32:05.200
creating interesting vectors through this space of tracks.
link |
00:32:08.720
And then it's not so surprising that across many tens of millions of atomic units,
link |
00:32:14.080
there will be billions of paths that make sense.
link |
00:32:17.280
And we're probably pretty quite far away from having found all of them.
link |
00:32:21.920
So kind of our job now is users, when Spotify started,
link |
00:32:26.640
it was really a search box that was for the time pretty powerful.
link |
00:32:30.000
And then I'd like to refer to it as this programming language called playlisting,
link |
00:32:34.400
where if you, as you probably were pretty good at music,
link |
00:32:36.800
you knew your new releases, you knew your back catalog,
link |
00:32:39.120
you knew your star with the heaven,
link |
00:32:40.480
you could create a soundtrack for yourself using this playlisting tool,
link |
00:32:43.200
this like meta programming language for music to soundtrack your life.
link |
00:32:47.360
And people who were good at music, it's back to how do you scale the product.
link |
00:32:50.960
For people who are good at music, that wasn't actually enough.
link |
00:32:53.760
If you had the catalog and a good search tool,
link |
00:32:55.840
and you can create your own sessions,
link |
00:32:57.120
you could create really good a soundtrack for your entire life.
link |
00:33:01.120
Probably perfectly personalized because you did it yourself.
link |
00:33:04.000
But the problem was most people, many people aren't that good at music.
link |
00:33:06.880
They just can't spend the time.
link |
00:33:08.480
Even if you're very good at music, it's going to be hard to keep up.
link |
00:33:10.800
So what we did to try to scale this was to essentially try to build,
link |
00:33:16.400
you can think of them as agents that this friend that some people had
link |
00:33:20.480
that helped them navigate this music catalog.
link |
00:33:22.800
That's what we're trying to do for you.
link |
00:33:24.800
But also there is something like 200 million active users.
link |
00:33:32.640
1 million active users on Spotify.
link |
00:33:35.040
So there it's okay.
link |
00:33:36.640
So from the machine learning perspective,
link |
00:33:39.760
you have these 200 million people plus they're creating.
link |
00:33:45.760
It's really interesting to think of a playlist as,
link |
00:33:51.760
I mean, I don't know if you meant it that way,
link |
00:33:53.200
but it's almost like a programming language.
link |
00:33:54.880
It's or at least a trace of exploration of those individual agents.
link |
00:34:01.120
The listeners and you have all this new tracks coming in.
link |
00:34:06.000
So it's a fascinating space that is ripe for machine learning.
link |
00:34:11.680
So is there, is it possible, how can playlists be used as data
link |
00:34:18.080
in terms of machine learning and to help Spotify organize the music?
link |
00:34:24.160
So we found in our data, not surprising that people who play listed lots
link |
00:34:29.680
they retain much better.
link |
00:34:30.720
They had a great experience.
link |
00:34:32.240
And so our first attempt was to playlist for users.
link |
00:34:35.920
And so we acquired this company called Tunigo of editors and professional playlisters
link |
00:34:41.360
and kind of leveraged the maximum of human intelligence
link |
00:34:45.600
to help build kind of these vectors through the track space for people.
link |
00:34:52.480
And that broadened the product.
link |
00:34:54.320
But then the obvious next, and we use statistical means,
link |
00:34:57.840
where they could see when they created a playlist, how did that playlist perform?
link |
00:35:02.080
They could see skips of the songs, they could see how the songs perform,
link |
00:35:04.800
and they manually iterated the playlist to maximize performance for a large group of people.
link |
00:35:10.720
But there were never enough editors to playlists for you personally.
link |
00:35:14.480
So the promise of machine learning was to go from kind of group personalization
link |
00:35:18.240
using editors and tools and statistics to individualization.
link |
00:35:22.640
And then what's so interesting about the 3 billion playlists we have is we ended,
link |
00:35:28.160
the truth is we lucked out.
link |
00:35:29.360
This was not a priority strategy, as is often the case.
link |
00:35:32.880
It looks really smart in hindsight, but it was dumb luck.
link |
00:35:37.440
We looked at these playlists and we had some people in the company,
link |
00:35:42.160
a person named Eric Beranodson.
link |
00:35:43.840
He was really good at machine learning already back then in like 2007, 2008.
link |
00:35:48.560
Back then it was mostly collaborative filtering and so forth.
link |
00:35:51.600
But we realized that what this is, is people are grouping tracks for themselves
link |
00:35:57.920
that have some semantic meaning to them.
link |
00:36:00.640
And then they actually label it with a playlist name as well.
link |
00:36:04.160
So in a sense, people were grouping tracks along semantic dimensions and labeling them.
link |
00:36:09.840
And so could you use that information to find that latent embedding?
link |
00:36:15.840
And so we started playing around with collaborative filtering
link |
00:36:20.960
and we saw tremendous success with it.
link |
00:36:24.160
Basically trying to extract some of these dimensions.
link |
00:36:28.320
And if you think about it, it's not surprising at all.
link |
00:36:30.880
It'd be quite surprising if playlists were actually random,
link |
00:36:34.880
if they had no semantic meaning.
link |
00:36:36.880
For most people, they group these tracks for some reason.
link |
00:36:39.840
So we just happened across this incredible data set.
link |
00:36:43.120
Where people are taking these tens of millions of tracks
link |
00:36:46.800
and group them along different semantic vectors.
link |
00:36:49.280
And the semantics being outside the individual users.
link |
00:36:52.720
So it's some kind of universal.
link |
00:36:54.400
There's a universal embedding that holds across people on this earth.
link |
00:36:59.760
Yes, I do think that the embeddings you find are going to be reflective of the people who play listed.
link |
00:37:05.440
So if you have a lot of indie lovers who play list,
link |
00:37:09.040
your embedding is going to perform better there.
link |
00:37:14.800
But what we found was that yes, there were these latent similarities.
link |
00:37:20.560
They were very powerful.
link |
00:37:22.000
And it was interesting because I think that the people who play listed the most initially
link |
00:37:28.720
were the so called music aficionados who were really into music.
link |
00:37:32.640
And they often had a certain...
link |
00:37:34.240
Their taste was often geared towards a certain type of music.
link |
00:37:38.800
And so what surprised us, if you look at the problem from the outside,
link |
00:37:42.160
you might expect that the algorithms would start performing best with mainstreamers first.
link |
00:37:47.840
Because it somehow feels like an easier problem to solve mainstream taste
link |
00:37:51.360
than really particular taste.
link |
00:37:53.360
It was the complete opposite for us.
link |
00:37:55.120
The recommendations performed fantastically for people who saw themselves as
link |
00:37:59.280
having very unique taste.
link |
00:38:00.960
That's probably because all of them play listed.
link |
00:38:03.280
And they didn't perform so well for mainstreamers.
link |
00:38:05.120
They actually thought they were a bit too particular and unorthodox.
link |
00:38:09.440
So we had the complete opposite of what we expected.
link |
00:38:12.000
Success within the hardest problem first,
link |
00:38:13.920
and then had to try to scale to more mainstream recommendations.
link |
00:38:17.600
So you've also acquired Echo Nest that analyzes song data.
link |
00:38:24.160
So in your view, maybe you can talk about,
link |
00:38:28.400
so what kind of data is there from a machine learning perspective?
link |
00:38:31.680
From a machine learning perspective, there's a huge amount.
link |
00:38:35.680
We're talking about playlisting and just user data of what people are listening to,
link |
00:38:40.640
the playlist they're constructing, and so on.
link |
00:38:44.640
And then there's the actual data within a song.
link |
00:38:48.080
What makes a song, I don't know, the actual waveforms.
link |
00:38:54.160
How do you mix the two?
link |
00:38:55.680
How much value is there in each?
link |
00:38:57.200
To me, it seems like user data is a romantic notion
link |
00:39:03.120
that the song itself would contain useful information.
link |
00:39:05.840
But if I were to guess, user data would be much more powerful,
link |
00:39:09.840
like playlists would be much more powerful.
link |
00:39:11.840
Yeah, so we use both.
link |
00:39:14.800
Our biggest success initially was with playlist data
link |
00:39:18.800
without understanding anything about the structure of the song.
link |
00:39:22.480
But when we acquired Echo Nest, they had the inverse problem.
link |
00:39:25.520
They actually didn't have any play data.
link |
00:39:27.440
They were just, they were a provider of recommendations,
link |
00:39:29.680
but they didn't actually have any play data.
link |
00:39:31.840
So they looked at the structure of songs, sonically,
link |
00:39:36.640
and they looked at Wikipedia for cultural references and so forth, right?
link |
00:39:40.400
And did a lot of NLU and so forth.
link |
00:39:41.920
So we got that skill into the company and combined kind of our user data
link |
00:39:47.600
with their kind of content based.
link |
00:39:51.600
So you can think of it as we were user based
link |
00:39:53.200
and they were content based in their recommendations.
link |
00:39:54.880
And we combined those two.
link |
00:39:56.960
And for some cases where you have a new song that has no play data,
link |
00:40:00.240
obviously you have to try to go by either who the artist is
link |
00:40:04.960
or the sonic information in the song or what it's similar to.
link |
00:40:09.760
So there's definitely a value in both and we do a lot in both,
link |
00:40:12.720
but I would say, yes, the user data captures things
link |
00:40:16.080
that have to do with culture in the greater society
link |
00:40:19.680
that you would never see in the content itself.
link |
00:40:23.440
But that said, we have seen, we have a research lab in Paris
link |
00:40:28.880
when we can talk more about that on machine learning on the creator side,
link |
00:40:32.960
what it can do for creators, not just for the consumers,
link |
00:40:35.520
but where we looked at how does the structure of a song
link |
00:40:38.640
actually affect the listening behavior?
link |
00:40:40.800
And it turns out that there is a lot of,
link |
00:40:43.120
we can predict things like skips based on the song itself.
link |
00:40:48.480
We could say that maybe you should move that chorus a bit
link |
00:40:50.880
because your skip is going to go up here.
link |
00:40:52.720
There is a lot of latent structure in the music,
link |
00:40:54.400
which is not surprising because it is some sort of mind hack.
link |
00:40:58.640
So there should be structure. That's probably what we respond to.
link |
00:41:00.960
You just blew my mind actually from the creator perspective.
link |
00:41:05.520
So that's a really interesting topic
link |
00:41:08.000
that probably most creators aren't taking advantage of, right?
link |
00:41:11.920
So I've recently got to interact with a few folks,
link |
00:41:15.920
YouTubers who are like obsessed with this idea of what do I do
link |
00:41:24.320
to make sure people keep watching the video?
link |
00:41:27.840
And they like look at the analytics of which point do people turn it off and so on.
link |
00:41:32.720
First of all, I don't think that's healthy,
link |
00:41:35.040
but it's because you can do it a little too much.
link |
00:41:38.320
But it is a really powerful tool for helping the creative process.
link |
00:41:42.240
You just made me realize you could do the same thing for creation of music.
link |
00:41:47.280
And so is that something you've looked into?
link |
00:41:51.360
And can you speak to how much opportunity there is for that kind of thing?
link |
00:41:54.800
Yeah, so I listened to the podcast with Ziraj and I thought it was fantastic
link |
00:41:59.200
and I reacted to the same thing where he said he posted something in the morning,
link |
00:42:04.160
immediately watched the feedback where the drop off was
link |
00:42:06.560
and then responded to that in the afternoon,
link |
00:42:08.400
which is quite different from how people make podcasts, for example.
link |
00:42:12.080
Yes, exactly.
link |
00:42:12.880
I mean, the feedback loop is almost non existent.
link |
00:42:15.040
So if we back out one level, I think actually both for music and podcasts,
link |
00:42:21.120
which we also do at Spotify,
link |
00:42:23.600
I think there's a tremendous opportunity just for the creation workflow.
link |
00:42:27.440
And I think it's really interesting speaking to you who,
link |
00:42:30.960
because you're a musician, a developer, and a podcaster.
link |
00:42:34.720
If you think about those three different roles,
link |
00:42:36.560
if you make the leap as a musician,
link |
00:42:38.880
if you think about it as a software tool chain, really,
link |
00:42:42.960
your DAW with the stems, that's the IDE, right?
link |
00:42:46.320
That's where you work in source code format with what you're creating.
link |
00:42:51.120
Then you sit around and you play with that.
link |
00:42:52.320
And when you're happy, you compile that thing into some sort of AAC or MP3 or something.
link |
00:42:57.520
You do that because you get distribution.
link |
00:42:59.040
There are so many runtimes for that MP3 across the world in car stairs and stuff.
link |
00:43:02.240
So if you kind of compile this execution,
link |
00:43:03.920
you ship it out in kind of an old fashioned boxed software analogy.
link |
00:43:09.280
And then you hope for the best, right?
link |
00:43:11.760
But as a software developer, you would never do that.
link |
00:43:16.080
First, you go on GitHub and you collaborate with other creators.
link |
00:43:19.440
And then you think it'd be crazy to just ship one version of your software
link |
00:43:22.800
without doing an A B test, without any feedback loop.
link |
00:43:26.800
Issue tracking.
link |
00:43:28.320
Exactly.
link |
00:43:28.880
And then you would look at the feedback loop and say,
link |
00:43:31.760
try to optimize that thing, right?
link |
00:43:34.160
So I think if you think of it as a very specific software tool chain,
link |
00:43:38.880
it looks quite arcane, the tools that a music creator has
link |
00:43:42.880
versus what a software developer has.
link |
00:43:45.360
So that's kind of how we think about it.
link |
00:43:48.400
Why wouldn't a music creator have something like GitHub
link |
00:43:52.640
where you could collaborate much more easily?
link |
00:43:54.000
So we bought this company called Soundtrap,
link |
00:43:56.560
which has a kind of Google Docs for music approach, where you can collaborate
link |
00:44:01.680
with other people on the kind of source code format with Stems.
link |
00:44:05.600
And I think introducing things like AI tools there to help you
link |
00:44:09.600
as you're creating music, both in helping you put accompaniment to your music,
link |
00:44:19.280
like drums or something, help you master and mix automatically,
link |
00:44:24.400
help you understand how this track will perform.
link |
00:44:26.720
Exactly what you would expect as a software developer.
link |
00:44:29.600
I think it makes a lot of sense.
link |
00:44:30.880
And I think the same goes for a podcaster.
link |
00:44:33.520
I think podcasters will expect to have the same kind of feedback loop
link |
00:44:36.320
that Siraj has, like, why wouldn't you?
link |
00:44:39.520
Maybe it's not healthy, but...
link |
00:44:41.520
Sorry, I wanted to criticize the fact because you can overdo it
link |
00:44:45.120
because a lot of the, and we're in a new era of that.
link |
00:44:49.760
So you can become addicted to it and therefore, what people say,
link |
00:44:56.400
you become a slave to the YouTube algorithm or sort of,
link |
00:45:00.640
it's always a danger of a new technology as opposed to say,
link |
00:45:04.400
if you're creating a song, becoming too obsessed about the intro riff to the song
link |
00:45:11.600
that keeps people listening versus actually the entirety of the creation process.
link |
00:45:15.440
It's a balance.
link |
00:45:16.160
But the fact that there's zero, I mean, you're blowing my mind right now,
link |
00:45:19.680
because you're completely right that there is no signal whatsoever.
link |
00:45:24.960
There's no feedback whatsoever on the creation process and music or podcasting,
link |
00:45:30.000
almost at all.
link |
00:45:31.680
And are you saying that Spotify is hoping to help create tools to, not tools, but...
link |
00:45:39.360
No, tools actually.
link |
00:45:41.680
Actually, tools.
link |
00:45:42.640
Tools for creators.
link |
00:45:47.200
Absolutely.
link |
00:45:48.320
So we've made some acquisitions the last few years around music creation,
link |
00:45:53.520
this company called Soundtrap, which is a digital audio workstation,
link |
00:45:57.280
but that is browser based.
link |
00:45:59.040
And their focus was really the Google Docs approach.
link |
00:46:01.200
We can collaborate with people much more easily than you could in previous tools.
link |
00:46:06.080
So we have some of these tools that we're working with that we want to make accessible
link |
00:46:09.280
and then we can connect it with our consumption data.
link |
00:46:12.960
We can create this feedback loop where we could help you understand,
link |
00:46:16.800
we could help you create and help you understand how you will perform.
link |
00:46:20.960
We also acquired this other company within podcasting called Anchor,
link |
00:46:24.560
which is one of the biggest podcasting tools, mobile focused.
link |
00:46:28.400
So really focused on simple creation or easy access to creation.
link |
00:46:32.800
But that also gives us this feedback loop.
link |
00:46:34.960
And even before that, we invested in something called Spotify for Artists
link |
00:46:40.640
and Spotify for Podcasters, which is an app that you can download,
link |
00:46:43.600
you can verify that you are that creator.
link |
00:46:46.000
And then you get things that software developers have had for years.
link |
00:46:51.680
You can see where, if you look at your podcast, for example, on Spotify
link |
00:46:55.520
or a song that you released, you can see how it's performing,
link |
00:46:58.720
which cities it's performing in, who's listening to it,
link |
00:47:01.280
what's the demographic breakup.
link |
00:47:02.800
So similar in the sense that you can understand
link |
00:47:05.840
how you're actually doing on the platform.
link |
00:47:08.880
So we definitely want to build tools.
link |
00:47:10.480
I think you also interviewed the head of research for Adobe.
link |
00:47:15.920
And I think that's an, back to Photoshop that you like,
link |
00:47:19.680
I think that's an interesting analogy as well.
link |
00:47:22.800
Photoshop, I think, has been very innovative in helping photographers and artists.
link |
00:47:28.000
And I think there should be the same kind of tools for music creators,
link |
00:47:32.320
where you could get AI assistance, for example, as you're creating music,
link |
00:47:36.640
as you can do with Adobe, where you can,
link |
00:47:38.880
I want a sky over here and you can get help creating that sky.
link |
00:47:42.000
The really fascinating thing is what Adobe doesn't have
link |
00:47:47.520
is a distribution for the content you create.
link |
00:47:50.400
So you don't have the data of if I create, if I, you know,
link |
00:47:55.840
whatever creation I make in Photoshop or Premiere,
link |
00:47:59.360
I can't get like immediate feedback like I can on YouTube,
link |
00:48:02.480
for example, about the way people are responding.
link |
00:48:05.360
And if Spotify is creating those tools, that's a really exciting actually world.
link |
00:48:11.680
But let's talk a little about podcasts.
link |
00:48:16.720
So I have trouble talking to one person.
link |
00:48:20.000
So it's a bit terrifying and kind of hard to fathom,
link |
00:48:23.120
but on average, 60 to 100,000 people will listen to this episode.
link |
00:48:30.320
Okay, so it's intimidating.
link |
00:48:32.240
Yeah, it's intimidating.
link |
00:48:34.320
So I hosted on Blueberry.
link |
00:48:36.720
I don't know if I'm pronouncing that correctly, actually.
link |
00:48:39.520
It looks like most people listen to it on Apple Podcasts,
link |
00:48:42.400
Cast Box and Pocket Casts, and only about a thousand listen on Spotify.
link |
00:48:48.480
It's just my podcast, right?
link |
00:48:53.840
So where do you see a time when Spotify will dominate this?
link |
00:49:00.960
So Spotify is relatively new into this podcasting site.
link |
00:49:06.000
Yeah, in podcasting.
link |
00:49:07.520
What's the deal with podcasting and Spotify?
link |
00:49:10.800
How serious is Spotify about podcasting?
link |
00:49:13.440
Do you see a time where everybody would listen to, you know,
link |
00:49:16.800
probably a huge amount of people, majority perhaps listen to music on Spotify?
link |
00:49:22.400
Do you see a time when the same is true for podcasting?
link |
00:49:26.880
Well, I certainly hope so.
link |
00:49:28.560
That is our mission.
link |
00:49:29.360
Our mission as a company is actually to enable a million creators to live off of their art,
link |
00:49:34.160
and a billion people be inspired by it.
link |
00:49:35.840
And what I think is interesting about that mission is it actually puts the creators first,
link |
00:49:40.640
even though it started as a consumer focused company,
link |
00:49:43.040
and it's just to be able to live off of their art,
link |
00:49:44.800
not just make some money off of their art as well.
link |
00:49:47.840
So it's quite an ambitious project.
link |
00:49:51.920
So we think about creators of all kinds,
link |
00:49:53.920
and we kind of expanded our mission from being music to being audio a while back.
link |
00:50:01.120
And that's not so much because we think we made that decision.
link |
00:50:08.400
We think that decision was made for us.
link |
00:50:10.800
We think the world made that decision.
link |
00:50:12.960
Whether we like it or not, when you put in your headphones,
link |
00:50:16.560
you're going to make a choice between music and a new episode of your podcast or something else.
link |
00:50:25.440
We're in that world whether we like it or not.
link |
00:50:26.960
And that's how radio works.
link |
00:50:28.960
So we decided that we think it's about audio.
link |
00:50:32.320
You can see the rise of audiobooks and so forth.
link |
00:50:34.480
We think audio is a great opportunity.
link |
00:50:36.480
So we decided to enter it.
link |
00:50:37.600
And obviously, Apple and Apple Podcasts is absolutely dominating in podcasting,
link |
00:50:45.280
and we didn't have a single podcast only like two years ago.
link |
00:50:49.440
What we did though was we looked at this and said,
link |
00:50:54.560
can we bring something to this?
link |
00:50:56.480
We want to do this, but back to the original Spotify,
link |
00:50:59.200
we have to do something that consumers actually value to be able to do this.
link |
00:51:03.840
And the reason we've gone from not existing at all to being quite a wide margin,
link |
00:51:09.840
the second largest podcast consumption, still wide gap to iTunes, but we're growing quite fast.
link |
00:51:16.480
I think it's because when we looked at the consumer problem,
link |
00:51:20.320
people said surprisingly that they wanted their podcasts and music in the same application.
link |
00:51:26.960
So what we did was we took a little bit of a different approach where we said,
link |
00:51:29.760
instead of building a separate podcast app,
link |
00:51:31.440
we thought, is there a consumer problem to solve here?
link |
00:51:33.680
Because the others are very successful already.
link |
00:51:35.680
And we thought there was in making a more seamless experience
link |
00:51:38.960
where you can have your podcast and your music in the same application,
link |
00:51:43.680
because we think it's audio to you.
link |
00:51:45.440
And that has been successful.
link |
00:51:46.800
And that meant that we actually had 200 million people to offer this to instead of starting from zero.
link |
00:51:52.400
So I think we have a good chance because we're taking a different approach than the competition.
link |
00:51:56.880
And back to the other thing I mentioned about
link |
00:51:59.120
creators, because we're looking at the end to end flow.
link |
00:52:02.800
I think there's a tremendous amount of innovation to do around podcast as a format.
link |
00:52:07.040
When we have creation tools and consumption, I think we could start improving what podcasting is.
link |
00:52:12.640
I mean, podcast is this opaque, big, like one, two hour file that you're streaming,
link |
00:52:19.520
which it really doesn't make that much sense in 2019 that it's not interactive.
link |
00:52:24.240
There's no feedback loops, nothing like that.
link |
00:52:26.000
So I think if we're going to win, it's going to have to be because we build a better product
link |
00:52:29.760
for creators and for consumers.
link |
00:52:32.480
So we'll see, but it's certainly our goal.
link |
00:52:34.640
We have a long way to go.
link |
00:52:36.240
Well, the creators part is really exciting.
link |
00:52:38.160
You already, you got me hooked there.
link |
00:52:40.160
Cause the only stats I have,
link |
00:52:42.320
Blueberry just recently added the stats of whether it's listened to the end or not.
link |
00:52:48.560
And that's like a huge improvement, but that's still
link |
00:52:52.320
nowhere to where you could possibly go in terms of statistics.
link |
00:52:54.960
You just download the Spotify podcasters up and verify.
link |
00:52:57.200
And then, then you'll know where people dropped out in this episode.
link |
00:52:59.920
Oh, wow.
link |
00:53:00.400
Okay.
link |
00:53:01.600
The moment I started talking.
link |
00:53:02.800
Okay.
link |
00:53:03.360
I might be depressed by this, but okay.
link |
00:53:06.800
So one, um, one other question is the original Spotify for music.
link |
00:53:14.400
And I have a question about podcasting in this line is the idea of podcasting
link |
00:53:19.120
about podcasting in this line is the idea of albums.
link |
00:53:23.440
I have, uh, what did you, uh, music aficionados, uh, friends who are really,
link |
00:53:29.440
uh, big fans of music often, uh, really enjoy albums,
link |
00:53:33.280
listening to entire albums of, of an artist.
link |
00:53:36.400
Correct me if I'm wrong, but I feel like Spotify has helped
link |
00:53:40.960
replace the idea of an album with playlists.
link |
00:53:44.240
So you create your own albums.
link |
00:53:46.000
It's, it's kind of the way, at least I've experienced music
link |
00:53:48.880
and I've really enjoyed it that way.
link |
00:53:51.040
One of the things that was missing in podcasting for me,
link |
00:53:54.880
I don't know if it's missing.
link |
00:53:56.320
I don't know.
link |
00:53:56.880
It's an open question for me, but the way I listened to podcasts is
link |
00:53:59.920
the way I would listen to albums.
link |
00:54:02.080
So I take a Joe Rogan experience and that's an album.
link |
00:54:05.600
And I listened, you know, I like, I, I put that on and I listened one
link |
00:54:09.680
episode after the next, then there's a sequence and so on.
link |
00:54:12.640
Is there a room for doing what you did for music or doing what
link |
00:54:17.520
Spotify did for music, but, uh, creating playlists, sort of, uh,
link |
00:54:22.880
this kind of playlisting idea of breaking apart from podcasting,
link |
00:54:27.120
uh, from individual podcasts and creating kind of, uh, this interplay
link |
00:54:31.680
or, or have you thought about that space?
link |
00:54:33.760
Uh, it's a great question.
link |
00:54:34.800
So I think in, um, in music, you're right.
link |
00:54:38.720
Basically you bought an album.
link |
00:54:39.920
So it was like, you bought a small catalog of like 10 tracks, right?
link |
00:54:42.800
It was, it was, again, it was actually a lot of, a lot of consumption.
link |
00:54:46.720
You think it's about what you like, but it's based on the business model.
link |
00:54:49.680
So you paid for this 10 track service and then you listened to that for a while.
link |
00:54:54.240
And then when, when everything was flat priced, you tended to listen differently.
link |
00:54:58.480
Now, so, so I think the, I think the album is still tremendously important.
link |
00:55:01.360
That's why we have it and you can save albums and so forth.
link |
00:55:03.360
And you have a huge amount of people who really listen according to albums.
link |
00:55:06.480
And I like that because it is a creator format, you can tell a longer story
link |
00:55:10.240
over several tracks.
link |
00:55:12.000
And so some people listen to just one track.
link |
00:55:13.840
Some people actually want to hear that whole story.
link |
00:55:17.520
Now in podcast, I think, I think it's different.
link |
00:55:21.600
You can argue that podcasts might be more like shows on Netflix.
link |
00:55:25.600
Have like a full season of Narcos and you're probably not going to do like
link |
00:55:29.200
one episode of Narcos and then one of House of Cards, like, like, you know,
link |
00:55:33.440
there's a narrative there.
link |
00:55:34.480
And you, you, you love the cast and you love these characters.
link |
00:55:37.440
So I think people will, people love shows.
link |
00:55:42.000
And I think they will, they will listen to those shows.
link |
00:55:44.880
I do think you follow a bunch of shows at the same time.
link |
00:55:46.880
So there's certainly an opportunity to bring you the latest episode of, you
link |
00:55:50.480
know, whatever the five, six, 10 things that, that you're into.
link |
00:55:54.560
But, but I think, I think people are going to listen to specific hosts and love
link |
00:56:00.000
those hosts for a long time.
link |
00:56:01.600
Because I think there's something different with podcasts where, um, this
link |
00:56:06.880
format of the, the, the, the, the, the experience of the, of the audience is
link |
00:56:11.280
actually sitting here right between us.
link |
00:56:13.360
Whereas if you look at something on TV, the audio actually would come from, you
link |
00:56:16.960
would sit over there and the audio would come to you from both of us as if you
link |
00:56:20.080
were watching, not as you were part of the conversation.
link |
00:56:22.560
So my experience is having listened to podcasts like yours and Joe Rogan is, I
link |
00:56:27.280
feel like I know all of these people.
link |
00:56:28.720
They, they have a lot of experience.
link |
00:56:30.240
I know all of these people, they have no idea who I am, but I feel like I've
link |
00:56:33.600
listened to so many hours of that.
link |
00:56:35.040
It's very different from me watching a, watching like a TV show or an interview.
link |
00:56:39.440
So I think you, you kind of, um, fall in love with people and, um, experience
link |
00:56:44.560
in a, in a different way.
link |
00:56:45.760
So I think, I think shows and hosts are going to be very, uh, very important.
link |
00:56:49.280
I don't think that's going to go away into some sort of thing where, where you
link |
00:56:52.160
don't even know who you're listening to.
link |
00:56:53.360
I don't think that's going to happen.
link |
00:56:55.040
What I do think is I think there's a tremendous discovery opportunity in
link |
00:56:59.760
podcast because the catalog is growing quite quickly.
link |
00:57:03.920
And I think podcast is only a few, like five, 600,000 shows right now.
link |
00:57:11.360
If you look back to YouTube as another analogy of creators, no one really knows
link |
00:57:16.080
if you would lift the lid on YouTube, but it's probably billions of episodes.
link |
00:57:21.120
And so I think the podcast catalog would probably grow tremendously because the
link |
00:57:24.960
creation tools are getting easier.
link |
00:57:27.040
And then you're going to have this discovery opportunity that I think is
link |
00:57:30.800
really big.
link |
00:57:31.280
So, so a lot of people tell me that they love their shows, but discovering
link |
00:57:35.600
podcasts kind of suck.
link |
00:57:36.880
It's really hard to get into new show.
link |
00:57:38.720
They're usually quite long.
link |
00:57:39.840
It's a big time investment.
link |
00:57:40.960
So I think there's plenty of opportunity in the discovery part.
link |
00:57:45.600
Yeah, for sure.
link |
00:57:46.560
A hundred percent in, in even the dumbest, there's so many low hanging fruit too.
link |
00:57:51.200
Uh, for example, just knowing what episode to listen to first to try out a podcast.
link |
00:57:59.680
Exactly.
link |
00:58:00.400
Uh, because most podcasts don't have an order to them.
link |
00:58:03.920
Uh, they, they can be listened to out of order and sorry to say some are better
link |
00:58:10.880
than others episodes.
link |
00:58:12.560
So some episodes of Joe Rogan are better than others.
link |
00:58:15.520
And it's nice to know, uh, which you should listen to, to try it out.
link |
00:58:20.400
And there's, uh, as far as I know, almost no information, uh, in terms of like, uh,
link |
00:58:26.320
upvotes on how good an episode is.
link |
00:58:28.640
Exactly.
link |
00:58:29.280
So I think part of the problem is, uh, you, it's kind of like music.
link |
00:58:33.520
There isn't one answer.
link |
00:58:34.480
People use music for different things and there's actually many different types of music.
link |
00:58:37.440
There's workout music and there's classical piano music and focus music and,
link |
00:58:41.200
and, and, uh, so forth.
link |
00:58:42.640
I think the same with podcasts.
link |
00:58:44.080
Some podcasts are sequential.
link |
00:58:45.360
They're supposed to be listened to in, in order.
link |
00:58:48.400
It's actually, it's actually telling a narrative.
link |
00:58:51.040
Some podcasts are one topic, uh, kind of like yours, but different guests.
link |
00:58:55.840
So you could jump in anywhere.
link |
00:58:57.280
Some podcasts actually have completely different topics.
link |
00:58:59.440
And for those podcasts, it might be that I want, you know, we should recommend one episode
link |
00:59:04.560
because it's about AI from someone, but then they talk about something that you're not
link |
00:59:09.280
interested in the rest of the episodes.
link |
00:59:10.880
So I think our, what we're spending a lot of time on now is just first understanding
link |
00:59:15.040
the domain and creating kind of the knowledge graph of how do these objects relate and how
link |
00:59:21.520
do people consume.
link |
00:59:22.240
And I think we'll find that it's going to be, it's going to be different.
link |
00:59:26.000
I'm excited because you're the, uh, Spotify is the first people I'm aware of that are
link |
00:59:32.240
trying to do this for podcasting.
link |
00:59:34.800
Podcasting has been like a wild west up until now.
link |
00:59:38.240
It's been a very, we want to be very careful though, because it's been a very good wild
link |
00:59:43.120
west, I think it's this fragile ecosystem.
link |
00:59:46.320
And I, we want to make sure that you don't barge in and say like, Oh, we're going to
link |
00:59:52.080
internetize this thing.
link |
00:59:53.440
And you have to think about the creators.
link |
00:59:56.640
You have to understand how they get distribution today, who listens to how they make money
link |
01:00:01.040
today, try to, you know, make sure that their business model works, that they understand.
link |
01:00:06.080
I think it's back to doing something to improving their products, like feedback loops and
link |
01:00:10.880
distribution.
link |
01:00:11.440
So jumping back into terms of this fascinating world of a recommender system and listening
link |
01:00:17.280
to music and using machine learning to analyze things, do you think it's better to what
link |
01:00:24.320
currently, correct me if I'm wrong, but currently Spotify lets people pick what they listen
link |
01:00:30.160
to the most part.
link |
01:00:31.680
There's a discovery process, but you kind of organize playlists.
link |
01:00:35.040
Is it better to let people pick what they listen to or recommend what they should listen
link |
01:00:39.840
to something like stations by Spotify that I saw that you're playing around with?
link |
01:00:44.960
Maybe you can tell me what's the status of that.
link |
01:00:47.520
This is a Pandora style app that just kind of, as opposed to you select the music you
link |
01:00:52.880
listen to, it kind of feeds you the music you listen to.
link |
01:00:58.400
What's the status of stations by Spotify?
link |
01:01:00.800
What's its future?
link |
01:01:01.920
The story of Spotify, as we have grown, has been that we made it more accessible to different
link |
01:01:07.040
audiences and stations is another one of those where the question is, some people want to
link |
01:01:14.000
be very specific.
link |
01:01:14.720
They actually want to hear Starway to Heaven right now, that needs to be very easy to do.
link |
01:01:19.760
And some people, or even the same person, at some point might say, I want to feel upbeat
link |
01:01:26.080
or I want to feel happy or I want songs to sing in the car.
link |
01:01:32.800
So they put in the information at a very different level and then we need to translate that into
link |
01:01:38.720
what that means musically.
link |
01:01:40.560
So stations is a test to create like a consumption input vector that is much simpler where you
link |
01:01:45.440
can just tune it a little bit and see if that increases the overall reach.
link |
01:01:49.520
But we're trying to kind of serve the entire gamut of super advanced so called music aficionados
link |
01:01:56.000
all the way to people who they love listening to music but it's not their number one priority
link |
01:02:02.560
in life.
link |
01:02:03.200
They're not going to sit and follow every new release from every new artist.
link |
01:02:06.160
They need to be able to influence music at a different level.
link |
01:02:11.120
So you can think of it as different products and I think one of the interesting things
link |
01:02:17.360
to answer your question on if it's better to let the user choose or to play, I think
link |
01:02:22.080
the answer is the challenge when machine learning kind of came along, there was a lot of thinking
link |
01:02:28.720
about what does product development mean in a machine learning context.
link |
01:02:33.920
People like Andrew Ng, for example, when he went to Baidu, he started doing a lot of practical
link |
01:02:38.880
machine learning, went from academia and he thought a lot about this and he had this notion
link |
01:02:43.280
that a product manager, designer and engineer, they used to work around this wireframe to
link |
01:02:47.760
kind of describe what the product should look like.
link |
01:02:49.440
It was something to talk about when you're doing a chatbot or a playlist, what are you
link |
01:02:54.080
going to say?
link |
01:02:54.640
It should be good.
link |
01:02:55.520
That's not a good product description.
link |
01:02:57.360
So how do you do that?
link |
01:02:58.400
And he came up with this notion that the test set is the new wireframe.
link |
01:03:03.120
The job of the product manager is to source a good test set that is representative of
link |
01:03:06.960
what, like if you say I want to play this, that is songs to sing in the car.
link |
01:03:11.520
The job of the product manager is to go and source a good test set of what that means.
link |
01:03:15.360
So then you can work with engineering to have algorithms to try to produce that.
link |
01:03:20.000
So we try to think a lot about how to structure product development for a machine learning
link |
01:03:25.600
age.
link |
01:03:26.320
And what we discovered was that a lot of it is actually in the expectation.
link |
01:03:30.560
And you can go two ways.
link |
01:03:33.120
So let's say that if you set the expectation with the user that this is a discovery product,
link |
01:03:40.880
like Discover Weekly, you're actually setting the expectation that most of what we show
link |
01:03:45.280
you will not be relevant.
link |
01:03:46.800
When you're in the discovery process, you're going to accept that actually if you find
link |
01:03:50.400
one gem every Monday that you totally love, you're probably going to be happy.
link |
01:03:55.200
Even though the statistical meaning, one out of 10 is terrible or one out of 20 is terrible
link |
01:04:00.240
from a user point of view because the setting was discovery is fine.
link |
01:04:03.440
Sorry to interrupt real quick.
link |
01:04:05.360
I just actually learned about Discover Weekly, which is a Spotify, I don't know, it's a
link |
01:04:11.600
feature of Spotify that shows you cool songs to listen to.
link |
01:04:16.640
Maybe I can do issue tracking.
link |
01:04:18.160
I couldn't find it on my Spotify app.
link |
01:04:20.640
It's in your library.
link |
01:04:21.680
It's in the library.
link |
01:04:22.640
It's in the list of library.
link |
01:04:23.760
Because I was like, whoa, this is cool.
link |
01:04:25.040
I didn't know this existed.
link |
01:04:26.320
And I tried to find it.
link |
01:04:27.440
But okay.
link |
01:04:28.800
I will show it to you and feedback to our product team.
link |
01:04:31.920
There you go.
link |
01:04:32.720
But yeah, so yeah, sorry.
link |
01:04:34.480
Just to mention the expectation there is basically that you're going to discover new songs.
link |
01:04:42.160
Yeah.
link |
01:04:42.400
So then you can be quite adventurous in the recommendations you do.
link |
01:04:47.920
But we have another product called Daily Mix, which kind of implies that these are only
link |
01:04:53.120
going to be your favorites.
link |
01:04:54.560
So if you have one out of 10 that is good and nine out of 10 that doesn't work for you,
link |
01:04:58.320
you're going to think it's a horrible product.
link |
01:04:59.600
So actually a lot of the product development we learned over the years is about setting
link |
01:05:03.040
the right expectations.
link |
01:05:04.080
So for Daily Mix, you know, algorithmically, we would pick among things that feel very
link |
01:05:09.680
safe in your taste space.
link |
01:05:11.280
Whereas Discover Weekly, we go kind of wild because the expectation is most of this is
link |
01:05:15.520
not going to.
link |
01:05:16.400
So a lot of that, a lot of to answer your question there, a lot of should you let the
link |
01:05:20.960
user pick or not?
link |
01:05:21.600
It depends.
link |
01:05:23.360
We have some products where the whole point is that the user can click play, put the phone
link |
01:05:26.720
in the pocket, and it should be really good music for like an hour.
link |
01:05:30.000
We have other products where you probably need to say like, no, no, save, no, no.
link |
01:05:35.120
And it's very interactive.
link |
01:05:37.040
I see.
link |
01:05:37.440
That makes sense.
link |
01:05:38.000
And then the radio product, the stations product is one of these like click play, put in your
link |
01:05:41.920
pocket for hours.
link |
01:05:43.360
That's really interesting.
link |
01:05:44.160
So you're thinking of different test sets for different users and trying to create products
link |
01:05:50.880
that sort of optimize for those test sets that represent a specific set of users.
link |
01:05:57.840
Yes, I think one thing that I think is interesting is we invested quite heavily in editorial
link |
01:06:06.160
in people creating playlists using statistical data.
link |
01:06:09.520
And that was successful for us.
link |
01:06:10.800
And then we also invested in machine learning.
link |
01:06:13.600
And for the longest time within Spotify and within the rest of the industry, there was
link |
01:06:18.000
always this narrative of humans versus the machine, algo versus editorial.
link |
01:06:23.360
And editors would say like, well, if I had that data, if I could see your
link |
01:06:27.600
playlisting history and I made a choice for you, I would have made a better choice.
link |
01:06:31.680
And they would have because they're much smarter than these algorithms.
link |
01:06:35.200
The human is incredibly smart compared to our algorithms.
link |
01:06:38.880
They can take culture into account and so forth.
link |
01:06:41.440
The problem is that they can't make 200 million decisions per hour for every user that logs
link |
01:06:47.600
in.
link |
01:06:47.680
So the algo may be not as sophisticated, but much more efficient.
link |
01:06:51.760
So there was this contradiction.
link |
01:06:54.480
But then a few years ago, we started focusing on this kind of human in the loop thinking
link |
01:07:00.160
around machine learning.
link |
01:07:01.280
And we actually coined an internal term for it called algotorial, a combination of algorithms
link |
01:07:07.120
and editors, where if we take a concrete example, you think of the editor, this paid
link |
01:07:15.040
expert that we have that's really good at something like soul, hip hop, EDM, something,
link |
01:07:20.400
right?
link |
01:07:20.720
They're a true expert, no one in the industry.
link |
01:07:22.800
So they have all the cultural knowledge.
link |
01:07:24.480
You think of them as the product manager.
link |
01:07:26.560
And you say that, let's say that you want to create a, you think that there's a product
link |
01:07:32.880
need in the world for something like songs to sing in the car or songs to sing in the
link |
01:07:36.160
shower.
link |
01:07:36.560
I'm taking that example because it exists.
link |
01:07:38.400
People love to scream songs in the car when they drive, right?
link |
01:07:42.560
So you want to create that product and you have this product manager who's a musical
link |
01:07:45.520
expert.
link |
01:07:46.640
They create, they come up with a concept, like I think this is a missing thing in humanity,
link |
01:07:50.800
like a playlist called songs to sing in the car.
link |
01:07:53.920
They create the framing, the image, the title, and they create a test set of, they create
link |
01:07:59.840
a group of songs, like a few thousand songs out of the catalog that they manually curate
link |
01:08:04.480
that are known songs that are great to sing in the car.
link |
01:08:07.520
And they can take like true romance into account.
link |
01:08:09.840
They understand things that our algorithms do not at all.
link |
01:08:12.400
So they have this huge set of tracks.
link |
01:08:14.480
Then when we deliver that to you, we look at your taste vectors and you get the 20 tracks
link |
01:08:19.600
that are songs to sing in the car in your taste.
link |
01:08:22.560
So you have personalization and editorial input in the same process, if that makes sense.
link |
01:08:29.520
Yeah, it makes total sense.
link |
01:08:30.880
And I have several questions around that.
link |
01:08:32.480
This is like fascinating.
link |
01:08:36.080
Okay.
link |
01:08:36.560
So first, it is a little bit surprising to me that the world expert humans are outperforming
link |
01:08:44.720
machines at specifying songs to sing in the car.
link |
01:08:50.960
So maybe you could talk to that a little bit.
link |
01:08:53.680
I don't know if you can put it into words, but what is it?
link |
01:08:57.760
How difficult is this problem?
link |
01:09:01.680
Do you really, I guess what I'm trying to ask is there, how difficult is it to encode
link |
01:09:06.720
the cultural references, the context of the song, the artists, all those things together?
link |
01:09:14.640
Can machine learning really not do that?
link |
01:09:17.360
I mean, I think machine learning is great at replicating patterns if you have the patterns.
link |
01:09:23.040
But if you try to write with me a spec of what song's greatest song to sing in the car
link |
01:09:27.680
definition is, is it loud?
link |
01:09:30.320
Does it have many choruses?
link |
01:09:31.520
Should it have been in movies?
link |
01:09:32.800
It quickly gets incredibly complicated, right?
link |
01:09:35.680
Yeah.
link |
01:09:36.880
And a lot of it may not be in the structure of the song or the title.
link |
01:09:40.960
It could be cultural references because, you know, it was a history.
link |
01:09:44.880
So the definition problems quickly get, and I think that was the insight of Andrew Ng
link |
01:09:51.360
when he said the job of the product manager is to understand these things that algorithms
link |
01:09:55.440
don't and then define what that looks like.
link |
01:09:58.640
And then you have something to train towards, right?
link |
01:10:00.880
Then you have kind of the test set.
link |
01:10:02.720
And then so today the editors create this pool of tracks and then we personalize.
link |
01:10:06.960
You could easily imagine that once you have this set, you could have some automatic exploration
link |
01:10:11.120
on the rest of the catalog because then you understand what it is.
link |
01:10:14.480
And then the other side of it, when machine learning does help is this taste vector.
link |
01:10:20.560
How hard is it to construct a vector that represents the things an individual human
link |
01:10:26.960
likes, this human preference?
link |
01:10:30.080
So you can, you know, music isn't like, it's not like Amazon, like things you usually buy.
link |
01:10:38.320
Music seems more amorphous.
link |
01:10:39.920
Like it's this thing that's hard to specify.
link |
01:10:42.560
Like what is, you know, if you look at my playlist, what is the music that I love?
link |
01:10:48.080
It's harder.
link |
01:10:49.360
It seems to be much more difficult to specify concretely.
link |
01:10:54.080
So how hard is it to build a taste vector?
link |
01:10:57.120
It is very hard in the sense that you need a lot of data.
link |
01:11:00.720
And I think what we found was that, so it's not a stationary problem.
link |
01:11:06.240
It changes over time.
link |
01:11:08.720
And so we've gone through the journey of, if you've done a lot of computer vision,
link |
01:11:15.680
obviously I've done a bunch of computer vision in my past.
link |
01:11:18.320
And we started kind of with the handcrafted heuristics for, you know, this is kind of
link |
01:11:24.160
indie music.
link |
01:11:24.800
This is this.
link |
01:11:25.360
And if you consume this, you'd probably like this.
link |
01:11:27.440
So we have, we started there and we have some of that still.
link |
01:11:31.200
Then what was interesting about the playlist data was that you could find these latent
link |
01:11:34.720
things that wouldn't necessarily even make sense to you.
link |
01:11:38.800
That could even capture maybe cultural references because they cooccurred.
link |
01:11:42.880
Things that wouldn't have appeared kind of mechanistically either in the content or so
link |
01:11:48.160
forth.
link |
01:11:48.400
So I think that, I think the core assumption is that there are patterns in almost
link |
01:12:01.280
everything.
link |
01:12:02.640
And if there are patterns, these embedding techniques are getting better and better now.
link |
01:12:06.960
Now, as everyone else, we're also using kind of deep embeddings where you can encode
link |
01:12:12.400
binary values and so forth.
link |
01:12:14.400
And what I think is interesting is this process to try to find things that do not
link |
01:12:21.280
necessarily, you wouldn't actually have guessed.
link |
01:12:23.920
So it is very hard in an engineering sense to find the right dimensions.
link |
01:12:28.560
It's an incredible scalability problem to do for hundreds of millions of users and to
link |
01:12:33.920
update it every day.
link |
01:12:35.920
But in theory, in theory embeddings isn't that complicated.
link |
01:12:42.160
The fact that you try to find some principal components or something like that, dimensionality
link |
01:12:46.240
reduction and so forth.
link |
01:12:47.040
So the theory, I guess, is easy.
link |
01:12:48.240
The practice is very, very hard.
link |
01:12:50.480
And it's a huge engineering challenge.
link |
01:12:53.120
But fortunately, we have some amazing both research and engineering teams in this space.
link |
01:12:58.400
Yeah, I guess the question is all, I mean, it's similar.
link |
01:13:03.200
I deal with it with autonomous vehicle spaces.
link |
01:13:05.360
The question is how hard is driving?
link |
01:13:07.680
And here is basically the question is of edge cases.
link |
01:13:14.560
So embedding probably works, not probably, but I would imagine works well in a lot of
link |
01:13:22.240
cases.
link |
01:13:24.000
So there's a bunch of questions that arise then.
link |
01:13:25.840
So do song preferences, does your taste vector depend on context, like mood, right?
link |
01:13:33.760
So there's different moods, and so how does that take in it?
link |
01:13:41.840
Is it possible to take that as a consideration?
link |
01:13:44.320
Or do you just leave that as a interface problem that allows the user to just control it?
link |
01:13:49.840
So when I'm looking for workout music, I kind of specify it by choosing certain playlists,
link |
01:13:55.440
doing certain search.
link |
01:13:56.560
Yeah, so that's a great point.
link |
01:13:58.560
Back to the product development.
link |
01:14:00.080
You could try to spend a few years trying to predict which mood you're in automatically
link |
01:14:04.480
when you open Spotify, or you create a tab which is happy and sad, right?
link |
01:14:08.320
And you're going to be right 100% of the time with one click.
link |
01:14:10.880
Now, it's probably much better to let the user tell you if they're happy or sad, or
link |
01:14:14.880
if they want to work out.
link |
01:14:15.840
On the other hand, if your user interface becomes 2,000 tabs, you're introducing so
link |
01:14:20.480
much friction so no one will use the product.
link |
01:14:22.080
So then you have to get better.
link |
01:14:24.080
So it's this thing where you have to be able to get better.
link |
01:14:26.800
So then you have to get better, so it's this thing where I think maybe it was, I don't
link |
01:14:32.640
remember who coined it, but it's called fault tolerant UIs, right?
link |
01:14:35.040
You build a UI that is tolerant of being wrong, and then you can be much less right in your
link |
01:14:42.000
algorithms.
link |
01:14:43.120
So we've had to learn a lot of that.
link |
01:14:45.440
Building the right UI that fits where the machine learning is, and a great discovery
link |
01:14:52.160
there, which was by the teams during one of our hack days, was this thing of taking discovery,
link |
01:14:58.720
packaging it into a playlist, and saying that these are new tracks that we think you might
link |
01:15:04.880
like based on this.
link |
01:15:05.920
And setting the right expectation made it a great product.
link |
01:15:09.440
So I think we have this benefit that, for example, Tesla doesn't have that we can change
link |
01:15:15.920
the expectation.
link |
01:15:16.800
We can build a fault tolerant setting.
link |
01:15:18.640
It's very hard to be fault tolerant when you're driving at 100 miles per hour or something.
link |
01:15:23.760
And we have the luxury of being able to say that of being wrong if we have the right UI,
link |
01:15:30.000
which gives us different abilities to take more risk.
link |
01:15:33.440
So I actually think the self driving problem is much harder.
link |
01:15:37.680
Oh, yeah, for sure.
link |
01:15:39.680
It's much less fun because people die.
link |
01:15:44.240
Exactly.
link |
01:15:45.200
And in Spotify, it's such a more fun problem because failure is beautiful in a way.
link |
01:15:55.040
It leads to exploration.
link |
01:15:56.320
So it's a really fun reinforcement learning problem.
link |
01:15:58.640
The worst case scenario is you get these WTF tweets like, how did I get this?
link |
01:16:02.800
This song, yeah.
link |
01:16:03.600
Which is a lot better than the self driving.
link |
01:16:05.440
Exactly, so what's the feedback that a user, what's the signal that a user provides into
link |
01:16:14.400
the system?
link |
01:16:15.440
So you mentioned skipping.
link |
01:16:19.360
What is like the strongest signal?
link |
01:16:22.000
You didn't mention clicking like.
link |
01:16:24.800
So we have a few signals that are important.
link |
01:16:27.600
Obviously playing, playing through.
link |
01:16:30.240
So one of the benefits of music, actually, even compared to podcasts or movies is the
link |
01:16:36.560
object itself is really only about three minutes.
link |
01:16:39.280
So you get a lot of chances to recommend and the feedback loop is every three minutes instead
link |
01:16:44.320
of every two hours or something.
link |
01:16:45.760
So you actually get kind of noisy, but quite fast feedback.
link |
01:16:50.880
And so you can see if people play through, which is the inverse of skip really.
link |
01:16:55.200
That's an important signal.
link |
01:16:56.560
On the other hand, much of the consumption happens when your phone is in your pocket.
link |
01:17:00.320
Maybe you're running or driving or you're playing on a speaker.
link |
01:17:03.040
And so you not skipping doesn't mean that you love that song.
link |
01:17:05.600
It may be that it wasn't bad enough that you would walk up and skip.
link |
01:17:08.960
So it's a noisy signal.
link |
01:17:10.560
Then we have the equivalent of the like, which is you saved it to your library.
link |
01:17:14.000
That's a pretty strong signal of affection.
link |
01:17:16.720
And then we have the more explicit signal of playlisting.
link |
01:17:21.280
Like you took the time to create a playlist, you put it in there.
link |
01:17:23.920
There's a very little small chance that if you took all that trouble, this is not a really
link |
01:17:28.960
important track to you.
link |
01:17:30.480
And then we understand also what are the tracks it relates to.
link |
01:17:34.000
So we have the playlisting, we have the like, and then we have the listening or skip.
link |
01:17:39.120
And you have to have very different approaches to all of them because of different levels
link |
01:17:43.360
of noise.
link |
01:17:44.400
One is very voluminous, but noisy, and the other is rare, but you can probably trust it.
link |
01:17:49.760
Yeah, it's interesting because I think between those signals captures all the information
link |
01:17:55.680
you'd want to capture.
link |
01:17:57.040
I mean, there's a feeling, a shallow feeling for me that there's sometimes that I'll hear
link |
01:18:01.520
a song that's like, yes, this is, you know, this was the right song for the moment.
link |
01:18:05.920
But there's really no way to express that fact except by listening through it all the
link |
01:18:10.720
way and maybe playing it again at that time or something.
link |
01:18:14.240
But there's no need for a button that says this was the best song I could have heard
link |
01:18:19.680
at this moment.
link |
01:18:20.400
Well, we're playing around with that, with kind of the thumbs up concept saying like,
link |
01:18:24.080
I really like this.
link |
01:18:25.200
Just kind of talking to the algorithm.
link |
01:18:27.520
It's unclear if that's the best way for humans to interact.
link |
01:18:30.640
Maybe it is.
link |
01:18:31.200
Maybe they should think of Spotify as a person, an agent sitting there trying to serve you
link |
01:18:35.600
and you can say like, bad Spotify, good Spotify.
link |
01:18:38.720
Right now, the analogy we've had is more, you shouldn't think of us.
link |
01:18:42.880
We should be invisible.
link |
01:18:44.400
And the feedback is if you save it, it's kind of you work for yourself.
link |
01:18:48.320
You do a playlist because you think it's great and we can learn from that.
link |
01:18:50.960
It's kind of back to Tesla, how they kind of have this shadow mode.
link |
01:18:55.200
They sit in what you drive.
link |
01:18:56.720
We kind of took the same analogy.
link |
01:18:58.560
We sit in what you playlist and then maybe we can offer you an autopilot where you can
link |
01:19:02.800
take over for a while or something like that.
link |
01:19:04.640
And then back off if you say like, that's not good enough.
link |
01:19:08.240
But I think it's interesting to figure out what your mental model is.
link |
01:19:11.600
If Spotify is an AI that you talk to, which I think might be a bit too abstract for many
link |
01:19:18.880
consumers, or if you still think of it as it's my music app, but it's just more helpful.
link |
01:19:24.320
And it depends on the device it's running on, which brings us to smart speakers.
link |
01:19:31.040
So I have a lot of the Spotify listening I do is on devices I can talk to, whether it's
link |
01:19:38.400
from Amazon, Google or Apple.
link |
01:19:39.920
What's the role of Spotify on those devices?
link |
01:19:42.320
How do you think of it differently than on the phone or on the desktop?
link |
01:19:47.840
There are a few things to say about the first of all, it's incredibly exciting.
link |
01:19:52.080
They're growing like crazy, especially here in the US.
link |
01:19:58.320
And it's solving a consumer need that I think is, you can think of it as just remote interactivity.
link |
01:20:09.200
You can control this thing from across the room.
link |
01:20:11.840
And it may feel like a small thing, but it turns out that friction matters to consumers
link |
01:20:16.880
being able to say play, pause and so forth from across the room is very powerful.
link |
01:20:22.000
So basically, you made the living room interactive now.
link |
01:20:26.000
And what we see in our data is that the number one use case for these speakers is music,
link |
01:20:33.600
music and podcast.
link |
01:20:34.960
So fortunately for us, it's been important to these companies to have those use case
link |
01:20:39.920
covered.
link |
01:20:40.640
So they want to Spotify on this.
link |
01:20:42.080
We have very good relationships with them.
link |
01:20:45.840
And we're seeing tremendous success with them.
link |
01:20:51.200
What I think is interesting about them is it's already working.
link |
01:20:57.360
We kind of had this epiphany many years ago, back when we started using Sonos.
link |
01:21:02.720
If you went through all the trouble of setting up your Sonos system, you had this magical
link |
01:21:06.800
experience where you had all the music ever made in your living room.
link |
01:21:10.400
And we made this assumption that the home, everyone used to have a CD player at home,
link |
01:21:16.320
but they never managed to get their files working in the home.
link |
01:21:19.040
Having this network attached storage was too cumbersome for most consumers.
link |
01:21:22.960
So we made the assumption that the home would skip from the CD all the way to streaming
link |
01:21:26.480
books, where you would buy the steering and would have all the music built in.
link |
01:21:31.120
That took longer than we thought.
link |
01:21:32.640
But with the voice speakers, that was the unlocking that made kind of the connected
link |
01:21:36.080
speaker happen in the home.
link |
01:21:39.760
So it really exploded.
link |
01:21:41.520
And we saw this engagement that we predicted would happen.
link |
01:21:45.760
What I think is interesting, though, is where it's going from now.
link |
01:21:49.120
Right now, you think of them as voice speakers.
link |
01:21:51.920
But I think if you look at Google I.O., for example, they just added a camera to it, where
link |
01:21:58.640
when the alarm goes off, instead of saying, hey, Google, stop, you can just wave your
link |
01:22:04.240
hand.
link |
01:22:05.040
So I think they're going to think more of it as an agent or as an assistant, truly an
link |
01:22:11.920
assistant.
link |
01:22:12.400
And an assistant that can see you is going to be much more effective than a blind assistant.
link |
01:22:17.040
So I think these things will morph.
link |
01:22:18.480
And we won't necessarily think of them as, quote unquote, voice speakers anymore.
link |
01:22:22.560
Just as interactive access to the Internet in the home.
link |
01:22:29.200
But I still think that the biggest use case for those will be audio.
link |
01:22:34.080
So for that reason, we're investing heavily in it.
link |
01:22:36.640
And we built our own NLU stack to be able to the challenge here is, how do you innovate
link |
01:22:43.520
in that world?
link |
01:22:44.240
It lowers friction for consumers, but it's also much more constrained.
link |
01:22:48.320
You have no pixels to play with in an audio only world.
link |
01:22:51.600
It's really the vocabulary that is the interface.
link |
01:22:54.880
So we started investing and playing around quite a lot with that, trying to understand
link |
01:22:58.560
what the future will be of you speaking and gesturing and waving at your music.
link |
01:23:03.360
And actually, you're actually nudging closer to the autonomous vehicle space because from
link |
01:23:08.480
everything I've seen, the level of frustration people experience upon failure of natural
link |
01:23:14.080
language understanding is much higher than failure in other contexts.
link |
01:23:18.320
People get frustrated really fast.
link |
01:23:20.400
So if you screw that experience up even just a little bit, they give up really quickly.
link |
01:23:25.600
Yeah.
link |
01:23:26.320
And I think you see that in the data.
link |
01:23:28.320
While it's tremendously successful, the most common interactions are play, pause and next.
link |
01:23:36.160
The things where if you compare it to taking up your phone, unlocking it, bringing up the
link |
01:23:39.440
app and skipping, clicking skip, it was much lower friction.
link |
01:23:44.160
But then for longer, more complicated things like, can you find me that song about the
link |
01:23:49.280
people still bring up the phone and search and then play it on their speaker?
link |
01:23:51.920
So we tried again to build a fault tolerant UI where for the more complicated things,
link |
01:23:56.960
you can still pick up your phone, have powerful full keyboard search and then try to optimize
link |
01:24:02.480
for where there is actually lower friction and try to it's kind of like the test autopilot
link |
01:24:07.280
thing.
link |
01:24:07.840
You have to be at the level where you're helpful.
link |
01:24:11.040
If you're too smart and just in the way, people are going to get frustrated.
link |
01:24:15.040
And first of all, I'm not obsessed with stairway to heaven.
link |
01:24:18.080
It's just a good song.
link |
01:24:19.440
But let me mention that as a use case because it's an interesting one.
link |
01:24:22.880
I've literally told one of I don't want to say the name of the speaker because when people
link |
01:24:28.160
are listening to it, it'll make their speaker go off.
link |
01:24:30.320
But I talked to the speaker and I say play stairway to heaven.
link |
01:24:34.720
And every time it like not every time, but a large percentage of the time plays the wrong
link |
01:24:40.320
stairway to heaven.
link |
01:24:41.440
It plays like some cover of the and that part of the experience.
link |
01:24:48.240
I actually wonder from a business perspective, does Spotify control that entire experience
link |
01:24:55.120
or no?
link |
01:24:56.160
It seems like the NLU, the natural language stuff is controlled by the speaker and then
link |
01:25:01.680
Spotify stays at a layer below that.
link |
01:25:04.640
It's a good and complicated question.
link |
01:25:07.040
Some of which is dependent on the on the partners.
link |
01:25:11.200
So it's hard to comment on the on the specifics.
link |
01:25:13.280
But the question is the right one.
link |
01:25:15.840
The challenge is if you can't use any of the personalization, I mean, we know which stairway
link |
01:25:21.280
to heaven.
link |
01:25:21.840
And the truth is maybe for for one person, it is exactly the cover that they want.
link |
01:25:26.400
And they would be very frustrated if a place I think we I think we default to the right
link |
01:25:31.440
version.
link |
01:25:31.760
But but you actually want to be able to do the cover for the person that just played
link |
01:25:35.280
the cover 50 times.
link |
01:25:36.320
Or Spotify is just going to seem stupid.
link |
01:25:38.400
So you want to be able to leverage the personalization.
link |
01:25:40.160
But you have this stack where you have the the ASR and this thing called the end best
link |
01:25:46.320
list of the best guesses here.
link |
01:25:48.480
And then the position comes in at the end.
link |
01:25:50.480
You actually want the person to be here when you're guessing about what they actually
link |
01:25:53.280
meant.
link |
01:25:54.000
So we're working with these partners and it's a complicated it's a complicated thing
link |
01:26:00.160
where you want to you want to be able.
link |
01:26:02.880
So first of all, you want to be very careful with your users data.
link |
01:26:06.800
You don't want to share your users data without the permission.
link |
01:26:09.200
But you want to share some data so that their experience gets better.
link |
01:26:12.640
So that these partners can understand enough, but not too much and so forth.
link |
01:26:16.400
So it's really the trick is that it's like a business driven relationship where you're
link |
01:26:21.760
doing product development across companies together, which is which is really complicated.
link |
01:26:26.960
But this is exactly why we built our own NLU so that we actually can make personalized
link |
01:26:32.960
guesses, because this is the biggest frustration from a user point of view.
link |
01:26:36.320
They don't understand about ASR and best list and and business deals.
link |
01:26:40.160
They're like, how hard can it be?
link |
01:26:41.440
I was told this thing 50 times this version and still the place the wrong thing.
link |
01:26:45.120
It can't it can't be hard.
link |
01:26:47.040
So we try to take the user approach.
link |
01:26:48.640
If the user the user is not going to understand the complications of business, we have to
link |
01:26:53.360
solve it.
link |
01:26:53.760
So let's talk about sort of a complicated subject that I myself I'm quite torn about
link |
01:27:02.960
the idea sort of of paying artists.
link |
01:27:08.640
Right.
link |
01:27:09.840
I saw as of August 31st, 2018, over 11 billion dollars were paid to rights holders.
link |
01:27:17.200
So and further distributed to artists from Spotify.
link |
01:27:21.200
So a lot of money is being paid to artists.
link |
01:27:23.840
First of all, the whole time as a consumer for me, when I look at Spotify, I'm not sure
link |
01:27:30.800
I'm remembering correctly, but I think you said exactly how I feel, which is this is
link |
01:27:34.880
too good to be true.
link |
01:27:36.240
Like when I start using Spotify, I assume you guys will go bankrupt in like a month.
link |
01:27:43.040
It's like this is too good.
link |
01:27:44.400
A lot of people did.
link |
01:27:47.040
I was like, this is amazing.
link |
01:27:48.960
So one question I have is sort of the bigger question.
link |
01:27:53.200
How do you make money in this complicated world?
link |
01:27:55.840
How do you deal with the relationship with record labels who are complicated?
link |
01:28:04.800
These big you're essentially have the task of herding cats, but like rich and powerful
link |
01:28:14.080
powerful cats, and also have the task of paying artists enough and paying those labels enough
link |
01:28:21.520
and still making money in the Internet space where people are not willing to pay hundreds
link |
01:28:26.480
of dollars a month.
link |
01:28:27.920
So how do you navigate the space?
link |
01:28:30.720
How do you navigate?
link |
01:28:31.600
That's a beautiful description.
link |
01:28:32.560
Herding rich cats.
link |
01:28:34.720
That before.
link |
01:28:37.200
It is very complicated, and I think certainly actually betting against Spotify has been
link |
01:28:42.880
statistically a very smart thing to do.
link |
01:28:45.040
Just looking at the at the line of roadkill in music streaming services, it's it's kind
link |
01:28:52.880
of I think if I understood the complexity when I joined Spotify, unfortunately, fortunately,
link |
01:28:58.560
I didn't know enough about the music industry to understand the complexities, because then
link |
01:29:03.440
I would have made a more rational guess that it wouldn't work.
link |
01:29:06.240
So, you know, ignorance is bliss.
link |
01:29:08.480
But I think there have been a few distinct challenges.
link |
01:29:13.200
I think, as I said, one of the things that made it work at all was that Sweden and the
link |
01:29:17.600
Nordics was a lost market.
link |
01:29:19.840
So there was no risk for labels to try this.
link |
01:29:25.120
I don't think it would have worked if if the market was healthy.
link |
01:29:29.760
So that was the initial condition.
link |
01:29:33.120
Then we had this tremendous challenge with the model itself.
link |
01:29:36.160
So now most people were pirating.
link |
01:29:39.520
But for the people who bought a download or a CD, the artists would get all the revenue
link |
01:29:45.120
for all the future plays then, right?
link |
01:29:48.000
So you got it all up front, whereas the streaming model was like almost nothing day one, almost
link |
01:29:51.840
nothing day two.
link |
01:29:52.800
And then at some point, this curve of incremental revenue would intersect with your day one
link |
01:29:58.720
payment.
link |
01:29:59.840
And that took a long time to play out before before the music labels, they understood
link |
01:30:05.280
that.
link |
01:30:05.780
But on the artist side, it took a lot of time to understand that actually, if I have a big
link |
01:30:09.600
hit that is going to be played for many years, this is a much better model because I get
link |
01:30:14.000
paid based on how much people use the product, not how much they thought they would use it
link |
01:30:18.000
day one or so forth.
link |
01:30:20.080
So it was a complicated model to get across.
link |
01:30:22.880
But time helped with that.
link |
01:30:24.000
And now the revenues to the music industry actually are bigger again than it's gone through
link |
01:30:30.640
this incredible dip and now they're back up.
link |
01:30:32.000
And so we're very proud of having been a part of that.
link |
01:30:37.920
So there have been distinct problems.
link |
01:30:39.520
I think when it comes to the labels, we have taken the painful approach.
link |
01:30:46.720
Some of our competition at the time, they kind of looked at other companies and said,
link |
01:30:52.400
if we just ignore the rights, we get really big, really fast.
link |
01:30:56.160
We're going to be too big for the labels to kind of, too big to fail.
link |
01:31:00.480
They're not going to kill us.
link |
01:31:01.120
We didn't take that approach.
link |
01:31:02.160
We went legal from day one and we negotiated and negotiated and negotiated.
link |
01:31:06.960
It was very slow.
link |
01:31:07.600
It was very frustrating.
link |
01:31:08.240
We were angry at seeing other companies taking shortcuts and seeming to get away with it.
link |
01:31:12.800
It was this game theory thing where over many rounds of playing the game, this would be
link |
01:31:18.160
the right strategy.
link |
01:31:19.200
And even though clearly there's a lot of frustrations at times during renegotiations, there is this
link |
01:31:25.680
there is this weird trust where we have been honest and fair.
link |
01:31:31.760
We've never screwed them.
link |
01:31:32.480
They've never screwed us.
link |
01:31:33.680
It's 10 years, but there's this trust and like they know that if music doesn't get
link |
01:31:39.280
really big, if lots of people do not want to listen to music and want to pay for it,
link |
01:31:43.360
Spotify has no business model.
link |
01:31:44.960
So we actually are incredibly aligned.
link |
01:31:48.240
Other companies, not to be tense, but other companies have other business models where
link |
01:31:51.840
even if they made no money from music, they'd still be profitable companies.
link |
01:31:56.400
But Spotify won't.
link |
01:31:57.200
So I think the industry sees that we are actually aligned business wise.
link |
01:32:03.120
So there is this trust that allows us to do product development, even if it's scary,
link |
01:32:11.040
taking risks.
link |
01:32:12.560
The free model itself was an incredible risk for the music industry to take that they should
link |
01:32:17.200
get credit for.
link |
01:32:17.920
Now, some of it was that they had nothing to lose in the game.
link |
01:32:20.400
Some of it was that they had nothing to lose in Sweden.
link |
01:32:22.240
But frankly, a lot of the labels also took risk.
link |
01:32:25.840
And so I think we built up that trust with I think herding of cats sounds a bit.
link |
01:32:32.320
What's the word?
link |
01:32:33.120
It sounds like dismissive of the cats.
link |
01:32:35.280
Dismissive.
link |
01:32:35.920
No, every cat matters.
link |
01:32:37.200
They're all beautiful and very important.
link |
01:32:39.360
Exactly.
link |
01:32:39.920
They've taken a lot of risks and certainly it's been frustrating.
link |
01:32:44.960
So it's really like playing it's game theory.
link |
01:32:47.600
If you play the game many times, then you can have the statistical outcome that you
link |
01:32:53.920
bet on.
link |
01:32:54.560
And it feels very painful when you're in the middle of that thing.
link |
01:32:57.520
I mean, there's risk, there's trust, there's relationships.
link |
01:33:00.480
From just having read the biography of Steve Jobs, similar kind of relationships were discussed
link |
01:33:07.200
in iTunes.
link |
01:33:08.400
The idea of selling a song for a dollar was very uncomfortable for labels.
link |
01:33:12.640
Exactly.
link |
01:33:13.760
And there was no, it was the same kind of thing.
link |
01:33:16.400
It was trust, it was game theory as a lot of relationships that had to be built.
link |
01:33:21.840
And it's really a terrifyingly difficult process that Apple could go through a little
link |
01:33:28.880
bit because they could afford for that process to fail.
link |
01:33:32.720
For Spotify, it seems terrifying because you can't.
link |
01:33:37.600
Initially, I think a lot of it comes down to honestly Daniel and his tenacity in negotiating,
link |
01:33:44.240
which seems like an impossible task because he was completely unknown and so forth.
link |
01:33:50.800
But maybe that was also the reason that it worked.
link |
01:33:56.480
But I think game theory is probably the best way to think about it.
link |
01:34:03.120
You could go straight for this Nash equilibrium that someone is going to defect or you play
link |
01:34:08.800
it many times, you try to actually go for the top left, the corporations sell.
link |
01:34:14.240
Is there any magical reason why Spotify seems to have won this?
link |
01:34:20.400
So a lot of people have tried to do what Spotify tried to do and Spotify has come out.
link |
01:34:25.360
Well, so the answer is that there's no magical reason because I don't believe in magic.
link |
01:34:30.000
But I think there are there are reasons.
link |
01:34:32.240
And I think some of them are that people have misunderstood a lot of what we actually do.
link |
01:34:40.400
The actual Spotify model is very complicated.
link |
01:34:43.520
They've looked at the premium model and said, it seems like you can charge $9.99 for music
link |
01:34:49.200
and people are going to pay, but that's not what happened.
link |
01:34:52.000
Actually, when we launched the original mobile product, everyone said they would never pay.
link |
01:34:56.640
What happened was they started on the free product and then their engagement grew so
link |
01:35:01.200
much that eventually they said, maybe it is worth $9.99, right?
link |
01:35:05.680
It's your propensity to pay gross with your engagement.
link |
01:35:08.880
So we have this super complicated business model.
link |
01:35:11.600
We operate two different business models, advertising and premium at the same time.
link |
01:35:15.760
And I think that is hard to replicate.
link |
01:35:17.680
I struggle to think of other companies that run large scale advertising and subscription
link |
01:35:22.320
products at the same time.
link |
01:35:24.400
So I think the business model is actually much more complicated than people think it is.
link |
01:35:28.480
And so some people went after just the premium part without the free part and ran into a
link |
01:35:32.800
wall where no one wanted to pay.
link |
01:35:35.120
Some people went after just music should be free, just ads, which doesn't give you enough
link |
01:35:40.400
revenue and doesn't work for the music industry.
link |
01:35:42.880
So I think that combination is kind of opaque from the outside.
link |
01:35:46.560
So maybe I shouldn't say it here and reveal the secret, but that turns out to be hard
link |
01:35:51.040
to replicate than you would think.
link |
01:35:54.400
So there's a lot of brilliant business strategies out there.
link |
01:35:57.040
Brilliant business strategy here.
link |
01:36:00.240
Brilliance or luck?
link |
01:36:01.280
Probably more luck, but it doesn't really matter.
link |
01:36:03.520
It looks brilliant in retrospect.
link |
01:36:05.440
Let's call it brilliant.
link |
01:36:07.840
Yeah, when the books are written, they'll be brilliant.
link |
01:36:10.480
You've mentioned that your philosophy is to embrace change.
link |
01:36:16.720
So how will the music streaming and music listening world change over the next 10 years,
link |
01:36:23.600
20 years?
link |
01:36:24.640
You look out into the far future.
link |
01:36:26.960
What do you think?
link |
01:36:28.960
I think that music and for that matter, audio podcasts, audiobooks, I think it's one of
link |
01:36:35.200
the few core human needs.
link |
01:36:37.360
I think it there is no good reason to me why it shouldn't be at the scale of something
link |
01:36:41.680
like messaging or social networking.
link |
01:36:44.160
I don't think it's a niche thing to listen to music or news or something.
link |
01:36:48.160
So I think scale is obviously one of the things that I really hope for.
link |
01:36:50.880
I think I hope that it's going to be billions of users.
link |
01:36:54.400
I hope eventually everyone in the world gets access to all the world's music ever made.
link |
01:36:58.720
So obviously, I think it's going to be a much bigger business.
link |
01:37:01.120
Otherwise, we wouldn't be betting this big.
link |
01:37:05.040
Now, if you look more at how it is consumed, what I'm hoping is back to this analogy of
link |
01:37:13.600
the software tool chain, where I think I sometimes internally I make this analogy to text messaging.
link |
01:37:22.800
Text messaging was also based on standards in the area of mobile carriers.
link |
01:37:28.480
You had the SMS, the 140 character, 120 character SMS.
link |
01:37:33.600
And it was great because everyone agreed on the standards.
link |
01:37:36.080
So as a consumer, you got a lot of distributions and interoperability, but it was a very constrained
link |
01:37:40.480
format.
link |
01:37:41.680
And when the industry wanted to add pictures to that format to do the MMS, I looked it
link |
01:37:45.840
up and I think it took from the late 80s to early 2000s.
link |
01:37:48.720
This is like a 15, 20 year product cycle to bring pictures into that.
link |
01:37:53.920
Now, once that entire value chain of creation and consumption got wrapped in one software
link |
01:38:00.240
stack within something like Snapchat or WhatsApp, the first week they added disappearing messages.
link |
01:38:07.280
Then two weeks later, they added stories.
link |
01:38:09.600
The pace of innovation when you're on one software stack and you can affect both creation
link |
01:38:14.560
and consumption, I think it's going to be rapid.
link |
01:38:17.120
So with these streaming services, we now, for the first time in history, have enough,
link |
01:38:22.320
I hope, people on one of these services.
link |
01:38:25.040
Actually, whether it's Spotify or Amazon or Apple or YouTube, and hopefully enough
link |
01:38:29.600
creators that you can actually start working with the format again.
link |
01:38:32.320
And that excites me.
link |
01:38:33.760
I think being able to change these constraints from 100 years, that could really do something
link |
01:38:39.200
interesting.
link |
01:38:40.160
I really hope it's not just going to be the iteration on the same thing for the next 10
link |
01:38:45.680
to 20 years as well.
link |
01:38:47.360
Yeah, changing the creation of music, the creation of audio, the creation of podcasts
link |
01:38:52.000
is a really fascinating possibility.
link |
01:38:54.400
I myself don't understand what it is about podcasts that's so intimate.
link |
01:38:59.680
It just is.
link |
01:39:00.480
I listen to a lot of podcasts.
link |
01:39:01.840
I think it touches on a deep human need for connection that people do feel like they're
link |
01:39:09.680
connected to when they listen.
link |
01:39:12.960
I don't understand what the psychology of that is, but in this world that's becoming
link |
01:39:17.600
more and more disconnected, it feels like this is fulfilling a certain kind of need.
link |
01:39:24.800
And empowering the creator as opposed to just the listener is really interesting.
link |
01:39:32.480
I'm really excited that you're working on this.
link |
01:39:34.240
Yeah, I think one of the things that is inspiring for our teams to work on podcasts is exactly
link |
01:39:38.800
that, whether you think, like I probably do, that it's something biological about perceiving
link |
01:39:44.720
to be in the middle of the conversation that makes you listen in a different way.
link |
01:39:47.840
It doesn't really matter.
link |
01:39:48.640
People seem to perceive it differently.
link |
01:39:50.240
And there was this narrative for a long time that if you look at video, everything kind
link |
01:39:55.600
of in the foreground, it got shorter and shorter and shorter because of financial pressures
link |
01:39:59.840
and monetization and so forth.
link |
01:40:01.600
And eventually, at the end, there's almost like 20 seconds clip, people just screaming
link |
01:40:06.240
something and I feel really good about the fact that you could have interpreted that
link |
01:40:14.640
as people have no attention span anymore.
link |
01:40:16.880
They don't want to listen to things.
link |
01:40:18.400
They're not interested in deeper stories.
link |
01:40:22.000
People are getting dumber.
link |
01:40:23.280
But then podcasts came along and it's almost like, no, no, the need still existed.
link |
01:40:28.000
But maybe it was the fact that you're not prepared to look at your phone like this for
link |
01:40:32.240
two hours.
link |
01:40:32.740
But if you can drive at the same time, it seems like people really want to dig deeper
link |
01:40:36.500
and they want to hear like the more complicated version.
link |
01:40:38.820
So to me, that is very inspiring that that podcast is actually long form.
link |
01:40:42.980
It gives me a lot of hope for humanity that people seem really interested in hearing deeper,
link |
01:40:48.340
more complicated conversations.
link |
01:40:49.940
This is I don't understand it.
link |
01:40:52.100
It's fascinating.
link |
01:40:53.140
So the majority for this podcast, listen to the whole thing.
link |
01:40:57.620
This whole conversation we've been talking for an hour and 45 minutes.
link |
01:41:02.500
And somebody will I mean, most people will be listening to these words I'm speaking right
link |
01:41:06.580
now.
link |
01:41:06.580
It's crazy.
link |
01:41:07.080
You wouldn't have thought that 10 years ago with where the world seemed to go.
link |
01:41:10.740
That's very positive, I think.
link |
01:41:12.100
That's really exciting.
link |
01:41:13.300
And empowering the creator there is really exciting.
link |
01:41:17.700
Last question.
link |
01:41:18.740
You also have a passion for just mobile in general.
link |
01:41:22.660
How do you see the smartphone world, the digital space of smartphones and just everything that's
link |
01:41:32.660
on the move, whether it's Internet of Things and so on, changing over the next 10 years
link |
01:41:39.780
and so on?
link |
01:41:41.460
I think that one way to think about it is that computing might be moving out of these
link |
01:41:47.460
multipurpose devices, the computer we had and the phone, into specific purpose devices.
link |
01:41:55.140
And it will be ambient that at least in my home, you just shout something at someone
link |
01:42:01.060
and there's always one of these speakers close enough.
link |
01:42:03.380
And so you start behaving differently.
link |
01:42:06.980
It's as if you have the Internet ambient, ambiently around you and you can ask it things.
link |
01:42:11.460
So I think computing will kind of get more integrated and we won't necessarily think
link |
01:42:15.780
of it as connected to a device in the same way that we do today.
link |
01:42:21.700
I don't know the path to that.
link |
01:42:22.900
Maybe we used to have these desktop computers and then we partially replaced that with the
link |
01:42:30.340
laptops and left the desktop at home when I work.
link |
01:42:32.740
And then we got these phones and we started leaving the mobile phones.
link |
01:42:37.380
We had the desktop at home when I work and then we got these phones and we started leaving
link |
01:42:41.540
the laptop at home for a while.
link |
01:42:42.820
And maybe for stretches of time you're going to start using the watch and you can leave
link |
01:42:47.460
your phone at home for a run or something.
link |
01:42:50.580
And we're on this progressive path where I think what is happening with voice is that
link |
01:43:00.740
you have an interaction paradigm that doesn't require as large physical devices.
link |
01:43:06.820
So I definitely think there's a future where you can have your AirPods and your watch and
link |
01:43:12.820
you can do a lot of computing.
link |
01:43:15.860
And I don't think it's going to be this binary thing.
link |
01:43:20.020
I think it's going to be like many of us still have a laptop, we just use it less.
link |
01:43:23.940
And so you shift your consumption over.
link |
01:43:26.820
And I don't know about AR glasses and so forth.
link |
01:43:31.940
I'm excited about it.
link |
01:43:32.740
I spent a lot of time in that area, but I still think it's quite far away.
link |
01:43:35.700
AR, VR, all of that.
link |
01:43:37.540
Yeah, VR is happening and working.
link |
01:43:39.780
I think the recent Oculus Quest is quite impressive.
link |
01:43:43.940
I think AR is further away.
link |
01:43:45.300
At least that type of AR.
link |
01:43:48.100
But I do think your phone or watch or glasses understanding where you are and maybe what
link |
01:43:54.660
you're looking at and being able to give you audio cues about that.
link |
01:43:56.980
Or you can say like, what is this?
link |
01:43:58.580
And it tells you what it is.
link |
01:44:00.980
That I think might happen.
link |
01:44:02.340
You use your watch or your glasses as a mouse pointer on reality.
link |
01:44:08.020
I think it might be a while before...
link |
01:44:09.460
I might be wrong.
link |
01:44:10.180
I hope I'm wrong.
link |
01:44:10.820
I think it might be a while before we walk around with these big lab glasses that project
link |
01:44:14.820
things.
link |
01:44:15.620
I agree with you.
link |
01:44:16.820
It's actually really difficult when you have to understand the physical world enough to
link |
01:44:23.060
project onto it.
link |
01:44:25.300
I lied about the last question.
link |
01:44:26.740
Go ahead, because I just thought of audio and my favorite topic, which is the movie
link |
01:44:32.660
Her, do you think, whether it's part of Spotify or not, we'll have, I don't know if you've
link |
01:44:41.140
seen the movie Her.
link |
01:44:42.180
Absolutely.
link |
01:44:45.060
And there, audio is the primary form of interaction and the connection with another entity that
link |
01:44:53.300
you can actually have a relationship with, that you fall in love with based on voice
link |
01:44:59.300
alone, audio alone.
link |
01:45:00.740
Do you think that's possible, first of all, based on audio alone to fall in love with
link |
01:45:04.820
somebody?
link |
01:45:05.380
Somebody or...
link |
01:45:06.580
Well, yeah, let's go with somebody.
link |
01:45:08.020
Just have a relationship based on audio alone.
link |
01:45:11.700
And second question to that, can we create an artificial intelligence system that allows
link |
01:45:18.500
one to fall in love with it and her, him with you?
link |
01:45:21.940
So this is my personal answer, speaking for me as a person, the answer is quite unequivocally
link |
01:45:29.940
yes on both.
link |
01:45:32.820
I think what we just said about podcasts and the feeling of being in the middle of a
link |
01:45:36.580
conversation, if you could have an assistant where, and we just said that feels like a
link |
01:45:42.660
very personal setting.
link |
01:45:43.940
So if you walk around with these headphones and this thing, you're speaking with this
link |
01:45:47.380
thing all of the time that feels like it's in your brain.
link |
01:45:49.940
I think it's going to be much easier to fall in love with than something that would be
link |
01:45:53.700
on your screen.
link |
01:45:54.740
I think that's entirely possible.
link |
01:45:56.340
And then from the, you can probably answer this better than me, but from the concept
link |
01:46:00.500
of if it's going to be possible to build a machine that can achieve that, I think whether
link |
01:46:07.060
you think of it as, if you can fake it, the philosophical zombie that assimilates it enough
link |
01:46:12.740
or it somehow actually is, I think there's, it's only a question.
link |
01:46:17.700
It's only a question if you ask me about time, I'd have a different answer.
link |
01:46:20.500
But if you say I've given some half infinite time, absolutely.
link |
01:46:24.580
I think it's just atoms and arrangement of information.
link |
01:46:29.620
Well, I personally think that love is a lot simpler than people think.
link |
01:46:33.780
So we started with true romance and ended in love.
link |
01:46:37.780
I don't see a better place to end.
link |
01:46:39.780
Beautiful.
link |
01:46:40.340
Gustav, thanks so much for talking today.
link |
01:46:41.860
Thank you so much.
link |
01:46:42.420
It was a lot of fun.
link |
01:46:43.140
It was fun.