back to index

Greg Brockman: OpenAI and AGI | Lex Fridman Podcast #17


small model | large model

link |
00:00:00.000
The following is a conversation with Greg Brockman.
link |
00:00:02.880
He's the cofounder and CTO of OpenAI,
link |
00:00:05.360
a world class research organization
link |
00:00:07.440
developing ideas in AI with a goal of eventually
link |
00:00:10.840
creating a safe and friendly artificial general
link |
00:00:14.200
intelligence, one that benefits and empowers humanity.
link |
00:00:18.800
OpenAI is not only a source of publications, algorithms, tools,
link |
00:00:23.080
and data sets.
link |
00:00:24.480
Their mission is a catalyst for an important public discourse
link |
00:00:28.160
about our future with both narrow and general intelligence
link |
00:00:32.720
systems.
link |
00:00:34.040
This conversation is part of the Artificial Intelligence
link |
00:00:36.660
podcast at MIT and beyond.
link |
00:00:39.560
If you enjoy it, subscribe on YouTube, iTunes,
link |
00:00:42.760
or simply connect with me on Twitter at Lex Friedman,
link |
00:00:45.680
spelled F R I D. And now, here's my conversation
link |
00:00:50.240
with Greg Brockman.
link |
00:00:52.800
So in high school, and right after you
link |
00:00:54.440
wrote a draft of a chemistry textbook,
link |
00:00:56.680
saw that that covers everything from basic structure
link |
00:00:59.080
of the atom to quantum mechanics.
link |
00:01:01.400
So it's clear you have an intuition and a passion
link |
00:01:04.360
for both the physical world with chemistry and now robotics
link |
00:01:09.880
to the digital world with AI, deep learning, reinforcement
link |
00:01:14.200
learning, so on.
link |
00:01:15.400
Do you see the physical world and the digital world
link |
00:01:17.360
as different?
link |
00:01:18.640
And what do you think is the gap?
link |
00:01:20.520
A lot of it actually boils down to iteration speed.
link |
00:01:23.320
I think that a lot of what really motivates me
link |
00:01:25.240
is building things.
link |
00:01:26.520
I think about mathematics, for example,
link |
00:01:28.960
where you think really hard about a problem.
link |
00:01:30.880
You understand it.
link |
00:01:31.680
You write it down in this very obscure form
link |
00:01:33.460
that we call a proof.
link |
00:01:34.560
But then, this is in humanity's library.
link |
00:01:37.600
It's there forever.
link |
00:01:38.440
This is some truth that we've discovered.
link |
00:01:40.520
Maybe only five people in your field will ever read it.
link |
00:01:43.040
But somehow, you've kind of moved humanity forward.
link |
00:01:45.400
And so I actually used to really think
link |
00:01:46.900
that I was going to be a mathematician.
link |
00:01:48.600
And then I actually started writing this chemistry
link |
00:01:51.000
textbook.
link |
00:01:51.600
One of my friends told me, you'll never publish it
link |
00:01:53.600
because you don't have a PhD.
link |
00:01:54.840
So instead, I decided to build a website
link |
00:01:57.960
and try to promote my ideas that way.
link |
00:01:59.920
And then I discovered programming.
link |
00:02:01.440
And in programming, you think hard about a problem.
link |
00:02:05.280
You understand it.
link |
00:02:06.040
You write it down in a very obscure form
link |
00:02:08.000
that we call a program.
link |
00:02:10.000
But then once again, it's in humanity's library.
link |
00:02:12.200
And anyone can get the benefit from it.
link |
00:02:14.080
And the scalability is massive.
link |
00:02:15.540
And so I think that the thing that really appeals
link |
00:02:17.540
to me about the digital world is that you
link |
00:02:19.420
can have this insane leverage.
link |
00:02:21.920
A single individual with an idea is
link |
00:02:24.320
able to affect the entire planet.
link |
00:02:25.800
And that's something I think is really
link |
00:02:27.400
hard to do if you're moving around physical atoms.
link |
00:02:30.160
But you said mathematics.
link |
00:02:32.400
So if you look at the wet thing over here, our mind,
link |
00:02:36.800
do you ultimately see it as just math,
link |
00:02:39.760
as just information processing?
link |
00:02:41.720
Or is there some other magic, as you've seen,
link |
00:02:44.320
if you've seen through biology and chemistry and so on?
link |
00:02:47.000
Yeah, I think it's really interesting to think about
link |
00:02:48.640
humans as just information processing systems.
link |
00:02:50.920
And that seems like it's actually
link |
00:02:52.560
a pretty good way of describing a lot of how the world works
link |
00:02:57.160
or a lot of what we're capable of, to think that, again,
link |
00:03:00.640
if you just look at technological innovations
link |
00:03:02.760
over time, that in some ways, the most transformative
link |
00:03:05.480
innovation that we've had has been the computer.
link |
00:03:07.720
In some ways, the internet, that what has the internet done?
link |
00:03:10.520
The internet is not about these physical cables.
link |
00:03:12.720
It's about the fact that I am suddenly
link |
00:03:14.520
able to instantly communicate with any other human
link |
00:03:16.520
on the planet.
link |
00:03:17.640
I'm able to retrieve any piece of knowledge
link |
00:03:19.640
that in some ways the human race has ever had,
link |
00:03:22.640
and that those are these insane transformations.
link |
00:03:26.040
Do you see our society as a whole, the collective,
link |
00:03:29.320
as another extension of the intelligence of the human being?
link |
00:03:32.240
So if you look at the human being
link |
00:03:33.400
as an information processing system,
link |
00:03:35.040
you mentioned the internet, the networking.
link |
00:03:36.880
Do you see us all together as a civilization
link |
00:03:39.320
as a kind of intelligent system?
link |
00:03:41.640
Yeah, I think this is actually
link |
00:03:42.840
a really interesting perspective to take
link |
00:03:44.840
and to think about, that you sort of have
link |
00:03:46.680
this collective intelligence of all of society,
link |
00:03:49.480
the economy itself is this superhuman machine
link |
00:03:51.640
that is optimizing something, right?
link |
00:03:54.400
And in some ways, a company has a will of its own, right?
link |
00:03:57.960
That you have all these individuals
link |
00:03:59.040
who are all pursuing their own individual goals
link |
00:04:00.800
and thinking really hard
link |
00:04:01.960
and thinking about the right things to do,
link |
00:04:03.600
but somehow the company does something
link |
00:04:05.320
that is this emergent thing
link |
00:04:07.880
and that is a really useful abstraction.
link |
00:04:10.600
And so I think that in some ways,
link |
00:04:12.400
we think of ourselves as the most intelligent things
link |
00:04:14.840
on the planet and the most powerful things on the planet,
link |
00:04:17.440
but there are things that are bigger than us
link |
00:04:19.280
that are the systems that we all contribute to.
link |
00:04:21.400
And so I think actually, it's interesting to think about
link |
00:04:24.960
if you've read Isaac Asimov's foundation, right?
link |
00:04:27.400
That there's this concept of psychohistory in there,
link |
00:04:30.000
which is effectively this,
link |
00:04:31.000
that if you have trillions or quadrillions of beings,
link |
00:04:33.880
then maybe you could actually predict what that being,
link |
00:04:36.520
that huge macro being will do
link |
00:04:39.040
and almost independent of what the individuals want.
link |
00:04:42.320
And I actually have a second angle on this
link |
00:04:44.200
that I think is interesting,
link |
00:04:45.040
which is thinking about technological determinism.
link |
00:04:48.360
One thing that I actually think a lot about with OpenAI,
link |
00:04:51.240
right, is that we're kind of coming on
link |
00:04:53.320
to this insanely transformational technology
link |
00:04:55.840
of general intelligence, right,
link |
00:04:57.320
that will happen at some point.
link |
00:04:58.760
And there's a question of how can you take actions
link |
00:05:01.520
that will actually steer it to go better rather than worse.
link |
00:05:04.840
And that I think one question you need to ask
link |
00:05:06.520
is as a scientist, as an inventor, as a creator,
link |
00:05:09.280
what impact can you have in general, right?
link |
00:05:11.680
You look at things like the telephone
link |
00:05:12.840
invented by two people on the same day.
link |
00:05:14.800
Like, what does that mean?
link |
00:05:15.920
Like, what does that mean about the shape of innovation?
link |
00:05:18.080
And I think that what's going on
link |
00:05:19.240
is everyone's building on the shoulders of the same giants.
link |
00:05:21.680
And so you can kind of, you can't really hope
link |
00:05:23.800
to create something no one else ever would.
link |
00:05:25.680
You know, if Einstein wasn't born,
link |
00:05:27.000
someone else would have come up with relativity.
link |
00:05:29.160
You know, he changed the timeline a bit, right,
link |
00:05:30.960
that maybe it would have taken another 20 years,
link |
00:05:32.960
but it wouldn't be that fundamentally humanity
link |
00:05:34.560
would never discover these fundamental truths.
link |
00:05:37.320
So there's some kind of invisible momentum
link |
00:05:40.400
that some people like Einstein or OpenAI is plugging into
link |
00:05:45.360
that anybody else can also plug into
link |
00:05:47.760
and ultimately that wave takes us into a certain direction.
link |
00:05:50.760
That's what he means by digital.
link |
00:05:51.840
That's right, that's right.
link |
00:05:52.800
And you know, this kind of seems to play out
link |
00:05:54.160
in a bunch of different ways,
link |
00:05:55.680
that there's some exponential that is being written
link |
00:05:58.000
and that the exponential itself, which one it is, changes.
link |
00:06:00.600
Think about Moore's Law, an entire industry
link |
00:06:02.400
set its clock to it for 50 years.
link |
00:06:04.760
Like, how can that be, right?
link |
00:06:06.160
How is that possible?
link |
00:06:07.320
And yet somehow it happened.
link |
00:06:09.240
And so I think you can't hope to ever invent something
link |
00:06:12.160
that no one else will.
link |
00:06:13.280
Maybe you can change the timeline a little bit.
link |
00:06:15.280
But if you really want to make a difference,
link |
00:06:17.360
I think that the thing that you really have to do,
link |
00:06:19.360
the only real degree of freedom you have
link |
00:06:21.280
is to set the initial conditions
link |
00:06:23.000
under which a technology is born.
link |
00:06:24.880
And so you think about the internet, right?
link |
00:06:26.640
That there are lots of other competitors
link |
00:06:27.800
trying to build similar things.
link |
00:06:29.360
And the internet won.
link |
00:06:30.720
And that the initial conditions
link |
00:06:33.200
were that it was created by this group
link |
00:06:34.640
that really valued people being able to be,
link |
00:06:38.200
anyone being able to plug in
link |
00:06:39.080
this very academic mindset of being open and connected.
link |
00:06:42.440
And I think that the internet for the next 40 years
link |
00:06:44.320
really played out that way.
link |
00:06:46.280
You know, maybe today things are starting
link |
00:06:48.400
to shift in a different direction.
link |
00:06:49.840
But I think that those initial conditions
link |
00:06:51.120
were really important to determine
link |
00:06:52.720
the next 40 years worth of progress.
link |
00:06:55.040
That's really beautifully put.
link |
00:06:56.440
So another example that I think about,
link |
00:06:58.800
you know, I recently looked at it.
link |
00:07:00.800
I looked at Wikipedia, the formation of Wikipedia.
link |
00:07:03.800
And I wondered what the internet would be like
link |
00:07:05.520
if Wikipedia had ads.
link |
00:07:07.720
You know, there's an interesting argument
link |
00:07:09.600
that why they chose not to make it,
link |
00:07:12.600
put advertisement on Wikipedia.
link |
00:07:14.240
I think Wikipedia's one of the greatest resources
link |
00:07:17.760
we have on the internet.
link |
00:07:18.880
It's extremely surprising how well it works
link |
00:07:21.200
and how well it was able to aggregate
link |
00:07:22.920
all this kind of good information.
link |
00:07:24.960
And essentially the creator of Wikipedia,
link |
00:07:27.280
I don't know, there's probably some debates there,
link |
00:07:29.320
but set the initial conditions.
link |
00:07:31.160
And now it carried itself forward.
link |
00:07:33.220
That's really interesting.
link |
00:07:34.060
So the way you're thinking about AGI
link |
00:07:36.480
or artificial intelligence is you're focused
link |
00:07:38.440
on setting the initial conditions for the progress.
link |
00:07:41.160
That's right.
link |
00:07:42.280
That's powerful.
link |
00:07:43.120
Okay, so looking to the future,
link |
00:07:45.520
if you create an AGI system,
link |
00:07:48.120
like one that can ace the Turing test, natural language,
link |
00:07:51.560
what do you think would be the interactions
link |
00:07:54.760
you would have with it?
link |
00:07:55.840
What do you think are the questions you would ask?
link |
00:07:57.720
Like what would be the first question you would ask?
link |
00:08:00.520
It, her, him.
link |
00:08:01.800
That's right.
link |
00:08:02.640
I think that at that point,
link |
00:08:03.920
if you've really built a powerful system
link |
00:08:05.920
that is capable of shaping the future of humanity,
link |
00:08:08.480
the first question that you really should ask
link |
00:08:10.240
is how do we make sure that this plays out well?
link |
00:08:12.280
And so that's actually the first question
link |
00:08:13.960
that I would ask a powerful AGI system is.
link |
00:08:17.600
So you wouldn't ask your colleague,
link |
00:08:19.160
you wouldn't ask like Ilya,
link |
00:08:20.760
you would ask the AGI system.
link |
00:08:22.280
Oh, we've already had the conversation with Ilya, right?
link |
00:08:24.600
And everyone here.
link |
00:08:25.720
And so you want as many perspectives
link |
00:08:27.460
and a piece of wisdom as you can
link |
00:08:29.680
for answering this question.
link |
00:08:31.200
So I don't think you necessarily defer
link |
00:08:32.440
to whatever your powerful system tells you,
link |
00:08:35.440
but you use it as one input
link |
00:08:37.080
to try to figure out what to do.
link |
00:08:39.240
But, and I guess fundamentally what it really comes down to
link |
00:08:41.800
is if you built something really powerful
link |
00:08:43.960
and you think about, for example,
link |
00:08:45.280
the creation of shortly after
link |
00:08:47.640
the creation of nuclear weapons, right?
link |
00:08:48.880
The most important question in the world was
link |
00:08:51.100
what's the world order going to be like?
link |
00:08:52.800
How do we set ourselves up in a place
link |
00:08:54.900
where we're going to be able to survive as a species?
link |
00:08:58.320
With AGI, I think the question is slightly different, right?
link |
00:09:00.640
That there is a question of how do we make sure
link |
00:09:02.720
that we don't get the negative effects,
link |
00:09:04.440
but there's also the positive side, right?
link |
00:09:06.240
You imagine that, like what won't AGI be like?
link |
00:09:09.760
Like what will it be capable of?
link |
00:09:11.240
And I think that one of the core reasons
link |
00:09:13.520
that an AGI can be powerful and transformative
link |
00:09:15.760
is actually due to technological development, right?
link |
00:09:18.900
If you have something that's capable as a human
link |
00:09:21.440
and that it's much more scalable,
link |
00:09:23.880
that you absolutely want that thing
link |
00:09:25.880
to go read the whole scientific literature
link |
00:09:27.640
and think about how to create cures for all the diseases,
link |
00:09:29.820
right?
link |
00:09:30.660
You want it to think about how to go
link |
00:09:31.480
and build technologies to help us create material abundance
link |
00:09:34.500
and to figure out societal problems
link |
00:09:37.320
that we have trouble with.
link |
00:09:38.160
Like how are we supposed to clean up the environment?
link |
00:09:40.000
And maybe you want this to go and invent
link |
00:09:42.840
a bunch of little robots that will go out
link |
00:09:44.120
and be biodegradable and turn ocean debris
link |
00:09:47.280
into harmless molecules.
link |
00:09:49.660
And I think that that positive side
link |
00:09:54.040
is something that I think people miss
link |
00:09:55.720
sometimes when thinking about what an AGI will be like.
link |
00:09:58.160
And so I think that if you have a system
link |
00:10:00.280
that's capable of all of that,
link |
00:10:01.640
you absolutely want its advice about how do I make sure
link |
00:10:03.960
that we're using your capabilities
link |
00:10:07.600
in a positive way for humanity.
link |
00:10:09.220
So what do you think about that psychology
link |
00:10:11.440
that looks at all the different possible trajectories
link |
00:10:14.800
of an AGI system, many of which,
link |
00:10:17.520
perhaps the majority of which are positive,
link |
00:10:19.960
and nevertheless focuses on the negative trajectories?
link |
00:10:23.340
I mean, you get to interact with folks,
link |
00:10:24.720
you get to think about this, maybe within yourself as well.
link |
00:10:28.860
You look at Sam Harris and so on.
link |
00:10:30.560
It seems to be, sorry to put it this way,
link |
00:10:32.760
but almost more fun to think about
link |
00:10:34.560
the negative possibilities.
link |
00:10:36.780
Whatever that's deep in our psychology,
link |
00:10:39.560
what do you think about that?
link |
00:10:40.840
And how do we deal with it?
link |
00:10:41.920
Because we want AI to help us.
link |
00:10:44.400
So I think there's kind of two problems
link |
00:10:47.360
entailed in that question.
link |
00:10:49.960
The first is more of the question of
link |
00:10:52.360
how can you even picture what a world
link |
00:10:54.620
with a new technology will be like?
link |
00:10:56.600
Now imagine we're in 1950,
link |
00:10:57.840
and I'm trying to describe Uber to someone.
link |
00:11:02.840
Apps and the internet.
link |
00:11:05.340
Yeah, I mean, that's going to be extremely complicated.
link |
00:11:08.920
But it's imaginable.
link |
00:11:10.160
It's imaginable, right?
link |
00:11:11.880
And now imagine being in 1950 and predicting Uber, right?
link |
00:11:15.280
And you need to describe the internet,
link |
00:11:17.680
you need to describe GPS,
link |
00:11:18.720
you need to describe the fact that
link |
00:11:20.480
everyone's going to have this phone in their pocket.
link |
00:11:23.920
And so I think that just the first truth
link |
00:11:26.160
is that it is hard to picture
link |
00:11:28.040
how a transformative technology will play out in the world.
link |
00:11:31.160
We've seen that before with technologies
link |
00:11:32.760
that are far less transformative than AGI will be.
link |
00:11:35.560
And so I think that one piece is that
link |
00:11:37.780
it's just even hard to imagine
link |
00:11:39.560
and to really put yourself in a world
link |
00:11:41.640
where you can predict what that positive vision
link |
00:11:44.640
would be like.
link |
00:11:46.920
And I think the second thing is that
link |
00:11:49.520
I think it is always easier to support the negative side
link |
00:11:54.280
than the positive side.
link |
00:11:55.120
It's always easier to destroy than create.
link |
00:11:58.160
And less in a physical sense
link |
00:12:00.760
and more just in an intellectual sense, right?
link |
00:12:03.080
Because I think that with creating something,
link |
00:12:05.680
you need to just get a bunch of things right.
link |
00:12:07.400
And to destroy, you just need to get one thing wrong.
link |
00:12:10.280
And so I think that what that means
link |
00:12:12.040
is that I think a lot of people's thinking dead ends
link |
00:12:14.240
as soon as they see the negative story.
link |
00:12:16.920
But that being said, I actually have some hope, right?
link |
00:12:20.360
I think that the positive vision
link |
00:12:23.160
is something that I think can be,
link |
00:12:26.000
is something that we can talk about.
link |
00:12:27.560
And I think that just simply saying this fact of,
link |
00:12:30.240
yeah, there's positive, there's negatives,
link |
00:12:32.000
everyone likes to dwell on the negative.
link |
00:12:33.600
People actually respond well to that message and say,
link |
00:12:35.400
huh, you're right, there's a part of this
link |
00:12:37.040
that we're not talking about, not thinking about.
link |
00:12:39.640
And that's actually something that's I think really
link |
00:12:42.280
been a key part of how we think about AGI at OpenAI.
link |
00:12:46.640
You can kind of look at it as like, okay,
link |
00:12:48.640
OpenAI talks about the fact that there are risks
link |
00:12:51.040
and yet they're trying to build this system.
link |
00:12:53.720
How do you square those two facts?
link |
00:12:56.120
So do you share the intuition that some people have,
link |
00:12:59.160
I mean from Sam Harris to even Elon Musk himself,
link |
00:13:02.720
that it's tricky as you develop AGI
link |
00:13:06.640
to keep it from slipping into the existential threats,
link |
00:13:10.440
into the negative?
link |
00:13:11.800
What's your intuition about how hard is it
link |
00:13:14.840
to keep AI development on the positive track?
link |
00:13:19.680
What's your intuition there?
link |
00:13:20.760
To answer that question, you can really look
link |
00:13:22.280
at how we structure OpenAI.
link |
00:13:24.000
So we really have three main arms.
link |
00:13:25.880
We have capabilities, which is actually doing
link |
00:13:28.000
the technical work and pushing forward
link |
00:13:29.880
what these systems can do.
link |
00:13:31.200
There's safety, which is working on technical mechanisms
link |
00:13:35.160
to ensure that the systems we build
link |
00:13:36.920
are aligned with human values.
link |
00:13:38.480
And then there's policy, which is making sure
link |
00:13:40.680
that we have governance mechanisms,
link |
00:13:42.040
answering that question of, well, whose values?
link |
00:13:45.280
And so I think that the technical safety one
link |
00:13:47.400
is the one that people kind of talk about the most, right?
link |
00:13:50.480
You talk about, like think about all of the dystopic AI
link |
00:13:53.840
movies, a lot of that is about not having
link |
00:13:55.800
good technical safety in place.
link |
00:13:57.560
And what we've been finding is that,
link |
00:13:59.840
you know, I think that actually a lot of people
link |
00:14:01.360
look at the technical safety problem
link |
00:14:02.680
and think it's just intractable, right?
link |
00:14:05.400
This question of what do humans want?
link |
00:14:07.840
How am I supposed to write that down?
link |
00:14:09.160
Can I even write down what I want?
link |
00:14:11.200
No way.
link |
00:14:13.040
And then they stop there.
link |
00:14:14.840
But the thing is, we've already built systems
link |
00:14:16.880
that are able to learn things that humans can't specify.
link |
00:14:20.920
You know, even the rules for how to recognize
link |
00:14:22.920
if there's a cat or a dog in an image.
link |
00:14:24.960
Turns out it's intractable to write that down,
link |
00:14:26.520
and yet we're able to learn it.
link |
00:14:28.440
And that what we're seeing with systems we build at OpenAI,
link |
00:14:31.040
and they're still in early proof of concept stage,
link |
00:14:33.800
is that you are able to learn human preferences.
link |
00:14:36.320
You're able to learn what humans want from data.
link |
00:14:38.960
And so that's kind of the core focus
link |
00:14:40.400
for our technical safety team,
link |
00:14:41.760
and I think that there actually,
link |
00:14:43.800
we've had some pretty encouraging updates
link |
00:14:45.680
in terms of what we've been able to make work.
link |
00:14:48.040
So you have an intuition and a hope that from data,
link |
00:14:51.680
you know, looking at the value alignment problem,
link |
00:14:53.640
from data we can build systems that align
link |
00:14:57.080
with the collective better angels of our nature.
link |
00:15:00.640
So align with the ethics and the morals of human beings.
link |
00:15:04.640
To even say this in a different way,
link |
00:15:05.920
I mean, think about how do we align humans, right?
link |
00:15:08.600
Think about like a human baby can grow up
link |
00:15:10.440
to be an evil person or a great person.
link |
00:15:12.920
And a lot of that is from learning from data, right?
link |
00:15:15.240
That you have some feedback as a child is growing up,
link |
00:15:17.760
they get to see positive examples.
link |
00:15:19.200
And so I think that just like,
link |
00:15:22.000
that the only example we have of a general intelligence
link |
00:15:25.400
that is able to learn from data
link |
00:15:28.040
to align with human values and to learn values,
link |
00:15:31.440
I think we shouldn't be surprised
link |
00:15:32.880
that we can do the same sorts of techniques
link |
00:15:36.000
or whether the same sort of techniques
link |
00:15:37.440
end up being how we solve value alignment for AGI's.
link |
00:15:41.080
So let's go even higher.
link |
00:15:42.720
I don't know if you've read the book, Sapiens,
link |
00:15:44.800
but there's an idea that, you know,
link |
00:15:48.280
that as a collective, as us human beings,
link |
00:15:49.960
we kind of develop together ideas that we hold.
link |
00:15:54.720
There's no, in that context, objective truth.
link |
00:15:57.880
We just kind of all agree to certain ideas
link |
00:15:59.960
and hold them as a collective.
link |
00:16:01.400
Did you have a sense that there is,
link |
00:16:03.440
in the world of good and evil,
link |
00:16:05.320
do you have a sense that to the first approximation,
link |
00:16:07.520
there are some things that are good
link |
00:16:10.240
and that you could teach systems to behave to be good?
link |
00:16:14.520
So I think that this actually blends into our third team,
link |
00:16:18.280
right, which is the policy team.
link |
00:16:19.880
And this is the one, the aspect I think people
link |
00:16:22.360
really talk about way less than they should, right?
link |
00:16:25.280
Because imagine that we build super powerful systems
link |
00:16:27.640
that we've managed to figure out all the mechanisms
link |
00:16:29.720
for these things to do whatever the operator wants.
link |
00:16:32.800
The most important question becomes,
link |
00:16:34.480
who's the operator, what do they want,
link |
00:16:36.720
and how is that going to affect everyone else, right?
link |
00:16:39.360
And I think that this question of what is good,
link |
00:16:43.080
what are those values, I mean,
link |
00:16:44.720
I think you don't even have to go to those,
link |
00:16:46.600
those very grand existential places
link |
00:16:48.400
to start to realize how hard this problem is.
link |
00:16:50.920
You just look at different countries
link |
00:16:52.880
and cultures across the world,
link |
00:16:54.520
and that there's a very different conception
link |
00:16:57.120
of how the world works and what kinds of ways
link |
00:17:01.920
that society wants to operate.
link |
00:17:03.400
And so I think that the really core question
link |
00:17:07.000
is actually very concrete,
link |
00:17:09.560
and I think it's not a question
link |
00:17:10.980
that we have ready answers to, right?
link |
00:17:12.720
It's how do you have a world
link |
00:17:14.720
where all of the different countries that we have,
link |
00:17:17.280
United States, China, Russia,
link |
00:17:19.760
and the hundreds of other countries out there
link |
00:17:22.760
are able to continue to not just operate
link |
00:17:26.620
in the way that they see fit,
link |
00:17:28.440
but in the world that emerges
link |
00:17:32.560
where you have these very powerful systems
link |
00:17:36.080
operating alongside humans,
link |
00:17:37.820
ends up being something that empowers humans more,
link |
00:17:39.820
that makes human existence be a more meaningful thing,
link |
00:17:44.140
and that people are happier and wealthier,
link |
00:17:46.440
and able to live more fulfilling lives.
link |
00:17:49.040
It's not an obvious thing for how to design that world
link |
00:17:51.600
once you have that very powerful system.
link |
00:17:53.640
So if we take a little step back,
link |
00:17:55.860
and we're having a fascinating conversation,
link |
00:17:58.260
and OpenAI is in many ways a tech leader in the world,
link |
00:18:01.920
and yet we're thinking about
link |
00:18:03.240
these big existential questions,
link |
00:18:05.480
which is fascinating, really important.
link |
00:18:07.060
I think you're a leader in that space,
link |
00:18:09.200
and that's a really important space
link |
00:18:10.880
of just thinking how AI affects society
link |
00:18:13.120
in a big picture view.
link |
00:18:14.400
So Oscar Wilde said, we're all in the gutter,
link |
00:18:17.360
but some of us are looking at the stars,
link |
00:18:19.040
and I think OpenAI has a charter
link |
00:18:22.360
that looks to the stars, I would say,
link |
00:18:24.640
to create intelligence, to create general intelligence,
link |
00:18:26.920
make it beneficial, safe, and collaborative.
link |
00:18:29.480
So can you tell me how that came about,
link |
00:18:33.720
how a mission like that and the path
link |
00:18:36.360
to creating a mission like that at OpenAI was founded?
link |
00:18:39.160
Yeah, so I think that in some ways
link |
00:18:41.680
it really boils down to taking a look at the landscape.
link |
00:18:45.160
So if you think about the history of AI,
link |
00:18:47.060
that basically for the past 60 or 70 years,
link |
00:18:49.960
people have thought about this goal
link |
00:18:51.680
of what could happen if you could automate
link |
00:18:54.000
human intellectual labor.
link |
00:18:56.700
Imagine you could build a computer system
link |
00:18:58.280
that could do that, what becomes possible?
link |
00:19:00.560
We have a lot of sci fi that tells stories
link |
00:19:02.440
of various dystopias, and increasingly you have movies
link |
00:19:04.960
like Her that tell you a little bit about,
link |
00:19:06.520
maybe more of a little bit utopic vision.
link |
00:19:09.480
You think about the impacts that we've seen
link |
00:19:12.560
from being able to have bicycles for our minds
link |
00:19:16.280
and computers, and I think that the impact
link |
00:19:20.360
of computers and the internet has just far outstripped
link |
00:19:23.480
what anyone really could have predicted.
link |
00:19:26.200
And so I think that it's very clear
link |
00:19:27.420
that if you can build an AGI,
link |
00:19:29.360
it will be the most transformative technology
link |
00:19:31.600
that humans will ever create.
link |
00:19:34.640
And so what it boils down to then is a question of,
link |
00:19:36.840
well, is there a path, is there hope,
link |
00:19:39.400
is there a way to build such a system?
link |
00:19:41.480
And I think that for 60 or 70 years,
link |
00:19:43.620
that people got excited and that ended up
link |
00:19:47.280
not being able to deliver on the hopes
link |
00:19:49.440
that people had pinned on them.
link |
00:19:51.400
And I think that then, that after two winters
link |
00:19:54.880
of AI development, that people I think kind of
link |
00:19:58.320
almost stopped daring to dream, right?
link |
00:20:00.520
That really talking about AGI or thinking about AGI
link |
00:20:03.240
became almost this taboo in the community.
link |
00:20:06.600
But I actually think that people took the wrong lesson
link |
00:20:08.660
from AI history.
link |
00:20:10.080
And if you look back, starting in 1959
link |
00:20:12.360
is when the Perceptron was released.
link |
00:20:14.240
And this is basically one of the earliest neural networks.
link |
00:20:17.680
It was released to what was perceived
link |
00:20:19.220
as this massive overhype.
link |
00:20:20.820
So in the New York Times in 1959,
link |
00:20:22.320
you have this article saying that the Perceptron
link |
00:20:26.380
will one day recognize people, call out their names,
link |
00:20:29.160
instantly translate speech between languages.
link |
00:20:31.440
And people at the time looked at this and said,
link |
00:20:33.800
this is, your system can't do any of that.
link |
00:20:36.080
And basically spent 10 years trying to discredit
link |
00:20:38.060
the whole Perceptron direction and succeeded.
link |
00:20:40.600
And all the funding dried up.
link |
00:20:41.800
And people kind of went in other directions.
link |
00:20:44.960
And in the 80s, there was this resurgence.
link |
00:20:46.900
And I'd always heard that the resurgence in the 80s
link |
00:20:49.280
was due to the invention of backpropagation
link |
00:20:51.480
and these algorithms that got people excited.
link |
00:20:53.680
But actually the causality was due to people
link |
00:20:55.720
building larger computers.
link |
00:20:57.140
That you can find these articles from the 80s
link |
00:20:59.080
saying that the democratization of computing power
link |
00:21:01.720
suddenly meant that you could run
link |
00:21:02.660
these larger neural networks.
link |
00:21:04.000
And then people started to do all these amazing things.
link |
00:21:06.280
Backpropagation algorithm was invented.
link |
00:21:08.000
And the neural nets people were running
link |
00:21:10.100
were these tiny little 20 neuron neural nets.
link |
00:21:13.040
What are you supposed to learn with 20 neurons?
link |
00:21:15.160
And so of course, they weren't able to get great results.
link |
00:21:18.640
And it really wasn't until 2012 that this approach,
link |
00:21:21.940
that's almost the most simple, natural approach
link |
00:21:24.680
that people had come up with in the 50s,
link |
00:21:27.720
in some ways even in the 40s before there were computers,
link |
00:21:30.320
with the Pitts–McCullough neuron,
link |
00:21:32.120
suddenly this became the best way of solving problems.
link |
00:21:37.460
And I think there are three core properties
link |
00:21:39.260
that deep learning has that I think
link |
00:21:42.080
are very worth paying attention to.
link |
00:21:44.100
The first is generality.
link |
00:21:45.900
We have a very small number of deep learning tools.
link |
00:21:48.700
SGD, deep neural net, maybe some RL.
link |
00:21:53.180
And it solves this huge variety of problems.
link |
00:21:55.580
Speech recognition, machine translation,
link |
00:21:57.220
game playing, all of these problems, small set of tools.
link |
00:22:00.980
So there's the generality.
link |
00:22:02.740
There's a second piece, which is the competence.
link |
00:22:04.980
You want to solve any of those problems?
link |
00:22:07.020
Throw up 40 years worth of normal computer vision research,
link |
00:22:10.620
replace it with a deep neural net,
link |
00:22:11.780
it's going to work better.
link |
00:22:13.580
And there's a third piece, which is the scalability.
link |
00:22:16.860
One thing that has been shown time and time again
link |
00:22:18.680
is that if you have a larger neural network,
link |
00:22:21.740
throw more compute, more data at it, it will work better.
link |
00:22:25.120
Those three properties together feel like essential parts
link |
00:22:28.860
of building a general intelligence.
link |
00:22:30.820
Now it doesn't just mean that if we scale up what we have,
link |
00:22:33.800
that we will have an AGI, right?
link |
00:22:35.180
There are clearly missing pieces.
link |
00:22:36.780
There are missing ideas.
link |
00:22:38.020
We need to have answers for reasoning.
link |
00:22:40.000
But I think that the core here is that for the first time,
link |
00:22:44.780
it feels that we have a paradigm that gives us hope
link |
00:22:47.940
that general intelligence can be achievable.
link |
00:22:50.580
And so as soon as you believe that,
link |
00:22:52.140
everything else comes into focus, right?
link |
00:22:54.460
If you imagine that you may be able to,
link |
00:22:56.580
and you know that the timeline I think remains uncertain,
link |
00:22:59.820
but I think that certainly within our lifetimes
link |
00:23:02.220
and possibly within a much shorter period of time
link |
00:23:04.660
than people would expect,
link |
00:23:06.580
if you can really build the most transformative technology
link |
00:23:09.340
that will ever exist,
link |
00:23:10.660
you stop thinking about yourself so much, right?
link |
00:23:12.620
You start thinking about just like,
link |
00:23:14.220
how do you have a world where this goes well?
link |
00:23:16.440
And that you need to think about the practicalities
link |
00:23:18.180
of how do you build an organization
link |
00:23:19.540
and get together a bunch of people and resources
link |
00:23:22.020
and to make sure that people feel motivated
link |
00:23:25.140
and ready to do it.
link |
00:23:26.780
But I think that then you start thinking about,
link |
00:23:29.260
well, what if we succeed?
link |
00:23:30.580
And how do we make sure that when we succeed,
link |
00:23:32.740
that the world is actually the place
link |
00:23:34.020
that we want ourselves to exist in?
link |
00:23:36.780
And almost in the Rawlsian Veil sense of the word.
link |
00:23:39.500
And so that's kind of the broader landscape.
link |
00:23:42.340
And OpenAI was really formed in 2015
link |
00:23:45.140
with that high level picture of AGI might be possible
link |
00:23:50.140
sooner than people think,
link |
00:23:51.380
and that we need to try to do our best
link |
00:23:54.420
to make sure it's going to go well.
link |
00:23:55.820
And then we spent the next couple of years
link |
00:23:57.740
really trying to figure out what does that mean?
link |
00:23:59.180
How do we do it?
link |
00:24:00.500
And I think that typically with a company,
link |
00:24:03.060
you start out very small, see you in a co founder,
link |
00:24:06.460
and you build a product, you get some users,
link |
00:24:07.900
you get a product market fit.
link |
00:24:09.540
Then at some point you raise some money,
link |
00:24:11.620
you hire people, you scale, and then down the road,
link |
00:24:14.940
then the big companies realize you exist
link |
00:24:16.420
and try to kill you.
link |
00:24:17.420
And for OpenAI, it was basically everything
link |
00:24:19.860
in exactly the opposite order.
link |
00:24:21.260
Let me just pause for a second, you said a lot of things.
link |
00:24:26.260
And let me just admire the jarring aspect
link |
00:24:29.740
of what OpenAI stands for, which is daring to dream.
link |
00:24:33.740
I mean, you said it's pretty powerful.
link |
00:24:35.620
It caught me off guard because I think that's very true.
link |
00:24:38.620
The step of just daring to dream about the possibilities
link |
00:24:43.620
of creating intelligence in a positive, in a safe way,
link |
00:24:47.180
but just even creating intelligence is a very powerful
link |
00:24:50.700
is a much needed refreshing catalyst for the AI community.
link |
00:24:57.460
So that's the starting point.
link |
00:24:58.860
Okay, so then formation of OpenAI, what's that?
link |
00:25:02.900
I would just say that when we were starting OpenAI,
link |
00:25:05.740
that kind of the first question that we had is,
link |
00:25:07.820
is it too late to start a lab
link |
00:25:10.380
with a bunch of the best people?
link |
00:25:12.060
Right, is that even possible? Wow, okay.
link |
00:25:13.220
That was an actual question?
link |
00:25:14.540
That was the core question of,
link |
00:25:17.340
we had this dinner in July of 2015,
link |
00:25:19.380
and that was really what we spent the whole time
link |
00:25:21.220
talking about.
link |
00:25:22.300
And, you know, because you think about kind of where AI was
link |
00:25:26.780
is that it had transitioned from being an academic pursuit
link |
00:25:30.180
to an industrial pursuit.
link |
00:25:32.220
And so a lot of the best people were in these big
link |
00:25:34.220
research labs and that we wanted to start our own one
link |
00:25:36.980
that no matter how much resources we could accumulate
link |
00:25:40.540
would be pale in comparison to the big tech companies.
link |
00:25:43.500
And we knew that.
link |
00:25:44.700
And it was a question of, are we going to be actually
link |
00:25:47.020
able to get this thing off the ground?
link |
00:25:48.700
You need critical mass.
link |
00:25:49.740
You can't just do you and a cofounder build a product.
link |
00:25:52.100
You really need to have a group of five to 10 people.
link |
00:25:55.580
And we kind of concluded it wasn't obviously impossible.
link |
00:25:59.460
So it seemed worth trying.
link |
00:26:02.220
Well, you're also a dreamer, so who knows, right?
link |
00:26:04.780
That's right.
link |
00:26:05.620
Okay, so speaking of that, competing with the big players,
link |
00:26:11.460
let's talk about some of the tricky things
link |
00:26:14.020
as you think through this process of growing,
link |
00:26:17.420
of seeing how you can develop these systems
link |
00:26:20.060
at a scale that competes.
link |
00:26:22.580
So you recently formed OpenAI LP,
link |
00:26:26.540
a new cap profit company that now carries the name OpenAI.
link |
00:26:30.780
So OpenAI is now this official company.
link |
00:26:33.260
The original nonprofit company still exists
link |
00:26:36.500
and carries the OpenAI nonprofit name.
link |
00:26:39.740
So can you explain what this company is,
link |
00:26:41.940
what the purpose of this creation is,
link |
00:26:44.220
and how did you arrive at the decision to create it?
link |
00:26:48.740
OpenAI, the whole entity and OpenAI LP as a vehicle
link |
00:26:53.220
is trying to accomplish the mission
link |
00:26:55.500
of ensuring that artificial general intelligence
link |
00:26:57.460
benefits everyone.
link |
00:26:58.740
And the main way that we're trying to do that
link |
00:27:00.180
is by actually trying to build general intelligence
link |
00:27:02.500
ourselves and make sure the benefits
link |
00:27:04.140
are distributed to the world.
link |
00:27:05.860
That's the primary way.
link |
00:27:07.100
We're also fine if someone else does this, right?
link |
00:27:09.540
Doesn't have to be us.
link |
00:27:10.580
If someone else is going to build an AGI
link |
00:27:12.540
and make sure that the benefits don't get locked up
link |
00:27:14.740
in one company or with one set of people,
link |
00:27:19.220
like we're actually fine with that.
link |
00:27:21.100
And so those ideas are baked into our charter,
link |
00:27:25.340
which is kind of the foundational document
link |
00:27:28.340
that describes kind of our values and how we operate.
link |
00:27:32.780
But it's also really baked into the structure of OpenAI LP.
link |
00:27:36.300
And so the way that we've set up OpenAI LP
link |
00:27:37.900
is that in the case where we succeed, right?
link |
00:27:42.100
If we actually build what we're trying to build,
link |
00:27:45.260
then investors are able to get a return,
link |
00:27:48.300
but that return is something that is capped.
link |
00:27:50.300
And so if you think of AGI in terms of the value
link |
00:27:52.940
that you could really create,
link |
00:27:54.100
you're talking about the most transformative technology
link |
00:27:56.260
ever created, it's going to create orders of magnitude
link |
00:27:58.780
more value than any existing company.
link |
00:28:01.820
And that all of that value will be owned by the world,
link |
00:28:05.900
like legally titled to the nonprofit
link |
00:28:07.820
to fulfill that mission.
link |
00:28:09.500
And so that's the structure.
link |
00:28:12.740
So the mission is a powerful one,
link |
00:28:15.140
and it's one that I think most people would agree with.
link |
00:28:18.860
It's how we would hope AI progresses.
link |
00:28:22.900
And so how do you tie yourself to that mission?
link |
00:28:25.340
How do you make sure you do not deviate from that mission,
link |
00:28:29.180
that other incentives that are profit driven
link |
00:28:35.260
don't interfere with the mission?
link |
00:28:36.740
So this was actually a really core question for us
link |
00:28:39.540
for the past couple of years,
link |
00:28:40.900
because I'd say that like the way that our history went
link |
00:28:43.540
was that for the first year,
link |
00:28:44.920
we were getting off the ground, right?
link |
00:28:46.200
We had this high level picture,
link |
00:28:47.900
but we didn't know exactly how we wanted to accomplish it.
link |
00:28:51.860
And really two years ago is when we first started realizing
link |
00:28:55.020
in order to build AGI,
link |
00:28:56.140
we're just going to need to raise way more money
link |
00:28:58.700
than we can as a nonprofit.
link |
00:29:00.180
And we're talking many billions of dollars.
link |
00:29:02.820
And so the first question is how are you supposed to do that
link |
00:29:06.860
and stay true to this mission?
link |
00:29:08.700
And we looked at every legal structure out there
link |
00:29:10.580
and concluded none of them were quite right
link |
00:29:11.940
for what we wanted to do.
link |
00:29:13.380
And I guess it shouldn't be too surprising
link |
00:29:14.580
if you're gonna do some like crazy unprecedented technology
link |
00:29:16.920
that you're gonna have to come with
link |
00:29:17.980
some crazy unprecedented structure to do it in.
link |
00:29:20.340
And a lot of our conversation was with people at OpenAI,
link |
00:29:26.140
the people who really joined
link |
00:29:27.220
because they believe so much in this mission
link |
00:29:29.100
and thinking about how do we actually
link |
00:29:31.260
raise the resources to do it
link |
00:29:33.020
and also stay true to what we stand for.
link |
00:29:35.900
And the place you gotta start is to really align
link |
00:29:37.940
on what is it that we stand for, right?
link |
00:29:39.540
What are those values?
link |
00:29:40.500
What's really important to us?
link |
00:29:41.820
And so I'd say that we spent about a year
link |
00:29:43.740
really compiling the OpenAI charter
link |
00:29:46.220
and that determines,
link |
00:29:47.540
and if you even look at the first line item in there,
link |
00:29:50.220
it says that, look, we expect we're gonna have to marshal
link |
00:29:52.340
huge amounts of resources,
link |
00:29:53.740
but we're going to make sure that we minimize
link |
00:29:55.720
conflict of interest with the mission.
link |
00:29:57.620
And that kind of aligning on all of those pieces
link |
00:30:00.700
was the most important step towards figuring out
link |
00:30:04.180
how do we structure a company
link |
00:30:06.020
that can actually raise the resources
link |
00:30:08.200
to do what we need to do.
link |
00:30:10.300
I imagine OpenAI, the decision to create OpenAI LP
link |
00:30:14.740
was a really difficult one.
link |
00:30:16.340
And there was a lot of discussions,
link |
00:30:17.900
as you mentioned, for a year,
link |
00:30:19.600
and there was different ideas,
link |
00:30:22.740
perhaps detractors within OpenAI,
link |
00:30:26.100
sort of different paths that you could have taken.
link |
00:30:28.900
What were those concerns?
link |
00:30:30.220
What were the different paths considered?
link |
00:30:32.020
What was that process of making that decision like?
link |
00:30:34.100
Yep, so if you look actually at the OpenAI charter,
link |
00:30:37.720
there's almost two paths embedded within it.
link |
00:30:40.900
There is, we are primarily trying to build AGI ourselves,
link |
00:30:44.900
but we're also okay if someone else does it.
link |
00:30:47.340
And this is a weird thing for a company.
link |
00:30:49.020
It's really interesting, actually.
link |
00:30:51.140
There is an element of competition
link |
00:30:53.260
that you do wanna be the one that does it,
link |
00:30:56.660
but at the same time, you're okay if somebody else doesn't.
link |
00:30:59.020
We'll talk about that a little bit, that trade off,
link |
00:31:01.380
that dance that's really interesting.
link |
00:31:02.940
And I think this was the core tension
link |
00:31:04.600
as we were designing OpenAI LP,
link |
00:31:06.380
and really the OpenAI strategy,
link |
00:31:08.260
is how do you make sure that both you have a shot
link |
00:31:11.080
at being a primary actor,
link |
00:31:12.660
which really requires building an organization,
link |
00:31:15.820
raising massive resources,
link |
00:31:17.700
and really having the will to go
link |
00:31:19.420
and execute on some really, really hard vision, right?
link |
00:31:22.040
You need to really sign up for a long period
link |
00:31:23.800
to go and take on a lot of pain and a lot of risk.
link |
00:31:27.160
And to do that, normally you just import
link |
00:31:30.420
the startup mindset, right?
link |
00:31:31.780
And that you think about, okay,
link |
00:31:32.820
like how do we out execute everyone?
link |
00:31:34.300
You have this very competitive angle.
link |
00:31:36.220
But you also have the second angle of saying that,
link |
00:31:38.180
well, the true mission isn't for OpenAI to build AGI.
link |
00:31:41.660
The true mission is for AGI to go well for humanity.
link |
00:31:45.140
And so how do you take all of those first actions
link |
00:31:48.140
and make sure you don't close the door on outcomes
link |
00:31:51.380
that would actually be positive and fulfill the mission?
link |
00:31:54.460
And so I think it's a very delicate balance, right?
link |
00:31:56.700
And I think that going 100% one direction or the other
link |
00:31:59.620
is clearly not the correct answer.
link |
00:32:01.340
And so I think that even in terms of just how we talk
link |
00:32:03.700
about OpenAI and think about it,
link |
00:32:05.440
there's just like one thing that's always in the back
link |
00:32:07.980
of my mind is to make sure that we're not just saying
link |
00:32:11.260
OpenAI's goal is to build AGI, right?
link |
00:32:14.020
That it's actually much broader than that, right?
link |
00:32:15.580
That first of all, it's not just AGI,
link |
00:32:18.260
it's safe AGI that's very important.
link |
00:32:20.340
But secondly, our goal isn't to be the ones to build it.
link |
00:32:23.100
Our goal is to make sure it goes well for the world.
link |
00:32:24.700
And so I think that figuring out
link |
00:32:26.100
how do you balance all of those
link |
00:32:27.900
and to get people to really come to the table
link |
00:32:30.220
and compile a single document that encompasses all of that
link |
00:32:36.340
wasn't trivial.
link |
00:32:37.540
So part of the challenge here is your mission is,
link |
00:32:41.640
I would say, beautiful, empowering,
link |
00:32:44.220
and a beacon of hope for people in the research community
link |
00:32:47.500
and just people thinking about AI.
link |
00:32:49.180
So your decisions are scrutinized more than,
link |
00:32:53.140
I think, a regular profit driven company.
link |
00:32:55.900
Do you feel the burden of this
link |
00:32:57.380
in the creation of the charter
link |
00:32:58.540
and just in the way you operate?
link |
00:33:00.160
Yes.
link |
00:33:01.000
So why do you lean into the burden
link |
00:33:07.020
by creating such a charter?
link |
00:33:08.660
Why not keep it quiet?
link |
00:33:10.420
I mean, it just boils down to the mission, right?
link |
00:33:12.900
Like I'm here and everyone else is here
link |
00:33:15.180
because we think this is the most important mission.
link |
00:33:17.380
Dare to dream.
link |
00:33:18.980
All right, so do you think you can be good for the world
link |
00:33:23.340
or create an AGI system that's good
link |
00:33:25.980
when you're a for profit company?
link |
00:33:28.320
From my perspective, I don't understand
link |
00:33:30.660
why profit interferes with positive impact on society.
link |
00:33:37.620
I don't understand why Google,
link |
00:33:40.740
that makes most of its money from ads,
link |
00:33:42.900
can't also do good for the world
link |
00:33:45.020
or other companies, Facebook, anything.
link |
00:33:47.500
I don't understand why those have to interfere.
link |
00:33:50.220
You know, profit isn't the thing, in my view,
link |
00:33:55.100
that affects the impact of a company.
link |
00:33:57.200
What affects the impact of the company is the charter,
link |
00:34:00.340
is the culture, is the people inside,
link |
00:34:04.140
and profit is the thing that just fuels those people.
link |
00:34:07.100
So what are your views there?
link |
00:34:08.740
Yeah, so I think that's a really good question
link |
00:34:10.900
and there's some real longstanding debates
link |
00:34:14.180
in human society that are wrapped up in it.
link |
00:34:16.460
The way that I think about it is just think about
link |
00:34:18.640
what are the most impactful non profits in the world?
link |
00:34:23.980
What are the most impactful for profits in the world?
link |
00:34:26.780
Right, it's much easier to list the for profits.
link |
00:34:29.260
That's right, and I think that there's some real truth here
link |
00:34:32.420
that the system that we set up,
link |
00:34:34.600
the system for kind of how today's world is organized,
link |
00:34:38.320
is one that really allows for huge impact.
link |
00:34:41.300
And that kind of part of that is that you need to be,
link |
00:34:45.140
that for profits are self sustaining
link |
00:34:48.060
and able to kind of build on their own momentum.
link |
00:34:51.180
And I think that's a really powerful thing.
link |
00:34:53.060
It's something that when it turns out
link |
00:34:55.860
that we haven't set the guardrails correctly,
link |
00:34:57.900
causes problems, right?
link |
00:34:58.820
Think about logging companies that go into forest,
link |
00:35:01.600
the rainforest, that's really bad, we don't want that.
link |
00:35:04.680
And it's actually really interesting to me
link |
00:35:06.500
that kind of this question of how do you get
link |
00:35:08.940
positive benefits out of a for profit company,
link |
00:35:11.380
it's actually very similar to how do you get
link |
00:35:13.020
positive benefits out of an AGI, right?
link |
00:35:15.800
That you have this like very powerful system,
link |
00:35:17.980
it's more powerful than any human,
link |
00:35:19.700
and is kind of autonomous in some ways,
link |
00:35:21.860
it's superhuman in a lot of axes,
link |
00:35:23.740
and somehow you have to set the guardrails
link |
00:35:25.420
to get good things to happen.
link |
00:35:26.820
But when you do, the benefits are massive.
link |
00:35:29.380
And so I think that when I think about
link |
00:35:32.500
nonprofit versus for profit,
link |
00:35:34.420
I think just not enough happens in nonprofits,
link |
00:35:36.760
they're very pure, but it's just kind of,
link |
00:35:39.180
it's just hard to do things there.
link |
00:35:40.860
In for profits in some ways, like too much happens,
link |
00:35:43.980
but if kind of shaped in the right way,
link |
00:35:46.460
it can actually be very positive.
link |
00:35:47.820
And so with OpenAI LP, we're picking a road in between.
link |
00:35:52.100
Now the thing that I think is really important to recognize
link |
00:35:54.820
is that the way that we think about OpenAI LP
link |
00:35:57.140
is that in the world where AGI actually happens, right,
link |
00:36:00.420
in a world where we are successful,
link |
00:36:01.660
we build the most transformative technology ever,
link |
00:36:03.760
the amount of value we're gonna create will be astronomical.
link |
00:36:07.580
And so then in that case, that the cap that we have
link |
00:36:12.760
will be a small fraction of the value we create,
link |
00:36:15.540
and the amount of value that goes back to investors
link |
00:36:17.800
and employees looks pretty similar to what would happen
link |
00:36:20.020
in a pretty successful startup.
link |
00:36:23.780
And that's really the case that we're optimizing for, right?
link |
00:36:26.580
That we're thinking about in the success case,
link |
00:36:28.600
making sure that the value we create doesn't get locked up.
link |
00:36:32.220
And I expect that in other for profit companies
link |
00:36:34.980
that it's possible to do something like that.
link |
00:36:37.860
I think it's not obvious how to do it, right?
link |
00:36:39.780
I think that as a for profit company,
link |
00:36:41.500
you have a lot of fiduciary duty to your shareholders
link |
00:36:44.300
and that there are certain decisions
link |
00:36:45.700
that you just cannot make.
link |
00:36:47.560
In our structure, we've set it up
link |
00:36:49.140
so that we have a fiduciary duty to the charter,
link |
00:36:52.500
that we always get to make the decision
link |
00:36:54.460
that is right for the charter rather than,
link |
00:36:57.460
even if it comes at the expense of our own stakeholders.
link |
00:37:00.700
And so I think that when I think about
link |
00:37:03.420
what's really important,
link |
00:37:04.380
it's not really about nonprofit versus for profit,
link |
00:37:06.300
it's really a question of if you build AGI
link |
00:37:09.620
and you kind of, humanity's now in this new age,
link |
00:37:13.100
who benefits, whose lives are better?
link |
00:37:15.780
And I think that what's really important
link |
00:37:17.180
is to have an answer that is everyone.
link |
00:37:20.340
Yeah, which is one of the core aspects of the charter.
link |
00:37:23.380
So one concern people have, not just with OpenAI,
link |
00:37:26.540
but with Google, Facebook, Amazon,
link |
00:37:28.420
anybody really that's creating impact at scale
link |
00:37:35.020
is how do we avoid, as your charter says,
link |
00:37:37.680
avoid enabling the use of AI or AGI
link |
00:37:40.100
to unduly concentrate power?
link |
00:37:43.660
Why would not a company like OpenAI
link |
00:37:45.940
keep all the power of an AGI system to itself?
link |
00:37:48.660
The charter.
link |
00:37:49.540
The charter.
link |
00:37:50.380
So how does the charter
link |
00:37:53.140
actualize itself in day to day?
link |
00:37:57.260
So I think that first, to zoom out,
link |
00:38:00.580
that the way that we structure the company
link |
00:38:01.860
is so that the power for sort of dictating the actions
link |
00:38:05.560
that OpenAI takes ultimately rests with the board,
link |
00:38:08.600
the board of the nonprofit.
link |
00:38:11.020
And the board is set up in certain ways
link |
00:38:12.300
with certain restrictions that you can read about
link |
00:38:14.260
in the OpenAI LP blog post.
link |
00:38:16.300
But effectively the board is the governing body
link |
00:38:19.220
for OpenAI LP.
link |
00:38:21.260
And the board has a duty to fulfill the mission
link |
00:38:24.440
of the nonprofit.
link |
00:38:26.420
And so that's kind of how we tie,
link |
00:38:28.820
how we thread all these things together.
link |
00:38:30.980
Now there's a question of, so day to day,
link |
00:38:32.900
how do people, the individuals,
link |
00:38:34.820
who in some ways are the most empowered ones, right?
link |
00:38:36.980
Now the board sort of gets to call the shots
link |
00:38:38.820
at the high level, but the people
link |
00:38:40.540
who are actually executing are the employees, right?
link |
00:38:43.140
People here on a day to day basis
link |
00:38:44.820
who have the keys to the technical whole kingdom.
link |
00:38:48.940
And there I think that the answer looks a lot like,
link |
00:38:51.700
well, how does any company's values get actualized, right?
link |
00:38:55.080
And I think that a lot of that comes down to
link |
00:38:56.680
that you need people who are here
link |
00:38:58.120
because they really believe in that mission
link |
00:39:01.300
and they believe in the charter
link |
00:39:02.780
and that they are willing to take actions
link |
00:39:05.420
that maybe are worse for them,
link |
00:39:07.060
but are better for the charter.
link |
00:39:08.580
And that's something that's really baked into the culture.
link |
00:39:11.420
And honestly, I think it's, you know,
link |
00:39:13.180
I think that that's one of the things
link |
00:39:14.540
that we really have to work to preserve as time goes on.
link |
00:39:18.140
And that's a really important part
link |
00:39:19.740
of how we think about hiring people
link |
00:39:21.620
and bringing people into OpenAI.
link |
00:39:23.020
So there's people here, there's people here
link |
00:39:25.280
who could speak up and say, like, hold on a second,
link |
00:39:30.820
this is totally against what we stand for, culture wise.
link |
00:39:34.540
Yeah, yeah, for sure.
link |
00:39:35.380
I mean, I think that we actually have,
link |
00:39:37.060
I think that's like a pretty important part
link |
00:39:38.720
of how we operate and how we have,
link |
00:39:41.900
even again with designing the charter
link |
00:39:44.180
and designing OpenAI LP in the first place,
link |
00:39:46.700
that there has been a lot of conversation
link |
00:39:48.740
with employees here and a lot of times
link |
00:39:50.500
where employees said, wait a second,
link |
00:39:52.400
this seems like it's going in the wrong direction
link |
00:39:53.940
and let's talk about it.
link |
00:39:55.140
And so I think one thing that's I think a really,
link |
00:39:57.380
and you know, here's actually one thing
link |
00:39:58.900
that I think is very unique about us as a small company,
link |
00:40:02.140
is that if you're at a massive tech giant,
link |
00:40:04.400
that's a little bit hard for someone
link |
00:40:05.720
who's a line employee to go and talk to the CEO
link |
00:40:08.140
and say, I think that we're doing this wrong.
link |
00:40:10.900
And you know, you'll get companies like Google
link |
00:40:13.060
that have had some collective action from employees
link |
00:40:15.740
to make ethical change around things like Maven.
link |
00:40:19.420
And so maybe there are mechanisms
link |
00:40:20.700
at other companies that work.
link |
00:40:22.260
But here, super easy for anyone to pull me aside,
link |
00:40:24.500
to pull Sam aside, to pull Ilya aside,
link |
00:40:26.340
and people do it all the time.
link |
00:40:27.780
One of the interesting things in the charter
link |
00:40:29.820
is this idea that it'd be great
link |
00:40:31.660
if you could try to describe or untangle
link |
00:40:34.260
switching from competition to collaboration
link |
00:40:36.460
in late stage AGI development.
link |
00:40:38.820
It's really interesting,
link |
00:40:39.780
this dance between competition and collaboration.
link |
00:40:42.180
How do you think about that?
link |
00:40:43.420
Yeah, assuming that you can actually do
link |
00:40:45.020
the technical side of AGI development,
link |
00:40:47.060
I think there's going to be two key problems
link |
00:40:48.980
with figuring out how do you actually deploy it,
link |
00:40:50.460
make it go well.
link |
00:40:51.540
The first one of these is the run up
link |
00:40:53.180
to building the first AGI.
link |
00:40:56.380
You look at how self driving cars are being developed,
link |
00:40:58.940
and it's a competitive race.
link |
00:41:00.700
And the thing that always happens in competitive race
link |
00:41:02.580
is that you have huge amounts of pressure
link |
00:41:04.200
to get rid of safety.
link |
00:41:06.700
And so that's one thing we're very concerned about,
link |
00:41:08.940
is that people, multiple teams figuring out
link |
00:41:12.020
we can actually get there,
link |
00:41:13.620
but if we took the slower path
link |
00:41:16.740
that is more guaranteed to be safe, we will lose.
link |
00:41:20.300
And so we're going to take the fast path.
link |
00:41:22.380
And so the more that we can both ourselves
link |
00:41:25.520
be in a position where we don't generate
link |
00:41:27.300
that competitive race, where we say,
link |
00:41:29.040
if the race is being run and that someone else
link |
00:41:31.540
is further ahead than we are,
link |
00:41:33.340
we're not going to try to leapfrog.
link |
00:41:35.640
We're going to actually work with them, right?
link |
00:41:37.220
We will help them succeed.
link |
00:41:38.840
As long as what they're trying to do
link |
00:41:40.460
is to fulfill our mission, then we're good.
link |
00:41:42.940
We don't have to build AGI ourselves.
link |
00:41:44.860
And I think that's a really important commitment from us,
link |
00:41:47.100
but it can't just be unilateral, right?
link |
00:41:49.100
I think that it's really important that other players
link |
00:41:51.420
who are serious about building AGI
link |
00:41:53.140
make similar commitments, right?
link |
00:41:54.700
I think that, again, to the extent that everyone believes
link |
00:41:57.820
that AGI should be something to benefit everyone,
link |
00:42:00.060
then it actually really shouldn't matter
link |
00:42:01.220
which company builds it.
link |
00:42:02.460
And we should all be concerned about the case
link |
00:42:04.140
where we just race so hard to get there
link |
00:42:06.060
that something goes wrong.
link |
00:42:07.620
So what role do you think government,
link |
00:42:10.540
our favorite entity, has in setting policy and rules
link |
00:42:13.820
about this domain, from research to the development
link |
00:42:18.300
to early stage to late stage AI and AGI development?
link |
00:42:22.900
So I think that, first of all,
link |
00:42:25.660
it's really important that government's in there, right?
link |
00:42:28.100
In some way, shape, or form.
link |
00:42:29.820
At the end of the day, we're talking about
link |
00:42:30.940
building technology that will shape how the world operates,
link |
00:42:35.140
and that there needs to be government
link |
00:42:37.300
as part of that answer.
link |
00:42:39.100
And so that's why we've done a number
link |
00:42:42.220
of different congressional testimonies,
link |
00:42:43.660
we interact with a number of different lawmakers,
link |
00:42:46.300
and that right now, a lot of our message to them
link |
00:42:50.060
is that it's not the time for regulation,
link |
00:42:54.380
it is the time for measurement, right?
link |
00:42:56.420
That our main policy recommendation is that people,
link |
00:42:59.100
and the government does this all the time
link |
00:43:00.700
with bodies like NIST, spend time trying to figure out
link |
00:43:04.900
just where the technology is, how fast it's moving,
link |
00:43:07.940
and can really become literate and up to speed
link |
00:43:11.220
with respect to what to expect.
link |
00:43:13.500
So I think that today, the answer really
link |
00:43:15.260
is about measurement, and I think that there will be a time
link |
00:43:19.260
and place where that will change.
link |
00:43:21.740
And I think it's a little bit hard to predict
link |
00:43:23.820
exactly what exactly that trajectory should look like.
link |
00:43:27.140
So there will be a point at which regulation,
link |
00:43:31.060
federal in the United States, the government steps in
link |
00:43:34.220
and helps be the, I don't wanna say the adult in the room,
link |
00:43:39.500
to make sure that there is strict rules,
link |
00:43:42.420
maybe conservative rules that nobody can cross.
link |
00:43:45.260
Well, I think there's kind of maybe two angles to it.
link |
00:43:47.440
So today, with narrow AI applications
link |
00:43:49.820
that I think there are already existing bodies
link |
00:43:51.980
that are responsible and should be responsible
link |
00:43:53.980
for regulation, you think about, for example,
link |
00:43:55.880
with self driving cars, that you want the national highway.
link |
00:44:00.340
Netsa.
link |
00:44:01.180
Yeah, exactly, to be regulating that.
link |
00:44:02.980
That makes sense, right, that basically what we're saying
link |
00:44:04.980
is that we're going to have these technological systems
link |
00:44:08.160
that are going to be performing applications
link |
00:44:10.640
that humans already do, great.
link |
00:44:12.740
We already have ways of thinking about standards
link |
00:44:14.820
and safety for those.
link |
00:44:16.140
So I think actually empowering those regulators today
link |
00:44:18.860
is also pretty important.
link |
00:44:20.020
And then I think for AGI, that there's going to be a point
link |
00:44:24.780
where we'll have better answers.
link |
00:44:26.000
And I think that maybe a similar approach
link |
00:44:27.580
of first measurement and start thinking about
link |
00:44:30.500
what the rules should be.
link |
00:44:31.620
I think it's really important
link |
00:44:32.580
that we don't prematurely squash progress.
link |
00:44:36.260
I think it's very easy to kind of smother a budding field.
link |
00:44:40.140
And I think that's something to really avoid.
link |
00:44:42.120
But I don't think that the right way of doing it
link |
00:44:43.740
is to say, let's just try to blaze ahead
link |
00:44:46.900
and not involve all these other stakeholders.
link |
00:44:50.260
So you recently released a paper on GPT2 language modeling,
link |
00:44:58.820
but did not release the full model
link |
00:45:02.020
because you had concerns about the possible
link |
00:45:04.380
negative effects of the availability of such model.
link |
00:45:07.480
It's outside of just that decision,
link |
00:45:10.700
it's super interesting because of the discussion
link |
00:45:14.340
at a societal level, the discourse it creates.
link |
00:45:16.980
So it's fascinating in that aspect.
link |
00:45:19.260
But if you think that's the specifics here at first,
link |
00:45:22.860
what are some negative effects that you envisioned?
link |
00:45:25.860
And of course, what are some of the positive effects?
link |
00:45:28.540
Yeah, so again, I think to zoom out,
link |
00:45:30.780
the way that we thought about GPT2
link |
00:45:33.980
is that with language modeling,
link |
00:45:35.760
we are clearly on a trajectory right now
link |
00:45:38.520
where we scale up our models
link |
00:45:40.860
and we get qualitatively better performance.
link |
00:45:44.440
GPT2 itself was actually just a scale up
link |
00:45:47.340
of a model that we've released in the previous June.
link |
00:45:50.660
We just ran it at much larger scale
link |
00:45:52.860
and we got these results where
link |
00:45:54.300
suddenly starting to write coherent pros,
link |
00:45:57.020
which was not something we'd seen previously.
link |
00:46:00.020
And what are we doing now?
link |
00:46:01.300
Well, we're gonna scale up GPT2 by 10x, by 100x, by 1000x,
link |
00:46:05.740
and we don't know what we're gonna get.
link |
00:46:07.820
And so it's very clear that the model
link |
00:46:10.080
that we released last June,
link |
00:46:12.820
I think it's kind of like, it's a good academic toy.
link |
00:46:16.420
It's not something that we think is something
link |
00:46:18.900
that can really have negative applications
link |
00:46:20.420
or to the extent that it can,
link |
00:46:21.660
that the positive of people being able to play with it
link |
00:46:24.340
is far outweighs the possible harms.
link |
00:46:28.260
You fast forward to not GPT2, but GPT20,
link |
00:46:32.580
and you think about what that's gonna be like.
link |
00:46:34.680
And I think that the capabilities are going to be substantive.
link |
00:46:38.180
And so there needs to be a point in between the two
link |
00:46:41.100
where you say, this is something
link |
00:46:43.460
where we are drawing the line
link |
00:46:45.140
and that we need to start thinking about the safety aspects.
link |
00:46:47.940
And I think for GPT2, we could have gone either way.
link |
00:46:50.140
And in fact, when we had conversations internally
link |
00:46:52.700
that we had a bunch of pros and cons,
link |
00:46:54.740
and it wasn't clear which one outweighed the other.
link |
00:46:58.140
And I think that when we announced that,
link |
00:46:59.940
hey, we decide not to release this model,
link |
00:47:02.140
then there was a bunch of conversation
link |
00:47:03.560
where various people said,
link |
00:47:04.420
it's so obvious that you should have just released it.
link |
00:47:06.340
There are other people said,
link |
00:47:07.180
it's so obvious you should not have released it.
link |
00:47:08.820
And I think that that almost definitionally means
link |
00:47:10.940
that holding it back was the correct decision.
link |
00:47:13.580
Right, if it's not obvious
link |
00:47:15.900
whether something is beneficial or not,
link |
00:47:17.620
you should probably default to caution.
link |
00:47:19.700
And so I think that the overall landscape
link |
00:47:22.420
for how we think about it
link |
00:47:23.700
is that this decision could have gone either way.
link |
00:47:25.900
There are great arguments in both directions,
link |
00:47:27.940
but for future models down the road
link |
00:47:30.060
and possibly sooner than you'd expect,
link |
00:47:32.300
because scaling these things up
link |
00:47:33.460
doesn't actually take that long,
link |
00:47:35.660
those ones you're definitely not going to want
link |
00:47:37.900
to release into the wild.
link |
00:47:39.560
And so I think that we almost view this as a test case
link |
00:47:42.600
and to see, can we even design,
link |
00:47:45.140
you know, how do you have a society
link |
00:47:46.580
or how do you have a system
link |
00:47:47.940
that goes from having no concept
link |
00:47:49.220
of responsible disclosure,
link |
00:47:50.500
where the mere idea of not releasing something
link |
00:47:53.400
for safety reasons is unfamiliar
link |
00:47:55.940
to a world where you say, okay, we have a powerful model,
link |
00:47:58.680
let's at least think about it,
link |
00:47:59.660
let's go through some process.
link |
00:48:01.220
And you think about the security community,
link |
00:48:02.660
it took them a long time
link |
00:48:03.860
to design responsible disclosure, right?
link |
00:48:05.660
You know, you think about this question of,
link |
00:48:07.160
well, I have a security exploit,
link |
00:48:08.740
I send it to the company,
link |
00:48:09.720
the company is like, tries to prosecute me
link |
00:48:11.980
or just sit, just ignores it, what do I do, right?
link |
00:48:16.020
And so, you know, the alternatives of,
link |
00:48:17.300
oh, I just always publish your exploits,
link |
00:48:19.060
that doesn't seem good either, right?
link |
00:48:20.180
And so it really took a long time
link |
00:48:21.580
and took this, it was bigger than any individual, right?
link |
00:48:25.300
It's really about building a whole community
link |
00:48:27.060
that believe that, okay, we'll have this process
link |
00:48:28.740
where you send it to the company, you know,
link |
00:48:30.140
if they don't act in a certain time,
link |
00:48:31.660
then you can go public and you're not a bad person,
link |
00:48:34.420
you've done the right thing.
link |
00:48:36.220
And I think that in AI,
link |
00:48:38.620
part of the response at GPT2 just proves
link |
00:48:41.380
that we don't have any concept of this.
link |
00:48:44.140
So that's the high level picture.
link |
00:48:47.060
And so I think that,
link |
00:48:48.660
I think this was a really important move to make
link |
00:48:51.220
and we could have maybe delayed it for GPT3,
link |
00:48:53.980
but I'm really glad we did it for GPT2.
link |
00:48:56.020
And so now you look at GPT2 itself
link |
00:48:57.740
and you think about the substance of, okay,
link |
00:48:59.420
what are potential negative applications?
link |
00:49:01.300
So you have this model that's been trained on the internet,
link |
00:49:04.100
which, you know, it's also going to be
link |
00:49:05.340
a bunch of very biased data,
link |
00:49:06.500
a bunch of, you know, very offensive content in there,
link |
00:49:09.580
and you can ask it to generate content for you
link |
00:49:13.180
on basically any topic, right?
link |
00:49:14.540
You just give it a prompt and it'll just start writing
link |
00:49:16.700
and it writes content like you see on the internet,
link |
00:49:19.060
you know, even down to like saying advertisement
link |
00:49:21.820
in the middle of some of its generations.
link |
00:49:24.140
And you think about the possibilities
link |
00:49:26.140
for generating fake news or abusive content.
link |
00:49:29.220
And, you know, it's interesting seeing
link |
00:49:30.300
what people have done with, you know,
link |
00:49:31.820
we released a smaller version of GPT2
link |
00:49:34.340
and the people have done things like try to generate,
link |
00:49:37.460
you know, take my own Facebook message history
link |
00:49:40.700
and generate more Facebook messages like me
link |
00:49:43.340
and people generating fake politician content
link |
00:49:47.340
or, you know, there's a bunch of things there
link |
00:49:49.500
where you at least have to think,
link |
00:49:51.860
is this going to be good for the world?
link |
00:49:54.700
There's the flip side, which is I think
link |
00:49:56.300
that there's a lot of awesome applications
link |
00:49:57.780
that we really want to see,
link |
00:49:59.340
like creative applications in terms of
link |
00:50:02.380
if you have sci fi authors that can work with this tool
link |
00:50:05.340
and come up with cool ideas, like that seems awesome
link |
00:50:08.580
if we can write better sci fi through the use of these tools
link |
00:50:11.340
and we've actually had a bunch of people write into us
link |
00:50:13.020
asking, hey, can we use it for, you know,
link |
00:50:16.060
a variety of different creative applications?
link |
00:50:18.300
So the positive are actually pretty easy to imagine.
link |
00:50:21.780
They're, you know, the usual NLP applications
link |
00:50:26.820
are really interesting, but let's go there.
link |
00:50:30.860
It's kind of interesting to think about a world
link |
00:50:32.860
where, look at Twitter, where not just fake news,
link |
00:50:37.860
but smarter and smarter bots being able to spread
link |
00:50:42.980
in an interesting, complex, networking way information
link |
00:50:47.300
that just floods out us regular human beings
link |
00:50:50.700
with our original thoughts.
link |
00:50:52.780
So what are your views of this world with GPT20, right?
link |
00:51:00.180
How do we think about it?
link |
00:51:01.220
Again, it's like one of those things about in the 50s
link |
00:51:03.540
trying to describe the internet or the smartphone.
link |
00:51:08.700
What do you think about that world,
link |
00:51:09.940
the nature of information?
link |
00:51:12.900
One possibility is that we'll always try to design systems
link |
00:51:16.780
that identify robot versus human
link |
00:51:19.660
and we'll do so successfully and so we'll authenticate
link |
00:51:23.340
that we're still human and the other world is that
link |
00:51:25.700
we just accept the fact that we're swimming in a sea
link |
00:51:29.020
of fake news and just learn to swim there.
link |
00:51:32.220
Well, have you ever seen the popular meme of robot
link |
00:51:39.860
with a physical arm and pen clicking the
link |
00:51:42.020
I'm not a robot button?
link |
00:51:43.460
Yeah.
link |
00:51:44.300
I think the truth is that really trying to distinguish
link |
00:51:48.620
between robot and human is a losing battle.
link |
00:51:52.200
Ultimately, you think it's a losing battle?
link |
00:51:53.860
I think it's a losing battle ultimately, right?
link |
00:51:55.560
I think that that is, in terms of the content,
link |
00:51:57.820
in terms of the actions that you can take.
link |
00:51:59.380
I mean, think about how captures have gone, right?
link |
00:52:01.220
The captures used to be a very nice, simple,
link |
00:52:02.980
you just have this image, all of our OCR is terrible,
link |
00:52:06.340
you put a couple of artifacts in it,
link |
00:52:08.900
humans are gonna be able to tell what it is.
link |
00:52:11.500
An AI system wouldn't be able to.
link |
00:52:13.300
Today, I could barely do captures.
link |
00:52:15.740
And I think that this is just kind of where we're going.
link |
00:52:18.380
I think captures were a moment in time thing
link |
00:52:20.420
and as AI systems become more powerful,
link |
00:52:22.500
that there being human capabilities that can be measured
link |
00:52:25.500
in a very easy, automated way that AIs
link |
00:52:28.900
will not be capable of.
link |
00:52:30.180
I think that's just like,
link |
00:52:31.140
it's just an increasingly hard technical battle.
link |
00:52:34.180
But it's not that all hope is lost, right?
link |
00:52:36.260
You think about how do we already authenticate ourselves,
link |
00:52:40.360
right, that we have systems, we have social security numbers
link |
00:52:43.460
if you're in the US or you have ways of identifying
link |
00:52:47.700
individual people and having real world identity
link |
00:52:50.180
tied to digital identity seems like a step
link |
00:52:53.060
towards authenticating the source of content
link |
00:52:56.220
rather than the content itself.
link |
00:52:58.260
Now, there are problems with that.
link |
00:52:59.980
How can you have privacy and anonymity
link |
00:53:02.340
in a world where the only content you can really trust is,
link |
00:53:05.460
or the only way you can trust content
link |
00:53:06.580
is by looking at where it comes from?
link |
00:53:08.560
And so I think that building out good reputation networks
link |
00:53:11.420
may be one possible solution.
link |
00:53:14.060
But yeah, I think that this question is not an obvious one.
link |
00:53:17.700
And I think that we, maybe sooner than we think,
link |
00:53:20.220
will be in a world where today I often will read a tweet
link |
00:53:23.820
and be like, hmm, do I feel like a real human wrote this?
link |
00:53:25.980
Or do I feel like this is genuine?
link |
00:53:27.560
I feel like I can kind of judge the content a little bit.
link |
00:53:30.180
And I think in the future, it just won't be the case.
link |
00:53:32.640
You look at, for example, the FCC comments on net neutrality.
link |
00:53:36.900
It came out later that millions of those were auto generated
link |
00:53:39.900
and that the researchers were able to do
link |
00:53:41.660
various statistical techniques to do that.
link |
00:53:44.040
What do you do in a world
link |
00:53:45.100
where those statistical techniques don't exist?
link |
00:53:47.720
It's just impossible to tell the difference
link |
00:53:49.180
between humans and AIs.
link |
00:53:50.660
And in fact, the most persuasive arguments
link |
00:53:53.980
are written by AI.
link |
00:53:56.620
All that stuff, it's not sci fi anymore.
link |
00:53:58.660
You look at GPT2 making a great argument
link |
00:54:00.580
for why recycling is bad for the world.
link |
00:54:02.580
You gotta read that and be like, huh, you're right.
link |
00:54:04.460
We are addressing just the symptoms.
link |
00:54:06.540
Yeah, that's quite interesting.
link |
00:54:08.140
I mean, ultimately it boils down to the physical world
link |
00:54:11.380
being the last frontier of proving,
link |
00:54:13.720
so you said like basically networks of people,
link |
00:54:16.100
humans vouching for humans in the physical world.
link |
00:54:19.420
And somehow the authentication ends there.
link |
00:54:22.980
I mean, if I had to ask you,
link |
00:54:25.560
I mean, you're way too eloquent for a human.
link |
00:54:28.180
So if I had to ask you to authenticate,
link |
00:54:31.260
like prove how do I know you're not a robot
link |
00:54:33.180
and how do you know I'm not a robot?
link |
00:54:34.940
Yeah.
link |
00:54:35.780
I think that's so far where in this space,
link |
00:54:40.540
this conversation we just had,
link |
00:54:42.140
the physical movements we did,
link |
00:54:44.020
is the biggest gap between us and AI systems
link |
00:54:47.060
is the physical manipulation.
link |
00:54:49.380
So maybe that's the last frontier.
link |
00:54:51.300
Well, here's another question is why is,
link |
00:54:55.020
why is solving this problem important, right?
link |
00:54:57.300
Like what aspects are really important to us?
link |
00:54:59.100
And I think that probably where we'll end up
link |
00:55:01.220
is we'll hone in on what do we really want
link |
00:55:03.620
out of knowing if we're talking to a human.
link |
00:55:06.420
And I think that, again, this comes down to identity.
link |
00:55:09.460
And so I think that the internet of the future,
link |
00:55:11.780
I expect to be one that will have lots of agents out there
link |
00:55:14.900
that will interact with you.
link |
00:55:16.380
But I think that the question of is this
link |
00:55:19.260
flesh, real flesh and blood human
link |
00:55:21.580
or is this an automated system,
link |
00:55:23.860
may actually just be less important.
link |
00:55:25.820
Let's actually go there.
link |
00:55:27.420
It's GPT2 is impressive and let's look at GPT20.
link |
00:55:32.500
Why is it so bad that all my friends are GPT20?
link |
00:55:37.500
Why is it so important on the internet,
link |
00:55:43.300
do you think, to interact with only human beings?
link |
00:55:47.340
Why can't we live in a world where ideas can come
link |
00:55:50.620
from models trained on human data?
link |
00:55:52.940
Yeah, I think this is actually
link |
00:55:54.820
a really interesting question.
link |
00:55:55.700
This comes back to the how do you even picture a world
link |
00:55:58.100
with some new technology?
link |
00:55:59.580
And I think that one thing that I think is important
link |
00:56:02.060
is, you know, let's say honesty.
link |
00:56:04.780
And I think that if you have almost in the Turing test
link |
00:56:07.820
style sense of technology, you have AIs that are pretending
link |
00:56:12.420
to be humans and deceiving you.
link |
00:56:14.100
I think that feels like a bad thing, right?
link |
00:56:17.300
I think that it's really important that we feel like
link |
00:56:19.460
we're in control of our environment, right?
link |
00:56:20.980
That we understand who we're interacting with.
link |
00:56:23.140
And if it's an AI or a human, that's not something
link |
00:56:27.060
that we're being deceived about.
link |
00:56:28.420
But I think that the flip side of can I have as meaningful
link |
00:56:31.220
of an interaction with an AI as I can with a human?
link |
00:56:33.980
Well, I actually think here you can turn to sci fi.
link |
00:56:36.620
And her I think is a great example of asking
link |
00:56:39.380
this very question, right?
link |
00:56:40.860
One thing I really love about her is it really starts out
link |
00:56:42.940
almost by asking how meaningful
link |
00:56:44.660
are human virtual relationships, right?
link |
00:56:47.020
And then you have a human who has a relationship with an AI
link |
00:56:50.940
and that you really start to be drawn into that, right?
link |
00:56:54.100
That all of your emotional buttons get triggered
link |
00:56:56.700
in the same way as if there was a real human
link |
00:56:58.260
that was on the other side of that phone.
link |
00:57:00.180
And so I think that this is one way of thinking about it
link |
00:57:03.540
is that I think that we can have meaningful interactions
link |
00:57:06.900
and that if there's a funny joke,
link |
00:57:09.500
some sense it doesn't really matter
link |
00:57:10.580
if it was written by a human or an AI.
link |
00:57:12.660
But what you don't want and why I think
link |
00:57:14.660
we should really draw hard lines is deception.
link |
00:57:17.100
And I think that as long as we're in a world
link |
00:57:19.340
where why do we build AI systems at all, right?
link |
00:57:22.420
The reason we want to build them is to enhance human lives,
link |
00:57:24.740
to make humans be able to do more things,
link |
00:57:26.420
to have humans feel more fulfilled.
link |
00:57:28.820
And if we can build AI systems that do that, sign me up.
link |
00:57:32.940
So the process of language modeling,
link |
00:57:36.860
how far do you think it'd take us?
link |
00:57:38.540
Let's look at movie Her.
link |
00:57:40.420
Do you think a dialogue, natural language conversation
link |
00:57:44.780
is formulated by the Turing test, for example,
link |
00:57:47.580
do you think that process could be achieved
link |
00:57:50.180
through this kind of unsupervised language modeling?
link |
00:57:52.900
So I think the Turing test in its real form
link |
00:57:56.700
isn't just about language, right?
link |
00:57:58.420
It's really about reasoning too, right?
link |
00:58:00.420
To really pass the Turing test,
link |
00:58:01.660
I should be able to teach calculus
link |
00:58:03.660
to whoever's on the other side
link |
00:58:05.340
and have it really understand calculus
link |
00:58:07.300
and be able to go and solve new calculus problems.
link |
00:58:11.100
And so I think that to really solve the Turing test,
link |
00:58:13.780
we need more than what we're seeing with language models.
link |
00:58:16.220
We need some way of plugging in reasoning.
link |
00:58:18.500
Now, how different will that be from what we already do?
link |
00:58:22.180
That's an open question, right?
link |
00:58:23.660
Might be that we need some sequence
link |
00:58:25.260
of totally radical new ideas,
link |
00:58:26.980
or it might be that we just need to kind of shape
link |
00:58:29.340
our existing systems in a slightly different way.
link |
00:58:32.740
But I think that in terms of how far language modeling
link |
00:58:35.020
will go, it's already gone way further
link |
00:58:37.260
than many people would have expected, right?
link |
00:58:39.460
I think that things like,
link |
00:58:40.700
and I think there's a lot of really interesting angles
link |
00:58:42.420
to poke in terms of how much does GPT2
link |
00:58:45.660
understand physical world?
link |
00:58:47.620
Like, you read a little bit about fire underwater in GPT2.
link |
00:58:52.060
So it's like, okay, maybe it doesn't quite understand
link |
00:58:53.900
what these things are, but at the same time,
link |
00:58:56.660
I think that you also see various things
link |
00:58:58.780
like smoke coming from flame,
link |
00:59:00.340
and a bunch of these things that GPT2,
link |
00:59:02.380
it has no body, it has no physical experience,
link |
00:59:04.580
it's just statically read data.
link |
00:59:06.980
And I think that the answer is like, we don't know yet.
link |
00:59:13.140
These questions, though, we're starting to be able
link |
00:59:15.020
to actually ask them to physical systems,
link |
00:59:17.300
to real systems that exist, and that's very exciting.
link |
00:59:19.580
Do you think, what's your intuition?
link |
00:59:20.860
Do you think if you just scale language modeling,
link |
00:59:25.220
like significantly scale,
link |
00:59:27.420
that reasoning can emerge from the same exact mechanisms?
link |
00:59:30.980
I think it's unlikely that if we just scale GPT2
link |
00:59:34.580
that we'll have reasoning in the full fledged way.
link |
00:59:38.260
And I think that there's like,
link |
00:59:39.420
the type signature's a little bit wrong, right?
link |
00:59:41.180
That like, there's something we do with,
link |
00:59:44.220
that we call thinking, right?
link |
00:59:45.460
Where we spend a lot of compute,
link |
00:59:47.300
like a variable amount of compute,
link |
00:59:48.820
to get to better answers, right?
link |
00:59:50.340
I think a little bit harder, I get a better answer.
link |
00:59:52.700
And that that kind of type signature
link |
00:59:54.860
isn't quite encoded in a GPT, right?
link |
00:59:58.620
GPT will kind of like, it's been a long time,
link |
01:00:01.580
and it's like evolutionary history,
link |
01:00:03.340
baking in all this information,
link |
01:00:04.380
getting very, very good at this predictive process.
link |
01:00:06.700
And then at runtime, I just kind of do one forward pass,
link |
01:00:10.020
and I'm able to generate stuff.
link |
01:00:12.940
And so, you know, there might be small tweaks
link |
01:00:15.260
to what we do in order to get the type signature, right?
link |
01:00:17.700
For example, well, you know,
link |
01:00:19.140
it's not really one forward pass, right?
link |
01:00:20.700
You know, you generate symbol by symbol,
link |
01:00:22.300
and so maybe you generate like a whole sequence
link |
01:00:24.340
of thoughts, and you only keep like the last bit
link |
01:00:26.540
or something.
link |
01:00:27.860
But I think that at the very least,
link |
01:00:29.500
I would expect you have to make changes like that.
link |
01:00:31.820
Yeah, just exactly how we, you said, think,
link |
01:00:35.220
is the process of generating thought by thought
link |
01:00:38.060
in the same kind of way, like you said,
link |
01:00:40.060
keep the last bit, the thing that we converge towards.
link |
01:00:43.220
Yep.
link |
01:00:44.700
And I think there's another piece which is interesting,
link |
01:00:46.980
which is this out of distribution generalization, right?
link |
01:00:49.940
That like thinking somehow lets us do that, right?
link |
01:00:52.300
That we haven't experienced a thing, and yet somehow
link |
01:00:54.780
we just kind of keep refining our mental model of it.
link |
01:00:57.780
This is, again, something that feels tied
link |
01:01:00.340
to whatever reasoning is, and maybe it's a small tweak
link |
01:01:04.620
to what we do, maybe it's many ideas,
link |
01:01:06.380
and we'll take as many decades.
link |
01:01:07.820
Yeah, so the assumption there,
link |
01:01:10.940
generalization out of distribution,
link |
01:01:12.980
is that it's possible to create new ideas.
link |
01:01:16.620
Mm hmm.
link |
01:01:17.460
You know, it's possible that nobody's ever created
link |
01:01:19.780
any new ideas, and then with scaling GPT2 to GPT20,
link |
01:01:25.340
you would essentially generalize to all possible thoughts
link |
01:01:30.340
that us humans could have.
link |
01:01:31.780
I mean.
link |
01:01:33.180
Just to play devil's advocate.
link |
01:01:34.180
Right, right, right, I mean, how many new story ideas
link |
01:01:37.260
have we come up with since Shakespeare, right?
link |
01:01:39.060
Yeah, exactly.
link |
01:01:40.100
It's just all different forms of love and drama and so on.
link |
01:01:44.620
Okay.
link |
01:01:45.740
Not sure if you read Bitter Lesson,
link |
01:01:47.460
a recent blog post by Rich Sutton.
link |
01:01:49.340
Yep, I have.
link |
01:01:50.820
He basically says something that echoes some of the ideas
link |
01:01:54.380
that you've been talking about, which is,
link |
01:01:56.780
he says the biggest lesson that can be read
link |
01:01:58.980
from 70 years of AI research is that general methods
link |
01:02:01.980
that leverage computation are ultimately going to,
link |
01:02:05.900
ultimately win out.
link |
01:02:07.820
Do you agree with this?
link |
01:02:08.860
So basically, and OpenAI in general,
link |
01:02:12.780
but the ideas you're exploring about coming up with methods,
link |
01:02:15.780
whether it's GPT2 modeling or whether it's OpenAI 5
link |
01:02:20.060
playing Dota, or a general method is better
link |
01:02:23.940
than a more fine tuned, expert tuned method.
link |
01:02:29.700
Yeah, so I think that, well one thing that I think
link |
01:02:32.140
was really interesting about the reaction
link |
01:02:33.740
to that blog post was that a lot of people have read this
link |
01:02:36.380
as saying that compute is all that matters.
link |
01:02:39.380
And that's a very threatening idea, right?
link |
01:02:41.300
And I don't think it's a true idea either.
link |
01:02:43.500
Right, it's very clear that we have algorithmic ideas
link |
01:02:45.740
that have been very important for making progress
link |
01:02:47.820
and to really build AGI.
link |
01:02:49.460
You wanna push as far as you can on the computational scale
link |
01:02:52.060
and you wanna push as far as you can on human ingenuity.
link |
01:02:55.500
And so I think you need both.
link |
01:02:56.980
But I think the way that you phrased the question
link |
01:02:58.260
is actually very good, right?
link |
01:02:59.580
That it's really about what kind of ideas
link |
01:03:02.140
should we be striving for?
link |
01:03:03.940
And absolutely, if you can find a scalable idea,
link |
01:03:07.540
you pour more compute into it, you pour more data into it,
link |
01:03:09.780
it gets better, like that's the real holy grail.
link |
01:03:13.740
And so I think that the answer to the question,
link |
01:03:16.580
I think, is yes, that that's really how we think about it
link |
01:03:19.900
and that part of why we're excited about the power
link |
01:03:22.700
of deep learning, the potential for building AGI
link |
01:03:25.260
is because we look at the systems that exist
link |
01:03:27.540
in the most successful AI systems
link |
01:03:29.700
and we realize that you scale those up,
link |
01:03:32.620
they're gonna work better.
link |
01:03:33.940
And I think that that scalability
link |
01:03:35.780
is something that really gives us hope
link |
01:03:37.020
for being able to build transformative systems.
link |
01:03:39.540
So I'll tell you, this is partially an emotional,
link |
01:03:43.780
a response that people often have,
link |
01:03:45.660
if compute is so important for state of the art performance,
link |
01:03:49.700
individual developers, maybe a 13 year old
link |
01:03:51.780
sitting somewhere in Kansas or something like that,
link |
01:03:54.420
they're sitting, they might not even have a GPU
link |
01:03:56.940
or may have a single GPU, a 1080 or something like that,
link |
01:03:59.980
and there's this feeling like, well,
link |
01:04:02.580
how can I possibly compete or contribute
link |
01:04:05.700
to this world of AI if scale is so important?
link |
01:04:09.780
So if you can comment on that and in general,
link |
01:04:12.460
do you think we need to also in the future
link |
01:04:14.780
focus on democratizing compute resources more
link |
01:04:19.980
or as much as we democratize the algorithms?
link |
01:04:22.620
Well, so the way that I think about it
link |
01:04:23.900
is that there's this space of possible progress, right?
link |
01:04:28.820
There's a space of ideas and sort of systems
link |
01:04:30.860
that will work that will move us forward
link |
01:04:32.900
and there's a portion of that space
link |
01:04:34.780
and to some extent, an increasingly significant portion
link |
01:04:37.020
of that space that does just require
link |
01:04:38.780
massive compute resources.
link |
01:04:40.980
And for that, I think that the answer is kind of clear
link |
01:04:44.660
and that part of why we have the structure that we do
link |
01:04:47.860
is because we think it's really important
link |
01:04:49.580
to be pushing the scale and to be building
link |
01:04:51.660
these large clusters and systems.
link |
01:04:53.740
But there's another portion of the space
link |
01:04:55.820
that isn't about the large scale compute
link |
01:04:57.780
that are these ideas that, and again,
link |
01:04:59.900
I think that for the ideas to really be impactful
link |
01:05:02.140
and really shine, that they should be ideas
link |
01:05:04.140
that if you scale them up, would work way better
link |
01:05:06.580
than they do at small scale.
link |
01:05:08.740
But that you can discover them
link |
01:05:10.420
without massive computational resources.
link |
01:05:12.700
And if you look at the history of recent developments,
link |
01:05:15.140
you think about things like the GAN or the VAE,
link |
01:05:17.620
that these are ones that I think you could come up with them
link |
01:05:20.860
without having, and in practice,
link |
01:05:22.660
people did come up with them without having
link |
01:05:24.460
massive, massive computational resources.
link |
01:05:26.500
Right, I just talked to Ian Goodfellow,
link |
01:05:27.900
but the thing is the initial GAN
link |
01:05:31.500
produced pretty terrible results, right?
link |
01:05:34.140
So only because it was in a very specific,
link |
01:05:36.220
it was only because they're smart enough
link |
01:05:38.220
to know that this is quite surprising
link |
01:05:39.940
it can generate anything that they know.
link |
01:05:43.100
Do you see a world, or is that too optimistic and dreamer
link |
01:05:45.980
like to imagine that the compute resources
link |
01:05:49.700
are something that's owned by governments
link |
01:05:52.180
and provided as utility?
link |
01:05:55.020
Actually, to some extent, this question reminds me
link |
01:05:57.100
of a blog post from one of my former professors at Harvard,
link |
01:06:01.140
this guy Matt Welsh, who was a systems professor.
link |
01:06:03.740
I remember sitting in his tenure talk, right,
link |
01:06:05.300
and that he had literally just gotten tenure.
link |
01:06:08.780
He went to Google for the summer
link |
01:06:10.940
and then decided he wasn't going back to academia, right?
link |
01:06:15.660
And kind of in his blog post, he makes this point that,
link |
01:06:18.340
look, as a systems researcher,
link |
01:06:20.780
that I come up with these cool system ideas, right,
link |
01:06:23.180
and I kind of build a little proof of concept,
link |
01:06:25.060
and the best thing I can hope for
link |
01:06:27.060
is that the people at Google or Yahoo,
link |
01:06:30.100
which was around at the time,
link |
01:06:31.580
will implement it and actually make it work at scale, right?
link |
01:06:35.380
That's like the dream for me, right?
link |
01:06:36.580
I build the little thing,
link |
01:06:37.420
and they turn it into the big thing that's actually working.
link |
01:06:39.980
And for him, he said, I'm done with that.
link |
01:06:43.340
I want to be the person who's actually doing building
link |
01:06:45.740
and deploying.
link |
01:06:47.300
And I think that there's a similar dichotomy here, right?
link |
01:06:49.540
I think that there are people who really actually find value,
link |
01:06:53.340
and I think it is a valuable thing to do
link |
01:06:55.180
to be the person who produces those ideas, right,
link |
01:06:57.420
who builds the proof of concept.
link |
01:06:58.820
And yeah, you don't get to generate
link |
01:07:00.540
the coolest possible GAN images,
link |
01:07:02.740
but you invented the GAN, right?
link |
01:07:04.460
And so there's a real trade off there,
link |
01:07:07.540
and I think that that's a very personal choice,
link |
01:07:09.020
but I think there's value in both sides.
link |
01:07:10.820
So do you think creating AGI or some new models,
link |
01:07:18.260
we would see echoes of the brilliance
link |
01:07:20.460
even at the prototype level?
link |
01:07:22.260
So you would be able to develop those ideas without scale,
link |
01:07:24.900
the initial seeds.
link |
01:07:27.300
So take a look at, you know,
link |
01:07:28.980
I always like to look at examples that exist, right?
link |
01:07:31.740
Look at real precedent.
link |
01:07:32.700
And so take a look at the June 2018 model that we released,
link |
01:07:37.020
that we scaled up to turn into GPT2.
link |
01:07:39.180
And you can see that at small scale,
link |
01:07:41.260
it set some records, right?
link |
01:07:42.780
This was the original GPT.
link |
01:07:44.820
We actually had some cool generations.
link |
01:07:46.820
They weren't nearly as amazing and really stunning
link |
01:07:49.820
as the GPT2 ones, but it was promising.
link |
01:07:51.980
It was interesting.
link |
01:07:53.020
And so I think it is the case
link |
01:07:54.500
that with a lot of these ideas,
link |
01:07:56.100
that you see promise at small scale.
link |
01:07:58.260
But there is an asterisk here, a very big asterisk,
link |
01:08:00.820
which is sometimes we see behaviors that emerge
link |
01:08:05.220
that are qualitatively different
link |
01:08:07.260
from anything we saw at small scale.
link |
01:08:09.060
And that the original inventor of whatever algorithm
link |
01:08:12.580
looks at and says, I didn't think it could do that.
link |
01:08:15.500
This is what we saw in Dota, right?
link |
01:08:17.420
So PPO was created by John Shulman,
link |
01:08:19.340
who's a researcher here.
link |
01:08:20.540
And with Dota, we basically just ran PPO
link |
01:08:24.660
at massive, massive scale.
link |
01:08:26.540
And there's some tweaks in order to make it work,
link |
01:08:29.100
but fundamentally, it's PPO at the core.
link |
01:08:31.540
And we were able to get this long term planning,
link |
01:08:35.300
these behaviors to really play out on a time scale
link |
01:08:38.700
that we just thought was not possible.
link |
01:08:40.780
And John looked at that and was like,
link |
01:08:42.700
I didn't think it could do that.
link |
01:08:44.220
That's what happens when you're at three orders
link |
01:08:45.460
of magnitude more scale than you tested at.
link |
01:08:48.380
Yeah, but it still has the same flavors of,
link |
01:08:50.580
you know, at least echoes of the expected billions.
link |
01:08:55.980
Although I suspect with GPT scaled more and more,
link |
01:08:59.020
you might get surprising things.
link |
01:09:01.780
So yeah, you're right, it's interesting.
link |
01:09:04.740
It's difficult to see how far an idea will go
link |
01:09:07.940
when it's scaled.
link |
01:09:09.300
It's an open question.
link |
01:09:11.020
Well, so to that point with Dota and PPO,
link |
01:09:13.060
like, I mean, here's a very concrete one, right?
link |
01:09:14.980
It's like, it's actually one thing
link |
01:09:16.620
that's very surprising about Dota
link |
01:09:17.700
that I think people don't really pay that much attention to
link |
01:09:20.340
is the decree of generalization
link |
01:09:22.380
out of distribution that happens, right?
link |
01:09:24.580
That you have this AI that's trained against other bots
link |
01:09:27.860
for its entirety, the entirety of its existence.
link |
01:09:30.340
Sorry to take a step back.
link |
01:09:31.460
Can you talk through, you know, a story of Dota,
link |
01:09:37.260
a story of leading up to opening I5 and that past,
link |
01:09:42.060
and what was the process of self play
link |
01:09:43.900
and so on of training on this?
link |
01:09:45.420
Yeah, yeah, yeah.
link |
01:09:46.260
So with Dota.
link |
01:09:47.100
What is Dota?
link |
01:09:47.940
Yeah, Dota is a complex video game
link |
01:09:50.020
and we started trying to solve Dota
link |
01:09:52.700
because we felt like this was a step towards the real world
link |
01:09:55.660
relative to other games like chess or Go, right?
link |
01:09:58.020
Those very cerebral games
link |
01:09:59.180
where you just kind of have this board,
link |
01:10:00.500
very discreet moves.
link |
01:10:01.900
Dota starts to be much more continuous time
link |
01:10:04.060
that you have this huge variety of different actions
link |
01:10:06.220
that you have a 45 minute game
link |
01:10:07.660
with all these different units
link |
01:10:09.380
and it's got a lot of messiness to it
link |
01:10:11.820
that really hasn't been captured by previous games.
link |
01:10:14.500
And famously, all of the hard coded bots for Dota
link |
01:10:17.340
were terrible, right?
link |
01:10:18.380
It's just impossible to write anything good for it
link |
01:10:19.940
because it's so complex.
link |
01:10:21.260
And so this seemed like a really good place
link |
01:10:23.300
to push what's the state of the art
link |
01:10:25.260
in reinforcement learning.
link |
01:10:26.860
And so we started by focusing
link |
01:10:28.380
on the one versus one version of the game
link |
01:10:29.980
and we're able to solve that.
link |
01:10:32.380
We're able to beat the world champions
link |
01:10:33.900
and the skill curve was this crazy exponential, right?
link |
01:10:38.980
And it was like constantly we were just scaling up
link |
01:10:41.020
that we were fixing bugs
link |
01:10:42.260
and that you look at the skill curve
link |
01:10:44.340
and it was really a very, very smooth one.
link |
01:10:46.660
This is actually really interesting
link |
01:10:47.500
to see how that human iteration loop
link |
01:10:50.020
yielded very steady exponential progress.
link |
01:10:52.740
And to one side note, first of all,
link |
01:10:55.220
it's an exceptionally popular video game.
link |
01:10:57.140
The side effect is that there's a lot of incredible
link |
01:11:00.300
human experts at that video game.
link |
01:11:01.960
So the benchmark that you're trying to reach is very high.
link |
01:11:05.260
And the other, can you talk about the approach
link |
01:11:07.900
that was used initially and throughout
link |
01:11:10.140
training these agents to play this game?
link |
01:11:12.100
Yep, and so the approach that we used is self play.
link |
01:11:14.420
And so you have two agents that don't know anything.
link |
01:11:17.380
They battle each other,
link |
01:11:18.700
they discover something a little bit good
link |
01:11:20.820
and now they both know it.
link |
01:11:22.060
And they just get better and better and better
link |
01:11:23.400
without bound.
link |
01:11:24.540
And that's a really powerful idea, right?
link |
01:11:27.100
That we then went from the one versus one version
link |
01:11:30.180
of the game and scaled up to five versus five, right?
link |
01:11:32.460
So you think about kind of like with basketball
link |
01:11:34.340
where you have this like team sport
link |
01:11:35.500
and you need to do all this coordination
link |
01:11:37.700
and we were able to push the same idea,
link |
01:11:40.940
the same self play to really get to the professional level
link |
01:11:45.940
at the full five versus five version of the game.
link |
01:11:48.980
And the things I think are really interesting here
link |
01:11:52.460
is that these agents, in some ways,
link |
01:11:54.820
they're almost like an insect like intelligence, right?
link |
01:11:56.820
Where they have a lot in common
link |
01:11:58.720
with how an insect is trained, right?
link |
01:12:00.180
An insect kind of lives in this environment
link |
01:12:01.840
for a very long time or the ancestors of this insect
link |
01:12:04.980
have been around for a long time
link |
01:12:05.900
and had a lot of experience that gets baked into this agent.
link |
01:12:09.740
And it's not really smart in the sense of a human, right?
link |
01:12:12.780
It's not able to go and learn calculus,
link |
01:12:14.620
but it's able to navigate its environment extremely well.
link |
01:12:16.980
And it's able to handle unexpected things
link |
01:12:18.460
in the environment that it's never seen before pretty well.
link |
01:12:22.060
And we see the same sort of thing with our Dota bots, right?
link |
01:12:24.780
That they're able to, within this game,
link |
01:12:26.740
they're able to play against humans,
link |
01:12:28.460
which is something that never existed
link |
01:12:29.980
in its evolutionary environment,
link |
01:12:31.380
totally different play styles from humans versus the bots.
link |
01:12:34.340
And yet it's able to handle it extremely well.
link |
01:12:37.220
And that's something that I think was very surprising to us,
link |
01:12:40.420
was something that doesn't really emerge
link |
01:12:43.460
from what we've seen with PPO at smaller scale, right?
link |
01:12:47.260
And the kind of scale we're running this stuff at was,
link |
01:12:49.780
I could say like 100,000 CPU cores
link |
01:12:51.980
running with like hundreds of GPUs.
link |
01:12:54.140
It was probably about something like hundreds
link |
01:12:57.580
of years of experience going into this bot
link |
01:13:01.300
every single real day.
link |
01:13:03.860
And so that scale is massive
link |
01:13:06.280
and we start to see very different kinds of behaviors
link |
01:13:08.500
out of the algorithms that we all know and love.
link |
01:13:10.820
Dota, you mentioned, beat the world expert one v one.
link |
01:13:15.260
And then you weren't able to win five v five this year.
link |
01:13:20.820
Yeah.
link |
01:13:21.660
At the best players in the world.
link |
01:13:24.180
So what's the comeback story?
link |
01:13:26.700
First of all, talk through that.
link |
01:13:27.740
That was an exceptionally exciting event.
link |
01:13:29.540
And what's the following months and this year look like?
link |
01:13:33.260
Yeah, yeah, so one thing that's interesting
link |
01:13:35.340
is that we lose all the time.
link |
01:13:38.700
Because we play.
link |
01:13:39.540
Who's we here?
link |
01:13:40.380
The Dota team at OpenAI.
link |
01:13:41.820
We play the bot against better players
link |
01:13:44.260
than our system all the time.
link |
01:13:45.920
Or at least we used to, right?
link |
01:13:47.500
Like the first time we lost publicly
link |
01:13:50.200
was we went up on stage at the international
link |
01:13:52.340
and we played against some of the best teams in the world
link |
01:13:54.740
and we ended up losing both games,
link |
01:13:56.440
but we gave them a run for their money, right?
link |
01:13:58.660
That both games were kind of 30 minutes, 25 minutes
link |
01:14:01.540
and they went back and forth, back and forth,
link |
01:14:03.260
back and forth.
link |
01:14:04.180
And so I think that really shows
link |
01:14:06.020
that we're at the professional level
link |
01:14:08.140
and that kind of looking at those games,
link |
01:14:09.780
we think that the coin could have gone a different direction
link |
01:14:12.420
and we could have had some wins.
link |
01:14:14.140
That was actually very encouraging for us.
link |
01:14:16.140
And it's interesting because the international
link |
01:14:18.380
was at a fixed time, right?
link |
01:14:19.860
So we knew exactly what day we were going to be playing
link |
01:14:22.900
and we pushed as far as we could, as fast as we could.
link |
01:14:25.660
Two weeks later, we had a bot that had an 80% win rate
link |
01:14:28.160
versus the one that played at TI.
link |
01:14:30.260
So the march of progress, you should think of it
link |
01:14:32.460
as a snapshot rather than as an end state.
link |
01:14:34.920
And so in fact, we'll be announcing our finals pretty soon.
link |
01:14:39.180
I actually think that we'll announce our final match
link |
01:14:42.900
prior to this podcast being released.
link |
01:14:45.340
So we'll be playing against the world champions.
link |
01:14:49.900
And for us, it's really less about,
link |
01:14:52.700
like the way that we think about what's upcoming
link |
01:14:55.460
is the final milestone, the final competitive milestone
link |
01:14:59.180
for the project, right?
link |
01:15:00.460
That our goal in all of this
link |
01:15:02.220
isn't really about beating humans at Dota.
link |
01:15:05.340
Our goal is to push the state of the art
link |
01:15:06.980
in reinforcement learning.
link |
01:15:08.020
And we've done that, right?
link |
01:15:09.100
And we've actually learned a lot from our system
link |
01:15:10.820
and that we have, I think, a lot of exciting next steps
link |
01:15:13.940
that we want to take.
link |
01:15:14.860
And so kind of as a final showcase of what we built,
link |
01:15:17.480
we're going to do this match.
link |
01:15:18.900
But for us, it's not really the success or failure
link |
01:15:21.380
to see do we have the coin flip go in our direction
link |
01:15:24.480
or against.
link |
01:15:25.940
Where do you see the field of deep learning
link |
01:15:28.860
heading in the next few years?
link |
01:15:31.620
Where do you see the work and reinforcement learning
link |
01:15:35.620
perhaps heading, and more specifically with OpenAI,
link |
01:15:41.220
all the exciting projects that you're working on,
link |
01:15:44.460
what does 2019 hold for you?
link |
01:15:46.460
Massive scale.
link |
01:15:47.420
Scale.
link |
01:15:48.260
I will put an asterisk on that and just say,
link |
01:15:49.900
I think that it's about ideas plus scale.
link |
01:15:52.340
You need both.
link |
01:15:53.180
So that's a really good point.
link |
01:15:55.060
So the question, in terms of ideas,
link |
01:15:58.620
you have a lot of projects
link |
01:16:00.620
that are exploring different areas of intelligence.
link |
01:16:04.380
And the question is, when you think of scale,
link |
01:16:07.660
do you think about growing the scale
link |
01:16:09.820
of those individual projects
link |
01:16:10.940
or do you think about adding new projects?
link |
01:16:13.260
And sorry to, and if you're thinking about
link |
01:16:16.060
adding new projects, or if you look at the past,
link |
01:16:19.020
what's the process of coming up with new projects
link |
01:16:21.380
and new ideas?
link |
01:16:22.220
Yep.
link |
01:16:23.060
So we really have a life cycle of project here.
link |
01:16:25.380
So we start with a few people
link |
01:16:27.040
just working on a small scale idea.
link |
01:16:28.560
And language is actually a very good example of this.
link |
01:16:30.700
That it was really one person here
link |
01:16:32.620
who was pushing on language for a long time.
link |
01:16:35.020
I mean, then you get signs of life, right?
link |
01:16:36.820
And so this is like, let's say,
link |
01:16:38.860
with the original GPT, we had something that was interesting
link |
01:16:42.740
and we said, okay, it's time to scale this, right?
link |
01:16:44.940
It's time to put more people on it,
link |
01:16:46.100
put more computational resources behind it.
link |
01:16:48.160
And then we just kind of keep pushing and keep pushing.
link |
01:16:51.660
And the end state is something
link |
01:16:52.700
that looks like Dota or robotics,
link |
01:16:54.420
where you have a large team of 10 or 15 people
link |
01:16:57.220
that are running things at very large scale
link |
01:16:59.300
and that you're able to really have material engineering
link |
01:17:02.300
and sort of machine learning science coming together
link |
01:17:06.640
to make systems that work and get material results
link |
01:17:10.380
that just would have been impossible otherwise.
link |
01:17:12.380
So we do that whole life cycle.
link |
01:17:13.740
We've done it a number of times, typically end to end.
link |
01:17:16.780
It's probably two years or so to do it.
link |
01:17:20.540
The organization has been around for three years,
link |
01:17:21.900
so maybe we'll find that we also have
link |
01:17:23.140
longer life cycle projects, but we'll work up to those.
link |
01:17:29.740
So one team that we were actually just starting,
link |
01:17:31.580
Ilya and I are kicking off a new team
link |
01:17:33.400
called the Reasoning Team,
link |
01:17:34.620
and that this is to really try to tackle
link |
01:17:36.420
how do you get neural networks to reason?
link |
01:17:38.700
And we think that this will be a long term project.
link |
01:17:42.700
It's one that we're very excited about.
link |
01:17:44.720
In terms of reasoning, super exciting topic,
link |
01:17:48.400
what kind of benchmarks, what kind of tests of reasoning
link |
01:17:54.180
do you envision?
link |
01:17:55.280
What would, if you sat back with whatever drink
link |
01:17:58.980
and you would be impressed that this system
link |
01:18:01.220
is able to do something, what would that look like?
link |
01:18:03.900
Theorem proving.
link |
01:18:04.860
Theorem proving.
link |
01:18:06.460
So some kind of logic, and especially mathematical logic.
link |
01:18:10.540
I think so.
link |
01:18:11.380
I think that there's other problems that are dual
link |
01:18:14.180
to theorem proving in particular.
link |
01:18:15.980
You think about programming, you think about
link |
01:18:18.500
even security analysis of code,
link |
01:18:21.260
that these all kind of capture the same sorts
link |
01:18:23.720
of core reasoning and being able to do
link |
01:18:26.200
some out of distribution generalization.
link |
01:18:28.360
So it would be quite exciting if OpenAI Reasoning Team
link |
01:18:32.600
was able to prove that P equals NP.
link |
01:18:34.720
That would be very nice.
link |
01:18:36.040
It would be very, very, very exciting, especially.
link |
01:18:38.560
If it turns out that P equals NP,
link |
01:18:39.760
that'll be interesting too.
link |
01:18:41.060
It would be ironic and humorous.
link |
01:18:47.560
So what problem stands out to you
link |
01:18:49.880
as the most exciting and challenging and impactful
link |
01:18:53.960
to the work for us as a community in general
link |
01:18:56.380
and for OpenAI this year?
link |
01:18:58.520
You mentioned reasoning.
link |
01:18:59.600
I think that's a heck of a problem.
link |
01:19:01.440
Yeah, so I think reasoning's an important one.
link |
01:19:02.880
I think it's gonna be hard to get good results in 2019.
link |
01:19:05.840
Again, just like we think about the life cycle, takes time.
link |
01:19:08.760
I think for 2019, language modeling seems to be
link |
01:19:11.040
kind of on that ramp.
link |
01:19:12.640
It's at the point that we have a technique that works.
link |
01:19:14.960
We wanna scale 100x, 1,000x, see what happens.
link |
01:19:18.080
Awesome.
link |
01:19:19.040
Do you think we're living in a simulation?
link |
01:19:21.600
I think it's hard to have a real opinion about it.
link |
01:19:24.840
It's actually interesting.
link |
01:19:26.320
I separate out things that I think can have like,
link |
01:19:29.520
yield materially different predictions about the world
link |
01:19:32.680
from ones that are just kind of fun to speculate about.
link |
01:19:35.880
I kind of view simulation as more like,
link |
01:19:37.960
is there a flying teapot between Mars and Jupiter?
link |
01:19:40.320
Like, maybe, but it's a little bit hard to know
link |
01:19:44.000
what that would mean for my life.
link |
01:19:45.120
So there is something actionable.
link |
01:19:47.000
So some of the best work OpenAI has done
link |
01:19:50.760
is in the field of reinforcement learning.
link |
01:19:52.780
And some of the success of reinforcement learning
link |
01:19:56.620
come from being able to simulate
link |
01:19:58.520
the problem you're trying to solve.
link |
01:20:00.120
So do you have a hope for reinforcement,
link |
01:20:03.680
for the future of reinforcement learning
link |
01:20:05.320
and for the future of simulation?
link |
01:20:07.080
Like whether it's, we're talking about autonomous vehicles
link |
01:20:09.120
or any kind of system.
link |
01:20:10.920
Do you see that scaling to where we'll be able
link |
01:20:13.560
to simulate systems and hence,
link |
01:20:16.440
be able to create a simulator that echoes our real world
link |
01:20:19.400
and proving once and for all,
link |
01:20:21.620
even though you're denying it,
link |
01:20:22.680
that we're living in a simulation?
link |
01:20:25.080
I feel like it's two separate questions, right?
link |
01:20:26.500
So kind of at the core there of like,
link |
01:20:28.400
can we use simulation for self driving cars?
link |
01:20:31.240
Take a look at our robotic system, Dactyl, right?
link |
01:20:33.860
That was trained in simulation using the Dota system,
link |
01:20:37.000
in fact, and it transfers to a physical robot.
link |
01:20:40.480
And I think everyone looks at our Dota system,
link |
01:20:42.320
they're like, okay, it's just a game.
link |
01:20:43.560
How are you ever gonna escape to the real world?
link |
01:20:45.260
And the answer is, well, we did it with a physical robot
link |
01:20:47.480
that no one could program.
link |
01:20:48.720
And so I think the answer is simulation
link |
01:20:50.240
goes a lot further than you think
link |
01:20:52.080
if you apply the right techniques to it.
link |
01:20:54.240
Now, there's a question of,
link |
01:20:55.480
are the beings in that simulation gonna wake up
link |
01:20:57.520
and have consciousness?
link |
01:20:59.620
I think that one seems a lot harder to, again,
link |
01:21:02.380
reason about.
link |
01:21:03.220
I think that you really should think about
link |
01:21:05.400
where exactly does human consciousness come from
link |
01:21:07.940
in our own self awareness?
link |
01:21:09.160
And is it just that once you have a complicated enough
link |
01:21:11.920
neural net, you have to worry about
link |
01:21:13.220
the agents feeling pain?
link |
01:21:15.840
And I think there's interesting speculation to do there,
link |
01:21:19.440
but again, I think it's a little bit hard to know for sure.
link |
01:21:23.120
Well, let me just keep with the speculation.
link |
01:21:25.040
Do you think to create intelligence, general intelligence,
link |
01:21:28.640
you need, one, consciousness, and two, a body?
link |
01:21:33.180
Do you think any of those elements are needed,
link |
01:21:35.040
or is intelligence something that's orthogonal to those?
link |
01:21:38.480
I'll stick to the non grand answer first, right?
link |
01:21:41.920
So the non grand answer is just to look at,
link |
01:21:44.360
what are we already making work?
link |
01:21:45.800
You look at GPT2, a lot of people would have said
link |
01:21:47.800
that to even get these kinds of results,
link |
01:21:49.480
you need real world experience.
link |
01:21:51.080
You need a body, you need grounding.
link |
01:21:52.560
How are you supposed to reason about any of these things?
link |
01:21:55.060
How are you supposed to like even kind of know
link |
01:21:56.500
about smoke and fire and those things
link |
01:21:58.040
if you've never experienced them?
link |
01:21:59.740
And GPT2 shows that you can actually go way further
link |
01:22:03.000
than that kind of reasoning would predict.
link |
01:22:06.880
So I think that in terms of, do we need consciousness?
link |
01:22:10.600
Do we need a body?
link |
01:22:11.840
It seems the answer is probably not, right?
link |
01:22:13.400
That we could probably just continue to push
link |
01:22:15.100
kind of the systems we have.
link |
01:22:16.140
They already feel general.
link |
01:22:18.280
They're not as competent or as general
link |
01:22:20.560
or able to learn as quickly as an AGI would,
link |
01:22:23.000
but they're at least like kind of proto AGI in some way,
link |
01:22:27.420
and they don't need any of those things.
link |
01:22:29.800
Now let's move to the grand answer,
link |
01:22:31.960
which is, are our neural nets conscious already?
link |
01:22:36.520
Would we ever know?
link |
01:22:37.440
How can we tell, right?
link |
01:22:38.920
And here's where the speculation starts to become
link |
01:22:43.040
at least interesting or fun
link |
01:22:44.920
and maybe a little bit disturbing
link |
01:22:46.520
depending on where you take it.
link |
01:22:48.080
But it certainly seems that when we think about animals,
link |
01:22:51.280
that there's some continuum of consciousness.
link |
01:22:53.280
You know, my cat I think is conscious in some way, right?
link |
01:22:57.120
Not as conscious as a human.
link |
01:22:58.200
And you could imagine that you could build
link |
01:23:00.080
a little consciousness meter, right?
link |
01:23:01.220
You point at a cat, it gives you a little reading.
link |
01:23:03.080
Point at a human, it gives you much bigger reading.
link |
01:23:06.400
What would happen if you pointed one of those
link |
01:23:08.120
at a donor neural net?
link |
01:23:09.960
And if you're training in this massive simulation,
link |
01:23:12.180
do the neural nets feel pain?
link |
01:23:13.680
You know, it becomes pretty hard to know
link |
01:23:16.960
that the answer is no.
link |
01:23:18.840
And it becomes pretty hard to really think about
link |
01:23:21.660
what that would mean if the answer were yes.
link |
01:23:25.440
And it's very possible, you know, for example,
link |
01:23:27.600
you could imagine that maybe the reason
link |
01:23:29.600
that humans have consciousness
link |
01:23:31.560
is because it's a convenient computational shortcut, right?
link |
01:23:35.160
If you think about it, if you have a being
link |
01:23:37.120
that wants to avoid pain,
link |
01:23:38.360
which seems pretty important to survive in this environment
link |
01:23:40.960
and wants to like, you know, eat food,
link |
01:23:43.800
then that maybe the best way of doing it
link |
01:23:45.640
is to have a being that's conscious, right?
link |
01:23:47.240
That, you know, in order to succeed in the environment,
link |
01:23:49.640
you need to have those properties
link |
01:23:51.200
and how are you supposed to implement them
link |
01:23:52.760
and maybe this consciousness's way of doing that.
link |
01:23:55.440
If that's true, then actually maybe we should expect
link |
01:23:57.920
that really competent reinforcement learning agents
link |
01:24:00.060
will also have consciousness.
link |
01:24:02.120
But you know, that's a big if.
link |
01:24:03.360
And I think there are a lot of other arguments
link |
01:24:04.880
they can make in other directions.
link |
01:24:06.760
I think that's a really interesting idea
link |
01:24:08.520
that even GPT2 has some degree of consciousness.
link |
01:24:11.520
That's something, it's actually not as crazy
link |
01:24:14.320
to think about, it's useful to think about
link |
01:24:16.640
as we think about what it means
link |
01:24:18.320
to create intelligence of a dog, intelligence of a cat,
link |
01:24:22.240
and the intelligence of a human.
link |
01:24:24.480
So last question, do you think
link |
01:24:27.880
we will ever fall in love, like in the movie Her,
link |
01:24:32.040
with an artificial intelligence system
link |
01:24:34.480
or an artificial intelligence system
link |
01:24:36.300
falling in love with a human?
link |
01:24:38.640
I hope so.
link |
01:24:40.280
If there's any better way to end it is on love.
link |
01:24:43.760
So Greg, thanks so much for talking today.
link |
01:24:45.680
Thank you for having me.