back to index

Greg Brockman: OpenAI and AGI | Lex Fridman Podcast #17


small model | large model

link |
00:00:00.000
The following is a conversation with Greg Brockman.
link |
00:00:02.920
He's the cofounder and CTO of OpenAI,
link |
00:00:05.400
a world class research organization
link |
00:00:07.440
developing ideas in AI with a goal of eventually
link |
00:00:10.840
creating a safe and friendly artificial general
link |
00:00:14.200
intelligence, one that benefits and empowers humanity.
link |
00:00:18.840
OpenAI is not only a source of publications, algorithms,
link |
00:00:22.520
tools, and data sets.
link |
00:00:24.480
Their mission is a catalyst for an important public discourse
link |
00:00:28.120
about our future with both narrow and general intelligence
link |
00:00:32.720
systems.
link |
00:00:34.000
This conversation is part of the Artificial Intelligence
link |
00:00:36.640
Podcast at MIT and beyond.
link |
00:00:39.480
If you enjoy it, subscribe on YouTube, iTunes,
link |
00:00:42.720
or simply connect with me on Twitter
link |
00:00:44.520
at Lex Friedman, spelled F R I D.
link |
00:00:48.040
And now here's my conversation with Greg Brockman.
link |
00:00:52.760
So in high school and right after you wrote
link |
00:00:54.640
a draft of a chemistry textbook,
link |
00:00:56.640
I saw that that covers everything
link |
00:00:58.120
from basic structure of the atom to quantum mechanics.
link |
00:01:01.400
So it's clear you have an intuition and a passion
link |
00:01:04.400
for both the physical world with chemistry and non robotics
link |
00:01:09.880
to the digital world with AI, deep learning,
link |
00:01:13.680
reinforcement learning, so on.
link |
00:01:15.360
Do you see the physical world and the digital world
link |
00:01:17.360
as different?
link |
00:01:18.600
And what do you think is the gap?
link |
00:01:20.480
A lot of it actually boils down to iteration speed,
link |
00:01:23.040
that I think that a lot of what really motivates me
link |
00:01:25.240
is building things, right?
link |
00:01:27.720
Think about mathematics, for example,
link |
00:01:29.320
where you think really hard about a problem.
link |
00:01:30.920
You understand it.
link |
00:01:31.680
You're right down in this very obscure form
link |
00:01:33.320
that we call a proof.
link |
00:01:34.560
But then this is in humanity's library, right?
link |
00:01:37.600
It's there forever.
link |
00:01:38.400
This is some truth that we've discovered.
link |
00:01:40.400
And maybe only five people in your field will ever read it.
link |
00:01:43.040
But somehow you've kind of moved humanity forward.
link |
00:01:45.440
And so I actually used to really think
link |
00:01:46.880
that I was going to be a mathematician.
link |
00:01:48.680
And then I actually started writing this chemistry textbook.
link |
00:01:51.400
One of my friends told me, you'll never publish it
link |
00:01:53.360
because you don't have a PhD.
link |
00:01:54.880
So instead, I decided to build a website
link |
00:01:58.000
and try to promote my ideas that way.
link |
00:01:59.960
And then I discovered programming.
link |
00:02:01.480
And in programming, you think hard about a problem.
link |
00:02:05.320
You understand it.
link |
00:02:06.080
You're right down in a very obscure form
link |
00:02:08.040
that we call a program.
link |
00:02:09.720
But then once again, it's in humanity's library, right?
link |
00:02:12.240
And anyone can get the benefit from it.
link |
00:02:14.120
And the scalability is massive.
link |
00:02:15.720
And so I think that the thing that really appeals to me
link |
00:02:17.720
about the digital world is that you
link |
00:02:19.440
can have this insane leverage, right?
link |
00:02:21.960
A single individual with an idea is able to affect
link |
00:02:24.960
the entire planet.
link |
00:02:26.120
And that's something I think is really hard to do
link |
00:02:28.240
if you're moving around physical atoms.
link |
00:02:30.240
But you said mathematics.
link |
00:02:32.440
So if you look at the wet thing over here, our mind,
link |
00:02:36.880
do you ultimately see it as just math,
link |
00:02:39.800
as just information processing?
link |
00:02:41.800
Or is there some other magic if you've
link |
00:02:44.880
seen through biology and chemistry and so on?
link |
00:02:46.960
I think it's really interesting to think about humans
link |
00:02:49.000
as just information processing systems.
link |
00:02:51.000
And it seems like it's actually a pretty good way
link |
00:02:54.080
of describing a lot of how the world works or a lot of what
link |
00:02:57.560
we're capable of to think that, again, if you just
link |
00:03:01.000
look at technological innovations over time,
link |
00:03:03.640
that in some ways, the most transformative innovation
link |
00:03:05.920
that we've had has been the computer, right?
link |
00:03:07.760
In some ways, the internet, what has the internet done?
link |
00:03:10.560
The internet is not about these physical cables.
link |
00:03:12.760
It's about the fact that I am suddenly
link |
00:03:14.560
able to instantly communicate with any other human
link |
00:03:16.560
on the planet.
link |
00:03:17.680
I'm able to retrieve any piece of knowledge
link |
00:03:19.680
that, in some ways, the human race has ever had,
link |
00:03:22.640
and that those are these insane transformations.
link |
00:03:26.080
Do you see our society as a whole the collective
link |
00:03:29.320
as another extension of the intelligence
link |
00:03:31.240
of the human being?
link |
00:03:32.280
So if you look at the human being as an information processing
link |
00:03:34.440
system, you mentioned the internet, the networking.
link |
00:03:36.920
Do you see us all together as a civilization
link |
00:03:39.360
as a kind of intelligent system?
link |
00:03:41.680
Yeah, I think this is actually a really interesting
link |
00:03:43.560
perspective to take and to think about
link |
00:03:45.840
that you sort of have this collective intelligence
link |
00:03:48.080
of all of society.
link |
00:03:49.520
The economy itself is this superhuman machine
link |
00:03:51.680
that is optimizing something, right?
link |
00:03:54.440
And it's almost, in some ways, a company
link |
00:03:56.400
has a will of its own, right?
link |
00:03:57.960
That you have all these individuals who are all
link |
00:03:59.400
pursuing their own individual goals
link |
00:04:00.800
and thinking really hard and thinking
link |
00:04:02.400
about the right things to do, but somehow the company does
link |
00:04:04.640
something that is this emergent thing
link |
00:04:07.880
and that it's a really useful abstraction.
link |
00:04:10.640
And so I think that in some ways,
link |
00:04:12.440
we think of ourselves as the most intelligent things
link |
00:04:14.880
on the planet and the most powerful things on the planet.
link |
00:04:17.480
But there are things that are bigger than us,
link |
00:04:19.320
that are these systems that we all contribute to.
link |
00:04:21.480
And so I think actually, it's interesting to think about,
link |
00:04:25.000
if you've read Asa Geismov's foundation, right,
link |
00:04:27.440
that there's this concept of psycho history in there,
link |
00:04:30.160
which is effectively this, that if you have trillions
link |
00:04:31.920
or quadrillions of beings, then maybe you could actually
link |
00:04:35.200
predict what that huge macro being will do
link |
00:04:39.080
and almost independent of what the individuals want.
link |
00:04:42.400
And I actually have a second angle on this
link |
00:04:44.240
that I think is interesting, which is thinking about
link |
00:04:46.760
technological determinism.
link |
00:04:48.400
One thing that I actually think a lot about with OpenAI
link |
00:04:51.480
is that we're kind of coming onto this insanely
link |
00:04:54.720
transformational technology of general intelligence
link |
00:04:57.400
that will happen at some point.
link |
00:04:58.760
And there's a question of how can you take actions
link |
00:05:01.560
that will actually steer it to go better rather than worse?
link |
00:05:04.880
And that I think one question you need to ask is,
link |
00:05:06.720
as a scientist, as an event or as a creator,
link |
00:05:09.320
what impact can you have in general?
link |
00:05:11.720
You look at things like the telephone
link |
00:05:12.880
invented by two people on the same day.
link |
00:05:14.840
Like what does that mean, like what does that mean
link |
00:05:16.600
about the shape of innovation?
link |
00:05:18.080
And I think that what's going on is everyone's building
link |
00:05:20.160
on the shoulders of the same giants.
link |
00:05:21.720
And so you can kind of, you can't really hope
link |
00:05:23.840
to create something no one else ever would.
link |
00:05:25.720
You know, if Einstein wasn't born,
link |
00:05:27.040
someone else would have come up with relativity.
link |
00:05:29.200
You know, he changed the timeline a bit, right?
link |
00:05:31.000
That maybe it would have taken another 20 years,
link |
00:05:33.000
but it wouldn't be that fundamentally humanity
link |
00:05:34.560
would never discover these fundamental truths.
link |
00:05:37.360
So there's some kind of invisible momentum
link |
00:05:40.440
that some people like Einstein or OpenAI is plugging into
link |
00:05:45.400
that anybody else can also plug into.
link |
00:05:47.800
And ultimately, that wave takes us into a certain direction.
link |
00:05:50.800
That's what you mean by digitalism?
link |
00:05:51.840
That's right, that's right.
link |
00:05:52.840
And you know, this kind of seems to play out
link |
00:05:54.240
in a bunch of different ways.
link |
00:05:55.720
That there's some exponential that is being written
link |
00:05:58.040
and that the exponential itself, which one it is,
link |
00:05:59.960
changes, think about Moore's Law,
link |
00:06:01.520
an entire industry set, it's clocked to it for 50 years.
link |
00:06:04.800
Like how can that be, right?
link |
00:06:06.200
How is that possible?
link |
00:06:07.360
And yet somehow it happened.
link |
00:06:09.320
And so I think you can't hope to ever invent something
link |
00:06:12.200
that no one else will.
link |
00:06:13.360
Maybe you can change the timeline a little bit.
link |
00:06:15.360
But if you really want to make a difference,
link |
00:06:17.400
I think that the thing that you really have to do,
link |
00:06:19.440
the only real degree of freedom you have
link |
00:06:21.320
is to set the initial conditions
link |
00:06:23.040
under which a technology is born.
link |
00:06:24.960
And so you think about the internet, right?
link |
00:06:26.680
That there are lots of other competitors
link |
00:06:27.840
trying to build similar things.
link |
00:06:29.400
And the internet one, and that the initial conditions
link |
00:06:33.240
where it was created by this group
link |
00:06:34.680
that really valued people being able to be,
link |
00:06:37.760
you know, anyone being able to plug in
link |
00:06:39.120
this very academic mindset of being open and connected.
link |
00:06:42.480
And I think that the internet for the next 40 years
link |
00:06:44.400
really played out that way.
link |
00:06:46.360
You know, maybe today,
link |
00:06:47.680
things are starting to shift in a different direction,
link |
00:06:49.840
but I think that those initial conditions
link |
00:06:51.120
were really important to determine
link |
00:06:52.720
the next 40 years worth of progress.
link |
00:06:55.080
That's really beautifully put.
link |
00:06:56.440
So another example of that I think about,
link |
00:06:58.800
you know, I recently looked at it.
link |
00:07:00.800
I looked at Wikipedia, the formation of Wikipedia.
link |
00:07:03.800
And I wonder what the internet would be like
link |
00:07:05.520
if Wikipedia had ads.
link |
00:07:07.760
You know, there's an interesting argument
link |
00:07:09.640
that why they chose not to put advertisement on Wikipedia.
link |
00:07:14.280
I think Wikipedia is one of the greatest resources
link |
00:07:17.800
we have on the internet.
link |
00:07:18.920
It's extremely surprising how well it works
link |
00:07:21.280
and how well it was able to aggregate
link |
00:07:22.960
all this kind of good information.
link |
00:07:25.000
And essentially the creator of Wikipedia,
link |
00:07:27.320
I don't know, there's probably some debates there,
link |
00:07:29.360
but set the initial conditions
link |
00:07:31.200
and now it carried itself forward.
link |
00:07:33.240
That's really interesting.
link |
00:07:34.080
So the way you're thinking about AGI
link |
00:07:36.520
or artificial intelligence is you're focused on
link |
00:07:38.640
setting the initial conditions for the progress.
link |
00:07:41.200
That's right.
link |
00:07:42.320
That's powerful.
link |
00:07:43.160
Okay, so look into the future.
link |
00:07:45.560
If you create an AGI system,
link |
00:07:48.160
like one that can ace the Turing test, natural language,
link |
00:07:51.560
what do you think would be the interactions
link |
00:07:54.800
you would have with it?
link |
00:07:55.840
What do you think are the questions you would ask?
link |
00:07:57.720
Like what would be the first question you would ask?
link |
00:08:00.560
It, her, him.
link |
00:08:01.840
That's right.
link |
00:08:02.680
I think that at that point,
link |
00:08:03.920
if you've really built a powerful system
link |
00:08:05.960
that is capable of shaping the future of humanity,
link |
00:08:08.480
the first question that you really should ask
link |
00:08:10.240
is how do we make sure that this plays out well?
link |
00:08:12.280
And so that's actually the first question
link |
00:08:13.960
that I would ask a powerful AGI system is.
link |
00:08:17.600
So you wouldn't ask your colleague,
link |
00:08:19.160
you wouldn't ask like Ilya, you would ask the AGI system.
link |
00:08:22.280
Oh, we've already had the conversation with Ilya, right?
link |
00:08:24.640
And everyone here.
link |
00:08:25.720
And so you want as many perspectives
link |
00:08:27.480
and a piece of wisdom as you can
link |
00:08:29.720
for answering this question.
link |
00:08:31.200
So I don't think you necessarily defer to
link |
00:08:33.120
whatever your powerful system tells you,
link |
00:08:35.480
but you use it as one input
link |
00:08:37.120
to try to figure out what to do.
link |
00:08:39.280
But, and I guess fundamentally,
link |
00:08:40.920
what it really comes down to is
link |
00:08:42.160
if you built something really powerful
link |
00:08:43.960
and you think about, think about, for example,
link |
00:08:45.280
the creation of, of shortly after
link |
00:08:47.640
the creation of nuclear weapons, right?
link |
00:08:48.880
The most important question in the world
link |
00:08:50.400
was what's the world we're going to be like?
link |
00:08:52.800
How do we set ourselves up in a place
link |
00:08:54.880
where we're going to be able to survive as a species?
link |
00:08:58.320
With AGI, I think the question is slightly different, right?
link |
00:09:00.640
That there is a question of how do we make sure
link |
00:09:02.720
that we don't get the negative effects?
link |
00:09:04.440
But there's also the positive side, right?
link |
00:09:06.240
You imagine that, you know, like,
link |
00:09:08.040
like what will AGI be like?
link |
00:09:09.720
Like what will it be capable of?
link |
00:09:11.280
And I think that one of the core reasons
link |
00:09:13.520
that an AGI can be powerful and transformative
link |
00:09:15.760
is actually due to technological development, right?
link |
00:09:18.920
If you have something that's capable,
link |
00:09:20.560
that's capable as a human and that it's much more scalable,
link |
00:09:23.880
that you absolutely want that thing
link |
00:09:25.880
to go read the whole scientific literature
link |
00:09:27.640
and think about how to create cures for all the diseases, right?
link |
00:09:30.000
You want it to think about how to go
link |
00:09:31.480
and build technologies to help us
link |
00:09:33.360
create material abundance and to figure out societal problems
link |
00:09:37.320
that we have trouble with,
link |
00:09:38.160
like how are we supposed to clean up the environment?
link |
00:09:40.000
And, you know, maybe you want this
link |
00:09:42.200
to go and invent a bunch of little robots that will go out
link |
00:09:44.120
and be biodegradable and turn ocean debris
link |
00:09:47.280
into harmless molecules.
link |
00:09:49.640
And I think that that positive side
link |
00:09:54.040
is something that I think people miss
link |
00:09:55.720
sometimes when thinking about what an AGI will be like.
link |
00:09:58.160
And so I think that if you have a system
link |
00:10:00.280
that's capable of all of that,
link |
00:10:01.640
you absolutely want its advice about how do I make sure
link |
00:10:03.960
that we're using your capabilities
link |
00:10:07.600
in a positive way for humanity.
link |
00:10:09.200
So what do you think about that psychology
link |
00:10:11.400
that looks at all the different possible trajectories
link |
00:10:14.800
of an AGI system, many of which,
link |
00:10:17.520
perhaps the majority of which are positive
link |
00:10:19.960
and nevertheless focuses on the negative trajectories?
link |
00:10:23.320
I mean, you get to interact with folks,
link |
00:10:24.720
you get to think about this maybe within yourself as well.
link |
00:10:28.840
You look at Sam Harris and so on.
link |
00:10:30.560
It seems to be, sorry to put it this way,
link |
00:10:32.720
but almost more fun to think about the negative possibilities.
link |
00:10:37.800
Whatever that's deep in our psychology,
link |
00:10:39.560
what do you think about that?
link |
00:10:40.760
And how do we deal with it?
link |
00:10:41.920
Because we want AI to help us.
link |
00:10:44.400
So I think there's kind of two problems
link |
00:10:47.880
entailed in that question.
link |
00:10:49.960
The first is more of the question of,
link |
00:10:52.360
how can you even picture what a world
link |
00:10:54.600
with a new technology will be like?
link |
00:10:56.600
Now imagine we're in 1950
link |
00:10:57.880
and I'm trying to describe Uber to someone.
link |
00:11:01.040
Aps and the internet.
link |
00:11:05.360
Yeah, I mean, that's going to be extremely complicated,
link |
00:11:08.920
but it's imaginable.
link |
00:11:10.160
It's imaginable, right?
link |
00:11:11.400
But, and now imagine being in 1950
link |
00:11:14.000
and predicting Uber, right?
link |
00:11:15.280
And you need to describe the internet,
link |
00:11:17.680
you need to describe GPS,
link |
00:11:18.720
you need to describe the fact
link |
00:11:20.280
that everyone's going to have this phone in their pocket.
link |
00:11:23.920
And so I think that just the first truth
link |
00:11:26.160
is that it is hard to picture
link |
00:11:28.040
how a transformative technology will play out in the world.
link |
00:11:31.160
We've seen that before with technologies
link |
00:11:32.760
that are far less transformative than AGI will be.
link |
00:11:35.560
And so I think that one piece
link |
00:11:37.480
is that it's just even hard to imagine
link |
00:11:39.560
and to really put yourself in a world
link |
00:11:41.640
where you can predict what that positive vision
link |
00:11:44.600
would be like.
link |
00:11:46.920
And I think the second thing is that it is,
link |
00:11:49.520
I think it is always easier to support
link |
00:11:53.280
the negative side than the positive side.
link |
00:11:55.080
It's always easier to destroy than create.
link |
00:11:58.200
And, you know, less in a physical sense
link |
00:12:00.800
and more just in an intellectual sense, right?
link |
00:12:03.080
Because, you know, I think that with creating something,
link |
00:12:05.680
you need to just get a bunch of things right
link |
00:12:07.440
and to destroy, you just need to get one thing wrong.
link |
00:12:10.280
And so I think that what that means
link |
00:12:12.080
is that I think a lot of people's thinking dead ends
link |
00:12:14.240
as soon as they see the negative story.
link |
00:12:16.880
But that being said, I actually have some hope, right?
link |
00:12:20.360
I think that the positive vision
link |
00:12:23.160
is something that I think can be,
link |
00:12:26.000
is something that we can talk about.
link |
00:12:27.600
I think that just simply saying this fact of,
link |
00:12:30.200
yeah, like there's positive, there's negatives,
link |
00:12:32.000
everyone likes to dwell on the negative,
link |
00:12:33.600
people actually respond well to that message and say,
link |
00:12:35.360
huh, you're right, there's a part of this
link |
00:12:37.040
that we're not talking about, not thinking about.
link |
00:12:39.640
And that's actually something that's,
link |
00:12:41.240
I think really been a key part
link |
00:12:43.800
of how we think about AGI at OpenAI, right?
link |
00:12:46.640
You can kind of look at it as like, okay,
link |
00:12:48.160
like OpenAI talks about the fact that there are risks
link |
00:12:51.000
and yet they're trying to build this system.
link |
00:12:53.160
Like how do you square those two facts?
link |
00:12:56.080
So do you share the intuition that some people have,
link |
00:12:59.120
I mean, from Sam Harris to even Elon Musk himself,
link |
00:13:02.680
that it's tricky as you develop AGI
link |
00:13:06.600
to keep it from slipping into the existential threats,
link |
00:13:10.400
into the negative.
link |
00:13:11.760
What's your intuition about,
link |
00:13:13.640
how hard is it to keep AI development
link |
00:13:17.720
on the positive track?
link |
00:13:19.640
What's your intuition there?
link |
00:13:20.680
To answer that question,
link |
00:13:21.560
you can really look at how we structure OpenAI.
link |
00:13:23.960
So we really have three main arms.
link |
00:13:25.840
So we have capabilities,
link |
00:13:26.960
which is actually doing the technical work
link |
00:13:29.040
and pushing forward what these systems can do.
link |
00:13:31.160
There's safety, which is working on technical mechanisms
link |
00:13:35.120
to ensure that the systems we build
link |
00:13:36.920
are aligned with human values.
link |
00:13:38.480
And then there's policy,
link |
00:13:39.640
which is making sure that we have governance mechanisms,
link |
00:13:42.040
answering that question of, well, whose values?
link |
00:13:45.280
And so I think that the technical safety one
link |
00:13:47.360
is the one that people kind of talk about the most, right?
link |
00:13:50.480
You talk about, like think about,
link |
00:13:52.080
you know, all of the dystopic AI movies,
link |
00:13:54.200
a lot of that is about not having good
link |
00:13:55.960
technical safety in place.
link |
00:13:57.520
And what we've been finding is that, you know,
link |
00:13:59.960
I think that actually a lot of people
link |
00:14:01.360
look at the technical safety problem
link |
00:14:02.680
and think it's just intractable, right?
link |
00:14:05.400
This question of what do humans want?
link |
00:14:07.840
How am I supposed to write that down?
link |
00:14:09.160
Can I even write down what I want?
link |
00:14:11.240
No way.
link |
00:14:13.040
And then they stop there.
link |
00:14:14.800
But the thing is we've already built systems
link |
00:14:16.880
that are able to learn things that humans can't specify.
link |
00:14:20.920
You know, even the rules for how to recognize
link |
00:14:22.920
if there's a cat or a dog in an image.
link |
00:14:25.000
Turns out it's intractable to write that down
link |
00:14:26.520
and yet we're able to learn it.
link |
00:14:28.400
And that what we're seeing with systems we build at OpenAI
link |
00:14:31.040
and they're still in early proof of concept stage
link |
00:14:33.800
is that you are able to learn human preferences.
link |
00:14:36.320
You're able to learn what humans want from data.
link |
00:14:38.920
And so that's kind of the core focus
link |
00:14:40.400
for our technical safety team.
link |
00:14:41.760
And I think that they're actually,
link |
00:14:43.800
we've had some pretty encouraging updates
link |
00:14:45.640
in terms of what we've been able to make work.
link |
00:14:48.040
So you have an intuition and a hope that from data,
link |
00:14:51.680
you know, looking at the value alignment problem,
link |
00:14:53.640
from data we can build systems that align
link |
00:14:57.040
with the collective better angels of our nature.
link |
00:15:00.600
So align with the ethics and the morals of human beings.
link |
00:15:04.600
To even say this in a different way,
link |
00:15:05.880
I mean, think about how do we align humans, right?
link |
00:15:08.560
Think about like a human baby can grow up
link |
00:15:10.400
to be an evil person or a great person.
link |
00:15:12.880
And a lot of that is from learning from data, right?
link |
00:15:15.200
That you have some feedback as a child is growing up.
link |
00:15:17.720
They get to see positive examples.
link |
00:15:19.160
And so I think that just like the only example
link |
00:15:23.120
we have of a general intelligence
link |
00:15:25.400
that is able to learn from data
link |
00:15:28.040
to align with human values and to learn values,
link |
00:15:31.440
I think we shouldn't be surprised
link |
00:15:32.880
that we can do the same sorts of techniques
link |
00:15:36.040
or whether the same sort of techniques
link |
00:15:37.440
end up being how we solve value alignment for AGI's.
link |
00:15:41.080
So let's go even higher.
link |
00:15:42.680
I don't know if you've read the book, Sapiens.
link |
00:15:44.800
But there's an idea that, you know,
link |
00:15:48.320
that as a collective, as us human beings,
link |
00:15:50.000
we kind of develop together ideas that we hold.
link |
00:15:54.720
There's no, in that context, objective truth.
link |
00:15:57.920
We just kind of all agree to certain ideas
link |
00:16:00.000
and hold them as a collective.
link |
00:16:01.440
Did you have a sense that there is
link |
00:16:03.480
in the world of good and evil,
link |
00:16:05.360
do you have a sense that to the first approximation,
link |
00:16:07.560
there are some things that are good
link |
00:16:10.280
and that you could teach systems to behave to be good?
link |
00:16:14.520
So I think that this actually blends into our third team,
link |
00:16:18.440
which is the policy team.
link |
00:16:19.880
And this is the one, the aspect that I think people
link |
00:16:22.320
really talk about way less than they should.
link |
00:16:25.280
Because imagine that we build super powerful systems
link |
00:16:27.640
that we've managed to figure out all the mechanisms
link |
00:16:29.720
for these things to do whatever the operator wants.
link |
00:16:32.800
The most important question becomes,
link |
00:16:34.480
who's the operator, what do they want,
link |
00:16:36.720
and how is that going to affect everyone else?
link |
00:16:39.400
And I think that this question of what is good,
link |
00:16:43.080
what are those values, I mean,
link |
00:16:44.720
I think you don't even have to go
link |
00:16:45.960
to those very grand existential places
link |
00:16:48.400
to realize how hard this problem is.
link |
00:16:50.920
You just look at different countries
link |
00:16:52.880
and cultures across the world.
link |
00:16:54.520
And that there's a very different conception
link |
00:16:57.120
of how the world works and what kinds of ways
link |
00:17:01.920
that society wants to operate.
link |
00:17:03.400
And so I think that the really core question
link |
00:17:07.000
is actually very concrete.
link |
00:17:09.560
And I think it's not a question
link |
00:17:10.960
that we have ready answers to,
link |
00:17:12.880
how do you have a world where all the different countries
link |
00:17:16.560
that we have, United States, China, Russia,
link |
00:17:19.720
and the hundreds of other countries out there
link |
00:17:22.720
are able to continue to not just operate
link |
00:17:26.600
in the way that they see fit,
link |
00:17:28.440
but in the world that emerges in these,
link |
00:17:32.520
where you have these very powerful systems,
link |
00:17:36.040
operating alongside humans,
link |
00:17:37.800
ends up being something that empowers humans more,
link |
00:17:39.800
that makes human existence be a more meaningful thing
link |
00:17:44.120
and that people are happier and wealthier
link |
00:17:46.400
and able to live more fulfilling lives.
link |
00:17:48.960
It's not an obvious thing for how to design that world
link |
00:17:51.560
once you have that very powerful system.
link |
00:17:53.600
So if we take a little step back,
link |
00:17:55.800
and we're having like a fascinating conversation
link |
00:17:58.200
and open as in many ways a tech leader in the world,
link |
00:18:01.880
and yet we're thinking about these big existential questions
link |
00:18:05.440
which is fascinating, really important.
link |
00:18:07.000
I think you're a leader in that space
link |
00:18:09.160
and that's a really important space
link |
00:18:10.840
of just thinking how AI affects society
link |
00:18:13.080
in a big picture view.
link |
00:18:14.360
So Oscar Wilde said, we're all in the gutter,
link |
00:18:17.320
but some of us are looking at the stars
link |
00:18:19.000
and I think OpenAI has a charter
link |
00:18:22.320
that looks to the stars, I would say,
link |
00:18:24.600
to create intelligence, to create general intelligence,
link |
00:18:26.880
make it beneficial, safe, and collaborative.
link |
00:18:29.440
So can you tell me how that came about?
link |
00:18:33.680
How a mission like that and the path
link |
00:18:36.320
to creating a mission like that at OpenAI was founded?
link |
00:18:39.120
Yeah, so I think that in some ways
link |
00:18:41.640
it really boils down to taking a look at the landscape.
link |
00:18:45.040
So if you think about the history of AI
link |
00:18:47.040
that basically for the past 60 or 70 years,
link |
00:18:49.920
people have thought about this goal
link |
00:18:51.640
of what could happen if you could automate
link |
00:18:53.960
human intellectual labor.
link |
00:18:56.680
Imagine you can build a computer system
link |
00:18:58.280
that could do that, what becomes possible?
link |
00:19:00.560
We have a lot of sci fi that tells stories
link |
00:19:02.400
of various dystopias and increasingly you have movies
link |
00:19:04.920
like Her that tell you a little bit about
link |
00:19:06.480
maybe more of a little bit utopic vision.
link |
00:19:09.440
You think about the impacts that we've seen
link |
00:19:12.560
from being able to have bicycles for our minds
link |
00:19:16.280
and computers and that I think that the impact
link |
00:19:20.360
of computers and the internet has just far outstripped
link |
00:19:23.480
what anyone really could have predicted.
link |
00:19:26.200
And so I think that it's very clear
link |
00:19:27.400
that if you can build an AGI,
link |
00:19:29.360
it will be the most transformative technology
link |
00:19:31.600
that humans will ever create.
link |
00:19:33.040
And so what it boils down to then is a question of,
link |
00:19:36.840
well, is there a path?
link |
00:19:38.680
Is there hope?
link |
00:19:39.520
Is there a way to build such a system?
link |
00:19:41.680
And I think that for 60 or 70 years
link |
00:19:43.640
that people got excited and that ended up not being able
link |
00:19:48.040
to deliver on the hopes that people had pinned on them.
link |
00:19:51.480
And I think that then, that after two winters
link |
00:19:54.880
of AI development, that people,
link |
00:19:57.600
I think kind of almost stopped daring to dream, right?
link |
00:20:00.560
That really talking about AGI or thinking about AGI
link |
00:20:03.280
became almost this taboo in the community.
link |
00:20:06.640
But I actually think that people took the wrong lesson
link |
00:20:08.720
from AI history.
link |
00:20:10.080
And if you look back, starting in 1959
link |
00:20:12.400
is when the Perceptron was released.
link |
00:20:14.240
And this is basically one of the earliest neural networks.
link |
00:20:17.720
It was released to what was perceived
link |
00:20:19.280
as this massive overhype.
link |
00:20:20.840
So in the New York Times in 1959,
link |
00:20:22.360
you have this article saying that the Perceptron
link |
00:20:26.400
will one day recognize people, call out their names,
link |
00:20:29.160
instantly translate speech between languages.
link |
00:20:31.480
And people at the time looked at this and said,
link |
00:20:33.800
this is, your system can't do any of that.
link |
00:20:36.120
And basically spent 10 years trying to discredit
link |
00:20:38.080
the whole Perceptron direction and succeeded.
link |
00:20:40.640
And all the funding dried up.
link |
00:20:41.840
And people kind of went in other directions.
link |
00:20:44.960
And in the 80s, there was this resurgence.
link |
00:20:46.920
And I'd always heard that the resurgence in the 80s
link |
00:20:49.320
was due to the invention of back propagation
link |
00:20:51.520
and these algorithms that got people excited.
link |
00:20:53.720
But actually the causality was due to people
link |
00:20:55.760
building larger computers.
link |
00:20:57.200
That you can find these articles from the 80s saying
link |
00:20:59.280
that the democratization of computing power
link |
00:21:01.760
suddenly meant that you could run these larger neural networks.
link |
00:21:04.040
And then people started to do all these amazing things,
link |
00:21:06.280
back propagation algorithm was invented.
link |
00:21:08.000
And the neural nets people were running
link |
00:21:10.120
were these tiny little like 20 neuron neural nets.
link |
00:21:13.000
What are you supposed to learn with 20 neurons?
link |
00:21:15.160
And so of course they weren't able to get great results.
link |
00:21:18.640
And it really wasn't until 2012 that this approach,
link |
00:21:21.960
that's almost the most simple, natural approach
link |
00:21:24.680
that people had come up with in the 50s, right?
link |
00:21:27.720
In some ways, even in the 40s before there were computers
link |
00:21:30.360
with the Pits McCullin neuron,
link |
00:21:33.040
suddenly this became the best way of solving problems, right?
link |
00:21:37.480
And I think there are three core properties
link |
00:21:39.280
that deep learning has that I think
link |
00:21:42.120
are very worth paying attention to.
link |
00:21:44.120
The first is generality.
link |
00:21:45.920
We have a very small number of deep learning tools,
link |
00:21:48.760
SGD, deep neural net, maybe some, you know, RL.
link |
00:21:52.360
And it solves this huge variety of problems,
link |
00:21:55.600
speech recognition, machine translation,
link |
00:21:57.240
game playing, all of these problems,
link |
00:22:00.200
small set of tools.
link |
00:22:01.040
So there's the generality.
link |
00:22:02.760
There's a second piece, which is the competence.
link |
00:22:05.000
You wanna solve any of those problems?
link |
00:22:07.040
Throughout 40 years worth of normal computer vision research
link |
00:22:10.640
replaced with a deep neural net, it's gonna work better.
link |
00:22:13.640
And there's a third piece, which is the scalability, right?
link |
00:22:16.320
That one thing that has been shown time and time again
link |
00:22:18.720
is that you, if you have a larger neural network,
link |
00:22:21.760
throw more compute, more data at it, it will work better.
link |
00:22:25.120
Those three properties together feel like essential parts
link |
00:22:28.880
of building a general intelligence.
link |
00:22:30.800
Now, it doesn't just mean that if we scale up
link |
00:22:33.000
what we have, that we will have an AGI, right?
link |
00:22:35.200
There are clearly missing pieces.
link |
00:22:36.800
There are missing ideas.
link |
00:22:38.000
We need to have answers for reasoning.
link |
00:22:40.000
But I think that the core here is that for the first time,
link |
00:22:44.800
it feels that we have a paradigm
link |
00:22:46.880
that gives us hope that general intelligence
link |
00:22:48.960
can be achievable.
link |
00:22:50.560
And so as soon as you believe that,
link |
00:22:52.160
everything else becomes into focus, right?
link |
00:22:54.480
If you imagine that you may be able to,
link |
00:22:56.560
and that the timeline I think remains uncertain,
link |
00:22:59.920
but I think that certainly within our lifetimes
link |
00:23:02.200
and possibly within a much shorter period of time
link |
00:23:04.640
than people would expect,
link |
00:23:06.560
if you can really build the most transformative technology
link |
00:23:09.360
that will ever exist, you stop thinking about yourself
link |
00:23:11.720
so much, right?
link |
00:23:12.560
And you start thinking about just like,
link |
00:23:14.240
how do you have a world where this goes well?
link |
00:23:16.440
And that you need to think about the practicalities
link |
00:23:18.160
of how do you build an organization
link |
00:23:19.560
and get together a bunch of people and resources
link |
00:23:22.000
and to make sure that people feel motivated
link |
00:23:25.160
and ready to do it.
link |
00:23:28.080
But I think that then you start thinking about,
link |
00:23:30.720
well, what if we succeed?
link |
00:23:32.080
And how do we make sure that when we succeed,
link |
00:23:34.280
that the world is actually the place
link |
00:23:35.600
that we want ourselves to exist in?
link |
00:23:38.200
And almost in the Rawlsian Vale sense of the word.
link |
00:23:41.080
And so that's kind of the broader landscape.
link |
00:23:43.880
And Open AI was really formed in 2015
link |
00:23:46.680
with that high level picture of AGI might be possible
link |
00:23:51.480
sooner than people think
link |
00:23:52.880
and that we need to try to do our best
link |
00:23:55.840
to make sure it's going to go well.
link |
00:23:57.480
And then we spent the next couple of years
link |
00:23:59.360
really trying to figure out what does that mean?
link |
00:24:00.840
How do we do it?
link |
00:24:01.960
And I think that typically with a company,
link |
00:24:04.800
you start out very small.
link |
00:24:07.320
So you want a cofounder and you build a product,
link |
00:24:09.000
you get some users, you get a product market fit,
link |
00:24:11.360
then at some point you raise some money,
link |
00:24:13.320
you hire people, you scale,
link |
00:24:14.840
and then down the road, then the big companies
link |
00:24:17.440
realize you exist and try to kill you.
link |
00:24:19.080
And for Open AI, it was basically everything
link |
00:24:21.520
in exactly the opposite order.
link |
00:24:25.480
Let me just pause for a second.
link |
00:24:26.760
He said a lot of things.
link |
00:24:27.520
And let me just admire the jarring aspect
link |
00:24:31.240
of what Open AI stands for, which is daring to dream.
link |
00:24:35.160
I mean, you said it's pretty powerful.
link |
00:24:37.120
You caught me off guard because I think that's very true.
link |
00:24:40.080
The step of just daring to dream
link |
00:24:44.040
about the possibilities of creating intelligence
link |
00:24:46.720
in a positive and a safe way,
link |
00:24:48.760
but just even creating intelligence
link |
00:24:50.640
is a much needed, refreshing catalyst
link |
00:24:56.280
for the AI community.
link |
00:24:57.360
So that's the starting point.
link |
00:24:58.800
Okay, so then formation of Open AI, what's your point?
link |
00:25:02.840
I would just say that when we were starting Open AI,
link |
00:25:05.640
that kind of the first question that we had is,
link |
00:25:07.760
is it too late to start a lab with a bunch of the best people?
link |
00:25:12.000
Right, is that even possible?
link |
00:25:13.160
That was an actual question.
link |
00:25:14.320
That was the core question of,
link |
00:25:17.280
we had this dinner in July of 2015,
link |
00:25:19.320
and that was really what we spent the whole time
link |
00:25:21.240
talking about.
link |
00:25:22.320
And because you think about kind of where AI was,
link |
00:25:26.800
is that it transitioned from being an academic pursuit
link |
00:25:30.200
to an industrial pursuit.
link |
00:25:32.240
And so a lot of the best people were in these big
link |
00:25:34.240
research labs and that we wanted to start our own one
link |
00:25:37.000
that no matter how much resources we could accumulate
link |
00:25:40.560
would be pale in comparison to the big tech companies.
link |
00:25:43.520
And we knew that.
link |
00:25:44.720
And there's a question of,
link |
00:25:45.800
are we going to be actually able to get this thing
link |
00:25:47.720
off the ground?
link |
00:25:48.720
You need critical mass.
link |
00:25:49.760
You can't just do you and a cofounder build a product, right?
link |
00:25:52.120
You really need to have a group of five to 10 people.
link |
00:25:55.600
And we kind of concluded it wasn't obviously impossible.
link |
00:25:59.480
So it seemed worth trying.
link |
00:26:02.240
Well, you're also a dreamer, so who knows, right?
link |
00:26:04.800
That's right.
link |
00:26:05.640
Okay, so speaking of that,
link |
00:26:07.720
competing with the big players,
link |
00:26:11.520
let's talk about some of the tricky things
link |
00:26:14.080
as you think through this process of growing,
link |
00:26:17.480
of seeing how you can develop these systems
link |
00:26:20.080
at a scale that competes.
link |
00:26:22.640
So you recently formed OpenAI LP,
link |
00:26:26.560
a new cap profit company that now carries the name OpenAI.
link |
00:26:30.800
So OpenAI is now this official company.
link |
00:26:33.280
The original nonprofit company still exists
link |
00:26:36.520
and carries the OpenAI nonprofit name.
link |
00:26:39.800
So can you explain what this company is,
link |
00:26:42.000
what the purpose of its creation is,
link |
00:26:44.280
and how did you arrive at the decision to create it?
link |
00:26:48.800
OpenAI, the whole entity and OpenAI LP as a vehicle
link |
00:26:53.280
is trying to accomplish the mission
link |
00:26:55.560
of ensuring that artificial general intelligence
link |
00:26:57.520
benefits everyone.
link |
00:26:58.800
And the main way that we're trying to do that
link |
00:27:00.240
is by actually trying to build
link |
00:27:01.840
general intelligence to ourselves
link |
00:27:03.240
and make sure the benefits are distributed to the world.
link |
00:27:05.920
That's the primary way.
link |
00:27:07.200
We're also fine if someone else does this, right?
link |
00:27:09.600
It doesn't have to be us.
link |
00:27:10.640
If someone else is going to build an AGI
link |
00:27:12.640
and make sure that the benefits don't get locked up
link |
00:27:14.840
in one company or with one set of people,
link |
00:27:19.280
like we're actually fine with that.
link |
00:27:21.160
And so those ideas are baked into our charter,
link |
00:27:25.400
which is kind of the foundational document
link |
00:27:28.400
that describes kind of our values and how we operate.
link |
00:27:31.920
And it's also really baked into the structure of OpenAI LP.
link |
00:27:36.360
And so the way that we've set up OpenAI LP
link |
00:27:37.960
is that in the case where we succeed, right?
link |
00:27:42.160
If we actually build what we're trying to build,
link |
00:27:45.320
then investors are able to get a return,
link |
00:27:47.800
and but that return is something that is capped.
link |
00:27:50.400
And so if you think of AGI in terms of the value
link |
00:27:53.000
that you could really create,
link |
00:27:54.160
you're talking about the most transformative technology
link |
00:27:56.320
ever created, it's gonna create,
link |
00:27:58.000
or does the magnitude more value than any existing company?
link |
00:28:01.880
And that all of that value will be owned by the world,
link |
00:28:05.960
like legally titled to the nonprofit
link |
00:28:07.880
to fulfill that mission.
link |
00:28:09.560
And so that's the structure.
link |
00:28:12.800
So the mission is a powerful one,
link |
00:28:15.200
and it's one that I think most people would agree with.
link |
00:28:18.920
It's how we would hope AI progresses.
link |
00:28:22.960
And so how do you tie yourself to that mission?
link |
00:28:25.440
How do you make sure you do not deviate from that mission
link |
00:28:29.240
that other incentives that are profit driven
link |
00:28:34.560
wouldn't don't interfere with the mission?
link |
00:28:36.800
So this was actually a really core question for us
link |
00:28:39.560
for the past couple of years,
link |
00:28:40.920
because I'd say that the way that our history went
link |
00:28:43.560
was that for the first year,
link |
00:28:44.960
we were getting off the ground, right?
link |
00:28:46.240
We had this high level picture,
link |
00:28:47.960
but we didn't know exactly how we wanted to accomplish it.
link |
00:28:51.880
And really two years ago,
link |
00:28:53.440
it's when we first started realizing
link |
00:28:55.040
in order to build AGI,
link |
00:28:56.160
we're just gonna need to raise way more money
link |
00:28:58.720
than we can as a nonprofit.
link |
00:29:00.680
We're talking many billions of dollars.
link |
00:29:02.800
And so the first question is,
link |
00:29:05.440
how are you supposed to do that
link |
00:29:06.840
and stay true to this mission?
link |
00:29:08.680
And we looked at every legal structure out there
link |
00:29:10.560
and included none of them were quite right
link |
00:29:11.960
for what we wanted to do.
link |
00:29:13.400
And I guess it shouldn't be too surprising
link |
00:29:14.600
if you're gonna do some crazy unprecedented technology
link |
00:29:16.920
that you're gonna have to come
link |
00:29:17.920
with some crazy unprecedented structure to do it in.
link |
00:29:20.320
And a lot of our conversation was with people at OpenAI,
link |
00:29:26.080
the people who really joined
link |
00:29:27.240
because they believe so much in this mission
link |
00:29:29.160
and thinking about how do we actually raise the resources
link |
00:29:32.120
to do it and also stay true to what we stand for.
link |
00:29:35.920
And the place you gotta start is to really align
link |
00:29:38.000
on what is it that we stand for, right?
link |
00:29:39.560
What are those values?
link |
00:29:40.560
What's really important to us?
link |
00:29:41.840
And so I'd say that we spent about a year
link |
00:29:43.760
really compiling the OpenAI charter.
link |
00:29:46.240
And that determines,
link |
00:29:47.560
and if you even look at the first line item in there,
link |
00:29:50.240
it says that, look, we expect we're gonna have to marshal
link |
00:29:52.360
huge amounts of resources,
link |
00:29:53.760
but we're going to make sure
link |
00:29:55.160
that we minimize conflict of interest with the mission.
link |
00:29:57.920
And that kind of aligning on all of those pieces
link |
00:30:00.720
was the most important step towards figuring out
link |
00:30:04.240
how do we structure a company
link |
00:30:06.040
that can actually raise the resources
link |
00:30:08.240
to do what we need to do.
link |
00:30:10.360
I imagine OpenAI, the decision to create OpenAI LP
link |
00:30:14.760
was a really difficult one.
link |
00:30:16.360
And there was a lot of discussions
link |
00:30:17.920
as you mentioned for a year.
link |
00:30:19.640
And there was different ideas,
link |
00:30:22.760
perhaps detractors within OpenAI,
link |
00:30:26.120
sort of different paths that you could have taken.
link |
00:30:28.920
What were those concerns?
link |
00:30:30.240
What were the different paths considered?
link |
00:30:32.040
What was that process of making that decision like?
link |
00:30:34.080
Yep.
link |
00:30:35.000
But so if you look actually at the OpenAI charter,
link |
00:30:37.200
that there's almost two paths embedded within it.
link |
00:30:40.880
There is, we are primarily trying to build AGI ourselves,
link |
00:30:44.880
but we're also okay if someone else does it.
link |
00:30:47.360
And this is a weird thing for a company.
link |
00:30:49.040
It's really interesting, actually.
link |
00:30:50.480
Yeah.
link |
00:30:51.320
But there is an element of competition
link |
00:30:53.280
that you do want to be the one that does it,
link |
00:30:56.680
but at the same time, you're okay if somebody else doesn't.
link |
00:30:59.040
We'll talk about that a little bit, that trade off,
link |
00:31:01.000
that dance that's really interesting.
link |
00:31:02.960
And I think this was the core tension
link |
00:31:04.600
as we were designing OpenAI LP
link |
00:31:06.360
and really the OpenAI strategy,
link |
00:31:08.240
is how do you make sure that both you have a shot
link |
00:31:11.080
at being a primary actor,
link |
00:31:12.640
which really requires building an organization,
link |
00:31:15.840
raising massive resources,
link |
00:31:17.720
and really having the will to go
link |
00:31:19.440
and execute on some really, really hard vision, right?
link |
00:31:22.000
You need to really sign up for a long period
link |
00:31:23.760
to go and take on a lot of pain and a lot of risk.
link |
00:31:27.120
And to do that,
link |
00:31:29.000
normally you just import the startup mindset, right?
link |
00:31:31.720
And that you think about, okay,
link |
00:31:32.760
like how do we out execute everyone?
link |
00:31:34.240
You have this very competitive angle.
link |
00:31:36.160
But you also have the second angle of saying that,
link |
00:31:38.120
well, the true mission isn't for OpenAI to build AGI.
link |
00:31:41.600
The true mission is for AGI to go well for humanity.
link |
00:31:45.080
And so how do you take all of those first actions
link |
00:31:48.080
and make sure you don't close the door on outcomes
link |
00:31:51.320
that would actually be positive and fulfill the mission?
link |
00:31:54.480
And so I think it's a very delicate balance, right?
link |
00:31:56.680
And I think that going 100% one direction or the other
link |
00:31:59.560
is clearly not the correct answer.
link |
00:32:01.320
And so I think that even in terms of just how we talk about
link |
00:32:03.920
OpenAI and think about it,
link |
00:32:05.400
there's just like one thing that's always
link |
00:32:07.600
in the back of my mind is to make sure
link |
00:32:09.680
that we're not just saying OpenAI's goal
link |
00:32:12.120
is to build AGI, right?
link |
00:32:14.000
That it's actually much broader than that, right?
link |
00:32:15.560
That first of all, it's not just AGI, it's safe AGI
link |
00:32:19.360
that's very important.
link |
00:32:20.320
But secondly, our goal isn't to be the ones to build it,
link |
00:32:23.120
our goal is to make sure it goes well for the world.
link |
00:32:24.720
And so I think that figuring out,
link |
00:32:26.120
how do you balance all of those
link |
00:32:27.960
and to get people to really come to the table
link |
00:32:30.280
and compile a single document that encompasses all of that
link |
00:32:36.360
wasn't trivial.
link |
00:32:37.560
So part of the challenge here is your mission is,
link |
00:32:41.680
I would say, beautiful, empowering,
link |
00:32:44.240
and a beacon of hope for people in the research community
link |
00:32:47.520
and just people thinking about AI.
link |
00:32:49.200
So your decisions are scrutinized
link |
00:32:51.880
more than, I think, a regular profit driven company.
link |
00:32:55.920
Do you feel the burden of this
link |
00:32:57.400
in the creation of the charter
link |
00:32:58.560
and just in the way you operate?
link |
00:33:00.200
Yes.
link |
00:33:03.040
So why do you lean into the burden
link |
00:33:07.040
by creating such a charter?
link |
00:33:08.640
Why not keep it quiet?
link |
00:33:10.440
I mean, it just boils down to the mission, right?
link |
00:33:12.920
Like, I'm here and everyone else is here
link |
00:33:15.200
because we think this is the most important mission, right?
link |
00:33:17.880
Dare to dream.
link |
00:33:19.000
All right, so do you think you can be good for the world
link |
00:33:23.360
or create an AGI system that's good
link |
00:33:26.000
when you're a for profit company?
link |
00:33:28.320
From my perspective, I don't understand why profit
link |
00:33:32.920
interferes with positive impact on society.
link |
00:33:37.640
I don't understand why Google
link |
00:33:40.760
that makes most of its money from ads
link |
00:33:42.920
can't also do good for the world
link |
00:33:45.040
or other companies, Facebook, anything.
link |
00:33:47.520
I don't understand why those have to interfere.
link |
00:33:50.240
You know, you can, profit isn't the thing in my view
link |
00:33:55.120
that affects the impact of a company.
link |
00:33:57.240
What affects the impact of the company is the charter,
link |
00:34:00.360
is the culture, is the people inside
link |
00:34:04.160
and profit is the thing that just fuels those people.
link |
00:34:07.360
What are your views there?
link |
00:34:08.760
Yeah, so I think that's a really good question
link |
00:34:10.920
and there's some real like longstanding debates
link |
00:34:14.200
in human society that are wrapped up in it.
link |
00:34:16.520
The way that I think about it is just think about
link |
00:34:18.680
what are the most impactful nonprofits in the world?
link |
00:34:24.000
What are the most impactful for profits in the world?
link |
00:34:26.760
Right, it's much easier to list the for profits.
link |
00:34:29.280
That's right.
link |
00:34:30.120
And I think that there's some real truth here
link |
00:34:32.400
that the system that we set up,
link |
00:34:34.600
the system for kind of how today's world is organized
link |
00:34:38.320
is one that really allows for huge impact
link |
00:34:41.760
and that kind of part of that is that you need to be,
link |
00:34:45.400
that for profits are self sustaining
link |
00:34:48.080
and able to kind of build on their own momentum.
link |
00:34:51.200
And I think that's a really powerful thing.
link |
00:34:53.080
It's something that when it turns out
link |
00:34:55.880
that we haven't set the guardrails correctly,
link |
00:34:57.920
causes problems, right?
link |
00:34:58.840
Think about logging companies that go into the rainforest,
link |
00:35:02.720
that's really bad, we don't want that.
link |
00:35:04.680
And it's actually really interesting to me
link |
00:35:06.520
that kind of this question of
link |
00:35:08.480
how do you get positive benefits out of a for profit company?
link |
00:35:11.400
It's actually very similar to
link |
00:35:12.600
how do you get positive benefits out of an AGI, right?
link |
00:35:15.800
That you have this like very powerful system,
link |
00:35:18.000
it's more powerful than any human
link |
00:35:19.680
and it's kind of autonomous in some ways.
link |
00:35:21.760
You know, it's super human in a lot of axes
link |
00:35:23.800
and somehow you have to set the guardrails
link |
00:35:25.400
to get good things to happen.
link |
00:35:26.800
But when you do, the benefits are massive.
link |
00:35:29.360
And so I think that when I think about nonprofit
link |
00:35:32.920
versus for profit, I think just not enough happens
link |
00:35:36.120
in nonprofits, they're very pure,
link |
00:35:37.800
but it's just kind of, you know,
link |
00:35:39.200
it's just hard to do things there.
link |
00:35:40.840
And for profits in some ways, like too much happens,
link |
00:35:44.000
but if kind of shaped in the right way,
link |
00:35:46.440
it can actually be very positive.
link |
00:35:47.840
And so with OpenILP, we're picking a road in between.
link |
00:35:52.160
Now, the thing that I think is really important to recognize
link |
00:35:54.880
is that the way that we think about OpenILP
link |
00:35:57.160
is that in the world where AGI actually happens, right?
link |
00:36:00.440
In a world where we are successful,
link |
00:36:01.720
we build the most transformative technology ever,
link |
00:36:03.800
the amount of value we're going to create will be astronomical.
link |
00:36:07.600
And so then in that case, that the cap that we have
link |
00:36:12.760
will be a small fraction of the value we create.
link |
00:36:15.520
And the amount of value that goes back to investors
link |
00:36:17.800
and employees looks pretty similar to what would happen
link |
00:36:20.000
in a pretty successful startup.
link |
00:36:23.760
And that's really the case that we're optimizing for, right?
link |
00:36:26.520
That we're thinking about in the success case,
link |
00:36:28.560
making sure that the value we create doesn't get locked up.
link |
00:36:32.120
And I expect that in other for profit companies
link |
00:36:34.920
that it's possible to do something like that.
link |
00:36:37.800
I think it's not obvious how to do it, right?
link |
00:36:39.720
And I think that as a for profit company,
link |
00:36:41.440
you have a lot of fiduciary duty to your shareholders
link |
00:36:44.240
and that there are certain decisions
link |
00:36:45.640
that you just cannot make.
link |
00:36:47.520
In our structure, we've set it up
link |
00:36:49.080
so that we have a fiduciary duty to the charter,
link |
00:36:52.440
that we always get to make the decision
link |
00:36:54.400
that is right for the charter,
link |
00:36:56.720
rather than even if it comes at the expense
link |
00:36:58.800
of our own stakeholders.
link |
00:37:00.680
And so I think that when I think about
link |
00:37:03.400
what's really important,
link |
00:37:04.360
it's not really about nonprofit versus for profit.
link |
00:37:06.280
It's really a question of if you build a GI
link |
00:37:09.600
and you kind of, you know,
link |
00:37:10.600
humanity is now at this new age,
link |
00:37:13.080
who benefits, whose lives are better?
link |
00:37:15.760
And I think that what's really important
link |
00:37:17.120
is to have an answer that is everyone.
link |
00:37:20.320
Yeah, which is one of the core aspects of the charter.
link |
00:37:23.400
So one concern people have, not just with OpenAI,
link |
00:37:26.520
but with Google, Facebook, Amazon,
link |
00:37:28.400
anybody really that's creating impact at scale
link |
00:37:35.000
is how do we avoid, as your charter says,
link |
00:37:37.680
avoid enabling the use of AI or AGI
link |
00:37:40.080
to unduly concentrate power?
link |
00:37:43.640
Why would not a company like OpenAI
link |
00:37:45.920
keep all the power of an AGI system to itself?
link |
00:37:48.640
The charter.
link |
00:37:49.520
The charter.
link |
00:37:50.360
So, you know, how does the charter
link |
00:37:53.120
actualize itself in day to day?
link |
00:37:57.240
So I think that first to zoom out, right,
link |
00:38:00.480
that the way that we structure the company
link |
00:38:01.880
is so that the power for sort of, you know,
link |
00:38:04.560
dictating the actions that OpenAI takes
link |
00:38:06.720
ultimately rests with the board, right?
link |
00:38:08.600
The board of the nonprofit and the board is set up
link |
00:38:11.720
in certain ways, with certain restrictions
link |
00:38:13.480
that you can read about in the OpenAI LP blog post.
link |
00:38:16.280
But effectively the board is the governing body
link |
00:38:19.200
for OpenAI LP.
link |
00:38:21.200
And the board has a duty to fulfill the mission
link |
00:38:24.400
of the nonprofit.
link |
00:38:26.360
And so that's kind of how we tie,
link |
00:38:28.800
how we thread all these things together.
link |
00:38:30.960
Now there's a question of so day to day,
link |
00:38:32.880
how do people, the individuals,
link |
00:38:34.800
who in some ways are the most empowered ones, right?
link |
00:38:36.960
You know, the board sort of gets to call the shots
link |
00:38:38.800
at the high level, but the people who are actually executing
link |
00:38:41.920
are the employees, right?
link |
00:38:43.120
The people here on a day to day basis who have the,
link |
00:38:45.480
you know, the keys to the technical kingdom.
link |
00:38:48.960
And there I think that the answer looks a lot like,
link |
00:38:51.720
well, how does any company's values get actualized, right?
link |
00:38:55.120
And I think that a lot of that comes down to
link |
00:38:56.720
that you need people who are here
link |
00:38:58.160
because they really believe in that mission
link |
00:39:01.320
and they believe in the charter
link |
00:39:02.800
and that they are willing to take actions
link |
00:39:05.440
that maybe are worse for them, but are better for the charter.
link |
00:39:08.600
And that's something that's really baked into the culture.
link |
00:39:11.440
And honestly, I think it's, you know,
link |
00:39:13.200
I think that that's one of the things
link |
00:39:14.560
that we really have to work to preserve as time goes on.
link |
00:39:18.200
And that's a really important part of how we think
link |
00:39:20.760
about hiring people and bringing people into OpenAI.
link |
00:39:23.040
So there's people here, there's people here
link |
00:39:25.320
who could speak up and say, like, hold on a second,
link |
00:39:30.840
this is totally against what we stand for, culture wise.
link |
00:39:34.600
Yeah, yeah, for sure.
link |
00:39:35.440
I mean, I think that we actually have,
link |
00:39:37.120
I think that's like a pretty important part
link |
00:39:38.760
of how we operate and how we have,
link |
00:39:41.920
even again with designing the charter
link |
00:39:44.160
and designing OpenAI in the first place,
link |
00:39:46.680
that there has been a lot of conversation
link |
00:39:48.760
with employees here and a lot of times
link |
00:39:50.480
where employees said, wait a second,
link |
00:39:52.400
this seems like it's going in the wrong direction
link |
00:39:53.920
and let's talk about it.
link |
00:39:55.120
And so I think one thing that's, I think are really,
link |
00:39:57.360
and you know, here's actually one thing
link |
00:39:58.880
that I think is very unique about us as a small company,
link |
00:40:02.080
is that if you're at a massive tech giant,
link |
00:40:04.360
that's a little bit hard for someone
link |
00:40:05.680
who's a line employee to go and talk to the CEO
link |
00:40:08.120
and say, I think that we're doing this wrong.
link |
00:40:10.520
And you know, you'll get companies like Google
link |
00:40:13.040
that have had some collective action from employees
link |
00:40:15.720
to make ethical change around things like Maven.
link |
00:40:19.400
And so maybe there are mechanisms
link |
00:40:20.680
that other companies that work,
link |
00:40:22.240
but here, super easy for anyone to pull me aside,
link |
00:40:24.480
to pull Sam aside, to pull Eli aside,
link |
00:40:26.320
and people do it all the time.
link |
00:40:27.800
One of the interesting things in the charter
link |
00:40:29.800
is this idea that it'd be great
link |
00:40:31.640
if you could try to describe or untangle
link |
00:40:34.240
switching from competition to collaboration
link |
00:40:36.440
and late stage AGI development.
link |
00:40:38.920
It's really interesting,
link |
00:40:39.760
this dance between competition and collaboration,
link |
00:40:42.160
how do you think about that?
link |
00:40:43.400
Yeah, assuming that you can actually do
link |
00:40:45.000
the technical side of AGI development,
link |
00:40:47.040
I think there's going to be two key problems
link |
00:40:48.960
with figuring out how do you actually deploy it
link |
00:40:50.400
and make it go well.
link |
00:40:51.520
The first one of these is the run up
link |
00:40:53.160
to building the first AGI.
link |
00:40:56.360
You look at how self driving cars are being developed,
link |
00:40:58.920
and it's a competitive race.
link |
00:41:00.680
And the thing that always happens in competitive race
link |
00:41:02.560
is that you have huge amounts of pressure
link |
00:41:04.160
to get rid of safety.
link |
00:41:06.800
And so that's one thing we're very concerned about, right?
link |
00:41:08.920
Is that people, multiple teams figuring out,
link |
00:41:12.000
we can actually get there,
link |
00:41:13.600
but you know, if we took the slower path
link |
00:41:16.680
that is more guaranteed to be safe, we will lose.
link |
00:41:20.240
And so we're going to take the fast path.
link |
00:41:22.360
And so the more that we can, both ourselves,
link |
00:41:25.480
be in a position where we don't generate
link |
00:41:27.280
that competitive race, where we say,
link |
00:41:29.000
if the race is being run and that someone else
link |
00:41:31.520
is further ahead than we are,
link |
00:41:33.280
we're not going to try to leapfrog.
link |
00:41:35.600
We're going to actually work with them, right?
link |
00:41:37.200
We will help them succeed.
link |
00:41:38.800
As long as what they're trying to do
link |
00:41:40.440
is to fulfill our mission, then we're good.
link |
00:41:42.920
We don't have to build AGI ourselves.
link |
00:41:44.800
And I think that's a really important commitment from us,
link |
00:41:47.080
but it can't just be unilateral, right?
link |
00:41:49.080
I think that it's really important
link |
00:41:50.400
that other players who are serious about building AGI
link |
00:41:53.120
make similar commitments, right?
link |
00:41:54.680
And I think that, you know, again,
link |
00:41:56.640
to the extent that everyone believes
link |
00:41:57.840
that AGI should be something to benefit everyone,
link |
00:42:00.080
then it actually really shouldn't matter
link |
00:42:01.240
which company builds it.
link |
00:42:02.440
And we should all be concerned about the case
link |
00:42:04.160
where we just race so hard to get there
link |
00:42:06.080
that something goes wrong.
link |
00:42:07.640
So what role do you think government,
link |
00:42:10.560
our favorite entity has in setting policy and rules
link |
00:42:13.840
about this domain, from research to the development
link |
00:42:18.320
to early stage, to late stage AI and AGI development?
link |
00:42:22.880
So I think that, first of all,
link |
00:42:25.640
it's really important that government's in there, right?
link |
00:42:28.080
In some way, shape, or form, you know,
link |
00:42:29.800
at the end of the day, we're talking about
link |
00:42:30.920
building technology that will shape how the world operates
link |
00:42:35.080
and that there needs to be government as part of that answer.
link |
00:42:39.040
And so that's why we've done a number
link |
00:42:42.160
of different congressional testimonies.
link |
00:42:43.600
We interact with a number of different lawmakers
link |
00:42:46.440
and that right now, a lot of our message to them
link |
00:42:50.040
is that it's not the time for regulation,
link |
00:42:54.360
it is the time for measurement, right?
link |
00:42:56.400
That our main policy recommendation is that people,
link |
00:42:59.080
and you know, the government does this all the time
link |
00:43:00.680
with bodies like NIST, spend time trying to figure out
link |
00:43:04.880
just where the technology is, how fast it's moving,
link |
00:43:07.920
and can really become literate and up to speed
link |
00:43:11.200
with respect to what to expect.
link |
00:43:13.520
So I think that today, the answer really
link |
00:43:15.240
is about measurement.
link |
00:43:17.320
And I think that there will be a time and place
link |
00:43:20.160
where that will change.
link |
00:43:21.720
And I think it's a little bit hard to predict exactly
link |
00:43:24.840
what exactly that trajectory should look like.
link |
00:43:27.120
So there will be a point at which regulation,
link |
00:43:31.080
federal in the United States, the government steps in
link |
00:43:34.200
and helps be the, I don't wanna say the adult in the room,
link |
00:43:39.520
to make sure that there is strict rules,
link |
00:43:42.400
maybe conservative rules that nobody can cross.
link |
00:43:45.200
Well, I think there's kind of maybe two angles to it.
link |
00:43:47.400
So today with narrow AI applications,
link |
00:43:49.800
that I think there are already existing bodies
link |
00:43:51.960
that are responsible and should be responsible for regulation.
link |
00:43:54.880
You think about, for example, with self driving cars,
link |
00:43:57.040
that you want the national highway.
link |
00:44:00.720
Yeah, exactly to be regulated in that.
link |
00:44:02.920
That makes sense, right?
link |
00:44:04.040
That basically what we're saying
link |
00:44:04.960
is that we're going to have these technological systems
link |
00:44:08.120
that are going to be performing applications
link |
00:44:10.600
that humans already do.
link |
00:44:12.280
Great, we already have ways of thinking about standards
link |
00:44:14.800
and safety for those.
link |
00:44:16.160
So I think actually empowering those regulators today
link |
00:44:18.880
is also pretty important.
link |
00:44:20.040
And then I think for AGI, that there's going to be a point
link |
00:44:24.760
where we'll have better answers.
link |
00:44:26.040
And I think that maybe a similar approach
link |
00:44:27.640
of first measurement and start thinking about
link |
00:44:30.520
what the rules should be.
link |
00:44:31.640
I think it's really important
link |
00:44:32.640
that we don't prematurely squash progress.
link |
00:44:36.280
I think it's very easy to kind of smother a budding field.
link |
00:44:40.160
And I think that's something to really avoid.
link |
00:44:42.160
But I don't think that the right way of doing it
link |
00:44:43.760
is to say, let's just try to blaze ahead
link |
00:44:46.920
and not involve all these other stakeholders.
link |
00:44:51.480
So you've recently released a paper on GPT2
link |
00:44:56.240
language modeling, but did not release the full model
link |
00:45:02.040
because you had concerns about the possible negative effects
link |
00:45:05.280
of the availability of such model.
link |
00:45:07.480
It's outside of just that decision,
link |
00:45:10.680
and it's super interesting because of the discussion
link |
00:45:14.360
at a societal level, the discourse it creates.
link |
00:45:17.000
So it's fascinating in that aspect.
link |
00:45:19.320
But if you think that's the specifics here at first,
link |
00:45:22.880
what are some negative effects that you envisioned?
link |
00:45:25.920
And of course, what are some of the positive effects?
link |
00:45:28.600
Yeah, so again, I think to zoom out,
link |
00:45:30.640
like the way that we thought about GPT2
link |
00:45:34.040
is that with language modeling,
link |
00:45:35.800
we are clearly on a trajectory right now
link |
00:45:38.560
where we scale up our models
link |
00:45:40.880
and we get qualitatively better performance, right?
link |
00:45:44.480
GPT2 itself was actually just a scale up
link |
00:45:47.360
of a model that we've released in the previous June, right?
link |
00:45:50.680
And we just ran it at much larger scale
link |
00:45:52.880
and we got these results
link |
00:45:53.880
where suddenly starting to write coherent pros,
link |
00:45:57.240
which was not something we'd seen previously.
link |
00:46:00.040
And what are we doing now?
link |
00:46:01.320
Well, we're gonna scale up GPT2 by 10x by 100x by 1000x
link |
00:46:05.760
and we don't know what we're gonna get.
link |
00:46:07.840
And so it's very clear that the model
link |
00:46:10.120
that we released last June,
link |
00:46:12.840
I think it's kind of like, it's a good academic toy.
link |
00:46:16.440
It's not something that we think is something
link |
00:46:18.920
that can really have negative applications
link |
00:46:20.440
or to the extent that it can,
link |
00:46:21.680
that the positive of people being able to play with it
link |
00:46:24.360
is far outweighs the possible harms.
link |
00:46:28.280
You fast forward to not GPT2, but GPT20,
link |
00:46:32.600
and you think about what that's gonna be like.
link |
00:46:34.720
And I think that the capabilities are going to be substantive.
link |
00:46:38.200
And so there needs to be a point in between the two
link |
00:46:41.120
where you say, this is something
link |
00:46:43.480
where we are drawing the line
link |
00:46:45.200
and that we need to start thinking about the safety aspects.
link |
00:46:48.000
And I think for GPT2, we could have gone either way.
link |
00:46:50.160
And in fact, when we had conversations internally
link |
00:46:52.720
that we had a bunch of pros and cons
link |
00:46:54.760
and it wasn't clear which one outweighed the other.
link |
00:46:58.160
And I think that when we announced
link |
00:46:59.840
that, hey, we decide not to release this model,
link |
00:47:02.160
then there was a bunch of conversation
link |
00:47:03.600
where various people said it's so obvious
link |
00:47:05.200
that you should have just released it.
link |
00:47:06.360
There are other people that said it's so obvious
link |
00:47:07.520
you should not have released it.
link |
00:47:08.840
And I think that that almost definitionally means
link |
00:47:10.960
that holding it back was the correct decision.
link |
00:47:13.800
If it's not obvious whether something is beneficial
link |
00:47:17.000
or not, you should probably default to caution.
link |
00:47:19.720
And so I think that the overall landscape
link |
00:47:22.440
for how we think about it
link |
00:47:23.760
is that this decision could have gone either way.
link |
00:47:25.920
There are great arguments in both directions.
link |
00:47:27.960
But for future models down the road,
link |
00:47:30.080
and possibly sooner than you'd expect,
link |
00:47:32.320
because scaling these things up doesn't actually
link |
00:47:33.880
take that long, those ones,
link |
00:47:36.800
you're definitely not going to want to release into the wild.
link |
00:47:39.600
And so I think that we almost view this as a test case
link |
00:47:42.640
and to see, can we even design,
link |
00:47:45.360
how do you have a society or how do you have a system
link |
00:47:47.960
that goes from having no concept of responsible disclosure
link |
00:47:50.520
where the mere idea of not releasing something
link |
00:47:53.440
for safety reasons is unfamiliar
link |
00:47:55.960
to a world where you say, okay,
link |
00:47:57.440
we have a powerful model.
link |
00:47:58.720
Let's at least think about it.
link |
00:47:59.720
Let's go through some process.
link |
00:48:01.280
And you think about the security community.
link |
00:48:02.680
It took them a long time
link |
00:48:03.880
to design responsible disclosure.
link |
00:48:05.960
You think about this question of,
link |
00:48:07.200
well, I have a security exploit.
link |
00:48:08.800
I send it to the company.
link |
00:48:09.760
The company is like, tries to prosecute me
link |
00:48:12.000
or just ignores it.
link |
00:48:14.760
What do I do?
link |
00:48:16.080
And so the alternatives of,
link |
00:48:17.320
oh, I just always publish your exploits.
link |
00:48:19.120
That doesn't seem good either.
link |
00:48:20.200
And so it really took a long time
link |
00:48:21.600
and it was bigger than any individual.
link |
00:48:25.320
It's really about building a whole community
link |
00:48:27.080
that believe that, okay, we'll have this process
link |
00:48:28.760
where you send it to the company
link |
00:48:30.160
if they don't act at a certain time,
link |
00:48:31.680
then you can go public
link |
00:48:33.120
and you're not a bad person.
link |
00:48:34.440
You've done the right thing.
link |
00:48:36.240
And I think that in AI,
link |
00:48:38.680
part of the response to GPT2 just proves
link |
00:48:41.400
that we don't have any concept of this.
link |
00:48:44.200
So that's the high level picture.
link |
00:48:47.080
And so I think that,
link |
00:48:48.720
I think this was a really important move to make.
link |
00:48:51.240
And we could have maybe delayed it for GPT3,
link |
00:48:54.000
but I'm really glad we did it for GPT2.
link |
00:48:56.080
And so now you look at GPT2 itself
link |
00:48:57.760
and you think about the substance of, okay,
link |
00:48:59.440
what are potential negative applications?
link |
00:49:01.320
So you have this model that's been trained on the internet,
link |
00:49:04.120
which is also going to be a bunch of very biased data,
link |
00:49:06.520
a bunch of very offensive content in there.
link |
00:49:09.600
And you can ask it to generate content for you
link |
00:49:13.240
on basically any topic, right?
link |
00:49:14.600
You just give it a prompt
link |
00:49:15.440
and it'll just start writing
link |
00:49:16.800
and it writes content like you see on the internet,
link |
00:49:19.120
you know, even down to like saying advertisement
link |
00:49:21.960
in the middle of some of its generations.
link |
00:49:24.200
And you think about the possibilities
link |
00:49:26.200
for generating fake news or abusive content.
link |
00:49:29.280
And, you know, it's interesting
link |
00:49:30.120
seeing what people have done with, you know,
link |
00:49:31.880
we released a smaller version of GPT2
link |
00:49:34.400
and the people have done things like try to generate,
link |
00:49:37.480
you know, take my own Facebook message history
link |
00:49:40.760
and generate more Facebook messages like me
link |
00:49:43.360
and people generating fake politician content
link |
00:49:47.360
or, you know, there's a bunch of things there
link |
00:49:49.520
where you at least have to think,
link |
00:49:51.920
is this going to be good for the world?
link |
00:49:54.720
There's the flip side, which is I think
link |
00:49:56.320
that there's a lot of awesome applications
link |
00:49:57.840
that we really want to see like creative applications
link |
00:50:01.640
in terms of if you have sci fi authors
link |
00:50:04.000
that can work with this tool and come with cool ideas,
link |
00:50:06.760
like that seems awesome if we can write better sci fi
link |
00:50:09.720
through the use of these tools.
link |
00:50:11.360
And we've actually had a bunch of people right into us
link |
00:50:13.080
asking, hey, can we use it for, you know,
link |
00:50:16.160
a variety of different creative applications?
link |
00:50:18.360
So the positive are actually pretty easy to imagine.
link |
00:50:21.880
There are, you know, the usual NLP applications
link |
00:50:26.880
that are really interesting, but let's go there.
link |
00:50:30.960
It's kind of interesting to think about a world
link |
00:50:32.960
where, look at Twitter, where not just fake news
link |
00:50:37.960
but smarter and smarter bots being able to spread
link |
00:50:43.040
in an interesting complex networking way in information
link |
00:50:47.400
that just floods out us regular human beings
link |
00:50:50.800
with our original thoughts.
link |
00:50:52.880
So what are your views of this world with GPT 20?
link |
00:50:58.760
Right, how do we think about, again,
link |
00:51:01.600
it's like one of those things about in the 50s
link |
00:51:03.560
trying to describe the internet or the smartphone.
link |
00:51:08.720
What do you think about that world,
link |
00:51:09.960
the nature of information?
link |
00:51:12.920
One possibility is that we'll always try to design systems
link |
00:51:16.760
that identify a robot versus human
link |
00:51:19.680
and we'll do so successfully.
link |
00:51:21.280
And so we'll authenticate that we're still human.
link |
00:51:24.600
And the other world is that we just accept the fact
link |
00:51:27.520
that we're swimming in a sea of fake news
link |
00:51:30.360
and just learn to swim there.
link |
00:51:32.120
Well, have you ever seen the, there's a, you know,
link |
00:51:34.800
popular meme of a robot with a physical arm and pen
link |
00:51:41.520
clicking the I'm not a robot button?
link |
00:51:43.440
Yeah.
link |
00:51:44.280
I think the truth is that really trying to distinguish
link |
00:51:48.560
between robot and human is a losing battle.
link |
00:51:52.160
Ultimately, you think it's a losing battle?
link |
00:51:53.800
I think it's a losing battle ultimately, right?
link |
00:51:55.520
I think that that is that in terms of the content,
link |
00:51:57.800
in terms of the actions that you can take.
link |
00:51:59.360
I mean, think about how captures have gone, right?
link |
00:52:01.200
The captures used to be a very nice, simple.
link |
00:52:02.920
You just have this image, all of our OCR is terrible.
link |
00:52:06.320
You put a couple of artifacts in it, you know,
link |
00:52:08.880
humans are gonna be able to tell what it is
link |
00:52:11.040
an AI system wouldn't be able to today.
link |
00:52:13.840
Like I could barely do captures.
link |
00:52:15.720
And I think that this is just kind of where we're going.
link |
00:52:18.360
I think captures where we're a moment in time thing.
link |
00:52:20.400
And as AI systems become more powerful,
link |
00:52:22.520
that there being human capabilities that can be measured
link |
00:52:25.520
in a very easy automated way that the AIs will not be
link |
00:52:29.360
capable of, I think that's just like,
link |
00:52:31.120
it's just an increasingly hard technical battle.
link |
00:52:34.160
But it's not that all hope is lost, right?
link |
00:52:36.240
And you think about how do we already authenticate
link |
00:52:39.760
ourselves, right?
link |
00:52:40.600
That, you know, we have systems.
link |
00:52:41.760
We have social security numbers.
link |
00:52:43.440
If you're in the U S or, you know, you have, you have,
link |
00:52:46.560
you know, ways of identifying individual people
link |
00:52:48.920
and having real world identity tied to digital identity
link |
00:52:51.880
seems like a step towards, you know,
link |
00:52:54.880
authenticating the source of content
link |
00:52:56.200
rather than the content itself.
link |
00:52:58.240
Now, there are problems with that.
link |
00:53:00.000
How can you have privacy and anonymity in a world
link |
00:53:03.000
where the only content you can really trust is,
link |
00:53:05.440
or the only way you can trust content
link |
00:53:06.560
is by looking at where it comes from.
link |
00:53:08.560
And so I think that building out good reputation networks
link |
00:53:11.400
maybe one possible solution.
link |
00:53:14.080
But yeah, I think that this question is not
link |
00:53:16.280
an obvious one.
link |
00:53:17.720
And I think that we, you know,
link |
00:53:19.320
maybe sooner than we think we'll be in a world
link |
00:53:20.880
where, you know, today I often will read a tweet
link |
00:53:23.800
and be like, do I feel like a real human wrote this?
link |
00:53:25.960
Or, you know, do I feel like this was like genuine?
link |
00:53:27.560
I feel like I can kind of judge the content a little bit.
link |
00:53:30.160
And I think in the future, it just won't be the case.
link |
00:53:32.640
You look at, for example, the FCC comments on net neutrality.
link |
00:53:36.880
It came out later that millions of those were auto generated
link |
00:53:39.880
and that the researchers were able to do various
link |
00:53:41.960
statistical techniques to do that.
link |
00:53:44.040
What do you do in a world where those statistical techniques
link |
00:53:47.160
don't exist?
link |
00:53:48.000
It's just impossible to tell the difference
link |
00:53:49.120
between humans and AI's.
link |
00:53:50.640
And in fact, the most persuasive arguments
link |
00:53:53.960
are written by AI, all that stuff.
link |
00:53:57.200
It's not sci fi anymore.
link |
00:53:58.600
You look at GPT2 making a great argument for why recycling
link |
00:54:01.320
is bad for the world.
link |
00:54:02.560
You got to read that and be like, huh, you're right.
link |
00:54:04.440
We are addressing just the symptoms.
link |
00:54:06.520
Yeah, that's quite interesting.
link |
00:54:08.120
I mean, ultimately it boils down to the physical world
link |
00:54:11.320
being the last frontier of proving.
link |
00:54:13.680
So you said like basically networks of people,
link |
00:54:16.080
humans vouching for humans in the physical world.
link |
00:54:19.400
And somehow the authentication ends there.
link |
00:54:22.960
I mean, if I had to ask you,
link |
00:54:25.520
I mean, you're way too eloquent for a human.
link |
00:54:28.160
So if I had to ask you to authenticate,
link |
00:54:31.240
like prove how do I know you're not a robot
link |
00:54:33.120
and how do you know I'm not a robot?
link |
00:54:34.920
Yeah.
link |
00:54:35.760
I think that's so far were in this space,
link |
00:54:40.520
this conversation we just had,
link |
00:54:42.120
the physical movements we did
link |
00:54:44.000
is the biggest gap between us and AI systems
link |
00:54:47.040
is the physical manipulation.
link |
00:54:49.360
So maybe that's the last frontier.
link |
00:54:51.280
Well, here's another question is,
link |
00:54:53.040
why is solving this problem important, right?
link |
00:54:57.320
Like what aspects are really important to us?
link |
00:54:59.080
And I think that probably where we'll end up
link |
00:55:01.200
is we'll hone in on what do we really want
link |
00:55:03.600
out of knowing if we're talking to a human.
link |
00:55:06.400
And I think that again, this comes down to identity.
link |
00:55:09.480
And so I think that the internet of the future,
link |
00:55:11.760
I expect to be one that will have lots of agents out there
link |
00:55:14.840
that will interact with you.
link |
00:55:16.320
But I think that the question of,
link |
00:55:17.880
is this real flesh and blood human
link |
00:55:21.520
or is this an automated system?
link |
00:55:23.800
May actually just be less important.
link |
00:55:25.800
Let's actually go there.
link |
00:55:27.360
It's GPT2 is impressive and let's look at GPT20.
link |
00:55:32.440
Why is it so bad that all my friends are GPT20?
link |
00:55:37.440
Why is it so important on the internet?
link |
00:55:43.320
Do you think to interact with only human beings?
link |
00:55:47.360
Why can't we live in a world where ideas can come
link |
00:55:50.640
from models trained on human data?
link |
00:55:52.960
Yeah, I think this is actually a really interesting question.
link |
00:55:55.720
This comes back to the,
link |
00:55:56.560
how do you even picture a world with some new technology?
link |
00:55:59.560
And I think that one thing that I think is important
link |
00:56:02.080
is, you know, let's say honesty.
link |
00:56:04.760
And I think that if you have, you know, almost in the
link |
00:56:07.520
Turing test style sense of technology,
link |
00:56:11.120
you have AIs that are pretending to be humans
link |
00:56:13.200
and deceiving you, I think that is, you know,
link |
00:56:15.800
that feels like a bad thing, right?
link |
00:56:17.560
I think that it's really important that we feel like
link |
00:56:19.720
we're in control of our environment, right?
link |
00:56:21.280
That we understand who we're interacting with.
link |
00:56:23.400
And if it's an AI or a human,
link |
00:56:25.880
that that's not something that we're being deceived about.
link |
00:56:28.680
But I think that the flip side of,
link |
00:56:30.240
can I have as meaningful of an interaction with an AI
link |
00:56:32.680
as I can with a human?
link |
00:56:34.240
Well, I actually think here you can turn to sci fi.
link |
00:56:36.880
And her, I think is a great example of asking this very
link |
00:56:40.040
question, right?
link |
00:56:40.880
And one thing I really love about her is it really starts
link |
00:56:42.800
out almost by asking how meaningful are human
link |
00:56:45.800
virtual relationships, right?
link |
00:56:47.280
And then you have a human who has a relationship with an AI
link |
00:56:51.200
and that you really start to be drawn into that, right?
link |
00:56:54.320
And that all of your emotional buttons get triggered
link |
00:56:56.960
in the same way as if there was a real human that was on
link |
00:56:59.000
the other side of that phone.
link |
00:57:00.400
And so I think that this is one way of thinking about it,
link |
00:57:03.800
is that I think that we can have meaningful interactions
link |
00:57:07.160
and that if there's a funny joke,
link |
00:57:09.720
some sense it doesn't really matter if it was written
link |
00:57:11.320
by a human or an AI, but what you don't want in a way
link |
00:57:14.600
where I think we should really draw hard lines is deception.
link |
00:57:17.360
And I think that as long as we're in a world where,
link |
00:57:20.200
you know, why do we build AI systems at all, right?
link |
00:57:22.640
The reason we want to build them is to enhance human lives,
link |
00:57:25.000
to make humans be able to do more things,
link |
00:57:26.680
to have humans feel more fulfilled.
link |
00:57:29.040
And if we can build AI systems that do that,
link |
00:57:32.040
you know, sign me up.
link |
00:57:33.200
So the process of language modeling,
link |
00:57:37.120
how far do you think it take us?
link |
00:57:38.760
Let's look at movie HER.
link |
00:57:40.680
Do you think a dialogue, natural language conversation
link |
00:57:45.040
is formulated by the Turing test, for example,
link |
00:57:47.840
do you think that process could be achieved through
link |
00:57:50.760
this kind of unsupervised language modeling?
link |
00:57:53.160
So I think the Turing test in its real form
link |
00:57:56.960
isn't just about language, right?
link |
00:57:58.680
It's really about reasoning too, right?
link |
00:58:00.560
That to really pass the Turing test,
link |
00:58:01.920
I should be able to teach calculus
link |
00:58:03.880
to whoever's on the other side
link |
00:58:05.520
and have it really understand calculus
link |
00:58:07.480
and be able to, you know, go and solve
link |
00:58:09.320
new calculus problems.
link |
00:58:11.280
And so I think that to really solve the Turing test,
link |
00:58:13.960
we need more than what we're seeing with language models.
link |
00:58:16.440
We need some way of plugging in reasoning.
link |
00:58:18.720
Now, how different will that be from what we already do?
link |
00:58:22.400
That's an open question, right?
link |
00:58:23.880
It might be that we need some sequence
link |
00:58:25.480
of totally radical new ideas,
link |
00:58:27.200
or it might be that we just need to kind of shape
link |
00:58:29.560
our existing systems in a slightly different way.
link |
00:58:33.040
But I think that in terms of how far
link |
00:58:34.640
language modeling will go,
link |
00:58:35.920
it's already gone way further
link |
00:58:37.520
than many people would have expected, right?
link |
00:58:39.760
I think that things like,
link |
00:58:40.960
and I think there's a lot of really interesting angles
link |
00:58:42.720
to poke in terms of how much does GPT2
link |
00:58:45.920
understand physical world?
link |
00:58:47.880
Like, you know, you read a little bit
link |
00:58:49.360
about fire underwater in GPT2.
link |
00:58:52.360
So it's like, okay, maybe it doesn't quite understand
link |
00:58:54.200
what these things are.
link |
00:58:55.680
But at the same time, I think that you also see
link |
00:58:58.560
various things like smoke coming from flame,
link |
00:59:00.640
and you know, a bunch of these things that GPT2,
link |
00:59:02.680
it has no body, it has no physical experience,
link |
00:59:04.880
it's just statically read data.
link |
00:59:07.280
And I think that if the answer is like,
link |
00:59:11.680
we don't know yet, and these questions though,
link |
00:59:14.600
we're starting to be able to actually ask them
link |
00:59:16.240
to physical systems, to real systems that exist,
link |
00:59:18.720
and that's very exciting.
link |
00:59:19.880
Do you think, what's your intuition?
link |
00:59:21.160
Do you think if you just scale language modeling,
link |
00:59:24.040
like significantly scale, that reasoning can emerge
link |
00:59:29.320
from the same exact mechanisms?
link |
00:59:31.320
I think it's unlikely that if we just scale GPT2,
link |
00:59:34.960
that we'll have reasoning in the full fledged way.
link |
00:59:38.600
And I think that there's like,
link |
00:59:39.760
the type signature is a little bit wrong, right?
link |
00:59:41.520
That like, there's something we do with,
link |
00:59:44.560
that we call thinking, right?
link |
00:59:45.800
Where we spend a lot of compute,
link |
00:59:47.640
like a variable amount of compute
link |
00:59:49.160
to get to better answers, right?
link |
00:59:50.680
I think a little bit harder, I get a better answer.
link |
00:59:53.040
And that that kind of type signature
link |
00:59:55.160
isn't quite encoded in a GPT, right?
link |
00:59:58.880
GPT will kind of like, it's spent a long time
link |
01:00:01.880
in it's like evolutionary history,
link |
01:00:03.640
baking and all this information,
link |
01:00:04.680
getting very, very good at this predictive process.
link |
01:00:07.000
And then at runtime, I just kind of do one forward pass
link |
01:00:10.320
and am able to generate stuff.
link |
01:00:13.240
And so, there might be small tweaks
link |
01:00:15.560
to what we do in order to get the type signature, right?
link |
01:00:18.040
For example, well, it's not really one forward pass, right?
link |
01:00:21.040
You generate symbol by symbol.
link |
01:00:22.640
And so, maybe you generate like a whole sequence of thoughts
link |
01:00:25.560
and you only keep like the last bit or something.
link |
01:00:28.200
But I think that at the very least,
link |
01:00:29.840
I would expect you have to make changes like that.
link |
01:00:32.160
Yeah, just exactly how we, you said think
link |
01:00:35.520
is the process of generating thought by thought
link |
01:00:38.400
in the same kind of way, like you said,
link |
01:00:40.360
keep the last bit, the thing that we converge towards.
link |
01:00:45.000
And I think there's another piece which is interesting,
link |
01:00:47.280
which is this out of distribution generalization, right?
link |
01:00:50.240
That like thinking somehow lets us do that, right?
link |
01:00:52.600
That we have an experience of thing
link |
01:00:54.400
and yet somehow we just kind of keep refining
link |
01:00:56.080
our mental model of it.
link |
01:00:58.040
This is again, something that feels tied to
link |
01:01:01.160
whatever reasoning is.
link |
01:01:03.360
And maybe it's a small tweak to what we do.
link |
01:01:05.720
Maybe it's many ideas and we'll take as many decades.
link |
01:01:08.080
Yeah, so the assumption there, generalization
link |
01:01:11.920
out of distribution is that it's possible
link |
01:01:14.160
to create new ideas.
link |
01:01:18.160
It's possible that nobody's ever created any new ideas.
link |
01:01:20.840
And then with scaling GPT2 to GPT20,
link |
01:01:25.360
you would essentially generalize to all possible thoughts
link |
01:01:30.520
as humans can have, just to play devil's advocate.
link |
01:01:34.200
Right, I mean, how many new story ideas
link |
01:01:37.280
have we come up with since Shakespeare, right?
link |
01:01:39.120
Yeah, exactly.
link |
01:01:41.600
It's just all different forms of love and drama and so on.
link |
01:01:44.680
Okay.
link |
01:01:45.800
Not sure if you read Biddle Lesson,
link |
01:01:47.520
a recent blog post by Rich Sutton.
link |
01:01:49.400
Yep, I have.
link |
01:01:50.880
He basically says something that echoes
link |
01:01:53.720
some of the ideas that you've been talking about,
link |
01:01:55.480
which is, he says the biggest lesson
link |
01:01:58.320
that can be read from 70 years of AI research
link |
01:02:00.680
is that general methods that leverage computation
link |
01:02:03.880
are ultimately going to ultimately win out.
link |
01:02:07.920
Do you agree with this?
link |
01:02:08.960
So basically open AI in general about the ideas
link |
01:02:13.520
you're exploring about coming up with methods,
link |
01:02:15.880
whether it's GPT2 modeling or whether it's open AI5,
link |
01:02:20.120
playing Dota, where a general method
link |
01:02:23.160
is better than a more fine tuned, expert tuned method.
link |
01:02:29.760
Yeah, so I think that, well, one thing that I think
link |
01:02:32.200
was really interesting about the reaction
link |
01:02:33.800
to that blog post was that a lot of people have read this
link |
01:02:36.480
as saying that compute is all that matters.
link |
01:02:39.440
And that's a very threatening idea, right?
link |
01:02:41.360
And I don't think it's a true idea either, right?
link |
01:02:43.720
It's very clear that we have algorithmic ideas
link |
01:02:45.800
that have been very important for making progress.
link |
01:02:47.920
And to really build AI, you wanna push as far as you can
link |
01:02:50.720
on the computational scale, and you wanna push
link |
01:02:52.760
as far as you can on human ingenuity.
link |
01:02:55.520
And so I think you need both.
link |
01:02:57.040
But I think the way that you phrase the question
link |
01:02:58.320
is actually very good, right?
link |
01:02:59.640
That it's really about what kind of ideas
link |
01:03:02.200
should we be striving for?
link |
01:03:04.040
And absolutely, if you can find a scalable idea,
link |
01:03:07.600
you pour more compute into it,
link |
01:03:08.640
you pour more data into it, it gets better.
link |
01:03:11.400
Like that's the real Holy Grail.
link |
01:03:13.800
And so I think that the answer to the question,
link |
01:03:16.600
I think is yes, that's really how we think about it.
link |
01:03:19.920
And that part of why we're excited about the power
link |
01:03:22.760
of deep learning and the potential for building AGI
link |
01:03:25.320
is because we look at the systems that exist
link |
01:03:27.600
in the most successful AI systems,
link |
01:03:29.720
and we realize that you scale those up,
link |
01:03:32.680
they're gonna work better.
link |
01:03:34.000
And I think that that scalability is something
link |
01:03:36.320
that really gives us hope
link |
01:03:37.160
for being able to build transformative systems.
link |
01:03:39.600
So I'll tell you, this is partially an emotional,
link |
01:03:43.240
you know, a thing that a response that people often have,
link |
01:03:45.760
if compute is so important for state of the art performance,
link |
01:03:49.280
you know, individual developers,
link |
01:03:50.760
maybe a 13 year old sitting somewhere in Kansas
link |
01:03:52.960
or something like that, you know, they're sitting,
link |
01:03:55.040
they might not even have a GPU
link |
01:03:56.760
and or may have a single GPU, a 1080 or something like that.
link |
01:04:00.080
And there's this feeling like, well,
link |
01:04:02.640
how can I possibly compete or contribute to this world of AI
link |
01:04:07.280
if scale is so important?
link |
01:04:09.840
So if you can comment on that,
link |
01:04:11.920
and in general, do you think we need to also
link |
01:04:14.320
in the future focus on democratizing compute resources
link |
01:04:18.800
more or as much as we democratize the algorithms?
link |
01:04:22.680
Well, so the way that I think about it
link |
01:04:23.960
is that there's this space of possible progress, right?
link |
01:04:28.880
There's a space of ideas and sort of systems
link |
01:04:30.920
that will work, that will move us forward.
link |
01:04:32.960
And there's a portion of that space,
link |
01:04:34.840
and to some extent,
link |
01:04:35.760
an increasingly significant portion of that space
link |
01:04:37.960
that does just require massive compute resources.
link |
01:04:41.080
And for that, I think that the answer is kind of clear
link |
01:04:44.760
and that part of why we have the structure that we do
link |
01:04:47.960
is because we think it's really important
link |
01:04:49.640
to be pushing the scale
link |
01:04:50.600
and to be building these large clusters and systems.
link |
01:04:53.840
But there's another portion of the space
link |
01:04:55.920
that isn't about the large scale compute,
link |
01:04:57.880
that are these ideas that, and again,
link |
01:04:59.960
I think that for the ideas to really be impactful
link |
01:05:02.200
and really shine, that they should be ideas
link |
01:05:04.200
that if you scale them up,
link |
01:05:05.840
would work way better than they do at small scale.
link |
01:05:08.840
But you can discover them without massive
link |
01:05:11.160
computational resources.
link |
01:05:12.760
And if you look at the history of recent developments,
link |
01:05:15.200
you think about things like the GAN or the VAE,
link |
01:05:17.680
that these are ones that I think you could come up with them
link |
01:05:20.920
without having, and in practice,
link |
01:05:22.720
people did come up with them without having
link |
01:05:24.520
massive, massive computational resources.
link |
01:05:26.560
Right, I just talked to Ian Goodfellow,
link |
01:05:28.000
but the thing is the initial GAN
link |
01:05:31.600
produced pretty terrible results, right?
link |
01:05:34.200
So only because it was in a very specific,
link |
01:05:36.880
only because they're smart enough to know
link |
01:05:38.640
that this is quite surprising to generate anything
link |
01:05:41.520
that they know.
link |
01:05:43.160
Do you see a world, or is that too optimistic and dreamer,
link |
01:05:46.040
like, to imagine that the compute resources
link |
01:05:49.760
are something that's owned by governments
link |
01:05:52.200
and provided as a utility?
link |
01:05:55.040
Actually, to some extent, this question reminds me
link |
01:05:57.120
of a blog post from one of my former professors
link |
01:06:00.280
at Harvard, this guy, Matt Welch,
link |
01:06:02.440
who was a systems professor.
link |
01:06:03.760
I remember sitting in his tenure talk, right,
link |
01:06:05.280
and that he had literally just gotten tenure.
link |
01:06:08.800
He went to Google for the summer,
link |
01:06:10.960
and then decided he wasn't going back to academia, right?
link |
01:06:15.680
And kind of in his blog post, he makes this point
link |
01:06:17.760
that, look, as a systems researcher,
link |
01:06:20.800
that I come up with these cool system ideas,
link |
01:06:23.040
right, and kind of build a little proof of concept,
link |
01:06:25.080
and the best thing I could hope for
link |
01:06:27.080
is that the people at Google or Yahoo,
link |
01:06:30.120
which was around at the time,
link |
01:06:32.600
will implement it and actually make it work at scale, right?
link |
01:06:35.400
That's like the dream for me, right?
link |
01:06:36.640
I build the little thing, and they turn it into
link |
01:06:38.000
the big thing that's actually working.
link |
01:06:40.000
And for him, he said, I'm done with that.
link |
01:06:43.360
I want to be the person who's actually doing
link |
01:06:45.320
building and deploying.
link |
01:06:47.200
And I think that there's a similar dichotomy here, right?
link |
01:06:49.560
I think that there are people who really actually
link |
01:06:52.400
find value, and I think it is a valuable thing to do,
link |
01:06:55.240
to be the person who produces those ideas, right,
link |
01:06:57.440
who builds the proof of concept.
link |
01:06:58.840
And yeah, you don't get to generate
link |
01:07:00.600
the coolest possible GAN images,
link |
01:07:02.760
but you invented the GAN, right?
link |
01:07:04.480
And so there's a real trade off there.
link |
01:07:07.560
And I think that that's a very personal choice,
link |
01:07:09.040
but I think there's value in both sides.
link |
01:07:10.840
So do you think creating AGI, something,
link |
01:07:14.600
or some new models, we would see echoes of the brilliance
link |
01:07:20.440
even at the prototype level.
link |
01:07:22.240
So you would be able to develop those ideas
link |
01:07:24.080
without scale, the initial seeds.
link |
01:07:27.240
So take a look at, I always like to look at examples
link |
01:07:30.680
that exist, right, look at real precedent.
link |
01:07:32.680
And so take a look at the June 2018 model
link |
01:07:36.240
that we released that we scaled up to turn to GPT2.
link |
01:07:39.200
And you can see that at small scale,
link |
01:07:41.280
it set some records, right?
link |
01:07:42.800
This was the original GPT.
link |
01:07:44.800
We actually had some cool generations.
link |
01:07:46.840
They weren't nearly as amazing and really stunning
link |
01:07:49.840
as the GPT2 ones, but it was promising.
link |
01:07:52.000
It was interesting.
link |
01:07:53.040
And so I think it is the case that with a lot
link |
01:07:55.280
of these ideas that you see promise at small scale,
link |
01:07:58.280
but there isn't an asterisk here, a very big asterisk,
link |
01:08:00.800
which is sometimes we see behaviors that emerge
link |
01:08:05.240
that are qualitatively different
link |
01:08:07.280
from anything we saw at small scale.
link |
01:08:09.080
And that the original inventor of whatever algorithm
link |
01:08:12.600
looks at and says, I didn't think it could do that.
link |
01:08:15.520
This is what we saw in Dota, right?
link |
01:08:17.400
So PPO was created by John Shulman,
link |
01:08:19.320
who's a researcher here.
link |
01:08:20.560
And with Dota, we basically just ran PPO
link |
01:08:24.680
at massive, massive scale.
link |
01:08:26.520
And there's some tweaks in order to make it work,
link |
01:08:29.120
but fundamentally it's PPO at the core.
link |
01:08:31.520
And we were able to get this longterm planning,
link |
01:08:35.280
these behaviors to really play out on a time scale
link |
01:08:38.680
that we just thought was not possible.
link |
01:08:40.760
And John looked at that and was like,
link |
01:08:42.680
I didn't think it could do that.
link |
01:08:44.240
That's what happens when you're at three orders
link |
01:08:45.480
of magnitude more scale than you tested at.
link |
01:08:48.400
Yeah, but it still has the same flavors of,
link |
01:08:50.600
you know, at least echoes of the expected billions.
link |
01:08:56.000
Although I suspect with GPT,
link |
01:08:57.880
it's scaled more and more, you might get surprising things.
link |
01:09:01.800
So yeah, you're right.
link |
01:09:03.200
It's interesting that it's difficult to see
link |
01:09:06.360
how far an idea will go when it's scaled.
link |
01:09:09.320
It's an open question.
link |
01:09:11.080
Well, so to that point with Dota and PPO,
link |
01:09:13.080
like I mean, here's a very concrete one, right?
link |
01:09:15.040
It's like, it's actually one thing
link |
01:09:16.680
that's very surprising about Dota
link |
01:09:17.720
that I think people don't really pay that much attention to.
link |
01:09:20.400
Is the decree of generalization
link |
01:09:22.360
out of distribution that happens, right?
link |
01:09:24.560
That you have this AI that's trained
link |
01:09:26.320
against other bots for its entirety,
link |
01:09:28.880
the entirety of its existence.
link |
01:09:30.360
Sorry to take a step back.
link |
01:09:31.440
Can you talk through, you know, a story of Dota,
link |
01:09:37.240
a story of leading up to opening I5 and that past,
link |
01:09:42.040
and what was the process of self playing
link |
01:09:43.920
and so on of training on this?
link |
01:09:45.440
Yeah, yeah, yeah.
link |
01:09:46.280
So with Dota.
link |
01:09:47.120
What is Dota?
link |
01:09:47.960
Dota is a complex video game
link |
01:09:50.000
and we started training,
link |
01:09:51.320
we started trying to solve Dota
link |
01:09:52.720
because we felt like this was a step towards the real world
link |
01:09:55.680
relative to other games like Chess or Go, right?
link |
01:09:58.040
Those very cerebral games
link |
01:09:59.160
where you just kind of have this board
link |
01:10:00.480
of very discreet moves.
link |
01:10:01.880
Dota starts to be much more continuous time.
link |
01:10:04.040
So you have this huge variety of different actions
link |
01:10:06.200
that you have a 45 minute game
link |
01:10:07.680
with all these different units
link |
01:10:09.360
and it's got a lot of messiness to it
link |
01:10:11.840
that really hasn't been captured by previous games.
link |
01:10:14.480
And famously all of the hard coded bots for Dota
link |
01:10:17.320
were terrible, right?
link |
01:10:18.400
It's just impossible to write anything good for it
link |
01:10:19.920
because it's so complex.
link |
01:10:21.240
And so this seemed like a really good place
link |
01:10:23.280
to push what's the state of the art
link |
01:10:25.240
in reinforcement learning.
link |
01:10:26.800
And so we started by focusing on the one versus one
link |
01:10:29.000
version of the game and we're able to solve that.
link |
01:10:32.360
We're able to beat the world champions
link |
01:10:33.880
and the learning, the skill curve
link |
01:10:37.240
was this crazy exponential, right?
link |
01:10:38.960
It was like constantly we were just scaling up,
link |
01:10:41.000
that we were fixing bugs and that you look
link |
01:10:43.240
at the skill curve and it was really a very, very smooth one.
link |
01:10:46.600
So it's actually really interesting
link |
01:10:47.440
to see how that like human iteration loop
link |
01:10:50.000
yielded very steady exponential progress.
link |
01:10:52.680
And to one side note, first of all,
link |
01:10:55.160
it's an exceptionally popular video game.
link |
01:10:57.080
The side effect is that there's a lot
link |
01:10:59.400
of incredible human experts at that video game.
link |
01:11:01.920
So the benchmark that you're trying to reach is very high.
link |
01:11:05.200
And the other, can you talk about the approach
link |
01:11:07.840
that was used initially and throughout training
link |
01:11:10.600
these agents to play this game?
link |
01:11:12.040
Yep.
link |
01:11:12.880
And so the approach that we used is self play.
link |
01:11:14.400
And so you have two agents that don't know anything.
link |
01:11:17.320
They battle each other,
link |
01:11:18.640
they discover something a little bit good
link |
01:11:20.760
and now they both know it.
link |
01:11:22.000
And they just get better and better and better without bound.
link |
01:11:24.520
And that's a really powerful idea, right?
link |
01:11:27.040
That we then went from the one versus one version
link |
01:11:30.160
of the game and scaled up to five versus five, right?
link |
01:11:32.400
So you think about kind of like with basketball
link |
01:11:34.280
where you have this like team sport
link |
01:11:35.440
and you need to do all this coordination
link |
01:11:37.640
and we were able to push the same idea,
link |
01:11:40.920
the same self play to really get to the professional level
link |
01:11:45.920
at the full five versus five version of the game.
link |
01:11:48.880
And the things that I think are really interesting here
link |
01:11:52.400
is that these agents in some ways
link |
01:11:54.760
they're almost like an insect like intelligence, right?
link |
01:11:56.760
Where they have a lot in common with how an insect is trained,
link |
01:11:59.920
right?
link |
01:12:00.760
An insect kind of lives in this environment for a very long time
link |
01:12:02.640
or the ancestors of this insect have been around
link |
01:12:05.280
for a long time and had a lot of experience.
link |
01:12:07.000
I think it's baked into this agent.
link |
01:12:09.680
And it's not really smart in the sense of a human, right?
link |
01:12:12.720
It's not able to go and learn calculus,
link |
01:12:14.560
but it's able to navigate its environment extremely well.
link |
01:12:17.000
And it's able to handle unexpected things
link |
01:12:18.480
in the environment that's never seen before, pretty well.
link |
01:12:22.080
And we see the same sort of thing with our Dota bots, right?
link |
01:12:24.800
That they're able to, within this game,
link |
01:12:26.720
they're able to play against humans,
link |
01:12:28.440
which is something that never existed
link |
01:12:30.000
in its evolutionary environment.
link |
01:12:31.360
Totally different play styles from humans versus the bots.
link |
01:12:34.400
And yet it's able to handle it extremely well.
link |
01:12:37.200
And that's something that I think was very surprising to us
link |
01:12:40.400
was something that doesn't really emerge
link |
01:12:43.440
from what we've seen with PPO at smaller scale, right?
link |
01:12:47.200
And the kind of scale we're running this stuff at
link |
01:12:48.560
was I could take 100,000 CPU cores,
link |
01:12:51.920
running with like hundreds of GPUs.
link |
01:12:54.040
It was probably about something like hundreds of years
link |
01:12:59.040
of experience going into this bot every single real day.
link |
01:13:03.800
And so that scale is massive.
link |
01:13:06.200
And we start to see very different kinds of behaviors
link |
01:13:08.400
out of the algorithms that we all know and love.
link |
01:13:10.760
Dota, you mentioned, beat the world expert 1v1.
link |
01:13:15.160
And then you weren't able to win 5v5 this year
link |
01:13:21.160
at the best players in the world.
link |
01:13:24.080
So what's the comeback story?
link |
01:13:26.640
First of all, talk through that.
link |
01:13:27.680
That was an exceptionally exciting event.
link |
01:13:29.480
And what's the following months in this year look like?
link |
01:13:33.160
Yeah, yeah.
link |
01:13:33.760
So one thing that's interesting is that we lose all the time.
link |
01:13:38.640
Because we play here.
link |
01:13:40.040
So the Dota team at OpenAI, we play the bot
link |
01:13:42.840
against better players than our system all the time.
link |
01:13:45.800
Or at least we used to, right?
link |
01:13:47.400
Like the first time we lost publicly was we went up
link |
01:13:50.680
on stage at the international and we played against some
link |
01:13:53.480
of the best teams in the world.
link |
01:13:54.800
And we ended up losing both games.
link |
01:13:56.320
But we give them a run for their money, right?
link |
01:13:58.520
That both games were kind of 30 minutes, 25 minutes.
link |
01:14:01.440
And they went back and forth, back and forth, back and forth.
link |
01:14:04.200
And so I think that really shows that we're
link |
01:14:06.360
at the professional level.
link |
01:14:08.280
And that kind of looking at those games,
link |
01:14:09.640
we think that the coin could have gone a different direction
link |
01:14:12.280
and we could have had some wins.
link |
01:14:13.560
And so that was actually very encouraging for us.
link |
01:14:16.200
And you know, it's interesting because the international was
link |
01:14:18.360
at a fixed time, right?
link |
01:14:19.720
So we knew exactly what day we were going to be playing.
link |
01:14:22.680
And we pushed as far as we could, as fast as we could.
link |
01:14:25.480
Two weeks later, we had a bot that had an 80% win rate
link |
01:14:28.040
versus the one that played at TI.
link |
01:14:30.120
So the March of Progress, you know,
link |
01:14:31.720
that you should think of as a snapshot rather
link |
01:14:33.480
than as an end state.
link |
01:14:34.920
And so in fact, we'll be announcing our finals pretty soon.
link |
01:14:39.000
I actually think that we'll announce our final match
link |
01:14:42.760
prior to this podcast being released.
link |
01:14:45.240
So there should be, we'll be playing against the world
link |
01:14:49.240
champions.
link |
01:14:49.720
And you know, for us, it's really less about,
link |
01:14:52.520
like the way that we think about what's upcoming
link |
01:14:55.400
is the final milestone, the final competitive milestone
link |
01:14:59.000
for the project, right?
link |
01:15:00.280
That our goal in all of this isn't really
link |
01:15:02.760
about beating humans at Dota.
link |
01:15:05.160
Our goal is to push the state of the art
link |
01:15:06.760
in reinforcement learning.
link |
01:15:07.800
And we've done that, right?
link |
01:15:08.920
And we've actually learned a lot from our system
link |
01:15:10.680
and that we have, you know, I think a lot of exciting
link |
01:15:13.320
next steps that we want to take.
link |
01:15:14.680
And so, you know, kind of the final showcase
link |
01:15:16.440
of what we built, we're going to do this match.
link |
01:15:18.760
But for us, it's not really the success or failure
link |
01:15:21.240
to see, you know, do we have the coin flip go
link |
01:15:23.800
in our direction or against.
link |
01:15:25.880
Where do you see the field of deep learning
link |
01:15:28.680
heading in the next few years?
link |
01:15:31.720
Where do you see the work in reinforcement learning
link |
01:15:35.480
perhaps heading and more specifically with OpenAI,
link |
01:15:41.160
all the exciting projects that you're working on,
link |
01:15:44.280
what does 2019 hold for you?
link |
01:15:46.360
Massive scale.
link |
01:15:47.400
Scale.
link |
01:15:47.880
I will put an atrocious on that and just say,
link |
01:15:49.480
you know, I think that it's about ideas plus scale.
link |
01:15:52.200
You need both.
link |
01:15:52.840
So that's a really good point.
link |
01:15:54.920
So the question, in terms of ideas,
link |
01:15:58.520
you have a lot of projects that are exploring
link |
01:16:02.200
different areas of intelligence.
link |
01:16:04.280
And the question is, when you think of scale,
link |
01:16:07.480
do you think about growing the scale
link |
01:16:09.560
of those individual projects,
link |
01:16:10.680
or do you think about adding new projects?
link |
01:16:13.160
And sorry, if you were thinking about adding new projects,
link |
01:16:17.320
or if you look at the past, what's the process
link |
01:16:19.800
of coming up with new projects and new ideas?
link |
01:16:21.960
Yep.
link |
01:16:22.680
So we really have a life cycle of project here.
link |
01:16:25.240
So we start with a few people just working
link |
01:16:27.320
on a small scale idea.
link |
01:16:28.440
And language is actually a very good example of this,
link |
01:16:30.520
that it was really, you know, one person here
link |
01:16:32.440
who was pushing on language for a long time.
link |
01:16:34.840
I mean, then you get signs of life, right?
link |
01:16:36.680
And so this is like, let's say, you know,
link |
01:16:38.440
with the original GPT, we had something that was interesting.
link |
01:16:42.600
And we said, okay, it's time to scale this, right?
link |
01:16:44.760
It's time to put more people on it,
link |
01:16:45.960
put more computational resources behind it,
link |
01:16:48.120
and then we just kind of keep pushing and keep pushing.
link |
01:16:51.560
And the end state is something that looks like
link |
01:16:52.920
Dota or Robotics, where you have a large team of,
link |
01:16:55.400
you know, 10 or 15 people that are running things
link |
01:16:57.800
at very large scale, and that you're able to really have
link |
01:17:00.680
material engineering and, you know,
link |
01:17:04.280
sort of machine learning science coming together
link |
01:17:06.520
to make systems that work and get material results
link |
01:17:10.200
that just would have been impossible otherwise.
link |
01:17:12.200
So we do that whole life cycle.
link |
01:17:13.560
We've done it a number of times, you know, typically end to end.
link |
01:17:16.600
It's probably two years or so to do it.
link |
01:17:19.960
You know, the organization's been around for three years,
link |
01:17:21.720
so maybe we'll find that we also have
link |
01:17:23.000
longer life cycle projects.
link |
01:17:24.760
But, you know, we work up to those.
link |
01:17:27.480
So one team that we're actually just starting,
link |
01:17:30.280
Illy and I, are kicking off a new team
link |
01:17:32.200
called the Reasoning Team, and this is to really try to tackle
link |
01:17:35.080
how do you get neural networks to reason?
link |
01:17:37.400
And we think that this will be a long term project.
link |
01:17:41.400
It's one that we're very excited about.
link |
01:17:42.840
In terms of reasoning, super exciting topic,
link |
01:17:47.400
what kind of benchmarks, what kind of tests of reasoning
link |
01:17:52.200
do you envision?
link |
01:17:53.800
What would, if you set back,
link |
01:17:55.880
whatever drink, and you would be impressed
link |
01:17:59.240
that this system is able to do something,
link |
01:18:01.640
what would that look like?
link |
01:18:02.760
Theorem proving.
link |
01:18:03.800
Theorem proving.
link |
01:18:04.840
So some kind of logic, and especially mathematical logic.
link |
01:18:09.480
I think so, right?
link |
01:18:10.440
And I think that there's kind of other problems
link |
01:18:12.440
that are dual to theorem proving in particular.
link |
01:18:14.520
You know, you think about programming,
link |
01:18:16.840
you think about even like security analysis of code,
link |
01:18:19.960
that these all kind of capture the same sorts of core reasoning
link |
01:18:24.200
and being able to do some out of distribution generalization.
link |
01:18:28.440
It would be quite exciting if OpenAI Reasoning Team
link |
01:18:31.880
was able to prove that P equals NP.
link |
01:18:33.880
That would be very nice.
link |
01:18:35.080
It would be very, very exciting especially.
link |
01:18:37.720
If it turns out that P equals NP,
link |
01:18:39.080
that'll be interesting too.
link |
01:18:40.120
It would be ironic and humorous.
link |
01:18:45.160
So what problem stands out to you as the most exciting
link |
01:18:51.800
and challenging impactful to the work for us as a community
link |
01:18:55.720
in general and for OpenAI this year?
link |
01:18:58.440
You mentioned reasoning.
link |
01:18:59.480
I think that's a heck of a problem.
link |
01:19:01.320
Yeah.
link |
01:19:01.480
So I think reasoning is an important one.
link |
01:19:02.760
I think it's going to be hard to get good results in 2019.
link |
01:19:05.480
You know, again, just like we think about the lifecycle,
link |
01:19:07.480
takes time.
link |
01:19:08.600
I think for 2019, language modeling seems to be kind of
link |
01:19:11.320
on that ramp, right?
link |
01:19:12.520
It's at the point that we have a technique that works.
link |
01:19:14.760
We want to scale 100x, 1000x, see what happens.
link |
01:19:18.040
Awesome.
link |
01:19:18.360
Do you think we're living in a simulation?
link |
01:19:21.800
I think it's hard to have a real opinion about it.
link |
01:19:25.560
It's actually interesting.
link |
01:19:26.200
I separate out things that I think can have yield
link |
01:19:29.960
materially different predictions about the world
link |
01:19:32.520
from ones that are just kind of fun to speculate about.
link |
01:19:35.640
And I kind of view simulation as more like,
link |
01:19:37.800
is there a flying teapot between Mars and Jupiter?
link |
01:19:40.200
Like, maybe, but it's a little bit hard to know
link |
01:19:43.800
what that would mean for my life.
link |
01:19:45.000
So there is something actionable.
link |
01:19:46.360
So some of the best work opening as done
link |
01:19:50.680
is in the field of reinforcement learning.
link |
01:19:52.760
And some of the success of reinforcement learning
link |
01:19:56.520
come from being able to simulate the problem you're trying
link |
01:19:59.080
to solve.
link |
01:20:00.040
So do you have a hope for reinforcement,
link |
01:20:03.560
for the future of reinforcement learning
link |
01:20:05.160
and for the future of simulation?
link |
01:20:06.920
Like, whether we're talking about autonomous vehicles
link |
01:20:09.000
or any kind of system, do you see that scaling?
link |
01:20:12.760
So we'll be able to simulate systems and, hence,
link |
01:20:16.280
be able to create a simulator that echoes our real world
link |
01:20:19.400
and proving once and for all, even though you're denying it
link |
01:20:22.520
that we're living in a simulation.
link |
01:20:24.840
I feel like I've used that for questions, right?
link |
01:20:26.360
So, you know, kind of at the core there of, like,
link |
01:20:28.200
can we use simulation for self driving cars?
link |
01:20:31.080
Take a look at our robotic system, DACTL, right?
link |
01:20:33.720
That was trained in simulation using the Dota system, in fact.
link |
01:20:37.720
And it transfers to a physical robot.
link |
01:20:40.280
And I think everyone looks at our Dota system,
link |
01:20:42.120
they're like, okay, it's just a game.
link |
01:20:43.400
How are you ever going to escape to the real world?
link |
01:20:45.080
And the answer is, well, we did it with the physical robot,
link |
01:20:47.320
the no one could program.
link |
01:20:48.600
And so I think the answer is simulation goes a lot further
link |
01:20:50.840
than you think if you apply the right techniques to it.
link |
01:20:54.040
Now, there's a question of, you know,
link |
01:20:55.400
are the beings in that simulation going to wake up
link |
01:20:57.400
and have consciousness?
link |
01:20:59.480
I think that one seems a lot harder to, again, reason about.
link |
01:21:02.840
I think that, you know, you really should think about, like,
link |
01:21:05.240
where exactly does human consciousness come from
link |
01:21:07.800
in our own self awareness?
link |
01:21:09.000
And, you know, is it just that, like,
link |
01:21:10.600
once you have, like, a complicated enough neural net,
link |
01:21:12.280
do you have to worry about the agent's feeling pain?
link |
01:21:15.720
And, you know, I think there's, like,
link |
01:21:17.560
interesting speculation to do there.
link |
01:21:19.320
But, you know, again, I think it's a little bit hard to know for sure.
link |
01:21:22.920
Well, let me just keep with the speculation.
link |
01:21:24.840
Do you think to create intelligence, general intelligence,
link |
01:21:28.600
you need one consciousness and two a body?
link |
01:21:33.000
Do you think any of those elements are needed,
link |
01:21:34.920
or is intelligence something that's orthogonal to those?
link |
01:21:38.360
I'll stick to the kind of, like, the non grand answer first,
link |
01:21:41.560
right?
link |
01:21:41.720
So the non grand answer is just to look at,
link |
01:21:43.960
you know, what are we already making work?
link |
01:21:45.560
You look at GPT2, a lot of people would have said
link |
01:21:47.640
that to even get these kinds of results,
link |
01:21:49.320
you need real world experience.
link |
01:21:50.920
You need a body, you need grounding.
link |
01:21:52.440
How are you supposed to reason about any of these things?
link |
01:21:54.920
How are you supposed to, like, even kind of know
link |
01:21:56.360
about smoke and fire and those things
link |
01:21:57.960
if you've never experienced them?
link |
01:21:59.560
And GPT2 shows that you can actually go way further
link |
01:22:03.000
than that kind of reasoning would predict.
link |
01:22:05.640
So I think that in terms of, do we need consciousness?
link |
01:22:09.240
Do we need a body?
link |
01:22:10.360
It seems the answer is probably not, right?
link |
01:22:11.880
That we could probably just continue to push
link |
01:22:13.640
kind of the systems we have.
link |
01:22:14.680
They already feel general.
link |
01:22:16.520
They're not as competent or as general
link |
01:22:19.080
or able to learn as quickly as an AGI would,
link |
01:22:21.640
but, you know, they're at least like kind of proto AGI
link |
01:22:24.680
in some way, and they don't need any of those things.
link |
01:22:28.040
Now, let's move to the grand answer, which is, you know,
link |
01:22:31.640
if our neural nets consciousness,
link |
01:22:34.840
nets conscious already, would we ever know?
link |
01:22:37.240
How can we tell, right?
link |
01:22:38.680
And, you know, here's where the speculation starts
link |
01:22:40.920
to become, you know, at least interesting or fun
link |
01:22:44.760
and maybe a little bit disturbing,
link |
01:22:46.200
depending on where you take it.
link |
01:22:47.880
But it certainly seems that when we think about animals,
link |
01:22:51.080
that there's some continuum of consciousness.
link |
01:22:53.080
You know, my cat, I think, is conscious in some way, right?
link |
01:22:56.040
You know, not as conscious as a human.
link |
01:22:58.040
And you could imagine that you could build
link |
01:22:59.880
a little consciousness meter, right?
link |
01:23:01.000
You point at a cat, it gives you a little reading,
link |
01:23:02.840
you point at a human, it gives you much bigger reading.
link |
01:23:06.200
What would happen if you pointed one of those
link |
01:23:07.960
at a Dota neural net?
link |
01:23:09.800
And if you're training this massive simulation,
link |
01:23:11.960
do the neural nets feel pain?
link |
01:23:14.600
You know, it becomes pretty hard to know
link |
01:23:16.760
that the answer is no, and it becomes pretty hard
link |
01:23:20.040
to really think about what that would mean
link |
01:23:22.360
if the answer were yes.
link |
01:23:25.160
And it's very possible, you know, for example,
link |
01:23:27.400
you could imagine that maybe the reason
link |
01:23:29.400
that humans have consciousness
link |
01:23:31.400
is because it's a convenient computational shortcut, right?
link |
01:23:35.000
If you think about it, if you have a being
link |
01:23:36.920
that wants to avoid pain, which seems pretty important
link |
01:23:39.320
to survive in this environment
link |
01:23:41.000
and wants to, like, you know, eat food,
link |
01:23:43.640
then maybe the best way of doing it
link |
01:23:45.400
is to have a being that's conscious, right?
link |
01:23:47.080
That, you know, in order to succeed in the environment,
link |
01:23:49.480
you need to have those properties
link |
01:23:51.080
and how are you supposed to implement them?
link |
01:23:52.600
And maybe this consciousness is a way of doing that.
link |
01:23:55.240
If that's true, then actually maybe we should expect
link |
01:23:57.720
that really competent reinforcement learning agents
link |
01:23:59.880
will also have consciousness.
link |
01:24:01.960
But, you know, that's a big if.
link |
01:24:03.240
And I think there are a lot of other arguments
link |
01:24:04.760
that you can make in other directions.
link |
01:24:06.680
I think that's a really interesting idea
link |
01:24:08.360
that even GPT2 has some degree of consciousness.
link |
01:24:11.400
That's something that's actually not as crazy
link |
01:24:14.200
to think about.
link |
01:24:14.760
It's useful to think about as we think about
link |
01:24:17.720
what it means to create intelligence of a dog,
link |
01:24:19.800
intelligence of a cat, and the intelligence of a human.
link |
01:24:24.360
So, last question, do you think we will ever fall in love,
link |
01:24:30.760
like in the movie, Her, with an artificial intelligence system
link |
01:24:34.360
or an artificial intelligence system
link |
01:24:36.200
falling in love with a human?
link |
01:24:38.440
I hope so.
link |
01:24:40.120
If there's any better way to end it is on love.
link |
01:24:43.640
So, Greg, thanks so much for talking today.
link |
01:24:45.560
Thank you for having me.