back to index

Leslie Kaelbling: Reinforcement Learning, Planning, and Robotics | Lex Fridman Podcast #15


small model | large model

link |
00:00:00.000
The following is a conversation with Leslie Kaelbling. She is a roboticist and professor at
link |
00:00:05.360
MIT. She is recognized for her work in reinforcement learning, planning, robot navigation, and several
link |
00:00:12.080
other topics in AI. She won the IJCAI Computers and Thought Award and was the editor in chief
link |
00:00:18.560
of the prestigious Journal of Machine Learning Research. This conversation is part of the
link |
00:00:24.320
Artificial Intelligence podcast at MIT and beyond. If you enjoy it, subscribe on YouTube,
link |
00:00:30.400
iTunes, or simply connect with me on Twitter at Lex Friedman, spelled F R I D.
link |
00:00:36.960
And now, here's my conversation with Leslie Kaelbling.
link |
00:00:42.800
What made me get excited about AI, I can say that, is I read Gödel Escher Bach when I was
link |
00:00:47.680
in high school. That was pretty formative for me because it exposed the interestingness of
link |
00:00:57.200
primitives and combination and how you can make complex things out of simple parts
link |
00:01:02.320
and ideas of AI and what kinds of programs might generate intelligent behavior. So...
link |
00:01:07.760
So you first fell in love with AI reasoning logic versus robots?
link |
00:01:12.720
Yeah, the robots came because my first job, so I finished an undergraduate degree in philosophy
link |
00:01:18.160
at Stanford and was about to finish a master's in computer science. And I got hired at SRI
link |
00:01:25.360
in their AI lab and they were building a robot. It was a kind of a follow on to shaky,
link |
00:01:30.960
but all the shaky people were not there anymore. And so my job was to try to get this robot to
link |
00:01:35.840
do stuff. And that's really kind of what got me interested in robots.
link |
00:01:39.280
So maybe taking a small step back to your bachelor's in Stanford in philosophy,
link |
00:01:44.400
did master's and PhD in computer science, but the bachelor's in philosophy. So what was that
link |
00:01:49.440
journey like? What elements of philosophy do you think you bring to your work in computer science?
link |
00:01:55.200
So it's surprisingly relevant. So the part of the reason that I didn't do a computer
link |
00:01:59.840
science undergraduate degree was that there wasn't one at Stanford at the time,
link |
00:02:03.440
but that there's a part of philosophy and in fact, Stanford has a special submajor in
link |
00:02:07.360
something called now symbolic systems, which is logic, model theory, formal semantics of
link |
00:02:13.280
natural language. And so that's actually a perfect preparation for work in AI and computer science.
link |
00:02:20.080
That's kind of interesting. So if you were interested in artificial intelligence,
link |
00:02:26.000
what kind of majors were people even thinking about taking? What is it in your science?
link |
00:02:31.840
So besides philosophies, what were you supposed to do if you were fascinated by the idea of creating
link |
00:02:37.120
intelligence? There weren't enough people who did that for that even to be a conversation.
link |
00:02:41.920
I mean, I think probably, probably philosophy. I mean, it's interesting in my class,
link |
00:02:48.320
my graduating class of undergraduate philosophers, probably maybe slightly less than half went on in
link |
00:02:56.800
computer science, slightly less than half went on in law and like one or two went on in philosophy.
link |
00:03:02.320
So it was a common kind of connection. Do you think AI researchers have a role
link |
00:03:06.960
to be part time philosophers or should they stick to the solid science and engineering
link |
00:03:11.440
without sort of taking the philosophizing tangents? I mean, you work with robots,
link |
00:03:16.240
you think about what it takes to create intelligent beings. Aren't you the perfect
link |
00:03:21.200
person to think about the big picture philosophy at all? The parts of philosophy that are closest
link |
00:03:25.920
to AI, I think, or at least the closest to AI that I think about are stuff like
link |
00:03:29.680
belief and knowledge and denotation and that kind of stuff. And that's, you know,
link |
00:03:35.040
it's quite formal. And it's like just one step away from the kinds of computer science work that
link |
00:03:40.800
we do kind of routinely. I think that there are important questions still about what you can do
link |
00:03:50.160
with a machine and what you can't and so on. Although at least my personal view is that I'm
link |
00:03:54.560
completely a materialist. And I don't think that there's any reason why we can't make a robot be
link |
00:04:00.960
behaviorally indistinguishable from a human. And the question of whether it's
link |
00:04:06.560
distinguishable internally, whether it's a zombie or not in philosophy terms, I actually don't,
link |
00:04:12.720
I don't know. And I don't know if I care too much about that.
link |
00:04:15.280
Right. But there is a philosophical notions. They're mathematical and philosophical because
link |
00:04:20.400
we don't know so much of how difficult it is. How difficult is the perception problem?
link |
00:04:25.600
How difficult is the planning problem? How difficult is it to operate in this world successfully?
link |
00:04:30.800
Because our robots are not currently as successful as human beings in many tasks.
link |
00:04:35.920
The question about the gap between current robots and human beings borders a little bit
link |
00:04:42.480
on philosophy. You know, the expanse of knowledge that's required to operate in a human world,
link |
00:04:50.160
required to operate in this world and the ability to form common sense knowledge, the ability to
link |
00:04:55.520
reason about uncertainty. Much of the work you've been doing, there's open questions there that,
link |
00:05:02.880
I don't know, required to activate a certain big picture view.
link |
00:05:07.840
To me, that doesn't seem like a philosophical gap at all. To me, there is a big technical gap.
link |
00:05:12.960
There's a huge technical gap, but I don't see any reason why it's more than a technical gap.
link |
00:05:19.360
Perfect. So, when you mentioned AI, you mentioned SRI, and maybe can you describe to me when you
link |
00:05:28.400
first fell in love with robotics, with robots or inspired, so you mentioned Flaky or Shaky Flaky,
link |
00:05:38.400
and what was the robot that first captured your imagination, what's possible?
link |
00:05:42.800
Right. Well, so the first robot I worked with was Flaky. Shaky was a robot that the SRI
link |
00:05:47.120
people had built, but by the time, I think when I arrived, it was sitting in a corner of somebody's
link |
00:05:52.960
office dripping hydraulic fluid into a pan, but it's iconic and really everybody should read the
link |
00:06:00.240
Shaky Tech Report because it has so many good ideas in it. I mean, they invented ASTAR search
link |
00:06:06.560
and symbolic planning and learning macro operators. They had low level kind of
link |
00:06:14.240
configuration space planning for their robot. They had vision. That's the basic ideas of
link |
00:06:19.360
a ton of things. Can you take a step back? Shaky have arms. What was the job? Shaky was a mobile
link |
00:06:26.240
robot, but it could push objects, and so it would move things around. With which actuator? With
link |
00:06:32.080
itself, with its base. Okay, great. And they had painted the baseboards black, so it used vision
link |
00:06:43.600
to localize itself in a map. It detected objects. It could detect objects that were surprising to
link |
00:06:49.360
it. It would plan and replan based on what it saw. It reasoned about whether to look and take
link |
00:06:55.520
pictures. I mean, it really had the basics of so many of the things that we think about now.
link |
00:07:03.280
How did it represent the space around it? So it had representations at a bunch of different levels
link |
00:07:08.960
of abstraction. So it had, I think, a kind of an occupancy grid of some sort at the lowest level.
link |
00:07:14.880
At the high level, it was abstract symbolic kind of rooms and connectivity. So where does flaky
link |
00:07:21.440
come in? Yeah, okay. So I showed up at SRI and we were building a brand new robot. As I said,
link |
00:07:28.240
none of the people from the previous project were kind of there or involved anymore. So we were kind
link |
00:07:33.200
of starting from scratch and my advisor was Stan Rosenstein. He ended up being my thesis advisor
link |
00:07:40.880
and he was motivated by this idea of situated computation or situated automata. And the idea was
link |
00:07:49.600
that the tools of logical reasoning were important, but possibly only for the engineers
link |
00:07:58.480
or designers to use in the analysis of a system, but not necessarily to be manipulated in the head
link |
00:08:04.560
of the system itself. So I might use logic to prove a theorem about the behavior of my robot,
link |
00:08:10.480
even if the robot's not using logic in its head to prove theorems. So that was kind of the
link |
00:08:14.400
distinction. And so the idea was to kind of use those principles to make a robot do stuff. But
link |
00:08:23.440
a lot of the basic things we had to kind of learn for ourselves because I had zero background in
link |
00:08:29.600
robotics. I didn't know anything about control. I didn't know anything about sensors. So we
link |
00:08:33.600
reinvented a lot of wheels on the way to getting that robot to do stuff. Do you think that was
link |
00:08:37.280
an advantage or a hindrance? Oh no, I mean, I'm big in favor of wheel reinvention actually. I mean,
link |
00:08:44.800
I think you learn a lot by doing it. It's important though to eventually have the pointers
link |
00:08:49.680
to, so that you can see what's really going on. But I think you can appreciate much better the
link |
00:08:56.640
good solutions once you've messed around a little bit on your own and found a bad one.
link |
00:09:00.400
Yeah. I think you mentioned reinventing reinforcement learning and referring to
link |
00:09:04.880
rewards as pleasures, pleasure. Yeah. Or I think, which I think is a nice name for it.
link |
00:09:11.440
Yeah. It's more fun almost. Do you think you could tell the history of AI machine learning
link |
00:09:18.960
reinforcement learning and how you think about it from the fifties to now?
link |
00:09:23.600
One thing is that it's oscillates, right? So things become fashionable and then they go out
link |
00:09:29.360
and then something else becomes cool and that goes out and so on. And I think there's, so there's
link |
00:09:33.680
some interesting sociological process that actually drives a lot of what's going on.
link |
00:09:38.880
Early days was kind of cybernetics and control, right? And the idea that of homeostasis,
link |
00:09:46.240
right? People have made these robots that could, I don't know, try to plug into the wall when they
link |
00:09:51.680
needed power and then come loose and roll around and do stuff. And then I think over time, the
link |
00:09:59.200
thought, well, that was inspiring, but people said, no, no, no, we want to get maybe closer to what
link |
00:10:03.200
feels like real intelligence or human intelligence. And then maybe the expert systems people tried
link |
00:10:10.160
to do that, but maybe a little too superficially, right? So, oh, we get the surface understanding of
link |
00:10:20.240
what intelligence is like, because I understand how a steel mill works and I can try to explain
link |
00:10:24.960
it to you and you can write it down in logic and then we can make a computer and for that.
link |
00:10:29.200
And then that didn't work out. But what's interesting, I think, is when a thing starts to not
link |
00:10:36.400
be working very well, it's not only do we change methods, we change problems, right? So it's not
link |
00:10:43.200
like we have better ways of doing the problem of the expert systems people were trying to do. We
link |
00:10:47.040
have no ways of trying to do that problem. Oh, yeah, no, I think maybe a few, but we kind of
link |
00:10:54.880
give up on that problem and we switched to a different problem and we worked that for a while
link |
00:11:00.720
and we make progress. As a broad community. As a community, yeah. And there's a lot of people who
link |
00:11:04.320
would argue, you don't give up on the problem, it's just you decrease the number of people working
link |
00:11:09.520
on it. You almost kind of like put it on the shelf, say, we'll come back to this 20 years later.
link |
00:11:13.920
Yeah, I think that's right. Or you might decide that it's malformed. Like you might say,
link |
00:11:21.600
it's wrong to just try to make something that does superficial symbolic reasoning
link |
00:11:26.160
behave like a doctor. You can't do that until you've had the sensory motor experience of being
link |
00:11:33.200
a doctor or something. So there's arguments that say that that problem was not well formed. Or it
link |
00:11:38.320
could be that it is well formed, but we just weren't approaching it well. So you mentioned
link |
00:11:43.760
that your favorite part of logic and symbolic systems is that they give short names for large
link |
00:11:48.800
sets. So there is some use to this. They use symbolic reasoning. So looking at expert systems
link |
00:11:56.960
and symbolic computing, what do you think are the roadblocks that were hit in the 80s and 90s?
link |
00:12:01.680
Ah, okay. So right. So the fact that I'm not a fan of expert systems doesn't mean that I'm not a
link |
00:12:07.920
fan of some kinds of symbolic reasoning, right? So let's see, roadblocks. Well, the main road
link |
00:12:15.520
block, I think, was that the idea that humans could articulate their knowledge effectively
link |
00:12:22.080
into some kind of logical statements.
link |
00:12:26.240
So it's not just the cost, the effort, but really just the capability of doing it.
link |
00:12:31.280
Right. Because we're all experts in vision, right? But totally don't have introspective
link |
00:12:36.720
access into how we do that. Right. And it's true that, I mean, I think the idea was, well,
link |
00:12:44.960
of course, even people then would know, of course, I wouldn't ask you to please write
link |
00:12:48.240
down the rules that you use for recognizing a water bottle. That's crazy. And everyone
link |
00:12:52.800
understood that. But we might ask you to please write down the rules you use for deciding,
link |
00:12:58.640
I don't know, what tie to put on or how to set up a microphone or something like that.
link |
00:13:04.640
But even those things, I think people maybe, I think what they found, I'm not sure about
link |
00:13:10.880
this, but I think what they found was that the so called experts could give explanations
link |
00:13:16.000
that sort of post hoc explanations for how and why they did things, but they weren't
link |
00:13:20.080
necessarily very good. And then they depended on maybe some kinds of perceptual things,
link |
00:13:28.960
which again, they couldn't really define very well. So I think fundamentally, I think the
link |
00:13:35.840
underlying problem with that was the assumption that people could articulate how and why they
link |
00:13:40.960
make their decisions. Right. So it's almost encoding the knowledge
link |
00:13:45.680
from converting from expert to something that a machine could understand and reason with.
link |
00:13:51.440
No, no, no, no, not even just encoding, but getting it out of you.
link |
00:13:56.320
Right. Not, not, not writing it. I mean, yes, hard also to write it down for the computer,
link |
00:14:02.240
but I don't think that people can produce it. You can tell me a story about why you do stuff,
link |
00:14:08.320
but I'm not so sure that's the why. Great. So there are still on the
link |
00:14:14.400
hierarchical planning side, places where symbolic reasoning is very useful. So as you've talked
link |
00:14:24.560
about, so where's the gap? Yeah. Okay, good. So saying that humans can't provide a description
link |
00:14:34.960
of their reasoning processes. That's okay. Fine. But that doesn't mean that it's not good to do
link |
00:14:41.040
reasoning of various styles inside a computer. Those are just two orthogonal points. So then
link |
00:14:47.120
the question is what kind of reasoning should you do inside a computer? Right.
link |
00:14:52.240
And the answer is, I think you need to do all different kinds of reasoning inside a computer,
link |
00:14:56.240
depending on what kinds of problems you face. I guess the question is what kind of things can you
link |
00:15:02.400
encode symbolically so you can reason about? I think the idea about, and even symbolic,
link |
00:15:12.480
I don't even like that terminology because I don't know what it means technically and formally.
link |
00:15:17.760
I do believe in abstractions. So abstractions are critical, right? You cannot reason at completely
link |
00:15:24.160
fine grain about everything in your life, right? You can't make a plan at the level of images and
link |
00:15:29.520
torques for getting a PhD. So you have to reduce the size of the state space and you have to reduce
link |
00:15:36.960
the horizon if you're going to reason about getting a PhD or even buying the ingredients to
link |
00:15:42.160
make dinner. And so how can you reduce the spaces and the horizon of the reasoning you have to do?
link |
00:15:49.280
And the answer is abstraction, spatial abstraction, temporal abstraction. I think abstraction along
link |
00:15:53.760
the lines of goals is also interesting, like you might, well, abstraction and decomposition. Goals
link |
00:16:00.480
is maybe more of a decomposition thing. So I think that's where these kinds of, if you want to call
link |
00:16:05.600
it symbolic or discrete models come in. You talk about a room of your house instead of your pose.
link |
00:16:12.800
You talk about doing something during the afternoon instead of at 2.54. And you do that because it
link |
00:16:20.800
and you do that because it makes your reasoning problem easier. And also because
link |
00:16:27.120
you have, you don't have enough information to reason in high fidelity about your pose of your
link |
00:16:34.800
elbow at 2.35 this afternoon anyway. Right. When you're trying to get a PhD.
link |
00:16:39.520
Or when you're doing anything really. Yeah. Okay.
link |
00:16:42.720
Except for at that moment, at that moment, you do have to reason about the pose of your elbow,
link |
00:16:46.080
maybe, but then you, maybe you do that in some continuous joint space kind of model.
link |
00:16:50.000
And so again, I, my biggest point about all of this is that there should be the dogma is not
link |
00:16:58.640
the thing, right? We shouldn't, it shouldn't be that I'm in favor against symbolic reasoning
link |
00:17:02.880
and you're in favor against neural networks. It should be that just, just computer science
link |
00:17:08.800
tells us what the right answer to all these questions is. If we were smart enough to figure
link |
00:17:12.560
it out. Well, yeah. When you try to actually solve the problem with computers, the right answer comes
link |
00:17:17.200
out. But you mentioned abstractions. I mean, neural networks form abstractions or rather
link |
00:17:24.480
there's, there's automated ways to form abstractions and there's expert driven ways to
link |
00:17:28.960
form abstractions and expert human driven ways. And humans just seem to be way better at forming
link |
00:17:35.280
abstractions currently and certain problems. So when you're referring to 2.45 PM versus afternoon,
link |
00:17:44.000
how do we construct that taxonomy? Is there any room for automated construction of such
link |
00:17:50.080
abstractions? Oh, I think eventually, yeah. I mean, I think when we get to be better
link |
00:17:55.280
and machine learning engineers, we'll build algorithms that build awesome abstractions.
link |
00:18:01.120
That are useful in this kind of way that you're describing. Yeah. So let's then step from
link |
00:18:05.760
the, the abstraction discussion and let's talk about POMM MDPs. Partially observable
link |
00:18:14.800
Markov decision processes. So uncertainty. So first, what are Markov decision processes?
link |
00:18:20.080
What are Markov decision processes? And maybe how much of our world can be models and MDPs? How
link |
00:18:26.320
much, when you wake up in the morning and you're making breakfast, how do you, do you think of
link |
00:18:30.160
yourself as an MDP? So how do you think about MDPs and how they relate to our world? Well, so
link |
00:18:36.640
there's a stance question, right? So a stance is a position that I take with respect to a problem.
link |
00:18:42.240
So I, as a researcher or a person who designs systems, can decide to make a model of the world
link |
00:18:50.720
around me in some terms. So I take this messy world and I say, I'm going to treat it as if it
link |
00:18:57.600
were a problem of this formal kind, and then I can apply solution concepts or algorithms or whatever
link |
00:19:02.960
to solve that formal thing, right? So of course the world is not anything. It's not an MDP or a
link |
00:19:07.920
POMM DP. I don't know what it is, but I can model aspects of it in some way or some other way.
link |
00:19:12.880
And when I model some aspect of it in a certain way, that gives me some set of algorithms I can
link |
00:19:17.600
use. You can model the world in all kinds of ways. Some have, some are, some are, some are
link |
00:19:26.160
more accepting of uncertainty, more easily modeling uncertainty of the world. Some really force the
link |
00:19:33.360
world to be deterministic. And so certainly MDPs model the uncertainty of the world. Yes. Model
link |
00:19:42.000
some uncertainty. They model not present state uncertainty, but they model uncertainty in the
link |
00:19:47.520
way the future will unfold. Right. So what are Markov decision processes? So Markov decision
link |
00:19:54.560
process is a model. It's a kind of a model that you could make that says, I know completely the
link |
00:19:59.760
current state of my system. And what it means to be a state is that I, that all the, I have all
link |
00:20:05.760
the information right now that will let me make predictions about the future as well as I can.
link |
00:20:11.120
So that remembering anything about my history wouldn't make my predictions any better.
link |
00:20:17.760
And, but then it also says that then I can take some actions that might change the state of the
link |
00:20:23.280
world. And that I don't have a deterministic model of those changes. I have a probabilistic
link |
00:20:28.400
model of how the world might change. It's a, it's a useful model for some kinds of systems.
link |
00:20:34.160
I think it's a, I mean, it's certainly not a good model for most problems, I think, because for most
link |
00:20:42.480
problems you don't actually know the state. For most problems you, it's partially observed. So
link |
00:20:48.720
that's now a different problem class. So, okay. That's where the POMDPs, the part that we observe
link |
00:20:55.600
with the Markov decision processes step in. So how do they address the fact that you can't
link |
00:21:01.360
observe most incomplete information about most of the world around you? Right. So now the idea is
link |
00:21:07.760
we still kind of postulate that there exists a state. We think that there is some information
link |
00:21:12.880
about the world out there such that if we knew that we could make good predictions, but we don't
link |
00:21:18.000
know the state. And so then we have to think about how, but we do get observations. Maybe I get
link |
00:21:23.680
images or I hear things or I feel things, and those might be local or noisy. And so therefore
link |
00:21:29.680
they don't tell me everything about what's going on. And then I have to reason about given the
link |
00:21:34.720
history of actions I've taken and observations I've gotten, what do I think is going on in the
link |
00:21:39.920
world? And then given my own kind of uncertainty about what's going on in the world, I can decide
link |
00:21:43.760
what actions to take. And so how difficult is this problem of planning under uncertainty in your
link |
00:21:50.080
view and your long experience of modeling the world, trying to deal with this uncertainty in
link |
00:21:57.840
especially in real world systems? Optimal planning for even discrete POMDPs can be undecidable
link |
00:22:05.040
depending on how you set it up. And so lots of people say, I don't use POMDPs because they are
link |
00:22:12.960
intractable. And I think that that's kind of a very funny thing to say because the problem you
link |
00:22:19.600
have to solve is the problem you have to solve. So if the problem you have to solve is intractable,
link |
00:22:24.320
that's what makes us AI people, right? So we solve, we understand that the problem we're
link |
00:22:28.720
solving is wildly intractable that we can't, we will never be able to solve it optimally,
link |
00:22:34.320
at least I don't. Yeah, right. So later we can come back to an idea about bounded optimality
link |
00:22:41.360
and something. But anyway, we can't come up with optimal solutions to these problems.
link |
00:22:45.520
So we have to make approximations, approximations in modeling, approximations in the solution
link |
00:22:50.640
algorithms and so on. And so I don't have a problem with saying, yeah, my problem actually,
link |
00:22:56.960
it is POMDP in continuous space with continuous observations. And it's so computationally complex,
link |
00:23:02.480
I can't even think about it's, you know, big O whatever. But that doesn't prevent me from,
link |
00:23:09.600
it helps me, gives me some clarity to think about it that way and to then take steps to
link |
00:23:16.160
make approximation after approximation to get down to something that's like computable
link |
00:23:20.880
in some reasonable time. When you think about optimality, the community broadly has shifted on
link |
00:23:26.720
that, I think a little bit in how much they value the idea of optimality, of chasing an optimal
link |
00:23:34.880
solution. How has your views of chasing an optimal solution changed over the years when
link |
00:23:40.960
you work with robots? That's interesting. I think we have a little bit of a methodological crisis
link |
00:23:48.880
actually from the theoretical side. I mean, I do think that theory is important and that right now
link |
00:23:53.520
we're not doing much of it. So there's lots of empirical hacking around and training this and
link |
00:24:00.080
doing that and reporting numbers, but is it good? Is it bad? We don't know. It's very hard to say
link |
00:24:05.120
things. And if you look at like computer science theory, so people talked for a while, everyone was
link |
00:24:16.400
about solving problems optimally or completely. And then there were interesting relaxations. So
link |
00:24:22.320
people look at, oh, are there regret bounds or can I do some kind of approximation? Can I prove
link |
00:24:30.640
something that I can approximately solve this problem or that I get closer to the solution as
link |
00:24:34.880
I spend more time and so on? What's interesting I think is that we don't have good approximate
link |
00:24:42.560
solution concepts for very difficult problems. I like to say that I'm interested in doing a very
link |
00:24:51.520
bad job of very big problems. Right. So very bad job, very big problems. I like to do that,
link |
00:25:00.400
but I wish I could say something. I wish I had a, I don't know, some kind of a formal solution
link |
00:25:09.120
concept that I could use to say, oh, this algorithm actually, it gives me something.
link |
00:25:16.240
Like I know what I'm going to get. I can do something other than just run it and get out.
link |
00:25:19.840
So that, that notion is still somewhere deeply compelling to you. The notion that you can say,
link |
00:25:27.520
you can drop thing on the table says this, you can expect this, this algorithm will
link |
00:25:32.400
give me some good results. I hope there's, I hope science will, I mean,
link |
00:25:37.280
there's engineering and there's science. I think that they're not exactly the same.
link |
00:25:42.320
And I think right now we're making huge engineering, like leaps and bounds. So the
link |
00:25:47.040
engineering is running away ahead of the science, which is cool. And often how it goes, right? So
link |
00:25:52.240
we're making things and nobody knows how and why they work roughly, but we need to turn that into
link |
00:25:59.680
science. There's some form. It's a, yeah, there's some room for formalizing. We need to know what
link |
00:26:05.440
the principles are. Why does this work? Why does that not work? I mean, for a while, people built
link |
00:26:09.840
bridges by trying, but now we can often predict whether it's going to work or not without building
link |
00:26:14.480
it. Can we do that for learning systems or for robots? So your hope is from a materialistic
link |
00:26:20.640
perspective that intelligence, artificial intelligence systems, robots are just fancier
link |
00:26:27.600
bridges. Belief space. What's the difference between belief space and state space? So you
link |
00:26:33.040
mentioned MDPs, FOMDPs, reasoning about, you sense the world, there's a state.
link |
00:26:39.840
Uh, what, what's this belief space idea? That sounds so good.
link |
00:26:44.400
It sounds good. So belief space, that is instead of thinking about what's the state of the world
link |
00:26:51.600
and trying to control that as a robot, I think about what is the space of beliefs that I could
link |
00:26:58.880
have about the world. What's, if I think of a belief as a probability distribution of our ways
link |
00:27:03.520
the world could be, a belief state is a distribution. And then my control problem, if I'm reasoning
link |
00:27:10.080
about how to move through a world I'm uncertain about, my control problem is actually the problem
link |
00:27:16.160
of controlling my beliefs. So I think about taking actions, not just what effect they'll have on the
link |
00:27:21.360
world outside, but what effect they'll have on my own understanding of the world outside. And so
link |
00:27:26.080
that might compel me to ask a question or look somewhere to gather information, which may not
link |
00:27:32.800
really change the world state, but it changes my own belief about the world. That's a powerful way
link |
00:27:38.400
to, to empower the agent, to reason about the world, to explore the world. So what kind of
link |
00:27:46.320
problems does it allow you to solve to, to consider belief space versus just state space?
link |
00:27:52.400
Well, any problem that requires deliberate information gathering, right? So if in some
link |
00:27:58.400
problems like chess, there's no uncertainty, or maybe there's uncertainty about the opponent,
link |
00:28:05.040
there's no uncertainty about the state. And some problems, there's uncertainty,
link |
00:28:10.320
but you gather information as you go, right? You might say, Oh, I'm driving my autonomous car down
link |
00:28:16.080
the road and it doesn't know perfectly where it is, but the light hours are all going all the time.
link |
00:28:20.560
So I don't have to think about whether to gather information. But if you're a human driving down
link |
00:28:25.360
the road, you sometimes look over your shoulder to see what's going on behind you in the lane.
link |
00:28:31.360
And you have to decide whether you should do that now. And you have to trade off the fact that
link |
00:28:37.840
you're not seeing in front of you and you're looking behind you and how valuable is that
link |
00:28:41.520
information and so on. And so to make choices about information gathering, you have to reasonably
link |
00:28:47.200
space. Also, I mean, also to just take into account your own uncertainty before trying to
link |
00:28:56.800
do things. So you might say, if I understand where I'm standing relative to the door jam,
link |
00:29:05.360
pretty accurately, then it's okay for me to go through the door. But if I'm really
link |
00:29:08.640
not sure where the door is, then it might be better to not do that right now.
link |
00:29:12.960
The degree of your uncertainty about the world is actually part of the thing you're trying to
link |
00:29:17.760
optimize in forming the plan, right? So this idea of a long horizon of planning for a PhD or just
link |
00:29:25.760
even how to get out of the house or how to make breakfast. You show this presentation of the WTF,
link |
00:29:31.920
where's the fork of robot looking at a sink. And can you describe how we plan in this world
link |
00:29:40.640
of this idea of hierarchical planning we've mentioned? So yeah, how can a robot hope to
link |
00:29:47.360
plan about something with such a long horizon where the goal is quite far away?
link |
00:29:54.480
People since probably reasoning began have thought about hierarchical reasoning,
link |
00:29:59.840
the temporal hierarchy in particular. Well, there's spatial hierarchy, but let's talk
link |
00:30:03.040
about temporal hierarchy. So you might say, oh, I have this long execution I have to do,
link |
00:30:08.960
but I can divide it into some segments abstractly, right? So maybe you have to get out of the house,
link |
00:30:15.360
I have to get in the car, I have to drive and so on. And so you can plan if you can build
link |
00:30:22.720
abstractions. So this we started out by talking about abstractions. And we're back to that now,
link |
00:30:26.960
if you can build abstractions in your state space, and abstractions sort of temporal abstractions,
link |
00:30:34.560
then you can make plans at a high level. And you can say, I'm going to go to town and then I'll
link |
00:30:39.760
have to get gas and then I can go here and I can do this other thing. And you can reason about the
link |
00:30:43.520
dependencies and constraints among these actions, again, without thinking about the complete
link |
00:30:50.080
details. What we do in our hierarchical planning work is then say, all right, I make a plan at a
link |
00:30:56.800
high level of abstraction, I have to have some reason to think that it's feasible without working
link |
00:31:03.760
it out in complete detail. And that's actually the interesting step. I always like to talk about
link |
00:31:08.720
walking through an airport, like you can plan to go to New York and arrive at the airport, and then
link |
00:31:15.120
find yourself an office building later. You can't even tell me in advance what your plan is for
link |
00:31:20.080
walking through the airport, partly because you're too lazy to think about it, maybe, but partly
link |
00:31:24.960
also because you just don't have the information, you don't know what gate you're landing in, or
link |
00:31:28.800
what people are going to be in front of you or anything. So there's no point in planning in
link |
00:31:34.080
detail, but you have to have, you have to make a leap of faith that you can figure it out once you
link |
00:31:40.640
get there. And it's really interesting to me how you arrive at that. How do you, so you have learned
link |
00:31:50.000
over your lifetime to be able to make some kinds of predictions about how hard it is to achieve some
link |
00:31:54.720
kinds of sub goals. And that's critical. Like you would never plan to fly somewhere if you couldn't,
link |
00:32:00.720
didn't have a model of how hard it was to do some of the intermediate steps. So one of the things
link |
00:32:04.800
we're thinking about now is how do you do this kind of very aggressive generalization to situations
link |
00:32:12.480
that you haven't been in and so on to predict how long will it take to walk through the Kuala Lumpur
link |
00:32:16.880
airport. Like you could give me an estimate and it wouldn't be crazy. And you have to have an
link |
00:32:22.480
estimate of that in order to make plans that involve walking through the Kuala Lumpur airport,
link |
00:32:27.280
even if you don't need to know it in detail. So I'm really interested in these kinds of abstract
link |
00:32:32.480
models and how do we acquire them. But once we have them, we can use them to do hierarchical
link |
00:32:37.280
reasoning, which is, I think is very important. Yeah. There's this notion of goal regression and
link |
00:32:43.280
preimage backchaining, this idea of starting at the goal and just forming these big clouds of
link |
00:32:50.080
states. I mean, it's almost like saying to the airport, you know, once you show up to the airport
link |
00:33:01.920
that you're like a few steps away from the goal. So like thinking of it this way, it's kind of
link |
00:33:08.160
interesting. I don't know if you have sort of further comments on that of starting at the goal.
link |
00:33:14.480
Yeah. I mean, it's interesting that Simon, Herb Simon back in the early days of AI talked a lot
link |
00:33:22.000
about means ends reasoning and reasoning back from the goal. There's a kind of an intuition that
link |
00:33:26.480
people have that the number of that state space is big. The number of actions you could take is
link |
00:33:34.480
really big. So if you say, here I sit and I want to search forward from where I am, what are all
link |
00:33:39.120
the things I could do? That's just overwhelming. If you say, if you can reason at this other level
link |
00:33:44.720
and say, here's what I'm hoping to achieve, what could I do to make that true? That somehow the
link |
00:33:49.520
branching is smaller. Now what's interesting is that like in the AI planning community,
link |
00:33:54.880
that hasn't worked out in the class of problems that they look at and the methods that they tend
link |
00:33:59.120
to use. It hasn't turned out that it's better to go backward. It's still kind of my intuition that
link |
00:34:04.560
it is, but I can't prove that to you right now. Right. I share your intuition, at least for us
link |
00:34:10.720
mere humans. Speaking of which, when you maybe now we take a little step into that philosophy circle.
link |
00:34:22.400
How hard would it, when you think about human life, you give those examples often. How hard do
link |
00:34:28.080
you think it is to formulate human life as a planning problem or aspects of human life? So
link |
00:34:33.360
when you look at robots, you're often trying to think about object manipulation,
link |
00:34:38.640
tasks about moving a thing. When you take a slight step outside the room, let the robot
link |
00:34:46.240
leave and go get lunch, or maybe try to pursue more fuzzy goals. How hard do you think is that
link |
00:34:54.480
problem? If you were to try to maybe put another way, try to formulate human life as a planning
link |
00:35:00.800
problem. Well, that would be a mistake. I mean, it's not all a planning problem, right? I think
link |
00:35:05.760
it's really, really important that we understand that you have to put together pieces and parts
link |
00:35:11.920
that have different styles of reasoning and representation and learning. I think it seems
link |
00:35:18.640
probably clear to anybody that it can't all be this or all be that. Brains aren't all like this
link |
00:35:25.680
or all like that, right? They have different pieces and parts and substructure and so on.
link |
00:35:30.160
So I don't think that there's any good reason to think that there's going to be like one true
link |
00:35:34.320
algorithmic thing that's going to do the whole job. So it's a bunch of pieces together designed
link |
00:35:40.720
to solve a bunch of specific problems. Or maybe styles of problems. I mean, there's probably some
link |
00:35:49.120
reasoning that needs to go on in image space. I think, again, there's this model based versus
link |
00:35:57.360
model free idea, right? So in reinforcement learning, people talk about, oh, should I learn,
link |
00:36:02.480
I could learn a policy, just straight up a way of behaving. I could learn it's popular
link |
00:36:08.400
on a value function. That's some kind of weird intermediate ground. Or I could learn a transition
link |
00:36:14.960
model, which tells me something about the dynamics of the world. If I take it, imagine that I learned
link |
00:36:20.240
a transition model and I couple it with a planner and I draw a box around that, I have a policy
link |
00:36:25.600
again. It's just stored a different way, right? But it's just as much of a policy as the other
link |
00:36:32.800
policy. It's just I've made, I think the way I see it is it's a time space trade off in computation,
link |
00:36:40.160
right? A more overt policy representation. Maybe it takes more space, but maybe I can
link |
00:36:46.160
compute quickly what action I should take. On the other hand, maybe a very compact model of
link |
00:36:51.200
the world dynamics plus a planner lets me compute what action to take to just more slowly. There's
link |
00:36:57.120
no, I don't, I mean, I don't think there's no argument to be had. It's just like a question of
link |
00:37:02.320
what form of computation is best for us for the various sub problems. Right. So, and, and so like
link |
00:37:10.320
learning to do algebra manipulations for some reason is, I mean, that's probably gonna want
link |
00:37:16.000
naturally a sort of a different representation than writing a unicycle at the time constraints
link |
00:37:21.280
on the unicycle are serious. The space is maybe smaller. I don't know, but so I could be the more
link |
00:37:27.760
human size of falling in love, having a relationship that might be another, another style of how to
link |
00:37:36.000
model that. Yeah. Let's first solve the algebra and the object manipulation. What do you think
link |
00:37:43.280
is harder perception or planning perception? That's why understanding that's why. So what do you think
link |
00:37:52.160
is so hard about perception by understanding the world around you? Well, I mean, I think the big
link |
00:37:56.480
question is representational. Hugely the question is representation. So perception has made great
link |
00:38:08.160
strides lately, right? And we can classify images and we can play certain kinds of games and predict
link |
00:38:15.360
how to steer the car and all this sort of stuff. Um, I don't think we have a very good idea of
link |
00:38:24.800
what perception should deliver, right? So if you, if you believe in modularity, okay, there's,
link |
00:38:29.760
there's a very strong view which says we shouldn't build in any modularity. We should make a giant
link |
00:38:38.000
gigantic neural network, train it end to end to do the thing. And that's the best way forward.
link |
00:38:44.320
And it's hard to argue with that except on a sample complexity basis, right? So you might say,
link |
00:38:51.440
Oh, well if I want to do end to end reinforcement learning on this giant, giant neural network,
link |
00:38:55.280
it's going to take a lot of data and a lot of like broken robots and stuff. So then the only answer
link |
00:39:05.520
is to say, okay, we have to build something in, build in some structure or some bias. We know
link |
00:39:11.760
from theory of machine learning, the only way to cut down the sample complexity is to kind of cut
link |
00:39:15.760
down, somehow cut down the hypothesis space. You can do that by building in bias. There's all kinds
link |
00:39:22.480
of reasons to think that nature built bias into humans. Um, convolution is a bias, right? It's a
link |
00:39:30.640
very strong bias and it's a very critical bias. So my own view is that we should look for more
link |
00:39:37.520
things that are like convolution, but the address other aspects of reasoning, right? So convolution
link |
00:39:42.880
helps us a lot with a certain kind of spatial reasoning. That's quite close to the imaging.
link |
00:39:48.320
I think there's other ideas like that. Maybe some amount of forward search, maybe some notions of
link |
00:39:56.880
abstraction, maybe the notion that objects exist. Actually, I think that's pretty important. And a
link |
00:40:02.080
lot of people won't give you that to start with. Right? So almost like a convolution in the, uh,
link |
00:40:08.960
uh, in the object, semantic object space or some kind of, some kind of ideas in there.
link |
00:40:13.840
That's right. And people are starting like the graph, graph convolutions are an idea that are
link |
00:40:17.760
related to relation, relational representations. And so, so I think there are, so you, I've come
link |
00:40:25.840
I've come far field from perception, but I think, um, I think the thing that's going to make
link |
00:40:30.720
perception that kind of the next step is actually understanding better what it should produce.
link |
00:40:36.800
Right? So what are we going to do with the output of it? Right? It's fine when what we're going to
link |
00:40:40.640
do with the output is steer. It's less clear when we're just trying to make a one integrated
link |
00:40:46.960
intelligent agent, what should the output of perception be? We have no idea. And how should
link |
00:40:52.560
that hook up to the other stuff? We don't know. So I think the pressing question is,
link |
00:40:59.040
what kinds of structure can we build in that are like the moral equivalent of convolution
link |
00:41:03.520
that will make a really awesome superstructure that then learning can kind of progress on
link |
00:41:09.440
efficiently. I agree. Very compelling description of actually where we stand with the perception
link |
00:41:13.840
problem. You're teaching a course on embodied intelligence. What do you think it takes to
link |
00:41:19.120
build a robot with human level intelligence? I don't know if we knew we would do it.
link |
00:41:27.680
If you were to, I mean, okay. So do you think a robot needs to have a self awareness,
link |
00:41:36.000
consciousness, fear of mortality, or is it, is it simpler than that? Or is consciousness a simple
link |
00:41:44.160
thing? Like, do you think about these notions? I don't think much about consciousness. Even
link |
00:41:50.880
most philosophers who care about it will give you that you could have robots that are zombies,
link |
00:41:55.840
right? That behave like humans, but are not conscious. And I, at this moment would be happy
link |
00:42:00.320
enough with that. So I'm not really worried one way or the other. So the technical side,
link |
00:42:03.760
you're not thinking of the use of self awareness. Well, but I, okay, but then what does self
link |
00:42:09.920
awareness mean? I mean, that you need to have some part of the system that can observe other
link |
00:42:16.960
parts of the system and tell whether they're working well or not. That seems critical.
link |
00:42:21.200
So does that count as, I mean, does that count as self awareness or not? Well, it depends on whether
link |
00:42:27.360
you think that there's somebody at home who can articulate whether they're self aware. But clearly,
link |
00:42:33.120
if I have like, you know, some piece of code that's counting how many times this procedure gets
link |
00:42:37.600
executed, that's a kind of self awareness, right? So there's a big spectrum. It's clear you have to
link |
00:42:43.680
have some of it. Right. You know, we're quite far away in many dimensions, but is there a direction
link |
00:42:48.160
of research that's most compelling to you for, you know, trying to achieve human level intelligence
link |
00:42:54.720
in our robots? Well, to me, I guess the thing that seems most compelling to me at the moment is this
link |
00:43:00.880
question of what to build in and what to learn. Um, I think we're, we don't, we're missing a bunch
link |
00:43:10.160
of ideas and, and we, you know, people, you know, don't you dare ask me how many years it's going to
link |
00:43:17.200
be until that happens because I won't even participate in the conversation because I think
link |
00:43:22.320
we're missing ideas and I don't know how long it's going to take to find them. So I won't ask you how
link |
00:43:26.400
many years, but, uh, maybe I'll ask you what it, when you'll be sufficiently impressed that we've
link |
00:43:34.240
achieved it. So what's, what's a good test of intelligence? Do you like the Turing test, the
link |
00:43:40.080
natural language in the robotic space? Is there something where you would sit back and think,
link |
00:43:46.400
Oh, that's, that's pretty impressive. Uh, as a test, as a benchmark, do you think about these
link |
00:43:52.000
kinds of problems? No, I resist. I mean, I think all the time that we spend arguing about those
link |
00:43:57.760
kinds of things could be better spent just making the robots work better. Uh, so you don't value
link |
00:44:03.520
competition. So, I mean, there's a nature of benchmark benchmarks and datasets or Turing
link |
00:44:10.000
test challenges where everybody kind of gets together and tries to build a better robot
link |
00:44:14.240
cause they want to out compete each other. Like the DARPA challenge with the autonomous vehicles.
link |
00:44:18.640
Do you see the value of that or it can get in the way? I think it can get in the way. I mean,
link |
00:44:25.040
some people, many people find it motivating. And so that's good. I find it anti motivating
link |
00:44:29.520
personally. Uh, but I think what, I mean, I think you get an interesting cycle where for a contest,
link |
00:44:37.440
a bunch of smart people get super motivated and they hack their brains out and much of what gets
link |
00:44:42.000
done is just hacks, but sometimes really cool ideas emerge. And then that gives us something
link |
00:44:47.200
to chew on after that. So I'm, it's not a thing for me, but I don't, I don't regret that other
link |
00:44:54.400
people do it. Yeah. It's like you said with everything else that it makes us good. So jumping
link |
00:44:59.120
topics a little bit, you started the journal of machine learning research and served as its editor
link |
00:45:05.440
in chief. Uh, how did the publication come about and what do you think about the current publishing
link |
00:45:13.760
model space in machine learning artificial intelligence? Okay, good. So it came about
link |
00:45:19.680
because there was a journal called machine learning, which still exists, which was owned by
link |
00:45:24.000
Cluer and there was, I was on the editorial board and we used to have these meetings annually where
link |
00:45:30.880
we would complain to Cluer that it was too expensive for the libraries and that people
link |
00:45:34.640
couldn't publish. And we would really like to have some kind of relief on those fronts and they would
link |
00:45:39.200
always sympathize, but not do anything. So, uh, we just decided to make a new journal and, uh,
link |
00:45:46.960
there was the journal of AI research, which has, was on the same model, which had been in existence
link |
00:45:52.720
for maybe five years or so, and it was going on pretty well. So, uh, we just made a new journal.
link |
00:45:59.920
It wasn't, I mean, um, I don't know, I guess it was work, but it wasn't that hard. So basically
link |
00:46:05.280
the editorial board, probably 75% of the editorial board of, uh, machine learning resigned and we
link |
00:46:14.560
founded the new journal, but it was sort of, it was more open. Yeah. Right. So it's completely
link |
00:46:21.760
open. It's open access. Actually, uh, uh, I had a postdoc, George Conidaris who wanted to call
link |
00:46:28.960
these journals free for all, uh, because there were, I mean, it both has no page charges and has
link |
00:46:36.240
no, uh, uh, access restrictions. And the reason, and so lots of people, I mean, there were, there
link |
00:46:44.640
were people who were mad about the existence of this journal who thought it was a fraud or
link |
00:46:48.960
something. It would be impossible. They said to run a journal like this with basically, I mean,
link |
00:46:54.320
for a long time, I didn't even have a bank account. Uh, I paid for the lawyer to incorporate and the
link |
00:47:00.320
IP address and it just did cost a couple of hundred dollars a year to run. It's a little bit
link |
00:47:06.960
more now, but not that much more, but that's because I think computer scientists are competent
link |
00:47:13.920
and autonomous in a way that many scientists and other fields aren't. I mean, at doing these kinds
link |
00:47:19.440
of things, we already types out our own papers. We all have students and people who can hack a
link |
00:47:24.480
website together in an afternoon. So the infrastructure for us was like, not a problem,
link |
00:47:29.280
but for other people in other fields, it's a harder thing to do. Yeah. And this kind of
link |
00:47:34.320
open access journal is nevertheless one of the most prestigious journals. So it's not like, uh,
link |
00:47:41.600
prestige and it can be achieved without any of the... Paper is not required for prestige.
link |
00:47:47.520
Yeah. It turns out. Yeah. So on the review process side of actually a long time ago,
link |
00:47:53.600
I don't remember when I reviewed a paper where you were also a reviewer. And I remember reading
link |
00:47:59.440
your review being influenced by it and it was really well written. It influenced how I write
link |
00:48:04.080
feature reviews. Uh, you disagreed with me actually. Uh, and you made it, uh, my review,
link |
00:48:11.520
but much better. So, but nevertheless, the review process, you know, has its, uh, flaws.
link |
00:48:19.280
And how do you think, what do you think works well? How can it be improved?
link |
00:48:23.600
So actually when I started JMLR, I wanted to do something completely different.
link |
00:48:28.720
And I didn't because it felt like we needed a traditional journal of record. And so we just
link |
00:48:34.800
made JMLR be almost like a normal journal, except for the open access parts of it, basically. Um,
link |
00:48:43.200
increasingly of course, publication is not even a sensible word. You can publish something by
link |
00:48:47.600
putting it in an archive so I can publish everything tomorrow. So making stuff public
link |
00:48:54.240
is, there's no barrier. We still need curation and evaluation. I don't have time to read all
link |
00:49:04.560
of archive. And you could argue that kind of social thumbs upping of articles suffices,
link |
00:49:20.000
right? You might say, Oh, heck with this. We don't need journals at all. We'll put everything
link |
00:49:24.880
on archive and people will upvote and downvote the articles. And then your CV will say, Oh man,
link |
00:49:29.840
he got a lot upvotes. So, uh, that's good. Um, but I think there's still
link |
00:49:39.040
value in careful reading and commentary of things. And it's hard to tell when people are
link |
00:49:46.320
upvoting and downvoting or arguing about your paper on Twitter and Reddit, whether they know
link |
00:49:53.280
what they're talking about, right? So then I have the second order problem of trying to decide whose
link |
00:49:57.760
opinions I should value and such. So I don't know what I w if I had infinite time, which I don't,
link |
00:50:04.480
and I'm not going to do this because I really want to make robots work. But if I felt inclined to do
link |
00:50:10.160
something more in the publication direction, I would do this other thing, which I thought about
link |
00:50:14.560
doing the first time, which is to get together some set of people whose opinions I value and
link |
00:50:19.840
who are pretty articulate. And I guess we would be public, although we could be private. I'm not sure.
link |
00:50:25.200
And we would review papers. We wouldn't publish them and you wouldn't submit them. We would just
link |
00:50:29.360
find papers and we would write reviews and we would make those reviews public. And maybe if you,
link |
00:50:37.040
you know, so we're Leslie's friends who review papers and maybe eventually if, if we, our opinion
link |
00:50:42.880
was sufficiently valued, like the opinion of JMLR is valued, then you'd say on your CV that Leslie's
link |
00:50:48.560
friends gave my paper a five star rating. And that would be just as good as saying, I got it,
link |
00:50:53.200
so, you know, accepted into this journal. So I think, I think we should have good public commentary
link |
00:51:01.840
and organize it in some way, but I don't really know how to do it. It's interesting times.
link |
00:51:06.320
The way you describe it actually is really interesting. I mean, we do it for movies,
link |
00:51:10.000
imdb.com. There's experts, critics come in, they write reviews, but there's also
link |
00:51:16.000
regular non critics, humans write reviews and they're separated.
link |
00:51:19.840
I like open review. The iClear process I think is interesting.
link |
00:51:29.280
It's a step in the right direction, but it's still not as compelling as reviewing movies or
link |
00:51:35.760
video games. I mean, it sometimes almost, it might be silly, at least from my perspective to say,
link |
00:51:41.600
but it boils down to the user interface, how fun and easy it is to actually perform the reviews,
link |
00:51:46.720
how efficient, how much you as a reviewer get street cred for being a good reviewer.
link |
00:51:54.400
Those elements, those human elements come into play.
link |
00:51:57.200
No, it's a big investment to do a good review of a paper and the flood of papers is out of control.
link |
00:52:04.000
Right. So, you know, there aren't 3000 new, I don't know how many new movies are there in a year.
link |
00:52:08.480
I don't know, but that's probably going to be less than how many machine learning papers are
link |
00:52:12.160
in a year now. And I'm worried, you know, I, right. So I'm like an old person. So of course,
link |
00:52:21.760
I'm going to say, things are moving too fast. I'm a stick in the mud. So I can say that,
link |
00:52:28.720
but my particular flavor of that is I think the horizon for researchers has gotten very short,
link |
00:52:35.760
that students want to publish a lot of papers and there's a huge, there's value. It's exciting. And
link |
00:52:41.760
there's value in that and you get patted on the head for it and so on. But, and some of that is
link |
00:52:49.200
fine, but I'm worried that we're driving out people who would spend two years thinking about
link |
00:52:59.200
something. Back in my day, when we worked on our thesis, we did not publish papers. You did your
link |
00:53:06.080
thesis for years. You picked a hard problem and then you worked and chewed on it and did stuff
link |
00:53:11.360
and wasted time and for a long time. And when it was roughly, when it was done, you would write
link |
00:53:16.320
papers. And so I don't know how to, and I don't think that everybody has to work in that mode,
link |
00:53:23.520
but I think there's some problems that are hard enough that it's important to have a long
link |
00:53:27.760
research horizon. And I'm worried that we don't incentivize that at all at this point.
link |
00:53:33.040
In this current structure. Yeah. So what do you see as, what are your hopes and fears about the
link |
00:53:41.840
future of AI and continuing on this theme? So AI has gone through a few winters, ups and downs. Do
link |
00:53:50.080
you see another winter of AI coming? Are you more hopeful about making robots work, as you said?
link |
00:53:58.720
I think the cycles are inevitable, but I think each time we get higher, right? I mean, so, you
link |
00:54:05.680
know, it's like climbing some kind of landscape with a noisy optimizer. So it's clear that the,
link |
00:54:15.280
you know, the deep learning stuff has made deep and important improvements. And so the high water
link |
00:54:22.560
mark is now higher. There's no question. But of course, I think people are overselling and
link |
00:54:29.360
eventually investors, I guess, and other people will look around and say, well, you're not quite
link |
00:54:37.040
delivering on this grand claim and that wild hypothesis. It's probably, it's going to crash
link |
00:54:43.120
some amount and then it's okay. I mean, but I don't, I can't imagine that there's like
link |
00:54:49.760
some awesome monotonic improvement from here to human level AI. So in, you know, I have to ask
link |
00:54:58.320
this question, I probably anticipate answers, the answers, but do you have a worry short term or
link |
00:55:05.520
long term about the existential threats of AI and maybe short term, less existential, but more
link |
00:55:15.280
robots taking away jobs?
link |
00:55:17.120
Well, actually, let me talk a little bit about utility. Actually, I had an interesting
link |
00:55:25.520
conversation with some military ethicists who wanted to talk to me about autonomous weapons.
link |
00:55:30.880
And they're, they were interesting, smart, well educated guys who didn't know too much about AI
link |
00:55:37.440
or machine learning. And the first question they asked me was, has your robot ever done
link |
00:55:41.440
something you didn't expect? And I like burst out laughing because anybody who's ever done
link |
00:55:46.480
something on the robot right knows that they don't do it. And what I realized was that their
link |
00:55:51.680
model of how we program a robot was completely wrong. Their model of how we can program a robot
link |
00:55:56.960
was like Lego mind storms, like, Oh, go forward a meter, turn left, take a picture, do this, do
link |
00:56:03.120
that. And so if you have that model of programming, then it's true. It's kind of weird that your robot
link |
00:56:08.800
would do something that you didn't anticipate. But the fact is, and actually, so now this is my
link |
00:56:13.760
new educational mission. If I have to talk to non experts, I try to teach them the idea that
link |
00:56:20.640
we don't operate, we operate at least one or maybe many levels of abstraction about that. And we say,
link |
00:56:26.400
Oh, here's a hypothesis class, maybe it's a space of plans, or maybe it's a space of
link |
00:56:31.200
classifiers or whatever. But there's some set of answers and an objective function. And then we
link |
00:56:36.400
work on some optimization method that tries to optimize a solution solution in that class.
link |
00:56:43.200
And we don't know what solution is going to come out. Right. So I think it's important to
link |
00:56:47.600
communicate that. So I mean, of course, probably people who listen to this, they, they know that
link |
00:56:52.160
lesson. But I think it's really critical to communicate that lesson. And then lots of people
link |
00:56:56.320
are now talking about, you know, the value alignment problem. So you want to be sure as
link |
00:57:01.840
robots or software systems get more competent, that their objectives are aligned with your
link |
00:57:07.840
objectives, or that our objectives are compatible in some way, or we have a good way of mediating
link |
00:57:13.760
when they have different objectives. And so I think it is important to start thinking in terms
link |
00:57:19.120
like, you don't have to be freaked out by the robot apocalypse, to accept that it's important
link |
00:57:25.360
to think about objective functions of value alignment. Yes. And that you have to really
link |
00:57:30.080
everyone who's done optimization knows that you have to be careful what you wish for that,
link |
00:57:34.160
you know, sometimes you get the optimal solution, and you realize, man, that was that objective was
link |
00:57:38.800
wrong. So pragmatically, in the shortest term, it seems to me that that that those are really
link |
00:57:47.040
interesting and critical questions. And the idea that we're going to go from being people who
link |
00:57:50.960
engineer algorithms to being people who engineer objective functions. I think that's, that's
link |
00:57:56.480
definitely going to happen. And that's going to change our thinking and methodology. And so we're
link |
00:58:00.560
gonna you started at Stanford philosophy, that's where she could be. And I will go back to
link |
00:58:05.920
philosophy maybe. Well, I mean, they're mixed together, because because, as we also know,
link |
00:58:11.920
as machine learning people, right? When you design, in fact, this is the lecture I gave in
link |
00:58:16.160
class today, when you design an objective function, you have to wear both hats, there's
link |
00:58:21.600
the hat that says, what do I want? And there's the hat that says, but I know what my optimizer
link |
00:58:26.080
can do to some degree. And I have to take that into account. So it's it's always a trade off,
link |
00:58:31.840
and we have to kind of be mindful of that. The part about taking people's jobs, I understand
link |
00:58:38.560
that that's important. I don't understand sociology or economics or people very well. So I
link |
00:58:45.440
don't know how to think about that. So that's Yeah, so there might be a sociological aspect
link |
00:58:50.160
there, the economic aspect that's very difficult to think about. Okay. I mean, I think other people
link |
00:58:54.640
should be thinking about it. But I'm just that's not my strength. So what do you think is the most
link |
00:58:58.480
exciting area of research in the short term, for the community and for your for yourself?
link |
00:59:03.600
Well, so I mean, there's the story I've been telling about how to engineer intelligent robots.
link |
00:59:10.160
So that's what we want to do. We all kind of want to do well, I mean, some set of us want to do this.
link |
00:59:16.160
And the question is, what's the most effective strategy? And we've tried it. And there's a bunch
link |
00:59:20.560
of different things you could do at the extremes, right? One super extreme is, what's the most
link |
00:59:25.360
effective strategy? And there's a bunch of different things you could do at the extremes,
link |
00:59:29.600
right? One super extreme is, we do introspection, and we write a program. Okay, that has not worked
link |
00:59:35.920
out very well. Another extreme is we take a giant bunch of neural goo, and we try and train it up to
link |
00:59:41.280
do something. I don't think that's going to work either. So the question is, what's the middle
link |
00:59:46.960
ground? And, and again, this isn't a theological question or anything like that. It's just like,
link |
00:59:54.240
what's the middle ground? And I think it's clear, it's a combination of learning, to me, it's clear,
link |
01:00:00.640
it's a combination of learning and not learning. And what should that combination be? And what's
link |
01:00:05.760
the stuff we build in? So to me, that's the most compelling question. And when you say engineer
link |
01:00:10.080
robots, you mean engineering systems that work in the real world? Is that, that's the emphasis?
link |
01:00:16.560
Okay. Last question. Which robots or robot is your favorite from science fiction?
link |
01:00:24.480
So you can go with Star Wars or RTD2, or you can go with more modern,
link |
01:00:32.240
maybe Hal from... I don't think I have a favorite robot from science fiction.
link |
01:00:38.080
This is, this is back to, you like to make robots work in the real world here, not, not in...
link |
01:00:45.520
I mean, I love the process and I care more about the process.
link |
01:00:50.000
The engineering process.
link |
01:00:51.600
Yeah. I mean, I do research because it's fun, not because I care about what we produce.
link |
01:00:57.520
Well, that's a, that's a beautiful note actually. And Leslie,
link |
01:01:00.640
thank you so much for talking today.
link |
01:01:02.000
Sure. It's been fun.