back to indexLeslie Kaelbling: Reinforcement Learning, Planning, and Robotics | Lex Fridman Podcast #15
link |
The following is a conversation with Leslie Kailbling. She is a roboticist and professor at
link |
MIT. She is recognized for her work in reinforcement learning, planning, robot navigation, and several
link |
other topics in AI. She won the Ijkai Computers and Thought Award and was the editor in chief
link |
of the prestigious journal machine learning research. This conversation is part of the
link |
artificial intelligence podcast at MIT and beyond. If you enjoy it, subscribe on YouTube,
link |
iTunes, or simply connect with me on Twitter at Lex Friedman, spelled F R I D. And now,
link |
here's my conversation with Leslie Kailbling. What made me get excited about AI, I can say
link |
that, is I read Girdle Escherbach when I was in high school. That was pretty formative for me
link |
because it exposed the interestingness of primitives and combination and how you can
link |
make complex things out of simple parts and ideas of AI and what kinds of programs might
link |
generate intelligent behavior. So you first fell in love with AI reasoning logic versus robots?
link |
Yeah, the robots came because my first job, so I finished an undergraduate degree in philosophy
link |
at Stanford and was about to finish masters in computer science and I got hired at SRI
link |
in their AI lab and they were building a robot. It was a kind of a follow on to shaky,
link |
but all the shaky people were not there anymore. And so my job was to try to get this robot to
link |
do stuff and that's really kind of what got me interested in robots. So maybe taking a small
link |
step back to your bachelor's in Stanford philosophy, did master's in PhD in computer science,
link |
but the bachelor's in philosophy. So what was that journey like? What elements of philosophy
link |
do you think you bring to your work in computer science?
link |
So it's surprisingly relevant. So part of the reason that I didn't do a computer science
link |
undergraduate degree was that there wasn't one at Stanford at the time, but that there's part of
link |
philosophy and in fact Stanford has a special sub major in something called now Symbolic Systems,
link |
which is logic, model, theory, formal semantics of natural language. And so that's actually
link |
a perfect preparation for work in AI and computer science.
link |
That's kind of interesting. So if you were interested in artificial intelligence,
link |
what kind of majors were people even thinking about taking? What is in your science? So besides
link |
philosophies, what were you supposed to do if you were fascinated by the idea of creating
link |
intelligence? There weren't enough people who did that for that even to be a conversation.
link |
I mean, I think probably philosophy. I mean, it's interesting in my graduating class of
link |
undergraduate philosophers, probably maybe slightly less than half went on in computer
link |
science, slightly less than half went on in law, and like one or two went on in philosophy.
link |
So it was a common kind of connection. Do you think AI researchers have a role,
link |
be part time philosophers, or should they stick to the solid science and engineering
link |
without sort of taking the philosophizing tangents? I mean, you work with robots,
link |
you think about what it takes to create intelligent beings. Aren't you the perfect person to think
link |
about the big picture philosophy at all? The parts of philosophy that are closest to AI,
link |
I think, or at least the closest to AI that I think about are stuff like
link |
belief and knowledge and denotation and that kind of stuff. It's quite formal, and it's
link |
like just one step away from the kinds of computer science work that we do kind of routinely.
link |
I think that there are important questions still about what you can do with a machine and what
link |
you can't and so on. Although at least my personal view is that I'm completely a materialist,
link |
and I don't think that there's any reason why we can't make a robot be
link |
behaviorally indistinguishable from a human. And the question of whether it's
link |
distinguishable internally, whether it's a zombie or not in philosophy terms, I actually don't,
link |
I don't know, and I don't know if I care too much about that.
link |
Right, but there is a philosophical notions there, mathematical and philosophical,
link |
because we don't know so much of how difficult that is, how difficult is a perception problem.
link |
How difficult is the planning problem? How difficult is it to operate in this world successfully?
link |
Because our robots are not currently as successful as human beings in many tasks.
link |
The question about the gap between current robots and human beings borders a little bit
link |
on philosophy. The expanse of knowledge that's required to operate in this world and the ability
link |
to form common sense knowledge, the ability to reason about uncertainty, much of the work
link |
you've been doing, there's open questions there that, I don't know, require to activate a certain
link |
big picture view. To me, that doesn't seem like a philosophical gap at all.
link |
To me, there is a big technical gap. There's a huge technical gap,
link |
but I don't see any reason why it's more than a technical gap.
link |
Perfect. When you mentioned AI, you mentioned SRI, and maybe can you describe to me when you
link |
first fell in love with robotics, with robots, or inspired, so you mentioned flaky or shaky flaky,
link |
and what was the robot that first captured your imagination of what's possible?
link |
Right. The first robot I worked with was flaky. Shaky was a robot that the SRI people had built,
link |
but by the time, I think when I arrived, it was sitting in a corner of somebody's office
link |
dripping hydraulic fluid into a pan, but it's iconic. Really, everybody should read the Shaky
link |
Tech Report because it has so many good ideas in it. They invented ASTAR Search and symbolic
link |
planning and learning macro operators. They had low level kind of configuration space planning for
link |
the robot. They had vision. That's the basic ideas of a ton of things.
link |
Can you take a step by it? Shaky was a mobile robot, but it could push objects,
link |
and so it would move things around. With which actuator?
link |
With its self, with its base. They had painted the baseboards black,
link |
so it used vision to localize itself in a map. It detected objects. It could detect objects that
link |
were surprising to it. It would plan and replan based on what it saw. It reasoned about whether
link |
to look and take pictures. It really had the basics of so many of the things that we think about now.
link |
How did it represent the space around it?
link |
It had representations at a bunch of different levels of abstraction,
link |
so it had, I think, a kind of an occupancy grid of some sort at the lowest level.
link |
At the high level, it was abstract, symbolic kind of rooms and connectivity.
link |
So where does Flakey come in?
link |
Yeah, okay. I showed up at SRI and we were building a brand new robot. As I said, none of the people
link |
from the previous project were there or involved anymore, so we were starting from scratch.
link |
My advisor was Stan Rosenstein. He ended up being my thesis advisor. He was motivated by this idea
link |
of situated computation or situated automata. The idea was that the tools of logical reasoning were
link |
important, but possibly only for the engineers or designers to use in the analysis of a system,
link |
but not necessarily to be manipulated in the head of the system itself.
link |
So I might use logic to prove a theorem about the behavior of my robot,
link |
even if the robot's not using logic, and it's had to prove theorems. So that was kind of the
link |
distinction. And so the idea was to kind of use those principles to make a robot do stuff.
link |
But a lot of the basic things we had to kind of learn for ourselves, because I had zero
link |
background in robotics. I didn't know anything about control. I didn't know anything about
link |
sensors. So we reinvented a lot of wheels on the way to getting that robot to do stuff.
link |
Do you think that was an advantage or hindrance?
link |
Oh, no. I'm big in favor of wheel reinvention, actually. I mean, I think you learned a lot
link |
by doing it. It's important though to eventually have the pointers so that you can see what's
link |
really going on. But I think you can appreciate much better the good solutions once you've
link |
messed around a little bit on your own and found a bad one.
link |
Yeah, I think you mentioned reinventing reinforcement learning and referring to
link |
rewards as pleasures, a pleasure, I think, which I think is a nice name for it.
link |
It's more fun, almost. Do you think you could tell the history of AI, machine learning,
link |
reinforcement learning, how you think about it from the 50s to now?
link |
One thing is that it oscillates. So things become fashionable and then they go out and
link |
then something else becomes cool and then it goes out and so on. So there's some interesting
link |
sociological process that actually drives a lot of what's going on. Early days was cybernetics and
link |
control and the idea that of homeostasis, people who made these robots that could,
link |
I don't know, try to plug into the wall when they needed power and then come loose and roll
link |
around and do stuff. And then I think over time, they thought, well, that was inspiring, but people
link |
said, no, no, no, we want to get maybe closer to what feels like real intelligence or human
link |
intelligence. And then maybe the expert systems people tried to do that, but maybe a little
link |
too superficially. So we get this surface understanding of what intelligence is like,
link |
because I understand how a steel mill works and I can try to explain it to you and you can write
link |
it down in logic and then we can make a computer infer that. And then that didn't work out.
link |
But what's interesting, I think, is when a thing starts to not be working very well,
link |
it's not only do we change methods, we change problems. So it's not like we have better ways
link |
of doing the problem of the expert systems people are trying to do. We have no ways of
link |
trying to do that problem. Oh, yeah, no, I think maybe a few. But we kind of give up on that problem
link |
and we switch to a different problem. And we work that for a while and we make progress.
link |
As a broad community. As a community. And there's a lot of people who would argue,
link |
you don't give up on the problem. It's just the decrease in the number of people working on it.
link |
You almost kind of like put it on the shelf. So we'll come back to this 20 years later.
link |
Yeah, I think that's right. Or you might decide that it's malformed. Like you might say,
link |
it's wrong to just try to make something that does superficial symbolic reasoning behave like a
link |
doctor. You can't do that until you've had the sensory motor experience of being a doctor or
link |
something. So there's arguments that say that that problem was not well formed. Or it could be
link |
that it is well formed, but we just weren't approaching it well. So you mentioned that your
link |
favorite part of logic and symbolic systems is that they give short names for large sets.
link |
So there is some use to this. They use symbolic reasoning. So looking at expert systems
link |
and symbolic computing, what do you think are the roadblocks that were hit in the 80s and 90s?
link |
Okay, so right. So the fact that I'm not a fan of expert systems doesn't mean that I'm not a fan
link |
of some kind of symbolic reasoning. So let's see roadblocks. Well, the main roadblock, I think,
link |
was that the idea that humans could articulate their knowledge effectively into some kind of
link |
logical statements. So it's not just the cost, the effort, but really just the capability of
link |
doing it. Right. Because we're all experts in vision, but totally don't have introspective access
link |
into how we do that. Right. And it's true that, I mean, I think the idea was, well, of course,
link |
even people then would know, of course, I wouldn't ask you to please write down the rules that you
link |
use for recognizing a water bottle. That's crazy. And everyone understood that. But we might ask
link |
you to please write down the rules you use for deciding, I don't know what tie to put on or
link |
or how to set up a microphone or something like that. But even those things, I think people maybe,
link |
I think what they found, I'm not sure about this, but I think what they found was that the
link |
so called experts could give explanations that sort of post hoc explanations for how and why
link |
they did things, but they weren't necessarily very good. And then they depended on maybe some
link |
kinds of perceptual things, which again, they couldn't really define very well. So I think,
link |
I think fundamentally, I think that the underlying problem with that was the assumption that people
link |
could articulate how and why they make their decisions. Right. So it's almost encoding the
link |
knowledge from converting from expert to something that a machine can understand and reason with.
link |
No, no, no, not even just encoding, but getting it out of you. Not not not writing it. I mean,
link |
yes, hard also to write it down for the computer. But I don't think that people can
link |
produce it. You can tell me a story about why you do stuff. But I'm not so sure that's the why.
link |
Great. So there are still on the hierarchical planning side,
link |
places where symbolic reasoning is very useful. So as you've talked about, so
link |
where so don't where's the gap? Yeah, okay, good. So saying that humans can't provide a
link |
description of their reasoning processes. That's okay, fine. But that doesn't mean that it's not
link |
good to do reasoning of various styles inside a computer. Those are just two orthogonal points.
link |
So then the question is, what kind of reasoning should you do inside a computer?
link |
Right. And the answer is, I think you need to do all different kinds of reasoning inside
link |
a computer, depending on what kinds of problems you face. I guess the question is, what kind of
link |
things can you encode symbolically so you can reason about? I think the idea about and and
link |
even symbolic, I don't even like that terminology because I don't know what it means technically
link |
and formally. I do believe in abstractions. So abstractions are critical, right? You cannot
link |
reason at completely fine grain about everything in your life, right? You can't make a plan at the
link |
level of images and torques for getting a PhD. So you have to reduce the size of the state space
link |
and you have to reduce the horizon if you're going to reason about getting a PhD or even buying
link |
the ingredients to make dinner. And so how can you reduce the spaces and the horizon of the
link |
reasoning you have to do? And the answer is abstraction, spatial abstraction, temporal
link |
abstraction. I think abstraction along the lines of goals is also interesting, like you might
link |
or well, abstraction and decomposition. Goals is maybe more of a decomposition thing.
link |
So I think that's where these kinds of, if you want to call it symbolic or discrete
link |
models come in. You talk about a room of your house instead of your pose. You talk about
link |
doing something during the afternoon instead of at 2.54. And you do that because it makes
link |
your reasoning problem easier and also because you don't have enough information
link |
to reason in high fidelity about your pose of your elbow at 2.35 this afternoon anyway.
link |
Right. When you're trying to get a PhD.
link |
Right. Or when you're doing anything really.
link |
Yeah, okay. Except for at that moment. At that moment,
link |
you do have to reason about the pose of your elbow, maybe. But then maybe you do that in some
link |
continuous joint space kind of model. And so again, my biggest point about all of this is that
link |
there should be, the dogma is not the thing, right? It shouldn't be that I am in favor
link |
against symbolic reasoning and you're in favor against neural networks. It should be that just
link |
computer science tells us what the right answer to all these questions is if we were smart enough
link |
to figure it out. Yeah. When you try to actually solve the problem with computers, the right answer
link |
comes out. You mentioned abstractions. I mean, neural networks form abstractions or rather,
link |
there's automated ways to form abstractions and there's expert driven ways to form abstractions
link |
and expert human driven ways. And humans just seems to be way better at forming abstractions
link |
currently and certain problems. So when you're referring to 2.45 a.m. versus afternoon,
link |
how do we construct that taxonomy? Is there any room for automated construction of such
link |
abstractions? Oh, I think eventually, yeah. I mean, I think when we get to be better
link |
and machine learning engineers, we'll build algorithms that build awesome abstractions.
link |
That are useful in this kind of way that you're describing. Yeah. So let's then step from
link |
the abstraction discussion and let's talk about BOMMDP's
link |
Partially Observable Markov Decision Processes. So uncertainty. So first, what are Markov Decision
link |
Processes? What are Markov Decision Processes? Maybe how much of our world can be models and
link |
MDPs? How much when you wake up in the morning and you're making breakfast, how do you think
link |
of yourself as an MDP? So how do you think about MDPs and how they relate to our world?
link |
Well, so there's a stance question, right? So a stance is a position that I take with
link |
respect to a problem. So I as a researcher or a person who designed systems can decide to make
link |
a model of the world around me in some terms. So I take this messy world and I say, I'm going to
link |
treat it as if it were a problem of this formal kind, and then I can apply solution concepts
link |
or algorithms or whatever to solve that formal thing, right? So of course, the world is not
link |
anything. It's not an MDP or a POMDP. I don't know what it is, but I can model aspects of it
link |
in some way or some other way. And when I model some aspect of it in a certain way, that gives me
link |
some set of algorithms I can use. You can model the world in all kinds of ways. Some have some
link |
are more accepting of uncertainty, more easily modeling uncertainty of the world. Some really
link |
force the world to be deterministic. And so certainly MDPs model the uncertainty of the world.
link |
Yes. Model some uncertainty. They model not present state uncertainty, but they model uncertainty
link |
in the way the future will unfold. Right. So what are Markov decision processes?
link |
So Markov decision process is a model. It's a kind of a model that you can make that says,
link |
I know completely the current state of my system. And what it means to be a state is that I have
link |
all the information right now that will let me make predictions about the future as well as I
link |
can. So that remembering anything about my history wouldn't make my predictions any better.
link |
But then it also says that then I can take some actions that might change the state of the world
link |
and that I don't have a deterministic model of those changes. I have a probabilistic model
link |
of how the world might change. It's a useful model for some kinds of systems. I mean, it's
link |
certainly not a good model for most problems. I think because for most problems, you don't
link |
actually know the state. For most problems, it's partially observed. So that's now a different
link |
problem class. So okay, that's where the problem depies, the partially observed Markov decision
link |
process step in. So how do they address the fact that you can't observe most the incomplete
link |
information about most of the world around you? Right. So now the idea is we still kind of postulate
link |
that there exists a state. We think that there is some information about the world out there
link |
such that if we knew that we could make good predictions, but we don't know the state.
link |
And so then we have to think about how, but we do get observations. Maybe I get images or I hear
link |
things or I feel things and those might be local or noisy. And so therefore they don't tell me
link |
everything about what's going on. And then I have to reason about given the history of actions
link |
I've taken and observations I've gotten, what do I think is going on in the world? And then
link |
given my own kind of uncertainty about what's going on in the world, I can decide what actions to
link |
take. And so difficult is this problem of planning under uncertainty in your view and your
link |
long experience of modeling the world, trying to deal with this uncertainty in
link |
especially in real world systems. Optimal planning for even discrete POMDPs can be
link |
undecidable depending on how you set it up. And so lots of people say I don't use POMDPs
link |
because they are intractable. And I think that that's a kind of a very funny thing to say because
link |
the problem you have to solve is the problem you have to solve. So if the problem you have to
link |
solve is intractable, that's what makes us AI people, right? So we solve, we understand that
link |
the problem we're solving is wildly intractable that we will never be able to solve it optimally,
link |
at least I don't. Yeah, right. So later we can come back to an idea about bounded optimality
link |
and something. But anyway, we can't come up with optimal solutions to these problems.
link |
So we have to make approximations. Approximations in modeling approximations in solution algorithms
link |
and so on. And so I don't have a problem with saying, yeah, my problem actually it is POMDP in
link |
continuous space with continuous observations. And it's so computationally complex. I can't
link |
even think about it's, you know, big O whatever. But that doesn't prevent me from it helps me
link |
gives me some clarity to think about it that way. And to then take steps to make approximation
link |
after approximation to get down to something that's like computable in some reasonable time.
link |
When you think about optimality, you know, the community broadly has shifted on that, I think,
link |
a little bit in how much they value the idea of optimality of chasing an optimal solution.
link |
How is your views of chasing an optimal solution changed over the years when you work with robots?
link |
That's interesting. I think we have a little bit of a methodological crisis, actually,
link |
from the theoretical side. I mean, I do think that theory is important and that right now we're not
link |
doing much of it. So there's lots of empirical hacking around and training this and doing that
link |
and reporting numbers. But is it good? Is it bad? We don't know. It's very hard to say things.
link |
And if you look at like computer science theory, so people talked for a while,
link |
everyone was about solving problems optimally or completely. And then there were interesting
link |
relaxations. So people look at, oh, can I, are there regret bounds? Or can I do some kind of,
link |
you know, approximation? Can I prove something that I can approximately solve this problem or
link |
that I get closer to the solution as I spend more time and so on? What's interesting, I think,
link |
is that we don't have good approximate solution concepts for very difficult problems. Right?
link |
I like to, you know, I like to say that I'm interested in doing a very bad job of very big
link |
problems. Right. So very bad job, very big problems. I like to do that. But I wish I could say
link |
something. I wish I had a, I don't know, some kind of a formal solution concept
link |
that I could use to say, oh, this algorithm actually, it gives me something. Like, I know
link |
what I'm going to get. I can do something other than just run it and get out. So that notion
link |
is still somewhere deeply compelling to you. The notion that you can say, you can drop
link |
thing on the table says this, you can expect this, this algorithm will give me some good results.
link |
I hope there's, I hope science will, I mean, there's engineering and there's science,
link |
I think that they're not exactly the same. And I think right now we're making huge engineering
link |
like leaps and bounds. So the engineering is running away ahead of the science, which is cool.
link |
And often how it goes, right? So we're making things and nobody knows how and why they work,
link |
roughly. But we need to turn that into science. There's some form. It's, yeah,
link |
there's some room for formalizing. We need to know what the principles are. Why does this work?
link |
Why does that not work? I mean, for while people build bridges by trying, but now we can often
link |
predict whether it's going to work or not without building it. Can we do that for learning systems
link |
or for robots? See, your hope is from a materialistic perspective that intelligence,
link |
artificial intelligence systems, robots are kind of just fancier bridges.
link |
Belief space. What's the difference between belief space and state space? So we mentioned
link |
MDPs, FOMDPs, you reasoning about, you sense the world, there's a state. What's this belief
link |
space idea? Yeah. Okay, that sounds good. It sounds good. So belief space, that is, instead of
link |
thinking about what's the state of the world and trying to control that as a robot, I think about
link |
what is the space of beliefs that I could have about the world? What's, if I think of a belief
link |
as a probability distribution of the ways the world could be, a belief state is a distribution,
link |
and then my control problem, if I'm reasoning about how to move through a world I'm uncertain about,
link |
my control problem is actually the problem of controlling my beliefs. So I think about taking
link |
actions, not just what effect they'll have on the world outside, but what effect they'll have on my
link |
own understanding of the world outside. And so that might compel me to ask a question or look
link |
somewhere to gather information, which may not really change the world state, but it changes
link |
my own belief about the world. That's a powerful way to empower the agent to reason about the
link |
world, to explore the world. What kind of problems does it allow you to solve to
link |
consider belief space versus just state space? Well, any problem that requires deliberate
link |
information gathering. So if in some problems, like chess, there's no uncertainty, or maybe
link |
there's uncertainty about the opponent. There's no uncertainty about the state.
link |
And some problems, there's uncertainty, but you gather information as you go. You might say,
link |
oh, I'm driving my autonomous car down the road, and it doesn't know perfectly where it is, but
link |
the LiDARs are all going all the time. So I don't have to think about whether to gather information.
link |
But if you're a human driving down the road, you sometimes look over your shoulder to see what's
link |
going on behind you in the lane. And you have to decide whether you should do that now. And you
link |
have to trade off the fact that you're not seeing in front of you, and you're looking behind you,
link |
and how valuable is that information, and so on. And so to make choices about information
link |
gathering, you have to reason in belief space. Also to just take into account your own uncertainty
link |
before trying to do things. So you might say, if I understand where I'm standing relative to the
link |
door jam, pretty accurately, then it's okay for me to go through the door. But if I'm really not
link |
sure where the door is, then it might be better to not do that right now. The degree of your
link |
uncertainty about the world is actually part of the thing you're trying to optimize in forming the
link |
plan, right? So this idea of a long horizon of planning for a PhD or just even how to get out
link |
of the house or how to make breakfast, you show this presentation of the WTF, where's the fork
link |
of a robot looking to sink. And can you describe how we plan in this world is this idea of hierarchical
link |
planning we've mentioned? Yeah, how can a robot hope to plan about something with such a long
link |
horizon where the goal is quite far away? People since probably reasoning began have thought about
link |
hierarchical reasoning, the temporal hierarchy in particular. Well, there's spatial hierarchy,
link |
but let's talk about temporal hierarchy. So you might say, oh, I have this long
link |
execution I have to do, but I can divide it into some segments abstractly, right? So maybe
link |
have to get out of the house, I have to get in the car, I have to drive, and so on. And so
link |
you can plan if you can build abstractions. So this we started out by talking about abstractions,
link |
and we're back to that now. If you can build abstractions in your state space,
link |
and abstractions, sort of temporal abstractions, then you can make plans at a high level. And you
link |
can say, I'm going to go to town, and then I'll have to get gas, and I can go here, and I can do
link |
this other thing. And you can reason about the dependencies and constraints among these actions,
link |
again, without thinking about the complete details. What we do in our hierarchical planning work is
link |
then say, all right, I make a plan at a high level of abstraction. I have to have some
link |
reason to think that it's feasible without working it out in complete detail. And that's
link |
actually the interesting step. I always like to talk about walking through an airport, like
link |
you can plan to go to New York and arrive at the airport, and then find yourself in an office
link |
building later. You can't even tell me in advance what your plan is for walking through the airport,
link |
partly because you're too lazy to think about it maybe, but partly also because you just don't
link |
have the information. You don't know what gate you're landing in or what people are going to be
link |
in front of you or anything. So there's no point in planning in detail. But you have to have,
link |
you have to make a leap of faith that you can figure it out once you get there. And it's really
link |
interesting to me how you arrive at that. How do you, so you have learned over your lifetime to be
link |
able to make some kinds of predictions about how hard it is to achieve some kinds of sub goals.
link |
And that's critical. Like you would never plan to fly somewhere if you couldn't,
link |
didn't have a model of how hard it was to do some of the intermediate steps.
link |
So one of the things we're thinking about now is how do you do this kind of very aggressive
link |
generalization to situations that you haven't been in and so on to predict how long will it
link |
take to walk through the Kuala Lumpur airport? Like you could give me an estimate and it wouldn't
link |
be crazy. And you have to have an estimate of that in order to make plans that involve
link |
walking through the Kuala Lumpur airport, even if you don't need to know it in detail.
link |
So I'm really interested in these kinds of abstract models and how do we acquire them.
link |
But once we have them, we can use them to do hierarchical reasoning, which I think is very
link |
important. Yeah, there's this notion of goal regression and preimage backchaining.
link |
This idea of starting at the goal and just forming these big clouds of states. I mean,
link |
it's almost like saying to the airport, you know, you know, once you show up to the airport,
link |
you're like a few steps away from the goal. So thinking of it this way is kind of interesting.
link |
I don't know if you have further comments on that of starting at the goal. Yeah, I mean,
link |
it's interesting that Herb Simon back in the early days of AI talked a lot about
link |
means ends reasoning and reasoning back from the goal. There's a kind of an intuition that people
link |
have that the number of the state space is big, the number of actions you could take is really big.
link |
So if you say, here I sit and I want to search forward from where I am, what are all the things
link |
I could do? That's just overwhelming. If you say, if you can reason at this other level and say,
link |
here's what I'm hoping to achieve, what can I do to make that true that somehow the
link |
branching is smaller? Now, what's interesting is that like in the AI planning community,
link |
that hasn't worked out in the class of problems that they look at and the methods that they tend
link |
to use, it hasn't turned out that it's better to go backward. It's still kind of my intuition
link |
that it is, but I can't prove that to you right now. Right. I share your intuition, at least for us
link |
mirror humans. Speaking of which, when you maybe now we take it and take a little step into that
link |
philosophy circle, how hard would it, when you think about human life, you give those examples
link |
often, how hard do you think it is to formulate human life as a planning problem or aspects of
link |
human life? So when you look at robots, you're often trying to think about object manipulation,
link |
tasks about moving a thing. When you take a slight step outside the room, let the robot
link |
leave and he'll get lunch or maybe try to pursue more fuzzy goals. How hard do you think is that
link |
problem? If you were to try to maybe put another way, try to formulate human life as a planning
link |
problem. Well, that would be a mistake. I mean, it's not all a planning problem, right? I think
link |
it's really, really important that we understand that you have to put together pieces and parts
link |
that have different styles of reasoning and representation and learning. I think it seems
link |
probably clear to anybody that it can't all be this or all be that. Brains aren't all like this
link |
or all like that, right? They have different pieces and parts and substructure and so on.
link |
So I don't think that there's any good reason to think that there's going to be like one true
link |
algorithmic thing that's going to do the whole job. Just a bunch of pieces together,
link |
design to solve a bunch of specific problems. Or maybe styles of problems. I mean,
link |
there's probably some reasoning that needs to go on in image space. I think, again,
link |
there's this model base versus model free idea, right? So in reinforcement learning,
link |
people talk about, oh, should I learn? I could learn a policy just straight up a way of behaving.
link |
I could learn it's popular in a value function. That's some kind of weird intermediate ground.
link |
Or I could learn a transition model, which tells me something about the dynamics of the world.
link |
If I take a, imagine that I learn a transition model and I couple it with a planner and I
link |
draw a box around that, I have a policy again. It's just stored a different way, right?
link |
But it's just as much of a policy as the other policy. It's just I've made, I think,
link |
the way I see it is it's a time space trade off in computation, right? A more overt policy
link |
representation. Maybe it takes more space, but maybe I can compute quickly what action I should
link |
take. On the other hand, maybe a very compact model of the world dynamics plus a planner
link |
lets me compute what action to take to just more slowly. There's no, I mean, I don't think,
link |
there's no argument to be had. It's just like a question of what form of computation is best
link |
for us. For the various sub problems. Right. So, and so like learning to do algebra manipulations
link |
for some reason is, I mean, that's probably going to want naturally a sort of a different
link |
representation than riding a unicycle. At the time constraints on the unicycle are serious.
link |
The space is maybe smaller. I don't know. But so I could be the more human size of
link |
falling in love, having a relationship that might be another another style of no idea how to model
link |
that. Yeah, that's, that's first solve the algebra and the object manipulation. What do you think
link |
is harder perception or planning perception? That's why I'm understanding that's why.
link |
So what do you think is so hard about perception by understanding the world around you?
link |
Well, I mean, I think the big question is representational. A hugely the question is
link |
representation. So perception has made great strides lately, right? And we can classify images and we
link |
can play certain kinds of games and predict how to steer the car and all this sort of stuff.
link |
I don't think we have a very good idea of what perception should deliver, right? So if you
link |
if you believe in modularity, okay, there's there's a very strong view which says
link |
we shouldn't build in any modularity, we should make a giant gigantic neural network,
link |
train it end to end to do the thing. And that's the best way forward.
link |
And it's hard to argue with that except on a sample complexity basis, right? So you might say,
link |
oh, well, if I want to do end to end reinforcement learning on this giant giant neural network,
link |
it's going to take a lot of data and a lot of like broken robots and stuff. So
link |
then the only answer is to say, okay, we have to build something in build in some structure
link |
or some bias, we know from theory of machine learning, the only way to cut down the sample
link |
complexity is to kind of cut down somehow cut down the hypothesis space, you can do that by
link |
building in bias. There's all kinds of reason to think that nature built bias into humans.
link |
Convolution is a bias, right? It's a very strong bias and it's a very critical bias.
link |
So my view is that we should look for more things that are like convolution, but that address other
link |
aspects of reasoning, right? So convolution helps us a lot with a certain kind of spatial
link |
reasoning that's quite close to the imaging. I think there's other ideas like that,
link |
maybe some amount of forward search, maybe some notions of abstraction, maybe the notion that
link |
objects exist, actually, I think that's pretty important. And a lot of people won't give you
link |
that to start with, right? So almost like a convolution in the
link |
in the object semantic object space or some kind of some kind of ideas in there. That's right.
link |
And people are like the graph, graph convolutions are an idea that are related to
link |
relational representations. And so I think there are, so you, I've come far field from perception,
link |
but I think, I think the thing that's going to make perception that kind of the next step is
link |
actually understanding better what it should produce, right? So what are we going to do with
link |
the output of it, right? It's fine when what we're going to do with the output is steer,
link |
it's less clear when we're just trying to make one integrated intelligent agent,
link |
what should the output of perception be? We have no idea. And how should that hook up to the other
link |
stuff? We don't know. So I think the pressing question is, what kinds of structure can we
link |
build in that are like the moral equivalent of convolution that will make a really awesome
link |
superstructure that then learning can kind of progress on efficiently?
link |
I agree. Very compelling description of actually where we stand with the perception from
link |
you're teaching a course on embodying intelligence. What do you think it takes to
link |
build a robot with human level intelligence? I don't know if we knew we would do it.
link |
If you were to, I mean, okay, so do you think a robot needs to have a self awareness,
link |
consciousness, fear of mortality? Or is it, is it simpler than that? Or is consciousness a simple
link |
thing? Do you think about these notions? I don't think much about consciousness. Even most philosophers
link |
who care about it will give you that you could have robots that are zombies, right, that behave
link |
like humans but are not conscious. And I, at this moment, would be happy enough for that. So I'm not
link |
really worried one way or the other. So the technical side, you're not thinking of the use of self
link |
awareness? Well, but I, okay, but then what does self awareness mean? I mean, that you need to have
link |
some part of the system that can observe other parts of the system and tell whether they're
link |
working well or not. That seems critical. So does that count as, I mean, does that count as
link |
self awareness or not? Well, it depends on whether you think that there's somebody at home who can
link |
articulate whether they're self aware. But clearly, if I have like, you know, some piece of code
link |
that's counting how many times this procedure gets executed, that's a kind of self awareness,
link |
right? So there's a big spectrum. It's clear you have to have some of it.
link |
Right. You know, we're quite far away on many dimensions, but is the direction of research
link |
that's most compelling to you for, you know, trying to achieve human level intelligence
link |
in our robots? Well, to me, I guess the thing that seems most compelling to me at the moment is this
link |
question of what to build in and what to learn. I think we're, we don't, we're missing a bunch of
link |
ideas. And, and we, you know, people, you know, don't you dare ask me how many years it's going
link |
to be until that happens, because I won't even participate in the conversation. Because I think
link |
we're missing ideas and I don't know how long it's going to take to find them. So I won't ask you
link |
how many years, but maybe I'll ask you what it, when you will be sufficiently impressed that we've
link |
achieved it. So what's a good test of intelligence? Do you like the Turing test and natural language
link |
in the robotic space? Is there something where you would sit back and think, oh, that's pretty
link |
impressive as a test, as a benchmark. Do you think about these kinds of problems?
link |
No, I resist. I mean, I think all the time that we spend arguing about those kinds of things could
link |
be better spent just making their robots work better. So you don't value competition. So I mean,
link |
there's a nature of benchmark, benchmarks and data sets, or Turing test challenges, where
link |
everybody kind of gets together and tries to build a better robot because they want to outcompete
link |
each other, like the DARPA challenge with the autonomous vehicles. Do you see the value of that?
link |
Or can get in the way? I think you can get in the way. I mean, some people, many people find it
link |
motivating. And so that's good. I find it anti motivating personally. But I think you get an
link |
interesting cycle where for a contest, a bunch of smart people get super motivated and they hack
link |
their brains out. And much of what gets done is just hacks, but sometimes really cool ideas emerge.
link |
And then that gives us something to chew on after that. So it's not a thing for me, but I don't
link |
I don't regret that other people do it. Yeah, it's like you said, with everything else that
link |
makes us good. So jumping topics a little bit, you started the journal machine learning research
link |
and served as its editor in chief. How did the publication come about?
link |
And what do you think about the current publishing model space in machine learning
link |
artificial intelligence? Okay, good. So it came about because there was a journal called machine
link |
learning, which still exists, which was owned by Clure. And there was I was on the editorial
link |
board and we used to have these meetings annually where we would complain to Clure that
link |
it was too expensive for the libraries and that people couldn't publish. And we would really
link |
like to have some kind of relief on those fronts. And they would always sympathize,
link |
but not do anything. So we just decided to make a new journal. And there was the Journal of AI
link |
Research, which has was on the same model, which had been in existence for maybe five years or so,
link |
and it was going on pretty well. So we just made a new journal. It wasn't I mean,
link |
I don't know, I guess it was work, but it wasn't that hard. So basically the editorial board,
link |
probably 75% of the editorial board of machine learning resigned. And we founded the new journal.
link |
But it was sort of it was more open. Yeah, right. So it's completely open. It's open access.
link |
Actually, I had a postdoc, George Conrad Harris, who wanted to call these journals free for all.
link |
Because there were I mean, it both has no page charges and has no
link |
access restrictions. And the reason and so lots of people, I mean, for there were,
link |
there were people who are mad about the existence of this journal who thought it was a fraud or
link |
something, it would be impossible, they said, to run a journal like this with basically,
link |
I mean, for a long time, I didn't even have a bank account. I paid for the
link |
lawyer to incorporate and the IP address. And it just didn't cost a couple hundred dollars a year
link |
to run. It's a little bit more now, but not that much more. But that's because I think computer
link |
scientists are competent and autonomous in a way that many scientists in other fields aren't.
link |
I mean, at doing these kinds of things, we already type set around papers,
link |
we all have students and people who can hack a website together in the afternoon.
link |
So the infrastructure for us was like, not a problem, but for other people in other fields,
link |
it's a harder thing to do. Yeah. And this kind of open access journal is nevertheless,
link |
one of the most prestigious journals. So it's not like a prestige and it can be achieved
link |
without any of the papers. Paper is not required for prestige, turns out. Yeah.
link |
So on the review process side of actually a long time ago, I don't remember when I reviewed a paper
link |
where you were also a reviewer and I remember reading your review and being influenced by it.
link |
It was really well written. It influenced how I write feature reviews. You disagreed with me,
link |
actually. And you made it my review, but much better. But nevertheless, the review process
link |
has its flaws. And what do you think works well? How can it be improved?
link |
So actually, when I started JMLR, I wanted to do something completely different.
link |
And I didn't because it felt like we needed a traditional journal of record and so we just
link |
made JMLR be almost like a normal journal, except for the open access parts of it, basically.
link |
Increasingly, of course, publication is not even a sensible word. You can publish something by
link |
putting it in an archive so I can publish everything tomorrow. So making stuff public is
link |
there's no barrier. We still need curation and evaluation. I don't have time to read all of
link |
archive. And you could argue that kind of social thumbs uping of articles suffices, right? You
link |
might say, oh, heck with this, we don't need journals at all. We'll put everything on archive
link |
and people will upvote and downvote the articles and then your CV will say, oh, man, he got a lot
link |
of upvotes. So that's good. But I think there's still value in careful reading and commentary of
link |
things. And it's hard to tell when people are upvoting and downvoting or arguing about your
link |
paper on Twitter and Reddit, whether they know what they're talking about. So then I have the
link |
second order problem of trying to decide whose opinions I should value and such. So I don't
link |
know. If I had infinite time, which I don't, and I'm not going to do this because I really want to
link |
make robots work, but if I felt inclined to do something more in a publication direction,
link |
I would do this other thing, which I thought about doing the first time, which is to get
link |
together some set of people whose opinions I value and who are pretty articulate. And I guess we
link |
would be public, although we could be private, I'm not sure. And we would review papers. We wouldn't
link |
publish them and you wouldn't submit them. We would just find papers and we would write reviews
link |
and we would make those reviews public. And maybe if you, you know, so we're Leslie's friends who
link |
review papers and maybe eventually if we, our opinion was sufficiently valued, like the opinion
link |
of JMLR is valued, then you'd say on your CV that Leslie's friends gave my paper a five star reading
link |
and that would be just as good as saying I got it accepted into this journal. So I think we
link |
should have good public commentary and organize it in some way, but I don't really know how to
link |
do it. It's interesting times. The way you describe it actually is really interesting. I mean,
link |
we do it for movies, IMDB.com. There's experts, critics come in, they write reviews, but there's
link |
also regular non critics humans write reviews and they're separated. I like open review.
link |
The eye clear process, I think is interesting. It's a step in the right direction, but it's still
link |
not as compelling as reviewing movies or video games. I mean, it sometimes almost, it might be
link |
silly, at least from my perspective to say, but it boils down to the user interface, how fun and
link |
easy it is to actually perform the reviews, how efficient, how much you as a reviewer get
link |
street cred for being a good reviewer. Those human elements come into play.
link |
No, it's a big investment to do a good review of a paper and the flood of papers is out of control.
link |
There aren't 3,000 new, I don't know how many new movies are there in a year, I don't know,
link |
but there's probably going to be less than how many machine learning papers there are in a year now.
link |
Right, so I'm like an old person, so of course I'm going to say,
link |
things are moving too fast, I'm a stick in the mud. So I can say that, but my particular flavor
link |
of that is, I think the horizon for researchers has gotten very short, that students want to
link |
publish a lot of papers and it's exciting and there's value in that and you get padded on the
link |
head for it and so on. And some of that is fine, but I'm worried that we're driving out people who
link |
would spend two years thinking about something. Back in my day, when we worked on our theses,
link |
we did not publish papers, you did your thesis for years, you picked a hard problem and then you
link |
worked and chewed on it and did stuff and wasted time and for a long time. And when it was roughly,
link |
when it was done, you would write papers. And so I don't know how to, and I don't think that
link |
everybody has to work in that mode, but I think there's some problems that are hard enough
link |
that it's important to have a longer research horizon and I'm worried that
link |
we don't incentivize that at all at this point. In this current structure. So what do you see
link |
what are your hopes and fears about the future of AI and continuing on this theme? So AI has
link |
gone through a few winters, ups and downs. Do you see another winter of AI coming?
link |
Or are you more hopeful about making robots work, as you said? I think the cycles are inevitable,
link |
but I think each time we get higher, right? I mean, it's like climbing some kind of
link |
landscape with a noisy optimizer. So it's clear that the deep learning stuff has
link |
made deep and important improvements. And so the high watermark is now higher. There's no question.
link |
But of course, I think people are overselling and eventually investors, I guess, and other people
link |
look around and say, well, you're not quite delivering on this grand claim and that wild
link |
hypothesis. It's like probably it's going to crash something out and then it's okay. I mean,
link |
it's okay. I mean, but I don't I can't imagine that there's like some awesome monotonic improvement
link |
from here to human level AI. So in, you know, I have to ask this question, I probably anticipate
link |
answers, the answers. But do you have a worry short term, a long term about the existential
link |
threats of AI and maybe short term, less existential, but more robots taking away jobs?
link |
Well, actually, let me talk a little bit about utility. Actually, I had an interesting conversation
link |
with some military ethicists who wanted to talk to me about autonomous weapons.
link |
And they're, they were interesting, smart, well educated guys who didn't know too much about AI or
link |
machine learning. And the first question they asked me was, has your robot ever done something you
link |
didn't expect? And I like burst out laughing because anybody who's ever done something other robot
link |
right knows that they don't do much. And what I realized was that their model of how we program
link |
a robot was completely wrong. Their model of how we could put program robot was like,
link |
program robot was like, Lego Mindstorms, like, Oh, go forward a meter, turn left, take a picture,
link |
do this, do that. And so if you have that model of programming, then it's true, it's kind of weird
link |
that your robot would do something that you didn't anticipate. But the fact is, and actually,
link |
so now this is my new educational mission, if I have to talk to non experts, I try to teach them
link |
the idea that we don't operate, we operate at least one or maybe many levels of abstraction
link |
about that. And we say, Oh, here's a hypothesis class, maybe it's a space of plans, or maybe it's a
link |
space of classifiers, or whatever. But there's some set of answers and an objective function. And
link |
then we work on some optimization method that tries to optimize a solution in that class.
link |
And we don't know what solution is going to come out. Right. So I think it's important to
link |
communicate that. So I mean, of course, probably people who listen to this, they know that lesson.
link |
But I think it's really critical to communicate that lesson. And then lots of people are now
link |
talking about, you know, the value alignment problem. So you want to be sure, as robots or
link |
software systems get more competent, that their objectives are aligned with your objectives,
link |
or that our objectives are compatible in some way, or we have a good way of mediating when they have
link |
different objectives. And so I think it is important to start thinking in terms, like,
link |
you don't have to be freaked out by the robot apocalypse to accept that it's important to think
link |
about objective functions of value alignment. And that you have to really, everyone who's done
link |
optimization knows that you have to be careful what you wish for that, you know, sometimes you get
link |
the optimal solution. And you realize, man, that was that objective was wrong. So pragmatically,
link |
in the shortest term, it seems to me that that those are really interesting and critical questions.
link |
And the idea that we're going to go from being people who engineer algorithms to being people
link |
who engineer objective functions, I think that's, that's definitely going to happen. And that's
link |
going to change our thinking and methodology and stuff.
link |
We're going to, you started at Stanford philosophy, that's wish you could be science,
link |
and I will go back to philosophy maybe. Well, I mean, they're mixed together because, because,
link |
as we also know, as machine learning people, right? When you design, in fact, this is the
link |
lecture I gave in class today, when you design an objective function, you have to wear both hats.
link |
There's the hat that says, what do I want? And there's the hat that says, but I know what my
link |
optimizer can do to some degree. And I have to take that into account. So it's, it's always a
link |
trade off. And we have to kind of be mindful of that. The part about taking people's jobs,
link |
that I understand that that's important, I don't understand sociology or economics or people
link |
very well. So I don't know how to think about that. So that's, yeah, so there might be a
link |
sociological aspect there, the economic aspect that's very difficult to think about. Okay.
link |
I mean, I think other people should be thinking about it, but I'm just, that's not my strength.
link |
So what do you think is the most exciting area of research in the short term,
link |
for the community and for your, for yourself? Well, so, I mean, there's this story I've been
link |
telling about how to engineer intelligent robots. So that's what we want to do. We all kind of want
link |
to do, well, I mean, some set of us want to do this. And the question is, what's the most effective
link |
strategy? And we've tried, and there's a bunch of different things you could do at the extremes,
link |
right? One super extreme is we do introspection and we write a program. Okay, that has not worked
link |
out very well. Another extreme is we take a giant bunch of neural guru and we try and train it up to
link |
do something. I don't think that's going to work either. So the question is, what's the middle
link |
ground? And again, this isn't a theological question or anything like that. It's just,
link |
like, how do, just how do we, what's the best way to make this work out? And I think it's clear,
link |
it's a combination of learning, to me, it's clear, it's a combination of learning and not learning.
link |
And what should that combination be? And what's the stuff we build in? So to me,
link |
that's the most compelling question. And when you say engineer robots, you mean
link |
engineering systems that work in the real world. That's the emphasis.
link |
Last question, which robots or robot is your favorite from science fiction?
link |
So you can go with Star Wars or RTD2, or you can go with more modern, maybe Hal.
link |
No, sir, I don't think I have a favorite robot from science fiction.
link |
This is, this is back to, you like to make robots work in the real world here, not, not in.
link |
I mean, I love the process. And I care more about the process.
link |
The engineering process.
link |
Yeah. I mean, I do research because it's fun, not because I care about what we produce.
link |
Well, that's, that's a beautiful note, actually. And Leslie, thank you so much for talking today.
link |
Sure, it's been fun.