back to index

Richard Karp: Algorithms and Computational Complexity | Lex Fridman Podcast #111


small model | large model

link |
00:00:00.000
The following is a conversation with Richard Karp, a professor at Berkeley and one of the most
link |
00:00:05.040
important figures in the history of theoretical computer science. In 1985, he received the
link |
00:00:11.600
Turing Award for his research in the theory of algorithms, including the development of the
link |
00:00:16.720
admiral's Karp algorithm for solving the max flow problem on networks, Hopcroft's Karp algorithm
link |
00:00:23.360
for finding maximum cardinality matchings in bipartite graphs, and his landmark paper in
link |
00:00:29.920
complexity theory called reducibility among combinatorial problems, in which he proved 21
link |
00:00:36.800
problems to be NP complete. This paper was probably the most important catalyst in the
link |
00:00:42.080
explosion of interest in the study of NP completeness and the P versus NP problem in general.
link |
00:00:48.720
Quick summary of the ads to sponsors a sleep mattress and cash app. Please consider supporting
link |
00:00:54.560
this podcast by going to a sleep comm slash Lex and downloading cash app and using code Lex
link |
00:01:01.840
podcast. Click the links by the stuff. It really is the best way to support this podcast.
link |
00:01:07.920
If you enjoy this thing, subscribe on YouTube review it with five stars and up a podcast
link |
00:01:12.320
supporting on Patreon or connect with me on Twitter at Lex freedman. As usual, I'll do a few
link |
00:01:17.760
minutes of ads now and never any ads in the middle that can break the flow of the conversation.
link |
00:01:22.880
This show is sponsored by eight sleep and it's pod pro mattress that you can check out at eight
link |
00:01:28.640
sleep comm slash Lex to get $200 off. It controls temperature with an app. It can cool down to
link |
00:01:36.640
as low as 55 degrees and each side of the bed separately. Research shows the temperature has
link |
00:01:42.720
a big impact on the quality of our sleep. Anecdotally, it's been a game changer for me. I love it.
link |
00:01:48.640
It's been a couple of weeks now. I just been really enjoying it both in the fact that I'm
link |
00:01:53.280
getting better sleep and then it's a smart mattress. Essentially, I kind of imagine it's
link |
00:01:59.120
being the early days of artificial intelligence being a part of every aspect of our lives and
link |
00:02:04.560
certainly infusing AI in one of the most important aspects of life, which is sleep.
link |
00:02:09.040
I think has a lot of potential for being beneficial. The pod pro is packed with sensors that track
link |
00:02:15.040
heart rate, heart rate variability and respiratory rate showing it all in their app. The apps health
link |
00:02:21.680
metrics are amazing, but the cooling alone is honestly worth the money. I don't know we sleep,
link |
00:02:27.520
but when I do, I choose the eighth sleep pod pro mattress. Check it out at eight sleep comm slash
link |
00:02:33.120
lex to get $200 off. And remember just visiting the site and considering the purchase helps convince
link |
00:02:41.520
the folks at eight sleep that this silly old podcast is worth sponsoring in the future.
link |
00:02:47.280
This show is also presented by the great and powerful cash app, the number one finance app
link |
00:02:54.240
in the app store. When you get it, use code lexpodcast. Cash app lets you send money to friends
link |
00:03:00.000
by Bitcoin and invest in the stock market with as little as $1. It's one of the best design
link |
00:03:06.000
interfaces of an app that I've ever used. To me, a good design is when everything is easy and natural.
link |
00:03:12.240
Bad design is when the app gets in the way, either because it's buggy or because it tries too hard
link |
00:03:17.840
to be helpful. I'm looking at you clippy from Microsoft, even though I love you. Anyway,
link |
00:03:23.360
there's a big part of my brain and heart that loves to design things and also to appreciate
link |
00:03:28.320
great design by others. So again, if you get cash out from the app store, Google Play and use the
link |
00:03:33.680
code lexpodcast, you get $10 and cash up will also donate $10 to first an organization that is helping
link |
00:03:40.800
to advance robotics and STEM education for young people around the world. And now here's my conversation
link |
00:03:47.920
with Richard Karp. You wrote that at the age of 13, you were first exposed to playing geometry
link |
00:03:55.040
and was wonder struck by the power and elegance of form of proofs. Are there problems, proofs,
link |
00:04:01.520
properties, ideas in playing geometry that from that time that you remember being mesmerized by
link |
00:04:07.680
or just enjoying to go through to prove various aspects? So Michael Rabin told me this story
link |
00:04:16.320
about an experience he had when he was a young student who was tossed out of his classroom
link |
00:04:24.000
for bad behavior and was wandering through the corridors of his school
link |
00:04:29.440
and came upon two older students who were studying the problem of finding the shortest
link |
00:04:36.560
distance between two non overlapping circles. And Michael thought about it and said,
link |
00:04:45.520
you take the straight line between the two centers and the segment between the two circles is the
link |
00:04:54.800
shortest because a straight line is the shortest distance between the two centers and any other
link |
00:05:02.240
line connecting the circles would be on a longer line. And he thought and I agreed that this was
link |
00:05:11.120
just elegant. The pure reasoning could come up with such a result. Certainly the shortest distance
link |
00:05:21.120
from the two centers of the circles is a straight line. Could you once again say
link |
00:05:27.840
what's the next step in that proof? Well, any segment joining the two circles
link |
00:05:34.800
if you extend it by taking the radius on each side, you get a path with three
link |
00:05:45.520
edges, which connects the two centers. And this has to be at least as long as the shortest path,
link |
00:05:52.560
which is the straight line. The straight line. Yeah. Well, yeah, that's quite simple. So what
link |
00:05:59.200
is it about that elegance that you just find compelling? Well, just that you could establish
link |
00:06:08.000
a fact about geometry beyond dispute by pure reasoning. I also enjoy the challenge of solving
link |
00:06:21.920
puzzles in plain geometry. It was much more fun than the earlier mathematics courses,
link |
00:06:27.440
which were mostly about arithmetic operations and manipulating them.
link |
00:06:33.200
Was there something about geometry itself, the slightly visual component of it that you can
link |
00:06:38.560
visualize? Oh, yes, absolutely. Although I lacked three dimensional vision. I wasn't very good at
link |
00:06:46.320
three dimensional vision. You mean being able to visualize three dimensional objects?
link |
00:06:49.760
Or surfaces, hyperplanes, and so on. So there, I didn't have an intuition. But
link |
00:07:04.880
for example, the fact that the sum of the angles of a triangle is 180 degrees
link |
00:07:09.040
is proved convincingly. And it comes as a surprise that that can be done.
link |
00:07:21.440
Why is that surprising? Well, it is a surprising idea, I suppose. Why is that proved difficult?
link |
00:07:32.240
It's not. That's the point. It's so easy, and yet it's so convincing.
link |
00:07:36.160
Do you remember what is the proof that it adds up to 180?
link |
00:07:43.280
You start at a corner and draw a line parallel to the opposite side,
link |
00:07:54.240
and that line sort of trisects the angle between the other two sides. And you get a
link |
00:08:09.440
half plane, which has to add up to 180 degrees. And it consists in the angles by the quality of
link |
00:08:18.720
alternate angles, what's it called? You get a correspondence between the angles created along
link |
00:08:28.880
the side of the triangle and the three angles of the triangle. Has geometry had an impact on,
link |
00:08:37.600
when you look into the future of your work with combinatorial algorithms, has it had some kind
link |
00:08:42.240
of impact in terms of the puzzles, the visual aspects that were first so compelling to you?
link |
00:08:51.360
Not Euclidean geometry, particularly. I think I use tools like linear programming and integer
link |
00:09:00.720
programming a lot, but those require high dimensional visualization. And so I tend to go
link |
00:09:09.920
by the algebraic properties. Right. You go by the linear algebra and not by the
link |
00:09:18.160
visualization. Well, the interpretation in terms of, for example, finding the highest point on
link |
00:09:24.960
a polyhedron, as in linear programming, is motivating. But again, I don't have the high
link |
00:09:35.600
dimensional intuition that would particularly inform me. So I sort of lean on the algebra.
link |
00:09:44.640
So to linger on that point, what kind of visualization do you do when you're trying to
link |
00:09:51.760
think about, we'll get to combinatorial algorithms, but just algorithms in general.
link |
00:09:58.400
What's inside your mind when you're thinking about designing algorithms?
link |
00:10:01.920
Or even just tackling any mathematical problem?
link |
00:10:09.360
Well, I think that usually an algorithm involves a repetition of some inner loop.
link |
00:10:20.320
And so I can sort of visualize the distance from the desired solution
link |
00:10:26.160
as iteratively reducing until you finally hit the exact solution.
link |
00:10:33.120
And try to take steps that get you closer to the…
link |
00:10:35.440
Try to take steps that get closer and having the certainty of converging. So it's basically the
link |
00:10:45.120
mechanics of the algorithm is often very simple. But especially when you're trying something out
link |
00:10:52.320
on the computer, so for example, I did some work on the traveling salesman problem. And
link |
00:11:00.400
I could see there was a particular function that had to be minimized, and it was
link |
00:11:05.680
fascinating to see the successive approaches to the optimum.
link |
00:11:11.840
You mean, so first of all, traveling salesman problem is where you have to visit
link |
00:11:15.120
it every city without ever the only ones. Yeah, that's right. Find the shortest path through
link |
00:11:23.520
the cities. Yeah, which is sort of a canonical standard, a really nice problem that's really
link |
00:11:29.840
hard. Exactly. So can you say again, what was nice about being able to think about the
link |
00:11:37.040
objective function there and maximizing it or minimizing it?
link |
00:11:40.720
Well, just so that as the algorithm proceeded, you were making progress,
link |
00:11:48.080
continual progress, and eventually getting to the optimum point.
link |
00:11:53.760
So there's two parts, maybe. Maybe you can correct me. But first is like getting an intuition
link |
00:12:00.160
about what the solution would look like, or even maybe coming up with a solution. And two is
link |
00:12:06.080
proving that this thing is actually going to be pretty good. What part is harder for you?
link |
00:12:13.440
Where's the magic happen? Is it in the first sets of intuitions, or is it in the
link |
00:12:19.680
messy details of actually showing that it is going to get to the exact solution,
link |
00:12:25.360
and it's going to run at a certain complexity?
link |
00:12:31.040
Well, the magic is just the fact that the gap from the optimum decreases monotonically,
link |
00:12:42.160
and you can see it happening. And various metrics of what's going on are improving
link |
00:12:49.680
all along until finally you hit the optimum. Perhaps later we'll talk about the assignment
link |
00:12:55.600
problem that I can illustrate a little better. Now zooming out again, as you write, Don Knuth
link |
00:13:04.320
has called attention to a breed of people who derive great aesthetic pleasure from contemplating
link |
00:13:11.440
the structure of computational processes. So Don calls these folks geeks. And you write that you
link |
00:13:17.440
remember the moment you realized you were such a person, you were showing the Hungarian algorithm
link |
00:13:23.040
to solve the assignment problem. So perhaps you can explain what the assignment problem is,
link |
00:13:28.960
and what the Hungarian algorithm is. So in the assignment problem, you have
link |
00:13:37.280
n boys and n girls, and you are given the desirability or the cost of matching
link |
00:13:47.360
the ith boy with the jth girl for all i and j. You're given a matrix of numbers,
link |
00:13:55.600
and you want to find the one to one matching of the boys with the girls, such that the sum
link |
00:14:05.600
of the associated costs will be minimized. So the best way to match the boys with the girls,
link |
00:14:13.120
or men with jobs, or any two sets. Any possible matching is possible?
link |
00:14:21.040
Yeah, all one to one correspondences are permissible. If there is a connection that
link |
00:14:28.240
is not allowed, then you can think of it as having an infinite cost. So what you do is
link |
00:14:37.120
to depend on the observation that the identity of the optimal assignment, or as we call it,
link |
00:14:51.200
the optimal permutation is not changed if you subtract
link |
00:15:00.240
a constant from any row or column of the matrix. You can see that the comparison between
link |
00:15:07.040
the different assignments is not changed by that. Because if you decrease a particular row,
link |
00:15:15.440
all the elements of a row by some constant, all solutions decrease by the cost of that,
link |
00:15:21.680
by an amount equal to that constant. So the idea of the algorithm is to start with a matrix of
link |
00:15:28.800
nonnegative numbers and keep subtracting from rows or entire columns in such a way that you
link |
00:15:42.720
subtract the same constant from all the elements of that row or column, while maintaining the
link |
00:15:49.680
property that all the elements are nonnegative. Simple. Yeah. And so what you have to do is
link |
00:16:06.320
find small moves which will decrease the total cost while subtracting constants from rows or
link |
00:16:16.880
columns. And there's a particular way of doing that by computing the shortest path through
link |
00:16:22.320
the elements in the matrix. And you just keep going in this way until you finally get a full
link |
00:16:31.680
permutation of zeros while the matrix is nonnegative, and then you know that that has to be the cheapest.
link |
00:16:38.400
Is that as simple as it sounds? So the shortest path through the matrix part?
link |
00:16:45.520
Yeah. The simplicity lies in how you find, I oversimplified slightly, what you will end up
link |
00:16:55.040
subtracting a constant from some rows or columns and adding the same constant back to other rows
link |
00:17:02.560
and columns. So as not to reduce any of the zero elements, you leave them unchanged.
link |
00:17:10.720
But each individual step modifies several rows and columns by the same amount,
link |
00:17:23.920
but overall decreases the cost. So there's something about that elegance that made you go,
link |
00:17:30.560
aha, this is beautiful. It's amazing that something so simple can solve a problem like this.
link |
00:17:38.000
Yeah, it's really cool. If I had mechanical ability, I would probably like to do woodworking or
link |
00:17:45.840
other activities where you sort of shape something into something beautiful and orderly.
link |
00:17:55.360
And there's something about the orderly, systematic nature of that iterative algorithm
link |
00:18:03.680
that is pleasing to me. So what do you think about this idea of geeks as Don Knuth calls them?
link |
00:18:12.800
Is it something specific to a mindset that allows you to discover the elegance in
link |
00:18:20.240
computational processes or can all of us discover this beauty? Are you born this way?
link |
00:18:26.400
I think so. I always like to play with numbers. I used to amuse myself by multiplying four digit
link |
00:18:37.440
decimal numbers in my head and putting myself to sleep by starting with one and
link |
00:18:45.360
doubling the number as long as I could go and testing my memory, my ability to retain the
link |
00:18:51.360
information. And I also read somewhere that you wrote that you enjoyed showing off to your friends
link |
00:18:59.760
by, I believe, multiplying four digit numbers, a couple of four digit numbers.
link |
00:19:06.880
Yeah, I had a summer job at a beach resort outside of Boston. And the other employee,
link |
00:19:15.760
I was the barker at a ski ball game. I used to sit at a microphone saying, come one,
link |
00:19:26.080
come all, come in and play, ski ball, five cents to play, nickel to win, and so on.
link |
00:19:30.880
That's what a barker, I wasn't sure if I should know, but barker, you're the charming,
link |
00:19:38.160
outgoing person that's getting people to come in.
link |
00:19:41.280
Yeah, well, I wasn't particularly charming, but I could be very repetitious and loud.
link |
00:19:47.040
And the other employees were sort of juvenile delinquents who had no academic
link |
00:19:56.720
bent, but somehow I found that I could impress them by performing this mental or arithmetic.
link |
00:20:06.560
You know, there's something to that. You know, one of some of the most popular videos on the
link |
00:20:12.800
internet is, there's a YouTube channel called Numberphile that shows off different mathematical
link |
00:20:19.520
ideas. There's still something really profoundly interesting to people about math, the beauty
link |
00:20:26.640
of it. Something, even if they don't understand the basic concept of even being discussed,
link |
00:20:33.520
there's something compelling to it. What do you think that is? Any lessons you drew from your
link |
00:20:39.440
early teen years when you were showing off to your friends with the numbers? What is it that
link |
00:20:46.800
attracts us to the beauty of mathematics, do you think? The general population, not just the
link |
00:20:53.280
computer scientists and mathematicians? I think that you can do amazing things. You can
link |
00:20:59.840
test whether the large numbers are prime. You can solve little puzzles about cannibals and
link |
00:21:11.280
missionaries. And there's a kind of achievement. It's puzzle solving. And at a higher level,
link |
00:21:20.640
the fact that you can do this reasoning, that you can prove in an absolutely ironclad way that
link |
00:21:27.200
some of the angles of a triangle is 180 degrees. Yeah. It's a nice escape from the messiness of
link |
00:21:35.280
the real world where nothing can be proved. And we'll talk about it, but sometimes the ability
link |
00:21:41.200
to map the real world into such problems where you can prove it is a powerful step.
link |
00:21:47.280
It's amazing that we can do it. Of course, another attribute of geeks is they're not necessarily
link |
00:21:51.680
endowed with emotional intelligence. So they can live in a world of abstractions without having to
link |
00:22:01.520
master the complexities of dealing with people.
link |
00:22:07.120
Just to link on the historical note, as a PhD student in 1955, you joined the Computational
link |
00:22:12.800
Lab at Harvard where Howard Agen had built the Mark I and the Mark IV computers.
link |
00:22:18.160
Just to take a step back into that history, what were those computers like?
link |
00:22:26.240
The Mark IV filled a large room much bigger than this large office that we're talking in now.
link |
00:22:36.800
And you could walk around inside it. There were rows of relays. You could just walk around the
link |
00:22:44.480
interior. And the machine would sometimes fail because of bugs, which literally meant
link |
00:22:55.040
flying creatures landing on the switches. So I never used that machine for any
link |
00:23:02.800
practical purpose. The lab eventually acquired one of the earlier commercial computers.
link |
00:23:14.480
This is already in the 60s? No, in the mid 50s.
link |
00:23:17.680
The mid 50s? Late 50s.
link |
00:23:19.680
There was already commercial computers in there?
link |
00:23:21.600
Yeah, we had a Univac with 2,000 words of storage. So you had to work hard to allocate
link |
00:23:30.000
the memory properly to also the excess time from one word to another depended on the
link |
00:23:38.000
number of the particular words. And so there was an art to sort of arranging the storage
link |
00:23:45.520
allocation to make fetching data rapid. Were you attracted to this actual physical
link |
00:23:54.480
world implementation of mathematics? So it's a mathematical machine that's actually doing
link |
00:24:01.360
the math physically? No, not at all. I was attracted to the underlying algorithms.
link |
00:24:10.800
But did you draw any inspiration? So what did you imagine was the future of these giant
link |
00:24:19.280
computers? Could you have imagined that 60 years later would have billions of these computers
link |
00:24:24.080
all over the world? I couldn't imagine that, but there was a sense in the laboratory
link |
00:24:32.720
that this was the wave of the future. In fact, my mother influenced me. She told me that data
link |
00:24:40.240
processing was going to be really big and I should get into it. She's a smart woman.
link |
00:24:47.120
Yeah, she was a smart woman. And there was just a feeling that this was going to change the world
link |
00:24:53.840
but I didn't think of it in terms of personal computing. I had no anticipation that we would
link |
00:25:01.520
be walking around with computers in our pockets or anything like that. Did you see computers as
link |
00:25:09.600
tools, as mathematical mechanisms to analyze sort of theoretical computer science or as the AI folks,
link |
00:25:18.000
which is an entire other community of dreamers, as something that could one day have human level
link |
00:25:25.360
intelligence? Well, AI wasn't very much on my radar. I did read Turing's paper about the
link |
00:25:35.920
the Turing test computing and intelligence. Yeah, the Turing test.
link |
00:25:40.320
What did you think about that paper? Was that just like science fiction?
link |
00:25:43.280
I thought that it wasn't a very good test because it was too subjective. I didn't feel that the
link |
00:25:54.400
Turing test was really the right way to calibrate how intelligent an algorithm could be.
link |
00:26:00.800
But to link on that, do you think it's because you've come up with some incredible
link |
00:26:04.960
tests later on, tests on algorithms that are strong, reliable, robust across a bunch of
link |
00:26:14.480
different classes of algorithms? But returning to this emotional mess that is intelligence,
link |
00:26:21.120
do you think it's possible to come up with a test that's as ironclad as some of the
link |
00:26:28.240
computational complexity work? Well, I think the greater question is whether it's possible to
link |
00:26:34.000
achieve human level intelligence. So first of all, let me, at the philosophical level,
link |
00:26:42.000
do you think it's possible to create algorithms that reason and would seem to us to have the
link |
00:26:51.920
same kind of intelligence as human beings? It's an open question. It seems to me that
link |
00:27:00.160
most of the achievements operate within a very limited set of ground rules and for a very limited,
link |
00:27:13.520
precise task, which is a quite different situation from the processes that go on in the minds of
link |
00:27:21.680
humans, where they have to function in changing environments. They have emotions,
link |
00:27:29.920
they have physical attributes for exploring their environment,
link |
00:27:39.840
they have intuition, they have desires, emotions, and I don't see anything in the
link |
00:27:50.880
current achievements of what's called AI that come close to that capability.
link |
00:27:56.800
I don't think there's any computer program which surpasses a six month old child in terms of
link |
00:28:07.200
comprehension of the world. Do you think this complexity of human intelligence,
link |
00:28:15.440
all the cognitive abilities you have, all the emotion, do you think that could be reduced one
link |
00:28:20.960
day or just fundamentally reduced to a set of algorithms or an algorithm? Can a towing machine
link |
00:28:29.920
achieve human level intelligence? I am doubtful about that. I guess the argument in favor of it
link |
00:28:38.560
is that the human brain seems to achieve what we call intelligence, cognitive abilities of
link |
00:28:49.600
different kinds. If you buy the premise that the human brain is just an enormous interconnected
link |
00:28:57.600
set of switches, so to speak, then in principle, you should be able to diagnose what that
link |
00:29:05.120
interconnection structure is like, characterize the individual switches and build a simulation outside.
link |
00:29:12.560
Why that may be true in principle, that cannot be the way we're eventually going to tackle this
link |
00:29:20.960
problem. That does not seem like a feasible way to go about it. There is however an existence
link |
00:29:31.600
proof that if you believe that the brain is just a network of neurons operating by rules,
link |
00:29:44.160
I guess you could say that that's an existence proof of the capabilities of a mechanism.
link |
00:29:52.160
But it would be almost impossible to acquire the information unless we got enough insight into the
link |
00:30:01.120
operation of the brain. There's so much mystery there. What do you make of consciousness, for
link |
00:30:06.720
example? As an example of something we completely have no clue about, the fact that we have this
link |
00:30:13.920
subjective experience, is it possible that this network of this circuit of switches is able to
link |
00:30:22.480
create something like consciousness? To know its own identity. To know itself. I think if you try
link |
00:30:32.960
to define that rigorously, you'd have a lot of trouble. I know that there are many who
link |
00:30:43.520
believe that general intelligence can be achieved. There are even some who feel certain
link |
00:30:55.760
that the singularity will come and we will be surpassed by the machines which will then learn
link |
00:31:02.640
more and more about themselves and reduce humans to an inferior breed. I am doubtful that this
link |
00:31:09.600
will ever be achieved. Just for the fun of it, could you linger on why, what's your intuition,
link |
00:31:17.920
why you're doubtful? There are quite a few people that are extremely worried about this existential
link |
00:31:24.800
threat of artificial intelligence of us being left behind by the superintelligent new species.
link |
00:31:32.160
What's your intuition, why that's not quite likely? Just because none of the achievements in
link |
00:31:42.720
speech or robotics or natural language processing or creation of
link |
00:31:50.640
flexible computer assistants or any of that comes anywhere near close to that level of cognition.
link |
00:31:59.280
What do you think about ideas if we look at Moore's law and exponential improvement
link |
00:32:06.000
to allow us that would surprise us? Our intuition fall apart with exponential improvement
link |
00:32:13.520
because we're not able to think in linear improvement. We're not able to imagine a world
link |
00:32:20.080
that goes from the Mark I computer to an iPhone X. We could be really surprised by the exponential
link |
00:32:32.560
growth or on the flip side, is it possible that also intelligence is actually way, way, way, way
link |
00:32:40.720
harder even with exponential improvement to be able to crack? I don't think any constant factor
link |
00:32:49.040
improvement could change things. Given our current comprehension of what cognition requires,
link |
00:33:04.720
it seems to me that multiplying the speed of the switches by a factor of a thousand or a million
link |
00:33:10.640
will not be useful until we really understand the organizational principle behind the network of
link |
00:33:20.480
switches. Well, let's jump into the network of switches and talk about combinatorial algorithms
link |
00:33:27.200
if we could. Let's step back for the very basics. What are combinatorial algorithms and what are
link |
00:33:33.920
some major examples of problems they aim to solve? A combinatorial algorithm is one which
link |
00:33:43.040
deals with a system of discrete objects that can occupy various states or take on various
link |
00:33:53.280
values from a discrete set of values and need to be arranged or selected in such a way as to
link |
00:34:08.560
achieve some, to minimize some cost function or to prove the existence of some combinatorial
link |
00:34:18.160
configuration. An example would be coloring the vertices of a graph. What's a graph?
link |
00:34:27.120
Let's step back. It's fun to ask one of the greatest computer scientists of all time,
link |
00:34:36.000
the most basic questions in the beginning of most books. For people who might not know,
link |
00:34:41.200
but in general, how you think about it, what is a graph? A graph? That's simple. It's a set of
link |
00:34:47.760
points, certain pairs of which are joined by lines called edges. They represent the,
link |
00:34:58.720
in different applications, represent the interconnections between
link |
00:35:04.720
discrete objects. They could be the interconnections between switches in a digital circuit
link |
00:35:11.600
or interconnections indicating the communication patterns of a human community.
link |
00:35:19.120
And they could be directed or undirected. And then, as you've mentioned before, might have
link |
00:35:24.240
costs. They can be directed or undirected. You can think of them as, if a graph were
link |
00:35:33.440
representing a communication network, then the edge could be undirected, meaning that information
link |
00:35:39.840
could flow along it in both directions or it could be directed with only one way communication.
link |
00:35:46.800
A road system is another example of a graph with weights on the edges. And then a lot of problems
link |
00:35:54.960
of optimizing the efficiency of such networks or learning about the performance of such networks
link |
00:36:05.520
are the object of a combinatorial algorithm. So it could be scheduling classes at a school
link |
00:36:15.520
where the vertices, the nodes of the network are the individual classes and the edges indicate
link |
00:36:27.520
the constraints which say that certain classes cannot take place at the same time or certain
link |
00:36:33.440
teachers are available only for certain classes, etc. Or I talked earlier about the assignment
link |
00:36:42.480
problem of matching the boys with the girls, where you have there a graph with an edge from
link |
00:36:52.400
each boy to each girl with a weight indicating the cost. Or in logical design of computers,
link |
00:37:02.080
you might want to find a set of so called gates switches that perform logical functions,
link |
00:37:11.760
which can be interconnected to realize some function. So you might ask,
link |
00:37:17.280
how many gates do you need in order for a circuit to give a yes output if at least
link |
00:37:34.160
a given number of inputs are ones and no if fewer are present.
link |
00:37:42.000
My favorite is probably all the work with network flows. So anytime you have,
link |
00:37:49.120
I don't know why it's so compelling, but there's something just beautiful about it.
link |
00:37:52.320
It seems like there's so many applications and communication networks
link |
00:37:57.360
in traffic flow that you can map into these. And then you could think of pipes and water
link |
00:38:04.560
going through pipes and you could optimize it in different ways. There's something always
link |
00:38:08.240
visually and intellectually compelling to me about it. And of course, you've done work there.
link |
00:38:15.840
Yeah. So there the edges represent channels along which some commodity can flow. It might be
link |
00:38:26.080
gas, it might be water, it might be information. Maybe supply chain as well, like products
link |
00:38:32.880
being products flowing from one operation to another. And the edges have a capacity,
link |
00:38:40.160
which is the rate at which the commodity can flow. And a central problem is to determine,
link |
00:38:49.040
given a network of these channels, in this case, the edges of communication channels,
link |
00:38:54.160
the challenges to find the maximum rate at which the information can flow along these
link |
00:39:04.960
channels to get from a source to a destination. And that's a fundamental combinatorial problem
link |
00:39:12.560
that I've worked on. Jointly with the scientist Jack Edmonds, I think we're the first to give
link |
00:39:21.360
a formal proof that this maximum flow problem through a network can be solved in polynomial time.
link |
00:39:30.560
Which I remember the first time I learned that, just learning that in maybe even grad school.
link |
00:39:39.600
I don't think it was even undergrad. No. Algorithm, yeah. Do network flows get taught in
link |
00:39:45.680
in basic algorithms courses? Yes, probably. Okay. So yeah, I remember being very surprised
link |
00:39:53.600
that max flow is a polynomial time algorithm. Yeah. That there's a nice fast algorithm that
link |
00:39:58.320
solves max flow. But so there is an algorithm named after you and Edmonds, the Edmond Karp
link |
00:40:07.120
algorithm for max flow. So what was it like tackling that problem and trying to arrive
link |
00:40:13.520
at a polynomial time solution? And maybe you can describe the algorithm, maybe you can describe
link |
00:40:17.920
what's the running time complexity that you showed. Yeah. Well, first of all, what is a
link |
00:40:23.600
polynomial time algorithm? Yeah. Perhaps we could discuss that. So yeah, let's actually just even,
link |
00:40:30.800
yeah, that's what is algorithmic complexity? What are the major classes of algorithm complexity?
link |
00:40:38.080
So in a problem like the assignment problem or scheduling schools or any of these applications,
link |
00:40:48.960
you have a set of input data, which might, for example, be a set of vertices connected by edges
link |
00:41:01.280
with, you're given for each edge the capacity of the edge. And you have algorithms which are,
link |
00:41:12.080
think of them as computer programs with operations such as addition, subtraction,
link |
00:41:18.400
multiplication, division, comparison of numbers and so on. And you're trying to construct an
link |
00:41:26.000
algorithm based on those operations, which will determine in a minimum number of computational
link |
00:41:36.560
steps the answer to the problem. In this case, the computational step is one of those operations.
link |
00:41:43.280
And the answer to the problem is, let's say, the configuration of the network that
link |
00:41:50.400
carries the maximum amount of flow. And an algorithm is said to run in polynomial time
link |
00:41:59.840
if, as a function of the size of the input, the number of vertices, the number of edges,
link |
00:42:06.880
and so on, the number of basic computational steps grows only as some fixed power of that size.
link |
00:42:14.720
A linear algorithm would execute a number of steps linearly proportional to the size.
link |
00:42:24.560
Quadratic algorithm would be steps proportional to the square of the size and so on.
link |
00:42:30.560
And algorithms whose running time is bounded by some fixed power of the size are called polynomial
link |
00:42:38.320
algorithms. And that's supposed to be a relatively fast class of algorithms.
link |
00:42:44.720
That's right. Theoreticians take that to be the definition of an algorithm being
link |
00:42:52.400
efficient and we're interested in which problems can be solved by such efficient algorithms.
link |
00:43:02.160
One can argue whether that's the right definition of efficient because you could have an algorithm
link |
00:43:08.000
whose running time is the 10,000th power of the size of the input and that wouldn't be
link |
00:43:14.000
really efficient. And in practice, it's oftentimes reducing from an n squared algorithm to an n log
link |
00:43:22.240
n or a linear time is practically the jump that you want to make to allow a real world system
link |
00:43:30.080
to solve a problem. Yeah, that's also true because especially as we get very large networks,
link |
00:43:34.880
the size can be in the millions and then anything above n log n where n is the size
link |
00:43:45.280
would be too much for practical solution. Okay, so that's polynomial time algorithms.
link |
00:43:52.480
What other classes of algorithms are there? So that usually they designate polynomials
link |
00:44:01.040
of the letter P. Yeah. There's also NP, NP complete and NP hard. Yeah. So can you try to
link |
00:44:08.000
disentangle those by trying to define them simply? Right. So a polynomial time algorithm
link |
00:44:16.880
is one whose running time is bounded by a polynomial and the size of the input.
link |
00:44:22.400
Then the class of such algorithms is called P. In the worst case, by the way, we should say,
link |
00:44:31.040
right? Yeah, that's very important that in this theory, when we measure the complexity of an
link |
00:44:39.680
algorithm, we really measure the growth of the number of steps in the worst case. So you may
link |
00:44:49.280
have an algorithm that runs very rapidly in most cases, but if there's any case where it gets into
link |
00:44:57.920
a very long computation, that would increase the computational complexity by this measure.
link |
00:45:05.280
And that's a very important issue because there are, as we may discuss later, there are some
link |
00:45:11.600
very important algorithms which don't have a good standing from the point of view of their
link |
00:45:16.560
worst case performance and yet are very effective. So theoreticians are interested in P, the class of
link |
00:45:24.880
problem solvable in polynomial time. Then there's NP, which is the class of problems
link |
00:45:35.920
which may be hard to solve, but where when confronted with a solution,
link |
00:45:43.920
you can check it in polynomial time. Let me give you an example there.
link |
00:45:49.120
So if we look at the assignment problem, so you have n boys, you have n girls, the number of numbers
link |
00:45:57.120
that you need to write down to specify the problem instances n squared. And the question is
link |
00:46:05.680
how many steps are needed to solve it? And Jack Edmonds and I were the first to show that it
link |
00:46:15.760
could be done in time in cubed earlier algorithms required into the fourth. So as a polynomial
link |
00:46:25.600
function of the size of the input, this is a fast algorithm. Now to illustrate the class NP,
link |
00:46:32.160
the question is how long would it take to verify that a solution is optimal?
link |
00:46:42.560
So for example, if the input was a graph, we might want to find the largest
link |
00:46:52.080
clique in the graph or a clique is a set of vertices such that any vertex,
link |
00:46:58.720
each vertex in the set is adjacent to each of the others. So the clique is a complete subgraph.
link |
00:47:08.800
Yeah, so if it's a Facebook social network, everybody's friends with everybody else. It's
link |
00:47:14.080
close clique. That would be what's called a complete graph. It would be.
link |
00:47:18.160
No, I mean within that clique. Within that clique, yeah. They're all friends.
link |
00:47:25.520
So a complete graph is when everybody's friends with everybody.
link |
00:47:31.280
So the problem might be to determine whether in a given graph there exists a clique of a certain
link |
00:47:40.160
size. Well, that turns out to be a very hard problem. But if somebody hands you a clique
link |
00:47:48.480
and asks you to check whether it is, hands you a set of vertices and asks you to check whether
link |
00:47:56.320
it's a clique, you could do that simply by exhaustively looking at all of the edges
link |
00:48:02.800
between the vertices in the clique and verifying that they're all there.
link |
00:48:07.920
And that's a polynomial time algorithm.
link |
00:48:10.240
That's a polynomial. So the problem of finding the clique
link |
00:48:15.120
appears to be extremely hard. But the problem of verifying a clique
link |
00:48:22.800
to see if it reaches a target number of vertices is easy to verify. So finding the
link |
00:48:32.400
clique is hard. Checking it is easy. Problems of that nature are called
link |
00:48:37.920
nondeterministic polynomial time algorithms. And that's the class NP.
link |
00:48:45.120
And what about NP complete and NP hard?
link |
00:48:48.240
Okay. Let's talk about problems where you're getting a yes or no answer rather than a numerical
link |
00:48:54.880
value. So either there is a perfect matching of the boys with the girls or there isn't.
link |
00:49:03.120
It's clear that every problem in P is also in NP. If you can solve the problem exactly,
link |
00:49:12.480
then you can certainly verify the solution. On the other hand, there are problems in the class
link |
00:49:21.600
NP. This is the class of problems that are easy to check, although they may be hard to solve.
link |
00:49:28.720
It's not at all clear that problems in NP lie in P. So for example, if we're looking at scheduling
link |
00:49:36.880
classes at a school, the fact that you can verify when handed a schedule for the school,
link |
00:49:45.600
whether it meets all the requirements, that doesn't mean that you can find the schedule
link |
00:49:49.920
rapidly. So intuitively, NP nondeterministic polynomial checking rather than finding is
link |
00:49:59.040
going to be harder than, is going to include, is easier. Checking is easier and therefore
link |
00:50:07.520
the class of problems that can be checked appears to be much larger than the class of problems that
link |
00:50:13.120
can be solved. Then you keep adding appears to and sort of these additional words that designate
link |
00:50:22.080
that we don't know for sure yet. We don't know for sure. So the theoretical question, which is
link |
00:50:27.360
considered to be the most central problem in theoretical computer science or at least computational
link |
00:50:34.080
complexity theory, combinatorial algorithm theory. The question is whether P is equal to NP. If P
link |
00:50:43.920
were equal to NP, it would be amazing. It would mean that every problem where a solution can be
link |
00:50:55.600
rapidly checked can actually be solved in polynomial time. We don't really believe that's
link |
00:51:02.160
true. If you're scheduling classes at a school, we expect that if somebody hands you a satisfying
link |
00:51:12.400
schedule, you can verify that it works. That doesn't mean that you should be able to find such a
link |
00:51:17.840
schedule. So intuitively, NP encompasses a lot more problems than P. So can we take a small tangent
link |
00:51:27.920
and break apart that intuition? Do you first of all think that the biggest open problem in
link |
00:51:35.360
computer science, maybe mathematics, is whether P equals NP? Do you think P equals NP or do you
link |
00:51:43.520
think P is not equal to NP? If you had to bet all your money on it. I would bet that P is unequal to
link |
00:51:50.560
NP simply because there are problems that have been around for centuries and have been studied
link |
00:51:56.720
intensively in mathematics and even more so in the last 50 years since the P versus NP was stated.
link |
00:52:05.440
And no polynomial time algorithms have been found for these easy to check problems.
link |
00:52:13.520
So one example is a problem that goes back to the mathematician Gauss who was interested in
link |
00:52:21.040
factoring large numbers. So we know what a number is prime if it cannot be written as the
link |
00:52:30.880
product of two or more numbers unequal to one. So if we can factor the number like
link |
00:52:39.440
91 at seven times 13. But if I give you 20 digit or 30 digit numbers, you're probably going to be
link |
00:52:51.840
at a loss to have any idea whether they can be factored. So the problem of factoring very large
link |
00:52:59.200
numbers does not appear to have an efficient solution. But once you have found the factors,
link |
00:53:11.600
expressed the number as a product to smaller numbers, you can quickly verify that they are
link |
00:53:18.480
factors of the number. And your intuition is a lot of brilliant people have tried to find
link |
00:53:25.200
algorithms. For this one particular problem, there's many others like it that are really well
link |
00:53:29.840
studied and will be great to find an efficient algorithm for.
link |
00:53:34.000
Right. And in fact, we have some results that I was instrumental in obtaining following up on
link |
00:53:44.080
work by the mathematician Stephen Cook to show that within the class NP of easy to check problems,
link |
00:53:55.600
there's a huge number that are equivalent in the sense that either all of them or none of them lie
link |
00:54:01.840
in P. And this happens only if P is equal to NP. So if P is unequal to NP, we would also know
link |
00:54:10.080
that virtually all the standard combinatorial problems, if P is unequal to NP, none of them
link |
00:54:22.960
can be solved in polynomial time. Can you explain how that's possible to tie together so many
link |
00:54:29.840
problems in a nice bunch that if one is proven to be efficient, then all are?
link |
00:54:35.600
The first and most important stage of progress was a result by Stephen Cook
link |
00:54:46.800
who showed that a certain problem called the satisfiability problem of propositional logic
link |
00:54:55.840
is as hard as any problem in the class P. So the propositional logic problem
link |
00:55:02.640
is expressed in terms of expressions involving the logical operations and or and not operating
link |
00:55:14.160
operating on variables that can be the true or false. So an instance of the problem would be
link |
00:55:23.440
some formula involving and or and not. And the question would be whether there is an
link |
00:55:30.480
assignment of truth values to the variables in the problem that would make the formula true.
link |
00:55:37.280
So for example, if I take the formula A or B and A or not B and not A or B and not A or not B
link |
00:55:49.120
and take the conjunction of all four of those so called expressions, you can determine that
link |
00:55:56.480
no assignment of truth values to the variables A and B will allow that conjunction of
link |
00:56:05.680
what are called clauses to be true. So that's an example of a formula in
link |
00:56:14.800
propositional logic involving expressions based on the operations and or and not.
link |
00:56:21.680
That's an example of a problem which is not satisfiable. There is no solution that satisfies all
link |
00:56:29.360
of those constraints. And that's like one of the cleanest and fundamental problems in computer
link |
00:56:34.640
science. It's like a nice statement of a really hard problem. It's a nice statement of a really
link |
00:56:39.120
hard problem. And what Cook showed is that every problem in NP can be re expressed as an instance
link |
00:56:53.360
of the satisfiability problem. So to do that, he used the observation that a very simple
link |
00:57:02.960
abstract machine called the Turing machine can be used to describe any algorithm.
link |
00:57:14.800
An algorithm for any realistic computer can be translated into an equivalent algorithm
link |
00:57:22.720
on one of these Turing machines which are extremely simple.
link |
00:57:27.120
So a Turing machine, there's a tape and you can walk along that. You have data on a tape and you
link |
00:57:33.440
have basic instructions, a finite list of instructions which say if you're reading a
link |
00:57:40.240
particular symbol on the tape and you're in a particular state, then you can move to
link |
00:57:48.720
a different state and change the state of the number or the element that you were looking at,
link |
00:57:53.600
the cell of the tape that you were looking at. And that was like a metaphor and a mathematical
link |
00:57:57.840
construct that Turing put together to represent all possible computation. All possible computation.
link |
00:58:03.440
Now one of these so called Turing machines is too simple to be useful in practice,
link |
00:58:09.200
but for theoretical purposes we can depend on the fact that an algorithm for any computer can be
link |
00:58:17.920
translated into one that would run on a Turing machine. And then using that fact,
link |
00:58:26.160
he could describe any possible nondeterministic polynomial time algorithm. Any algorithm for
link |
00:58:37.920
a problem in NP could be expressed as a sequence of moves of the Turing machine
link |
00:58:45.360
described in terms of reading a symbol on the tape while you're in a given state and moving
link |
00:58:55.840
to a new state and leaving behind a new symbol. And given that the fact that any
link |
00:59:04.800
nondeterministic polynomial time algorithm can be described by a list of such instructions,
link |
00:59:11.920
you could translate the problem into the language of the satisfiability problem.
link |
00:59:18.400
Is that amazing to you, by the way? If you take yourself back when you were first thinking about
link |
00:59:22.400
the space of problems, how amazing is that? It's astonishing. When you look at Cook's proof,
link |
00:59:30.160
it's not too difficult to figure out why this is so, but the implications are staggering.
link |
00:59:40.400
It tells us that this, of all the problems in NP, all the problems where solutions are easy to
link |
00:59:48.320
check, they can all be rewritten in terms of the satisfiability problem.
link |
00:59:59.280
Yeah, it's adding so much more weight to the P equals NP question because all it takes is to show
link |
01:00:08.080
that one algorithm in this class. So the P versus NP can be reexpressed as simply asking whether the
link |
01:00:15.920
satisfiability problem of propositional logic is solvable in polynomial time.
link |
01:00:23.680
But there's more. I encountered Cook's paper when he published it in a conference in 1971.
link |
01:00:33.760
And yeah, so when I saw Cook's paper and saw this reduction of each of the problems in NP
link |
01:00:44.240
by a uniform method to the satisfiability problem of propositional logic,
link |
01:00:52.400
that meant that the satisfiability problem was a universal combinatorial problem.
link |
01:00:57.360
And it occurred to me through experience I had had in trying to solve other combinatorial problems
link |
01:01:07.600
that there were many other problems which seemed to have that universal structure.
link |
01:01:15.920
And so I began looking for reductions from the satisfiability to other problems.
link |
01:01:25.680
One of the other problems would be the so called integer programming problem of
link |
01:01:37.440
solving a determining whether there's a solution to a set of linear inequalities involving integer
link |
01:01:46.240
variables. Just like linear programming, but there's a constraint that the variables must
link |
01:01:52.400
remain integers. Integers in fact must be the zero or one that could only take on those values.
link |
01:01:58.400
And that makes the problem much harder. Yes, that makes the problem much harder.
link |
01:02:03.680
And it was not difficult to show that the satisfiability problem can be restated
link |
01:02:11.520
as an integer programming problem. So can you pause on that? Was that one of the first
link |
01:02:16.560
mappings that you tried to do? And how hard is that mapping? You said it wasn't hard to show, but
link |
01:02:24.480
that's a big leap. It is a big leap, yeah. Well, let me give you another example.
link |
01:02:32.880
Another problem in NP is whether a graph contains a clique of a given size.
link |
01:02:39.120
And now the question is, can we reduce the propositional logic problem to the problem of
link |
01:02:54.320
whether there's a clique of a certain size? Well, if you look at the propositional logic
link |
01:03:00.720
problem, it can be expressed as a number of clauses, each of which is of the form
link |
01:03:13.680
A or B or C, where A is either one of the variables in the problem or the negation of one of the
link |
01:03:19.840
variables. And an instance of the propositional logic problem can be rewritten using operations of
link |
01:03:34.080
Boolean logic, can be rewritten as the conjunction of a set of clauses, the AND of a set of ORs,
link |
01:03:43.440
where each clause is a disjunction on OR of variables or negated variables.
link |
01:03:53.760
So the question in the satisfiability problem is whether those clauses can be simultaneously
link |
01:04:04.880
satisfied. Now, to satisfy all those clauses, you have to find one of the terms in each clause,
link |
01:04:13.760
which is going to be true in your truth assignment, but you can't make the same
link |
01:04:22.480
variable both true and false. So if you have the variable A in one clause and you want to
link |
01:04:30.240
satisfy that clause by making A true, you can't also make the complement of A true in some other
link |
01:04:38.800
clause. And so the goal is to make every single clause true if it's possible to satisfy this,
link |
01:04:45.120
and the way you make it true is at least… One term in the clause must be true.
link |
01:04:52.160
So now to convert this problem to something called the independent set problem, where you're
link |
01:05:01.600
just sort of asking for a set of vertices in a graph such that no two of them are adjacent,
link |
01:05:08.800
sort of the opposite of the clique problem.
link |
01:05:10.640
So we've seen that we can now express that as finding a set of terms one in each clause
link |
01:05:32.080
without picking both the variable and the negation of that variable because if the variable is
link |
01:05:40.480
assigned the truth value, the negated variable has to have the opposite truth value. And so
link |
01:05:47.600
we can construct the graph where the vertices are the terms in all of the clauses and you have
link |
01:05:58.880
an edge between two occurrences of terms, either if they're both in the same clause
link |
01:06:19.440
because you're only picking one element from each clause and also an edge between them if they
link |
01:06:26.240
represent opposite values of the same variable because you can't make a variable both true
link |
01:06:31.920
and false. And so you get a graph where you have all of these occurrences of variables,
link |
01:06:38.000
you have edges, which mean that you're not allowed to choose both ends of the edge,
link |
01:06:44.720
either because they're in the same clause or they're con negations of one another.
link |
01:06:49.680
That's a really powerful idea that you can take a graph and connect it to a logic equation
link |
01:07:03.840
and do that mapping for all possible formulations of a particular problem on a graph.
link |
01:07:09.840
I mean, that still is hard for me to believe. That's possible. What do you make of that?
link |
01:07:22.160
There's such a union of, there's such a friendship among all these problems across
link |
01:07:28.720
that somehow are akin to combinatorial algorithms that they're all somehow related.
link |
01:07:35.920
I know it can be proven, but what do you make of it that that's true?
link |
01:07:43.040
Well, that they just have the same expressive power. You can take any one of them and
link |
01:07:50.240
translate it into the terms of the other. The fact that they have the same expressive power
link |
01:07:55.520
also somehow means that they can be translatable. Right. And what I did in the 1971 paper was to
link |
01:08:03.920
take 21 fundamental problems, commonly occurring problems of packing, covering, matching, and so
link |
01:08:13.760
forth, lying in the class NP and show that the satisfiability problem can be reexpressed as any
link |
01:08:23.600
of those, that any of those have the same expressive power.
link |
01:08:28.560
And that was like throwing down the gauntlet of saying there's probably many more problems like
link |
01:08:35.280
this, but that's just saying that, look, they're all the same. They're all the same,
link |
01:08:41.040
but not exactly. They're all the same in terms of whether they are rich enough to express any of
link |
01:08:50.640
the others. But that doesn't mean that they have the same computational complexity. But what we
link |
01:08:58.880
can say is that either all of these problems or none of them are solvable in polynomial time.
link |
01:09:05.600
Yes, so where does NP completeness and NP hard as classes?
link |
01:09:10.880
Oh, that's just a small technicality. So when we're talking about decision problems,
link |
01:09:16.480
that means that the answer is just yes or no. There is a clique of size 15 or there's not a
link |
01:09:23.360
clique of size 15. On the other hand, an optimization problem would be asking, find the largest
link |
01:09:31.120
clique. The answer would not be yes or no, it would be 15. So when you're asking for the,
link |
01:09:40.320
when you're putting a valuation on the different solutions and you're asking for the one with the
link |
01:09:45.600
highest valuation, that's an optimization problem. And there's a very close affinity between the
link |
01:09:51.360
two kinds of problems. But the counterpart of being the hardest decision problem, the hardest yes,
link |
01:10:00.960
no problem, the kind of part of that is to minimize or maximize an objective function. And so a problem
link |
01:10:11.920
that's hardest in the class when viewed in terms of optimization, those are called NP hard, rather
link |
01:10:20.400
than NP complete. And NP complete is for decision problems. And NP complete is for decision problems.
link |
01:10:26.080
And NP complete is for decision problems. So if somebody shows that P equals NP,
link |
01:10:35.520
what do you think that proof will look like if you were to put on yourself, if it's possible to show
link |
01:10:42.800
that as a proof or to demonstrate an algorithm? All I can say is that it will involve concepts
link |
01:10:52.080
that we do not now have and approaches that we don't have. Do you think those concepts are out
link |
01:10:57.680
there in terms of inside complexity theory, inside of computational analysis of algorithms?
link |
01:11:04.640
Do you think there's concepts that are totally outside of the box who haven't considered yet?
link |
01:11:09.040
I think that if there is a proof that P is equal to NP or that P is not equal to NP,
link |
01:11:14.640
it'll depend on concepts that are now outside the box.
link |
01:11:22.160
Now, if that's shown, either way, P equals NP or P not, well, actually P equals NP, what impact,
link |
01:11:30.800
you kind of mentioned a little bit, but can you link on it, what kind of impact would it have
link |
01:11:36.640
on theoretical computer science and perhaps software based systems in general?
link |
01:11:42.080
Well, I think it would have enormous impact on the world in either case. If P is unequal to NP,
link |
01:11:50.560
which is what we expect, then we know that for the great majority of the combinatorial problems
link |
01:11:58.160
that come up, since they're known to be NP complete, we're not going to be able to solve them by
link |
01:12:05.680
efficient algorithms. However, there's a little bit of hope in that it may be that we can solve
link |
01:12:14.720
most instances. All we know is that if a problem is not in P, then it can't be solved efficiently
link |
01:12:21.520
on all instances. But basically, if we find that P is unequal to NP, it will mean that we
link |
01:12:33.520
can't expect always to get the optimal solutions to these problems. And we have to depend on
link |
01:12:40.160
heuristics that perhaps work most of the time or give us good approximate solutions, but not.
link |
01:12:48.320
So we would turn our eye towards the heuristics with a little bit more acceptance and comfort
link |
01:12:55.360
on our hearts. Exactly. Okay, so let me ask a romanticized question. What to you is one of the
link |
01:13:04.000
most or the most beautiful combinatorial algorithm in your own life or just in general in the field
link |
01:13:10.720
that you've ever come across or have developed yourself? I like the stable matching problem,
link |
01:13:17.600
or the stable marriage problem very much. What's the stable matching problem? Imagine that you
link |
01:13:28.000
want to marry off end boys with end girls. And each boy has an ordered list of his preferences
link |
01:13:40.960
among the girls, his first choice, his second choice through her nth choice. And each girl
link |
01:13:52.720
also has an ordering of the boys, his first choice, second choice, and so on. And we'll say
link |
01:13:59.680
that a one to one matching of the boys with the girls is stable if there are
link |
01:14:11.920
no two couples in the matching such that the boy in the first couple
link |
01:14:18.480
prefers the girl in the second couple to her mate and she prefers the boy to her current mate. In
link |
01:14:26.960
other words, the matching is stable if there is no pair who want to run away with each other
link |
01:14:35.360
leaving their partners behind. Gosh, yeah. Actually, this is relevant to matching
link |
01:14:48.400
residents with hospitals and some other real life problems, although not quite in the form that I
link |
01:14:54.800
describe. So it turns out that there is that a stable, for any set of preferences, a stable
link |
01:15:03.200
matching exists. And moreover, it can be computed by a simple algorithm in which each boy starts
link |
01:15:17.440
making proposals to girls. And if the girl receives a proposal, she accepts it tentatively,
link |
01:15:25.760
but she can drop it if she can end it, she can drop it later if she gets a better proposal
link |
01:15:34.320
from her point of view. And the boys start going down their lists proposing to their first,
link |
01:15:40.080
second, third choices until stopping when a proposal is accepted. But the girls,
link |
01:15:51.040
meanwhile, are watching the proposals that are coming into them and the girl will drop her
link |
01:15:57.440
current partner if she gets a better proposal. And the boys never go back through the list?
link |
01:16:06.160
They never go back. Yeah. So once they've been denied, they don't try again because the girls
link |
01:16:16.720
are always improving their status as they get more, as they receive better and better proposals.
link |
01:16:22.640
The boys are going down their list starting with their top preferences. And one can prove that
link |
01:16:36.240
the process will come to an end where everybody will get matched with somebody and you won't have
link |
01:16:45.200
any pairs that want to abscond from each other.
link |
01:16:50.160
Do you find that proof or the algorithm itself beautiful or is it the fact that with the simplicity
link |
01:16:56.560
of just the two marching, I mean, the simplicity of the underlying rule of the algorithm,
link |
01:17:02.400
is that the beautiful part? Both, I would say. And you also have the observation that
link |
01:17:10.160
you might ask, who is better off the boys who are doing the proposing or the girls who are
link |
01:17:15.520
reacting to proposals? And it turns out that it's the boys who are doing the best,
link |
01:17:22.720
that as each boy is doing at least as well as he could do in any other staple matching.
link |
01:17:30.400
So there's a sort of lesson for the boys that you should go out and be proactive and make those
link |
01:17:37.040
proposals go for broke. I don't know if this is directly mappable philosophically to our society,
link |
01:17:45.040
but certainly seems like a compelling notion. And like you said, there's probably a lot of
link |
01:17:51.440
actual real world problems that this could be mapped to.
link |
01:17:54.480
Yeah, well, you get complications. For example, what happens when a husband and wife want to
link |
01:18:01.680
be assigned to the same hospital? So you have to take those constraints into account. And then
link |
01:18:10.720
the problem becomes NP hard. Why is it a problem for the husband and wife to be assigned to the
link |
01:18:18.640
same hospital? No, it's desirable. Or at least go to the same city. So if you're assigning
link |
01:18:27.920
residents to hospitals. And then you have some preferences for the husband and wife or for the
link |
01:18:33.920
hospitals. The residents have their own preferences. Residents, both male and female, have their own
link |
01:18:42.080
preferences. The hospitals have their preferences. But if resident A, the boy, is going to Philadelphia,
link |
01:18:56.000
then you'd like his wife also to be assigned to a hospital in Philadelphia. So...
link |
01:19:04.320
Which step makes it an NP hard problem that you mentioned?
link |
01:19:07.840
The fact that you have this additional constraint. That it's not just the preferences of individuals,
link |
01:19:14.880
but the fact that the two partners to a marriage have to be assigned to the same place.
link |
01:19:22.080
I'm being a little dense. The perfect matching? No, not the stable matching,
link |
01:19:32.320
is what you referred to. That's when two partners are trying to...
link |
01:19:36.000
Okay. What's confusing you is that in the first
link |
01:19:39.200
interpretation of the problem, I had boys matching with girls.
link |
01:19:44.080
In the second interpretation, you have humans matching with institutions.
link |
01:19:49.440
And there's a coupling between within the, gotcha, within the humans. Any added little
link |
01:19:58.080
constraint will make it an NP hard problem. Well, yeah.
link |
01:20:03.280
Okay. By the way, the algorithm you mentioned wasn't one of yours?
link |
01:20:07.600
No, no, that was due to Gail and Shapley. And my friend David Gail passed away before he could
link |
01:20:16.240
get part of the Nobel Prize, but his partner, Shapley, shared in the Nobel Prize with somebody
link |
01:20:23.200
else for economics. For ideas stemming from the stable matching idea.
link |
01:20:32.480
So you've also developed yourself some elegant, beautiful algorithms. Again, picking your children.
link |
01:20:39.600
So the Robin Karp algorithm for string searching, pattern matching,
link |
01:20:44.320
Edmund Karp algorithm for max flows we mentioned, Hopcroft Karp algorithm for finding
link |
01:20:48.960
maximum cardinality, matchings, and bipartite graphs. Is there ones that stand out to you,
link |
01:20:55.360
ones you're most proud of, or just whether it's beauty, elegance, or just being the right discovery
link |
01:21:05.680
development in your life that you're especially proud of?
link |
01:21:08.480
I like the Robin Karp algorithm because it illustrates the power of randomization.
link |
01:21:17.520
So the problem there is to decide whether a given long string of symbols from some
link |
01:21:35.040
alphabet contains a given word, whether a particular word occurs within some very much longer word.
link |
01:21:45.280
And so the idea of the algorithm is to associate with the word that we're looking for,
link |
01:21:56.000
a fingerprint, some number, or some combinatorial object that
link |
01:22:07.760
describes that word, and then to look for an occurrence of that same fingerprint as you
link |
01:22:13.760
slide along the longer word. And what we do is we associate with each word a number. So first
link |
01:22:26.880
of all, we think of the letters that occur in a word as the digits of, let's say,
link |
01:22:32.880
decimal or whatever base here, whatever number of different symbols there are in the alphabet.
link |
01:22:41.840
That's the base of the numbers, yeah. Right. So every word can then be thought of as a number
link |
01:22:48.080
with the letters being the digits of that number. And then we pick a random prime number in a certain
link |
01:22:56.880
range and we take that word viewed as a number and take the remainder on dividing that number by
link |
01:23:10.320
the prime. So coming up with a nice hash function. It's a kind of hash function. Yeah.
link |
01:23:17.920
It gives you a little shortcut for that particular word. Yeah. So that's the,
link |
01:23:25.040
that's the... It's very different than any other algorithms of its kind that we're trying to do
link |
01:23:31.760
search, string matching. Yeah. Which usually are combinatorial and don't involve the idea of
link |
01:23:40.400
taking a random fingerprint. Yes. And doing the fingerprinting has two advantages. One is that
link |
01:23:48.560
as we slide along the long word, digit by digit, we can, we keep a window of a certain size,
link |
01:23:57.280
the size of the word we're looking for. And we compute the fingerprint of every
link |
01:24:05.200
stretch of that length. And it turns out that just a couple of arithmetic
link |
01:24:10.400
operations will take you from the fingerprint of one part to what you get when you slide over by
link |
01:24:17.360
one position. So the computation of all the fingerprints is simple. And secondly, it's
link |
01:24:29.520
unlikely if the prime is chosen randomly from a certain range that you will get
link |
01:24:36.000
two of the segments in question having the same fingerprint. And so there's a small probability
link |
01:24:43.280
of error which can be checked after the fact. And also the ease of doing the computation
link |
01:24:48.560
because you're working with these fingerprints, which are remainder's modulo some big prime.
link |
01:24:55.520
So that's the magical thing about randomized algorithms is that if you add a little bit
link |
01:25:00.480
of randomness, it somehow allows you to take a pretty naive approach, a simple looking approach
link |
01:25:07.120
and allow it to run extremely well. So can you maybe take a step back and say what is a randomized
link |
01:25:15.280
algorithm, this category of algorithms? Well, it's just the ability to draw a random number from
link |
01:25:23.680
such, from some range or to associate a random number with some object or to draw
link |
01:25:33.040
a random from some set. So another example is very simple if we're conducting a presidential
link |
01:25:44.480
election and we would like to pick the winner. In principle, we could draw a random sample of
link |
01:25:57.360
all of the voters in the country. And if it was of substantial size, say a few thousand,
link |
01:26:05.520
then the most popular candidate in that group would be very likely to be the correct choice
link |
01:26:12.080
that would come out of counting all the millions of votes. And of course, we can't do this because
link |
01:26:17.680
first of all, everybody has to feel that his or her vote counted. And secondly, we can't really do
link |
01:26:23.280
a purely random sample from that population. And I guess thirdly, there could be a tie in which case
link |
01:26:31.840
we wouldn't have a significant difference between two candidates.
link |
01:26:36.160
But those things aside, if you didn't have all that messiness of human beings,
link |
01:26:40.320
you could prove that that kind of random picking would solve the problem with a very low probability
link |
01:26:49.840
of error. Another example is testing whether a number is prime. So if I want to test whether
link |
01:26:58.560
17 is prime, I could pick any number between 1 and 17 and raise it to the 16th power,
link |
01:27:10.480
modulo 17, and you should get back the original number. That's a famous formula due to Fermat
link |
01:27:18.720
about what's called Fermat's little theorem, that if you take any A, any number A in the range
link |
01:27:29.200
0 through n minus 1 and raise it to the n minus 1th power, modulo n, you'll get back the number A
link |
01:27:39.440
if A is prime. So if you don't get back the number A, that's a proof that a number is not prime.
link |
01:27:52.320
And you can show that suitably define the probability that you will get
link |
01:28:03.360
a value unequal. You will get a violation of Fermat's result is very high, and so this gives
link |
01:28:14.960
you a way of rapidly proving that a number is not prime. It's a little more complicated than that
link |
01:28:22.720
because there are certain values of n where something a little more elaborate has to be done,
link |
01:28:28.480
but that's the basic idea. You're taking an identity that holds for primes, and therefore,
link |
01:28:36.000
if it ever fails on any instance for a nonprime unit, you know that the number is not prime.
link |
01:28:43.280
It's a quick choice, a fast choice, fast proof that a number is not prime.
link |
01:28:48.640
Can you maybe elaborate a little bit more what's your intuition why randomness works so well
link |
01:28:54.080
and results in such simple algorithms? Well, the example of conducting an election
link |
01:29:00.720
where you could take, in theory, you could take a sample and depend on the validity of the sample
link |
01:29:06.960
to really represent the whole is just the basic fact of statistics, which gives a lot of opportunities.
link |
01:29:14.480
I actually exploited that sort of random sampling idea in designing an algorithm for
link |
01:29:27.360
counting the number of solutions that satisfy a particular formula and propositional logic.
link |
01:29:36.480
In particular, so some version of the satisfiability problem? A version of the satisfiability problem.
link |
01:29:47.760
Is there some interesting insight that you want to elaborate on? What some aspect of
link |
01:29:51.920
that algorithm that might be useful to describe? You have a collection of
link |
01:30:01.520
formulas and you want to count the number of solutions that satisfy at least one of the formulas.
link |
01:30:20.320
And you can count the number of solutions that satisfy any particular one of the formulas,
link |
01:30:27.120
but you have to account for the fact that that solution might be counted many times if it solves
link |
01:30:36.720
more than one of the formulas. And so what you do is you sample from the formulas according to
link |
01:30:48.000
the number of solutions that satisfy each individual one. In that way, you draw a random
link |
01:30:54.720
solution, but then you correct by looking at the number of formulas that satisfy that random solution
link |
01:31:04.400
and don't double count. So you can think of it this way. So you have a matrix of zeros and ones
link |
01:31:15.840
and you want to know how many columns of that matrix contain at least one one.
link |
01:31:20.240
And you can count in each row how many ones there are. So what you can do is draw from
link |
01:31:28.640
the rows according to the number of ones. If a row has more ones, it gets drawn more frequently.
link |
01:31:35.840
But then, if you draw from that row, you have to go up the column and looking at
link |
01:31:42.480
where that same one is repeated in different rows and only count it as a success or a hit
link |
01:31:51.120
if it's the earliest row that contains the one. And that gives you a robust statistical estimate
link |
01:32:00.160
of the total number of columns that contain at least one of the ones. So that is an example of
link |
01:32:08.000
the same principle that was used in studying random sampling. Another viewpoint is that
link |
01:32:17.440
if you have a phenomenon that occurs almost all the time, then if you sample one of the
link |
01:32:26.720
occasions where it occurs, you're most likely to and you're looking for an occurrence. A random
link |
01:32:32.960
occurrence is likely to work. So that comes up in solving identities, solving algebraic
link |
01:32:41.920
identities. You get two formulas that may look very different. You want to know if they're
link |
01:32:47.200
really identical. What you can do is just pick a random value and evaluate the formulas at that
link |
01:32:55.440
value and see if they agree. And you depend on the fact that if the formulas are distinct,
link |
01:33:04.080
then they're going to disagree a lot. And so, therefore, a random choice will exhibit the
link |
01:33:09.600
disagreement. If there are many ways for the two to disagree and you only need to find one
link |
01:33:17.600
disagreement, then random choice is likely to yield it.
link |
01:33:21.440
And in general, so we've just talked about randomized algorithms, but we can look at
link |
01:33:26.720
the probabilistic analysis of algorithms. And that gives us an opportunity to step back and,
link |
01:33:32.640
as we said, everything we've been talking about is worst case analysis.
link |
01:33:37.840
Because you may be comment on the usefulness and the power of worst case analysis versus
link |
01:33:46.240
best case analysis, average case probabilistic. How do we think about the future of theoretical
link |
01:33:53.360
computer science, computer science, and the kind of analysis we do of algorithms? Does
link |
01:33:59.200
worst case analysis still have a place, an important place, or do we want to try to move
link |
01:34:04.160
forward towards kind of average case analysis? And what are the challenges there?
link |
01:34:08.640
So, if worst case analysis shows that an algorithm is always good, that's fine. If worst case analysis
link |
01:34:21.760
is used to show that the problem, that the solution is not always good,
link |
01:34:28.960
then you have to step back and do something else to ask, how often will you get a good solution?
link |
01:34:34.640
That's just to pause on that for a second. That's so beautifully put because I think we
link |
01:34:41.200
tend to judge algorithms. We throw them in the trash the moment their worst case is shown to be bad.
link |
01:34:47.840
Right. And that's unfortunate. I think a good example is going back to the satisfiability
link |
01:34:58.080
problem. There are very powerful programs called SAT solvers, which in practice,
link |
01:35:06.560
fairly reliably solve instances with many millions of variables that arise in
link |
01:35:12.880
digital design or in proving programs correct and other applications.
link |
01:35:20.080
And so, in many application areas, even though satisfiability, as we've already
link |
01:35:27.120
discussed, is NP complete, the SAT solvers will work so well that the people in that discipline
link |
01:35:37.120
tend to think of satisfiability as an easy problem. So, in other words, just for some
link |
01:35:44.720
reason that we don't entirely understand, the instances that people formulate in designing
link |
01:35:51.280
digital circuits or other applications are such that satisfiability is not hard to check.
link |
01:36:04.480
And even searching for a satisfying solution can be done efficiently in practice.
link |
01:36:11.520
And there are many examples. For example, we talked about the traveling salesman problem.
link |
01:36:17.040
So, just to refresh our memories, the problem is you've got a set of cities. You have
link |
01:36:24.480
pairwise distances between cities. And you want to find a tour through all the cities that
link |
01:36:31.840
minimizes the total cost of all the edges traversed, all the trips between cities.
link |
01:36:38.800
The problem is NP hard, but people using integer programming codes together with some
link |
01:36:48.720
other mathematical tricks can solve geometric instances of the problem where the cities are,
link |
01:36:58.080
let's say, points in the plane and get optimal solutions to problems with tens of thousands
link |
01:37:04.320
of cities. Actually, it'll take a few computer months to solve a problem of that size, but for
link |
01:37:10.800
problems of size 1,000 or two, it'll rapidly get optimal solutions, provably optimal solutions,
link |
01:37:19.120
even though, again, we know that it's unlikely that the traveling salesman problem can be
link |
01:37:26.160
solved in polynomial time. Are there methodologies like rigorous
link |
01:37:31.760
systematic methodologies for, you said in practice. In practice, this algorithm sounds
link |
01:37:39.440
pretty good. Are there systematic ways of saying in practice, this one's pretty good?
link |
01:37:43.680
So, in other words, average case analysis. Or you've also mentioned that average case
link |
01:37:48.960
kind of requires you to understand what the typical cases, typical instances, and that
link |
01:37:53.920
might be really difficult. That's very difficult. So, after I did my original work on showing all
link |
01:38:03.680
these problems through NP complete, I looked around for a way to shed some positive light on
link |
01:38:11.600
combinatorial algorithms. And what I tried to do was to study problems, behavior on the average,
link |
01:38:21.760
or with high probability. But I had to make some assumptions about what's the probability space,
link |
01:38:29.600
what's the sample space, what do we mean by typical problems. That's very hard to say. So,
link |
01:38:35.600
I took the easy way out and made some very simplistic assumptions. So, I assumed, for example,
link |
01:38:41.840
that if we were generating a graph with a certain number of vertices and edges,
link |
01:38:46.480
then we would generate the graph by simply choosing one edge at a time at random until we got the
link |
01:38:54.160
right number of edges. That's a particular model of random graphs that has been studied
link |
01:38:59.840
mathematically a lot. And within that model, I could prove all kinds of wonderful things,
link |
01:39:06.640
I and others who also worked on this. So, we could show that we know exactly how many edges
link |
01:39:15.040
there have to be in order for there be a so called Hamiltonian circuit. That's a cycle that
link |
01:39:26.480
visits each vertex exactly once. We know that if the number of edges is a little bit more than n log n,
link |
01:39:37.360
where n is the number of vertices, then where such a cycle is very likely to exist,
link |
01:39:43.120
and we can give a heuristic that will find it with high probability. And we got the community
link |
01:39:53.360
in which I was working got a lot of results along these lines. But the field tended to be rather
link |
01:40:03.280
lukewarm about accepting these results as meaningful because we were making such a
link |
01:40:08.640
simplistic assumption about the kinds of graphs that we would be dealing with. So, we could show
link |
01:40:14.480
all kinds of wonderful things. It was a great playground. I enjoyed doing it. But after a while,
link |
01:40:20.000
I concluded that it didn't have a lot of bite in terms of the practical application.
link |
01:40:30.960
Okay. So, there's too much into the world of toy problems. Is there a way to find
link |
01:40:41.600
nice representative real world impactful instances of a problem on which demonstrate
link |
01:40:47.440
that an algorithm is good? So, the machine learning world is kind of what they
link |
01:40:52.800
at its best tries to do is find a data set from the real world and show the performance all of
link |
01:40:59.600
the all the conferences are all focused on beating the performance of on that real world data set.
link |
01:41:07.040
Is there an equivalent in complexity analysis?
link |
01:41:11.680
Not really. Don Knuth started to collect examples of graphs coming from various places. So, he
link |
01:41:21.840
would have a whole zoo of different graphs that he could choose from and he could study the
link |
01:41:28.640
performance of algorithms on different types of graphs. But there, it's really important and
link |
01:41:36.480
compelling to be able to define a class of graphs. So, the actual act of defining a class of graphs
link |
01:41:43.920
that you're interested in, it seems to be a non trivial step before talking about instances that
link |
01:41:49.440
we should care about in the real world. Yeah. There's nothing available there that would be
link |
01:41:57.040
analogous to the training set for supervised learning where you sort of assume that the world
link |
01:42:03.760
has given you a bunch of examples to work with. We don't really have that for problems,
link |
01:42:14.480
for combinatorial problems on graphs and networks.
link |
01:42:18.160
You know, there's been a huge growth, a big growth of data sets available. Do you think
link |
01:42:24.320
some aspect of theoretical computer science might be contradicting my own question while saying it,
link |
01:42:30.560
but will there be some aspect, an empirical aspect of theoretical computer science,
link |
01:42:36.880
which will allow the fact that these data sets are huge will start using them for analysis?
link |
01:42:44.000
Sort of, you know, if you want to say something about a graph algorithm, you might take
link |
01:42:49.600
a social network like Facebook and looking at subgraphs of that and prove something about the
link |
01:42:56.960
Facebook graph and be respected. And at the same time, be respected in the theoretical computer
link |
01:43:02.720
science community. That hasn't been achieved yet, I'm afraid. Is that P equals NP? Is that
link |
01:43:09.280
impossible? Is it impossible to publish a successful paper in the theoretical computer
link |
01:43:15.520
science community that shows some performance on a real world data set? Or is that really just
link |
01:43:24.160
those two different worlds? They haven't really come together. I would say that there is a field
link |
01:43:31.680
of experimental algorithmics where people, sometimes they're given some family of examples.
link |
01:43:38.800
Sometimes they just generate them at random and they report on performance. But there's no
link |
01:43:48.720
convincing evidence that the sample is representative of anything at all.
link |
01:43:57.360
So let me ask, in terms of breakthroughs and open problems, what are the most
link |
01:44:03.520
compelling open problems to you? And what possible breakthroughs do you see in the near term
link |
01:44:08.160
in terms of theoretical computer science? Well, there are all kinds of relationships
link |
01:44:15.360
among complexity classes that can be studied. Just to mention one thing, I wrote a paper with
link |
01:44:23.520
Richard Lipton in 1979 where we asked the following question. If you take a combinatorial problem
link |
01:44:38.400
in NP, let's say, and you pick the size of the problem. I say it's a traveling salesman problem,
link |
01:44:55.360
but of size 52. And you ask, could you get an efficient, a small Boolean circuit tailored for
link |
01:45:06.960
that size, 52, where you could feed the edges of the graph in as Boolean inputs and get as an
link |
01:45:16.080
output the question of whether or not there's a tour of a certain length. And that would,
link |
01:45:22.000
in other words, briefly, what you would say in that case is that the problem has small circuits,
link |
01:45:28.560
polynomial size circuits. Now, we know that if P is equal to NP, then, in fact, these problems
link |
01:45:37.440
will have small circuits. But what about the converse? Could a problem have small circuits,
link |
01:45:43.120
meaning that an algorithm tailored to any particular size could work well,
link |
01:45:49.520
and yet not be a polynomial time algorithm? That is, you couldn't write it as a single
link |
01:45:54.000
uniform algorithm good for all sizes. Just to clarify, small circuits for a problem of particular
link |
01:46:00.960
size or even further constraint, small circuit for a particular... No, for all the inputs of that
link |
01:46:08.880
size. Of that size. Is that a trivial problem for a particular instance? So coming up an automated
link |
01:46:15.280
way of coming up with a circuit, I guess that's... That would be hard. But there's the existential
link |
01:46:24.640
question. Everybody talks nowadays about existential questions, existential challenges.
link |
01:46:32.960
Yeah. You could ask the question, does the Hamiltonian circuit problem have
link |
01:46:46.320
a small circuit for every size, for each size, a different small circuit?
link |
01:46:53.120
In other words, could you tailor solutions depending on the size and get polynomial size?
link |
01:47:00.480
Even if P is not equal to NP. Right. That would be fascinating if that's true.
link |
01:47:08.640
Yeah. What we proved is that if that were possible, then something strange would happen
link |
01:47:16.480
in complexity theory, some high level class which I could briefly describe, something strange would
link |
01:47:27.680
happen. So I'll take a stab at describing what I mean. Sure. Let's go there.
link |
01:47:34.000
So we have to define this hierarchy in which the first level of the hierarchy is P
link |
01:47:41.360
and the second level is NP. And what is NP? NP involves statements of the form
link |
01:47:47.520
there exists a something, such that something holds. So for example, there exists a coloring
link |
01:47:59.840
such that a graph can be colored with only that number of colors or there exists a Hamiltonian
link |
01:48:08.240
circuit. There's a statement about this graph. Yeah. So the NP deals with statements of that
link |
01:48:22.480
kind, that there exists a solution. Now, you could imagine a more complicated
link |
01:48:29.280
expression which says for all x, there exists a y such that some proposition holds involving
link |
01:48:44.960
both x and y. So that would say, for example, in game theory, for all strategies for the first
link |
01:48:54.400
player, there exists a strategy for the second player such that the first player wins. That would
link |
01:49:00.960
be at the second level of the hierarchy. The third level would be there exists an A such
link |
01:49:06.240
that for all B, there exists a C, but something holds. And you can imagine going higher and higher
link |
01:49:11.200
in the hierarchy. And you'd expect that the complexity classes that correspond to those
link |
01:49:20.000
different cases would get bigger and bigger. Sorry, they'd get harder and harder to solve.
link |
01:49:35.360
And what Lifton and I showed was that if NP had small circuits, then this hierarchy would collapse
link |
01:49:43.840
down to the second level. In other words, you wouldn't get any more mileage by complicating
link |
01:49:49.360
your expressions with three quantifiers or four quantifiers or any number.
link |
01:49:55.360
I'm not sure what to make of that exactly. Well, I think it would be evidence that NP doesn't have
link |
01:50:01.840
small circuits because something so bizarre would happen. But again, it's only evidence,
link |
01:50:08.480
not proof. Well, yeah, that's not even evidence because you're saying P is not equal to NP because
link |
01:50:17.440
something bizarre has to happen. I mean, that's proof by the lack of
link |
01:50:24.240
bizarreness in our science. But it seems like just the very notion of P equals NP would be
link |
01:50:32.560
bizarre. So any way you arrive it, there's no way you have to fight the dragon at some point.
link |
01:50:38.160
Yeah. Okay. Well, for whatever it's worth, that's what we proved.
link |
01:50:43.360
Awesome. So that's a potential space of interesting problems. Let me
link |
01:50:51.360
ask you about this other world of machine learning, of deep learning. What's your
link |
01:50:58.560
thoughts on the history and the current progress of machine learning field that's often progressed
link |
01:51:03.840
sort of separately as a space of ideas and space of people than the theoretical computer science
link |
01:51:10.800
or just even computer science world? Yeah. It's really very different from the theoretical computer
link |
01:51:17.040
science world because the results about algorithmic performance tend to be empirical. It's more akin
link |
01:51:26.320
to the world of SAT solvers where we observe that for formulas arising in practice, the solver does
link |
01:51:34.720
well. So it's of that type. We're moving into the empirical evaluation of algorithms.
link |
01:51:45.280
Now, it's clear that there have been huge successes in image processing, robotics,
link |
01:51:52.800
natural language processing a little less so. But across the spectrum of
link |
01:51:57.280
game playing is another one. There have been great successes. And one of those
link |
01:52:07.120
effects is that it's not too hard to become a millionaire if you can get a reputation in
link |
01:52:11.840
machine learning and there'll be all kinds of companies that will be willing to offer you
link |
01:52:16.080
the moon because they think that if they have AI at their disposal,
link |
01:52:23.520
then they can solve all kinds of problems. But there are limitations.
link |
01:52:34.960
One is that the solutions that you get to
link |
01:52:41.200
supervise learning problems through convolutional neural networks seem to perform
link |
01:52:54.160
amazingly well even for inputs that are outside the training set.
link |
01:53:03.040
But we don't have any theoretical understanding of why that's true.
link |
01:53:07.120
Secondly, the solutions, the networks that you get are very hard to understand and so very little
link |
01:53:17.520
insight comes out. So yeah, they may seem to work on your training set and you may be able to
link |
01:53:26.480
discover whether your photos occur in a different sample of inputs or not.
link |
01:53:33.760
But we don't really know what's going on. We don't know the features that distinguish
link |
01:53:40.480
the photographs or the objects are not easy to characterize.
link |
01:53:49.440
Well, it's interesting because you mentioned coming up with a small circuit
link |
01:53:53.680
to solve a particular size problem. It seems that neural networks are kind of small circuits.
link |
01:53:59.360
In a way, yeah. But they're not programs. Sort of like the things you've designed are algorithms,
link |
01:54:05.680
programs, algorithms. Neural networks aren't able to develop algorithms to solve a problem.
link |
01:54:14.400
Well, they are algorithms. It's just that they're...
link |
01:54:17.840
But sort of, yeah, it could be a semantic question, but there's not
link |
01:54:27.360
a algorithmic style manipulation of the input. Perhaps you could argue there is.
link |
01:54:35.360
Yeah, well, it feels a lot more like a function of the input.
link |
01:54:40.400
It's a function. It's a computable function.
link |
01:54:42.800
And once you have the network, you can simulate it on a given input and figure out the output.
link |
01:54:51.120
But if you're trying to recognize images, then you don't know what features of the image are really
link |
01:55:01.440
being determinant of what the circuit is doing. The circuit is sort of very intricate and it's
link |
01:55:15.120
not clear that the simple characteristics that you're looking for, the edges of the objects or
link |
01:55:23.920
whatever they may be, they're not emerging from the structure of the circuit.
link |
01:55:29.360
Well, it's not clear to us humans, but it's clear to the circuit.
link |
01:55:32.880
Yeah, well, right. I mean, it's not clear to sort of the
link |
01:55:41.520
elephant how the human brain works, but it's clear to us humans, we can explain to each other
link |
01:55:48.160
our reasoning and that's why the cognitive science and psychology field exist.
link |
01:55:52.640
Maybe the whole thing of being explainable to humans is a little bit overrated.
link |
01:55:57.600
Oh, maybe, yeah. I guess you can say the same thing about our brain that when we perform
link |
01:56:04.960
acts of cognition, we have no idea how we do it really. We do, though. I mean, for
link |
01:56:12.000
at least for the visual system, the auditory system and so on, we do
link |
01:56:17.040
get some understanding of the principles that they operate under, but
link |
01:56:21.840
for many deeper cognitive tasks, we don't have that.
link |
01:56:25.520
That's right. Let me ask, you've also been doing work on bioinformatics.
link |
01:56:33.040
Does it amaze you that the fundamental building blocks, so if we take a step back
link |
01:56:38.000
and look at us humans, the building blocks used by evolution to build us intelligent
link |
01:56:43.520
human beings is all contained there in our DNA?
link |
01:56:46.880
It's amazing, and what's really amazing is that we are beginning to learn how to edit
link |
01:56:59.440
DNA, which is very, very, very fascinating. This ability to
link |
01:57:10.560
take a sequence, find it in the genome, and do something to it.
link |
01:57:19.120
That's really taking our biological systems towards the worlds of algorithms.
link |
01:57:25.040
But it raises a lot of questions. You have to distinguish between doing it on an individual
link |
01:57:33.760
or doing it on somebody's germline, which means that all of their descendants will be affected.
link |
01:57:38.560
So, that's an ethical…
link |
01:57:42.160
Yeah, so it raises very severe ethical questions.
link |
01:57:50.480
Even doing it on individuals, there's a lot of hubris involved that you can assume that
link |
01:58:02.480
knocking out a particular gene is going to be beneficial because you don't know what the
link |
01:58:06.320
side effects are going to be. So, we have this wonderful new world of gene editing,
link |
01:58:20.080
which is very, very impressive, and it could be used in agriculture, it could be used in medicine
link |
01:58:30.000
in various ways, but very serious ethical problems arise.
link |
01:58:37.360
What are, to you, the most interesting places where algorithms…
link |
01:58:42.720
Sort of the ethical side is an exceptionally challenging thing that I think we're going to
link |
01:58:47.200
have to tackle with all of genetic engineering. But on the algorithmic side, there's a lot of
link |
01:58:54.080
benefit that's possible. So, is there areas where you see exciting possibilities for algorithms to
link |
01:59:01.120
help model optimized study biological systems? Yeah, I mean, we can certainly analyze genomic data to
link |
01:59:12.560
figure out which genes are operative in the cell and under what conditions and which proteins affect
link |
01:59:20.240
one another, which proteins physically interact. We can sequence proteins and modify them.
link |
01:59:32.480
Is there some aspect of that that's a computer science problem,
link |
01:59:35.760
or is that still fundamentally a biology problem? Well, it's a big data,
link |
01:59:41.440
it's a statistical big data problem for sure. So, the biological data sets are increasing,
link |
01:59:49.120
our ability to study our ancestry, to study the tendencies towards disease, to personalize
link |
02:00:03.280
treatment according to what's in our genomes and what tendencies for disease we have,
link |
02:00:10.240
to be able to predict what troubles might come upon us in the future and anticipate them to
link |
02:00:16.160
to understand whether you, for a woman, whether her perclivity for breast cancer is so strong
link |
02:00:32.480
enough that you would want to take action to avoid it. You dedicate your 1985 Turing Award
link |
02:00:40.560
lecture to the memory of your father. What's your fondest memory of your dad?
link |
02:00:52.960
Seeing him standing in front of a class at the blackboard drawing perfect circles
link |
02:00:59.280
by hand and showing his ability to attract the interest of the motley collection of
link |
02:01:13.520
eighth grade students that he was teaching. When did you get a chance to see him draw the perfect
link |
02:01:22.240
circles? On rare occasions, I would get a chance to sneak into his classroom and observe it.
link |
02:01:33.280
I think he was at his best in the classroom. I think he really came to life
link |
02:01:40.720
and had fun, not only teaching, but engaging in chit chat with the students and
link |
02:01:50.000
you know, ingratiating himself with the students. And what I inherited from that is
link |
02:01:58.720
a great desire to be a teacher. I retired recently and a lot of my former students came,
link |
02:02:08.240
students with whom I had done research or who had read my papers or who had been in my classes.
link |
02:02:14.400
And when they talked about me, they talked not about my 1979 paper or 1992 paper,
link |
02:02:29.520
but about what came away in my classes and not just the details, but just the approach
link |
02:02:36.240
on the manner of teaching. And so I sort of take pride in the, at least in my early years as a
link |
02:02:45.280
faculty member at Berkeley, I was exemplary in preparing my lectures and I always came in
link |
02:02:55.440
prepared to the teeth and able, therefore, to deviate according to what happened in the class
link |
02:03:00.560
and to really, really provide a model for the students.
link |
02:03:08.640
So is there advice you could give out for others on how to be a good teacher? So preparation is
link |
02:03:17.760
one thing you've mentioned being exceptionally well prepared, but there are other things,
link |
02:03:21.360
pieces of advice that you can impart. Well, the top three would be preparation,
link |
02:03:26.480
preparation and preparation. Why is preparation so important, I guess?
link |
02:03:32.640
It's because it gives you the ease to deal with any situation that comes up in the classroom. And
link |
02:03:40.560
if you discover that you're not getting through one way, you can do it another way. If the students
link |
02:03:46.400
have questions, you can handle the questions. Ultimately, you're also feeling the crowd,
link |
02:03:54.160
the students of what they're struggling with, what they're picking up, just looking at them
link |
02:03:58.880
through the questions, but even just through their eyes. And because of the preparation,
link |
02:04:04.160
you can dance. You can dance. You can say it another way or give it another angle.
link |
02:04:11.520
Are there, in particular, ideas and algorithms of computer science that you find were big aha
link |
02:04:18.720
moments for students where they, for some reason, once they got it, it clicked for them and they
link |
02:04:23.920
fell in love with computer science? Or is it individual? Is it different for everybody?
link |
02:04:29.200
It's different for everybody. You have to work differently with students. Some of them just
link |
02:04:36.000
don't need much influence. They're just running with what they're doing and they just need an
link |
02:04:43.120
ear now and then. Others need a little prodding. Others need to be persuaded to collaborate
link |
02:04:49.840
among themselves rather than working alone. They have their personal ups and downs,
link |
02:04:57.200
so you have to deal with each student as a human being and bring out the best.
link |
02:05:06.560
Humans are complicated. Perhaps a silly question. If you could relive a moment in your life outside
link |
02:05:14.080
of family because it made you truly happy or perhaps because it changed the direction of your
link |
02:05:19.520
life in a profound way, what moment would you pick? I was kind of a lazy student as an undergraduate
link |
02:05:28.240
and even in my first year in graduate school. I think it was when I started doing research.
link |
02:05:37.120
I had a couple of summer jobs where I was able to contribute and I had an idea.
link |
02:05:43.840
Then there was one particular course on mathematical methods and operations research
link |
02:05:51.440
where I just gobbled up the material and I scored 20 points higher than anybody else in the class
link |
02:05:57.520
then came to the attention of the faculty. It made me realize that I had some ability
link |
02:06:04.480
that was going somewhere. You realize you're pretty good at this thing.
link |
02:06:10.720
I don't think there's a better way to end it, Richard. It was a huge honor.
link |
02:06:15.200
Thank you for decades of incredible work. Thank you for talking to us.
link |
02:06:18.880
Thank you. It's been a great pleasure. You're a superb interviewer.
link |
02:06:23.440
Thanks for listening to this conversation with Richard Karp and thank you to our sponsors,
link |
02:06:42.000
Aidsleep and Cash App. Please consider supporting this podcast by going to aidsleep.com to check out
link |
02:06:39.680
their awesome mattress and downloading Cash App and using code LEX podcast.
link |
02:06:45.920
Click the links, buy the stuff, even just visiting the site, but also considering the purchase
link |
02:06:51.520
helps them know that this podcast is worth supporting in the future.
link |
02:06:55.760
It really is the best way to support this journey I'm on. If you enjoy this thing,
link |
02:07:00.640
subscribe on YouTube, review it with Five Stars and Apple Podcasts, support it on Patreon,
link |
02:07:05.280
or connect with me on Twitter at Lex Freedman if you can figure out how to spell that.
link |
02:07:11.280
And now let me leave you with some words from Isaac Asimov.
link |
02:07:14.960
I do not fear computers. I fear lack of them. Thank you for listening and hope to see you next time.