back to indexTuomas Sandholm: Poker and Game Theory | Lex Fridman Podcast #12
link |
The following is a conversation with Thomas Sanholm.
link |
He's a professor at CMU and co creator of Labratus,
link |
which is the first AI system to beat top human players
link |
in the game of Heads Up No Limit Texas Holdem.
link |
He has published over 450 papers
link |
on game theory and machine learning,
link |
including a best paper in 2017 at NIPS,
link |
now renamed to Newrips,
link |
which is where I caught up with him for this conversation.
link |
His research and companies have had wide reaching impact
link |
in the real world,
link |
especially because he and his group
link |
not only propose new ideas,
link |
but also build systems to prove that these ideas work
link |
in the real world.
link |
This conversation is part of the MIT course
link |
on artificial general intelligence
link |
and the artificial intelligence podcast.
link |
If you enjoy it, subscribe on YouTube, iTunes,
link |
or simply connect with me on Twitter
link |
at Lex Friedman, spelled F R I D.
link |
And now here's my conversation with Thomas Sanholm.
link |
Can you describe at the high level
link |
the game of poker, Texas Holdem, Heads Up Texas Holdem
link |
for people who might not be familiar with this card game?
link |
So Heads Up No Limit Texas Holdem
link |
has really emerged in the AI community
link |
as a main benchmark for testing these
link |
application independent algorithms
link |
for imperfect information game solving.
link |
And this is a game that's actually played by humans.
link |
You don't see that much on TV or casinos
link |
because well, for various reasons,
link |
but you do see it in some expert level casinos
link |
and you see it in the best poker movies of all time.
link |
It's actually an event in the World Series of Poker,
link |
but mostly it's played online
link |
and typically for pretty big sums of money.
link |
And this is a game that usually only experts play.
link |
So if you go to your home game on a Friday night,
link |
it probably is not gonna be Heads Up No Limit Texas Holdem.
link |
It might be No Limit Texas Holdem in some cases,
link |
but typically for a big group and it's not as competitive.
link |
While Heads Up means it's two players.
link |
So it's really like me against you.
link |
Am I better or are you better?
link |
Much like chess or go in that sense,
link |
but an imperfect information game,
link |
which makes it much harder because I have to deal
link |
with issues of you knowing things that I don't know
link |
and I know things that you don't know
link |
instead of pieces being nicely laid on the board
link |
for both of us to see.
link |
So in Texas Holdem, there's two cards
link |
that you only see that belong to you.
link |
Yeah. And there is,
link |
they gradually lay out some cards
link |
that add up overall to five cards that everybody can see.
link |
Yeah. So the imperfect nature
link |
of the information is the two cards
link |
that you're holding in your hand.
link |
So as you said, you first get two cards
link |
in private each and then there's a betting round.
link |
Then you get three cards in public on the table.
link |
Then there's a betting round.
link |
Then you get the fourth card in public on the table.
link |
There's a betting round.
link |
Then you get the 5th card on the table.
link |
There's a betting round.
link |
So there's a total of four betting rounds
link |
and four tranches of information revelation if you will.
link |
The only the first tranche is private
link |
and then it's public from there.
link |
And this is probably by far the most popular game in AI
link |
and just the general public
link |
in terms of imperfect information.
link |
So that's probably the most popular spectator game
link |
So, which is why it's a super exciting game to tackle.
link |
So it's on the order of chess, I would say,
link |
in terms of popularity, in terms of AI setting it
link |
as the bar of what is intelligence.
link |
So in 2017, Labratus, how do you pronounce it?
link |
A little Latin there.
link |
A little bit of Latin.
link |
Labratus beats a few, four expert human players.
link |
Can you describe that event?
link |
What you learned from it?
link |
What was the process in general
link |
for people who have not read the papers and the study?
link |
Yeah, so the event was that we invited
link |
four of the top 10 players,
link |
with these specialist players in Heads Up No Limit,
link |
Texas Holden, which is very important
link |
because this game is actually quite different
link |
than the multiplayer version.
link |
We brought them in to Pittsburgh
link |
to play at the Reverse Casino for 20 days.
link |
We wanted to get 120,000 hands in
link |
because we wanted to get statistical significance.
link |
So it's a lot of hands for humans to play,
link |
even for these top pros who play fairly quickly normally.
link |
So we couldn't just have one of them play so many hands.
link |
20 days, they were playing basically morning to evening.
link |
And I raised 200,000 as a little incentive for them to play.
link |
And the setting was so that they didn't all get 50,000.
link |
We actually paid them out
link |
based on how they did against the AI each.
link |
So they had an incentive to play as hard as they could,
link |
whether they're way ahead or way behind
link |
or right at the mark of beating the AI.
link |
And you don't make any money, unfortunately.
link |
Right, no, we can't make any money.
link |
So originally, a couple of years earlier,
link |
I actually explored whether we could actually play for money
link |
because that would be, of course, interesting as well,
link |
to play against the top people for money.
link |
But the Pennsylvania Gaming Board said no, so we couldn't.
link |
So this is much like an exhibit,
link |
like for a musician or a boxer or something like that.
link |
Nevertheless, they were keeping track of the money
link |
and brought us close to $2 million, I think.
link |
So if it was for real money, if you were able to earn money,
link |
that was a quite impressive and inspiring achievement.
link |
Just a few details, what were the players looking at?
link |
Were they behind a computer?
link |
What was the interface like?
link |
Yes, they were playing much like they normally do.
link |
These top players, when they play this game,
link |
they play mostly online.
link |
So they're used to playing through a UI.
link |
And they did the same thing here.
link |
So there was this layout.
link |
You could imagine there's a table on a screen.
link |
There's the human sitting there,
link |
and then there's the AI sitting there.
link |
And the screen shows everything that's happening.
link |
The cards coming out and shows the bets being made.
link |
And we also had the betting history for the human.
link |
So if the human forgot what had happened in the hand so far,
link |
they could actually reference back and so forth.
link |
Is there a reason they were given access
link |
to the betting history for?
link |
Well, we just, it didn't really matter.
link |
They wouldn't have forgotten anyway.
link |
These are top quality people.
link |
But we just wanted to put out there
link |
so it's not a question of the human forgetting
link |
and the AI somehow trying to get advantage
link |
So what was that like?
link |
I mean, that was an incredible accomplishment.
link |
So what did it feel like before the event?
link |
Did you have doubt, hope?
link |
Where was your confidence at?
link |
Yeah, that's great.
link |
So great question.
link |
So 18 months earlier, I had organized a similar brains
link |
versus AI competition with a previous AI called Cloudyco
link |
and we couldn't beat the humans.
link |
So this time around, it was only 18 months later.
link |
And I knew that this new AI, Libratus, was way stronger,
link |
but it's hard to say how you'll do against the top humans
link |
So I thought we had about a 50, 50 shot.
link |
And the international betting sites put us
link |
as a four to one or five to one underdog.
link |
So it's kind of interesting that people really believe
link |
in people and over AI, not just people.
link |
People don't just over believe in themselves,
link |
but they have overconfidence in other people as well
link |
compared to the performance of AI.
link |
And yeah, so we were a four to one or five to one underdog.
link |
And even after three days of beating the humans in a row,
link |
we were still 50, 50 on the international betting sites.
link |
Do you think there's something special and magical
link |
about poker and the way people think about it,
link |
in the sense you have,
link |
I mean, even in chess, there's no Hollywood movies.
link |
Poker is the star of many movies.
link |
And there's this feeling that certain human facial
link |
expressions and body language, eye movement,
link |
all these tells are critical to poker.
link |
Like you can look into somebody's soul
link |
and understand their betting strategy and so on.
link |
So that's probably why, possibly,
link |
do you think that is why people have a confidence
link |
that humans will outperform?
link |
Because AI systems cannot, in this construct,
link |
perceive these kinds of tells.
link |
They're only looking at betting patterns
link |
and nothing else, betting patterns and statistics.
link |
So what's more important to you
link |
if you step back on human players, human versus human?
link |
What's the role of these tells,
link |
of these ideas that we romanticize?
link |
Yeah, so I'll split it into two parts.
link |
So one is why do humans trust humans more than AI
link |
and have overconfidence in humans?
link |
I think that's not really related to the tell question.
link |
It's just that they've seen these top players,
link |
how good they are, and they're really fantastic.
link |
So it's just hard to believe that an AI could beat them.
link |
So I think that's where that comes from.
link |
And that's actually maybe a more general lesson about AI.
link |
That until you've seen it overperform a human,
link |
it's hard to believe that it could.
link |
But then the tells, a lot of these top players,
link |
they're so good at hiding tells
link |
that among the top players,
link |
it's actually not really worth it
link |
for them to invest a lot of effort
link |
trying to find tells in each other
link |
because they're so good at hiding them.
link |
So yes, at the kind of Friday evening game,
link |
tells are gonna be a huge thing.
link |
You can read other people.
link |
And if you're a good reader,
link |
you'll read them like an open book.
link |
But at the top levels of poker now,
link |
the tells become a much smaller and smaller aspect
link |
of the game as you go to the top levels.
link |
The amount of strategies, the amount of possible actions
link |
is very large, 10 to the power of 100 plus.
link |
So there has to be some, I've read a few of the papers
link |
related, it has to form some abstractions
link |
of various hands and actions.
link |
So what kind of abstractions are effective
link |
for the game of poker?
link |
Yeah, so you're exactly right.
link |
So when you go from a game tree that's 10 to the 161,
link |
especially in an imperfect information game,
link |
it's way too large to solve directly,
link |
even with our fastest equilibrium finding algorithms.
link |
So you wanna abstract it first.
link |
And abstraction in games is much trickier
link |
than abstraction in MDPs or other single agent settings.
link |
Because you have these abstraction pathologies
link |
that if I have a finer grained abstraction,
link |
the strategy that I can get from that for the real game
link |
might actually be worse than the strategy
link |
I can get from the coarse grained abstraction.
link |
So you have to be very careful.
link |
Now the kinds of abstractions, just to zoom out,
link |
we're talking about, there's the hands abstractions
link |
and then there's betting strategies.
link |
Yeah, betting actions, yeah.
link |
So there's information abstraction,
link |
don't talk about general games, information abstraction,
link |
which is the abstraction of what chance does.
link |
And this would be the cards in the case of poker.
link |
And then there's action abstraction,
link |
which is abstracting the actions of the actual players,
link |
which would be bets in the case of poker.
link |
Yourself and the other players?
link |
Yes, yourself and other players.
link |
And for information abstraction,
link |
we were completely automated.
link |
So these are algorithms,
link |
but they do what we call potential aware abstraction,
link |
where we don't just look at the value of the hand,
link |
but also how it might materialize
link |
into good or bad hands over time.
link |
And it's a certain kind of bottom up process
link |
with integer programming there and clustering
link |
and various aspects, how do you build this abstraction?
link |
And then in the action abstraction,
link |
there it's largely based on how humans and other AIs
link |
have played this game in the past.
link |
But in the beginning,
link |
we actually used an automated action abstraction technology,
link |
which is provably convergent
link |
that it finds the optimal combination of bet sizes,
link |
but it's not very scalable.
link |
So we couldn't use it for the whole game,
link |
but we use it for the first couple of betting actions.
link |
So what's more important, the strength of the hand,
link |
so the information abstraction or the how you play them,
link |
the actions, does it, you know,
link |
the romanticized notion again,
link |
is that it doesn't matter what hands you have,
link |
that the actions, the betting may be the way you win
link |
no matter what hands you have.
link |
Yeah, so that's why you have to play a lot of hands
link |
so that the role of luck gets smaller.
link |
So you could otherwise get lucky and get some good hands
link |
and then you're gonna win the match.
link |
Even with thousands of hands, you can get lucky
link |
because there's so much variance
link |
in No Limit Texas Holden because if we both go all in,
link |
it's a huge stack of variance, so there are these
link |
massive swings in No Limit Texas Holden.
link |
So that's why you have to play not just thousands,
link |
but over 100,000 hands to get statistical significance.
link |
So let me ask another way this question.
link |
If you didn't even look at your hands,
link |
but they didn't know that, the opponents didn't know that,
link |
how well would you be able to do?
link |
Oh, that's a good question.
link |
There's actually, I heard this story
link |
that there's this Norwegian female poker player
link |
called Annette Oberstad who's actually won a tournament
link |
by doing exactly that, but that would be extremely rare.
link |
So you cannot really play well that way.
link |
Okay, so the hands do have some role to play, okay.
link |
So Labradus does not use, as far as I understand,
link |
they use learning methods, deep learning.
link |
Is there room for learning in,
link |
there's no reason why Labradus doesn't combine
link |
with an AlphaGo type approach for estimating
link |
the quality for function estimator.
link |
What are your thoughts on this,
link |
maybe as compared to another algorithm
link |
which I'm not that familiar with, DeepStack,
link |
the engine that does use deep learning,
link |
that it's unclear how well it does,
link |
but nevertheless uses deep learning.
link |
So what are your thoughts about learning methods
link |
to aid in the way that Labradus plays in the game of poker?
link |
Yeah, so as you said,
link |
Labradus did not use learning methods
link |
and played very well without them.
link |
Since then, we have actually, actually here,
link |
we have a couple of papers on things
link |
that do use learning techniques.
link |
And deep learning in particular.
link |
And sort of the way you're talking about
link |
where it's learning an evaluation function,
link |
but in imperfect information games,
link |
unlike let's say in Go or now also in chess and shogi,
link |
it's not sufficient to learn an evaluation for a state
link |
because the value of an information set
link |
depends not only on the exact state,
link |
but it also depends on both players beliefs.
link |
Like if I have a bad hand,
link |
I'm much better off if the opponent thinks I have a good hand
link |
If I have a good hand,
link |
I'm much better off if the opponent believes
link |
I have a bad hand.
link |
So the value of a state is not just a function of the cards.
link |
It depends on, if you will, the path of play,
link |
but only to the extent that it's captured
link |
in the belief distributions.
link |
So that's why it's not as simple
link |
as it is in perfect information games.
link |
And I don't wanna say it's simple there either.
link |
It's of course very complicated computationally there too,
link |
but at least conceptually, it's very straightforward.
link |
There's a state, there's an evaluation function.
link |
You can try to learn it.
link |
Here, you have to do something more.
link |
And what we do is in one of these papers,
link |
we're looking at where we allow the opponent
link |
to actually take different strategies
link |
at the leaf of the search tree, if you will.
link |
And that is a different way of doing it.
link |
And it doesn't assume therefore a particular way
link |
that the opponent plays,
link |
but it allows the opponent to choose
link |
from a set of different continuation strategies.
link |
And that forces us to not be too optimistic
link |
in a look ahead search.
link |
And that's one way you can do sound look ahead search
link |
in imperfect information games,
link |
which is very difficult.
link |
And you were asking about DeepStack.
link |
What they did, it was very different than what we do,
link |
either in Libratus or in this new work.
link |
They were randomly generating various situations
link |
Then they were doing the look ahead
link |
from there to the end of the game,
link |
as if that was the start of a different game.
link |
And then they were using deep learning
link |
to learn those values of those states,
link |
but the states were not just the physical states.
link |
They include belief distributions.
link |
When you talk about look ahead for DeepStack
link |
or with Libratus, does it mean,
link |
considering every possibility that the game can evolve,
link |
are we talking about extremely,
link |
sort of this exponentially growth of a tree?
link |
Yes, so we're talking about exactly that.
link |
Much like you do in alpha beta search
link |
or Monte Carlo tree search, but with different techniques.
link |
So there's a different search algorithm.
link |
And then we have to deal with the leaves differently.
link |
So if you think about what Libratus did,
link |
we didn't have to worry about this
link |
because we only did it at the end of the game.
link |
So we would always terminate into a real situation
link |
and we would know what the payout is.
link |
It didn't do these depth limited lookaheads,
link |
but now in this new paper, which is called depth limited,
link |
I think it's called depth limited search
link |
for imperfect information games,
link |
we can actually do sound depth limited lookahead.
link |
So we can actually start to do the look ahead
link |
from the beginning of the game on,
link |
because that's too complicated to do
link |
for this whole long game.
link |
So in Libratus, we were just doing it for the end.
link |
So, and then the other side, this belief distribution,
link |
so is it explicitly modeled what kind of beliefs
link |
that the opponent might have?
link |
Yeah, it is explicitly modeled, but it's not assumed.
link |
The beliefs are actually output, not input.
link |
Of course, the starting beliefs are input,
link |
but they just fall from the rules of the game
link |
because we know that the dealer deals uniformly
link |
from the deck, so I know that every pair of cards
link |
that you might have is equally likely.
link |
I know that for a fact, that just follows
link |
from the rules of the game.
link |
Of course, except the two cards that I have,
link |
I know you don't have those.
link |
You have to take that into account.
link |
That's called card removal and that's very important.
link |
Is the dealing always coming from a single deck
link |
in Heads Up, so you can assume.
link |
Single deck, so you know that if I have the ace of spades,
link |
I know you don't have an ace of spades.
link |
Great, so in the beginning, your belief is basically
link |
the fact that it's a fair dealing of hands,
link |
but how do you start to adjust that belief?
link |
Well, that's where this beauty of game theory comes.
link |
So Nash equilibrium, which John Nash introduced in 1950,
link |
introduces what rational play is
link |
when you have more than one player.
link |
And these are pairs of strategies
link |
where strategies are contingency plans,
link |
one for each player.
link |
So that neither player wants to deviate
link |
to a different strategy,
link |
given that the other doesn't deviate.
link |
But as a side effect, you get the beliefs from base roll.
link |
So Nash equilibrium really isn't just deriving
link |
in these imperfect information games,
link |
Nash equilibrium, it doesn't just define strategies.
link |
It also defines beliefs for both of us
link |
and defines beliefs for each state.
link |
So at each state, it's called information sets.
link |
At each information set in the game,
link |
there's a set of different states that we might be in,
link |
but I don't know which one we're in.
link |
Nash equilibrium tells me exactly
link |
what is the probability distribution
link |
over those real world states in my mind.
link |
How does Nash equilibrium give you that distribution?
link |
I'll do a simple example.
link |
So you know the game Rock, Paper, Scissors?
link |
So we can draw it as player one moves first
link |
and then player two moves.
link |
But of course, it's important that player two
link |
doesn't know what player one moved,
link |
otherwise player two would win every time.
link |
So we can draw that as an information set
link |
where player one makes one of three moves first,
link |
and then there's an information set for player two.
link |
So player two doesn't know which of those nodes
link |
But once we know the strategy for player one,
link |
Nash equilibrium will say that you play 1 3rd Rock,
link |
1 3rd Paper, 1 3rd Scissors.
link |
From that, I can derive my beliefs on the information set
link |
that they're 1 3rd, 1 3rd, 1 3rd.
link |
So Bayes gives you that.
link |
But is that specific to a particular player,
link |
or is it something you quickly update
link |
with the specific player?
link |
No, the game theory isn't really player specific.
link |
So that's also why we don't need any data.
link |
We don't need any history
link |
how these particular humans played in the past
link |
or how any AI or human had played before.
link |
It's all about rationality.
link |
So the AI just thinks about
link |
what would a rational opponent do?
link |
And what would I do if I am rational?
link |
And that's the idea of game theory.
link |
So it's really a data free, opponent free approach.
link |
So it comes from the design of the game
link |
as opposed to the design of the player.
link |
Exactly, there's no opponent modeling per se.
link |
I mean, we've done some work on combining opponent modeling
link |
with game theory so you can exploit weak players even more,
link |
but that's another strand.
link |
And in Librarus, we didn't turn that on.
link |
So I decided that these players are too good.
link |
And when you start to exploit an opponent,
link |
you typically open yourself up to exploitation.
link |
And these guys have so few holes to exploit
link |
and they're world's leading experts in counter exploitation.
link |
So I decided that we're not gonna turn that stuff on.
link |
Actually, I saw a few of your papers exploiting opponents.
link |
It sounded very interesting to explore.
link |
Do you think there's room for exploitation
link |
generally outside of Librarus?
link |
Is there a subject or people differences
link |
that could be exploited, maybe not just in poker,
link |
but in general interactions and negotiations,
link |
all these other domains that you're considering?
link |
We've done some work on that.
link |
And I really like the work at hybrid digested too.
link |
So you figure out what would a rational opponent do.
link |
And by the way, that's safe in these zero sum games,
link |
two player zero sum games,
link |
because if the opponent does something irrational,
link |
yes, it might throw off my beliefs,
link |
but the amount that the player can gain
link |
by throwing off my belief is always less
link |
than they lose by playing poorly.
link |
But still, if somebody's weak as a player,
link |
you might wanna play differently to exploit them more.
link |
So you can think about it this way,
link |
a game theoretic strategy is unbeatable,
link |
but it doesn't maximally beat the other opponent.
link |
So the winnings per hand might be better
link |
with a different strategy.
link |
And the hybrid is that you start
link |
from a game theoretic approach.
link |
And then as you gain data about the opponent
link |
in certain parts of the game tree,
link |
then in those parts of the game tree,
link |
you start to tweak your strategy more and more
link |
towards exploitation while still staying fairly close
link |
to the game theoretic strategy
link |
so as to not open yourself up to exploitation too much.
link |
How do you do that?
link |
Do you try to vary up strategies, make it unpredictable?
link |
It's like, what is it, tit for tat strategies
link |
in Prisoner's Dilemma or?
link |
Well, that's a repeated game.
link |
Simple Prisoner's Dilemma, repeated games.
link |
But even there, there's no proof that says
link |
that that's the best thing.
link |
But experimentally, it actually does well.
link |
So what kind of games are there, first of all?
link |
I don't know if this is something
link |
that you could just summarize.
link |
There's perfect information games
link |
where all the information's on the table.
link |
There is imperfect information games.
link |
There's repeated games that you play over and over.
link |
There's zero sum games.
link |
There's non zero sum games.
link |
And then there's a really important distinction
link |
you're making, two player versus more players.
link |
So what are, what other games are there?
link |
And what's the difference, for example,
link |
with this two player game versus more players?
link |
What are the key differences in your view?
link |
So let me start from the basics.
link |
So a repeated game is a game where the same exact game
link |
is played over and over.
link |
In these extensive form games, where it's,
link |
think about three form, maybe with these information sets
link |
to represent incomplete information,
link |
you can have kind of repetitive interactions.
link |
Even repeated games are a special case of that, by the way.
link |
But the game doesn't have to be exactly the same.
link |
It's like in sourcing auctions.
link |
Yes, we're gonna see the same supply base year to year,
link |
but what I'm buying is a little different every time.
link |
And the supply base is a little different every time
link |
So it's not really repeated.
link |
So to find a purely repeated game
link |
is actually very rare in the world.
link |
So they're really a very course model of what's going on.
link |
Then if you move up from just repeated,
link |
simple repeated matrix games,
link |
not all the way to extensive form games,
link |
but in between, they're stochastic games,
link |
where, you know, there's these,
link |
you think about it like these little matrix games.
link |
And when you take an action and your opponent takes an action,
link |
they determine not which next state I'm going to,
link |
next game I'm going to,
link |
but the distribution over next games
link |
where I might be going to.
link |
So that's the stochastic game.
link |
But it's like matrix games, repeated stochastic games,
link |
extensive form games.
link |
That is from less to more general.
link |
And poker is an example of the last one.
link |
So it's really in the most general setting.
link |
Extensive form games.
link |
And that's kind of what the AI community has been working on
link |
and being benchmarked on
link |
with this Heads Up No Limit Texas Holdem.
link |
Can you describe extensive form games?
link |
What's the model here?
link |
Yeah, so if you're familiar with the tree form,
link |
so it's really the tree form.
link |
Like in chess, there's a search tree.
link |
Versus a matrix, yeah.
link |
And the matrix is called the matrix form
link |
or bi matrix form or normal form game.
link |
And here you have the tree form.
link |
So you can actually do certain types of reasoning there
link |
that you lose the information when you go to normal form.
link |
There's a certain form of equivalence.
link |
Like if you go from tree form and you say it,
link |
every possible contingency plan is a strategy.
link |
Then I can actually go back to the normal form,
link |
but I lose some information from the lack of sequentiality.
link |
Then the multiplayer versus two player distinction
link |
is an important one.
link |
So two player games in zero sum
link |
are conceptually easier and computationally easier.
link |
They're still huge like this one,
link |
but they're conceptually easier and computationally easier
link |
in that conceptually, you don't have to worry about
link |
which equilibrium is the other guy going to play
link |
when there are multiple,
link |
because any equilibrium strategy is a best response
link |
to any other equilibrium strategy.
link |
So I can play a different equilibrium from you
link |
and we'll still get the right values of the game.
link |
That falls apart even with two players
link |
when you have general sum games.
link |
Even without cooperation just in general.
link |
Even without cooperation.
link |
So there's a big gap from two player zero sum
link |
to two player general sum or even to three player zero sum.
link |
That's a big gap, at least in theory.
link |
Can you maybe non mathematically provide the intuition
link |
why it all falls apart with three or more players?
link |
It seems like you should still be able to have
link |
a Nash equilibrium that's instructive, that holds.
link |
Okay, so it is true that all finite games
link |
have a Nash equilibrium.
link |
So this is what John Nash actually proved.
link |
So they do have a Nash equilibrium.
link |
That's not the problem.
link |
The problem is that there can be many.
link |
And then there's a question of which equilibrium to select.
link |
So, and if you select your strategy
link |
from a different equilibrium and I select mine,
link |
then what does that mean?
link |
And in these non zero sum games,
link |
we may lose some joint benefit
link |
by being just simply stupid.
link |
We could actually both be better off
link |
if we did something else.
link |
And in three player, you get other problems
link |
also like collusion.
link |
Like maybe you and I can gang up on a third player
link |
and we can do radically better by colluding.
link |
So there are lots of issues that come up there.
link |
So Noah Brown, the student you work with on this
link |
has mentioned, I looked through the AMA on Reddit.
link |
He mentioned that the ability of poker players
link |
to collaborate will make the game.
link |
He was asked the question of,
link |
how would you make the game of poker,
link |
or both of you were asked the question,
link |
how would you make the game of poker
link |
beyond being solvable by current AI methods?
link |
And he said that there's not many ways
link |
of making poker more difficult,
link |
but a collaboration or cooperation between players
link |
would make it extremely difficult.
link |
So can you provide the intuition behind why that is,
link |
if you agree with that idea?
link |
Yeah, so I've done a lot of work on coalitional games
link |
and we actually have a paper here
link |
with my other student Gabriele Farina
link |
and some other collaborators at NIPS on that.
link |
Actually just came back from the poster session
link |
where we presented this.
link |
But so when you have a collusion, it's a different problem.
link |
And it typically gets even harder then.
link |
Even the game representations,
link |
some of the game representations don't really allow
link |
So we actually introduced a new game representation
link |
Is that kind of cooperation part of the model?
link |
Are you, do you have, do you have information
link |
about the fact that other players are cooperating
link |
or is it just this chaos that where nothing is known?
link |
So there's some things unknown.
link |
Can you give an example of a collusion type game
link |
So think about bridge.
link |
It's like when you and I are on a team,
link |
our payoffs are the same.
link |
The problem is that we can't talk.
link |
So when I get my cards, I can't whisper to you
link |
what my cards are.
link |
That would not be allowed.
link |
So we have to somehow coordinate our strategies
link |
ahead of time and only ahead of time.
link |
And then there's certain signals we can talk about,
link |
but they have to be such that the other team
link |
also understands them.
link |
So that's an example where the coordination
link |
is already built into the rules of the game.
link |
But in many other situations like auctions
link |
or negotiations or diplomatic relationships, poker,
link |
it's not really built in, but it still can be very helpful
link |
for the colluders.
link |
I've read you write somewhere,
link |
the negotiations you come to the table with prior,
link |
like a strategy that you're willing to do
link |
and not willing to do those kinds of things.
link |
So how do you start to now moving away from poker,
link |
moving beyond poker into other applications
link |
like negotiations, how do you start applying this
link |
to other domains, even real world domains
link |
that you've worked on?
link |
Yeah, I actually have two startup companies
link |
doing exactly that.
link |
One is called Strategic Machine,
link |
and that's for kind of business applications,
link |
gaming, sports, all sorts of things like that.
link |
Any applications of this to business and to sports
link |
and to gaming, to various types of things
link |
in finance, electricity markets and so on.
link |
And the other is called Strategy Robot,
link |
where we are taking these to military security,
link |
cyber security and intelligence applications.
link |
I think you worked a little bit in,
link |
how do you put it, advertisement,
link |
sort of suggesting ads kind of thing, auction.
link |
That's another company, optimized markets.
link |
But that's much more about a combinatorial market
link |
and optimization based technology.
link |
That's not using these game theoretic reasoning technologies.
link |
I see, okay, so what sort of high level
link |
do you think about our ability to use
link |
game theoretic concepts to model human behavior?
link |
Do you think human behavior is amenable
link |
to this kind of modeling outside of the poker games,
link |
and where have you seen it done successfully in your work?
link |
I'm not sure the goal really is modeling humans.
link |
Like for example, if I'm playing a zero sum game,
link |
I don't really care that the opponent
link |
is actually following my model of rational behavior,
link |
because if they're not, that's even better for me.
link |
Right, so see with the opponents in games,
link |
the prerequisite is that you formalize
link |
the interaction in some way
link |
that can be amenable to analysis.
link |
And you've done this amazing work with mechanism design,
link |
designing games that have certain outcomes.
link |
But, so I'll tell you an example
link |
from my world of autonomous vehicles, right?
link |
We're studying pedestrians,
link |
and pedestrians and cars negotiate
link |
in this nonverbal communication.
link |
There's this weird game dance of tension
link |
where pedestrians are basically saying,
link |
I trust that you won't kill me,
link |
and so as a jaywalker, I will step onto the road
link |
even though I'm breaking the law, and there's this tension.
link |
And the question is, we really don't know
link |
how to model that well in trying to model intent.
link |
And so people sometimes bring up ideas
link |
of game theory and so on.
link |
Do you think that aspect of human behavior
link |
can use these kinds of imperfect information approaches,
link |
modeling, how do you start to attack a problem like that
link |
when you don't even know how to design the game
link |
to describe the situation in order to solve it?
link |
Okay, so I haven't really thought about jaywalking,
link |
but one thing that I think could be a good application
link |
in autonomous vehicles is the following.
link |
So let's say that you have fleets of autonomous cars
link |
operating by different companies.
link |
So maybe here's the Waymo fleet and here's the Uber fleet.
link |
If you think about the rules of the road,
link |
they define certain legal rules,
link |
but that still leaves a huge strategy space open.
link |
Like as a simple example, when cars merge,
link |
how humans merge, they slow down and look at each other
link |
Wouldn't it be better if these situations
link |
would already be prenegotiated
link |
so we can actually merge at full speed
link |
and we know that this is the situation,
link |
this is how we do it, and it's all gonna be faster.
link |
But there are way too many situations to negotiate manually.
link |
So you could use automated negotiation,
link |
this is the idea at least,
link |
you could use automated negotiation
link |
to negotiate all of these situations
link |
or many of them in advance.
link |
And of course it might be that,
link |
hey, maybe you're not gonna always let me go first.
link |
Maybe you said, okay, well, in these situations,
link |
I'll let you go first, but in exchange,
link |
you're gonna give me too much,
link |
you're gonna let me go first in this situation.
link |
So it's this huge combinatorial negotiation.
link |
And do you think there's room in that example of merging
link |
to model this whole situation
link |
as an imperfect information game
link |
or do you really want to consider it to be a perfect?
link |
No, that's a good question, yeah.
link |
That's a good question.
link |
Do you pay the price of assuming
link |
that you don't know everything?
link |
Yeah, I don't know.
link |
It's certainly much easier.
link |
Games with perfect information are much easier.
link |
So if you can't get away with it, you should.
link |
But if the real situation is of imperfect information,
link |
then you're gonna have to deal with imperfect information.
link |
Great, so what lessons have you learned
link |
the Annual Computer Poker Competition?
link |
An incredible accomplishment of AI.
link |
You look at the history of Deep Blue, AlphaGo,
link |
these kind of moments when AI stepped up
link |
in an engineering effort and a scientific effort combined
link |
to beat the best of human players.
link |
So what do you take away from this whole experience?
link |
What have you learned about designing AI systems
link |
that play these kinds of games?
link |
And what does that mean for AI in general,
link |
for the future of AI development?
link |
Yeah, so that's a good question.
link |
So there's so much to say about it.
link |
I do like this type of performance oriented research.
link |
Although in my group, we go all the way from like idea
link |
to theory, to experiments, to big system building,
link |
to commercialization, so we span that spectrum.
link |
But I think that in a lot of situations in AI,
link |
you really have to build the big systems
link |
and evaluate them at scale
link |
before you know what works and doesn't.
link |
And we've seen that in the computational
link |
game theory community, that there are a lot of techniques
link |
that look good in the small,
link |
but then they cease to look good in the large.
link |
And we've also seen that there are a lot of techniques
link |
that look superior in theory.
link |
And I really mean in terms of convergence rates,
link |
like first order methods, better convergence rates,
link |
like the CFR based algorithms,
link |
yet the CFR based algorithms are the fastest in practice.
link |
So it really tells me that you have to test this in reality.
link |
The theory isn't tight enough, if you will,
link |
to tell you which algorithms are better than the others.
link |
And you have to look at these things in the large,
link |
because any sort of projections you do from the small
link |
can at least in this domain be very misleading.
link |
So that's kind of from a kind of a science
link |
and engineering perspective, from a personal perspective,
link |
it's been just a wild experience
link |
in that with the first poker competition,
link |
the first brains versus AI,
link |
man machine poker competition that we organized.
link |
There had been, by the way, for other poker games,
link |
there had been previous competitions,
link |
but this was for Heads Up No Limit, this was the first.
link |
And I probably became the most hated person
link |
in the world of poker.
link |
And I didn't mean to, I just saw.
link |
For cracking the game, for something.
link |
Yeah, a lot of people felt that it was a real threat
link |
to the whole game, the whole existence of the game.
link |
If AI becomes better than humans,
link |
people would be scared to play poker
link |
because there are these superhuman AIs running around
link |
taking their money and all of that.
link |
So I just, it's just really aggressive.
link |
The comments were super aggressive.
link |
I got everything just short of death threats.
link |
Do you think the same was true for chess?
link |
Because right now they just completed
link |
the world championships in chess,
link |
and humans just started ignoring the fact
link |
that there's AI systems now that outperform humans
link |
and they still enjoy the game, it's still a beautiful game.
link |
That's what I think.
link |
And I think the same thing happens in poker.
link |
And so I didn't think of myself
link |
as somebody who was gonna kill the game,
link |
and I don't think I did.
link |
I've really learned to love this game.
link |
I wasn't a poker player before,
link |
but learned so many nuances about it from these AIs,
link |
and they've really changed how the game is played,
link |
So they have these very Martian ways of playing poker,
link |
and the top humans are now incorporating
link |
those types of strategies into their own play.
link |
So if anything, to me, our work has made poker
link |
a richer, more interesting game for humans to play,
link |
not something that is gonna steer humans
link |
away from it entirely.
link |
Just a quick comment on something you said,
link |
which is, if I may say so,
link |
in academia is a little bit rare sometimes.
link |
It's pretty brave to put your ideas to the test
link |
in the way you described,
link |
saying that sometimes good ideas don't work
link |
when you actually try to apply them at scale.
link |
So where does that come from?
link |
I mean, if you could do advice for people,
link |
what drives you in that sense?
link |
Were you always this way?
link |
I mean, it takes a brave person.
link |
I guess is what I'm saying, to test their ideas
link |
and to see if this thing actually works
link |
against human top human players and so on.
link |
Yeah, I don't know about brave,
link |
but it takes a lot of work.
link |
It takes a lot of work and a lot of time
link |
to organize, to make something big
link |
and to organize an event and stuff like that.
link |
And what drives you in that effort?
link |
Because you could still, I would argue,
link |
get a best paper award at NIPS as you did in 17
link |
without doing this.
link |
That's right, yes.
link |
And so in general, I believe it's very important
link |
to do things in the real world and at scale.
link |
And that's really where the pudding, if you will,
link |
proof is in the pudding, that's where it is.
link |
In this particular case,
link |
it was kind of a competition between different groups
link |
and for many years as to who can be the first one
link |
to beat the top humans at Heads Up No Limit, Texas Holdem.
link |
So it became kind of like a competition who can get there.
link |
Yeah, so a little friendly competition
link |
could do wonders for progress.
link |
So the topic of mechanism design,
link |
which is really interesting, also kind of new to me,
link |
except as an observer of, I don't know, politics and any,
link |
I'm an observer of mechanisms,
link |
but you write in your paper an automated mechanism design
link |
that I quickly read.
link |
So mechanism design is designing the rules of the game
link |
so you get a certain desirable outcome.
link |
And you have this work on doing so in an automatic fashion
link |
as opposed to fine tuning it.
link |
So what have you learned from those efforts?
link |
If you look, say, I don't know,
link |
at complexes like our political system,
link |
can we design our political system
link |
to have, in an automated fashion,
link |
to have outcomes that we want?
link |
Can we design something like traffic lights to be smart
link |
where it gets outcomes that we want?
link |
So what are the lessons that you draw from that work?
link |
Yeah, so I still very much believe
link |
in the automated mechanism design direction.
link |
But it's not a panacea.
link |
There are impossibility results in mechanism design
link |
saying that there is no mechanism that accomplishes
link |
objective X in class C.
link |
So it's not going to,
link |
there's no way using any mechanism design tools,
link |
manual or automated,
link |
to do certain things in mechanism design.
link |
Can you describe that again?
link |
So meaning it's impossible to achieve that?
link |
And it's unlikely.
link |
So these are not statements about human ingenuity
link |
who might come up with something smart.
link |
These are proofs that if you want to accomplish
link |
properties X in class C,
link |
that is not doable with any mechanism.
link |
The good thing about automated mechanism design
link |
is that we're not really designing for a class.
link |
We're designing for specific settings at a time.
link |
So even if there's an impossibility result
link |
for the whole class,
link |
it just doesn't mean that all of the cases
link |
in the class are impossible.
link |
It just means that some of the cases are impossible.
link |
So we can actually carve these islands of possibility
link |
within these known impossible classes.
link |
And we've actually done that.
link |
So one of the famous results in mechanism design
link |
is the Meyerson Satethweight theorem
link |
by Roger Meyerson and Mark Satethweight from 1983.
link |
It's an impossibility of efficient trade
link |
under imperfect information.
link |
We show that you can, in many settings,
link |
avoid that and get efficient trade anyway.
link |
Depending on how they design the game, okay.
link |
Depending how you design the game.
link |
And of course, it doesn't in any way
link |
contradict the impossibility result.
link |
The impossibility result is still there,
link |
but it just finds spots within this impossible class
link |
where in those spots, you don't have the impossibility.
link |
Sorry if I'm going a bit philosophical,
link |
but what lessons do you draw towards,
link |
like I mentioned, politics or human interaction
link |
and designing mechanisms for outside of just
link |
these kinds of trading or auctioning
link |
or purely formal games or human interaction,
link |
like a political system?
link |
How, do you think it's applicable to, yeah, politics
link |
or to business, to negotiations, these kinds of things,
link |
designing rules that have certain outcomes?
link |
Yeah, yeah, I do think so.
link |
Have you seen that successfully done?
link |
They haven't really, oh, you mean mechanism design
link |
or automated mechanism design?
link |
Automated mechanism design.
link |
So mechanism design itself
link |
has had fairly limited success so far.
link |
There are certain cases,
link |
but most of the real world situations
link |
are actually not sound from a mechanism design perspective,
link |
even in those cases where they've been designed
link |
by very knowledgeable mechanism design people,
link |
the people are typically just taking some insights
link |
from the theory and applying those insights
link |
into the real world,
link |
rather than applying the mechanisms directly.
link |
So one famous example of is the FCC spectrum auctions.
link |
So I've also had a small role in that
link |
and very good economists have been working,
link |
excellent economists have been working on that
link |
with no game theory,
link |
yet the rules that are designed in practice there,
link |
they're such that bidding truthfully
link |
is not the best strategy.
link |
Usually mechanism design,
link |
we try to make things easy for the participants.
link |
So telling the truth is the best strategy,
link |
but even in those very high stakes auctions
link |
where you have tens of billions of dollars
link |
worth of spectrum being auctioned,
link |
truth telling is not the best strategy.
link |
nobody knows even a single optimal bidding strategy
link |
for those auctions.
link |
What's the challenge of coming up with an optimal,
link |
because there's a lot of players and there's imperfect.
link |
It's not so much that a lot of players,
link |
but many items for sale,
link |
and these mechanisms are such that even with just two items
link |
or one item, bidding truthfully
link |
wouldn't be the best strategy.
link |
If you look at the history of AI,
link |
it's marked by seminal events.
link |
AlphaGo beating a world champion human Go player,
link |
I would put Liberatus winning the Heads Up No Limit Holdem
link |
as one of such event.
link |
And what do you think is the next such event,
link |
whether it's in your life or in the broadly AI community
link |
that you think might be out there
link |
that would surprise the world?
link |
So that's a great question,
link |
and I don't really know the answer.
link |
In terms of game solving,
link |
Heads Up No Limit Texas Holdem
link |
really was the one remaining widely agreed upon benchmark.
link |
So that was the big milestone.
link |
Now, are there other things?
link |
Yeah, certainly there are,
link |
but there's not one that the community
link |
has kind of focused on.
link |
So what could be other things?
link |
There are groups working on StarCraft.
link |
There are groups working on Dota 2.
link |
These are video games.
link |
Or you could have like Diplomacy or Hanabi,
link |
These are like recreational games,
link |
but none of them are really acknowledged
link |
as kind of the main next challenge problem,
link |
like chess or Go or Heads Up No Limit Texas Holdem was.
link |
So I don't really know in the game solving space
link |
what is or what will be the next benchmark.
link |
I kind of hope that there will be a next benchmark
link |
because really the different groups
link |
working on the same problem
link |
really drove these application independent techniques
link |
forward very quickly over 10 years.
link |
Do you think there's an open problem
link |
that excites you that you start moving away
link |
from games into real world games,
link |
like say the stock market trading?
link |
Yeah, so that's kind of how I am.
link |
So I am probably not going to work
link |
as hard on these recreational benchmarks.
link |
I'm doing two startups on game solving technology,
link |
Strategic Machine and Strategy Robot,
link |
and we're really interested
link |
in pushing this stuff into practice.
link |
What do you think would be really
link |
a powerful result that would be surprising
link |
that would be, if you can say,
link |
I mean, five years, 10 years from now,
link |
something that statistically you would say
link |
is not very likely,
link |
but if there's a breakthrough, would achieve?
link |
Yeah, so I think that overall,
link |
we're in a very different situation in game theory
link |
than we are in, let's say, machine learning.
link |
So in machine learning, it's a fairly mature technology
link |
and it's very broadly applied
link |
and proven success in the real world.
link |
In game solving, there are almost no applications yet.
link |
We have just become superhuman,
link |
which machine learning you could argue happened in the 90s,
link |
and at least on supervised learning,
link |
certain complex supervised learning applications.
link |
Now, I think the next challenge problem,
link |
I know you're not asking about it this way,
link |
you're asking about the technology breakthrough,
link |
but I think that big, big breakthrough
link |
is to be able to show that, hey,
link |
maybe most of, let's say, military planning
link |
or most of business strategy
link |
will actually be done strategically
link |
using computational game theory.
link |
That's what I would like to see
link |
as the next five or 10 year goal.
link |
Maybe you can explain to me again,
link |
forgive me if this is an obvious question,
link |
but machine learning methods,
link |
neural networks suffer from not being transparent,
link |
not being explainable.
link |
Game theoretic methods, Nash equilibria,
link |
do they generally, when you see the different solutions,
link |
are they, when you talk about military operations,
link |
are they, once you see the strategies,
link |
do they make sense, are they explainable,
link |
or do they suffer from the same problems
link |
as neural networks do?
link |
So that's a good question.
link |
I would say a little bit yes and no.
link |
And what I mean by that is that
link |
these game theoretic strategies,
link |
let's say, Nash equilibrium,
link |
it has provable properties.
link |
So it's unlike, let's say, deep learning
link |
where you kind of cross your fingers,
link |
hopefully it'll work.
link |
And then after the fact, when you have the weights,
link |
you're still crossing your fingers,
link |
and I hope it will work.
link |
Here, you know that the solution quality is there.
link |
There's provable solution quality guarantees.
link |
Now, that doesn't necessarily mean
link |
that the strategies are human understandable.
link |
That's a whole other problem.
link |
So I think that deep learning and computational game theory
link |
are in the same boat in that sense,
link |
that both are difficult to understand.
link |
But at least the game theoretic techniques,
link |
they have these guarantees of solution quality.
link |
So do you see business operations, strategic operations,
link |
or even military in the future being
link |
at least the strong candidates
link |
being proposed by automated systems?
link |
But that's more of a belief than a substantiated fact.
link |
Depending on where you land in optimism or pessimism,
link |
that's a really, to me, that's an exciting future,
link |
especially if there's provable things
link |
in terms of optimality.
link |
So looking into the future,
link |
there's a few folks worried about the,
link |
especially you look at the game of poker,
link |
which is probably one of the last benchmarks
link |
in terms of games being solved.
link |
They worry about the future
link |
and the existential threats of artificial intelligence.
link |
So the negative impact in whatever form on society.
link |
Is that something that concerns you as much,
link |
or are you more optimistic about the positive impacts of AI?
link |
Oh, I am much more optimistic about the positive impacts.
link |
So just in my own work, what we've done so far,
link |
we run the nationwide kidney exchange.
link |
Hundreds of people are walking around alive today,
link |
And it's increased employment.
link |
You have a lot of people now running kidney exchanges
link |
and at the transplant centers,
link |
interacting with the kidney exchange.
link |
You have extra surgeons, nurses, anesthesiologists,
link |
hospitals, all of that.
link |
So employment is increasing from that
link |
and the world is becoming a better place.
link |
Another example is combinatorial sourcing auctions.
link |
We did 800 large scale combinatorial sourcing auctions
link |
from 2001 to 2010 in a previous startup of mine
link |
called CombineNet.
link |
And we increased the supply chain efficiency
link |
on that $60 billion of spend by 12.6%.
link |
So that's over $6 billion of efficiency improvement
link |
And this is not like shifting value
link |
from somebody to somebody else,
link |
just efficiency improvement, like in trucking,
link |
less empty driving, so there's less waste,
link |
less carbon footprint and so on.
link |
So a huge positive impact in the near term,
link |
but sort of to stay in it for a little longer,
link |
because I think game theory has a role to play here.
link |
Oh, let me actually come back on that as one thing.
link |
I think AI is also going to make the world much safer.
link |
So that's another aspect that often gets overlooked.
link |
Well, let me ask this question.
link |
Maybe you can speak to the safer.
link |
So I talked to Max Tegmark and Stuart Russell,
link |
who are very concerned about existential threats of AI.
link |
And often the concern is about value misalignment.
link |
So AI systems basically working,
link |
operating towards goals that are not the same
link |
as human civilization, human beings.
link |
So it seems like game theory has a role to play there
link |
to make sure the values are aligned with human beings.
link |
I don't know if that's how you think about it.
link |
If not, how do you think AI might help with this problem?
link |
How do you think AI might make the world safer?
link |
Yeah, I think this value misalignment
link |
is a fairly theoretical worry.
link |
And I haven't really seen it in,
link |
because I do a lot of real applications.
link |
I don't see it anywhere.
link |
The closest I've seen it
link |
was the following type of mental exercise really,
link |
where I had this argument in the late eighties
link |
when we were building
link |
these transportation optimization systems.
link |
And somebody had heard that it's a good idea
link |
to have high utilization of assets.
link |
So they told me, hey, why don't you put that as objective?
link |
And we didn't even put it as an objective
link |
because I just showed him that,
link |
if you had that as your objective,
link |
the solution would be to load your trucks full
link |
and drive in circles.
link |
Nothing would ever get delivered.
link |
You'd have a hundred percent utilization.
link |
So yeah, I know this phenomenon.
link |
I've known this for over 30 years,
link |
but I've never seen it actually be a problem in reality.
link |
And yes, if you have the wrong objective,
link |
the AI will optimize that to the hilt
link |
and it's gonna hurt more than some human
link |
who's kind of trying to solve it in a half baked way
link |
with some human insight too.
link |
But I just haven't seen that materialize in practice.
link |
There's this gap that you've actually put your finger on
link |
very clearly just now between theory and reality.
link |
That's very difficult to put into words, I think.
link |
It's what you can theoretically imagine,
link |
the worst possible case or even, yeah, I mean bad cases.
link |
And what usually happens in reality.
link |
So for example, to me,
link |
maybe it's something you can comment on having grown up
link |
and I grew up in the Soviet Union.
link |
There's currently 10,000 nuclear weapons in the world.
link |
And for many decades,
link |
it's theoretically surprising to me
link |
that the nuclear war is not broken out.
link |
Do you think about this aspect
link |
from a game theoretic perspective in general,
link |
Why in theory you could see
link |
how things would go terribly wrong
link |
and somehow yet they have not?
link |
Yeah, how do you think about it?
link |
So I do think about that a lot.
link |
I think the biggest two threats that we're facing as mankind,
link |
one is climate change and the other is nuclear war.
link |
So those are my main two worries that I worry about.
link |
And I've tried to do something about climate,
link |
thought about trying to do something
link |
for climate change twice.
link |
Actually, for two of my startups,
link |
I've actually commissioned studies
link |
of what we could do on those things.
link |
And we didn't really find a sweet spot,
link |
but I'm still keeping an eye out on that.
link |
If there's something where we could actually
link |
provide a market solution or optimizations solution
link |
or some other technology solution to problems.
link |
Right now, like for example,
link |
pollution critic markets was what we were looking at then.
link |
And it was much more the lack of political will
link |
by those markets were not so successful
link |
rather than bad market design.
link |
So I could go in and make a better market design,
link |
but that wouldn't really move the needle
link |
on the world very much if there's no political will.
link |
And in the US, the market,
link |
at least the Chicago market was just shut down and so on.
link |
So then it doesn't really help
link |
how great your market design was.
link |
And then the nuclear side, it's more,
link |
so global warming is a more encroaching problem.
link |
Nuclear weapons have been here.
link |
It's an obvious problem that's just been sitting there.
link |
So how do you think about,
link |
what is the mechanism design there
link |
that just made everything seem stable?
link |
And are you still extremely worried?
link |
I am still extremely worried.
link |
So you probably know the simple game theory of mad.
link |
So this was a mutually assured destruction
link |
and it doesn't require any computation with small matrices.
link |
You can actually convince yourself
link |
that the game is such that nobody wants to initiate.
link |
Yeah, that's a very coarse grained analysis.
link |
And it really works in a situational way.
link |
You have two superpowers or small number of superpowers.
link |
Now things are very different.
link |
You have a smaller nuke.
link |
So the threshold of initiating is smaller
link |
and you have smaller countries and non nation actors
link |
who may get a nuke and so on.
link |
So I think it's riskier now than it was maybe ever before.
link |
And what idea, application of AI,
link |
you've talked about a little bit,
link |
but what is the most exciting to you right now?
link |
I mean, you're here at NIPS, NeurIPS.
link |
Now you have a few excellent pieces of work,
link |
but what are you thinking into the future
link |
with several companies you're doing?
link |
What's the most exciting thing or one of the exciting things?
link |
The number one thing for me right now
link |
is coming up with these scalable techniques
link |
for game solving and applying them into the real world.
link |
I'm still very interested in market design as well.
link |
And we're doing that in the optimized markets,
link |
but I'm most interested if number one right now
link |
is strategic machine strategy robot,
link |
getting that technology out there
link |
and seeing as you were in the trenches doing applications,
link |
what needs to be actually filled,
link |
what technology gaps still need to be filled.
link |
So it's so hard to just put your feet on the table
link |
and imagine what needs to be done.
link |
But when you're actually doing real applications,
link |
the applications tell you what needs to be done.
link |
And I really enjoy that interaction.
link |
Is it a challenging process to apply
link |
some of the state of the art techniques you're working on
link |
and having the various players in industry
link |
or the military or people who could really benefit from it
link |
What's that process like of,
link |
autonomous vehicles work with automotive companies
link |
and they're in many ways are a little bit old fashioned.
link |
They really want to use this technology.
link |
There's clearly will have a significant benefit,
link |
but the systems aren't quite in place
link |
to easily have them integrated in terms of data,
link |
in terms of compute, in terms of all these kinds of things.
link |
So is that one of the bigger challenges that you're facing
link |
and how do you tackle that challenge?
link |
Yeah, I think that's always a challenge.
link |
That's kind of slowness and inertia really
link |
of let's do things the way we've always done it.
link |
You just have to find the internal champions
link |
at the customer who understand that,
link |
hey, things can't be the same way in the future.
link |
Otherwise bad things are going to happen.
link |
And it's in autonomous vehicles.
link |
It's actually very interesting
link |
that the car makers are doing that
link |
and they're very traditional,
link |
but at the same time you have tech companies
link |
who have nothing to do with cars or transportation
link |
like Google and Baidu really pushing on autonomous cars.
link |
I find that fascinating.
link |
Clearly you're super excited
link |
about actually these ideas having an impact in the world.
link |
In terms of the technology, in terms of ideas and research,
link |
are there directions that you're also excited about?
link |
Whether that's on some of the approaches you talked about
link |
for the imperfect information games,
link |
whether it's applying deep learning
link |
to some of these problems,
link |
is there something that you're excited
link |
in the research side of things?
link |
Yeah, yeah, lots of different things
link |
in the game solving.
link |
So solving even bigger games,
link |
games where you have more hidden action
link |
of the player actions as well.
link |
Poker is a game where really the chance actions are hidden
link |
or some of them are hidden,
link |
but the player actions are public.
link |
Multiplayer games of various sorts,
link |
collusion, opponent exploitation,
link |
all and even longer games.
link |
So games that basically go forever,
link |
but they're not repeated.
link |
So see extensive fun games that go forever.
link |
What would that even look like?
link |
How do you represent that?
link |
How do you solve that?
link |
What's an example of a game like that?
link |
Or is this some of the stochastic games
link |
that you mentioned?
link |
Let's say business strategy.
link |
So it's not just modeling like a particular interaction,
link |
but thinking about the business from here to eternity.
link |
Or let's say military strategy.
link |
So it's not like war is gonna go away.
link |
How do you think about military strategy
link |
that's gonna go forever?
link |
How do you even model that?
link |
How do you know whether a move was good
link |
that somebody made and so on?
link |
So that's kind of one direction.
link |
I'm also very interested in learning
link |
much more scalable techniques for integer programming.
link |
So we had an ICML paper this summer on that.
link |
The first automated algorithm configuration paper
link |
that has theoretical generalization guarantees.
link |
So if I see this many training examples
link |
and I told my algorithm in this way,
link |
it's going to have good performance
link |
on the real distribution, which I've not seen.
link |
So, which is kind of interesting
link |
that algorithm configuration has been going on now
link |
for at least 17 years seriously.
link |
And there has not been any generalization theory before.
link |
Well, this is really exciting
link |
and it's a huge honor to talk to you.
link |
Thank you so much, Tomas.
link |
Thank you for bringing Labradus to the world
link |
and all the great work you're doing.
link |
Well, thank you very much.
link |
No more questions.